Commit Graph

29 Commits

Author SHA1 Message Date
Kovid Goyal
a7c06b38e6 We dont actually need vzeroupper at start of function
GCC emits vzeroupper automatically when compiling with native
optimizations but we still need it otherwise
2024-02-25 09:57:43 +05:30
Kovid Goyal
720618bc37 Use go 1.22 for building
It supports PCALIGN on non ARM arches as well
2024-02-25 09:57:43 +05:30
Kovid Goyal
ede4d7fbca ... 2024-02-25 09:57:42 +05:30
Kovid Goyal
c01b959723 Fix Go unaligned index implementation 2024-02-25 09:57:42 +05:30
Kovid Goyal
7467307200 Add some alignment tests 2024-02-25 09:57:42 +05:30
Kovid Goyal
bbdb0b15f3 DRYer 2024-02-25 09:57:42 +05:30
Kovid Goyal
b5edd9ad57 Dont precalculate mask in loop body
No need since we dont shift. Avoids the extra mask instructions for the
not found case.
2024-02-25 09:57:42 +05:30
Kovid Goyal
f9fd6ffd46 Use only aligned loads for index funcs
Also obviates the necessity for safe slice wrappers
2024-02-25 09:57:41 +05:30
Kovid Goyal
31a5fcf297 DRYer 2024-02-25 09:57:41 +05:30
Kovid Goyal
561712090d Fix cmplt implementation 2024-02-25 09:57:41 +05:30
Kovid Goyal
d9190ea675 DRYer 2024-02-25 09:57:41 +05:30
Kovid Goyal
57f4ea4d4a Add some tests for broadcast from constant intrinsic 2024-02-25 09:57:41 +05:30
Kovid Goyal
9b0ae8d403 Dont use VEX encoded instructions for 128 bit ISA 2024-02-25 09:57:41 +05:30
Kovid Goyal
aed0611fb8 Avoid double trailing RET 2024-02-25 09:57:40 +05:30
Kovid Goyal
5a5e31c38b Also zero upper at start of function 2024-02-25 09:57:40 +05:30
Kovid Goyal
db2e0e816d Fix mixing of register types in the same function 2024-02-25 09:57:40 +05:30
Kovid Goyal
a298781b85 DRYer 2024-02-25 09:57:40 +05:30
Kovid Goyal
d5cd9ef2ca ... 2024-02-25 09:57:40 +05:30
Kovid Goyal
da31db3212 ... 2024-02-25 09:57:40 +05:30
Kovid Goyal
601c4ad4df Fix some typos 2024-02-25 09:57:40 +05:30
Kovid Goyal
68d800d4fa make clean should clean generated asm as well 2024-02-25 09:57:40 +05:30
Kovid Goyal
9fc3db1dd1 Work on C0 index func 2024-02-25 09:57:40 +05:30
Kovid Goyal
161eae78b6 Make generated asm_* files world readable 2024-02-25 09:57:40 +05:30
Kovid Goyal
77cfd44f24 More efficient clearing of register to all zeros or all ones 2024-02-25 09:57:39 +05:30
Kovid Goyal
59be7213cf Make set1_epi8 more general 2024-02-25 09:57:39 +05:30
Kovid Goyal
d60dacbd09 Implement > and < intrinsics for vector registers 2024-02-25 09:57:39 +05:30
Kovid Goyal
82b7b4fcce Make a re-useable template for generating ASM index functions with different tests 2024-02-25 09:57:39 +05:30
Kovid Goyal
4e6138d785 Generate SIMD code during build 2024-02-25 09:57:39 +05:30
Kovid Goyal
de8c1e0206 Work on porting SIMD vt arser to Go for the kittens 2024-02-25 09:57:39 +05:30