Commit Graph

13410 Commits

Author SHA1 Message Date
Kovid Goyal
86a55e2c0a Use an aligned slice for file reads 2024-02-25 09:57:39 +05:30
Kovid Goyal
de8c1e0206 Work on porting SIMD vt arser to Go for the kittens 2024-02-25 09:57:39 +05:30
Kovid Goyal
131716da00 Ignore another warning on some compiler versions in simde 2024-02-25 09:57:39 +05:30
Kovid Goyal
4d35fc2928 Use a custom movmask for ARM rather than the one from simde
Supposedly faster, not that I can measure it, but...
Also gives neater code, so keep it.
2024-02-25 09:57:39 +05:30
Kovid Goyal
3b65c1a58a remove declaration without implementation 2024-02-25 09:57:39 +05:30
Kovid Goyal
9bca415af2 Use aligned loads when finding either of two bytes
No measurable performance improvement, but neater algorithm anyway.
2024-02-25 09:57:39 +05:30
Kovid Goyal
60bc8e6c25 ... 2024-02-25 09:57:39 +05:30
Kovid Goyal
8aa1b112b8 Turns out the simde implementation of movemask is not slow enough to compensate for the speed bump from 256 bit 2024-02-25 09:57:39 +05:30
Kovid Goyal
0bd47d8457 Cleanup KITTY_NO_SIMD compilation 2024-02-25 09:57:39 +05:30
Kovid Goyal
fcbda63023 Move finding byte code into separate functions
movemask() is inefficient on ARM64 this will allow us to use a dedicated
implementation for finding bytes on that platform
2024-02-25 09:57:38 +05:30
Kovid Goyal
1d59bfade3 ... 2024-02-25 09:57:38 +05:30
Kovid Goyal
fd7d0f8787 Fix event loop continuously ticking every input_delay seconds even when no input is available 2024-02-25 09:57:38 +05:30
Kovid Goyal
fa11858a72 Make bash integration tests more robust on macOS 2024-02-25 09:57:38 +05:30
Kovid Goyal
1293ee60e0 ... 2024-02-25 09:57:38 +05:30
Kovid Goyal
66341aa28e Make the env var controlling which SIMD level to use more capable 2024-02-25 09:57:38 +05:30
Kovid Goyal
73342411bc Dont build any SIMD code when the target is neither ARM64 nor x86/amd64 2024-02-25 09:57:38 +05:30
Kovid Goyal
8dd6f9b07c Get universal builds working again
Now we use lipo and build individually so we can pass the correct
compiler flags per arch
2024-02-25 09:57:38 +05:30
Kovid Goyal
7e77a196e6 Build only the SIMD code with SIMD compiler flags 2024-02-25 09:57:38 +05:30
Kovid Goyal
465616223c Drop using the v2 microarch
No significant performance impact and small risk of breakage
2024-02-25 09:57:38 +05:30
Kovid Goyal
9d4193f4ea Fix texture ref not useable on repurposed image object 2024-02-25 09:57:38 +05:30
Kovid Goyal
dafb876d75 Skip simd parser tests on machines without SIMD instructions 2024-02-25 09:57:38 +05:30
Kovid Goyal
4b846e0106 Turns out that using 256 bit code on ARM is slightly faster even though it is emulated with 128 bit registers 2024-02-25 09:57:38 +05:30
Kovid Goyal
76c6630084 Dont use 256 bit code paths on ARM
ARM only has 128 bit registers. simde simulates 256 bit operations using
them, which is fairly pointless for us.
2024-02-25 09:57:38 +05:30
Kovid Goyal
23a4012aeb Add an env var to turn off use of SIMD instructions 2024-02-25 09:57:38 +05:30
Kovid Goyal
eee14ae148 Workaround for machines on GitHub Actions that incorrectly report CPU vector instruction availability 2024-02-25 09:57:37 +05:30
Kovid Goyal
b0ccaa09be Clean up test env reporting 2024-02-25 09:57:37 +05:30
Kovid Goyal
bbaccfdaae DRYer 2024-02-25 09:57:37 +05:30
Kovid Goyal
cb5a2cce53 ... 2024-02-25 09:57:37 +05:30
Kovid Goyal
4fec11af05 Run dsymutil in post link phase 2024-02-25 09:57:37 +05:30
Kovid Goyal
5a9304e1b8 DRYer 2024-02-25 09:57:37 +05:30
Kovid Goyal
2b9c646c5b Build dSYM bundles on CI 2024-02-25 09:57:37 +05:30
Kovid Goyal
6b6f3e0ece ... 2024-02-25 09:57:37 +05:30
Kovid Goyal
b560fe34c9 Give the functions for creating various objects unique names so they are easily recognized in macOS's non-fully-symolicated crash reports 2024-02-25 09:57:37 +05:30
Kovid Goyal
e5b27d066c Output macOS crash reports on CI with nicer formatting 2024-02-25 09:57:37 +05:30
Kovid Goyal
8762a939c0 Dont specify arch/tune when building universal binary 2024-02-25 09:57:37 +05:30
Kovid Goyal
06da31019c Micro-optimize clearing of lines
Use a doubling strategy to memset arrays to a fixed value. Makes the
memset O(log(N)) from O(N) in number of calls to memcpy.
2024-02-25 09:57:37 +05:30
Kovid Goyal
d0621cb82a Better ipd crash report printing 2024-02-25 09:57:37 +05:30
Kovid Goyal
9935b5ddb2 ... 2024-02-25 09:57:37 +05:30
Kovid Goyal
49d664bb0d Fix incorrect line mapping when clearing screen using optimized code 2024-02-25 09:57:37 +05:30
Kovid Goyal
c6c0d0ed60 Sleep for a minute in the hope that macOS crash log will become available 2024-02-25 09:57:37 +05:30
Kovid Goyal
6f74d1b0c1 ... 2024-02-25 09:57:36 +05:30
Kovid Goyal
5eb852532f Use coredumpctl on Linux CI 2024-02-25 09:57:36 +05:30
Kovid Goyal
43e0281ab5 No ulimit on Linux CI 2024-02-25 09:57:36 +05:30
Kovid Goyal
99d1eec021 ... 2024-02-25 09:57:36 +05:30
Kovid Goyal
0a158f3577 More attempts at finding a core dump on macOS 2024-02-25 09:57:36 +05:30
Kovid Goyal
89c431a624 Optimize implementation of clear screen escape code 2024-02-25 09:57:36 +05:30
Kovid Goyal
b48b70aedf Speed up CSI benchmark by another 10% 2024-02-25 09:57:36 +05:30
Kovid Goyal
f105bc5f4e ... 2024-02-25 09:57:36 +05:30
Kovid Goyal
d5fae07ab7 More help text for the benchmark kitten 2024-02-25 09:57:36 +05:30
Kovid Goyal
58dbcf0840 ... 2024-02-25 09:57:36 +05:30