kitty

mirror of https://github.com/kovidgoyal/kitty synced 2026-07-27 10:41:58 +02:00

Author	SHA1	Message	Date
Kovid Goyal	3abdc54e4b	...	2024-02-25 09:57:41 +05:30
Kovid Goyal	618aeec709	Finally got gnome-terminal to run on my system Apparently it needed some kind of GTK desktop portal or the other 🙄 Interesting that its numbers are basically the same as alacritty's. Lot better than I remember, I guess the recent libvte performance work was good.	2024-02-25 09:57:41 +05:30
Kovid Goyal	4585361161	Micro optimization	2024-02-25 09:57:41 +05:30
Kovid Goyal	f64739c29b	Fix regression that broke handling of single byte control chars when cursor is on second cell of wide character	2024-02-25 09:57:41 +05:30
Kovid Goyal	f3830aa854	Avoid unnecessary if	2024-02-25 09:57:41 +05:30
Kovid Goyal	f1fe0bf40a	Code to easily compare SIMD and scalar decode in a live instance Also remove -mtune=intel as it fails with clang	2024-02-25 09:57:41 +05:30
Kovid Goyal	561712090d	Fix cmplt implementation	2024-02-25 09:57:41 +05:30
Kovid Goyal	d5f34c401d	Better vector registers to pre-calculate before the loop	2024-02-25 09:57:41 +05:30
Kovid Goyal	d9190ea675	DRYer	2024-02-25 09:57:41 +05:30
Kovid Goyal	57f4ea4d4a	Add some tests for broadcast from constant intrinsic	2024-02-25 09:57:41 +05:30
Kovid Goyal	9b0ae8d403	Dont use VEX encoded instructions for 128 bit ISA	2024-02-25 09:57:41 +05:30
Kovid Goyal	aed0611fb8	Avoid double trailing RET	2024-02-25 09:57:40 +05:30
Kovid Goyal	920b8a2496	Use VZEROUPPER in avx functions See https://www.intel.com/content/dam/develop/external/us/en/documents/11mc12-avoiding-2bavx-sse-2btransition-2bpenalties-2brh-2bfinal-809104.pdf	2024-02-25 09:57:40 +05:30
Kovid Goyal	5a5e31c38b	Also zero upper at start of function	2024-02-25 09:57:40 +05:30
Kovid Goyal	db2e0e816d	Fix mixing of register types in the same function	2024-02-25 09:57:40 +05:30
Kovid Goyal	a298781b85	DRYer	2024-02-25 09:57:40 +05:30
Kovid Goyal	d5cd9ef2ca	...	2024-02-25 09:57:40 +05:30
Kovid Goyal	55c909c656	Use -mtune=intel for SIMD files when building without native optimizations	2024-02-25 09:57:40 +05:30
Kovid Goyal	da31db3212	...	2024-02-25 09:57:40 +05:30
Kovid Goyal	601c4ad4df	Fix some typos	2024-02-25 09:57:40 +05:30
Kovid Goyal	2549b4328f	Update throughput comparison table in light of latest improvements	2024-02-25 09:57:40 +05:30
Kovid Goyal	68d800d4fa	make clean should clean generated asm as well	2024-02-25 09:57:40 +05:30
Kovid Goyal	9fc3db1dd1	Work on C0 index func	2024-02-25 09:57:40 +05:30
Kovid Goyal	d4c4805f96	const away to glory	2024-02-25 09:57:40 +05:30
Kovid Goyal	161eae78b6	Make generated asm_* files world readable	2024-02-25 09:57:40 +05:30
Kovid Goyal	6cdc7ac91d	A further 5% speedup for UTF-8 decoding Achieved by decoding in larger chunks thereby amortizing the cost of creating various constant vectors over larger chunks.	2024-02-25 09:57:40 +05:30
Kovid Goyal	0bccada9d1	No longer need to abort after dealing with trailing bytes	2024-02-25 09:57:40 +05:30
Kovid Goyal	9cb9373274	Allow unbounded output in UTF8Decoder This will allow us to eventually decode more than a single vector's worth in a fast inner loop	2024-02-25 09:57:39 +05:30
Kovid Goyal	d987ffe49a	Use unaligned stores Makes no measurable difference in the benchmark. And will eventually allow us to process larger chunks of data without need to reset a bunch of vector registers to constant values each time.	2024-02-25 09:57:39 +05:30
Kovid Goyal	77cfd44f24	More efficient clearing of register to all zeros or all ones	2024-02-25 09:57:39 +05:30
Kovid Goyal	59be7213cf	Make set1_epi8 more general	2024-02-25 09:57:39 +05:30
Kovid Goyal	d60dacbd09	Implement > and < intrinsics for vector registers	2024-02-25 09:57:39 +05:30
Kovid Goyal	82b7b4fcce	Make a re-useable template for generating ASM index functions with different tests	2024-02-25 09:57:39 +05:30
Kovid Goyal	fa9a2b1e2e	Switch file input to use new SIMD parser to search for \n and \r in parallel	2024-02-25 09:57:39 +05:30
Kovid Goyal	4e6138d785	Generate SIMD code during build	2024-02-25 09:57:39 +05:30
Kovid Goyal	86a55e2c0a	Use an aligned slice for file reads	2024-02-25 09:57:39 +05:30
Kovid Goyal	de8c1e0206	Work on porting SIMD vt arser to Go for the kittens	2024-02-25 09:57:39 +05:30
Kovid Goyal	131716da00	Ignore another warning on some compiler versions in simde	2024-02-25 09:57:39 +05:30
Kovid Goyal	4d35fc2928	Use a custom movmask for ARM rather than the one from simde Supposedly faster, not that I can measure it, but... Also gives neater code, so keep it.	2024-02-25 09:57:39 +05:30
Kovid Goyal	3b65c1a58a	remove declaration without implementation	2024-02-25 09:57:39 +05:30
Kovid Goyal	9bca415af2	Use aligned loads when finding either of two bytes No measurable performance improvement, but neater algorithm anyway.	2024-02-25 09:57:39 +05:30
Kovid Goyal	60bc8e6c25	...	2024-02-25 09:57:39 +05:30
Kovid Goyal	8aa1b112b8	Turns out the simde implementation of movemask is not slow enough to compensate for the speed bump from 256 bit	2024-02-25 09:57:39 +05:30
Kovid Goyal	0bd47d8457	Cleanup KITTY_NO_SIMD compilation	2024-02-25 09:57:39 +05:30
Kovid Goyal	fcbda63023	Move finding byte code into separate functions movemask() is inefficient on ARM64 this will allow us to use a dedicated implementation for finding bytes on that platform	2024-02-25 09:57:38 +05:30
Kovid Goyal	1d59bfade3	...	2024-02-25 09:57:38 +05:30
Kovid Goyal	fd7d0f8787	Fix event loop continuously ticking every input_delay seconds even when no input is available	2024-02-25 09:57:38 +05:30
Kovid Goyal	fa11858a72	Make bash integration tests more robust on macOS	2024-02-25 09:57:38 +05:30
Kovid Goyal	1293ee60e0	...	2024-02-25 09:57:38 +05:30
Kovid Goyal	66341aa28e	Make the env var controlling which SIMD level to use more capable	2024-02-25 09:57:38 +05:30

1 2 3 4 5 ...

13445 Commits