mirror of
https://github.com/kovidgoyal/kitty
synced 2026-06-08 22:28:24 +02:00
Modified `start_classification` in `utf8_decode_to_esc` in `simd-string-impl.h`, which now: Rejects `0xC0`, `0xC1` and `0xF5..0xFF` lead bytes in UTF-8 subsequences. Enforces special ranges for the second subsequence bytes after `0xE0`, `0xED`, `0xF0` and `0xF4` bytes to prevent overlong sequences, surrogates, and code points above U+10FFFF. Accumulates UTF-8 validation errors in a single vector to avoid many conditional branches. Worsens unicode benchmark performance by about 4%.