Improve performance, especially on non-ASCII codepoints #28
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces two performance-oriented changes:
if
instead ofmatch
in the ASCII fast-path. This leads to a ~15% performance improvement on ASCII text. This was previously suggested in UnicodeWidthChar should inline an ascii fast path #1; it's been 7 years since that issue was created andmatch
still is not optimized as well asif
, so at this point I think it's justified to replace it with anif
statement. I hope that this can be reverted after future compiler improvements since thematch
expression was much more readable.I created two additional benchmarks to test these changes' performance.
enwik8
runsUnicodeWidthStr::width
on ~97MB of English Wikipedia XML data;jawiki
runsUnicodeWidthStr::width
on ~120MB of Japanese Wikipedia page metadata. I haven't included these benchmarks in the commit because I'm unsure about copyright/licensing on the data and I can't remember where I obtained thejawiki
data from.Here are the benchmarks for the current version of
unicode-width
:And here are the benchmarks with this PR applied:
jawiki
still includes a lot of ASCII characters, so the performance improvement for non-ASCII characters is likely larger than 70%.I also tested the new code against
wcswidth
. This wasn't a perfect comparison:wcswidth
was at a disadvantagewcswidth
directly operates onwchar_t
s, whileUnicodeWidthStr
must pass the input string through.chars()
. I accounted for this by collecting.chars()
into aVec
and then benchmarkingvec.map(|&c| cw::width(c, false).unwrap_or(0)).fold(0, Add::add)
.With these disclaimers, the new code is ~30% faster than the system implementation of
wcswidth
and ~60% faster than http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c when run onjawiki
:I've run the following code to ensure that I didn't introduce any behavioral regressions:
Apologies in advance to the reviewer(s): I experimented with a few different optimization methods (binary search variants, B+ trees, etc) and by the time I was finished most of the code in
unicode.py
had been rewritten, so there's a lot to look through.