You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The performance is still not great compared to miniz/zlib on files with long runs of the same byte.
EDIT:
See next post.
Profiling reveals that lz77::longest_match and lz77::get_match_length is where most time is spent.
get_match_length is particularly problematic for data where there is a lot of repetitions of one literal that causes a lot of calls to this function. (As there will be a large amount of entries in the hash chain for the 3-byte sequences of this byte.) Currently it uses two zipped iterators to compare the matches, which may not be ideal performance wise. C implementations of deflate seem to be checking multiple bytes at once by casting the bytes to larger data types. I've tested this, but it didn't seem to make a difference.
In the longest_match function, array lookups seems to be the main cause of the slowdown (maybe because further instructions depend on the array value?). If we can find a way to reduce the number of lookups, or length of the hash chains without impacting compression ratio, that would be helpful to improve performance.
For lower compression levels, other compressors simply hard-limit the length hash chains, and further adaptively reduces the hash chain length when there is a decent match.
The text was updated successfully, but these errors were encountered:
Avoid bounds checks in various functions (huffman table length/distance lookups, hash chain, huffman length generation and others)
Check for faster CRC generators (current used library seems to be abandoned/inactive with some PRs with improvements)
Consider using threads for some operations
Avoid buffering input when it's not needed (e.g for the simple functions (deflate_*) and if write_all is called with a very long buffer.)
Specialisation is unstable, so we might want to make a VecEncoder or something similar instead for now
Avoid buffering output when using a writer where writing is "guaranteed"(excluding OOM) to succeed, e.g Vec.
Try to heap-allocate buffers close together in memory. Don't think it's currently doable in an easy way in current stable rust without blowing up the stack (box syntax is unstable, but even that may not work in debug mode).
The performance is still not great compared to miniz/zlib on files with long runs of the same byte.
EDIT:
See next post.
The text was updated successfully, but these errors were encountered: