performance optimization #3

stevengj · 2023-05-10T12:26:00Z

I haven't done any benchmarking yet, but it seems likely that the current algorithm will be fairly slow. It can probably be made much faster if needed.

For the "plain text" of the document, PATTERN does a regex match one character at a time via the final (.) pattern. (And for each regex match there is a bunch of type-unstable code that executes.) One simple improvement would be update the regex so that it can match long strings of plain text.

There might be other ways to improve performance. Relying on StringEncodings/iconv for translating a few bytes at a time (we have to flush before every print because Unicode and Windows-codepage encodings are intermixed) is surely inefficient, and also type-unstable because the code page is in the type of the encoder stream. But I don't really want to implement a Julia-native code-page conversion routine myself.

The text was updated successfully, but these errors were encountered:

stevengj added the enhancement New feature or request label May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

performance optimization #3

performance optimization #3

stevengj commented May 10, 2023

performance optimization #3

performance optimization #3

Comments

stevengj commented May 10, 2023