Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance optimization #3

Open
stevengj opened this issue May 10, 2023 · 0 comments
Open

performance optimization #3

stevengj opened this issue May 10, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@stevengj
Copy link
Member

I haven't done any benchmarking yet, but it seems likely that the current algorithm will be fairly slow. It can probably be made much faster if needed.

For the "plain text" of the document, PATTERN does a regex match one character at a time via the final (.) pattern. (And for each regex match there is a bunch of type-unstable code that executes.) One simple improvement would be update the regex so that it can match long strings of plain text.

There might be other ways to improve performance. Relying on StringEncodings/iconv for translating a few bytes at a time (we have to flush before every print because Unicode and Windows-codepage encodings are intermixed) is surely inefficient, and also type-unstable because the code page is in the type of the encoder stream. But I don't really want to implement a Julia-native code-page conversion routine myself.

@stevengj stevengj added the enhancement New feature or request label May 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant