Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

Open
ItsAMeMarcel opened this issue Oct 1, 2024 · 2 comments · May be fixed by #364
Open

Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

ItsAMeMarcel opened this issue Oct 1, 2024 · 2 comments · May be fixed by #364

Comments

@ItsAMeMarcel
Copy link

What would you like to see added?

Lookup Table for base64/bas64url Decoding

Additional Context

Hi folks,

Over the past few days, I’ve been researching performance issues related to jwt-cpp and have noticed a bottleneck in the jwt::decoded_jwt / jwt::alphabet::index functionality. The main cause of this appears to be the use of find_if during Base64 decoding. Since find_if performs a linear search, this results in an O(n) complexity for the decoding process.

In contrast, the encoding process in jwt-cpp uses a lookup table, which operates in O(1), making it significantly faster.

I’d like to suggest two possible ways of improvement:

  1. Use OpenSSL’s Base64 implementation (OpenSSL using lookup tables): One challenge with this approach is that OpenSSL doesn’t natively support Base64URL, meaning we’d have to manually preprocess and postprocess the data. This would require caution to avoid modifying user data and introducing unnecessary copying.

  2. Implement a reverse lookup table for char-to-index mapping: This would align the decoding process with the encoding by using an O(1) lookup table. One consideration here is the limitation of constexpr in C++11. Also some changes in the internal functions of decode would be needed.

In my opinion, option 2 seems easier to implement.

What are your thoughts about this?

@prince-chrismc
Copy link
Collaborator

Any contributions will be very welcomed!

Any performance related topics need to have an objective numerical analysis. Which we currently do not have any bench marks so that would likely need to be added first. GoogleBenchmark is an easy choice https://github.com/google/benchmark

From an application point of view, the TLS connection and DNS looks tend to be the bottleneck so the impact of decoding tokens is less critical.

If you are looking to improve the code base by all means please submit a PR. Option 2 does sound better... I would also love to see char type templates for wide string support and a lookup table Could help there as well.

The decode is templates and you could quickly test the openssl option as well. Theres discussions of that in other issues #83

@Thalhammer
Copy link
Owner

One consideration here is the limitation of constexpr in C++11

I don't think that's really an issue for jwt-cpp, since it is not usable at compile time anyway ,and due to the vast amount of system crypto calls likely will never be.
I am not opposed to generating a 256 byte lookup table to speed up decoding.

I don't wanna use OpenSSL's code because it's

  • ugly and cumbersome to use
  • doesn't support base64url (or really any custom alphabet)
  • ties the code further to openssl which is the opposite of what we want
  • Is likely gonna end up being slower than the current solution (for the sizes we talk about with tokens) because of all the overhead of setting up stuff for openssl.

@ItsAMeMarcel ItsAMeMarcel linked a pull request Oct 18, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants