Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

ItsAMeMarcel · 2024-10-01T14:54:41Z

What would you like to see added?

Lookup Table for base64/bas64url Decoding

Additional Context

Hi folks,

Over the past few days, I’ve been researching performance issues related to jwt-cpp and have noticed a bottleneck in the jwt::decoded_jwt / jwt::alphabet::index functionality. The main cause of this appears to be the use of find_if during Base64 decoding. Since find_if performs a linear search, this results in an O(n) complexity for the decoding process.

In contrast, the encoding process in jwt-cpp uses a lookup table, which operates in O(1), making it significantly faster.

I’d like to suggest two possible ways of improvement:

Use OpenSSL’s Base64 implementation (OpenSSL using lookup tables): One challenge with this approach is that OpenSSL doesn’t natively support Base64URL, meaning we’d have to manually preprocess and postprocess the data. This would require caution to avoid modifying user data and introducing unnecessary copying.
Implement a reverse lookup table for char-to-index mapping: This would align the decoding process with the encoding by using an O(1) lookup table. One consideration here is the limitation of constexpr in C++11. Also some changes in the internal functions of decode would be needed.

In my opinion, option 2 seems easier to implement.

What are your thoughts about this?

prince-chrismc · 2024-10-01T15:09:02Z

Any contributions will be very welcomed!

Any performance related topics need to have an objective numerical analysis. Which we currently do not have any bench marks so that would likely need to be added first. GoogleBenchmark is an easy choice https://github.com/google/benchmark

From an application point of view, the TLS connection and DNS looks tend to be the bottleneck so the impact of decoding tokens is less critical.

If you are looking to improve the code base by all means please submit a PR. Option 2 does sound better... I would also love to see char type templates for wide string support and a lookup table Could help there as well.

The decode is templates and you could quickly test the openssl option as well. Theres discussions of that in other issues #83

Thalhammer · 2024-10-08T08:30:25Z

One consideration here is the limitation of constexpr in C++11

I don't think that's really an issue for jwt-cpp, since it is not usable at compile time anyway ,and due to the vast amount of system crypto calls likely will never be.
I am not opposed to generating a 256 byte lookup table to speed up decoding.

I don't wanna use OpenSSL's code because it's

ugly and cumbersome to use
doesn't support base64url (or really any custom alphabet)
ties the code further to openssl which is the opposite of what we want
Is likely gonna end up being slower than the current solution (for the sizes we talk about with tokens) because of all the overhead of setting up stuff for openssl.

ItsAMeMarcel added the enhancement label Oct 1, 2024

ItsAMeMarcel linked a pull request Oct 18, 2024 that will close this issue

Refactor base64 decoding to use a lookup table #364

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

ItsAMeMarcel commented Oct 1, 2024

prince-chrismc commented Oct 1, 2024

Thalhammer commented Oct 8, 2024

Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

Performance Bottleneck in jwt::decoded_jwt due to Linear Search #362

Comments

ItsAMeMarcel commented Oct 1, 2024

What would you like to see added?

Additional Context

prince-chrismc commented Oct 1, 2024

Thalhammer commented Oct 8, 2024