Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce stack usage when parsing JSON selection #6526

Merged
merged 2 commits into from
Jan 23, 2025

Conversation

pubmodmatt
Copy link
Contributor

The tuple and alt methods in nom can result in high stack usage because they can take a number of parsers as parameters, which are all on the stack at once. Breaking up alt calls into smaller chunks of parsers and eliminating tuple in favor of incrementally calling parsers results in a large reduction in the amount of data that needs to be on the stack all at once.

This allows parsing a greater depth of nested JSON selection syntax in a given stack size (from a depth 33 to 46 on my local machine for the test case I was using, a 40% improvement).


Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

  • Changes are compatible1
  • Documentation2 completed
  • Performance impact assessed and acceptable
  • Tests added and passing3
    • Unit Tests
    • Integration Tests
    • Manual Tests

Exceptions

Note any exceptions here

Notes

Footnotes

  1. It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this.

  2. Configuration is an important part of many changes. Where applicable please try to document configuration examples.

  3. Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions.

@pubmodmatt pubmodmatt self-assigned this Jan 9, 2025
@pubmodmatt pubmodmatt requested a review from a team as a code owner January 9, 2025 13:55
@router-perf
Copy link

router-perf bot commented Jan 9, 2025

CI performance tests

  • connectors-const - Connectors stress test that runs with a constant number of users
  • const - Basic stress test that runs with a constant number of users
  • demand-control-instrumented - A copy of the step test, but with demand control monitoring and metrics enabled
  • demand-control-uninstrumented - A copy of the step test, but with demand control monitoring enabled
  • enhanced-signature - Enhanced signature enabled
  • events - Stress test for events with a lot of users and deduplication ENABLED
  • events_big_cap_high_rate - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity
  • events_big_cap_high_rate_callback - Stress test for events with a lot of users, deduplication enabled and high rate event with a big queue capacity using callback mode
  • events_callback - Stress test for events with a lot of users and deduplication ENABLED in callback mode
  • events_without_dedup - Stress test for events with a lot of users and deduplication DISABLED
  • events_without_dedup_callback - Stress test for events with a lot of users and deduplication DISABLED using callback mode
  • extended-reference-mode - Extended reference mode enabled
  • large-request - Stress test with a 1 MB request payload
  • no-tracing - Basic stress test, no tracing
  • reload - Reload test over a long period of time at a constant rate of users
  • step-jemalloc-tuning - Clone of the basic stress test for jemalloc tuning
  • step-local-metrics - Field stats that are generated from the router rather than FTV1
  • step-with-prometheus - A copy of the step test with the Prometheus metrics exporter enabled
  • step - Basic stress test that steps up the number of users over time
  • xlarge-request - Stress test with 10 MB request payload
  • xxlarge-request - Stress test with 100 MB request payload

Copy link
Member

@dylan-apollo dylan-apollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In doing stack analysis, I feel like we should identify where we want the bottleneck. Like, certainly we never want a stack overflow at runtime during ApplyTo execution, it'd be much better to fail at composition (parsing or validation).

After this change, is the parser still the bottleneck? Or is it possible to parse selections which then overflow the stack when executing? Maybe this requires setting up fuzzing of some sort...

@pubmodmatt
Copy link
Contributor Author

In doing stack analysis, I feel like we should identify where we want the bottleneck. Like, certainly we never want a stack overflow at runtime during ApplyTo execution, it'd be much better to fail at composition (parsing or validation).

After this change, is the parser still the bottleneck? Or is it possible to parse selections which then overflow the stack when executing? Maybe this requires setting up fuzzing of some sort...

@dylan-apollo - the goal here is just to make the parser code more efficient in its use of the stack space. This is not attempting to solve the larger problem of limiting the input selection size to avoid stack overflow generally. In the cases I've tested, the parser is still the first point of failure, but that is far from conclusive.

@pubmodmatt pubmodmatt requested a review from a team January 10, 2025 02:52
Copy link
Member

@dylan-apollo dylan-apollo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay to me, and actually is easier to read to me, too! But that's because I'm not comfortable with nom, so maybe wait for a second ✅ 😅

)),
let (input, _) = spaces_or_comments(input)?;
alt((
Self::parse_primitive,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is moving this just about code organization? I'd assume adding another function increases stack usage, not decreases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually does help considerably. My (possibly incorrect) understanding is that this reduces the number of parsers on the stack at this level. It adds a level for primitives, but that path is a dead end in terms of stack depth - primitives are processed with a very shallow depth of parsers. In contrast, the parse_object path can be very deep, so when we go down that path, anything on the alt parameter list at this level is amplified because it stays on the stack as we go down that branch.

@svc-apollo-docs
Copy link
Collaborator

svc-apollo-docs commented Jan 10, 2025

✅ Docs preview has no changes

The preview was not built because there were no changes.

Build ID: 83886935cc02f1d2286be4ff

@pubmodmatt pubmodmatt force-pushed the pubmodmatt/connectors/stackoverflow branch from fede284 to 7966fb2 Compare January 10, 2025 18:37
Base automatically changed from next to dev January 20, 2025 15:02
@pubmodmatt pubmodmatt merged commit 7229525 into dev Jan 23, 2025
15 checks passed
@pubmodmatt pubmodmatt deleted the pubmodmatt/connectors/stackoverflow branch January 23, 2025 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants