Reduce stack usage when parsing JSON selection #6526

pubmodmatt · 2025-01-09T13:55:02Z

The tuple and alt methods in nom can result in high stack usage because they can take a number of parsers as parameters, which are all on the stack at once. Breaking up alt calls into smaller chunks of parsers and eliminating tuple in favor of incrementally calling parsers results in a large reduction in the amount of data that needs to be on the stack all at once.

This allows parsing a greater depth of nested JSON selection syntax in a given stack size (from a depth 33 to 46 on my local machine for the test case I was using, a 40% improvement).

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

Exceptions

Note any exceptions here

Notes

It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩

router-perf · 2025-01-09T13:55:36Z

CI performance tests

dylan-apollo

In doing stack analysis, I feel like we should identify where we want the bottleneck. Like, certainly we never want a stack overflow at runtime during ApplyTo execution, it'd be much better to fail at composition (parsing or validation).

After this change, is the parser still the bottleneck? Or is it possible to parse selections which then overflow the stack when executing? Maybe this requires setting up fuzzing of some sort...

pubmodmatt · 2025-01-09T17:52:26Z

In doing stack analysis, I feel like we should identify where we want the bottleneck. Like, certainly we never want a stack overflow at runtime during ApplyTo execution, it'd be much better to fail at composition (parsing or validation).

After this change, is the parser still the bottleneck? Or is it possible to parse selections which then overflow the stack when executing? Maybe this requires setting up fuzzing of some sort...

@dylan-apollo - the goal here is just to make the parser code more efficient in its use of the stack space. This is not attempting to solve the larger problem of limiting the input selection size to avoid stack overflow generally. In the cases I've tested, the parser is still the first point of failure, but that is far from conclusive.

dylan-apollo

Looks okay to me, and actually is easier to read to me, too! But that's because I'm not comfortable with nom, so maybe wait for a second ✅ 😅

dylan-apollo · 2025-01-10T16:59:12Z

apollo-federation/src/sources/connect/json_selection/lit_expr.rs

-            )),
+        let (input, _) = spaces_or_comments(input)?;
+        alt((
+            Self::parse_primitive,


Is moving this just about code organization? I'd assume adding another function increases stack usage, not decreases?

This actually does help considerably. My (possibly incorrect) understanding is that this reduces the number of parsers on the stack at this level. It adds a level for primitives, but that path is a dead end in terms of stack depth - primitives are processed with a very shallow depth of parsers. In contrast, the parse_object path can be very deep, so when we go down that path, anything on the alt parameter list at this level is amplified because it stays on the stack as we go down that branch.

svc-apollo-docs · 2025-01-10T17:43:14Z

✅ Docs preview has no changes

The preview was not built because there were no changes.

Build ID: 83886935cc02f1d2286be4ff

Reduce stack usage when parsing JSON selection

e191037

pubmodmatt self-assigned this Jan 9, 2025

pubmodmatt requested a review from a team as a code owner January 9, 2025 13:55

dylan-apollo reviewed Jan 9, 2025

View reviewed changes

pubmodmatt requested a review from a team January 10, 2025 02:52

dylan-apollo approved these changes Jan 10, 2025

View reviewed changes

Minor comment changes

7966fb2

pubmodmatt force-pushed the pubmodmatt/connectors/stackoverflow branch from fede284 to 7966fb2 Compare January 10, 2025 18:37

Base automatically changed from next to dev January 20, 2025 15:02

pubmodmatt merged commit 7229525 into dev Jan 23, 2025
15 checks passed

pubmodmatt deleted the pubmodmatt/connectors/stackoverflow branch January 23, 2025 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce stack usage when parsing JSON selection #6526

Reduce stack usage when parsing JSON selection #6526

pubmodmatt commented Jan 9, 2025

router-perf bot commented Jan 9, 2025

dylan-apollo left a comment

pubmodmatt commented Jan 9, 2025

dylan-apollo left a comment

dylan-apollo Jan 10, 2025

pubmodmatt Jan 10, 2025

svc-apollo-docs commented Jan 10, 2025 •

edited

Loading

Reduce stack usage when parsing JSON selection #6526

Reduce stack usage when parsing JSON selection #6526

Conversation

pubmodmatt commented Jan 9, 2025

Footnotes

router-perf bot commented Jan 9, 2025

dylan-apollo left a comment

Choose a reason for hiding this comment

pubmodmatt commented Jan 9, 2025

dylan-apollo left a comment

Choose a reason for hiding this comment

dylan-apollo Jan 10, 2025

Choose a reason for hiding this comment

pubmodmatt Jan 10, 2025

Choose a reason for hiding this comment

svc-apollo-docs commented Jan 10, 2025 • edited Loading

✅ Docs preview has no changes

svc-apollo-docs commented Jan 10, 2025 •

edited

Loading