Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix infinite loop due to integer overflow when reading large strings #1350

Open
wants to merge 3 commits into
base: 2.19
Choose a base branch
from

Conversation

adamjshook
Copy link

If the StreamReadConstraints are set to Integer.MAX_VALUE, reading a large string results in an infinite loop due to the value of max overflowing to a negative value. while (ptr < max) is always false so break ascii_loop is never reached.

This change uses Math.addExact to throw an ArthimeticException if an overflow were to occur and sets max to max integer, allowing the loop to break and large strings to be properly deserialized.

This can be reproduced like so, where the call to nextTextValue will trigger the infinite loop. After applying the patch, the code completes. I can add the test case somewhere (or put it in a new test class) however it requires a large JVM to run and I'm not sure the size of the JVMs used for testing.

    @Test
    void testLargeStringDeserialization() throws Exception {
        String largeString = RandomString.make(Integer.MAX_VALUE - 1024);
        JsonFactory f = JsonFactory.builder()
                .streamReadConstraints(StreamReadConstraints.builder()
                        .maxStringLength(Integer.MAX_VALUE)
                        .build())
                .build();
        ByteArrayOutputStream bytes = new ByteArrayOutputStream();
        JsonGenerator g = f.createGenerator(bytes);
        g.writeString(largeString);
        g.close();

        JsonParser parser = f.createParser(bytes.toByteArray());
        String parsedString = parser.nextTextValue();

        assertEquals(largeString, parsedString);
    }

If the StreamReadConstraints are set to Integer.MAX_VALUE, reading a string results in an infinite loop due to the value of max overflowing to a negative value. while (ptr < max) is always false so break ascii_loop is never reached.

This change uses Math.addExact to throw an ArthimeticException if an overflow were to occur and sets max to max integer, allowing the loop to break and large strings to be properly deserialized.
@pjfanning
Copy link
Member

Fair enough. Can't you add the test in the PR?

try {
max = Math.min(_inputEnd, Math.addExact(ptr, outBuf.length - outPtr));
} catch (ArithmeticException ex) {
max = Integer.MAX_VALUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way that this could be set to something like the StreamReadConstraints max number size? For people who might set a bigger max number size but that don't go for the absolute max.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eg would _inputEnd work?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, either _inputEnd or _streamReadConstraints.getMaxStringLength() will fix the issue. Do you have a preference?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min of both?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, looking at the code Integer.MAX_VALUE looks like a good fallback

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think Integer.MAX_VALUE would be fine since we're just trying to prevent an overflow here. max would generally be larger than the max string length or _inputEnd. Verification that we don't go beyond the max string length happens when data is read.

@adamjshook
Copy link
Author

Fair enough. Can't you add the test in the PR?

Added the test case but I see the jobs are failing with Java heap space errors. The JVMs would need to be scaled up for the test and I'm not sure the appetite on that one for this change. I needed to set the JVM args to 16G to get it to pass.

@pjfanning
Copy link
Member

@cowtowncoder is there some kind of annotation or pattern that we could use to mark a test as too big to run normally?

@pjfanning
Copy link
Member

pjfanning commented Oct 21, 2024

@adamjshook The same issue appears to occur in 2 places in NonBlockingUtf8JsonParserBase - https://github.com/search?q=repo%3AFasterXML%2Fjackson-core%20%22ptr%20%2B%20(outBuf.length%20-%20outPtr)%22&type=code

And one more:

int max2 = _inputPtr + (outBuf.length - outPtr);

@adamjshook
Copy link
Author

@adamjshook The same issue appears to occur in 2 places in NonBlockingUtf8JsonParserBase - https://github.com/search?q=repo%3AFasterXML%2Fjackson-core%20%22ptr%20%2B%20(outBuf.length%20-%20outPtr)%22&type=code

And one more:

int max2 = _inputPtr + (outBuf.length - outPtr);

Sure, can update in these other places as well.

@cowtowncoder
Copy link
Member

cowtowncoder commented Oct 21, 2024

@cowtowncoder is there some kind of annotation or pattern that we could use to mark a test as too big to run normally?

I don't think there is a specific annotation.

But I think there is already one test that is skipped by default, for running long; but can then be run from IDE manually. I'll see if I can find it.

EDIT: it's in src/test/java/com/fasterxml/jackson/core/io/NumberOutputTest.java but basic @Disabled, so locally need to comment out

@cowtowncoder
Copy link
Member

Ok the basic idea is sound; even if we have no way to automate test runs. I'm ok with that.

Couple of notes...

First, I think this would be good to get in 2.18(.1); while unlikely to occur, infinite loops are nasty. So I think PR should be against 2.18.

Second: I'd prefer not using exception or Math.addExact() -- can simply check that result is positive number (in this case it cannot roll over to 0).
But I think it could/should be extracted as a helper method in a base class (I can help find the space).

@cowtowncoder cowtowncoder added the 2.18 Issues planned at earliest for 2.18 label Oct 22, 2024
@adamjshook
Copy link
Author

Ok the basic idea is sound; even if we have no way to automate test runs. I'm ok with that.

I can add @Disabled to the existing test so at least it is available should someone choose to run it.

First, I think this would be good to get in 2.18(.1); while unlikely to occur, infinite loops are nasty. So I think PR should be against 2.18.

Agreed, I can rebase on 2.18 and change the base of this PR to 2.18 as well.

Second: I'd prefer not using exception or Math.addExact() -- can simply check that result is positive number (in this case it cannot roll over to 0). But I think it could/should be extracted as a helper method in a base class (I can help find the space).

Sure, can check after the fact if it is negative and set it to Integer.MAX_VALUE if so. Let me know what base class you had in mind and I can add a function there and update the locations pointer out here to use it. I'll get it updated tomorrow morning and push.

As an aside, I'll be unavailable Oct 24th to Oct 29th. I can continue addressing any additional comments when I return on the 30th, but feel free to finish it up if there is some urgency to do so.

@cowtowncoder
Copy link
Member

Thank you @adamjshook ! I'll see if I can play with this PR wrt base class/method to use (or if shared method even makes sense).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.18 Issues planned at earliest for 2.18
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants