Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO#gets should have same result regardless of #peek availability #13882

Conversation

compumike
Copy link
Contributor

At Heii On-Call we ran into a problem with Crystal 1.10.0. We found that when parsing HTTP responses, HTTP.parse_headers_and_body was behaving differently depending on whether the IO had #peek or not. We run a lot of HEAD requests for monitoring customer APIs/websites, so the IO contents often looks something like: "Header: Value\r\n\r\n" where there's an empty line at the end, not followed by any body contents. That was working fine until Crystal 1.10.0 😮

It turns out thatHTTP.parse_headers_and_body totally breaks if io.gets(chomp: true) returns a nil before it returns an empty "", see https://github.com/crystal-lang/crystal/blob/1.10.0/src/http/common.cr#L121 . (There is no case in parse_headers_and_body for handling the nil from read_header_line.)

We then found that IO#gets was actually behaving differently depending on whether the IO had a #peek or not! ❗

We expect that an IO wrapping an empty string "" should have #gets returning nil, while an IO wrapping a newline "\n" or a CRLF newline "\r\n" should have #gets returning an empty string "".

In 1.10.0, an IO without #peek that wraps just a newline will incorrectly have gets(chomp: true) returning nil, when it should return "". This PR fixes that behavior, and adds tests.

@compumike
Copy link
Contributor Author

@asterite would be interested in your eyes on this, where https://github.com/crystal-lang/crystal/blob/1.10.0/src/io.cr#L725 especially came from #11242 and is the only other call site for gets_slow

@Blacksmoke16 Blacksmoke16 added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib kind:regression Something that used to correctly work but no longer works labels Oct 11, 2023
@straight-shoota straight-shoota added this to the 1.10.1 milestone Oct 11, 2023
src/io.cr Outdated
@@ -776,6 +778,7 @@ abstract class IO
end

char2, char_bytesize2 = info2
bytes_read += char_bytesize2
Copy link
Contributor

@HertzDevil HertzDevil Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the exact number of bytes read needs to be computed, other than that it is non-zero. So bytes_read can be represented by a Bool instead and this line can be dropped; bytes_read += char_bytesize above already implies at least 1 byte is consumed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.
And actually I think it can be achieved even easier by replacing break in line 766 with return false. That's the only place where the method can return without having read anything. The last line would be true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HertzDevil I've just pushed a second commit 17752d1 which tracks bytes_read as a Bool and drops the second accumulation bytes_read += char_bytesize2 line as you've suggested.

@straight-shoota If I were to change line 766 from break to return false, I think this might incorrectly return false in any case where we've already read some bytes in previous iterations of the while loop and then hit EOF?

@beta-ziliani
Copy link
Member

Thanks so much for this @compumike ! While at it, I noted that Heii On Call is not listed in https://crystal-lang.org/used_in_prod/ Do you mind adding it? We love to hear how people use Crystal!

src/io.cr Outdated Show resolved Hide resolved
@@ -801,6 +803,8 @@ abstract class IO
break if char_bytesize >= limit
limit -= char_bytesize
end

bytes_read
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing break on line 766 to return false would allow to get rid of the variable altogether.

Suggested change
bytes_read
true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Join same discussion here #13882 (comment) (I'm not sure this would be correct)

@compumike
Copy link
Contributor Author

Thanks so much for this @compumike ! While at it, I noted that Heii On Call is not listed in https://crystal-lang.org/used_in_prod/ Do you mind adding it? We love to hear how people use Crystal!

If find src/ spec/ | grep \.cr$ | xargs cat | grep -v '^$' | grep -E -v "^\s+#" | wc -l is a reasonable measure, it looks like we have a little over 6K lines of Crystal code in production! Our status page https://heiioncall.com/status gives a good description of how we actually use Crystal as part of the architecture if you're curious.

I just made an internal request to have someone write a concise description and submit it to the list, hopefully later this week. Thanks ❤️

@straight-shoota straight-shoota merged commit 4867d81 into crystal-lang:master Oct 12, 2023
54 checks passed
Blacksmoke16 pushed a commit to Blacksmoke16/crystal that referenced this pull request Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:bug A bug in the code. Does not apply to documentation, specs, etc. kind:regression Something that used to correctly work but no longer works topic:stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants