Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent Request headers canonicalization #607

Merged
merged 10 commits into from
Dec 31, 2024

Conversation

ErikPelli
Copy link
Collaborator

Fixes #559.
Thanks to a suggestion made by @alessiodallapiazza, I found this thread where they talk about the request canonicalization problem: golang/go#37834.
Basically, if the request specifies an header value, the Go http parser can rewrite its name to make it compliant with the RFC standard, but this behavior will break some requests made to websites that expects a different case (e.g. everything lowercase) for the name.

In this PR I tried to implement a (not too ugly) solution for this problem.
I still rely on the standard Go implementation, which handles all the edge cases, but I duplicate the data stream to parse manually only a list of the header names by ourselves.
To extract headers I used a simplified version of https://github.com/evanphx/wildcat, modified by me.
Then, for each header, I manually replace the name saved inside the Headers map only if it's different than the canonicalized one.

I added some additional tests for everything.
Does someone want to review this? Is there some missing edge case?

@alessiodallapiazza
Copy link
Contributor

Hi Erik,

I reviewed the PR. I’m lucky to have control over the client, so for certain scenarios, I can pass metadata directly and didn’t have to implement it the way you did.

That said, the PR looks good overall. It does consume more RAM since you're cloning data. For the tests, you might consider adding some new elements, for instance, from here:
https://github.com/golang/go/blob/master/src/net/http/readrequest_test.go

Additionally, instead of parsing as you’ve done, why not leverage customize the original package that also have more integrity checks?

https://github.com/golang/go/blob/2b794ed86cb1b718bc212ee90fecbb8f3b28a744/src/net/http/request.go#L1080

This way, you’d avoid cloning everything and could simply rely on Go's implementation from the source. If some functions is private I think you can expose it instead of copy/paste by abusing linkname like already done here:

https://github.com/SagerNet/sing/blob/aa7d2543a3ce2b3a639c6bf8cb8f241159f4c487/protocol/http/link.go#L9

Let me know what you think!

@ErikPelli
Copy link
Collaborator Author

Thanks for the review, I really appreciate that.
You have exposed some valid suggestions that I will fix & evaluate.

I'm a bit skeptical about the abuse of linkname, for a library like that which is used by different versions of the Go compiler, which can break itself if the standard library changes the signature.
I will evaluate this one specifically, deeply. Http/1 parsing is straightforward, I'll evaluate if keep the current custom solution or if it's better to rely on the Go textparser.

@ErikPelli ErikPelli force-pushed the prevent-request-canonicalization branch from a53387b to ed3a6e1 Compare December 28, 2024 20:47
@ErikPelli ErikPelli force-pushed the prevent-request-canonicalization branch 2 times, most recently from 9b495a8 to 3df826e Compare December 28, 2024 22:02
@ErikPelli
Copy link
Collaborator Author

ErikPelli commented Dec 28, 2024

Updates:

  • I added more tests to make sure to properly handle the HTTP data correctly
  • I made sure to avoid the data duplication when the PreventCanonicalization option is set to false, to avoid increased memory usage when it's not needed
  • Replaced the custom HTTP parser with the Go textproto implementation (no need to use link name since ~30 lines of code are enough to integrate it in our header parsing usage)

If someone wants to add anything else to the review, I'm happy to evaluate new changes. However, next week I will merge this Pull Request.

@ErikPelli ErikPelli force-pushed the prevent-request-canonicalization branch from 3df826e to f18e376 Compare December 28, 2024 22:16
@alessiodallapiazza
Copy link
Contributor

LGTM!

@ErikPelli ErikPelli merged commit 3d6017a into master Dec 31, 2024
1 check passed
@ErikPelli ErikPelli deleted the prevent-request-canonicalization branch January 2, 2025 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prevent header canonicalization in request
2 participants