Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Honor parameters order when parsing query and form parameters #7599

Merged
merged 3 commits into from
Feb 16, 2022

Conversation

jebeaudet
Copy link
Contributor

When parsing the query or form parameters in Request, the values are stored in a MultiMap. This class extends HashMap which does not preserve the order of insertion so a request with parameters first=1&second=2 might end up in a map where second will come first when iterating on the entry set.

The order is necessary in some case where the request is signed off the body and/or the query parameters. When the order is not preserved, it is impossible to reconstruct the original request sent to form a digest, unless using the Request::getInputStream which consumes the stream and makes subsequent calls to Request::getParameters to don't return the form parameters which can be misleading. The same behavior applied to query parameters, by using Request::getQueryString, you get the correct order but Request::getParameters will not.

Moreoever, if the application is behind a reverse proxy using Jetty that is proxying using Request::getParameters which consume the request InputStream, it will be completely impossible to reconstruct the original request.

I believe changing MultiMap to extends LinkedHashMap should have no performance impact and a minimal additional memory footprint. I'm of course open to alternatives if you have any.

I've based off this PR against the 9.4 branch which is still widely used by Spring Boot notably. I believe it can also be easily merged in other branch since changes are quite trivial.

Thanks!

When parsing the query or form parameters in Request, the values are stored in a MultiMap. This class extends HashMap which does not preserve the order of insertion so a request with parameters "first=1&second=2" might end up in a map where "second" will come first when iterating on the entry set. 

The order is necessary in some case where the request is signed off the body and/or the query parameters. When the order is not preserved, it is impossible to reconstruct the original request sent, unless using the Request::getInputStream which consumes the stream and makes subsequent calls to Request::getParameters to don't return the form parameters which can be misleading. The same behavior applied to query parameters, by using Request::getQueryString, you get the correct order but Request::getParameters will not.

Moreoever, if the application is behind a reverse proxy using Jetty that is proxying using Request::getParameters which consume the request InputStream, it will be completely impossible to reconstruct the original request.
@joakime
Copy link
Contributor

joakime commented Feb 15, 2022

Digest is not handled at this level.
There's no safe way to build a digest from the Servlet API. (case differences in headers, header caching, multi value header field value spread across multiple headers in the request/response, query parameter decoding differences, form value decoding differences, etc)
Digest has to be done during parsing/generating to be sane.

There is no query parameter order with the Servlet spec (which is what Jetty 9.x/10.x/11.x are built against, we are changing this with Jetty 12, the core server will be disconnected from the servlet spec and behaviors).
The parameters are not what you think, and even the simple use of RequestDispatcher.include and RequestDispatcher.forward can change the appearance of the query parameters per Servlet spec (eg: mandatory query parameter merging)

@lorban what do you think about this change from a performance point of view? can you test this branch against our performance baseline?

@jebeaudet
Copy link
Contributor Author

jebeaudet commented Feb 15, 2022

Thanks for your input.

To give you some additional context, we have a reverse proxy using Jetty that uses, for hard to change historical reasons, the Request::getParameterMap to build the upstream proxied request for application/x-www-form-urlencoded requests. These requests ends up on another Jetty server that uses the Slack SDK to validate the signature using this class and this class to fetch the body as a string directly from the input stream to build the digest.

I understand that the spec doesn't specify ordering and that digest is hard/impossible to do so based solely on the Servlet spec. However, the "happy path" of a request that is not forwarded and has no special characters that can be decoded to + or %20 works by using this PR, we have been running it in our dev environment for a few days. IMHO, keeping the parameters order is a natural expectation of the returned argument, even if not specified by the spec.

For what it's worth (absolutely not trying to start a finger pointing fight), Tomcat seems to preserve ordering 1 2.

@joakime
Copy link
Contributor

joakime commented Feb 16, 2022

Tomcat seems to preserve ordering 1 2.

Tomcat still has the same servlet spec issues with regards to RequestDispatcher and parameter merging, that order is transient in the extreme in the tomcat case, don't trust it.

Also, with your change, what happens if the query is ?a=1&b=2&c=3&a=4 ?
Both tomcat and jetty will merge a as described in servlet spec, returning 2 values for a lookup of a.
(Incidentally, this isn't limited to the query string, a urlencoded form has the same behavior)

If you have a combination of query and form params, merging occurs again on the parameters map.
Also, if the form is multipart/form-data the param names/keys are not allowed to clash within that content form, but allowed to clash across the query params too. This is not the same behavior when the content is application/x-www-form-urlencoded where the params are allowed to clash even within that same form.

@joakime
Copy link
Contributor

joakime commented Feb 16, 2022

What would your expected param order be for the following request?

POST /foo?a=1&b=2&c=3&a=4
Host: bar.machine.com
Connection: close
Content-Type: application/x-www-form-urlencoded
Content-Length: 12

c=5&b=6&a=7

String responses = _connector.getResponse(request);
Map<String, String[]> returnedMap = reference.get();
assertTrue(responses.startsWith("HTTP/1.1 200"));
assertThat(new ArrayList<>(returnedMap.keySet()), is(Arrays.asList("a", "b", "c")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense.
The params are filled in by logic in Request.getParameters(), which does the form content first, then the query params. Which means this this should be "c", "b", "a" as the form sets the order, not the query.

Copy link
Contributor Author

@jebeaudet jebeaudet Feb 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are parsed before but when getting the map, they are injected in the order defined by the servlet spec.

https://github.com/eclipse/jetty.project/blob/a8e0689a08c66342b151a9e9de7a4fc1f2659d22/jetty-server/src/main/java/org/eclipse/jetty/server/Request.java#L398-L400

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That block is concerned with the value order when merging between query/content (what the Servlet Section 3.1 is about), the key order is different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data from the query string and the post body are aggregated into the request parameter set. Query string data is presented before post body data. For example, if a request is made with a query string of a=hello and a post body of a=goodbye&a=world, the resulting parameter set would be ordered a=(hello, goodbye, world).

I wish the spec would give another example with different keys than a... You're right that it could be read as the order for the data (data as in the values, not the keys) but I personally read it as data == key+value. So I'd expect the key+values to be ordered with query parameters first, then form parameters.

@jebeaudet
Copy link
Contributor Author

jebeaudet commented Feb 16, 2022

Also, with your change, what happens if the query is ?a=1&b=2&c=3&a=4 ? Both tomcat and jetty will merge a as described in servlet spec, returning 2 values for a lookup of a.

It will end up like {a=[1, 4],b=[2],c=[3]} which I would say is the most intuitive way I'd expect the map to return.

As for combinations, this is defined in the spec AFAICT in section 3.1 :

Data from the query string and the post body are aggregated into the request parameter set. Query string data is presented before post body data

So for

What would your expected param order be for the following request?
I'd expect {a=[1, 4, 7],b=[2, 6],c=[3, 5]}.

I've added an additional test to show this behavior if you want to look at it.

I agree that this PR is not perfect but I believe that for simple requests that don't have duplicate parameters, the returned variable is better when it's ordered.

@joakime
Copy link
Contributor

joakime commented Feb 16, 2022

We missed a few ECA requirements on this PR.
Can you address those? (even if it means a rebase / force-push to a single commit)

@jebeaudet
Copy link
Contributor Author

We missed a few ECA requirements on this PR. Can you address those? (even if it means a rebase / force-push to a single commit)

Should be fixed, sorry I commited the last test with my work email.

@joakime
Copy link
Contributor

joakime commented Feb 16, 2022

We missed a few ECA requirements on this PR. Can you address those? (even if it means a rebase / force-push to a single commit)

Should be fixed, sorry I commited the last test with my work email.

No problem, it happens. Thanks for the quick fix.

@jebeaudet
Copy link
Contributor Author

Seems like I'm not the only one to worry about parameters ordering : jakartaee/servlet#393

Here is the bug when Tomcat changed from HashMap to LinkedHashMap (still trying to find the commit)

@lorban
Copy link
Contributor

lorban commented Feb 16, 2022

@joakime from a performance point of view, I tested this change against our baseline and it makes no measurable difference.

@joakime
Copy link
Contributor

joakime commented Feb 16, 2022

Seems like I'm not the only one to worry about parameters ordering : eclipse-ee4j/servlet-api#393

Here is the bug when Tomcat changed from HashMap to LinkedHashMap (still trying to find the commit)

Yep, I was down that rabbit hole already. :-)

Here's the commit btw, apache/tomcat@90556a9

@jebeaudet
Copy link
Contributor Author

Seems like I'm not the only one to worry about parameters ordering : eclipse-ee4j/servlet-api#393
Here is the bug when Tomcat changed from HashMap to LinkedHashMap (still trying to find the commit)

Yep, I was down that rabbit hole already. :-)

Here's the commit btw, apache/tomcat@90556a9

Hahaha good find, that commit looks a lot like this PR, minus the tests 😄

@joakime joakime self-assigned this Feb 16, 2022
@joakime joakime added this to the 9.4.x milestone Feb 16, 2022
@joakime joakime merged commit 68bc443 into jetty:jetty-9.4.x Feb 16, 2022
joakime pushed a commit that referenced this pull request Feb 16, 2022
* Honor parameters order when parsing query and form parameters

When parsing the query or form parameters in Request, the values are stored in a MultiMap. This class extends HashMap which does not preserve the order of insertion so a request with parameters "first=1&second=2" might end up in a map where "second" will come first when iterating on the entry set. 

The order is necessary in some case where the request is signed off the body and/or the query parameters. When the order is not preserved, it is impossible to reconstruct the original request sent, unless using the Request::getInputStream which consumes the stream and makes subsequent calls to Request::getParameters to don't return the form parameters which can be misleading. The same behavior applied to query parameters, by using Request::getQueryString, you get the correct order but Request::getParameters will not.

Moreoever, if the application is behind a reverse proxy using Jetty that is proxying using Request::getParameters which consume the request InputStream, it will be completely impossible to reconstruct the original request.

* Added a test with parameter merging
@jebeaudet jebeaudet deleted the fix/parameter-order branch February 16, 2022 15:31
@joakime
Copy link
Contributor

joakime commented Feb 16, 2022

Merged into jetty-9.4.x.
Forward port/cherry-pick into jetty-10.0.x available as PR #7605

joakime added a commit that referenced this pull request Feb 16, 2022
…#7605)

* Honor parameters order when parsing query and form parameters

When parsing the query or form parameters in Request, the values are stored in a MultiMap. This class extends HashMap which does not preserve the order of insertion so a request with parameters "first=1&second=2" might end up in a map where "second" will come first when iterating on the entry set. 

The order is necessary in some case where the request is signed off the body and/or the query parameters. When the order is not preserved, it is impossible to reconstruct the original request sent, unless using the Request::getInputStream which consumes the stream and makes subsequent calls to Request::getParameters to don't return the form parameters which can be misleading. The same behavior applied to query parameters, by using Request::getQueryString, you get the correct order but Request::getParameters will not.

Moreoever, if the application is behind a reverse proxy using Jetty that is proxying using Request::getParameters which consume the request InputStream, it will be completely impossible to reconstruct the original request.

* Added a test with parameter merging

Co-authored-by: Jacques-Etienne Beaudet <[email protected]>
joakime added a commit that referenced this pull request Feb 16, 2022
…#7605)

* Honor parameters order when parsing query and form parameters

When parsing the query or form parameters in Request, the values are stored in a MultiMap. This class extends HashMap which does not preserve the order of insertion so a request with parameters "first=1&second=2" might end up in a map where "second" will come first when iterating on the entry set.

The order is necessary in some case where the request is signed off the body and/or the query parameters. When the order is not preserved, it is impossible to reconstruct the original request sent, unless using the Request::getInputStream which consumes the stream and makes subsequent calls to Request::getParameters to don't return the form parameters which can be misleading. The same behavior applied to query parameters, by using Request::getQueryString, you get the correct order but Request::getParameters will not.

Moreoever, if the application is behind a reverse proxy using Jetty that is proxying using Request::getParameters which consume the request InputStream, it will be completely impossible to reconstruct the original request.

* Added a test with parameter merging

Co-authored-by: Jacques-Etienne Beaudet <[email protected]>
sbordet pushed a commit that referenced this pull request Feb 17, 2022
…#7605)

* Honor parameters order when parsing query and form parameters

When parsing the query or form parameters in Request, the values are stored in a MultiMap. This class extends HashMap which does not preserve the order of insertion so a request with parameters "first=1&second=2" might end up in a map where "second" will come first when iterating on the entry set.

The order is necessary in some case where the request is signed off the body and/or the query parameters. When the order is not preserved, it is impossible to reconstruct the original request sent, unless using the Request::getInputStream which consumes the stream and makes subsequent calls to Request::getParameters to don't return the form parameters which can be misleading. The same behavior applied to query parameters, by using Request::getQueryString, you get the correct order but Request::getParameters will not.

Moreoever, if the application is behind a reverse proxy using Jetty that is proxying using Request::getParameters which consume the request InputStream, it will be completely impossible to reconstruct the original request.

* Added a test with parameter merging

Co-authored-by: Jacques-Etienne Beaudet <[email protected]>
magnolia-k added a commit to scala-steward-bot/scalatra that referenced this pull request Apr 2, 2022
A change was made in Jetty 9.4.46.v20220331 to guarantee the order
of request parameter insertion, but the change is jetty-specific,
so the test is not dependent on that behavior.

jetty/jetty.project#7599
magnolia-k added a commit to scalatra/scalatra that referenced this pull request Apr 2, 2022
* Update jetty-servlet, jetty-webapp to 9.4.46.v20220331

* Removed tests that depended on the order of params

A change was made in Jetty 9.4.46.v20220331 to guarantee the order
of request parameter insertion, but the change is jetty-specific,
so the test is not dependent on that behavior.

jetty/jetty.project#7599

Co-authored-by: Magnolia.K <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants