-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Jetty with Spring Webclient causes thread leak #22
Comments
Can you take a thread dump to see what are those threads doing? |
Sure. You can find the thread dump here |
@sbordet Could you get any insight out of it? |
@thomas-br sorry for the late reply, I was in vacation. |
From your logs, there are 439 different instances of That is not good, as there should be only 1. I suspect there is something really wrong in the way you setup your application. |
@sbordet On the one handside same procedure does not cause such a thread behavior with Netty HttpClient (see graph below). On the other, if you say ideally a singleton instance of the |
When creating the |
Can you detail this? Typically when using a proxy, you want to say that requests to a particular origin should be proxied - see for example how they are configured in a browser. Do you have a case where, for the same origin, you want to use a proxy for one request, but not for another? |
Yes sure. So basically the use case would be that the service itself would not know any request information on application startup, but everything is basically dynamic configuration per "job". So it get's at runtime the task to execute the following request with the passed configuration. We cannot suppose any similarity on, for example, same hosts. Imagine the first request to example.com is configured to don't use any proxy, second requests is configured to use a certain proxy and third one to the same host / url can even have a complete different proxy setting in place. Same applies for the expected timeout behavior, which is also another problem with a single Still for testing purposes, I have tried a singleton instance of
Thread / Memory behavior seems to be okay, but from a certain point I receive the following error:
Which is within the load period acceptable. But after stopping the load, all WebClient requests are at latest done after 1s. Still this message stays for more than 30min, thus making the instance unusable in such a scenario. It seems although the WebClient request is cancelled, internally something seems to be queued still. Why is this the case? Why doesn't the triggered timeout by Spring WebClient on the |
It's the first time that I encounter the need of different proxy configurations for different requests for the same origin server (e.g. browsers do not support this). Jetty's The workaround is to use different About the load test you describe in the above comment, I am not sure I understand "Still this message stays for more than 30min". |
Okay. Accepting that the proxy configuration is bound per HttpClient level. For the other issue:
It seems to be a deadlock. I am not talking about any retry behavior. But more on the mechanism when the My observation is that after heavy load of the same requests performed to the same host, which are all terminated by the |
Any chance that you can write a simple reproducer? |
Sure, will probably find some time next week :) |
Hi @sbordet Simple reproducing project can be found here Simply insert in the Fire a lot of requests, for example with jmeter. I used 10 threads in parallel. |
Added test case - there appear to be no issue though. Signed-off-by: Simone Bordet <[email protected]>
@thomas-br I simplified your reproducer and I obtain this with
When I fix the All seems good to me. |
@sbordet I think you did not get my point. Let me try to clarify it again:
The issue is just that when a lot of long requests to the same host are being cancelled with
And apparently due to the fact that all requests are cancelled by So in a nutshell my observation is that an attacker could enforce a lot of timeouts to for example |
@thomas-br so I can confirm that it's Spring that does not call Because Spring wraps and hides the Jetty APIs, application code does not have, IIUC, any API to manually call I'm leaning towards this being a Spring issue rather than an issue for this project - they should either call |
@sbordet I would expect the Jetty reactive client to handle Reactive Streams signals. In this case case the By comparison Spring also wraps use of Reactor Netty as an alternative which does handle cancellations and we don't have to intercept the cancellation signal ourselves. Likewise any other Subscriber (not related to Spring or Reactor) should be able to cancel the subscription and that should abort the request. |
@rstoyanchev the problem with That is why I think it's probably best for the Reactor implementation to actually abort the request with the actual exception that is known to Reactor, also because Reactor completely wraps the Jetty request, so it has access to both the Jetty request and the failure. I can react to |
@sbordet I don't see why it matters whether it is timeout or not. A cancellation means the
The request is wrapped by Spring's JettyClientHttpConnector in order to plug it into Spring's
Such an exception cannot be passed to the
This is in Reactor Netty which is a project on top of Netty that adds Reactive Streams semantics and back-pressure. It is comparable to how |
@rstoyanchev wrote:
But it's not a cancellation. It's a timeout. Plus, a single HTTP request is not really a "stream", so when you timeout an HTTP request/response that has already been passed to the subscriber (in the form of a response element), and the subscriber calls And even if you stretch the RS semantic to "cancel" the current element, the publisher may need to know the failure type because it may need to dispose resources in one case of failure, but not in others. Having the implementation correctly abort HTTP requests, rather than relying on the limited semantic of RS, would be best IMHO. Having said that, I can abort with a generic exception in |
Apologies for the slow reply. It's true "cancel" can lack certainty, i.e. does the downstream want to abort or is it simply not interested in consuming the rest? In my experience the distinction rarely matters. The only exception I've encountered so far is a database transaction but even there it's more intuitive to interpret a cancellation as aborting the the transaction. In the context of an HTTP response I can see that "cancel" can be interpreted to mean close the connection immediately (disposing of resources) or consume and release the rest of the response without passing it downstream. Is that the concern here, or if not what is the choice of what could be done? What I don't think is okay is to ignore a cancel signal altogether and do nothing, so the question to me is what is the appropriate default action to take? Another thing to mention, in this example it's only the Reactor
I realize this part is not obvious if you only think of a WebClient.builder().clientConnector(new JettyClientHttpConnector(this.httpClient)).build()
.get().uri(new URI(URI))
.retrieve()
.toEntity(String.class)
.timeout(Duration.ofMillis(1000)) // 1sec to "fail" fast Effectively the response publisher (which is wrapped in a
I think by "implementation" you mean Spring of Reactor but as I mentioned neither Reactor has knowledge of what the application does upstream, nor does any code upstream have any knowledge of the reason for the cancellation.
I think it would be better to do something vs to completely ignore it. |
@rstoyanchev my turn to apologize for the delay. I have implemented the cancellation, but now I am facing another problem: normal requests always fail due to cancellation. I seem to be hitting The test that fails is: https://github.com/jetty-project/jetty-reactive-httpclient/blob/1.1.x/src/test/java/org/eclipse/jetty/reactive/client/ReactorTest.java#L42 Help? |
I'll have a look. |
@sbordet, both tests in |
@rstoyanchev I did not commit your suggested change to |
Okay thanks, I see the problem on our side. We're using I have confirmed locally that after the fix all tests in |
@rstoyanchev not to beat a dead horse, but this is actually the example where an implementation of However, with the "fix" the producer holds on to the item it just passed to And the solution of "oh, I know this producer will only produce one item [how do you know?], so let's make a special case, and this time only not cancel the subscription". All of this seems extremely fragile IMHO and I'd like to reiterate that aborting the request upon a timeout generated and controlled by external subscriber should be done by the external subscriber, not by the implementation as a reaction to a generic call to |
I know, this doesn't seem very re-assuring :) ... but allow me to elaborate.
ReactiveRequest reactiveRequest = ReactiveRequest.newBuilder(request).build();
Publisher<JettyClientHttpResponse> publisher = reactiveRequest..response((response, chunks) -> {
Flux<DataBuffer> content = Flux.from(chunks).map(this::toDataBuffer);
return Mono.just(new JettyClientHttpResponse(response, content));
}); So we know only 1 item will come but since the Jetty API returns
Not sure what you mean. If you're referring to Reactive Streams does not have any specialization for a Publisher of 0..1, see reactive-streams/reactive-streams-jvm#490. It leaves it as a separate exercise for reactive libraries and this is why virtually all reactive libraries that offer declarative composition have a special type for a Publisher of 1 or 0..1 vs many because that distinction is so important for expressing APIs. If there was such a type in Reactive Streams I imagine the Jetty Reactive client would have two response methods with a callback, one for a Publisher of 1 and another for a Publisher of many. Feel free to conclude this as you prefer. If you decide to leave it as is, we'll certainly address it on the Spring side for spring applications at least but it seems a fix could be available more broadly. In the very least the Javadoc should address what is done or is not done on cancel. I still believe it is better to provide default handling for cancel than to do ignore it. In the former case you can be surprised by a cancel that you didn't expect which resulted in the connection being closed, which can then be corrected (like we are going to with spring-projects/spring-framework#25849). In the latter case it leads to the leak reported here. |
Greetings, There is no thread leak in Jetty reactive client or Spring's WebClient when used properly. We've been using I would also like to state that the error For instance, if you are using non blocking IO, why do you even care about early timeout? More so if you want to protect the callee (the server being called) from traffic spikes. If anything, having early timeout increases the likelihood of your application hitting the callee/server with even more requests, because you reply to your caller that much faster with a negative response. The whole point of non blocking IO is to consume less resources in your application, save for file descriptors count, so that you can soak in much much bigger concurrent traffic/load without hassle. You can wait a little bit more for sure. If you are dealing with a slow callee/server, you can use Also note, that Jetty tries to be a good citizen. The examples tell you, you should consume what you asked for even if the payload is big. I think that is a good default behaviour. It does not feel right, that you should be able to abort in-flight requests at whim, just because you can with |
I still think forcing publishers to hold the current element in case a subscriber cancels the stream, with the unspecified meaning to "cancel" the current element is an ad-hoc interpretation of reactive streams that has no basis in the specification. Having said that, I'm willing to add a boolean configuration parameter to Hopefully this satisfies everybody, and if not shout here rather soon -- I'll probably make a release early next week. |
It's good to have the choice and we'll certainly put it to use in WebFlux. Thanks! |
@thomas-br can you please try the latest code in branch |
Introduced abortOnCancel in ReactiveRequest.Builder to configure the behavior in case of cancellation from the content subscriber. Signed-off-by: Simone Bordet <[email protected]>
Hi,
I am currently looking into using the jetty httpclient to replace netty for my Spring Webclient which is responsible for external calls. As there is a hard limit when these external requests shall be cancelled, I am using a timeout in the Webclient execution call:
I am using Jetty as ClientHttpConnector, simplified:
In a load test scenario where all calls are cancelled before it can finish, I am observing a thread leak on the Jetty HttpClient Threads:
Even after a long cooldown they are not coming back. And at the peak new requests are failing due to no free resources.
The text was updated successfully, but these errors were encountered: