-
Notifications
You must be signed in to change notification settings - Fork 23
Add support for scheduled retry with retryAfter #94
Add support for scheduled retry with retryAfter #94
Conversation
Discussion with @shuhei
|
this.options.dropAllRequestsAfter - | ||
(performance.now() - timerInitial); | ||
if (params.dropRequestAfter) { | ||
params.dropRequestAfter = Math.min( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This params
object is reused across retries. Is it safe to mutate it? I prefer to clone it for each retry unless its reference equality is used somewhere.
if (!hasRequestEnded) { | ||
requestObject.abort(); | ||
const err = new UserTimeoutError(options, timings); | ||
logEvent(EventSource.HTTP_REQUEST, EventName.ERROR, err.message); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to mark aborted requests as timeouts when aborting because of a successful request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging is one thing, but we also need to make sure that this doesn't open the circuit breaker when aborting because of another successful request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once one on the request has succeed, retryOperation will be marked as resolved
which will bypass any reject "bubble", see:
catch((error: ServiceClientError) => {
if (retryOperation.isResolved()) {
return;
}
retryErrors.push(error);
failure();
...
So actually clients are not even aware of the result of this promise. There is a vicious side effect though.: this request is actually never resolved or rejected: is it a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, so the circuit breaker command's success()
or failure()
are not called. Then, it won't make false failure()
calls at least!
But isn't the circuit breaker designed with an assumption that one of success()
or failure()
is called when an operation ends? It seems that the circuit breaker thinks that it is a timeout if success()
or failure()
are not called. And timeouts contribute to error rate calculation in the circuit breaker, which means it can open the circuit breaker without actual errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, we should stop the timeout timer in the circuit breaker when aborting requests because of another successful request. Because calling success()
will lower the actual error rate, how about creating another function (cancel
or something) in Command
?
We may want to mark failures in some cases though. For example, if the first request doesn't respond and the second request succeeds, it makes more sense to mark the first request as a failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow very good catch! Will evaluate adding this new function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok this approach seems to work. I have to write more tests now to cover all the different cases:
- both requests work
- first fails
- second fails
- first timeouts
- second timeouts
- both fail
- both timeouts
success(); | ||
result.retryErrors = retryErrors; | ||
resolve(result); | ||
if (retryAfter) { | ||
requestsAbortCallbacks.forEach(cb => cb()); | ||
requestsAbortCallbacks.length = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is a cleanup for the happy case. Is this array of callbacks cleaned up in the sad case (all requests fail)? I'm not sure if it causes a memory leak only from the code, but I think it's worth checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good catch! Will check that
Fixes #93
Changes
retry operation now does not pass
currentAttempt
as argument.currentAttempt
is now actually returned to retry() so that we can use it outside of the current retry operation (for multiple retries in parallel).retry operation now passes
scheduledRetry
as boolean argument.scheduledRetry
helps determine that we need to propagate error when it fails.support for
dropAllRequestsAfter
which wraps all requests (initial and potential retries) sent and will drop all of them after that timesupport for
retryAfter
which enables retrying a request without having to wait for first one to fail and still have it in flight while we send second one