You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm harvesting from a server which frequently times out (requests.exceptions.Timeout). Then, the request is not retried even though I set max_retries, since the retry functionality only covers the case where you actually get a response from the server.
I would like to extend the retry functionality to also include timeouts, but rather than increasing the complexity of the _request method further, I think it's worth considering switching to the tested and tried Retry urllib3 class. For some background on the class, see https://kevin.burke.dev/kevin/urllib3-retries/
Retry also handles the Retry-After header, so it shouldn't be that different from the current behaviour. The main difference is that it uses a backoff factor instead of a fixed sleep time:
sleep_time=backoff_factor* (2**retry_number)
Since OAI-PMH servers can be quite slow, we could set the default backoff factor to something like 2, to make the sleep time increase quickly. It is capped to BACKOFF_MAX=120 seconds by default
By the way, it looks like Retry also handles formatted dates, so I think it should take care of the issue described in #28, although I haven't tested it.
I'm harvesting from a server which frequently times out (
requests.exceptions.Timeout
). Then, the request is not retried even though I setmax_retries
, since the retry functionality only covers the case where you actually get a response from the server.I would like to extend the retry functionality to also include timeouts, but rather than increasing the complexity of the _request method further, I think it's worth considering switching to the tested and tried
Retry
urllib3 class. For some background on the class, see https://kevin.burke.dev/kevin/urllib3-retries/Retry
also handles theRetry-After
header, so it shouldn't be that different from the current behaviour. The main difference is that it uses a backoff factor instead of a fixed sleep time:Since OAI-PMH servers can be quite slow, we could set the default backoff factor to something like 2, to make the sleep time increase quickly. It is capped to BACKOFF_MAX=120 seconds by default
Breaking change: This means that the
default_retry_after
argument would no longer be supported.Let me know what you think, and whether there is a chance a PR for this would be accepted.
The text was updated successfully, but these errors were encountered: