Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SVCS-894] Cache File During Contiguous Upload For Dropbox #368

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

cslzchen
Copy link
Contributor

@cslzchen cslzchen commented Oct 1, 2018

Ticket

https://openscience.atlassian.net/browse/SVCS-894

Purpose

Dropbox's rate limit feature may cause uploading/copying/moving many files to fail with "too_many_requests" & "too_many_write_operations". The former is for too many requests literally and the former is for namespace lock contentions. Unlike GitHub, Dropbox doesn't provide detailed information on how they rate limit requests but simply asks clients to retry according to the "Retry-After" header in the 429 response.

The retry doesn't work in a straightforward way for upload (inter copy and move use upload internally) since the stream will have been consumed when the request is finished. The solution is to cache the stream locally into a temporary file and stream from the file for both the initial request and following 429 retries.

Changes

TBD

Side effects

TBD

QA Notes

TBD

Deployment Notes

No

@coveralls
Copy link

coveralls commented Oct 1, 2018

Coverage Status

Coverage decreased (-0.06%) to 91.789% when pulling a214e84 on cslzchen:fix/dropbox-too-many-write-ops into 1e2841c on CenterForOpenScience:develop.

Copy link
Contributor Author

@cslzchen cslzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Back to Add'l Dev to update branch and (re)test locally due to previously fixed conflicts.

Dropbox's rate limiting feature may cause uploading/copying/moving
many files in parallel to fail with either "too_many_requests" or
"too_many_write_operations". The former is literally used for "too
many requests" while  the latter for namespace lock contentions. See
https://www.dropbox.com/developers/reference/data-ingress-guide for
more details.

In addition, Dropbox doesn't reveal how they rate-limit requests and
is actively testing different algorithms. They recommend clients to
retry according to the "Retry-After" header in the 429 response.

However, the retry doesn't work in a straightforward way for upload
requests since the stream will have been consumed when the request is
finished. Please note both inter copy and move use upload internally.
The solution is to cache the stream locally into a temporary file and
stream from it for both the initial request and following 429 retries.
During local testing, all 429 failures succeeded upon first retry.
The issue turned out to be "namespace lock contention" instead of
"two many requests". It is reasonable to set the default maximum
value to 2 retires per upload.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants