-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include bucket region in domain name of S3 URLs #1853
Comments
Thanks for the issue report. Can you say why you need the region name? |
Good to know! I thought that it was some generic S3 functionality to redirect to underlying region, and possibly to have buckets replicated across regions etc... didn't know that it is specific to "old" regions. I wonder if it wouldn't cause us some disturbance to change it this late in the game as we already have good number of such URLs "dumped" in a good number of places (e.g., dandiset manifests on S3, datalad dandisets). |
I don't think there's much we can do about this--we use django-storages with the S3 backend (which in turn uses boto3) to manage our bucket usage, and we directly use the URLs provided by boto3, which are of the form that is suboptimal for your use case, @jwodder. Suggested workarounds:
For some extra background info: I did read that article, and then I did some experimentation of my own with buckets in us-east-2 and also in a us-west zone. I was never able to get boto3 to provide me with presigned URLs that had the region name in the URL. I don't know what to make of that, except:
afaict, this is in fact what is happening. John's linked article mentions that URLs without a region are sent to us-east, and if necessary a 307 response can then redirect you to the correct region, but in my testing I was never able to get a redirect from boto3's URLs. And
I suppose S3 might "prefer" these URLs in some mild, inferred way, but according my reading of that article, S3 is happy to respond properly to all forms of these URLs. Fortunately (for us, and unfortunately for the S3 folks) I doubt they will deprecate these other forms of the URLs anytime soon. And if they do, we would likely be relying on boto3 to give us better URLs. |
(This is a low-priority request, but I thought I'd get it out there anyway.)
Currently, the S3 URLs in assets'
contentUrl
metadata fields have domains of the form{bucket}.s3.amazonaws.com
, known as the "legacy global endpoint." However, certain S3 SDKs (such as the official Rust one) require supplying an S3 region in order to query an S3 bucket. While a bucket's region can be found via aHEAD
request tohttps://{bucket}.s3.amazonaws.com
, it would be more efficient if this weren't required, i.e., if our S3 URLs had domain names of the form{bucket}.s3.{region}.amazonaws.com
as seems to currently be preferred by S3. See https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html for more information.The text was updated successfully, but these errors were encountered: