Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Announcement: S3 default integrity change #3166

Open
aBurmeseDev opened this issue Jan 16, 2025 · 3 comments
Open

Announcement: S3 default integrity change #3166

aBurmeseDev opened this issue Jan 16, 2025 · 3 comments
Labels
announcement This is an announcement issue

Comments

@aBurmeseDev
Copy link
Member

aBurmeseDev commented Jan 16, 2025

In aws-sdk-s3 v1.178.0, we released changes to the S3 client that adopts new default integrity protections. For more information on default integrity behavior, please refer to the official SDK documentation. In SDK releases from this version on, clients default to enabling an additional checksum on all Put calls and enabling validation on Get calls.

You can disable default integrity protections for S3. We do not recommend this because checksums are important to S3 integrity posture. Integrity protections can be disabled by setting the config flag to WHEN_REQUIRED, or by using the related AWS shared config file settings or environment variables.

Disclaimer: the AWS SDKs and CLI are designed for usage with official AWS services. We may introduce and enable new features by default, such as these new default integrity protections, prior to them being supported or otherwise handled by third-party service implementations. You can disable the new behavior with the WHEN_REQUIRED value for the request_checksum_calculation and response_checksum_validation configuration options covered in Data Integrity Protections for Amazon S3.

@marckohlbrugge
Copy link
Contributor

We experienced data loss in production due to this minor-level update.

While I understand the focus of AWS SDKs and CLI is on official AWS services, it's important to acknowledge the impact on users who rely on third-party storage providers. I believe AWS holds a responsibility to ensure the following when implementing changes of this nature:

  1. Advance Notification: Provide sufficient notice to third-party service providers and users about upcoming changes that might affect data integrity or compatibility. This allows stakeholders to prepare and adapt their systems accordingly.
  2. Opt-in Approach: Consider making potentially disruptive changes opt-in rather than default. This ensures that users can test and adapt to new features in a controlled manner without unexpected disruptions to their existing workflows.
  3. Versioning Practices: When introducing changes that are not backwards-compatible, it would be prudent to increment the major version of the library. This traditional versioning practice signals to users that substantial changes have been made, allowing them to manage updates with greater caution.

These steps would help mitigate risks and maintain trust among users who depend on AWS SDKs in diverse environments. Your consideration of these points would be greatly appreciated.

@mullermp
Copy link
Contributor

I'm so sorry you experienced data loss from this change. Can you please be specific about how that happened so I can forward that feedback to s3?

Regarding your points:

  1. Totally agree with this. The SDK team called this out as a risk but s3 wanted to proceed anyway. The intent was secrecy for re:Invent which is why it was not announced ahead of time. While I personally (and others) disagree with this, we had no choice. In Ruby's case, we thankfully had an "out" for calculating md5 using drop-in code.

  2. This behavior was already opt-in for over 3 years. S3 supported crc32, crc32c, sha1 and sha256 using checksum parameters to various operations. The adoption was not quick enough, prompting checksumming my default.

  3. While I understand your sentiment, this was not realistic across all language SDKs and given measurable history of how slowly customers move to newer major versions. I agree that the changes were a bit too severe but it was intended to be mostly transparent.

Hopefully the ruby implementation (which I did) was not too painful. I simply implemented what was asked, so please don't shoot the messenger here 😊.

@marckohlbrugge
Copy link
Contributor

Thank you for your quick reply!

Can you please be specific about how that happened so I can forward that feedback to s3?

What happened is that I updated to the latest aws-sdk-ruby version without much thought, given it was a minor version increment. My web application allows people to post todos with attachments. We process file uploads asynchronously, so there was no immediate feedback to the user that something went wrong. When we tried to upload the file in a background job to Cloudflare R2 (through the aws-sdk-ruby gem with a custom end-point) we ran into an Aws::S3::Errors::InvalidRequest error with the message You can only specify one checksum at a time..

Once I finally discovered this exception, I went and added the following configuration settings to my ActiveStorage storage.yml which I previously found elsewhere on the internet:

request_checksum_calculation: when_required
response_checksum_validation: when_required

The good news is that after writing my previous post, I was able to recover the lost files. The missing files were initially uploaded through a Telegram chatbot. This meant the files were still stored on Telegram's servers, and I was able to recover the original required file identifiers as part of the logs for the failed background jobs. If it wasn't for those logs, there would have been no way to recover the file identifiers and we would have essentially lost that data.

This behavior was already opt-in for over 3 years. S3 supported crc32, crc32c, sha1 and sha256 using checksum parameters to various operations. The adoption was not quick enough, prompting checksumming my default.

I see. I was unaware of this. I believe in recent Rails versions deprecation warnings will by default raise an error in a development environment. Perhaps this can be used to warn people of such changes. By setting request_checksum_calculation and response_checksum_validation to a backwards-compatible default while also raising a depreciation warning when there's no explicit value set by the user. In this way the gem will continue to work in production, while alerting developers. After a grace period, the default value could be changed to the one preferred by AWS.

There might be better approaches. This just one idea that came to mind.

Hopefully the ruby implementation (which I did) was not too painful. I simply implemented what was asked, so please don't shoot the messenger here 😊.

Once I realized what was going on, the fix was relatively easy. I don't blame you. I appreciate your work on this gem and I understand the difficult position you must be in trying to balance your employer's and the developer's wishes. Thank you for hard work 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
announcement This is an announcement issue
Projects
None yet
Development

No branches or pull requests

3 participants