-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-36346: [C++] Safe S3 finalization #36442
Conversation
This is an alternative approach to #36437. @westonpace @kou |
cpp/src/arrow/filesystem/s3fs.cc
Outdated
AwsInstance() : is_initialized_(false), is_finalized_(false) {} | ||
~AwsInstance() { Finalize(/*from_destructor=*/true); } | ||
|
||
// Returns true iff the instance was newly initialized with `options` | ||
Result<bool> EnsureInitialized(const S3GlobalOptions& options) { | ||
bool expected = false; | ||
// XXX The individual accesses are atomic but the entire sequence below is not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I don't think this should be a problem in practice, as calls to EnsureInitialized
and Finalize
should be serialized at the application level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your opinion.
How about adding this note to this comment or documents for InitializeS3()
/FinalizeS3()
?
(I think that we can remove the "XXX" mark here.)
@github-actions crossbow submit -g python -g wheel |
This comment was marked as outdated.
This comment was marked as outdated.
cpp/src/arrow/filesystem/s3fs.cc
Outdated
AwsInstance() : is_initialized_(false), is_finalized_(false) {} | ||
~AwsInstance() { Finalize(/*from_destructor=*/true); } | ||
|
||
// Returns true iff the instance was newly initialized with `options` | ||
Result<bool> EnsureInitialized(const S3GlobalOptions& options) { | ||
bool expected = false; | ||
// XXX The individual accesses are atomic but the entire sequence below is not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your opinion.
How about adding this note to this comment or documents for InitializeS3()
/FinalizeS3()
?
(I think that we can remove the "XXX" mark here.)
@github-actions crossbow submit -g python -g wheel |
Thanks a lot for the review @kou . I agree with all your comments and made the necessary changes. |
Revision: c171b31 Submitted crossbow builds: ursacomputing/crossbow @ actions-1c2290f238 |
@github-actions crossbow submit -g cpp |
Revision: c171b31 Submitted crossbow builds: ursacomputing/crossbow @ actions-0d14e99a00 |
The remaining CI failures seem unrelated. I think I'm going to merge so as to improve CI, as we're nearing the release. |
### Rationale for this change It is required to call `FinalizeS3` at process end to cleanly shut down resources used by the AWS SDK before it's too late. However, once `FinalizeS3` has been called, another problem appears: no AWS SDK API can be called safely anymore. Even object destructors can be dangerous. We therefore need a way to prevent unsafe use of the AWS SDK APIs, regardless of how the application issues filesystems calls. ### What changes are included in this PR? Shield all uses of the internal `S3Client` class behind a safe RAII facility `S3ClientLock`. Obtaining a `S3ClientLock` ensures that S3 finalization has not been called yet _and_ that it will not be called until the `S3ClientLock` goes out of scope. ### Are these changes tested? Yes, some Python tests exercise calling S3 APIs after explicit finalization. ### Are there any user-facing changes? Not for correctly written application code, ideally. * Closes: apache#36346 Authored-by: Antoine Pitrou <[email protected]> Signed-off-by: Antoine Pitrou <[email protected]>
Conbench analyzed the 6 benchmark runs on commit There were 7 benchmark results indicating a performance regression:
The full Conbench report has more details. |
### Rationale for this change In #12227 we decided to use a bundled version of the AWS SDK when compiling Python wheels, in order to downgrade the AWS SDK version. Now that we have fixed S3 finalization issues (#36442), it should be ok to rely on the vcpkg-installed version of the AWS SDK again. ### What changes are included in this PR? Remove use of bundled AWS SDK and use S3 vcpkg feature for requirements. ### Are these changes tested? On CI and via crossbow ### Are there any user-facing changes? No * Closes: #36752 Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Did this make it into /Users/voltrondata/github-actions-runner/_work/crossbow/crossbow/arrow/cpp/src/arrow/filesystem/s3fs.cc:2829: arrow::fs::FinalizeS3 was not called even though S3 was initialized. This could lead to a segmentation fault at exit
zsh: segmentation fault ./test_routing_pybind
(venv) test % pip list | grep pyarrow
pyarrow 13.0.0 |
Yes, it should be in 13.0.0. Can you please open an issue with a reproducer script? |
…apache#36925) ### Rationale for this change In apache#12227 we decided to use a bundled version of the AWS SDK when compiling Python wheels, in order to downgrade the AWS SDK version. Now that we have fixed S3 finalization issues (apache#36442), it should be ok to rely on the vcpkg-installed version of the AWS SDK again. ### What changes are included in this PR? Remove use of bundled AWS SDK and use S3 vcpkg feature for requirements. ### Are these changes tested? On CI and via crossbow ### Are there any user-facing changes? No * Closes: apache#36752 Authored-by: Raúl Cumplido <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Rationale for this change
It is required to call
FinalizeS3
at process end to cleanly shut down resources used by the AWS SDK before it's too late.However, once
FinalizeS3
has been called, another problem appears: no AWS SDK API can be called safely anymore. Even object destructors can be dangerous.We therefore need a way to prevent unsafe use of the AWS SDK APIs, regardless of how the application issues filesystems calls.
What changes are included in this PR?
Shield all uses of the internal
S3Client
class behind a safe RAII facilityS3ClientLock
. Obtaining aS3ClientLock
ensures that S3 finalization has not been called yet and that it will not be called until theS3ClientLock
goes out of scope.Are these changes tested?
Yes, some Python tests exercise calling S3 APIs after explicit finalization.
Are there any user-facing changes?
Not for correctly written application code, ideally.