-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement syncbn for TensorFlow #18671
Conversation
Also, I am not sure if it is better to move the framework-specified code (e.g. TensorFlow) to |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #18671 +/- ##
==========================================
- Coverage 78.53% 78.47% -0.07%
==========================================
Files 335 335
Lines 32943 32975 +32
Branches 6450 6454 +4
==========================================
+ Hits 25873 25878 +5
- Misses 5512 5538 +26
- Partials 1558 1559 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
@qlzh727 can you advise on how to proceed for testing the feature with 2 devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks. Will post-process this on our side.
For sync batch norm test, user can config 2 virtual cpus and use them for mirrored strategy test. https://www.tensorflow.org/api_docs/python/tf/config/set_logical_device_configuration |
@qlzh727 the test for this feature is now moved to How should we modify the test to ensure correctness? (right now I think the sync branch is actually never run) |
ack, I will add a test for that. The sync logic will only have a difference in the distribution setting. |
Add SyncBN implementation (via
synchronized
argument inlayers.BatchNormalization
) for TensorFlow. Note that, I don't know how to write the test for it in Keras online test (it requires multi-gpus).#18667