Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Store SOAP condition matrices as the dtype of their parameters #335

Merged
merged 4 commits into from
Feb 2, 2025

Conversation

kylevedder
Copy link
Contributor

After #333 SOAP functions correctly, but it has significant excess VRAM usage when training models with reduced precision weights (e.g. bfloat16).

This PR initializes and updates the condition matrices based on the precision of the parameters themselves rather than defaulting to float32 for everything.

Note that the only exception to this is the QR factorization to get the orthogonal Q is done in float32, regardless of the underlying matrix precision, as

  • only float32 has CUDA kernel support as of PyTorch 2.5.1
  • precision matters for making the matrix actually orthogonal (I attempted to hand-roll Newton iteration but it was causing NaNs during optimization)

@kylevedder
Copy link
Contributor Author

should now comply with the formatting requirements :)

Copy link

codecov bot commented Jan 29, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (c7496b0) to head (b177ac2).
Report is 23 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##              main      #335    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          108       110     +2     
  Lines         8509      8731   +222     
==========================================
+ Hits          8509      8731   +222     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kozistr
kozistr previously approved these changes Feb 1, 2025
Copy link
Owner

@kozistr kozistr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your fix! I left some minor reviews, and others look great to me :)

--- updated

I just committed them to merge this PR

@kozistr kozistr added the bug Something isn't working label Feb 1, 2025
@kozistr kozistr merged commit aca76b6 into kozistr:main Feb 2, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants