[Feature] Store SOAP condition matrices as the dtype of their parameters #335

kylevedder · 2025-01-29T22:14:45Z

After #333 SOAP functions correctly, but it has significant excess VRAM usage when training models with reduced precision weights (e.g. bfloat16).

This PR initializes and updates the condition matrices based on the precision of the parameters themselves rather than defaulting to float32 for everything.

Note that the only exception to this is the QR factorization to get the orthogonal Q is done in float32, regardless of the underlying matrix precision, as

only float32 has CUDA kernel support as of PyTorch 2.5.1
precision matters for making the matrix actually orthogonal (I attempted to hand-roll Newton iteration but it was causing NaNs during optimization)

kylevedder · 2025-01-29T22:33:18Z

should now comply with the formatting requirements :)

codecov · 2025-01-29T22:34:00Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (c7496b0) to head (b177ac2).
Report is 23 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##              main      #335    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files          108       110     +2     
  Lines         8509      8731   +222     
==========================================
+ Hits          8509      8731   +222

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kozistr

Thanks for your fix! I left some minor reviews, and others look great to me :)

--- updated

I just committed them to merge this PR

pytorch_optimizer/optimizer/soap.py

kylevedder requested a review from kozistr as a code owner January 29, 2025 22:14

pull-request-size bot added the size/S label Jan 29, 2025

kylevedder force-pushed the main branch from 92723ac to 9fc7dc9 Compare January 29, 2025 22:21

Added parameter dtype support throughout the conditioner code

b94f3cc

kylevedder force-pushed the main branch from 9fc7dc9 to b94f3cc Compare January 29, 2025 22:32

kozistr previously approved these changes Feb 1, 2025

View reviewed changes

pytorch_optimizer/optimizer/soap.py Outdated Show resolved Hide resolved

pytorch_optimizer/optimizer/soap.py Outdated Show resolved Hide resolved

pytorch_optimizer/optimizer/soap.py Outdated Show resolved Hide resolved

kozistr assigned kylevedder Feb 1, 2025

kozistr added the bug Something isn't working label Feb 1, 2025

Update pytorch_optimizer/optimizer/soap.py

47155e1

kozistr dismissed their stale review via 47155e1 February 2, 2025 04:40

kozistr added 2 commits February 2, 2025 13:40

Update pytorch_optimizer/optimizer/soap.py

4eaa77c

Update pytorch_optimizer/optimizer/soap.py

b177ac2

kozistr merged commit aca76b6 into kozistr:main Feb 2, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Store SOAP condition matrices as the dtype of their parameters #335

[Feature] Store SOAP condition matrices as the dtype of their parameters #335

kylevedder commented Jan 29, 2025

kylevedder commented Jan 29, 2025

codecov bot commented Jan 29, 2025 •

edited

Loading

kozistr left a comment •

edited

Loading

[Feature] Store SOAP condition matrices as the dtype of their parameters #335

[Feature] Store SOAP condition matrices as the dtype of their parameters #335

Conversation

kylevedder commented Jan 29, 2025

kylevedder commented Jan 29, 2025

codecov bot commented Jan 29, 2025 • edited Loading

Codecov Report

kozistr left a comment • edited Loading

Choose a reason for hiding this comment

codecov bot commented Jan 29, 2025 •

edited

Loading

kozistr left a comment •

edited

Loading