Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/1458 add incremental SVD/PCA #1629

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

mrfh92
Copy link
Collaborator

@mrfh92 mrfh92 commented Aug 22, 2024

Due Diligence

  • General:
  • Implementation:
    • unit tests: all split configurations tested
    • unit tests: multiple dtypes tested
    • documentation updated where needed

Description

Issue/s resolved: #1458
(Should be merged after #1561 as it already contains these changes)

Changes proposed:

Adds incremental SVD (see M. Brand, Fast low-rank modifications of the thin singular value decomposition, Linear Algebra and its Applications 415 (2006)) and corresponding interface for PCA

Type of change

new feature

Does this change modify the behaviour of other functions? If so, which?

no

@mrfh92 mrfh92 self-assigned this Aug 22, 2024
@mrfh92 mrfh92 added ESAPCA relevant for the ESA-funded project "ESAPCA" linalg labels Aug 22, 2024
@mrfh92 mrfh92 marked this pull request as ready for review August 22, 2024 11:33
Copy link
Contributor

Thank you for the PR!

Hoppe and others added 3 commits August 22, 2024 13:45
…w-split=1 needs to be ruled out for the moment due to numerical instabilities of the combination of the respective algorithms.
Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

2 similar comments
Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

1 similar comment
Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

github-actions bot commented Sep 2, 2024

Thank you for the PR!

Copy link
Contributor

github-actions bot commented Sep 6, 2024

Thank you for the PR!

Copy link
Contributor

github-actions bot commented Sep 6, 2024

Thank you for the PR!

@ClaudiaComito ClaudiaComito requested review from JuanPedroGHM and removed request for mtar September 30, 2024 07:59
Copy link
Member

@JuanPedroGHM JuanPedroGHM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but we have to make sure to merge randomized SVD first, and then solve all the merge conflicts that will appear from this. I'm guessing this was branched from that, but some of the functions have been renamed.

Also, it would be nice to have benchmarks for randomized SVD and incremental SVD.

Copy link
Contributor

github-actions bot commented Oct 4, 2024

Thank you for the PR!

@mrfh92
Copy link
Collaborator Author

mrfh92 commented Oct 4, 2024

@JuanPedroGHM @ClaudiaComito I have merged the rSVD and resolved the conflicts. Now it should be possible to merge after merging rSVD without conflicts.

@mrfh92
Copy link
Collaborator Author

mrfh92 commented Oct 4, 2024

@JuanPedroGHM You're right regarding the benchmarks. I have added this as todo to the corresponding Issue #1615

Copy link
Contributor

github-actions bot commented Oct 4, 2024

Thank you for the PR!

@JuanPedroGHM
Copy link
Member

JuanPedroGHM commented Oct 4, 2024

Benchmarks results - Sponsored by perun

function mpi_ranks device metric value ref_value std % change type alert lower_quantile upper_quantile
apply_inplace_normalizer 4 CPU RUNTIME 0.00301788 0.00101893 0.00588907 196.181 jump-detection True nan nan
qr_split_0 4 GPU RUNTIME 0.0536457 0.0701262 0.0036273 -23.5012 jump-detection True nan nan
qr_split_1 4 GPU RUNTIME 0.0522529 0.0664146 0.0012327 -21.3232 jump-detection True nan nan
apply_inplace_standard_scaler_and_inverse 4 GPU RUNTIME 0.0111402 0.009996 0.00162195 11.4462 jump-detection True nan nan
qr_split_1 4 CPU RUNTIME 0.169213 0.185829 0.00172974 -8.94135 trend-deviation True 0.179067 0.196003
hierachical_svd_rank 4 CPU RUNTIME 0.0515223 0.0472803 0.00999751 8.97223 trend-deviation True 0.0464289 0.0487015
hierachical_svd_tol 4 CPU RUNTIME 0.0564265 0.052285 0.0129836 7.92092 trend-deviation True 0.051572 0.0534139
kmeans 4 CPU RUNTIME 0.306328 0.317888 0.00495235 -3.63626 trend-deviation True 0.307904 0.329682
qr_split_1 4 GPU RUNTIME 0.0522529 0.0671226 0.0012327 -22.1529 trend-deviation True 0.0622825 0.0740967
hierachical_svd_tol 4 GPU RUNTIME 0.122298 0.119138 0.00582516 2.65215 trend-deviation True 0.116942 0.121381
reshape 4 GPU RUNTIME 0.260241 0.248041 0.0216572 4.91848 trend-deviation True 0.238506 0.259244

Grafana Dashboard
Last updated: 2024-10-28T08:47:41Z

Copy link
Contributor

github-actions bot commented Oct 8, 2024

Thank you for the PR!

Copy link
Contributor

github-actions bot commented Oct 9, 2024

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Copy link
Contributor

Thank you for the PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
benchmark PR ESAPCA relevant for the ESA-funded project "ESAPCA" high-level functions High-level machine-learning algorithms linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add incremental SVD/PCA
3 participants