Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor performance difference between BanditPAM and sklearn for a small number of data points #255

Open
lukeleeai opened this issue Jun 16, 2023 · 0 comments

Comments

@lukeleeai
Copy link
Collaborator

lukeleeai commented Jun 16, 2023

Description:
I have observed a not so significant performance difference between BanditPAM and sklearn for the mnist dataset n <= 20000. BanditPAM is marginally slower compared to sklearn.

Reproducibility:
You can reproduce the results by running the code available in the branch "sklearn_comparison" of the BanditPAM repository. To run the experiment, execute the following command:
python experiments/run_scaling_experiment.py. I've installed banditpam with pip install banditpam

You will then observe the results similar to the following:

Num data:  1000

<Running  SKLEARN >
0.19861984252929688

<Running  BanditPAM VA with caching >
0.8404459953308105

Num data:  10000

<Running  SKLEARN >
15.577669143676758

<Running  BanditPAM VA with caching >
20.48973298072815

But fortunately for larger N, banditpam significantly outperforms sklearn:

Num data:  20000

<Running  SKLEARN >
42.05375599861145

<Running  BanditPAM VA with caching >
29.887195110321045
@lukeleeai lukeleeai changed the title Significant performance difference between BanditPAM and sklearn for mnist n10000 k5 Significant performance difference between BanditPAM and sklearn Jun 16, 2023
@lukeleeai lukeleeai changed the title Significant performance difference between BanditPAM and sklearn Minor performance difference between BanditPAM and sklearn for a small number of data points Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant