-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transitioning to a faster MTEB #482
Comments
Yeah, we can think about this in more detail when MMTEB is in its final stages. It may also make sense to change the dataset composition of the default English average. The |
If we end up subsampling datasets to speed things up, I might recommend we have like a MMTEB lite and an MMTEB full. I know when BEIR came out it confused a lot of folks since the MSMarco they use is not the same as the standard eval setup that had previously been done. Or if that is not desirable, it would be helpful to have some way of distinguishing that the datasets have been altered and that it's incomparable with previous results on the full benchmark. |
This work may be relevant: https://arxiv.org/abs/2402.14992 |
Perfect I will go with this approach then when updating datasets.
I was thinking of doing something like but it would be frustrating if performance on task A could differ between two sets (so would rather decide on one subsampling strategy for each dataset rather than allow it to be different depending on benchmark). A mini would then just be a set of representative tasks.
Yea that is pretty cool paper. We can do a similar thing if we treat each dataset as a sample (otherwise we will need to do a refactor of how we handle samples - which I believe is too much). |
Ah new suggestion I made in #481 is to add `task.superseeded_by = "new_dataset_name", which raises a warning when the original dataset is run. I believe this keeps backward compatibility, it allows us to see which datasets are outdated and allow us to update datasets without influencing previous benchmarks. |
This issue seems outdated will multiple newer issues - to get an overview #784 is probably a good place to start. |
@Muennighoff it seems like we might run into an issue where we will need to update MTEB to MMTEB, where we might e.g. want to speed up existing tasks. One solution might be to keep the leaderboard using an older version on MTEB until we move the entire leaderboard to the newest version (hereby also outdating older results).
As far as my experiments go with speeding up the clustering task there are plenty of ways to improve the speed, but most of them sacrifice comparative performance scores with the old versions.
The text was updated successfully, but these errors were encountered: