Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for concurrency bug #70

Merged
merged 4 commits into from
Jul 16, 2024
Merged

Fix for concurrency bug #70

merged 4 commits into from
Jul 16, 2024

Conversation

miclegr
Copy link
Contributor

@miclegr miclegr commented Jul 11, 2024

Description

When adding a point to an index without providing an id, we rely on TypedIndex::currentLabel to get one. But we are not doing it in a threadsafe way. This PR fixes that, solves #65

Changes Made

C++

TypedIndex::currentLabel is now std::atomic and it's post-incremented atomically to get a new id when needed

Testing

Checklist

  • My code follows the code style of this project.
  • I have added and/or updated appropriate documentation (if applicable).
  • All new and existing tests pass locally with these changes.
  • I have run static code analysis (if available) and resolved any issues.
  • I have considered backward compatibility (if applicable).
  • I have confirmed that this PR does not introduce any security vulnerabilities.

Additional Comments

@miclegr
Copy link
Contributor Author

miclegr commented Jul 11, 2024

C++ linting failing due to this

Copy link
Contributor

@markkohdev markkohdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a question about the usage of the atomic. I'm not sure this will result in behavioral parity with the existing implementation

@@ -360,7 +360,7 @@ class TypedIndex : public Index {

int start = 0;
if (!ep_added) {
size_t id = ids.size() ? ids.at(0) : (currentLabel);
size_t id = ids.size() ? ids.at(0) : (currentLabel.fetch_add(1));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is .fetch_add(1) actually what we want throughout this change? Wouldn't that result in us incrementing currentLabel multiple times where we actually only wanted to increment it once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I'm reverting this

Copy link
Contributor Author

@miclegr miclegr Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually not, my original implementation is correct. Basically here we assign an id, then here:

start = 1;

we set start = 1 so that when we loop via ParallelFor we start from 1:

ParallelFor(start, rows, numThreads, [&](size_t row, size_t threadId) {

therefore we need currentLabel already incremented

Copy link
Contributor

@markkohdev markkohdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks for addressing the comment! This looks good to me, albeit race conditions are always a little tricky to verify fixes for

@markkohdev markkohdev merged commit 3fc184a into spotify:main Jul 16, 2024
53 checks passed
@miclegr miclegr mentioned this pull request Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants