Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Revert change to use ready to create plasma_object_ids #50467

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dayshah
Copy link
Contributor

@dayshah dayshah commented Feb 12, 2025

Why are these changes needed?

Reverting the change to use ready to create plasma_object_ids here. #49218

An example ray data workload that highlights the problem with the issue is here https://gist.github.com/dayshah/1080db0cd3fb561119bca17c85215117. 5 seconds without, 50 seconds with.

The problem is that ready is capped to num_returns, and we do object pulling based on plasma_object_ids which was now being created from ready. Ray data calls ray.wait(10_refs, num_returns=1). If we create plasma_object_ids with only ready, it'll only contain one object in this situation. If we use memory_store_ids to create plasma_object_ids, we'll have plasma_object_ids with 10 objects and all 10 will start being pulled even if the ray.wait call returns immediately after getting one.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@dayshah dayshah added the go add ONLY when ready to merge, run all tests label Feb 12, 2025
@dayshah dayshah requested a review from edoakes February 12, 2025 03:18
const auto &obj_id = *iter;
auto found = memory_store->GetIfExists(obj_id);
if (found != nullptr && found->IsInPlasmaError()) {
plasma_object_ids.insert(obj_id);
ready.erase(iter);
memory_object_ids.erase(obj_id);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't use memory object ids after this so no need for this

@dayshah
Copy link
Contributor Author

dayshah commented Feb 12, 2025

i think a possible fix for this could be to pass two vectors into plasma provider wait

one with all the objects in plasma - we'll use this to start doing the pulling

one with just the number of objects we need from ready, we'll use this to actually make the WaitRequest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants