Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clearer byte range definition in get_range and get_ranges #156

Merged
merged 3 commits into from
Jan 28, 2025

Conversation

kylebarron
Copy link
Member

@kylebarron kylebarron commented Jan 17, 2025

Closes #155

@kylebarron kylebarron changed the title Kyle/get-range-params-required Clearer byte range definition in get_range and get_ranges Jan 17, 2025
@kylebarron kylebarron requested a review from maxrjones January 17, 2025 18:53
@kylebarron
Copy link
Member Author

cc @maxrjones A Rust match on two optional integers should be extremely cheap, and much cheaper than matching Python objects. I guess it really depends on the Python overhead of translating keyword arguments into rust. In terms of the Rust overhead of the option match, I'd expect that should be on the order of nanoseconds. Maybe any additional Python overhead would add up to a microsecond extra?

@maxrjones Would you be able to rerun your zarr benchmark from this branch?

@maxrjones
Copy link
Member

@maxrjones Would you be able to rerun your zarr benchmark from this branch?

Happy to. Can you provide guidance on how to install the crate + bindings from git? usually I'd use uv add "obstore @ git+https://github.com/developmentseed/obstore@0ce55be897206d9867c6a1170be95c0c1ffc4d32" but that raises Package metadata name "test-env" does not match given name "obstore".

@kylebarron
Copy link
Member Author

I think you need subdirectory=obstore https://stackoverflow.com/a/19516714

@kylebarron
Copy link
Member Author

Just checked and

pip install git+https://github.com/developmentseed/obstore@0ce55be897206d9867c6a1170be95c0c1ffc4d32#subdirectory=obstore

worked for me

Copy link
Member

@maxrjones maxrjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, took me a second to be able to use compiled code on the VEDA Hub and I wanted to run the benchmarks in us-west-2. Anyways, the new rate for loading a ~3.7GB array with Zarr is ~0.36% slower which IMO is acceptable.

@kylebarron
Copy link
Member Author

No worries! Thanks for running it!

Anyways, the new rate for loading a ~3.7GB array with Zarr is ~0.36% slower which IMO is acceptable.

So it was 7.92s before this change and 7.948s after this change? I figure that could just be noise?

@maxrjones
Copy link
Member

I used a instance with half as many cores so it was ~2x that time for each. If you'd be concerned about that relative difference, if real, I could record the individual timings to determine if there's a statistically significant difference.

@kylebarron
Copy link
Member Author

Nah, that's good. Thanks for checking that benchmark!

@kylebarron kylebarron merged commit 4e88107 into main Jan 28, 2025
4 checks passed
@kylebarron kylebarron deleted the kyle/get-range-params-required branch January 28, 2025 04:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change get_range start and end parameters to be kwarg only, add length as alternative
2 participants