Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compute() on plate grid #299

Merged
merged 9 commits into from
Sep 29, 2023
Merged

Conversation

will-moore
Copy link
Member

@will-moore will-moore commented Aug 21, 2023

Fixes #297

This fixes the issue above: The get_tile() already returns a dask array so it doesn't need to be wrapped in dask.delayed().

Test using the code sample on the ticket above.

@codecov
Copy link

codecov bot commented Aug 21, 2023

Codecov Report

All modified lines are covered by tests ✅

Files Coverage Δ
ome_zarr/reader.py 91.37% <100.00%> (+2.24%) ⬆️

📢 Thoughts on this report? Let us know!.

@ziw-liu
Copy link

ziw-liu commented Aug 21, 2023

Hi @will-moore I installed from this branch with:

pip install git+https://github.com/will-moore/ome-zarr-py.git@plate_dask_compute

And I can see the updated lines in the site package of my environment.

I can confirm that 286a4f2 fixes the plate level. However the well level still has the same issue:

import dask.array as da
from ome_zarr.io import parse_url
from ome_zarr.reader import Reader

# well level in an HCS store
location = parse_url("hcs.ome.zarr/A/1")
reader = Reader(location)

data = next(reader()).data[0]
# this passes
assert isinstance(data, da.Array), type(data)

result = data.compute()
# this fails
assert not isinstance(result, da.Array), type(result)

@@ -541,7 +541,8 @@ def get_tile(tile_name: str) -> np.ndarray:
LOGGER.debug("LOADING tile... %s with shape: %s", path, tile_shape)

try:
data = self.zarr.load(path)
# compute() to get the data from dask - we want to return array
data = self.zarr.load(path).compute()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the computation not happen as late as possible, e.g., during the concatenation below, to make use of dask's scheduler?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we already making use of dask's scheduler? Or do you mean make better use of it?

Would that actually give better performance? Not being very familiar with it, I'm reading https://docs.dask.org/en/stable/scheduler-overview.html which says there are gains to be made if you have a large graph and "share many intermediates", but I don't see that we can do that here?

I've simplified the code a fair bit in ee9ef7d to remove the extra layer of delayed() wrapping. This looks cleaner and still works as expected. It just means that we call self.zarr.load(path) for every tile right at the start, even if we never need to load/compute that region. I don't know if this makes any difference to performance but probably not?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use, yes, so that the right things happen parallel.

@will-moore
Copy link
Member Author

@ziw-liu Thanks for that - fixed for Wells too now. And I added some tests.
@joshmoore any ideas for why readthedocs has started to fail? (also seen on #300)

@joshmoore
Copy link
Member

No, I don't. The error is:

NameError: name 'html_theme' is not define

Possibly due to the deprecation back in 2021? https://www.sphinx-doc.org/en/master/changes.html#release-4-1-0-released-jul-12-2021

@jburel
Copy link
Member

jburel commented Aug 24, 2023

@will-moore @joshmoore This is a problem with readthedocs. I faced the same problem on ome/omero-scripts#187
I ended up adding the dependencies see ome/omero-scripts@07e130d

@will-moore will-moore added this to the 0.8.1 milestone Sep 13, 2023
@will-moore
Copy link
Member Author

will-moore commented Sep 29, 2023

That last commit fixes failing tests with a bug that comes from #300.
In that PR, image_path = image_paths[field_index] was failing when viewing a Well of e.g. 3 fields, since it tries to build a 2 x 2 grid - the field_index of the bottom right spot is 3, but there are only 3 image_paths.

That bug wasn't caught in the tests for that PR since that code is only run when the you call dask.compute().
It was failing in tests above because in this PR, that code gets run immediately, since it's not wrapped in dask.delayed() and we also add tests for loading a single Well.

Also tested that fix in napari.
Viewing a 3-Field Well with this branch now, it works...

Screenshot 2023-09-29 at 12 23 27

But doing the same with current origin/master branch containing #300, I get the IndexError exception from within dask.compute()

(napari-env) LS30778:data wmoore$ napari --plugin napari-ome-zarr plates/54.ome.zarr/B/10
Traceback (most recent call last):
  File "/opt/anaconda3/envs/napari-env/bin/napari", line 10, in <module>
    sys.exit(main())
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/__main__.py", line 431, in main
    _run()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/__main__.py", line 308, in _run
    viewer = view_path(  # noqa: F841
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/view_layers.py", line 167, in view_path
    return _make_viewer_then('open', args, kwargs)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/view_layers.py", line 117, in _make_viewer_then
    method(*args, **kwargs)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/components/viewer_model.py", line 885, in open
    self._add_layers_with_plugins(
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/components/viewer_model.py", line 958, in _add_layers_with_plugins
    added.extend(self._add_layer_from_data(*_data))
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/components/viewer_model.py", line 1032, in _add_layer_from_data
    layer = add_method(data, **(meta or {}))
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/components/viewer_model.py", line 725, in add_image
    layer = Image(image, **i_kwargs)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/image/image.py", line 332, in __init__
    self._update_dims()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/base/base.py", line 641, in _update_dims
    self.refresh()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/base/base.py", line 1109, in refresh
    self.set_view_slice()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/base/base.py", line 902, in set_view_slice
    self._set_view_slice()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/image/image.py", line 700, in _set_view_slice
    self._load_slice(data)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/image/image.py", line 721, in _load_slice
    if self._slice.load(data):
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/image/_image_slice.py", line 123, in load
    return self.loader.load(data)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/image/_image_loader.py", line 22, in load
    data.load_sync()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/napari/layers/image/_image_slice_data.py", line 44, in load_sync
    self.image = np.asarray(self.image)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/array/core.py", line 1581, in __array__
    x = self.compute()
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/base.py", line 288, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/base.py", line 571, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/threaded.py", line 79, in get
    results = get_async(
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/local.py", line 507, in get_async
    raise_exception(exc, tb)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/local.py", line 315, in reraise
    raise exc
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/local.py", line 220, in execute_task
    result = _execute_task(task, data)
  File "/opt/anaconda3/envs/napari-env/lib/python3.9/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/Users/wmoore/Desktop/ZARR/ome-zarr-py/ome_zarr/reader.py", line 427, in get_field
    image_path = image_paths[field_index]
IndexError: list index out of range
Bus error: 10

@will-moore will-moore merged commit 03de064 into ome:master Sep 29, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dask arrays for HCS nodes do not compute to NumPy arrays
4 participants