-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix parallel buffered slice writer bug #922
Conversation
I added a boolean argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff! I added a few comments, mostly about naming things and defaults. Could you have a look?
@@ -34,6 +34,7 @@ def __init__( | |||
self, | |||
view: "View", | |||
offset: Optional[Vec3IntLike] = None, | |||
update_bbox: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d say the default should be True
, and there could be a comment explaining what it does and to encourage setting it to false IF parallel access is intended.
@@ -192,6 +192,7 @@ def write( | |||
self, | |||
data: np.ndarray, | |||
offset: Optional[Vec3IntLike] = None, # deprecated, relative, in current mag | |||
update_bbox: bool = True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe in this context it is rather something like allow_write_outside_view_bbox: bool = False
? (this method does not update the bbox, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, did you test what happens if there is a write outside of the existing box? does it just work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
I agree, that the name
allow_write_outside_view_bbox
fits better forView.write()
but it seems a bit odd to have different signatures for View and MagView. -
Yes, it just worked for the files that I used to test it.
@philippotto if you have time, could you also have a look at these changes? I’m not super sure about the semantics with view vs magview and I think another opinion would make sense |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a bandaid hotfix this seems okay to me (however, shouldn't the default for update_bbox
be false everywhere to avoid the runtime errors?).
Is there a plan for a proper fix? The PR would cause bounding boxes to be not set correctly potentially, if I understand it correctly?
if isinstance(self.view, MagView): | ||
self.view.write( | ||
data, | ||
offset=buffer_start.add_or_none(self.offset), | ||
relative_offset=buffer_start_mag1.add_or_none(self.relative_offset), | ||
absolute_offset=buffer_start_mag1.add_or_none(self.absolute_offset), | ||
update_bbox=self.update_bbox, | ||
) | ||
else: | ||
self.view.write( | ||
data, | ||
offset=buffer_start.add_or_none(self.offset), | ||
relative_offset=buffer_start_mag1.add_or_none(self.relative_offset), | ||
absolute_offset=buffer_start_mag1.add_or_none(self.absolute_offset), | ||
allow_write_outside_bbox=not self.update_bbox, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be DRYed with something like this, I think:
kwargs = None
if isinstance(self.view, MagView):
kwargs = {"update_bbox": self.update_bbox}
else:
kwargs = {"allow_write_outside_bbox": not self.update_bbox}
self.view.write(
data,
offset=buffer_start.add_or_none(self.offset),
relative_offset=buffer_start_mag1.add_or_none(self.relative_offset),
absolute_offset=buffer_start_mag1.add_or_none(self.absolute_offset),
**kwargs
)
offset=buffer_start.add_or_none(self.offset), | ||
relative_offset=buffer_start_mag1.add_or_none(self.relative_offset), | ||
absolute_offset=buffer_start_mag1.add_or_none(self.absolute_offset), | ||
allow_write_outside_bbox=not self.update_bbox, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this logic. If the bbox must not be updated, writing outside of the bbox is allowed? Shouldn't this be inverted?
I’m unsure how to go about this. The idea here was to move the responsibility of setting the full bbox from the subjobs (which do the write call) to the main process (afterwards). The benefit would be that the subjobs don’t run into this parallel editing of the json problem, and yet, the final bbox would be correct (as the caller, in this case add_layer_from_images, can then set it). The idea was to have the defaults be True, which is the intuitive behavior (everything updates), and only set them to False explicitly in the cases where concurrent usages are intended (creating a need to set the bbox in the caller code afterwards). A problem that arose from this is that the write calls must now be able to write outside of the (previous, smaller) bbox, because it will be set only at the end. That is why there is the I agree that this is all a bit hard to comprehend from the code alone. I am not super happy with this solution, but I also don’t have an idea for a “proper fix” – if you have ideas, please share them! Another idea was to set the bbox to a huge one (millions of voxel per dimensions) before calling the subjobs. In that case, the subjobs would not touch the json as nothing is bigger than expected. That would alleviate the need for passing these booleans. However, itwould introduce magic numbers for this huge bbox, and also not feel like a proper fix. We could probably also implement a locking mechanism for the json, but I’m not sure how to do that (in a way that does not impede performance) |
I see two possible solutions for this problem (I'm preferring (2)):
I think, I'm in strong favor of (2) since it seems way simpler to me. |
I would have assumed that file locks are not guaranteed to be multiprocessing-safe, but I may be wrong about that. Do we use it in voxelytics in the same way without issue? Then I’m ok with that approach. Maybe you could give @markbader an introduction on how it is done in voxelytics so that he can adapt it into here? |
I think it should be safe. The package describes the locking mechanism as a "a simple way of inter-process communication".
In vx we use the locking to avoid that the same workflow task is run multiple times (e.g., due to user error). So, I think, that this should work for the libs use case, too.
@markbader See this code. I think, it's as simple as importing the module and adding the |
@philippotto Thanks for your feedback and implementation ideas. I tried to implement the
I just discussed with @daniel-wer that under the given circumstances, the implementation of your first proposed solution would be better. @philippotto do you have another idea for an implementation of a locking mechanism, or would you agree that I should implement solution 1? |
…ck for s3 buckets.
I would hope that a filelock-implementation would be possible with S3, but I also don't see an easy way to achieve it without changing the entire approach that is used by the file-lock library. Alternatively, one could use a file lock somewhere else (e.g. in the current working directory) that is shared between all workers. For multiprocessing this would be easy (the only downside is that this wouldn't catch independent actors writing to the same DS). However, I don't see how this would be generalized to slurm etc. Therefore, I'm fine with doing (1) :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, looks good to me :) Two things:
- The documentation should state somewhere explicitly which restrictions exist for the parallel access of wk datasets. Namely:
- when writing shards in parallel, json_update_allowed should be set to False to disable the automatic update of the bounding box meta data. otherwise, race conditions may happen. the user is responsible for updating the bounding box manually.
- when writing to chunks in shards, one chunk may only be written to by one actor at any time
- when writing to compressed shards, one shard may only be written to by one actor at any time
- reading in parallel should be fine.
- see my two review comments
@normanrz could you double check my restrictions from above?
Ok for now. With Zarr, parallel writes to a shard will no longer be allowed at all. Maybe we already put such wording in.
reading in parallel without concurrent writes is ok. |
@philippotto Thanks for your review! I added a paragraph about parallel access of Datasets in the Dataset Usage part of the docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, looks good to me :)
That sounds like trouble for some of our applications that do slice-wise operations. What would be the right way to deal with that? Shards with very small z size so that we can do the parallelization? Or ditch the fine-grained parallelization to write to the large chunk files sequentially each? Or re-chunking in a second pass? All seem to have drawbacks compared to now. |
For the vx align case, picking a small shard-z dimension should be a good option imo. Especially, because zarr supports non-cubic shards (so, x and y can still have larger sizes). After materialization, a recubing would be necessary (not strictly, but good for WK) which is a drawback, but on the other hand a low shard-z-size would allow higher parallelization. |
Description:
Issues:
Todos:
Make sure to delete unnecessary points or to check all before merging: