-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Backend Configuration Va] Basic user documentation #802
Conversation
|
||
And likewise for ``AVAILABLE_ZARR_COMPRESSION_METHODS``. | ||
|
||
We can confirm these values are saved by re-printing that particular dataset configuration... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can confirm these values are saved by re-printing that particular dataset configuration... | |
We can confirm these values are saved by re-printing that particular dataset configuration: |
It's not clear from this that these configurations made it back into the larger object. It's also not clear how you would use this configuration to modify an in-memory NWBFile object. It would also be great to write this file and then show that the written file followed the chunking and compression parameters we chose. This could be done using HDFView, or some other app or API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear from this that these configurations made it back into the larger object.
This was obscured by the odd injection of all the notes; with those moved, I put the two printouts side by side and added some spacing
It would also be great to write this file and then show that the written file followed the chunking and compression parameters we chose.
The point the demo here makes is just that they get saved in the backend_configuration
object itself - I can add something to prove that it actually writes the file in that manner (for chunks and compression anyway, can't prove buffering that way)
Though I will probably just demonstrate it with h5py
since mentioning/showcasing HDFView could lead to annoyance because if a user wanted to try it they'd need to sign up for HDF5 - also Neurosift doesn't display this information, though I'm sure you could just request that - could also use PyNWB HTML repr after hdmf-dev/hdmf#1100 is merged, but this section is also currently for python script / ipython usage, not specific to a jupyter notebook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a section to the FAQ for how to check that customizations (or the default for that matter) actually configure the item correctly in the resulting file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK great, could you please add a section on writing the data and confirming (somehow) that the data was written with the proper settings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did, it's the last section of the FAQ. Uses the file written using the preceeding section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK that section of the FAQ looks great, but I still feel like it would be nice to have another snippet here showing how to apply the configs to an in-memory NWBFile object and write to disk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what you mean here - what are you imagining and how is it different from the preceding section?
Co-authored-by: Ben Dichter <[email protected]>
…before and after closer together
|
||
**How do I disable chunking and compression completely?** | ||
|
||
To completely disable chunking (i.e., 'contiguous' layout), set both ``chunk_shape=None`` and ``compression_method=None``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To completely disable chunking (i.e., 'contiguous' layout), set both ``chunk_shape=None`` and ``compression_method=None``. | |
To completely disable chunking for HDF5 backends (i.e., 'contiguous' layout), set both ``chunk_shape=None`` and ``compression_method=None``. Zarr requires all datasets to be chunked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe not that important and this is what you meant but you can pass chunks=None
to zarr and get all the data in one file:
import zarr
# Create a Zarr group
zarr_folder = zarr.open_group('test_zarr.zarr', mode='w')
# Create an array and add it to the Zarr group
array_data = np.random.rand(20, 20, 20)
zarr_array = zarr_folder.create_dataset('array', data=array_data, chunks=None)
zarr_array.info
Name | /array |
---|---|
Type | zarr.core.Array |
Data type | float64 |
Shape | (20, 20, 20) |
Chunk shape | (20, 20, 20) |
Order | C |
Read-only | False |
Compressor | Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0) |
Store type | zarr.storage.DirectoryStore |
No. bytes | 64000 (62.5K) |
No. bytes stored | 56455 (55.1K) |
Storage ratio | 1.1 |
Chunks initialized | 1/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not check myself (trusted @bendichter on that one)
Our config literally just passes the values on to downstream calls so I assume this would work here, in which case we can remove this line as a minor point (even though chunks=None
is the option, and even though the array says 1/1 chunks, that's all 'equivalent' to contiguous layout in a sense)
Co-authored-by: Ben Dichter <[email protected]>
I haven't had a chance to run through the code. If it all runs as advertised, this looks good to me |
I ran through it all again - fixed a couple minor things now that the upstream ecephys change is through - confirmed everything works as advertised We should keep in the back of our minds how to ensure this into the future however - doctest doesn't really work well for this scenario since we broke up various steps for the sake of readability (and doctest needs the variables to be in the same code block), which is more of a style that works well with using a notebook as a doc base (as PyNWB tutorials do) but we've not done that yet on NeuroConv |
fix #797