Travis CI: | |
---|---|
And...: |
This is a filter for HDF5 that uses the Blosc2 compressor; by installing this filter, you can read and write HDF5 files with Blosc2-compressed datasets.
You need to be a bit careful before using this filter because you should not activate the shuffle right in HDF5, but rather from Blosc2 itself. This is because Blosc2 uses an SIMD shuffle internally which is much faster.
Instead of just linking this Blosc2 filter into your HDF5 application, it is possible to install it as a system-wide HDF5 plugin (with HDF5 1.8.11 or later). This is useful because it allows every HDF5-using program on your system to transparently read Blosc2-compressed HDF5 files.
As described in the HDF5 plugin documentation, you just need to compile the Blosc2 plugin into a shared library and
copy it to the plugin directory (which defaults to /usr/local/hdf5/lib/plugin
on non-Windows systems).
Following the cmake
instructions below produces a libH5Zblosc2.so
shared library
file (or .dylib
/.dll
on Mac/Windows), that you can copy to the HDF5 plugin directory.
To write Blosc2-compressed HDF5 files, on the other hand, an HDF5 using program must be specially modified to enable the Blosc2 filter when writing HDF5 datasets, as described below.
Instead of (or in addition to) installing the Blosc2 plugin system-wide as
described above, you can also link the Blosc2 filter directly into your
application. Although this only makes the Blosc2 filter available in
your application (as opposed to other HDF5-using applications), it
is useful in cases where installing the plugin is inconvenient. Compile
the Blosc2 filter as described above, but link libblosc2_filter.a
(generated by make
) directly into your program.
In order to register Blosc2 in your HDF5 application, you then need to call a function in blosc2_filter.h, with the following signature:
int register_blosc2(char **version, char **date)
Calling this will register the filter with the HDF5 library and will return info about the Blosc2 release in **version and **date char pointers.
A non-negative return value indicates success. If the registration fails, an error is pushed onto the current error stack and a negative value is returned.
An example C program ('src/example.c') is included which demonstrates the proper use of the filter.
This filter has been tested against HDF5 versions 1.6.5 through 1.8.10. It is released under the MIT license (see LICENSE.txt for details).
Assuming the filter is installed (either by a system-wide plugin or registered directly in your program as described above), your application can transparently read HDF5 files with Blosc2-compressed datasets. (The HDF5 library will detect that the dataset is Blosc2-compressed and invoke the filter automatically).
To write an HDF5 file with a Blosc2-compressed dataset, you call the
H5Pset_filter function
on the property list of the dataset you are creating, and pass FILTER_BLOSC2
(defined in blosc2_filter.h
) for the filter_id
parameter. In addition, HDF5
only supports compression for "chunked" datasets; this just means that you need to
call H5Pset_chunk to
specify a chunk size (e.g. 1MB chunks), and the subsequent chunking of the dataset I/O
is performed transparently by HDF5.
The filter consists of a single 'src/blosc2_filter.c' source file and
'src/blosc2_filter.h' header, which will need the Blosc2 library
installed to work. It is simplest to just use the provided cmake
build scripts, which compile and both the filter and the Blosc2 library
into a library for you
Assuming you have cmake and other standard Unix build tools installed, do:
mkdir build cd build cmake .. make
This generates the library/plugin files required above in the build
directory.
See THANKS.rst.
Enjoy data!