-
Notifications
You must be signed in to change notification settings - Fork 20
R \ W Interface examples
Mantid has utility functions for RW of std strings and vectors, in addition to std maps for attributes. These can be seen here: https://github.com/mantidproject/mantid/blob/master/Framework/DataHandling/src/H5Util.cpp
The following is a suggestion for how a C++ interface to HDF5 data could be used. Each line of importance is commented individually after the complete suggestion. Note that all names are also suggestions.
#include <iostream>
#include <vector>
#include "hdf5"
int main() {
hdf5::File hdf5File("some_file_name.h5");
hdf5::Group someGroup(hdf5File, "name_of_group");
std::vector<hdf5::hsize_t> dimensions = {2, 2};
hdf5::DataSet<double> someDataSet(someGroup, "name_of_data_set", dimensions);
std::vector<double> someData = {1, 2, 3, 4};
someDataSet({{0, 1}, {0, 1}}) = someData;
std::cout << someDataSet << std::endl;
return 0;
}
hdf5::File hdf5File("some_file_name.h5");
This is pretty straight forward. What is not shown here is that there needs to be an interface for setting "additional parameters" when opening a file. Examples include: read-only mode, SWMR-mode etc. One possibility of how this could be accomplished is by the use of The Boost Parameter Library.
hdf5::Group someGroup(hdf5File, "name_of_group");
I think it is important that in order to create a group or a dataset in another group you should give the parent in the constructor. The reason this is important is because it allows for easy inheritance if your user wants to extend the functionality of (e.g.) the Group
class.
hdf5::DataSet<double> someDataSet(someGroup, "name_of_data_set", dimensions);
I am once again using the method of giving the parent as a parameter in the constructor in order to set the location of the dataset in the HDF5 hierarchy. The dimensions are given as a std::vector
. I have not set a cache size or a chunk size here and I think that the library should try to set sensible default values if they are not set explicitly. This goes for other parameters as well.
someDataSet({{0, 1}, {0, 1}}) = someData;
This line illustrates functionality that would be nice to have in the long term. Data set can be accessed using the ()-operator. In this case the operator has the following interface:
InterObj<dType> operator()(std::vector<std::pair<size_t, size_t>>)
I am only considering accessing the elements in a data set using ranges as I think this is the most common use case. For different access types, you would have to be more explicit about which elements to access (e.g.):
someDataSet({Point({0, 0}), Point({0, 1})}) = someData;
The ()-operator returns an intermediary object, i.e InterObj<dType>
. This object in turn defines the =-operator for setting the actual data in the dataset. The interface could look like something as follows:
InterObj operator=(std::vector<dType> data)
std::cout << someDataSet << std::endl;
This is yet another thing that would be nice to have in the long term. For printing of the data, a function has been defined which does this for you. See below:
template<class dType>
std::ostream &operator<<(std::ostream &os, DataSet<dType> const &dSet) {
...
}
A similar function would of course have to be defined for the InterObj
-class as well.