Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5 Data reader enhancements #2328

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

bvanessen
Copy link
Collaborator

Added a check in the HDF5 data reader to check that the metadata for
each field actually matches the dimensions of the data fields.

Added a helper function for conduit to allow the calculation of a product of a data array's elements.

Comment on lines 363 to 365
int n_elts = node[pathname].dtype().number_of_elements();
conduit::int64_array data_array_dims = metadata[HDF5_METADATA_KEY_DIMS].value();
auto expected_n_elts = data_array_prod(data_array_dims);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int n_elts = node[pathname].dtype().number_of_elements();
conduit::int64_array data_array_dims = metadata[HDF5_METADATA_KEY_DIMS].value();
auto expected_n_elts = data_array_prod(data_array_dims);
int const n_elts = node[pathname].dtype().number_of_elements();
conduit::int64_array const data_array_dims = metadata[HDF5_METADATA_KEY_DIMS].value();
auto const expected_n_elts = data_array_prod(data_array_dims);

Comment on lines 29 to 45
namespace conduit {

template <typename T>
T
data_array_prod(DataArray<T> a)
{
T res = 1;
for(index_t i = 0; i < a.number_of_elements(); i++)
{
const T &val = a.element(i);
res *= val;
}

return res;
}

} // conduit
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace conduit {
template <typename T>
T
data_array_prod(DataArray<T> a)
{
T res = 1;
for(index_t i = 0; i < a.number_of_elements(); i++)
{
const T &val = a.element(i);
res *= val;
}
return res;
}
} // conduit
namespace lbann {
template <typename T>
T
data_array_prod(DataArray<T> const& a)
{
T res = 1;
for(index_t i = 0; i < a.number_of_elements(); i++)
{
const T &val = a.element(i);
res *= val;
}
return res;
}
} // namespace lbann

IMO we shouldn't be extending their namespace; just put it in lbann. Also, conduit::DataArray doesn't own any data, so it's just a "shallow copy" (pointer and size), but the "expected" API would probably be pass-by-const-reference. (I'm moderately surprised that they don't expose begin() and end() functions (and suitable iterators) for this container!)

Comment on lines 368 to 375
LBANN_WARNING("Ingesting sample field ",
pathname,
" for sample ",
sample_name,
" where the dimensions in the metadata don't match the actual field: ",
expected_n_elts,
" != ",
n_elts);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to highlight in the output which is the observed vs the expected value -- shouldn't have to look in source code to decipher warnings...

each field actually matches the dimensions of the data fields.

Added a helper function for conduit to allow the calculation of a
product of a data array's elements.
@bvanessen bvanessen force-pushed the enhance_hdf5_reader branch from 903e928 to 90cb6e7 Compare January 26, 2024 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants