You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When an IntCategory axis with growth=False and overflow=False is filled with values that fall outside the defined bins (which should be a valid combination, and is useful in certain contexts to 'ignore' invalid values without explicit masking of them), the result can be 0 for the entire histogram, at least when a StrCategory axis and Weight storage are used. This directly appears with using named versions of these axes types in scikit-hep/hist, and produces a stranger error in dask_histogram.
Steps to reproduce
This also includes the reproducer for hist and dask_histogram, to demonstrate some of the discrepancies
This produces the following output, where the unmasked fill into the histogram without an overflow on the IntCategory produces 0 sum for (boost-)hist; and a junk-like value in DaskHistogram (presumably the partitioned fill plays an additional role here)
This bug was reproduced in 1.4.1 of boost-histogram, on an M1 Max chip, using MacOS 14.4; it was originally discovered on AlmaLinux8 (on Fermilab's LPC cluster) using a coffea container. Other principle libraries used are a mix of development versions corresponding mostly to the latest git version from approximately a month ago, and can be provided if necessary, but given the reproduction across different environments, it seems to be a relatively 'stable' bug
The text was updated successfully, but these errors were encountered:
NJManganelli
changed the title
[BUG]
[BUG] Histogram with IntCategory and StrCategory axes can be zeroed by out-of-bounds fill values.
Sep 14, 2024
A reproducer upstream in boost::histogram is beyond my time availability, but I'd expect it's equally true there, so I'm posting this bug report here as much for user-visibility as technical lack of a strict reproducer there.
Other principle libraries used are a mix of development versions corresponding mostly to the latest git version from approximately a month ago, and can be provided if necessary, but given the reproduction across different environments, it seems to be a relatively 'stable' bug
@NJManganelli Just for developer time savings, can you provide the minimal install list required to reproduce this? (So the high level requirements.txt that you would pip install from, not the output of pip list.)
Describe the bug
When an IntCategory axis with growth=False and overflow=False is filled with values that fall outside the defined bins (which should be a valid combination, and is useful in certain contexts to 'ignore' invalid values without explicit masking of them), the result can be 0 for the entire histogram, at least when a StrCategory axis and Weight storage are used. This directly appears with using named versions of these axes types in scikit-hep/hist, and produces a stranger error in dask_histogram.
Steps to reproduce
This also includes the reproducer for hist and dask_histogram, to demonstrate some of the discrepancies
This produces the following output, where the unmasked fill into the histogram without an overflow on the IntCategory produces 0 sum for (boost-)hist; and a junk-like value in DaskHistogram (presumably the partitioned fill plays an additional role here)
This bug was reproduced in 1.4.1 of boost-histogram, on an M1 Max chip, using MacOS 14.4; it was originally discovered on AlmaLinux8 (on Fermilab's LPC cluster) using a coffea container. Other principle libraries used are a mix of development versions corresponding mostly to the latest git version from approximately a month ago, and can be provided if necessary, but given the reproduction across different environments, it seems to be a relatively 'stable' bug
The text was updated successfully, but these errors were encountered: