[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

Superharz · 2024-10-15T23:17:38Z

Dividing a weighted histogram by a large scalar integer can result in negative (or nan) variances. This only happens with integers.

Code to reproduce:

import boost_histogram as bh
hist = bh.Histogram(bh.axis.Regular(2,0,1), storage=bh.storage.Weight())
x = [0,1]
weight = [10.0,10.0]
hist.fill(x, weight=weight)

print(hist.values(), hist.variances())
#>>> [10.  0.] [100.   0.]

hist_2 = hist / (123456789)
print(hist_2.values(), hist_2.variances())
#>>> [8.10000007e-08 0.00000000e+00] [-5.68861947e-08 -0.00000000e+00]

hist_3 = hist / float(123456789)
print(hist_3.values(), hist_3.variances())
#>>> [8.10000007e-08 0.00000000e+00] [6.56100012e-15 0.00000000e+00]

Observed behavior:

The variance in hist_2 turns negative.

Dividing by 2**N with N>15 results in a [inf, nan] variance.

Expected behavior:

The variance with a weight of 10 after dividing by 123456789 should be the one from hist_3.

Workaround:

Cast the scalar to a float (which happens for hist_3).

IMHO this should happen automatically or a warning should be given to the user.

Version:

Windows 10, 64 bit, AMD Ryzen 3800X
Python 3.12.7
boost-histogram 1.5.0
numpy 1.26.4

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

Superharz commented Oct 15, 2024

[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

Comments

Superharz commented Oct 15, 2024

Observed behavior:

Expected behavior:

Workaround:

Version: