Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Variances in Weighted Histogram can become negative (or nan) if scaled by too large integer #964

Open
Superharz opened this issue Oct 15, 2024 · 0 comments

Comments

@Superharz
Copy link

Dividing a weighted histogram by a large scalar integer can result in negative (or nan) variances. This only happens with integers.

Code to reproduce:

import boost_histogram as bh
hist = bh.Histogram(bh.axis.Regular(2,0,1), storage=bh.storage.Weight())
x = [0,1]
weight = [10.0,10.0]
hist.fill(x, weight=weight)

print(hist.values(), hist.variances())
#>>> [10.  0.] [100.   0.]

hist_2 = hist / (123456789)
print(hist_2.values(), hist_2.variances())
#>>> [8.10000007e-08 0.00000000e+00] [-5.68861947e-08 -0.00000000e+00]

hist_3 = hist / float(123456789)
print(hist_3.values(), hist_3.variances())
#>>> [8.10000007e-08 0.00000000e+00] [6.56100012e-15 0.00000000e+00]

Observed behavior:

The variance in hist_2 turns negative.

Dividing by 2**N with N>15 results in a [inf, nan] variance.

Expected behavior:

The variance with a weight of 10 after dividing by 123456789 should be the one from hist_3.

Workaround:

Cast the scalar to a float (which happens for hist_3).

IMHO this should happen automatically or a warning should be given to the user.

Version:

  • Windows 10, 64 bit, AMD Ryzen 3800X
  • Python 3.12.7
  • boost-histogram 1.5.0
  • numpy 1.26.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant