randomInt(min, max)
has biased distribution
#2720
Replies: 9 comments
-
Thanks for bringing this up. If anyone is interested in working out a solution for |
Beta Was this translation helpful? Give feedback.
-
This comment is for the uniform distribution. I haven't looked at how to avoid this problem for other distributions. I think the code at the end of the blog post can be used as the basis for an implementation. The question would be: how large a range do we require? (According to the docs Is 2^51 large enough? Do we want to add BigNumber support to |
Beta Was this translation helpful? Give feedback.
-
So far the random functions only support numbers. Adding support for BigNumbers will be great, but apart from that it would be nice to improve the implementation for regular numbers. Would you be interested in giving this a try Dimitri? |
Beta Was this translation helpful? Give feedback.
-
Sure! Let me get acquainted with the error handling strategy used in the library and also how to write code that will work both in the browser with |
Beta Was this translation helpful? Give feedback.
-
Cool! If you have any questions just ask. For crypto you will probably have to create a switch to check whether in a browser environment or in node.js, and we should make sure that node.js crypto library isn't bundled with math.js (it's quite large). Maybe there are libraries out there to do this for you, I'm not sure. And also, feel free to refactor code in |
Beta Was this translation helpful? Give feedback.
-
In my view, this interface:
(defined in
Neither of these is true for the normal distribution. A sample from the normal distribution can be any value from -infinity to +infinity and those values are not discrete. Other distributions are discrete, but not bounded. For example, the geometric distribution is discrete but not bounded. It’s samples are in the interval [1, +infinity). This means the interface defined by In short, I believe we should not have this interface. ( var normal = math.distribution('normal', 0 , 1); // create a normal distribution
// with zero mean and variance 1
var x = normal.sample(); // it's unlikely but possible that x = -11.2345
//create continuous uniform distribution in [2,5)
var uniform = math.distribution(‘continuous-uniform’, 2, 5)
var y = uniform.sample() // it’s possible that y = 4.2867
var disc = math.distribution(‘discrete-uniform’, 2, 5);
var z = disc.sample() // z can only be one of [2,3,4,5] This makes clear the difference between a continuous uniform distribution and a discrete one. So, I recommend a few breaking changes:
Thoughts? |
Beta Was this translation helpful? Give feedback.
-
Thanks, your suggestion for a new API make sense to me, sounds good and simple (from a usage point of view). Because the idea of I'm not sure what's the easiest approach to realize this - maybe start from scratch and built a completely new implementation of |
Beta Was this translation helpful? Give feedback.
-
@josdejong |
Beta Was this translation helpful? Give feedback.
-
Thanks for the update Dimitri, I fully understand :D no worries |
Beta Was this translation helpful? Give feedback.
-
The final definition of
randomInt
uses a scale-then-floor algorithm to fit the distribution to the range:This scaling introduces a tiny bias in the resulting distributions. If N random bits of precision are used in
distribution()
this bias is only of order 2^(-N), but it may be a problem for some scientific and crypto applications.I have now seen this problem everywhere, so I wrote a blog post to explain the details.
You may want to consider giving a precision parameter to the definition of
distribution
and then performing the scaling at that level to make sure the bias can always be made as small as desirable. (A different solution for uniform distributions is presented in the blog post)Beta Was this translation helpful? Give feedback.
All reactions