-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index out of bounds in MergingDigest #107
Comments
this may be out of date; i just ran it into current master and it passes. |
Thanks for checking.
(and what a huge relief)
…On Wed, Sep 5, 2018 at 4:00 PM dariomx ***@***.***> wrote:
this may be out of date; i just ran it into current master and it passes.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#107 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAPSeivTS7IIUfOyFZoN4KrkAKIenbXmks5uYFeTgaJpZM4T0JWi>
.
|
I'm not sure I'd say this is out of date, per se. The above code snippet runs OK with the current master version. However, the latest release version (AFAICT 3.2, released August 2017) will crash with these inputs, and we've seen similar failures in production systems with different inputs. Is there a plan to release a new version of the library to maven central? It's a really nice library, so we're keen to continue using it and would prefer not to have to run our own internal fork! |
Yes.
I will be releasing a new version before long.
There are a number of improvements to be had, particularly in terms of
accuracy and predictability.
Would it help you if I released a patch release before the larger release?
…On Mon, Nov 26, 2018 at 3:37 PM Richard North ***@***.***> wrote:
I'm not sure I'd say this is out of date, per se. The above code snippet
runs OK with the current master version. However, the latest release
version (AFAICT 3.2, released August 2017) will crash with these inputs,
and we've seen similar failures in production systems with different inputs.
Is there a plan to release a new version of the library to maven central?
It's a really nice library, so we're keen to continue using it and would
prefer not to have to run our own internal fork!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#107 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAPSekutHVlNN2nRukJ7XZW_yO36Jag7ks5uzArBgaJpZM4T0JWi>
.
|
A compression factor of 1 is pretty extreme, isn't it? I wouldn't understand the use of any value less than, say, 10 or 20. |
Thanks for such a quick response. Yes, a patch release would be greatly appreciated. We have this occurring at a compression factor of 100 (and obviously an entirely different volume of input data points). |
Where is the exception happening? The problem that I just looked at had to do with the bufferSize getting set too small so that the array copy failed on merging. Where is your problem appearing? Same place? |
In both cases (the repro case above and our error in production) both failed at line 302 of MergingDigest.java, which is the following (in v3.2): System.arraycopy(mean, 0, incomingMean, incomingCount, lastUsedCell); Relevant part of the stack trace:
I'm afraid right now I can't give much more information about the state of the fields at the time this is occurring, aside from saying that we've seen this on one of our busiest services. I shall try and gather some more useful data tomorrow, though, if it helps. |
Is this fixed? |
Should be fixed in master. Let me verify.
…On Tue, May 14, 2019 at 1:55 PM Carotene ***@***.***> wrote:
Is this fixed?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#107?email_source=notifications&email_token=AAB5E6VF3ZB7DNQLN2QF4B3PVMRL7A5CNFSM4E6QSWRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVMYLCI#issuecomment-492406153>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAB5E6USGFRC2WK4UCYYWZDPVMRL7ANCNFSM4E6QSWRA>
.
|
Verified. |
Is this fix released and in which version? |
Related exception in 3.2: ERROR in (issue-201-incorrect-result-column-count) (System.java:-2)
Uncaught exception, not in assertion.
expected: nil
actual: java.lang.ArrayIndexOutOfBoundsException: null
at java.lang.System.arraycopy (System.java:-2)
com.tdunning.math.stats.MergingDigest.merge (MergingDigest.java:302)
com.tdunning.math.stats.MergingDigest.mergeNewValues (MergingDigest.java:291)
com.tdunning.math.stats.MergingDigest.quantile (MergingDigest.java:661) This is after do a parallelized aggregation where TDigest.add was called to merge separate thread contexts. It happens only once in a while, not every time which is a bummer for a large aggregation. |
Chris,
It doesn't sound like you have a test case for this, but could you say
which version you are using?
…On Wed, Mar 24, 2021 at 6:45 AM Chris Nuernberger ***@***.***> wrote:
Related NPE I believe:
ERROR in (issue-201-incorrect-result-column-count) (System.java:-2)Uncaught exception, not in assertion.expected: nil actual: java.lang.ArrayIndexOutOfBoundsException: null at java.lang.System.arraycopy (System.java:-2) com.tdunning.math.stats.MergingDigest.merge (MergingDigest.java:302) com.tdunning.math.stats.MergingDigest.mergeNewValues (MergingDigest.java:291) com.tdunning.math.stats.MergingDigest.quantile (MergingDigest.java:661)
This is after do a parallelized aggregation where TDigest.add was called
to merge separate thread contexts. It happens only once in a while, not
every time which is a bummer for a large aggregation.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#107 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5E6V2TL2MNEHCXG43TGTTFHUHVANCNFSM4E6QSWRA>
.
|
I have a test case that does this about 3 times out of 100 if I run it in a loop. I am using version 3.2. I will get test with master:HEAD and see if that eliminates/changes the problem. |
I would love to have the test case as well. Not useful as it stands as a
unit case, but very, very useful to use to collect data.
One of the pressures for a new release is that several related issues have
been fixed in the main branch. (note that the master branch got renamed not
long ago)
…On Wed, Mar 24, 2021 at 12:06 PM Chris Nuernberger ***@***.***> wrote:
I have a test case that does this about 3 times out of 100 if I run it in
a loop. I am using version 3.2.
I will get test with master:HEAD and see if that eliminates/changes the
problem.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#107 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5E6XBFS2ZCNGAPADEGCLTFIZ25ANCNFSM4E6QSWRA>
.
|
100% main:HEAD fixes the test. I ran it many thousands of times just now and saw zero failures. The test case involves a Clojure datatframe library which means you would be calling Clojure in your unit tests. Are you sure you want the test case :-)? I basically to a N-core reduction across Y datasets followed by This library is fantastic so I am down to help you any way you like. |
OK.
This is yet another vote for a release.
Gonna have to do it even without the serialization stuff that is cooking.
Speaking of which, please weigh in on your thoughts.
https://docs.google.com/document/d/1CX9Colw1g58nLHfJ_IR4XeFHhUaiXGTOG0aKkHpTut4/edit#heading=h.oukzhuq9hijs
…On Wed, Mar 24, 2021 at 12:25 PM Chris Nuernberger ***@***.***> wrote:
100% main:HEAD fixes the test. I ran it many thousands of times just now
and saw zero failures.
The test case involves a Clojure datatframe library which means you would
be calling Clojure in your unit tests. Are you sure you want the test case
:-)? I basically to a N-core reduction across Y datasets followed by
thread0_digest.add(rest-of-digests).
This library is fantastic so I am down to help you any way you like.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#107 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5E6WIXOOG24OQNGHM5ILTFI4D3ANCNFSM4E6QSWRA>
.
|
@cnuernber I would like to take you up on your offer. Can you look at the current main branch plus live issues and identify what you would say are blockers for a point release? If none and if this sounds like consensus, I will start the process with what we have. |
Yes, will do. |
I looked through the issues and have them categorized and such. Here are my opinions- In short I think a point release is worth it now to get the fixes for mergetree.
I messaged you on linked in and we can continue offline from there or we can continue the discussion here; whichever way works for you. |
I will close this after the upcoming release. |
(Slight) progress note. I have decoded the new behavior of gpg and interactions with the maven code signing (short answer, much better than before). Should be able to cut a release shortly. |
Hi @cnuernber, could you explain how you solved the bug ? I'm having the same one, really rarely also. |
Well, two ways. Honestly I think this library has been superceded by apache datasketches. What we did before I started work with datasketches was I patched the library and released it on Clojars under a different name. If you are doing work in this area of course I have to recommend our data processing system :-). |
hi,@cnuernber, I saw that your fork just submitted two configurations, how did you solve the current problem?(main...techascent:t-digest:main) |
@cnuernber Interesting to see the work on tech.ml.dataset. It frankly looks a bit like a reinvention of Arrow, but just in Clojure, at least for the columnar in-memory format work. |
Hi,
The following variant of the
testSmallCountQuantile
test results in ajava.lang.ArrayIndexOutOfBoundsException
error. Dropping the last element will have the test pass.The text was updated successfully, but these errors were encountered: