-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge TDigest with different compression factors #221
Comments
The merge can be done, but you will only be able to guarantee to meet the
invariant for weaker compression of the items being merged.
This is even more complex than it sounds because some changes in 3.2 => 3.3
changed the effective meaning of the compression factor.
One way to deal with this is to do the merge and then examine the result to
determine how high the compression factor can still be.
Does this make sense?
…On Tue, Sep 24, 2024 at 9:31 PM jameswang2015 ***@***.***> wrote:
Sometimes we get a TDigest field with different compression factors(For
example, we change compression factor but some users still use old versions
of product that still have old compression factors). The current Merge
function can't handle that. Can we extend the Merge function to support
that?
On the other hand, we currently follow these steps to merge it. Would this
make sense?
1. Destruct the tdigest into a list of centroids and weights.
2. Unnest the list of centroids and weights.
3. Merge the centroids and weights into a single tdigest with the
specified compression by using TDIGEST_AGG(m, w, compression)
Thanks.
—
Reply to this email directly, view it on GitHub
<#221>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5E6TGZEOELCPXGXRYJN3ZYG4ZXAVCNFSM6AAAAABOY75366VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU2DMMRRGYZTSNQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thanks @tdunning for the advise! What's the effective meaning change for the compression factor by 3.2 => 3.3 upgrade? We observes that now the compression factor is most likely equal to the size of centroid means/weights, does this sound right to you? |
Yes. That is about right. But I'm 3.2, the number of centroids was about
twice as many for the same compression value
…On Wed, Oct 2, 2024, 05:36 jameswang2015 ***@***.***> wrote:
Thanks @tdunning <https://github.com/tdunning> for the advise! What's the
effective meaning change for the compression factor by 3.2 => 3.3 upgrade?
We observes that now the compression factor is most likely equal to the
size of centroid means/weights, does this sound right to you?
—
Reply to this email directly, view it on GitHub
<#221 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAB5E6T4IU4G6XL36Z56GOTZZNS33AVCNFSM6AAAAABOY75366VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBXGU3DGMJTGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Sometimes we get a TDigest field with different compression factors(For example, we change compression factor but some users still use old versions of product that still have old compression factors). The current Merge function can't handle that. Can we extend the Merge function to support that?
On the other hand, we currently follow these steps to merge it. Would this make sense?
Thanks.
The text was updated successfully, but these errors were encountered: