Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: what is the correct way to measure the compressed size? #9

Open
MikeB2019x opened this issue Oct 29, 2024 · 0 comments
Open

Comments

@MikeB2019x
Copy link

MikeB2019x commented Oct 29, 2024

The compress function returns a byte object and original size. But how to you see the new compressed size?
The reason I ask is because I am compressing tuples to do NCD on them. Below I show some example tuples, converted to a single string, and then the results of unishox2 compression [original size, len(unishox2.compress()[0])].
The challenge is that tuples of identical length all have the same compressed length regardless of differences e.g. all zeros. That is unexpected to me so I'm wondering if it is because i am measuring the compressed size incorrectly.

[131100, 131100] 131100131100 [12, 8]
[120100, 410100] 120100410100 [12, 8]
[131100, 510100, 800400] 131100510100800400 [18, 11]
[131150, 512100] 131150512100 [12, 8]

@MikeB2019x MikeB2019x changed the title Question: how do you see the compressed size? Question: what is the correct way to measure the compressed size? Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant