-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crc polynomial #43752
base: feature/storage/content-validation
Are you sure you want to change the base?
Crc polynomial #43752
Conversation
API change check API changes are not detected in this pull request. |
...age-common/src/main/java/com/azure/storage/common/implementation/StorageCrc64Calculator.java
Outdated
Show resolved
Hide resolved
...age-common/src/main/java/com/azure/storage/common/implementation/StorageCrc64Calculator.java
Show resolved
Hide resolved
long b0, b1, b2, b3; | ||
|
||
// Load and XOR data with CRC | ||
b0 = ByteBuffer.wrap(src, pData, 8).order(ByteOrder.LITTLE_ENDIAN).getLong() ^ uCrc0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If additional performance is needed converting the 8 bytes to a long could be done manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This is looking good. Really only had minor comments and questions.
I think it would be also worth to sync up with @jaschrep-msft and @jalauzon-msft at some point to see if they notice anything with the implementation since they implemented it for .NET and Python.
|
||
private static final long POLY = 0x9A6C9329AC4BC9B5L; | ||
|
||
private static final long[] M_U1 = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking over the .NET and Python implementations:
- .NET: https://github.com/Azure/azure-sdk-for-net/blob/feature/storage/stg96base2/sdk/storage/Azure.Storage.Common/src/Shared/StorageCrc64Calculator.cs
- Python: https://github.com/Azure/azure-sdk-for-python/blob/feature/storage-content-validation/sdk/storage/azure-storage-extensions/azure/storage/extensions/crc64/helpers.h
They both represent these tables as unsigned 64-bit integers, but for us we are using longs, which is slightly different because it allows for negative numbers. We should double check if there is a specific reason unsigned integers were used for these implementations and see if we need to do something similar.
private static final long POLY = 0x9A6C9329AC4BC9B5L; | ||
|
||
private static final long[] M_U1 = { | ||
0x0000000000000000L, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort of a nitpick, but it would be nice if these tables were ordered in columns of four so it is easier to compare these values with the Python and .NET values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately, Java defaults it to be a single value on each line whenever we build the package. It might be due to some underlying linting mechanism.
b3 >>>= 8; | ||
|
||
// Repeat for remaining bytes | ||
for (int i = 6; i >= 0; i--) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I noticed looking over the .NET implementation: https://github.com/Azure/azure-sdk-for-net/blob/c85cc69742b4a5a325eb2f077cbdaf548e7e6362/sdk/storage/Azure.Storage.Common/src/Shared/StorageCrc64Calculator.cs#L840-L891 was that it did not use any for loops when doing these XORs. I'd prefer what we have for readability but curious if there was a specific reason the .NET implementation was done that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we explicitly unrolled the loop. Whoever built this way back in the day for the service did the perf testing and wanted it that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's good to know. Thanks for confirming. @ibrahimrabab once we get the checksum working fully and some more performance benchmarking in place, let's try unrolling it to see if it makes a difference performance wise in the implementation.
Description
Implementation of CRC64 Polynomial for Content Validation.