-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hashing API for model signing #188
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Open
mihaimaruseac
force-pushed
the
api-hashing
branch
from
May 27, 2024 13:02
6756577
to
5124fe0
Compare
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
mihaimaruseac
force-pushed
the
api-hashing
branch
from
May 28, 2024 12:01
5124fe0
to
d4d82c5
Compare
Remove bad design patterns around the ahshing engine concerns. Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
Signed-off-by: Mihai Maruseac <[email protected]>
This is now ready for review. |
laurentsimon
approved these changes
May 29, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
mihaimaruseac
added a commit
to mihaimaruseac/model-transparency
that referenced
this pull request
Jun 3, 2024
Missed this in sigstore#188, but found out I need it when working on sigstore#190. The `serialize_v0`/`serialize_v1` methods all had headers in front of the files, so we need to do that too. Will update usage of header on sigstore#190 shortly. As a benefit, we can simulate hashing a file with a header for the first portion of the file and a sharded hasher for the remainder of the file. Signed-off-by: Mihai Maruseac <[email protected]>
This was referenced Jun 3, 2024
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This is the lowest layer of the model signing API (#172). It only supports computing the digest of a single object, in a flexible way (#140). We add a precomputed hasher to allow benchmarking (remove startup, etc. time; this will come up later). We add one memory hasher for now, but we might add others later (#13).
The important part of the API is under
hashing/file
. We have 2 ways to hash a file: either completely or by passing a span (start, end pair) and only hashing the contents within the span. Next level API will generate corresponding spans to hash a file in a distributed fashion. This is also useful for models that are loaded in a distributed way: each host is able to hash only the part that it accesses.Tested added in this CL, microbenchmarks will follow later.
Drive-by-fixes:
.gitignore
serialize_test.py
import to work with directory packages. We will remove this later once the API is implemented and code migrated, with testing in the proper placesflake8
max line length to 80 to match Google styleRelease Note
NONE
Documentation
NONE