Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create in-toto signing payload for single digests. #263

Merged
merged 1 commit into from
Jul 31, 2024

Conversation

mihaimaruseac
Copy link
Collaborator

@mihaimaruseac mihaimaruseac commented Jul 27, 2024

Summary

This converts DigestManifests objects to an in-toto format where the model is identified by its complete digest (as result of the serialization).

For Sigstore signing, this payload can be signed via sign_intoto, producing a Sigstore Bundle as the signature.

CC @susperius for converting manifest to in-toto. This should cover #111, #224, and #248 (first part of the machinery). CC @laurentsimon to make sure I did not mishandle in-toto.

Note: Looking at the goldens, I see that some of the hashes are not properly converted to the one we use in serialization (should always be file-sha256 in the current testing scenario). I was planning to debug that before pushing this work, but given #260 I'll send it now to reduce work duplication and will debug the issue next week.

Note: Disabling C0103 and E1101 in the pylint configuration. The first one is about the name of the class constant, it wants it to be all uppercase, but that goes against established style for class/instance variables. Worse, the second one fails to detect that protobuf generated code has certain attributes/names. This is because pylint does not use .pyi files, which is what protobuf generates, but we can disable it given this missing attribute would be detected by pytype, which works correctly. Maybe when switching linter we'd get rid of this issue.

Release Note

NONE

Documentation

NONE

self.statement = statement.Statement(
subjects=[descriptor],
predicate_type=self.predicate_type,
predicate={"actual_hash_algorithm": digest_algorithm},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps consider adding this as annotation to be aligned with the later PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a predicate. If I move this to an annotation then I'll have to add a useless predicate


predicate_type: Final[str] = "https://model_signing/Digest/v0.1"

def __init__(self, *, digest_hex: str, digest_algorithm: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a metadata dict that can be used as predicate with a default {'unused': True} or something like that. It could be used to provide model information down the road.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add that when needed

laurentsimon
laurentsimon previously approved these changes Jul 31, 2024
],
"predicateType": "https://model_signing/Digest/v0.1",
"predicate": {
"actual_hash_algorithm": "file-sha256"
Copy link
Collaborator

@laurentsimon laurentsimon Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: We could also name this field "serialization_xxx" instead of hash_xxx. Might be more consistent with the codebase (we have distinct hash and serialization code), but more confusing since intoto does not have this concept. On the other hand, I recall Hayden saying he'd like to see cosign and other tools have a dedicated serialization layer, iirc

Note 2: In SLSA we typically use camel case for all field names.

Those are just fyis, feel free to ignore

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, serializationAlgorithm instead of actual_hash_algorithm? I'll do that after the set lands, otherwise due to having to regenerate goldens there will be a ton of conflicts.

model_signing/signing/in_toto.py Show resolved Hide resolved
@@ -0,0 +1,14 @@
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe add the extension .intoto to the file, to give a hint to readers of what the file contains

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me do this after merging all of these, to do it across the board (rebase would be harder otherwise, with a lot of conflicts due to file moves).

Copy link
Contributor

@spencerschrock spencerschrock Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: If you decide to rename, this would go against the naming convention. So maybe .intoto.json ?

If stored in a dedicated file by itself, and not as part of a Bundle, an Envelope SHOULD use the suffix .json.

EDIT: I may be confusing statement and envelope here whoops

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's correct. Yes, it should be .intoto.json

This converts `DigestManifest`s objects to an in-toto format where the
model is identified by its complete digest (as result of the
serialization).

For Sigstore signing, this payload can be signed via `sign_intoto`,
producing a Sigstore `Bundle` as the signature.

Signed-off-by: Mihai Maruseac <[email protected]>
@mihaimaruseac mihaimaruseac merged commit 494dd72 into sigstore:main Jul 31, 2024
20 checks passed
@mihaimaruseac mihaimaruseac deleted the in-toto-digests branch July 31, 2024 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants