Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursively build checksum for SLSA #46

Merged
merged 3 commits into from
Oct 25, 2023
Merged

Recursively build checksum for SLSA #46

merged 3 commits into from
Oct 25, 2023

Conversation

mihaimaruseac
Copy link
Collaborator

Some models (TF SavedModel format) are saved as a directory of assets, not just a single file. Hence, we need special handling to cover them when generating SLSA provenance.

Copy link
Collaborator

@laurentsimon laurentsimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this a stable hash? If you verify on a different OS? Is the order of files stable?
How about using sha256sum $(find . -type f | sort) | sha256sum which is what Golang does #12?

@mihaimaruseac
Copy link
Collaborator Author

Let me run a verification for both cases. I think double hashing might have issues for models that are just a single file

@laurentsimon
Copy link
Collaborator

laurentsimon commented Oct 24, 2023

good point. Hoo, and the slsa-verifier does not understand folder hashing yet, so anything involving folders won't be verifiable yet

@smeiklej
Copy link
Collaborator

Silly question but is this the same hash we get in the model signing library? Would be nice for them to be aligned.

@mihaimaruseac
Copy link
Collaborator Author

Unfortunately, right now I don't think they are the same. But we should create an issue for this in the future to match.

We'll need to also change verifier to record that this hash is computed by this library, probably also need to record this in the provenance

@mihaimaruseac
Copy link
Collaborator Author

@laurentsimon I actually think this PR works. I just downloaded the artifacts from https://github.com/mihaimaruseac/model-transparency/actions/runs/6632440974 (which was run on my fork just before creating this PR) and unzipped them.

The attestation (needed to be unzipped too, it seems) has the following subjects:

# cat multiple.intoto.jsonl | jq -r .payload | base64 -d | jq .subject
[
  {
    "name": "tensorflow_saved_model/fingerprint.pb",
    "digest": {
      "sha256": "7d52a854279e02d473e068dd326fb98a1e905b8ba7e6ec7a3b0f42f5987ca0e2"
    }
  },
  {
    "name": "tensorflow_saved_model/keras_metadata.pb",
    "digest": {
      "sha256": "29c28d5a8128920a1af478da068e338d58cf9306d66c57aee7ee36de11dc6c82"
    }
  },
  {
    "name": "tensorflow_saved_model/variables/variables.data-00000-of-00001",
    "digest": {
      "sha256": "f0add8686ac32726901782837596cda826c9f033591f0602f97bef50ee7add31"
    }
  },
  {
    "name": "tensorflow_saved_model/variables/variables.index",
    "digest": {
      "sha256": "617f47238c089df58c055ce329c73014ce064650899b1c85aadc1066dbbf57ea"
    }
  },
  {
    "name": "tensorflow_saved_model/saved_model.pb",
    "digest": {
      "sha256": "ff7fb63a9f8218737e1effa463940b27c2330f8b2c5f2dfb966e2d2544279bd8"
    }
  }
]

Verification also succeeds, once you pass the arguments correctly:

mihaimaruseac@ankh:~/src/repos/gosst/slsa-verifier$ go run ./cli/slsa-verifier/{main,verify}.go verify-artifact --provenance-path /tmp/verify/tf/multiple.intoto.jsonl --source-uri github.com/mihaimaruseac/model-transparency /tmp/verify/tf/{fingerprint.pb,keras_metadata.pb,saved_model.pb,variables/*} 
Verified signature against tlog entry index 45180918 at URL: https://rekor.sigstore.dev/api/v1/log/entries/24296fb24b8ad77a11ce42c9f7aa985a05c7d30467a77d14c9d96bddf7b9fa29657f72a86cde7b82
Verified build using builder "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v1.9.0" at commit 831b9521692f564156a61998b2478378e7dc6f49
Verifying artifact /tmp/verify/tf/fingerprint.pb: PASSED

Verified signature against tlog entry index 45180918 at URL: https://rekor.sigstore.dev/api/v1/log/entries/24296fb24b8ad77a11ce42c9f7aa985a05c7d30467a77d14c9d96bddf7b9fa29657f72a86cde7b82
Verified build using builder "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v1.9.0" at commit 831b9521692f564156a61998b2478378e7dc6f49
Verifying artifact /tmp/verify/tf/keras_metadata.pb: PASSED

Verified signature against tlog entry index 45180918 at URL: https://rekor.sigstore.dev/api/v1/log/entries/24296fb24b8ad77a11ce42c9f7aa985a05c7d30467a77d14c9d96bddf7b9fa29657f72a86cde7b82
Verified build using builder "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v1.9.0" at commit 831b9521692f564156a61998b2478378e7dc6f49
Verifying artifact /tmp/verify/tf/saved_model.pb: PASSED

Verified signature against tlog entry index 45180918 at URL: https://rekor.sigstore.dev/api/v1/log/entries/24296fb24b8ad77a11ce42c9f7aa985a05c7d30467a77d14c9d96bddf7b9fa29657f72a86cde7b82
Verified build using builder "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v1.9.0" at commit 831b9521692f564156a61998b2478378e7dc6f49
Verifying artifact /tmp/verify/tf/variables/variables.data-00000-of-00001: PASSED

Verified signature against tlog entry index 45180918 at URL: https://rekor.sigstore.dev/api/v1/log/entries/24296fb24b8ad77a11ce42c9f7aa985a05c7d30467a77d14c9d96bddf7b9fa29657f72a86cde7b82
Verified build using builder "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v1.9.0" at commit 831b9521692f564156a61998b2478378e7dc6f49
Verifying artifact /tmp/verify/tf/variables/variables.index: PASSED

PASSED: Verified SLSA provenance

@mihaimaruseac
Copy link
Collaborator Author

After unzipping the files from https://github.com/mihaimaruseac/model-transparency/actions/runs/6632442016 we can also verify the single file scenario:

mihaimaruseac@ankh:~/src/repos/gosst/slsa-verifier$ go run ./cli/slsa-verifier/{main,verify}.go verify-artifact --provenance-path /tmp/verify/pt/pytorch_model.pth.intoto.jsonl --source-uri github.com/mihaimaruseac/model-transparency /tmp/verify/pt/pytorch_model.pth
Verified signature against tlog entry index 45172090 at URL: https://rekor.sigstore.dev/api/v1/log/entries/24296fb24b8ad77aba51e418ce36828f790c58a0f1304246a31eaadc35f36c2a0d03aabeb4b9ab07
Verified build using builder "https://github.com/slsa-framework/slsa-github-generator/.github/workflows/generator_generic_slsa3.yml@refs/tags/v1.9.0" at commit 831b9521692f564156a61998b2478378e7dc6f49
Verifying artifact /tmp/verify/pt/pytorch_model.pth: PASSED

PASSED: Verified SLSA provenance

I'm saying that for now we can go with this PR, since each component of the model has its own provenance and later work on generating provenance for a model as a single entity, matching the hash for the entire model from the signing part.

@haydentherapper @smeiklej what do you think?

Some models (TF SavedModel format) are saved as a directory of assets,
not just a single file. Hence, we need special handling to cover them
when generating SLSA provenance.

Signed-off-by: Mihai Maruseac <[email protected]>
@mihaimaruseac
Copy link
Collaborator Author

This will be pending after #49 at which point we might do a different scheme

@laurentsimon
Copy link
Collaborator

laurentsimon commented Oct 25, 2023

I've updated the PR to:

  1. Remove the directory models
  2. Fix the hash generation to be for single-file models
  3. Set permissions: read-all as default permissions

@mihaimaruseac mihaimaruseac merged commit 13812e8 into sigstore:main Oct 25, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants