Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot abstraction #1603

Closed

Conversation

mnm678
Copy link
Contributor

@mnm678 mnm678 commented Oct 4, 2021

This moves generation of snapshot metadata into a new interface so that variations on snapshot generation like snapshot merkle trees in TAP 16 and snapshot RSA accumulators can be added to the interface and more easily swapped out.

This is a draft pr that is missing a few things:

  • impact of changes to snapshot metadata on the fields in timestamp metadata (ie including the merkle tree root hash)
  • better utilization of add_to_snapshot and remove_from_snapshot in the implementation of ManifestSnapshot (for example, maintaining fileinfodict in ManifestSnapshot, rather than re-computing the whole dict every time)
  • more tests

I would appreciate any general feedback about the abstraction design, whether this is a good idea, or thoughts about what should/shouldn't be moved to this snapshot object.

cc @znewman01

mnm678 added 3 commits October 1, 2021 12:53
This design was inspired by the StorageBackendInterface
in securesystemslib.

Signed-off-by: Marina Moore <[email protected]>
This commit moves the implementation of generate_snapshot_metadata
to the snapshot backend. In the future, this implementation
may be simplified by keeping track of the fileinfodict in the
ManifestSnapshot class instead of re-computing these on each run.

Signed-off-by: Marina Moore <[email protected]>
Adds roles to the snapshot backend when they are
written to disk so that the backend may maintain
current snapshot information, and removes delegated
targets from the snapshot backend when they are deleted

TODO: add tests, more uses of add_to_snapshot

Signed-off-by: Marina Moore <[email protected]>
@coveralls
Copy link

Pull Request Test Coverage Report for Build 1305274445

  • 76 of 78 (97.44%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.07%) to 97.203%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tuf/repository_tool.py 12 14 85.71%
Totals Coverage Status
Change from base Build 1294989370: -0.07%
Covered Lines: 4002
Relevant Lines: 4092

💛 - Coveralls

Copy link

@znewman01 znewman01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction you're going in, I'd just like you to go further :)

Specifically, I'd like to see the verification procedures moved in as well, and in general a move towards a more OOP-y design.

<Purpose>
Create the snapshot metadata

<Arguments>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of these seem to me like the belong to the class (and specifically, should correspond to different classes in some cases). At least the following:

  • consistent_snapshot
  • use_length
  • use_hashes

__metaclass__ = abc.ABCMeta

@abc.abstractmethod
def add_to_snapshot(self, rolename : str):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming nit: maybe just add() (and similar throughout)?

SNAPSHOT_FILENAME = 'snapshot' + METADATA_EXTENSION
TIMESTAMP_FILENAME = 'timestamp' + METADATA_EXTENSION

class SnapshotInterface():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we distinguish between:

  • the repository state (i.e., all metadata about all current packages)
  • the artifact that clients will download

I think that will help make the roles/names of various classes clearer.






def generate_timestamp_metadata(snapshot_file_path, version, expiration_date,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this stays here? That kind of makes sense, but is a little asymmetric.

def add_to_snapshot(self, rolename : str):
"""
<Purpose>
Indicate to the snapshot interface that 'rolename'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by this. What does it mean to add a role to a snapshot?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC from the PR description, this would allow increased efficiency.


class SnapshotInterface():
"""
<Purpose>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure from the code/naming whether this should have state or it's just a way to swap out implementations.

the snapshot metadata object.

<Side Effects>
The 'root.json' and 'targets.json' files are read.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO nicer to move the file i/o out of this method.

So the "SnapshotInterface" would be responsible for giving you a serializable snapshot object that you could distribute.

@jku
Copy link
Member

jku commented Oct 6, 2021

I would appreciate any general feedback about the abstraction design, whether this is a good idea, or thoughts about what should/shouldn't be moved to this snapshot object.

I'll try to do that later but I'll leave this feedback for now:

I totally understand you are doing this in the legacy code base because it has loads and loads of repository side code in it and adding one more argument to a function call there is almost certainly the path of least resistance... but I also hope you understand that this makes reviewing much less interesting: It's hard to see the long term value.

I think the project should aim to build repository side components on top of Metadata API -- I am not sure what exactly should be included in those components but a "snapshot update handler" sounds like a prime candidate and I'd definitely be interested in collaborating on something like that.

@lukpueh
Copy link
Member

lukpueh commented Feb 17, 2022

Ping @mnm678, same questions as in #1113 (comment).

@mnm678
Copy link
Contributor Author

mnm678 commented Feb 17, 2022

Closing in favor of writing this on top of the new Metadata api.

@mnm678 mnm678 closed this Feb 17, 2022
@coveralls
Copy link

coveralls commented Oct 11, 2024

Pull Request Test Coverage Report for Build 1305274445

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 76 of 78 (97.44%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.6%) to 97.866%

Changes Missing Coverage Covered Lines Changed/Added Lines %
tuf/repository_tool.py 12 14 85.71%
Totals Coverage Status
Change from base Build 1294989370: 0.6%
Covered Lines: 3766
Relevant Lines: 3815

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants