-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yarn: SBOM components #739
base: main
Are you sure you want to change the base?
yarn: SBOM components #739
Conversation
The class is very often interpreted as `algorithm:hash`. Let's make use of builtin __str__ magic method and return the same format. Signed-off-by: Michal Šoltis <[email protected]>
Each "yarn classic" package type has a property that returns the package URL based on its attributes and community PURL specification [1]. All package types share the same base -> name, version, and type which is set to "npm" ("yarn" does not exist). - `FilePackage`, `WorkspacePackage`, `LinkPackage` have in addition subpath component (extra subpath within a package, relative to the package root) - `UrlPackage` has one extra qualifier -> its URL as it is definied - `GitPackage` has one extra qualifier -> package version control system URL with a specific syntax [2] - `RegistryPackage` has two extra qualifiers -> repository_url (default repository/registry for "npm" is https://registry.npmjs.org so alternative registries such as https://registry.yarnpkg.com should be qualified via the qualifier) [3], [4] + the checksum of the package converted from Subresource Integrity representation --- [1]: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst [2]: https://github.com/spdx/spdx-spec/blob/cfa1b9d08903/chapters/3-package-information.md#37-package-download-location- [3]: https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#npm [4]: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst#known-qualifiers-keyvalue-pairs Signed-off-by: Michal Šoltis <[email protected]>
After a successful pre-fetching of all packages, report all downloaded packages as components in the final SBOM. Create the `Component` object from each package based on package attributes. Dev packages should have `cdx:npm:package:development` property, that is added to the component if package is marked for development -> `dev` attribute is set to True. Move the rest of the unit test logic to `test_fetch_yarn_source` from its predecessor in yarn-berry implementation. closes containerbuildsystem#636 Signed-off-by: Michal Šoltis <[email protected]>
The commit follows the previous one, that implements generating SBOM components. Now e2e tests that generate SBOMs should be updated to reflect this change by running integration tests and setting our custom env variable: `CACHI2_GENERATE_TEST_DATA=true` Signed-off-by: Michal Šoltis <[email protected]>
1fb7c7b
to
4941272
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a couple small questions/nitpicks. Good to go once those are resolved.
"""Return package URL.""" | ||
qualifiers = {} | ||
|
||
if self.url != DEFAULT_NPM_REGISTRY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that self.url can be anything? If not then could it be constrained to a union of literals?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have a check for that:
https://github.com/containerbuildsystem/cachi2/blob/main/cachi2/core/package_managers/yarn_classic/resolver.py#L184
IOW, if a package is RegistryPackage
it is one of those npm registries.
@@ -26,6 +29,9 @@ | |||
re.compile(r"^https?:.+\.git#.+"), | |||
) | |||
|
|||
DEFAULT_NPM_REGISTRY = "https://registry.npmjs.org" | |||
ALTERNATIVE_NPM_REGISTRY = "https://registry.yarnpkg.com" | |||
|
|||
|
|||
class _BasePackage(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make this inherit from ABC and have purl
as an abstract method?
return _create_sbom_component(packages) | ||
|
||
|
||
def _create_sbom_component(packages: Iterable[YarnClassicPackage]) -> list[Component]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor naming nitpick: it creates multiple components, so better be named _create_sbom_components
. The docstring seems to agree with me on this one.
Another controversial naming scheme to consider is this: _sbom_components_created_from
.
Then in the invocation spot you'll have:
return _sbom_components_created_from(packages)
which is mostly a regular English sentence (except for weird punctuation).
I am absolutely not insisting on the second one, it is just a readability idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be _create_sbom_components
(probably just a typo).
qualifiers = {"download_url": self.url} | ||
return PackageURL( | ||
type="npm", | ||
name=self.name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some of the non-registry packages, we may want to read their respective package.json files from the cached tarballs to get their true names. As a user, I can call these non-registry dependencies whatever I want in my project's package.json and it could be completely different from what it really is:
"dependencies": {
"not-fecha": "https://github.com/taylorhakes/fecha.git"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the filename in the cache come from package.json or the URL ?
Maintainers will complete the following section
Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:
/ok-to-test
(as is the standard for Pipelines as Code)