Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a rebuild cache #4863

Open
Trenly opened this issue Oct 10, 2024 · 3 comments
Open

Add a rebuild cache #4863

Trenly opened this issue Oct 10, 2024 · 3 comments
Labels
Issue-Feature This is a feature request for the Windows Package Manager client.

Comments

@Trenly
Copy link
Contributor

Trenly commented Oct 10, 2024

Description of the new feature / enhancement

When rebuilding the entire index, it takes a long time as each manifest must be fully parsed and rebuilt. However, many of these manifests may not have changed since the last time a rebuild was run. With nearly 60,000 manifests, it would be beneficial to have some method of doing a partial rebuild.

Proposed technical implementation details

When a rebuild is performed, a copy of the manifests and the indexes could be saved off to a storage blob as a gzip. When the next rebuild is performed, this gzip could be downloaded and expanded, and the indexes loaded into memory as if it were the publishing pipeline. Then, instead of rebuilding the index from scratch, each manifest could be compared. If the manifest has changed, then update the index based upon the diff from the old manifest file to the new manifest file. If there was no change in the manifest, the index does not need to be updated. Once all the manifests have been processed, the new indexes can be published and a copy of the manifests and indexes can be saved off as the cache for the next rebuild.

Of course the pipelines will still need to have an option to perform a full rebuild, if necessary, but adding a caching layer could significantly reduce the amount of time it takes by starting from the last known-good index.

With this caching strategy, it could also be beneficial to perform a rebuild on a regular cadence (every 3 months?) to help ensure a well-maintained cache.

@Trenly Trenly added the Issue-Feature This is a feature request for the Windows Package Manager client. label Oct 10, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Triage Issue need to be triaged label Oct 10, 2024
@stephengillie
Copy link

This seems to cover the pipelines, not the CLI application - should it be moved to winget-pkgs?

@microsoft-github-policy-service microsoft-github-policy-service bot removed the Needs-Triage Issue need to be triaged label Oct 10, 2024
@Trenly
Copy link
Contributor Author

Trenly commented Oct 10, 2024

I'll leave that up to @denelon, but considering that the index creation is part of the CLI implementation, I had opted to put it here, mostly for planning purposes within the team; Especially since the rebuild pipeline isn't typically run as a regular part of verification/publishing

@denelon
Copy link
Contributor

denelon commented Oct 10, 2024

I'll let the engineering team take a look to see if this is beneficial, and if it should be here or at winget-pkgs. 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Feature This is a feature request for the Windows Package Manager client.
Projects
None yet
Development

No branches or pull requests

3 participants