Skip to content
This repository has been archived by the owner on Jan 2, 2023. It is now read-only.

Implement Garbage Collection #10

Open
flokli opened this issue Dec 10, 2021 · 3 comments
Open

Implement Garbage Collection #10

flokli opened this issue Dec 10, 2021 · 3 comments
Labels
enhancement New feature or request

Comments

@flokli
Copy link
Owner

flokli commented Dec 10, 2021

There should be a way to Remove castr chunks that are not referenced by any of the caibx.

@Mic92
Copy link

Mic92 commented Dec 12, 2021

An easy to implement alternative would be to rotate s3 buckets, by having older buckets read-only and only upload to the latest one. This should not be the end goal but at least this is better having no gc, which makes the project not really usable without infinite storage.

@bbigras
Copy link

bbigras commented Dec 12, 2021

Does nix-casync use a database? Could it try to delete files that were not modified or accessed recently, like cachix does?

@flokli
Copy link
Owner Author

flokli commented Dec 12, 2021

There's multiple layers of GC here, starting from the bottom to the top:

Garbage Collection of unreferenced Chunks

To do this, we need to assemble the list of all referred chunks in all caibx files in the store.
Chunks that are not part of that list, but that exist in the chunk store can be safely removed.

Garbage collection of Narfiles

To do this, we need to assemble the list of all Narfiles referred in all Narinfo files.
Narfiles that are not referred in any Narinfo file can be safely removed.

Removal of Narinfo files

We can only remove Narinfo files that are not referred by any other Narinfo file.

We can start with files that are not referred by any other Narinfo file, check their last-access time, if it's too old, remove, and add all References to the next iteration (so we slowly walk our way up).

Asking the "referred by" question, as well as tracking access times requires some sort of database (so this is something for #9).

--

A locally deployed "cache" would probably not need to do the complicated "safe Narinfo removal", if we silently fetch the Narinfo again if it's requested.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants