-
Notifications
You must be signed in to change notification settings - Fork 4
Strip nix store references from chunks and store them in metadata instead #37
Comments
iirc this is exactly what people are trying to implement in nix itself with CA derivations |
I know much of this is similar but I don't think they do any chunking as nix-casync does, do they? |
Right, yes. The "This way, two paths made from the same derivation that are distinct because of an inconsequential stdenv rebuild but (except for the references) hold exactly the same data, would still dedup against each other." is exactly the same. So it would still make sense to upload CA storepaths into nix casync, because of the chunking, but the whole reference rewriting is probably better implemented in nix and is supposed to be part of nix 4.0. |
Oh yeah, absolutely, Nix is the proper place to implement all of this. I'm pretty sure the purpose of this repo is to show off a PoC, rather than something that's actually usable. |
This was also something that was discussed. There's already plans to add a reference scanner while ingesting .nar files, and replacing those with some placeholders before feeding to the chunker could provide more deduplication benefits. With this, we'd be able to deduplicate a block containing a lot of strings with only differing store paths. This is something I'd also like to test out with a large dataset (#2). |
This is an optimisation idea I had while reading your blog post:
Strip all Nix store references from the actual chunks and put them into per-path metadata instead.
This way, two paths made from the same derivation that are distinct because of an inconsequential stdenv rebuild but (except for the references) hold exactly the same data, would still dedup against each other.
How this could look like:
/nix/store/aa...-test:
becomes chunk C:
and aa....meta now contains:
which it can then use to recreate the store path, substituting the references in sequential order into the placeholders.
If there was an inconsequential stdenv rebuild (a comment in a string somewhere for example), the same derivation might evaluate to /nix/store/bb...-test:
but the metadata would look like this:
As you can see, chunk C is reused.
The text was updated successfully, but these errors were encountered: