Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Root CID does not match js/go-ipfs when packing a dir with 10k sub dirs #91

Open
olizilla opened this issue Sep 17, 2021 · 3 comments
Open

Comments

@olizilla
Copy link
Contributor

Reported by @obo20

Singular file uploads went fine and then a folder of 100 files went fine, but when I tried with a folder of 10k files (a super common use case), I received a different CID with adding to IPFS (go-ipfs/js-ipfs) and then the ipfs-car output

the cid I get from go-ipfs and js-ipfs is: bafybeihq6az265aar27wuhzltxrgge5ywwllcgux7wui4z3ddq4i2cskky
the cid I get from ipfs-car is: bafybeigww4x6shkc7vbp7c5slmnw3vo6ioj4gnar6ign5eqbkfpijcavk4

large-folder-10k.zip

@olizilla olizilla self-assigned this Sep 17, 2021
@olizilla
Copy link
Contributor Author

More context... there is a divergence in implementation of large directory sharding between go and js

Yes they diverge:
Currently go-ipfs will either do no sharding (by default) or will shard even small folders (sharding enabled), while js-ipfs has a cutoff
go-ipfs should for v0.11.0 have sharding enabled by default and is planning on sharding directories that are close to 1MiB in size (there's a deterministic function for this). js-ipfs may choose to do the same as well, but it's not strictly necessary.
@aschmahmann

The auto-shard PR for js-ipfs is here: ipfs/js-ipfs-unixfs#171
The size limit is currently 256KiB but it'll align with go-ipfs before it's merged and will be overrideable by the user
– @achingbrain

ipfs-car will adopt the changes in ipfs/js-ipfs-unixfs#171 once they land so that it's CID derivation stays in sync with js-ipfs

@obo20
Copy link

obo20 commented Sep 17, 2021

For more context:

The code I'm using looks like:

const results = await packToFs({
      input: contentFilePath,
      output: `${destinationFolder}/data.car`,
      blockstore: new FsBlockStore(),
      wrapWithDirectory: false,
      maxChildrenPerNode: 1024,
      maxChunkSize: 262144
});

The npm versions look like:

    "ipfs-car": "^0.5.8",
    "ipfs-core": "^0.10.6",

@olizilla
Copy link
Contributor Author

As you've found the defaults are different. If sharding is not enabled in js-ipfs (the default) it passes Infinity for shardSplitThreshold . ipfs-car passes nothing so ipfs-unixfs-importer falls back to it's default of 1000.
So yes - you just need to expose a shardSplitThreshold option in ipfs-car and pass it on to the importer, then you can manipulate the args to get the same CID as js-ipfs and go-ipfs.
Though of course with the knowledge that the default behaviour is going to change soon(ish) to auto-shard based on final block size rather than number of entries in a directory.
– achingbrain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants