Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing SuperCDC 2022 #29

Open
sylvain101010 opened this issue Apr 2, 2023 · 1 comment
Open

Implementing SuperCDC 2022 #29

sylvain101010 opened this issue Apr 2, 2023 · 1 comment

Comments

@sylvain101010
Copy link

sylvain101010 commented Apr 2, 2023

Hi there 👋

First, thank you very much for this awesome crate, I learned a lot reading the code and the documentation is top tier!

I would like to know if you are interested in adding SuperCDC (https://www.computer.org/csdl/proceedings-article/iccd/2022/618600a170/1JeFWRlp7gs) to this crate?

I've just glanced over the paper, but it shows a throughput improvement of ~1.5 to 8-10 times over FastCDC 2020.

If you are interested, please let me know how I can help.

@nlfiedler
Copy link
Owner

There are a couple of reasons why SuperCDC would not make sense in this crate. First the name is different, so that's a no-go right off the bat. Second, FastCDC is suited only to chunking a single stream of bytes, while SuperCDC and its ilk are about managing whole systems comprised of many files. These solutions employ additional storage to track chunks (and files) in order to predict the size of the upcoming chunks in the hope of speeding up the process slightly. That's really neat, but it's far more complex and grander in scheme than the modest and simple FastCDC. Implementing SuperCDC involves making decisions that are better suited to an application rather than a library. It's either that or the library would have to offer several options for tuning the performance vs accuracy dial to suit the application.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants