Implementing SuperCDC 2022 #29

sylvain101010 · 2023-04-02T10:00:50Z

Hi there 👋

First, thank you very much for this awesome crate, I learned a lot reading the code and the documentation is top tier!

I would like to know if you are interested in adding SuperCDC (https://www.computer.org/csdl/proceedings-article/iccd/2022/618600a170/1JeFWRlp7gs) to this crate?

I've just glanced over the paper, but it shows a throughput improvement of ~1.5 to 8-10 times over FastCDC 2020.

If you are interested, please let me know how I can help.

nlfiedler · 2023-04-07T04:13:53Z

There are a couple of reasons why SuperCDC would not make sense in this crate. First the name is different, so that's a no-go right off the bat. Second, FastCDC is suited only to chunking a single stream of bytes, while SuperCDC and its ilk are about managing whole systems comprised of many files. These solutions employ additional storage to track chunks (and files) in order to predict the size of the upcoming chunks in the hope of speeding up the process slightly. That's really neat, but it's far more complex and grander in scheme than the modest and simple FastCDC. Implementing SuperCDC involves making decisions that are better suited to an application rather than a library. It's either that or the library would have to offer several options for tuning the performance vs accuracy dial to suit the application.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing SuperCDC 2022 #29

Implementing SuperCDC 2022 #29

sylvain101010 commented Apr 2, 2023 •

edited

Loading

nlfiedler commented Apr 7, 2023

Implementing SuperCDC 2022 #29

Implementing SuperCDC 2022 #29

Comments

sylvain101010 commented Apr 2, 2023 • edited Loading

nlfiedler commented Apr 7, 2023

sylvain101010 commented Apr 2, 2023 •

edited

Loading