Shuffly

Increases compressability of data with fixed-sized records.

Usage Examples

Compress a single file with shuffly and zstd

# Compress
cat my_file | shuffly -e | pzstd -o my_file.shuffly.pzstd
# Extract
cat my_file.shuffly.pzstd | pzstd -d | shuffly -d --output my_file

Compress a folder with tar, shuffly, and zstd

# Compress
tar -cf - my_folder | shuffly -e | pzstd -o my_folder.tar.shuffly.pzstd
# Extract
cat my_folder.tar.shuffly.pzstd | pzstd -d | shuffly -d | tar -xvf -

Installation

The CLI app can be installed with Cargo:

cargo install shuffly

Examples

Compressing OpenStreetMap data stored in osmflat format is a good example for which shuffly can dramatically improve compression ratios.

The following table shows how the original data compresses with zstd versus shuffly combined with zstd.

Compression was done with pzstd -4 on the planet dataset converted to osmflat. The table below lists all files larger than 1 GB.

File	Original Size	Compressed (zstd)	Compressed (shuffly + zstd)
nodes	103.62 GB	45.69%	29.79%
nodes_index	40.97 GB	50.17%	25.27%
ids/nodes	34.54 GB	51.99%	1.44%
tags_index	11.95 GB	18.99%	19.73%
ways	7.70 GB	33.71%	7.95%
ids/ways	3.85 GB	53.68%	1.78%
stringtable	1.92 GB	40.51%	40.78%
tags	1.20 GB	28.36%	24.45%
relation_members	1.09 GB	24.53%	27.12%

How it works

Shuffly detects fixed-sized data patterns by trying out different pattern sized between 1 to 64 bytes. For each pattern it reorderd bytes such that byte X of each record is group together, and stores deltas of these bytes instead of the original data.

The resulting data stream is much more compressible for pattern based compression algorithms like deflate /gz, zip, etc), zstd, or lzma.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Shuffly

Usage Examples

Installation

Examples

How it works

About

Releases

Packages

Languages

License

VeaaC/shuffly

Folders and files

Latest commit

History

Repository files navigation

Shuffly

Usage Examples

Installation

Examples

How it works

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages