Fragmentation Prevention #147

ylow · 2025-01-23T00:07:11Z

For https://linear.app/xet/issue/XET-246/fragmentation-prevention We use average chunks / range as a fragmentation estimator, targetting an average of 16 chunks per range which roughly equates to 1MB per range. This is computed over the last window of 32 ranges. If the average drops below the target, dedupe is disabled until the average is above the target again.

Running on first 1GB of a highly fragmented file (comprising of a few hundred KB of an existing file, followed by a hundred KB of zeros, repeat) we see the following:

Baseline: 1000000001 bytes -> 726845953 bytes, 2975 ranges, 336134 average bytes per range
512KB target (anti-fragmentation goal of 8 chunk per range): 1000000001 bytes -> 873515521 bytes, 1465 ranges, 682594 average bytes per range
1MB target (anti-fragmentation goal of 16 chunks per range): 1000000001 bytes -> 932235777 bytes, 829 ranges, 1206273 average bytes per range

This also includes a hysteresis implementation:

512KB target (anti-fragmentation goal of 8 chunk per range): 1000000001 bytes -> 873515521 bytes, 1657 ranges, 603500 average bytes per range.

The hysteresis turned out to be pretty important for deduping a content defined chunked variant of Parquet:
Without hysteresis (only concern is how v2 dedupes against v1):

parquet file v1: 5728317968 bytes -> 5728137283 bytes
parquet file v2: 5726717793 bytes -> 4544391399 bytes (11.14 chunks per range)

With hysteresis

parquet file v1: 5728317968 bytes -> 5728137283 bytes
parquet file v2: 5726717793 bytes -> 3568275084 bytes (8.11 chunks per range)

So with the hysteresis implementation we are closer to the target chunk per range and we are able to still dedupe pretty well. As comparison, without any fragmentation prevention:

parquet file v1: 5728317968 bytes -> 5728137283 bytes
parquet file v2: 5726717793 bytes -> 3402767500 bytes (6.89 chunks per segment)

For https://linear.app/xet/issue/XET-246/fragmentation-prevention We use average chunks / range as a fragmentation estimator, targetting an average of 16 chunks per range which roughly equates to 1MB per range. This is computed over the last window of 32 ranges. In the event of high fragmentation, we simply avoid dedupe.

ylow · 2025-01-24T18:17:11Z

Other ideas which may or may not improve things are:

have a high-water and low-water mark so if we are near the 1MB boundary we don't keep jumping back and forth between dedupe and no-dedupe.
look ahead a bunch of chunks (this is somewhat complicated)

data/src/clean.rs

data/src/constants.rs

seanses

LGTM!

ylow requested review from hoytak and seanses January 23, 2025 00:07

ylow added 3 commits January 23, 2025 11:29

fmt

4917063

Missed a case

3e6b92c

clippy

46eeb86

ylow marked this pull request as ready for review January 24, 2025 18:44

hysteresis

3f5062d

hoytak reviewed Jan 25, 2025

View reviewed changes

data/src/clean.rs Show resolved Hide resolved

Make the parameters runtime configurable

e6fb460

seanses reviewed Jan 30, 2025

View reviewed changes

data/src/constants.rs Outdated Show resolved Hide resolved

Update constants.rs

13bba84

seanses approved these changes Jan 30, 2025

View reviewed changes

ylow merged commit 5cf29c1 into main Jan 30, 2025
2 checks passed

ylow deleted the ylow/fragmentation_prevention branch January 30, 2025 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fragmentation Prevention #147

Fragmentation Prevention #147

ylow commented Jan 23, 2025 •

edited

Loading

ylow commented Jan 24, 2025

seanses left a comment

Fragmentation Prevention #147

Fragmentation Prevention #147

Conversation

ylow commented Jan 23, 2025 • edited Loading

ylow commented Jan 24, 2025

seanses left a comment

Choose a reason for hiding this comment

ylow commented Jan 23, 2025 •

edited

Loading