Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New syncing logic #63

Merged
merged 34 commits into from
Jun 14, 2024
Merged

New syncing logic #63

merged 34 commits into from
Jun 14, 2024

Conversation

twitu
Copy link
Collaborator

@twitu twitu commented May 20, 2024

Closes #37

Copy link

Benchmark for 4a65eca

Click to view benchmark
Test Base PR %
../test-assets/lena.jpg/compute_bytes 14.1±0.04µs 14.5±0.11µs +2.84%
../test-assets/test.pdf/compute_bytes 114.6±0.86µs 116.5±1.51µs +1.66%
compute_bytes_large/compute_bytes 146.3±2.16µs 148.1±5.07µs +1.23%
compute_bytes_medium/compute_bytes 27.9±0.55µs 27.8±0.37µs -0.36%
compute_bytes_small/compute_bytes 131.8±2.92ns 131.3±1.37ns -0.38%
index_build/index_build/../test-assets/ 159.2±0.70µs 160.6±2.35µs +0.88%

@kirillt
Copy link
Member

kirillt commented May 23, 2024

I think we have also discussed "smart rounding":

round HH:MM:ss:xxx always to the lowest side when we memoize the timestamp of latest scan
round HH:MM:ss:yyy always to the highest side when we retrieve the timestamp of file modification

@twitu
Copy link
Collaborator Author

twitu commented May 24, 2024

SystemTime records time at a nanosecond level depending on the operating system. The as_millis function implements the standard logic for rounding the timestamp up to milliseconds. The default implementation should be sufficient for our purpose.

    pub const fn as_millis(&self) -> u128 {
        self.secs as u128 * MILLIS_PER_SEC as u128 + (self.nanos.0 / NANOS_PER_MILLI) as u128
    }

@kirillt
Copy link
Member

kirillt commented May 24, 2024

It's still possible theoretically:

  1. Read the storage file at T_r. If T_r has non-zero nanos, then T_r is rounded up to the next millisecond.
  2. Some writes arrive to the disk at T_r + 0.5ms.
  3. During next scan, we ignore updates from step 2, because they happened before round(T_r).

I believe, that with millisecond precision, it becomes very rare case.

But if we round read time down, then we eliminate this corner case completely.

@twitu
Copy link
Collaborator Author

twitu commented May 25, 2024

If we want to reduce the possibility of a race condition, we can use nano second resolution without any loss in terms of performance and increase in complexity. We are already is storing the SystemTime which is represented as a u128.

    pub const fn as_nanos(&self) -> u128 {
        self.secs as u128 * NANOS_PER_SEC as u128 + self.nanos.0 as u128
    }

Copy link

Benchmark for 1d6e14f

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 247.1±0.92µs 247.9±0.94µs +0.32%
blake3_resource_id_creation/compute_from_bytes:medium 15.5±0.02µs 15.5±0.04µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1351.8±3.55ns 1349.8±5.56ns -0.15%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.0±0.35µs 195.6±2.81µs -0.20%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1707.0±5.40µs 1721.3±29.31µs +0.84%
crc32_resource_id_creation/compute_from_bytes:large 86.6±0.24µs 86.7±0.47µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.3±0.25ns 92.4±0.54ns +0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.4±0.33µs 64.7±0.38µs +0.47%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 933.2±3.00µs 932.1±7.60µs -0.12%
index_build/index_build/../test-assets/ 1066.4±6.16µs 1055.8±5.28µs -0.99%

@kirillt
Copy link
Member

kirillt commented May 25, 2024

@twitu yes, nanos are possible and are provided by the filesystem API, but iirc they are not real nanos and there are zeros at the end.. But why you don't want to adopt different rounding logic for reading timestamp? I think this would solve the issue for any precision.

Copy link

Benchmark for 02c5daf

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 248.2±0.92µs 254.3±0.96µs +2.46%
blake3_resource_id_creation/compute_from_bytes:medium 15.5±0.03µs 15.6±0.03µs +0.65%
blake3_resource_id_creation/compute_from_bytes:small 1350.6±1.77ns 1350.4±2.18ns -0.01%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.9±0.58µs 196.3±0.46µs +0.20%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1732.2±2.75µs 1728.8±26.12µs -0.20%
crc32_resource_id_creation/compute_from_bytes:large 86.9±0.78µs 86.7±0.31µs -0.23%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.05µs 5.4±0.18µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.44ns 92.3±0.16ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.6±0.17µs 64.7±0.49µs +0.15%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 927.6±7.79µs 929.0±4.57µs +0.15%
index_build/index_build/../test-assets/ 1062.0±5.31µs 1064.7±6.22µs +0.25%

@twitu
Copy link
Collaborator Author

twitu commented May 26, 2024

I just noticed that because of the recent changes in the monoids PR. The semantics for the syncing logic has changed. We're now checking for timestamp equality, so rounding logic won't work. We can only increase the resolution of the timestamp.

Also from the SystemTime docs, the resolution is OS dependent. Linux supports ns resolution and windows and mac atleast 100ns resolution. So the best way to cover all OS would be to use microsecond resolution.

    /// Compare the timestamp of the storage file
    /// with the timestamp of the in-memory storage update
    /// to determine if either of the two requires syncing.
    fn needs_syncing(&self) -> Result<bool> {
        match fs::metadata(&self.path) {
            Ok(metadata) => {
                let get_duration_since_epoch = |time: SystemTime| {
                    time.duration_since(SystemTime::UNIX_EPOCH)
                        .unwrap()
                        .as_secs()
                };

                let fs_modified =
                    get_duration_since_epoch(metadata.modified()?);
                let self_modified = get_duration_since_epoch(self.modified);

                Ok(fs_modified != self_modified)

@twitu twitu force-pushed the millis-timestamp branch from 04fdc25 to 2719559 Compare May 26, 2024 15:30
Copy link

Benchmark for 2b24fe0

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 250.0±1.77µs 249.2±1.35µs -0.32%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.08µs 15.5±0.07µs -0.64%
blake3_resource_id_creation/compute_from_bytes:small 1342.0±8.24ns 1346.7±10.74ns +0.35%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.6±0.56µs 196.1±1.64µs +0.26%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1731.3±8.75µs 1736.2±20.19µs +0.28%
crc32_resource_id_creation/compute_from_bytes:large 86.9±0.34µs 86.9±0.48µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.3±0.23ns 92.4±0.55ns +0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.2±0.25µs 64.2±0.29µs 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 929.7±13.46µs 926.3±3.06µs -0.37%
index_build/index_build/../test-assets/ 1066.1±8.74µs 1063.5±3.74µs -0.24%

Copy link

Benchmark for 4b74ce1

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 250.4±0.86µs 248.6±0.80µs -0.72%
blake3_resource_id_creation/compute_from_bytes:medium 15.5±0.06µs 15.5±0.04µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1349.0±1.70ns 1347.1±2.90ns -0.14%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.9±0.44µs 196.5±1.91µs -0.20%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1732.7±7.20µs 1722.7±10.15µs -0.58%
crc32_resource_id_creation/compute_from_bytes:large 86.7±0.32µs 86.9±2.36µs +0.23%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.03µs 5.4±0.02µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.29ns 92.3±0.46ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.6±0.28µs 64.5±1.14µs -0.15%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 931.8±14.97µs 922.4±3.09µs -1.01%
index_build/index_build/../test-assets/ 1063.7±4.27µs 1068.1±5.83µs +0.41%

@tareknaser
Copy link
Collaborator

I just wanted to note that it's great the macOS CI is now failing, so we were able to reproduce the error.

Using microseconds also results in the same error on macOS systems.

@kirillt
Copy link
Member

kirillt commented May 30, 2024

I just noticed that because of the recent changes in the monoids PR. The semantics for the syncing logic has changed. We're now checking for timestamp equality, so rounding logic won't work. We can only increase the resolution of the timestamp.

Agree, this changes everything...

Ready to merge as soon as build is fixed.

@twitu twitu changed the title Use millisecond resolution when comparing timestamps New syncing logic May 30, 2024
Copy link

Benchmark for ef107de

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 249.1±0.53µs 248.5±2.48µs -0.24%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.06µs 15.6±0.23µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1347.1±5.71ns 1340.7±9.44ns -0.48%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.2±0.84µs 196.0±1.06µs -0.10%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1725.2±5.60µs 1731.9±28.71µs +0.39%
crc32_resource_id_creation/compute_from_bytes:large 86.7±0.30µs 87.1±1.80µs +0.46%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.08µs 5.4±0.04µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.51ns 92.7±3.31ns +0.32%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.8±0.35µs 64.7±0.96µs -0.15%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 925.9±4.68µs 921.5±4.26µs -0.48%
index_build/index_build/../test-assets/ 1067.2±10.85µs 1067.0±8.47µs -0.02%

@twitu
Copy link
Collaborator Author

twitu commented May 30, 2024

A completely revamped syncing logic. This affects read_fs, write_fs, new FileStorage and some minor changes to the BaseStorage api.

Needs a solid review.

@twitu twitu self-assigned this May 30, 2024
@twitu twitu requested review from kirillt and tareknaser May 30, 2024 18:56
Copy link

Benchmark for 1bd21fb

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 248.7±1.16µs 248.8±1.10µs +0.04%
blake3_resource_id_creation/compute_from_bytes:medium 15.5±0.07µs 15.6±0.24µs +0.65%
blake3_resource_id_creation/compute_from_bytes:small 1350.0±1.93ns 1334.8±9.91ns -1.13%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.2±0.55µs 197.9±1.20µs +0.87%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1725.9±7.29µs 1721.5±12.79µs -0.25%
crc32_resource_id_creation/compute_from_bytes:large 87.0±0.91µs 86.9±0.44µs -0.11%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.03µs 5.4±0.07µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.4±0.37ns 92.4±0.80ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.4±0.29µs 64.4±0.17µs 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 928.4±3.51µs 926.4±16.63µs -0.22%
index_build/index_build/../test-assets/ 1062.8±5.78µs 1072.1±29.03µs +0.88%

Copy link

Benchmark for 897bb10

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 251.8±1.68µs 250.4±1.70µs -0.56%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.04µs 15.6±0.40µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1370.4±17.60ns 1357.3±10.91ns -0.96%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.1±0.59µs 196.1±0.59µs 0.00%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1747.5±7.84µs 1751.2±17.27µs +0.21%
crc32_resource_id_creation/compute_from_bytes:large 87.0±1.83µs 87.1±1.12µs +0.11%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.16µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.7±0.35ns 92.6±0.27ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.8±1.50µs 65.6±1.41µs -0.30%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 952.7±5.17µs 959.6±6.89µs +0.72%
index_build/index_build/../test-assets/ 1042.5±19.58µs 1043.5±4.53µs +0.10%

Copy link

Benchmark for a925154

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 249.0±0.55µs 248.0±0.71µs -0.40%
blake3_resource_id_creation/compute_from_bytes:medium 15.7±0.08µs 15.6±0.11µs -0.64%
blake3_resource_id_creation/compute_from_bytes:small 1367.0±9.47ns 1355.3±8.81ns -0.86%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.2±0.88µs 196.0±1.70µs -0.10%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1741.5±4.68µs 1744.2±21.92µs +0.16%
crc32_resource_id_creation/compute_from_bytes:large 86.7±0.48µs 86.8±0.46µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.04µs 5.4±0.17µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.7±0.40ns 92.7±0.26ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.4±0.74µs 65.1±0.24µs -0.46%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 955.0±5.43µs 954.2±10.96µs -0.08%
index_build/index_build/../test-assets/ 1040.1±9.10µs 1045.1±5.69µs +0.48%

Copy link

Benchmark for 559d655

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 251.4±0.52µs 250.3±1.65µs -0.44%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.07µs 15.6±0.09µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1354.9±2.76ns 1351.3±8.07ns -0.27%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.0±0.55µs 196.3±0.53µs +0.15%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1752.0±5.35µs 1754.3±25.97µs +0.13%
crc32_resource_id_creation/compute_from_bytes:large 86.7±0.17µs 87.0±1.22µs +0.35%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.7±0.44ns 92.9±2.01ns +0.22%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.1±0.30µs 66.2±0.45µs +1.69%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 956.8±9.36µs 953.7±3.99µs -0.32%
index_build/index_build/../test-assets/ 1040.3±8.22µs 1058.7±2.97µs +1.77%

Copy link

Benchmark for f5a617a

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 252.8±1.05µs 250.3±1.44µs -0.99%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.02µs 15.6±0.61µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1364.1±1.98ns 1350.4±6.39ns -1.00%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.1±0.47µs 196.2±0.34µs +0.05%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1741.8±4.12µs 1743.1±13.80µs +0.07%
crc32_resource_id_creation/compute_from_bytes:large 86.8±0.34µs 86.7±0.20µs -0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.01µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.8±0.55ns 92.7±0.36ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.4±0.33µs 65.4±0.45µs 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 957.3±8.56µs 952.2±4.37µs -0.53%
index_build/index_build/../test-assets/ 1039.8±6.15µs 1041.7±5.36µs +0.18%

Copy link

Benchmark for 1ee6652

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 249.2±1.15µs 248.8±1.41µs -0.16%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.06µs 15.6±0.03µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1354.3±10.27ns 1357.3±8.67ns +0.22%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.7±0.44µs 196.3±2.31µs +0.31%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1768.2±4.17µs 1741.1±30.73µs -1.53%
crc32_resource_id_creation/compute_from_bytes:large 86.9±0.35µs 86.8±0.34µs -0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.01µs 5.4±0.01µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.7±0.45ns 92.7±0.27ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.3±0.31µs 64.9±1.82µs -0.61%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 956.4±5.62µs 952.2±5.34µs -0.44%
index_build/index_build/../test-assets/ 1041.2±5.24µs 1042.5±3.56µs +0.12%

@tareknaser
Copy link
Collaborator

Another weird timestamp related bug. It seems write_fs doesn't even change the modified timestamp in the file metadata.

Could you write a minimal test to reproduce it?
locally for me, all tests are passing at 4a13396

Copy link

Benchmark for 66d9bdd

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 251.1±0.69µs 249.6±1.91µs -0.60%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.04µs 15.6±0.09µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1349.2±3.14ns 1350.8±11.41ns +0.12%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.8±0.47µs 195.7±0.59µs -0.05%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1747.2±7.89µs 1742.8±19.18µs -0.25%
crc32_resource_id_creation/compute_from_bytes:large 86.8±0.32µs 86.7±0.61µs -0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.05µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.7±0.40ns 92.7±0.47ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.3±0.57µs 65.3±0.44µs 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 955.9±7.30µs 966.9±6.73µs +1.15%
index_build/index_build/../test-assets/ 1041.7±6.43µs 1051.8±7.27µs +0.97%

Copy link

Benchmark for 1a602a7

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 249.3±2.52µs 252.4±12.70µs +1.24%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.13µs 15.7±0.17µs +0.64%
blake3_resource_id_creation/compute_from_bytes:small 1365.3±14.47ns 1364.8±1.88ns -0.04%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.5±2.06µs 196.0±0.44µs -0.25%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1745.8±10.03µs 1743.5±23.57µs -0.13%
crc32_resource_id_creation/compute_from_bytes:large 86.8±0.56µs 86.7±0.21µs -0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.07µs 5.4±0.05µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.8±1.31ns 92.7±0.72ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 66.1±1.71µs 65.4±1.61µs -1.06%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 954.5±7.48µs 961.0±10.60µs +0.68%
index_build/index_build/../test-assets/ 1054.5±15.77µs 1040.2±5.65µs -1.36%

Copy link
Collaborator Author

@twitu twitu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an MRE in the failing test_file_metadata_timestamp_updated test

@tareknaser
Copy link
Collaborator

Added an MRE in the failing test_file_metadata_timestamp_updated test

It looks like this issue might be related to the file system's timestamp granularity. From what I've found, it seems to depend on the file system being used.

For Linux:

  • EXT3 supports timestamps with a precision up to one second.
  • EXT4 can support timestamps with a precision up to nanoseconds, but not always

I am not entirely sure about these numbers. I couldn't find a definitive resource to list them.

I've pushed a commit that adds a one-second delay after each file write to fix the tests. This seems to be enough to fix the tests. I also reviewed the Rust file system documentation and added a call to sync_all().

I’m wondering if users might run into similar issues with fs-storage. It might be worth adding a note about this in the FileStorage’s doc comment.

Copy link

Benchmark for aca4451

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 250.2±1.84µs 251.6±2.09µs +0.56%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.07µs 15.6±0.13µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1376.1±20.54ns 1365.4±2.45ns -0.78%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.3±0.66µs 196.0±1.85µs -0.15%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1741.9±4.77µs 1749.5±26.87µs +0.44%
crc32_resource_id_creation/compute_from_bytes:large 86.9±0.39µs 87.0±0.27µs +0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.03µs 5.4±0.01µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 92.9±0.37ns 92.9±0.31ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.5±0.19µs 65.3±1.13µs -0.31%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 950.5±4.67µs 958.0±4.07µs +0.79%
index_build/index_build/../test-assets/ 1035.6±6.52µs 1038.1±3.50µs +0.24%

@twitu
Copy link
Collaborator Author

twitu commented Jun 14, 2024

It's unfortunate that the OS does not update the metadata fast enough for it to be useful for us. However, we can work around it by setting the modified time our self. With this logic we can set the modified timestamp consistently and the tests pass without needing a sleep.

        fs::create_dir_all(parent_dir)?;
        let mut file = File::create(&self.path)?;
        file.write_all(serde_json::to_string_pretty(&self.data)?.as_bytes())?;
        file.flush()?;

        let new_timestamp = SystemTime::now();
        file.set_modified(new_timestamp)?;
        file.sync_all()?;

Copy link

Benchmark for d50b7a3

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 248.1±0.67µs 250.1±1.11µs +0.81%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.18µs 15.6±0.14µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1348.8±6.12ns 1351.1±15.27ns +0.17%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 195.7±0.47µs 196.6±1.54µs +0.46%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1737.2±3.63µs 1748.5±21.43µs +0.65%
crc32_resource_id_creation/compute_from_bytes:large 86.6±0.16µs 86.6±0.29µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.04µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.82ns 93.0±0.28ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 64.9±0.50µs 65.5±1.47µs +0.92%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 953.1±3.16µs 956.9±3.17µs +0.40%
index_build/index_build/../test-assets/ 1037.2±3.24µs 1037.2±4.07µs 0.00%

Copy link

Benchmark for d86f413

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 250.5±1.06µs 250.2±0.88µs -0.12%
blake3_resource_id_creation/compute_from_bytes:medium 15.7±0.10µs 15.7±0.11µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1354.5±9.53ns 1347.1±2.42ns -0.55%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.1±0.43µs 196.3±0.64µs +0.10%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1735.2±5.13µs 1750.0±13.22µs +0.85%
crc32_resource_id_creation/compute_from_bytes:large 86.7±0.48µs 86.7±0.33µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.08µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.41ns 93.3±2.53ns +0.32%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.3±0.35µs 65.3±0.89µs 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 957.3±7.10µs 961.9±3.68µs +0.48%
index_build/index_build/../test-assets/ 1043.8±5.14µs 1038.3±5.20µs -0.53%

Copy link
Collaborator

@tareknaser tareknaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making all the requested changes. I've added a few minor comments.

Once you rebase, we should be ready to merge.

fs-storage/src/file_storage.rs Outdated Show resolved Hide resolved
fs-storage/src/file_storage.rs Show resolved Hide resolved
fs-storage/src/file_storage.rs Outdated Show resolved Hide resolved
Copy link

Benchmark for 83e4f37

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 248.5±1.19µs 250.3±0.70µs +0.72%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.09µs 15.6±0.04µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1366.6±8.27ns 1364.9±36.24ns -0.12%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.5±0.51µs 196.8±0.52µs +0.15%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1738.5±8.73µs 1745.2±16.75µs +0.39%
crc32_resource_id_creation/compute_from_bytes:large 86.7±0.45µs 86.9±1.75µs +0.23%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.07µs 5.4±0.03µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.30ns 92.9±0.39ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.1±0.19µs 65.5±2.12µs +0.61%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 958.1±4.79µs 960.4±7.49µs +0.24%
index_build/index_build/../test-assets/ 1035.0±6.23µs 1036.0±11.22µs +0.10%

Copy link

Benchmark for 7c12c54

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 248.8±0.77µs 250.0±2.80µs +0.48%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.14µs 15.6±0.08µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1356.1±9.80ns 1354.2±9.60ns -0.14%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 198.4±0.40µs 195.7±0.60µs -1.36%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1747.0±9.22µs 1770.9±19.79µs +1.37%
crc32_resource_id_creation/compute_from_bytes:large 86.8±0.49µs 86.7±0.26µs -0.12%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.03µs 5.4±0.02µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.35ns 92.9±0.36ns -0.11%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.1±0.36µs 68.4±0.38µs +5.07%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 955.6±5.14µs 964.1±5.87µs +0.89%
index_build/index_build/../test-assets/ 1039.0±9.51µs 1053.1±3.46µs +1.36%

Copy link

Benchmark for a31c7a4

Click to view benchmark
Test Base PR %
blake3_resource_id_creation/compute_from_bytes:large 252.3±0.94µs 247.8±1.69µs -1.78%
blake3_resource_id_creation/compute_from_bytes:medium 15.6±0.07µs 15.6±0.09µs 0.00%
blake3_resource_id_creation/compute_from_bytes:small 1354.0±9.26ns 1373.0±7.76ns +1.40%
blake3_resource_id_creation/compute_from_path:../test-assets/lena.jpg 196.0±0.40µs 196.4±0.70µs +0.20%
blake3_resource_id_creation/compute_from_path:../test-assets/test.pdf 1737.4±4.48µs 1738.8±17.20µs +0.08%
crc32_resource_id_creation/compute_from_bytes:large 86.5±0.38µs 86.5±0.27µs 0.00%
crc32_resource_id_creation/compute_from_bytes:medium 5.4±0.02µs 5.4±0.06µs 0.00%
crc32_resource_id_creation/compute_from_bytes:small 93.0±0.31ns 93.0±0.30ns 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/lena.jpg 65.0±0.27µs 65.0±0.25µs 0.00%
crc32_resource_id_creation/compute_from_path:../test-assets/test.pdf 955.2±7.21µs 956.4±9.06µs +0.13%
index_build/index_build/../test-assets/ 1035.2±3.08µs 1037.8±16.52µs +0.25%

@twitu twitu merged commit 6c5f786 into main Jun 14, 2024
4 checks passed
@kirillt
Copy link
Member

kirillt commented Jun 15, 2024

Good job @twitu @Pushkarm029 @tareknaser, this new logic looks very robust!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fs-storage: Millisecond timestamps and proper rounding
4 participants