Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Avoid spurious PMTUD resets #2293

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

larseggert
Copy link
Collaborator

Previously, after PMTUD had completed, we could
end up restarting PMTUD when packet loss counters
for packets larger than the current PMTU exceeded
the limit. We're now making sure to not do that.

Previously, after PMTUD had completed, we could
end up restarting PMTUD when packet loss counters
for packets larger than the current PMTU exceeded
the limit. We're now making sure to not do that.
Copy link

codecov bot commented Dec 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.31%. Comparing base (7bbf900) to head (7b09086).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2293      +/-   ##
==========================================
+ Coverage   95.29%   95.31%   +0.02%     
==========================================
  Files         114      114              
  Lines       36850    36871      +21     
  Branches    36850    36871      +21     
==========================================
+ Hits        35117    35145      +28     
+ Misses       1727     1718       -9     
- Partials        6        8       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Dec 19, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 085fa62.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

@larseggert larseggert changed the title fix: Avoid suprious PMTUD resets fix: Avoid spurious PMTUD resets Dec 19, 2024
neqo-transport/src/path.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Show resolved Hide resolved
neqo-transport/src/pmtud.rs Show resolved Hide resolved
larseggert and others added 3 commits December 27, 2024 10:58
Co-authored-by: Martin Thomson <[email protected]>
Signed-off-by: Lars Eggert <[email protected]>
Copy link

github-actions bot commented Jan 13, 2025

Benchmark results

Performance differences relative to 085fa62.

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.165 µs 11.204 µs 11.248 µs]
       change: [-0.4610% -0.0496% +0.3528%] (p = 0.81 > 0.05)

Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
4 (4.00%) low mild
1 (1.00%) high mild
9 (9.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [3.0176 ms 3.0273 ms 3.0384 ms]
       change: [-1.0225% -0.3356% +0.2533%] (p = 0.33 > 0.05)

Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) high mild
10 (10.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.492 µs 19.527 µs 19.570 µs]
       change: [-0.3941% +0.2469% +1.3374%] (p = 0.67 > 0.05)

Found 16 outliers among 100 measurements (16.00%)
3 (3.00%) low mild
3 (3.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.1589 ms 5.1703 ms 5.1833 ms]
       change: [-1.3764% -0.4628% +0.1637%] (p = 0.30 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
13 (13.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [5.5396 µs 5.5724 µs 5.6095 µs]
       change: [-0.4446% +0.1083% +0.7153%] (p = 0.72 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.7580 ms 1.7608 ms 1.7651 ms]
       change: [-0.0183% +0.1477% +0.3879%] (p = 0.22 > 0.05)

Found 6 outliers among 100 measurements (6.00%)
1 (1.00%) low mild
5 (5.00%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [98.827 ns 99.508 ns 100.51 ns]
       change: [-0.6682% +0.6242% +2.6596%] (p = 0.69 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
8 (8.00%) high mild
5 (5.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [116.49 ns 116.82 ns 117.18 ns]
       change: [-0.8637% -0.3494% +0.1265%] (p = 0.18 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
10 (10.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [116.23 ns 116.65 ns 117.17 ns]
       change: [-0.4721% +0.3398% +1.2799%] (p = 0.49 > 0.05)

Found 20 outliers among 100 measurements (20.00%)
5 (5.00%) low severe
2 (2.00%) low mild
6 (6.00%) high mild
7 (7.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [97.156 ns 97.311 ns 97.479 ns]
       change: [-1.3302% -0.5288% +0.1523%] (p = 0.19 > 0.05)

Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) high mild
3 (3.00%) high severe

RxStreamOrderer::inbound_frame(): No change in performance detected.
       time:   [111.25 ms 111.29 ms 111.34 ms]
       change: [-0.1634% -0.0048% +0.1115%] (p = 0.95 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) low severe
1 (1.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe

SentPackets::take_ranges: No change in performance detected.
       time:   [5.5069 µs 5.6635 µs 5.8278 µs]
       change: [-3.1223% -0.2761% +2.5083%] (p = 0.85 > 0.05)

Found 7 outliers among 100 measurements (7.00%)
5 (5.00%) high mild
2 (2.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [40.630 ms 40.725 ms 40.826 ms]
       change: [-2.9836% -2.6961% -2.3847%] (p = 0.00 < 0.05)

Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe

transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [40.737 ms 40.819 ms 40.908 ms]
       change: [-3.2317% -2.9449% -2.6494%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/pacing-false/same-seed: Change within noise threshold.
       time:   [40.799 ms 40.859 ms 40.928 ms]
       change: [-2.9773% -2.7618% -2.5440%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [40.507 ms 40.573 ms 40.645 ms]
       change: [-2.4402% -2.2014% -1.9494%] (p = 0.00 < 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved.
       time:   [878.83 ms 888.80 ms 899.20 ms]
       thrpt:  [111.21 MiB/s 112.51 MiB/s 113.79 MiB/s]
change:
       time:   [-5.4778% -3.9878% -2.3566%] (p = 0.00 < 0.05)
       thrpt:  [+2.4135% +4.1534% +5.7953%]

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold.
       time:   [321.26 ms 323.27 ms 325.37 ms]
       thrpt:  [30.734 Kelem/s 30.934 Kelem/s 31.128 Kelem/s]
change:
       time:   [+0.4718% +1.4110% +2.3404%] (p = 0.00 < 0.05)
       thrpt:  [-2.2869% -1.3913% -0.4696%]

Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
       time:   [34.155 ms 34.341 ms 34.553 ms]
       thrpt:  [28.941  elem/s 29.119  elem/s 29.278  elem/s]
change:
       time:   [-0.6599% +0.0824% +0.9133%] (p = 0.84 > 0.05)
       thrpt:  [-0.9050% -0.0823% +0.6642%]

Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
1 (1.00%) high mild
6 (6.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.
       time:   [1.7401 s 1.7576 s 1.7748 s]
       thrpt:  [56.346 MiB/s 56.897 MiB/s 57.469 MiB/s]
change:
       time:   [+1.4209% +2.8458% +4.2099%] (p = 0.00 < 0.05)
       thrpt:  [-4.0398% -2.7671% -1.4009%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client Server CC Pacing MTU Mean [ms] Min [ms] Max [ms]
gquiche gquiche 1504 563.2 ± 78.7 504.3 703.2
neqo gquiche reno on 1504 723.6 ± 19.6 705.4 756.3
neqo gquiche reno 1504 755.4 ± 25.6 703.2 803.2
neqo gquiche cubic on 1504 762.8 ± 77.3 706.9 977.2
neqo gquiche cubic 1504 747.5 ± 67.8 706.9 930.2
msquic msquic 1504 124.8 ± 53.7 95.0 317.2
neqo msquic reno on 1504 260.2 ± 88.7 202.9 461.8
neqo msquic reno 1504 212.2 ± 9.4 198.4 227.0
neqo msquic cubic on 1504 258.5 ± 76.3 212.4 428.4
neqo msquic cubic 1504 216.1 ± 14.1 201.1 239.3
gquiche neqo reno on 1504 672.6 ± 87.4 527.0 792.6
gquiche neqo reno 1504 675.1 ± 91.1 547.5 830.3
gquiche neqo cubic on 1504 688.2 ± 95.3 535.7 854.1
gquiche neqo cubic 1504 696.0 ± 78.4 556.4 811.6
msquic neqo reno on 1504 460.3 ± 10.8 446.7 482.7
msquic neqo reno 1504 473.8 ± 9.8 456.7 486.7
msquic neqo cubic on 1504 517.9 ± 84.0 462.8 714.8
msquic neqo cubic 1504 466.0 ± 9.4 449.6 478.1
neqo neqo reno on 1504 448.0 ± 28.6 430.8 526.2
neqo neqo reno 1504 456.9 ± 8.2 442.6 469.8
neqo neqo cubic on 1504 497.1 ± 90.5 443.9 724.2
neqo neqo cubic 1504 467.7 ± 11.8 447.4 490.6

⬇️ Download logs

Comment on lines +269 to +270
let largest_ok_idx = first_failed - 1;
let largest_ok_mtu = self.search_table[largest_ok_idx];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated: Do we ensure that first_failed is never 0 and thus we never index self.search_table with usize::MAX?

Is it that if 1280 is always lost, we would never make it to MAX_PROBES?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test for this.

neqo-transport/src/path.rs Outdated Show resolved Hide resolved
@larseggert
Copy link
Collaborator Author

@codecov-ai-reviewer review

neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants