fix: Avoid spurious PMTUD resets #2293

larseggert · 2024-12-19T12:41:13Z

Previously, after PMTUD had completed, we could
end up restarting PMTUD when packet loss counters
for packets larger than the current PMTU exceeded
the limit. We're now making sure to not do that.

Previously, after PMTUD had completed, we could end up restarting PMTUD when packet loss counters for packets larger than the current PMTU exceeded the limit. We're now making sure to not do that.

codecov · 2024-12-19T12:57:17Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.31%. Comparing base (7bbf900) to head (7b09086).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2293      +/-   ##
==========================================
+ Coverage   95.29%   95.31%   +0.02%     
==========================================
  Files         114      114              
  Lines       36850    36871      +21     
  Branches    36850    36871      +21     
==========================================
+ Hits        35117    35145      +28     
+ Misses       1727     1718       -9     
- Partials        6        8       +2

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2024-12-19T13:06:44Z

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 085fa62.

neqo-latest as client

neqo-latest vs. aioquic: Z
neqo-latest vs. go-x-net: ⚠️BP BA
neqo-latest vs. haproxy: 🚀L1 C1 ⚠️BP BA
neqo-latest vs. kwik: 🚀Z ⚠️BP BA
neqo-latest vs. msquic: A
neqo-latest vs. mvfst: A L1 🚀C1 ⚠️BP BA
neqo-latest vs. neqo: ⚠️CM
neqo-latest vs. neqo-latest: ⚠️CM
neqo-latest vs. nginx: ⚠️BP BA
neqo-latest vs. quic-go: ⚠️BP BA
neqo-latest vs. quiche: U C1 ⚠️BP BA
neqo-latest vs. quinn: ⚠️C1
neqo-latest vs. s2n-quic: ⚠️BP BA CM
neqo-latest vs. xquic: A

neqo-latest as server

aioquic vs. neqo-latest: ⚠️L1 CM
go-x-net vs. neqo-latest: ⚠️CM
kwik vs. neqo-latest: ⚠️BP BA CM
lsquic vs. neqo-latest: run cancelled after 30 min
msquic vs. neqo-latest: Z U ⚠️CM
mvfst vs. neqo-latest: run cancelled after 30 min
neqo vs. neqo-latest: ⚠️CM
ngtcp2 vs. neqo-latest: run cancelled after 30 min
picoquic vs. neqo-latest: ⚠️CM
quic-go vs. neqo-latest: ⚠️M CM
quiche vs. neqo-latest: 🚀C1 ⚠️CM
quinn vs. neqo-latest: V2 ⚠️CM
s2n-quic vs. neqo-latest: ⚠️CM
xquic vs. neqo-latest: M ⚠️CM

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: H DC LR C20 M S R 3 B U A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. go-x-net: H DC LR M B U A L2 C2 6
neqo-latest vs. haproxy: H DC LR C20 M S R Z 3 B U A ⚠️L1 L2 C2 6 V2
neqo-latest vs. kwik: H DC LR C20 M S R ⚠️Z 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. lsquic: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. msquic: H DC LR C20 M S R Z B U L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. mvfst: H DC LR M R Z 3 B U L2 ⚠️C1 C2 6
neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. ngtcp2: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. picoquic: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B A L1 L2 C2 6
neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U E A L1 L2 🚀C1 C2 6 ⚠️BP BA
neqo-latest vs. s2n-quic: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6
neqo-latest vs. xquic: H DC LR C20 M R Z 3 B U L1 L2 C1 C2 6 ⚠️BP BA

neqo-latest as server

aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B A 🚀L1 L2 C1 C2 6 V2 ⚠️BP BA
chrome vs. neqo-latest: 3
go-x-net vs. neqo-latest: H DC LR M B U A L2 C2 6 ⚠️BP BA
kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
msquic vs. neqo-latest: H DC LR C20 M S R B A L1 L2 C1 C2 6 V2 ⚠️BP BA
neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA
quic-go vs. neqo-latest: ⚠️H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6 BP BA
quiche vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 ⚠️C1 C2 6 ⚠️BP BA
quinn vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 ⚠️BP BA
s2n-quic vs. neqo-latest: H DC LR M S R 3 B E A L1 L2 C1 C2 6 ⚠️BP BA
xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6 ⚠️BP BA

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: E ⚠️CM
neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2 ⚠️CM
neqo-latest vs. haproxy: E ⚠️CM
neqo-latest vs. kwik: E ⚠️CM
neqo-latest vs. lsquic: ⚠️CM
neqo-latest vs. msquic: 3 E ⚠️CM
neqo-latest vs. mvfst: C20 S E V2 ⚠️CM
neqo-latest vs. nginx: E V2 ⚠️CM
neqo-latest vs. ngtcp2: ⚠️CM
neqo-latest vs. picoquic: ⚠️CM
neqo-latest vs. quic-go: E V2 ⚠️CM
neqo-latest vs. quiche: E V2 ⚠️CM
neqo-latest vs. quinn: V2 ⚠️CM
neqo-latest vs. s2n-quic: Z V2
neqo-latest vs. xquic: S E V2 ⚠️CM

neqo-latest as server

aioquic vs. neqo-latest: U E
chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 ⚠️BP BA CM
go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
kwik vs. neqo-latest: E
msquic vs. neqo-latest: 3 E
quic-go vs. neqo-latest: ⚠️E V2
quiche vs. neqo-latest: C20 U E V2
s2n-quic vs. neqo-latest: C20 Z U V2
xquic vs. neqo-latest: E V2

neqo-transport/src/path.rs

neqo-transport/src/pmtud.rs

Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>

Signed-off-by: Lars Eggert <[email protected]>

github-actions · 2025-01-13T07:39:36Z

Benchmark results

Performance differences relative to 085fa62.

decode 4096 bytes, mask ff: No change in performance detected.

       time:   [11.165 µs 11.204 µs 11.248 µs]
       change: [-0.4610% -0.0496% +0.3528%] (p = 0.81 > 0.05)
Found 15 outliers among 100 measurements (15.00%)

1 (1.00%) low severe

4 (4.00%) low mild

1 (1.00%) high mild

9 (9.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.

       time:   [3.0176 ms 3.0273 ms 3.0384 ms]
       change: [-1.0225% -0.3356% +0.2533%] (p = 0.33 > 0.05)
Found 11 outliers among 100 measurements (11.00%)

1 (1.00%) high mild

10 (10.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.

       time:   [19.492 µs 19.527 µs 19.570 µs]
       change: [-0.3941% +0.2469% +1.3374%] (p = 0.67 > 0.05)
Found 16 outliers among 100 measurements (16.00%)

3 (3.00%) low mild

3 (3.00%) high mild

10 (10.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.

       time:   [5.1589 ms 5.1703 ms 5.1833 ms]
       change: [-1.3764% -0.4628% +0.1637%] (p = 0.30 > 0.05)
Found 13 outliers among 100 measurements (13.00%)

13 (13.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.

       time:   [5.5396 µs 5.5724 µs 5.6095 µs]
       change: [-0.4446% +0.1083% +0.7153%] (p = 0.72 > 0.05)
Found 13 outliers among 100 measurements (13.00%)

3 (3.00%) high mild

10 (10.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.

       time:   [1.7580 ms 1.7608 ms 1.7651 ms]
       change: [-0.0183% +0.1477% +0.3879%] (p = 0.22 > 0.05)
Found 6 outliers among 100 measurements (6.00%)

1 (1.00%) low mild

5 (5.00%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [98.827 ns 99.508 ns 100.51 ns]
       change: [-0.6682% +0.6242% +2.6596%] (p = 0.69 > 0.05)
Found 13 outliers among 100 measurements (13.00%)

8 (8.00%) high mild

5 (5.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.

       time:   [116.49 ns 116.82 ns 117.18 ns]
       change: [-0.8637% -0.3494% +0.1265%] (p = 0.18 > 0.05)
Found 10 outliers among 100 measurements (10.00%)

10 (10.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.

       time:   [116.23 ns 116.65 ns 117.17 ns]
       change: [-0.4721% +0.3398% +1.2799%] (p = 0.49 > 0.05)
Found 20 outliers among 100 measurements (20.00%)

5 (5.00%) low severe

2 (2.00%) low mild

6 (6.00%) high mild

7 (7.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [97.156 ns 97.311 ns 97.479 ns]
       change: [-1.3302% -0.5288% +0.1523%] (p = 0.19 > 0.05)
Found 5 outliers among 100 measurements (5.00%)

2 (2.00%) high mild

3 (3.00%) high severe

RxStreamOrderer::inbound_frame(): No change in performance detected.

       time:   [111.25 ms 111.29 ms 111.34 ms]
       change: [-0.1634% -0.0048% +0.1115%] (p = 0.95 > 0.05)
Found 10 outliers among 100 measurements (10.00%)

3 (3.00%) low severe

1 (1.00%) low mild

3 (3.00%) high mild

3 (3.00%) high severe

SentPackets::take_ranges: No change in performance detected.

       time:   [5.5069 µs 5.6635 µs 5.8278 µs]
       change: [-3.1223% -0.2761% +2.5083%] (p = 0.85 > 0.05)
Found 7 outliers among 100 measurements (7.00%)

5 (5.00%) high mild

2 (2.00%) high severe

transfer/pacing-false/varying-seeds: Change within noise threshold.

       time:   [40.630 ms 40.725 ms 40.826 ms]
       change: [-2.9836% -2.6961% -2.3847%] (p = 0.00 < 0.05)
Found 3 outliers among 100 measurements (3.00%)

2 (2.00%) high mild

1 (1.00%) high severe

transfer/pacing-true/varying-seeds: Change within noise threshold.

       time:   [40.737 ms 40.819 ms 40.908 ms]
       change: [-3.2317% -2.9449% -2.6494%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-false/same-seed: Change within noise threshold.

       time:   [40.799 ms 40.859 ms 40.928 ms]
       change: [-2.9773% -2.7618% -2.5440%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

transfer/pacing-true/same-seed: Change within noise threshold.

       time:   [40.507 ms 40.573 ms 40.645 ms]
       change: [-2.4402% -2.2014% -1.9494%] (p = 0.00 < 0.05)
Found 1 outliers among 100 measurements (1.00%)

1 (1.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved.

       time:   [878.83 ms 888.80 ms 899.20 ms]
       thrpt:  [111.21 MiB/s 112.51 MiB/s 113.79 MiB/s]
change:
       time:   [-5.4778% -3.9878% -2.3566%] (p = 0.00 < 0.05)
       thrpt:  [+2.4135% +4.1534% +5.7953%]
Found 4 outliers among 100 measurements (4.00%)

4 (4.00%) high mild

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold.

       time:   [321.26 ms 323.27 ms 325.37 ms]
       thrpt:  [30.734 Kelem/s 30.934 Kelem/s 31.128 Kelem/s]
change:
       time:   [+0.4718% +1.4110% +2.3404%] (p = 0.00 < 0.05)
       thrpt:  [-2.2869% -1.3913% -0.4696%]
Found 3 outliers among 100 measurements (3.00%)

2 (2.00%) high mild

1 (1.00%) high severe

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.

       time:   [34.155 ms 34.341 ms 34.553 ms]
       thrpt:  [28.941  elem/s 29.119  elem/s 29.278  elem/s]
change:
       time:   [-0.6599% +0.0824% +0.9133%] (p = 0.84 > 0.05)
       thrpt:  [-0.9050% -0.0823% +0.6642%]
Found 8 outliers among 100 measurements (8.00%)

1 (1.00%) low mild

1 (1.00%) high mild

6 (6.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: 💔 Performance has regressed.

       time:   [1.7401 s 1.7576 s 1.7748 s]
       thrpt:  [56.346 MiB/s 56.897 MiB/s 57.469 MiB/s]
change:
       time:   [+1.4209% +2.8458% +4.2099%] (p = 0.00 < 0.05)
       thrpt:  [-4.0398% -2.7671% -1.4009%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client	Server	CC	Pacing	MTU	Mean [ms]	Min [ms]	Max [ms]
gquiche	gquiche			1504	563.2 ± 78.7	504.3	703.2
neqo	gquiche	reno	on	1504	723.6 ± 19.6	705.4	756.3
neqo	gquiche	reno		1504	755.4 ± 25.6	703.2	803.2
neqo	gquiche	cubic	on	1504	762.8 ± 77.3	706.9	977.2
neqo	gquiche	cubic		1504	747.5 ± 67.8	706.9	930.2
msquic	msquic			1504	124.8 ± 53.7	95.0	317.2
neqo	msquic	reno	on	1504	260.2 ± 88.7	202.9	461.8
neqo	msquic	reno		1504	212.2 ± 9.4	198.4	227.0
neqo	msquic	cubic	on	1504	258.5 ± 76.3	212.4	428.4
neqo	msquic	cubic		1504	216.1 ± 14.1	201.1	239.3
gquiche	neqo	reno	on	1504	672.6 ± 87.4	527.0	792.6
gquiche	neqo	reno		1504	675.1 ± 91.1	547.5	830.3
gquiche	neqo	cubic	on	1504	688.2 ± 95.3	535.7	854.1
gquiche	neqo	cubic		1504	696.0 ± 78.4	556.4	811.6
msquic	neqo	reno	on	1504	460.3 ± 10.8	446.7	482.7
msquic	neqo	reno		1504	473.8 ± 9.8	456.7	486.7
msquic	neqo	cubic	on	1504	517.9 ± 84.0	462.8	714.8
msquic	neqo	cubic		1504	466.0 ± 9.4	449.6	478.1
neqo	neqo	reno	on	1504	448.0 ± 28.6	430.8	526.2
neqo	neqo	reno		1504	456.9 ± 8.2	442.6	469.8
neqo	neqo	cubic	on	1504	497.1 ± 90.5	443.9	724.2
neqo	neqo	cubic		1504	467.7 ± 11.8	447.4	490.6

⬇️ Download logs

mxinden · 2024-12-22T08:20:15Z

neqo-transport/src/pmtud.rs

+        let largest_ok_idx = first_failed - 1;
+        let largest_ok_mtu = self.search_table[largest_ok_idx];


Unrelated: Do we ensure that first_failed is never 0 and thus we never index self.search_table with usize::MAX?

Is it that if 1280 is always lost, we would never make it to MAX_PROBES?

I've added a test for this.

neqo-transport/src/path.rs

Co-authored-by: Max Inden <[email protected]> Signed-off-by: Lars Eggert <[email protected]>

larseggert · 2025-01-14T06:56:10Z

@codecov-ai-reviewer review

neqo-transport/src/path.rs

neqo-transport/src/pmtud.rs

fix: Avoid suprious PMTUD resets

376f804

Previously, after PMTUD had completed, we could end up restarting PMTUD when packet loss counters for packets larger than the current PMTU exceeded the limit. We're now making sure to not do that.

larseggert requested review from KershawChang, martinthomson and mxinden as code owners December 19, 2024 12:41

WIP

685a895

larseggert changed the title ~~fix: Avoid suprious PMTUD resets~~ fix: Avoid spurious PMTUD resets Dec 19, 2024

Nits

8a37331

martinthomson reviewed Dec 23, 2024

View reviewed changes

neqo-transport/src/path.rs Outdated Show resolved Hide resolved

neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved

neqo-transport/src/pmtud.rs Show resolved Hide resolved

neqo-transport/src/pmtud.rs Show resolved Hide resolved

larseggert and others added 3 commits December 27, 2024 10:58

Update path.rs

e897be7

Co-authored-by: Martin Thomson <[email protected]> Signed-off-by: Lars Eggert <[email protected]>

Merge branch 'main' into fix-pmtud

8a68df9

Signed-off-by: Lars Eggert <[email protected]>

Fix condition

45854ca

mxinden approved these changes Jan 13, 2025

View reviewed changes

larseggert and others added 2 commits January 13, 2025 14:51

Update neqo-transport/src/path.rs

855e435

Co-authored-by: Max Inden <[email protected]> Signed-off-by: Lars Eggert <[email protected]>

Merge branch 'main' into fix-pmtud

27401dd