Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android is broken on master (GSO is enabled) #2042

Closed
stormshield-damiend opened this issue Nov 14, 2024 · 9 comments · Fixed by #2047
Closed

Android is broken on master (GSO is enabled) #2042

stormshield-damiend opened this issue Nov 14, 2024 · 9 comments · Fixed by #2047

Comments

@stormshield-damiend
Copy link
Contributor

stormshield-damiend commented Nov 14, 2024

The following commit adc4a06 (MacOs optimizations) introduced a side effect on android.

These two modifications lead to BATCH_SIZE being set to 32 on android platform, enabling GSO :

Tests are done on and Android 14, armv8, phone using Wifi with and MTU of 1500 :

  • With BATCH_SIZE = 1 (before linked commit) and debug_assertions -> all fine
  • With BATCH_SIZE = 32 (with the linked commit) without debug_assertions -> Quic connection have sporadic disconnections
  • With BATCH_SIZE = 32 (with the linked commit) and debug_assertions -> We reach the debug_assert in PacketBuilder::new()
    debug_assert!(max_size >= min_size);
  • With GSO disabled in TransportConfig using enable_segmentation_offload(false) -> everything is fine

We observed similar issue on an android 9 x86 emulator.

@djc
Copy link
Member

djc commented Nov 14, 2024

cc @larseggert / @mxinden -- have you deployed quinn-udp 0.5.7 to Nightly already?

@mxinden
Copy link
Collaborator

mxinden commented Nov 14, 2024

Taking a look now.

cc @larseggert / @mxinden -- have you deployed quinn-udp 0.5.7 to Nightly already?

We have not yet. Also, we don't do GSO (yet).

@mxinden
Copy link
Collaborator

mxinden commented Nov 14, 2024

The following commit adc4a06 (MacOs optimizations) introduced a side effect on android.

Yes, you are right. Sorry for not catching this earlier.

On a more general note, ignoring the fact that this should not have happened in adc4a06, what is the reason to disable GSO on Android?

I am not familiar with quinn-proto, e.g. the above debug_assert.

@Ralith
Copy link
Collaborator

Ralith commented Nov 16, 2024

what is the reason to disable GSO on Android?

I think this was an accident. Probably it didn't occur to me that android didn't count as linux for target_os purposes when I wrote the original code.

We reach the debug_assert in PacketBuilder::new()

Is this fixed by #2046?

@Ralith
Copy link
Collaborator

Ralith commented Nov 16, 2024

#2047 should get this stuff working as intended on non-Linux Unixes in general.

@thomaseizinger
Copy link
Contributor

I just tested latest main is still broken for me. Specifically, I am getting:

I/O error (os error 5)

Which then in turn triggers the following on the first packet:

11-18 13:22:22.241 3732 3850 E connlib : quinn_udp::imp: got transmit error, halting segmentation offload

@thomaseizinger
Copy link
Contributor

thomaseizinger commented Nov 18, 2024

I/O error (os error 5)

So interestingly enough, this only happens when my phone is on mobile data. When I am on WiFI, GSO and GRO work just fine. Additionally, halting GSO doesn't seem to have any impact in that case.

I have a suspicion that Android doesn't like to be told to "do GSO" even if the segment length is just 1. So perhaps we need a special-case to check if segment_length == payload_length and not set any GSO related flags in that case.

@thomaseizinger
Copy link
Contributor

I have a suspicion that Android doesn't like to be told to "do GSO" even if the segment length is just 1. So perhaps we need a special-case to check if segment_length == payload_length and not set any GSO related flags in that case.

So with #2050 applied, I am seeing the following:

  1. The newly added log when setting the initial socket option does not show up (meaning setting the initial socket option works).
  2. Sending individual packets works again.
  3. Attempting to send packets using GSO still fails but due to the new condition of not setting segment size if == contents.len(), resetting max_gso_segments actually works (assuming the calling code respects that value.

Here is a tidied up log excerpt from our application:

11-18 16:41:30.733 V wire::net::send: num_packets=1 segment_size=240
11-18 16:41:30.832 V wire::net::recv: num_packets=1 segment_size=28 trailing_bytes=0
11-18 16:41:31.172 V wire::net::send: num_packets=1 segment_size=240
11-18 16:41:31.769 V wire::net::recv: num_packets=1 segment_size=28 trailing_bytes=0
11-18 16:41:41.363 V wire::net::send: num_packets=2 segment_size=244
11-18 16:41:41.364 E quinn_udp::imp: got transmit error, halting segmentation offload
11-18 16:41:41.364 D connlib_client_shared::eventloop: Tunnel error: I/O error (os error 5)
11-18 16:41:41.365 V wire::net::send: num_packets=1 segment_size=256
11-18 16:41:41.365 V wire::net::send: num_packets=1 segment_size=256
11-18 16:41:41.366 V wire::net::send: num_packets=1 segment_size=88
11-18 16:41:41.367 V wire::net::send: num_packets=1 segment_size=88
11-18 16:41:43.109 V wire::net::recv: num_packets=2 segment_size=1312 trailing_bytes=386

As you can see, we first send and receive some smaller packets. Then, we try to send one with GSO, that fails and we then continue without GSO.

@stormshield-gt
Copy link
Contributor

We have the same kind of problem (connlib : quinn_udp::imp: got transmit error, halting segmentation offload) with GSO and post-quantum enabled on android with the latest master. I will try to do a repro case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants