Skip to content

Commit

Permalink
gazette/runtime: use HTTP/2 keep-alive intervals
Browse files Browse the repository at this point in the history
HTTP/2 keep-alive sends a PING frame every interval, and fails the
connection of the peer doesn't respond in time. This verifies the
end-to-end health of the HTTP/2 transport and catches issues like
servers which have bound sockets but aren't actively listening.

Also using HTTP/2 keep-alive when connecting to local containers. We've
observed that `podman` can fail in ways that leave the reactor believing
it has an established connection to flow-connector-init, even though the
container has failed and the network namespace has been torn down.
  • Loading branch information
jgraettinger committed Nov 5, 2024
1 parent acc461f commit dfe28b9
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
7 changes: 3 additions & 4 deletions crates/gazette/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -87,11 +87,10 @@ pub fn dial_channel(endpoint: &str) -> Result<tonic::transport::Channel> {
// Note this connect_timeout accounts only for TCP connection time and
// does not apply to time required for TLS or HTTP/2 transport start,
// which can block indefinitely if the server is bound but not listening.
// Callers MUST implement per-RPC timeouts if that's important.
// This timeout is only a best-effort sanity check.
.connect_timeout(Duration::from_secs(5))
.keep_alive_timeout(Duration::from_secs(120))
.keep_alive_while_idle(true)
// HTTP/2 keep-alive sends a PING frame every interval to confirm the
// health of the end-to-end HTTP/2 transport.
.http2_keep_alive_interval(std::time::Duration::from_secs(5))
.tls_config(
tonic::transport::ClientTlsConfig::new()
.with_native_roots()
Expand Down
1 change: 1 addition & 0 deletions crates/runtime/src/container.rs
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,7 @@ pub async fn start(
let channel = tonic::transport::Endpoint::new(init_address.clone())
.expect("formatting endpoint address")
.connect_timeout(std::time::Duration::from_secs(5))
.http2_keep_alive_interval(std::time::Duration::from_secs(5))
.connect()
.await
.with_context(|| {
Expand Down

0 comments on commit dfe28b9

Please sign in to comment.