proxy_h2: acknowledge downstream RST_STREAM while the upstream request-body write is blocked#911
proxy_h2: acknowledge downstream RST_STREAM while the upstream request-body write is blocked#911molocule wants to merge 8 commits into
Conversation
There was a problem hiding this comment.
I don't think we want to mark the entire upstream connection for shutdown on a downstream error. Sending a CANCEL seems fine on the impacted stream.
| } | ||
| // ignore downstream error so that upstream can continue to write cache | ||
| downstream_state.to_errored(); | ||
| warn!( |
There was a problem hiding this comment.
Let's use the same pattern of checking suppress_proxy_warn_log before emitting a warn here.
| /// For HTTP/2 this resolves when the client resets the stream (RST_STREAM) or the | ||
| /// stream errors. Other protocols have no out-of-band abort signal (detecting a | ||
| /// close would require consuming reads), so this future is pending forever for them. | ||
| pub async fn watch_h2_stream_reset(&mut self) -> Result<h2::Reason> { |
There was a problem hiding this comment.
I'd prefer us return the future here Option<Idle<'_>>, we then avoid the future and select for downstream HTTP/1.1 clients in proxy_h2. We also avoid the case where a caller mistakenly invokes this for something other than H2 outside of select and stays pending.
There was a problem hiding this comment.
ah good point, I just fixed this
|
Thank you, will update the label on this when it's merged internally. |
This PR closes #910
Problem
In
bidirection_down_to_up, the downstream-read select arm awaitssend_body_to2()inside the arm body. While that write is parked on upstream flow control, the downstreamRecvStreamis not polled, so a downstreamRST_STREAMis never observed.The task keeps holding the downstream stream handles, and since the
h2crate only returns a reset stream's connection-window credit once all handles drop (release_closed_capacityatref_count == 0), the cancelled stream pins its share of the downstream connection window for as long as the upstream write stays blocked (potentially indefinitely, because there is no default write timeout).This is dangerous because cancelling a stream immediately frees its stream slot (max_concurrent_streams) on both ends, so the connection appears to have plenty of capacity and the client (e.g. Envoy or any gRPC client) keeps multiplexing new requests onto it. But the shared connection send window does not come back: credit the client spent can only be restored by the receiver's connection-level WINDOW_UPDATEs, and pingora never sends them because the cancelled streams' unconsumed credit stays pinned to their zombie streams. Flow-control accounting is correct, but clients will not be able to send data upstream.
Changes
pingora-core: addSession::watch_h2_stream_reset()to the HTTP server session enum. For H2 it resolves when the client resets the stream, reusing the existingIdle/poll_resetfuture; for H1/other protocols it is pending forever (there is no out-of-band abort signal — detecting an H1 close would require destructively reading the socket).proxy_h2::send_body_to2: racewrite_bodyagainstwatch_h2_stream_reset(). A reset surfaces as anH2ErrorwithErrorSource::Downstream.bidirection_down_to_up: on a downstream-sourced error fromsend_body_to2, fail so the stream handles drop and the window credit is reclaimed immediately, except when a cache fill is in progress, in which case the downstream error is swallowed and the upstream response continues to be admitted to cache, mirroring the existing policy for downstream read errors.proxy_down_to_up: on downstream-sourced errors, sendRST_STREAM CANCELon the upstream stream so the upstream peer also releases its stream resources promptly (previously only done for upstream read timeouts).Notes
The race is placed around the raw
write_body(after the request-body filters) rather than around all ofsend_body_to2or at the call site. This is because it is the only spot where the borrows are disjoint (&mut SendStreamvs.&mut Session), and it means a reset only ever cancels anh2frame write on a stream that is being torn down, it never cancels user-defined body filters mid-await, which were never required to be cancel-safe.Testing
Two integration tests (plus an
h2clistener for the cache test service on :6154, since rawh2frame control is needed):test_h2_downstream_rst_while_upstream_write_blocked: upstream never grants window updates; client fills the upstream stream window until the proxy parks, then sendsRST_STREAM. Asserts the upstream stream is cancelled (RST_STREAM CANCELreceived) within a bounded time. Hangs without this fix.test_h2_downstream_rst_during_cache_fill: same blocked-write reset, but with a cacheable response mid-admission; the upstream withholds the response tail until after the reset. Asserts the fill still completes: a follow-up request is a cache hit with the full body.