Skip to content

tcp: register tcp.rcv_wnd_max and tcp.snd_mss_max as runtime-tunable sysctls#46

Open
doronz88 wants to merge 1 commit into
ccie18643:masterfrom
doronz88:feature/tcp-throughput-sysctls
Open

tcp: register tcp.rcv_wnd_max and tcp.snd_mss_max as runtime-tunable sysctls#46
doronz88 wants to merge 1 commit into
ccie18643:masterfrom
doronz88:feature/tcp-throughput-sysctls

Conversation

@doronz88

Copy link
Copy Markdown

Two TCP policy values were effectively hard-coded, forcing applications that embed PyTCP to monkeypatch internals to tune throughput on fast or asymmetric paths:

  • The advertised receive-window ceiling was fixed at 65535 bytes (WindowState.rcv_wnd_max). A bulk inbound transfer is bound by window / RTT, so on a high bandwidth-delay-product path (fast link, tunnel) 64 KiB throttles the peer far below the link rate even though PyTCP already negotiates RFC 7323 window scaling.

  • The send-side MSS always tracked the egress interface MTU, with no way to emit smaller segments while still advertising a large receive MSS. Overlay/tunnel deployments whose host->peer path MTU is below the local interface MTU need exactly that asymmetry (and classical PMTUD cannot discover the smaller hop when it sits past a relay that drops ICMP PTB).

Expose both as registered sysctls, consistent with the existing 'tcp.base_mss' / 'tcp.mtu_probing' knobs:

  • 'tcp.rcv_wnd_max' (flat; Linux net.ipv4.tcp_rmem parity) — default 65535, seeded into WindowState.rcv_wnd_max at session creation, so behaviour is unchanged until an operator raises it.

  • 'tcp.snd_mss_max' (per-interface, like 'tcp.base_mss') — default 0 (uncapped); a non-zero value clamps _mss_ceiling() last, bounding the segments we EMIT without lowering the advertised receive MSS. Floor 88 (Linux TCP_MIN_MSS); 0 reserved for "off".

Both ride the 'sysctls={...}' bag in stack.init(); neither warrants an explicit kwarg yet.

Tests at tests/integration/protocols/tcp/test__tcp__sysctls.py pin registration, defaults, validator rejection (rcv_wnd_max != 0; snd_mss_max 0-or->=88) and the per-interface storage semantics; test__tcp__session__throughput_knobs.py pins the session behaviour (rcv_wnd_max seeds the window; snd_mss_max caps _mss_ceiling while leaving rcv_mss at the interface ceiling).

Reference: Linux net.ipv4.tcp_rmem (receive-window max).
Reference: Linux include/net/tcp.h TCP_MIN_MSS=88.

…sysctls

Two TCP policy values were effectively hard-coded, forcing applications
that embed PyTCP to monkeypatch internals to tune throughput on fast or
asymmetric paths:

  * The advertised receive-window ceiling was fixed at 65535 bytes
    (WindowState.rcv_wnd_max). A bulk inbound transfer is bound by
    window / RTT, so on a high bandwidth-delay-product path (fast link,
    tunnel) 64 KiB throttles the peer far below the link rate even though
    PyTCP already negotiates RFC 7323 window scaling.

  * The send-side MSS always tracked the egress interface MTU, with no
    way to emit smaller segments while still advertising a large receive
    MSS. Overlay/tunnel deployments whose host->peer path MTU is below
    the local interface MTU need exactly that asymmetry (and classical
    PMTUD cannot discover the smaller hop when it sits past a relay that
    drops ICMP PTB).

Expose both as registered sysctls, consistent with the existing
'tcp.base_mss' / 'tcp.mtu_probing' knobs:

  * 'tcp.rcv_wnd_max' (flat; Linux net.ipv4.tcp_rmem parity) — default
    65535, seeded into WindowState.rcv_wnd_max at session creation, so
    behaviour is unchanged until an operator raises it.

  * 'tcp.snd_mss_max' (per-interface, like 'tcp.base_mss') — default 0
    (uncapped); a non-zero value clamps _mss_ceiling() last, bounding the
    segments we EMIT without lowering the advertised receive MSS. Floor
    88 (Linux TCP_MIN_MSS); 0 reserved for "off".

Both ride the 'sysctls={...}' bag in stack.init(); neither warrants an
explicit kwarg yet.

Tests at tests/integration/protocols/tcp/test__tcp__sysctls.py pin
registration, defaults, validator rejection (rcv_wnd_max != 0;
snd_mss_max 0-or->=88) and the per-interface storage semantics;
test__tcp__session__throughput_knobs.py pins the session behaviour
(rcv_wnd_max seeds the window; snd_mss_max caps _mss_ceiling while
leaving rcv_mss at the interface ceiling).

Reference: Linux net.ipv4.tcp_rmem (receive-window max).
Reference: Linux include/net/tcp.h TCP_MIN_MSS=88.
@doronz88 doronz88 force-pushed the feature/tcp-throughput-sysctls branch from 815fd47 to 85b2aec Compare June 20, 2026 16:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant