linux/net/ipv4
Eric Dumazet aa251c8463 tcp: fix too slow tcp_rcvbuf_grow() action
While the blamed commits apparently avoided an overshoot,
they also limited how fast a sender can increase BDP at each RTT.

This is not exactly a revert, we do not add the 16 * tp->advmss
cushion we had, and we are keeping the out_of_order_queue
contribution.

Do the same in mptcp_rcvbuf_grow().

Tested:

emulated 50ms rtt (tcp_stream --tcp-tx-delay 50000), cubic 20 second flow.
net.ipv4.tcp_rmem set to "4096 131072 67000000"

perf record -a -e tcp:tcp_rcvbuf_grow sleep 20
perf script

Before:

We can see we fail to roughly double RWIN at each RTT.
Sender is RWIN limited while CWND is ramping up (before getting tcp_wmem
limited).

tcp_stream 33793 [010]  825.717525: tcp:tcp_rcvbuf_grow: time=100869 rtt_us=50428 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
tcp_stream 33793 [010]  825.768966: tcp:tcp_rcvbuf_grow: time=51447 rtt_us=50362 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=107474 window_clamp=112128 rcv_wnd=106496
tcp_stream 33793 [010]  825.821539: tcp:tcp_rcvbuf_grow: time=52577 rtt_us=50243 copied=114688 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=201096 rcv_ssthresh=167377 window_clamp=172031 rcv_wnd=167936
tcp_stream 33793 [010]  825.871781: tcp:tcp_rcvbuf_grow: time=50248 rtt_us=50237 copied=167936 inq=0 space=114688 ooo=0 scaling_ratio=219 rcvbuf=268129 rcv_ssthresh=224722 window_clamp=229375 rcv_wnd=225280
tcp_stream 33793 [010]  825.922475: tcp:tcp_rcvbuf_grow: time=50698 rtt_us=50183 copied=241664 inq=0 space=167936 ooo=0 scaling_ratio=219 rcvbuf=392617 rcv_ssthresh=331217 window_clamp=335871 rcv_wnd=323584
tcp_stream 33793 [010]  825.973326: tcp:tcp_rcvbuf_grow: time=50855 rtt_us=50213 copied=339968 inq=0 space=241664 ooo=0 scaling_ratio=219 rcvbuf=564986 rcv_ssthresh=478674 window_clamp=483327 rcv_wnd=462848
tcp_stream 33793 [010]  826.023970: tcp:tcp_rcvbuf_grow: time=50647 rtt_us=50248 copied=491520 inq=0 space=339968 ooo=0 scaling_ratio=219 rcvbuf=794811 rcv_ssthresh=671778 window_clamp=679935 rcv_wnd=651264
tcp_stream 33793 [010]  826.074612: tcp:tcp_rcvbuf_grow: time=50648 rtt_us=50227 copied=700416 inq=0 space=491520 ooo=0 scaling_ratio=219 rcvbuf=1149124 rcv_ssthresh=974881 window_clamp=983039 rcv_wnd=942080
tcp_stream 33793 [010]  826.125452: tcp:tcp_rcvbuf_grow: time=50845 rtt_us=50225 copied=987136 inq=8192 space=700416 ooo=0 scaling_ratio=219 rcvbuf=1637502 rcv_ssthresh=1392674 window_clamp=1400831 rcv_wnd=1339392
tcp_stream 33793 [010]  826.175698: tcp:tcp_rcvbuf_grow: time=50250 rtt_us=50198 copied=1347584 inq=0 space=978944 ooo=0 scaling_ratio=219 rcvbuf=2288672 rcv_ssthresh=1949729 window_clamp=1957887 rcv_wnd=1945600
tcp_stream 33793 [010]  826.225947: tcp:tcp_rcvbuf_grow: time=50252 rtt_us=50240 copied=1945600 inq=0 space=1347584 ooo=0 scaling_ratio=219 rcvbuf=3150516 rcv_ssthresh=2687010 window_clamp=2695167 rcv_wnd=2691072
tcp_stream 33793 [010]  826.276175: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50224 copied=2691072 inq=0 space=1945600 ooo=0 scaling_ratio=219 rcvbuf=4548617 rcv_ssthresh=3883041 window_clamp=3891199 rcv_wnd=3887104
tcp_stream 33793 [010]  826.326403: tcp:tcp_rcvbuf_grow: time=50233 rtt_us=50229 copied=3887104 inq=0 space=2691072 ooo=0 scaling_ratio=219 rcvbuf=6291456 rcv_ssthresh=5370482 window_clamp=5382144 rcv_wnd=5373952
tcp_stream 33793 [010]  826.376723: tcp:tcp_rcvbuf_grow: time=50323 rtt_us=50218 copied=5373952 inq=0 space=3887104 ooo=0 scaling_ratio=219 rcvbuf=9087658 rcv_ssthresh=7755537 window_clamp=7774207 rcv_wnd=7757824
tcp_stream 33793 [010]  826.426991: tcp:tcp_rcvbuf_grow: time=50274 rtt_us=50196 copied=7757824 inq=180224 space=5373952 ooo=0 scaling_ratio=219 rcvbuf=12563759 rcv_ssthresh=10729233 window_clamp=10747903 rcv_wnd=10575872
tcp_stream 33793 [010]  826.477229: tcp:tcp_rcvbuf_grow: time=50241 rtt_us=50078 copied=10731520 inq=180224 space=7577600 ooo=0 scaling_ratio=219 rcvbuf=17715667 rcv_ssthresh=15136529 window_clamp=15155199 rcv_wnd=14983168
tcp_stream 33793 [010]  826.527482: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50153 copied=15138816 inq=360448 space=10551296 ooo=0 scaling_ratio=219 rcvbuf=24667870 rcv_ssthresh=21073410 window_clamp=21102591 rcv_wnd=20766720
tcp_stream 33793 [010]  826.577712: tcp:tcp_rcvbuf_grow: time=50234 rtt_us=50228 copied=21073920 inq=0 space=14778368 ooo=0 scaling_ratio=219 rcvbuf=34550339 rcv_ssthresh=29517041 window_clamp=29556735 rcv_wnd=29519872
tcp_stream 33793 [010]  826.627982: tcp:tcp_rcvbuf_grow: time=50275 rtt_us=50220 copied=29519872 inq=540672 space=21073920 ooo=0 scaling_ratio=219 rcvbuf=49268707 rcv_ssthresh=42090625 window_clamp=42147839 rcv_wnd=41627648
tcp_stream 33793 [010]  826.678274: tcp:tcp_rcvbuf_grow: time=50296 rtt_us=50185 copied=42053632 inq=761856 space=28979200 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57238168 window_clamp=57316406 rcv_wnd=56606720
tcp_stream 33793 [010]  826.728627: tcp:tcp_rcvbuf_grow: time=50357 rtt_us=50128 copied=43913216 inq=851968 space=41291776 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56524800
tcp_stream 33793 [010]  827.131364: tcp:tcp_rcvbuf_grow: time=50239 rtt_us=50127 copied=43843584 inq=655360 space=43061248 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56696832
tcp_stream 33793 [010]  827.181613: tcp:tcp_rcvbuf_grow: time=50254 rtt_us=50115 copied=43843584 inq=524288 space=43188224 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56807424
tcp_stream 33793 [010]  828.339635: tcp:tcp_rcvbuf_grow: time=50283 rtt_us=50110 copied=43843584 inq=458752 space=43319296 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56864768
tcp_stream 33793 [010]  828.440350: tcp:tcp_rcvbuf_grow: time=50404 rtt_us=50099 copied=43843584 inq=393216 space=43384832 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=56922112
tcp_stream 33793 [010]  829.195106: tcp:tcp_rcvbuf_grow: time=50154 rtt_us=50077 copied=43843584 inq=196608 space=43450368 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57290728 window_clamp=57316406 rcv_wnd=57090048

After:

It takes few steps to increase RWIN. Sender is no longer RWIN limited.

tcp_stream 50826 [010]  935.634212: tcp:tcp_rcvbuf_grow: time=100788 rtt_us=50315 copied=49152 inq=0 space=40960 ooo=0 scaling_ratio=219 rcvbuf=131072 rcv_ssthresh=103970 window_clamp=112128 rcv_wnd=106496
tcp_stream 50826 [010]  935.685642: tcp:tcp_rcvbuf_grow: time=51437 rtt_us=50361 copied=86016 inq=0 space=49152 ooo=0 scaling_ratio=219 rcvbuf=160875 rcv_ssthresh=132969 window_clamp=137623 rcv_wnd=131072
tcp_stream 50826 [010]  935.738299: tcp:tcp_rcvbuf_grow: time=52660 rtt_us=50256 copied=139264 inq=0 space=86016 ooo=0 scaling_ratio=219 rcvbuf=502741 rcv_ssthresh=411497 window_clamp=430079 rcv_wnd=413696
tcp_stream 50826 [010]  935.788544: tcp:tcp_rcvbuf_grow: time=50249 rtt_us=50233 copied=307200 inq=0 space=139264 ooo=0 scaling_ratio=219 rcvbuf=728690 rcv_ssthresh=618717 window_clamp=623371 rcv_wnd=618496
tcp_stream 50826 [010]  935.838796: tcp:tcp_rcvbuf_grow: time=50258 rtt_us=50202 copied=618496 inq=0 space=307200 ooo=0 scaling_ratio=219 rcvbuf=2450338 rcv_ssthresh=1855709 window_clamp=2096187 rcv_wnd=1859584
tcp_stream 50826 [010]  935.889140: tcp:tcp_rcvbuf_grow: time=50347 rtt_us=50166 copied=1261568 inq=0 space=618496 ooo=0 scaling_ratio=219 rcvbuf=4376503 rcv_ssthresh=3725291 window_clamp=3743961 rcv_wnd=3706880
tcp_stream 50826 [010]  935.939435: tcp:tcp_rcvbuf_grow: time=50300 rtt_us=50185 copied=2478080 inq=24576 space=1261568 ooo=0 scaling_ratio=219 rcvbuf=9082648 rcv_ssthresh=7733731 window_clamp=7769921 rcv_wnd=7692288
tcp_stream 50826 [010]  935.989681: tcp:tcp_rcvbuf_grow: time=50251 rtt_us=50221 copied=4915200 inq=114688 space=2453504 ooo=0 scaling_ratio=219 rcvbuf=16574936 rcv_ssthresh=14108110 window_clamp=14179339 rcv_wnd=14024704
tcp_stream 50826 [010]  936.039967: tcp:tcp_rcvbuf_grow: time=50289 rtt_us=50279 copied=9830400 inq=114688 space=4800512 ooo=0 scaling_ratio=219 rcvbuf=32695050 rcv_ssthresh=27896187 window_clamp=27969593 rcv_wnd=27815936
tcp_stream 50826 [010]  936.090172: tcp:tcp_rcvbuf_grow: time=50211 rtt_us=50200 copied=19841024 inq=114688 space=9715712 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
tcp_stream 50826 [010]  936.140430: tcp:tcp_rcvbuf_grow: time=50262 rtt_us=50197 copied=39501824 inq=114688 space=19726336 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57245176 window_clamp=57316406 rcv_wnd=57163776
tcp_stream 50826 [010]  936.190527: tcp:tcp_rcvbuf_grow: time=50101 rtt_us=50071 copied=43655168 inq=262144 space=39387136 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
tcp_stream 50826 [010]  936.240719: tcp:tcp_rcvbuf_grow: time=50197 rtt_us=50057 copied=43843584 inq=262144 space=43393024 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57032704
tcp_stream 50826 [010]  936.341271: tcp:tcp_rcvbuf_grow: time=50297 rtt_us=50123 copied=43843584 inq=131072 space=43581440 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57147392
tcp_stream 50826 [010]  936.642503: tcp:tcp_rcvbuf_grow: time=50131 rtt_us=50084 copied=43843584 inq=0 space=43712512 ooo=0 scaling_ratio=219 rcvbuf=67000000 rcv_ssthresh=57259192 window_clamp=57316406 rcv_wnd=57262080

Fixes: 65c5287892 ("tcp: fix sk_rcvbuf overshoot")
Fixes: e118cdc34d ("mptcp: rcvbuf auto-tuning improvement")
Reported-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/589
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Link: https://patch.msgid.link/20251028-net-tcp-recv-autotune-v3-4-74b43ba4c84c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-29 17:30:19 -07:00
..
netfilter netfilter: nf_reject: don't reply to icmp error messages 2025-09-11 15:40:55 +02:00
af_inet.c net: gro: remove unnecessary df checks 2025-09-25 12:42:49 +02:00
ah4.c
arp.c net: set net.core.rmem_max and net.core.wmem_max to 4 MB 2025-08-20 19:35:00 -07:00
bpf_tcp_ca.c tcp: Pass flags to __tcp_send_ack 2025-03-17 13:56:38 +00:00
cipso_ipv4.c ipv4: icmp: Pass IPv4 control block structure as an argument to __icmp_send() 2025-09-11 12:22:38 +02:00
datagram.c net: dst: annotate data-races around dst->obsolete 2025-07-02 14:32:29 -07:00
devinet.c ipv4: Fix NULL vs error pointer check in inet_blackhole_dev_init() 2025-09-03 16:58:44 -07:00
esp4.c tcp: Don't pass hashinfo to socket lookup helpers. 2025-08-25 17:53:35 -07:00
esp4_offload.c
fib_frontend.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
fib_lookup.h
fib_notifier.c
fib_rules.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
fib_semantics.c net: s/dev_get_flags/netif_get_flags/ 2025-07-18 17:27:47 -07:00
fib_trie.c ipv4: fib: Move fib_valid_key_len() to rtm_to_fib_config(). 2025-03-03 15:04:11 -08:00
fou_bpf.c
fou_core.c net: gro: remove is_ipv6 from napi_gro_cb 2025-09-25 12:42:49 +02:00
fou_nl.c netlink: specs: fou: change local-v6/peer-v6 check 2025-09-03 15:16:49 -07:00
fou_nl.h
gre_demux.c net: ip_gre: Fix spelling mistake "demultiplexor" -> "demultiplexer" 2025-04-24 18:20:40 -07:00
gre_offload.c
icmp.c ipv4: icmp: Fix source IP derivation in presence of VRFs 2025-09-11 12:22:38 +02:00
igmp.c ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu] 2025-07-02 14:32:30 -07:00
igmp_internal.h netlink: support dumping IPv4 multicast addresses 2025-02-11 11:26:53 +01:00
inet_connection_sock.c tcp: Update bind bucket state on port release 2025-09-23 10:12:15 +02:00
inet_diag.c inet_diag: avoid cache line misses in inet_diag_bc_sk() 2025-08-29 19:29:24 -07:00
inet_fragment.c net: replace use of system_wq with system_percpu_wq 2025-09-22 17:40:30 -07:00
inet_hashtables.c tcp: Update bind bucket state on port release 2025-09-23 10:12:15 +02:00
inet_timewait_sock.c Networking changes for 6.18. 2025-10-02 15:17:01 -07:00
inetpeer.c inetpeer: use EXPORT_IPV6_MOD[_GPL]() 2025-02-14 13:09:39 -08:00
ip_forward.c
ip_fragment.c ipv4: start using dst_dev_rcu() 2025-08-29 19:36:32 -07:00
ip_gre.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
ip_input.c net: ipv4: convert ip_rcv_options to drop reasons 2025-09-19 17:35:52 -07:00
ip_options.c net: Switch to skb_dstref_steal/skb_dstref_restore for ip_route_input callers 2025-08-19 17:54:35 -07:00
ip_output.c net: psp: don't assume reply skbs will have a socket 2025-10-03 10:23:50 -07:00
ip_sockglue.c
ip_tunnel.c net/ip6_tunnel: Prevent perpetual tunnel growth 2025-10-13 17:43:46 -07:00
ip_tunnel_core.c tunnels: reset the GSO metadata before reusing the skb 2025-09-09 13:03:33 +02:00
ip_vti.c ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu] 2025-07-02 14:32:30 -07:00
ipcomp.c xfrm: delete x->tunnel as we delete x 2025-07-08 13:28:27 +02:00
ipconfig.c net: ipconfig: convert timeouts to secs_to_jiffies() 2025-07-09 19:25:01 -07:00
ipip.c ipv4: ip_tunnel: Convert ip_tunnel_delete_nets() callers to ->exit_rtnl(). 2025-04-14 17:08:42 -07:00
ipmr.c ipv4: start using dst_dev_rcu() 2025-08-29 19:36:32 -07:00
ipmr_base.c
Kconfig net: Retire DCCP socket. 2025-04-11 18:58:10 -07:00
Makefile
metrics.c
netfilter.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
netlink.c
nexthop.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-09-25 11:00:59 -07:00
ping.c inet: ping: use EXPORT_IPV6_MOD[_GPL]() 2025-09-01 13:15:14 -07:00
proc.c ipv4: snmp: do not use SNMP_MIB_SENTINEL anymore 2025-09-08 18:06:20 -07:00
protocol.c
raw.c inet: raw: add drop_counters to raw sockets 2025-08-28 13:14:50 +02:00
raw_diag.c inet_diag: change inet_diag_bc_sk() first argument 2025-08-29 19:29:24 -07:00
route.c ipv4: icmp: Pass IPv4 control block structure as an argument to __icmp_send() 2025-09-11 12:22:38 +02:00
syncookies.c tcp: accecn: AccECN negotiation 2025-09-18 08:47:51 +02:00
sysctl_net_ipv4.c tcp: accecn: AccECN option send control 2025-09-18 08:47:52 +02:00
tcp.c tcp: take care of zero tp->window_clamp in tcp_set_rcvlowat() 2025-10-06 13:08:48 -07:00
tcp_ao.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-09-18 11:26:06 -07:00
tcp_bbr.c
tcp_bic.c
tcp_bpf.c tcp_bpf: Call sk_msg_free() when tcp_bpf_send_verdict() fails to allocate psock->cork. 2025-09-10 06:53:56 -07:00
tcp_cdg.c tcp: cdg: remove redundant __GFP_NOWARN 2025-08-12 14:05:43 -07:00
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c tcp: helpers for ECN mode handling 2025-03-17 13:54:11 +00:00
tcp_dctcp.h tcp: Pass flags to __tcp_send_ack 2025-03-17 13:56:38 +00:00
tcp_diag.c inet_diag: change inet_diag_bc_sk() first argument 2025-08-29 19:29:24 -07:00
tcp_fastopen.c tcp: use dst_dev_rcu() in tcp_fastopen_active_disable_ofo_check() 2025-08-29 19:36:32 -07:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: fix too slow tcp_rcvbuf_grow() action 2025-10-29 17:30:19 -07:00
tcp_ipv4.c net: psp: update the TCP MSS to reflect PSP packet overhead 2025-09-18 12:32:06 +02:00
tcp_lp.c
tcp_metrics.c Networking changes for 6.18. 2025-10-02 15:17:01 -07:00
tcp_minisocks.c tcp: add datapath logic for PSP with inline key exchange 2025-09-18 12:32:06 +02:00
tcp_nv.c
tcp_offload.c net: gso: restore ids of outer ip headers correctly 2025-09-25 12:42:49 +02:00
tcp_output.c tcp: fix tcp_tso_should_defer() vs large RTT 2025-10-14 12:21:48 +02:00
tcp_plb.c
tcp_rate.c
tcp_recovery.c tcp: update the outdated ref draft-ietf-tcpm-rack 2025-07-08 09:01:52 -07:00
tcp_scalable.c
tcp_sigpool.c
tcp_timer.c tcp: annotate data-races around icsk->icsk_probes_out 2025-08-25 16:20:59 -07:00
tcp_ulp.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tunnel4.c
udp.c udp: do not use skb_release_head_state() before skb_attempt_defer_free() 2025-10-16 16:03:07 +02:00
udp_bpf.c
udp_diag.c inet_diag: change inet_diag_bc_sk() first argument 2025-08-29 19:29:24 -07:00
udp_impl.h udp: move udp_memory_allocated into net_aligned_data 2025-07-02 14:22:02 -07:00
udp_offload.c net: gro: remove is_ipv6 from napi_gro_cb 2025-09-25 12:42:49 +02:00
udp_tunnel_core.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
udp_tunnel_nic.c udp_tunnel: use netdev_warn() instead of netdev_WARN() 2025-09-11 19:09:48 -07:00
udp_tunnel_stub.c
udplite.c udp: move udp_memory_allocated into net_aligned_data 2025-07-02 14:22:02 -07:00
xfrm4_input.c xfrm: Set transport header to fix UDP GRO handling 2025-07-02 09:19:56 +02:00
xfrm4_output.c ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu] 2025-07-02 14:32:30 -07:00
xfrm4_policy.c ipv4: Convert ->flowi4_tos to dscp_t. 2025-08-26 17:34:31 -07:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c