|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
3ba07527 |
| 25-Feb-2025 |
Eric Dumazet <[email protected]> |
tcp: be less liberal in TSEcr received while in SYN_RECV state
Yong-Hao Zou mentioned that linux was not strict as other OS in 3WHS, for flows using TCP TS option (RFC 7323)
As hinted by an old com
tcp: be less liberal in TSEcr received while in SYN_RECV state
Yong-Hao Zou mentioned that linux was not strict as other OS in 3WHS, for flows using TCP TS option (RFC 7323)
As hinted by an old comment in tcp_check_req(), we can check the TSEcr value in the incoming packet corresponds to one of the SYNACK TSval values we have sent.
In this patch, I record the oldest and most recent values that SYNACK packets have used.
Send a challenge ACK if we receive a TSEcr outside of this range, and increase a new SNMP counter.
nstat -az | grep TSEcrRejected TcpExtTSEcrRejected 0 0.0
Due to TCP fastopen implementation, do not apply yet these checks for fastopen flows.
v2: No longer use req->num_timeout, but treq->snt_tsval_first to detect when first SYNACK is prepared. This means we make sure to not send an initial zero TSval. Make sure MPTCP and TCP selftests are passing. Change MIB name to TcpExtTSEcrRejected
v1: https://lore.kernel.org/netdev/CADVnQykD8i4ArpSZaPKaoNxLJ2if2ts9m4As+=Jvdkrgx1qMHw@mail.gmail.com/T/
Reported-by: Yong-Hao Zou <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Neal Cardwell <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3 |
|
| #
6dc4c252 |
| 12-Feb-2025 |
Eric Dumazet <[email protected]> |
tcp: use EXPORT_IPV6_MOD[_GPL]()
Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need to be exported unless CONFIG_IPV6=m
tcp_hashinfo and tcp_openreq_init_rwin() are no longer used from any mod
tcp: use EXPORT_IPV6_MOD[_GPL]()
Use EXPORT_IPV6_MOD[_GPL]() for symbols that don't need to be exported unless CONFIG_IPV6=m
tcp_hashinfo and tcp_openreq_init_rwin() are no longer used from any module anyway.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Mateusz Polchlopek <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5 |
|
| #
46a02aa3 |
| 17-Jun-2024 |
Yan Zhai <[email protected]> |
tcp: use sk_skb_reason_drop to free rx packets
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving socket to the tracepoint.
Reported-by: kernel test robot <[email protected]> Close
tcp: use sk_skb_reason_drop to free rx packets
Replace kfree_skb_reason with sk_skb_reason_drop and pass the receiving socket to the tracepoint.
Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/r/[email protected]/ Signed-off-by: Yan Zhai <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3 |
|
| #
f410cbea |
| 04-Apr-2024 |
Eric Dumazet <[email protected]> |
tcp: annotate data-races around tp->window_clamp
tp->window_clamp can be read locklessly, add READ_ONCE() and WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by
tcp: annotate data-races around tp->window_clamp
tp->window_clamp can be read locklessly, add READ_ONCE() and WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Jason Xing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc2, v6.9-rc1 |
|
| #
956c0d61 |
| 15-Mar-2024 |
Kuniyuki Iwashima <[email protected]> |
tcp: Clear req->syncookie in reqsk_alloc().
syzkaller reported a read of uninit req->syncookie. [0]
Originally, req->syncookie was used only in tcp_conn_request() to indicate if we need to encode S
tcp: Clear req->syncookie in reqsk_alloc().
syzkaller reported a read of uninit req->syncookie. [0]
Originally, req->syncookie was used only in tcp_conn_request() to indicate if we need to encode SYN cookie in SYN+ACK, so the field remains uninitialised in other places.
The commit 695751e31a63 ("bpf: tcp: Handle BPF SYN Cookie in cookie_v[46]_check().") added another meaning in ACK path; req->syncookie is set true if SYN cookie is validated by BPF kfunc.
After the change, cookie_v[46]_check() always read req->syncookie, but it is not initialised in the normal SYN cookie case as reported by KMSAN.
Let's make sure we always initialise req->syncookie in reqsk_alloc().
[0]: BUG: KMSAN: uninit-value in cookie_v4_check+0x22b7/0x29e0 net/ipv4/syncookies.c:477 cookie_v4_check+0x22b7/0x29e0 net/ipv4/syncookies.c:477 tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1855 [inline] tcp_v4_do_rcv+0xb17/0x10b0 net/ipv4/tcp_ipv4.c:1914 tcp_v4_rcv+0x4ce4/0x5420 net/ipv4/tcp_ipv4.c:2322 ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x332/0x500 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:314 [inline] ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:460 [inline] ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:449 NF_HOOK include/linux/netfilter.h:314 [inline] ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core net/core/dev.c:5538 [inline] __netif_receive_skb+0x319/0x9e0 net/core/dev.c:5652 process_backlog+0x480/0x8b0 net/core/dev.c:5981 __napi_poll+0xe7/0x980 net/core/dev.c:6632 napi_poll net/core/dev.c:6701 [inline] net_rx_action+0x89d/0x1820 net/core/dev.c:6813 __do_softirq+0x1c0/0x7d7 kernel/softirq.c:554 do_softirq+0x9a/0x100 kernel/softirq.c:455 __local_bh_enable_ip+0x9f/0xb0 kernel/softirq.c:382 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:820 [inline] __dev_queue_xmit+0x2776/0x52c0 net/core/dev.c:4362 dev_queue_xmit include/linux/netdevice.h:3091 [inline] neigh_hh_output include/net/neighbour.h:526 [inline] neigh_output include/net/neighbour.h:540 [inline] ip_finish_output2+0x187a/0x1b70 net/ipv4/ip_output.c:235 __ip_finish_output+0x287/0x810 ip_finish_output+0x4b/0x550 net/ipv4/ip_output.c:323 NF_HOOK_COND include/linux/netfilter.h:303 [inline] ip_output+0x15f/0x3f0 net/ipv4/ip_output.c:433 dst_output include/net/dst.h:450 [inline] ip_local_out net/ipv4/ip_output.c:129 [inline] __ip_queue_xmit+0x1e93/0x2030 net/ipv4/ip_output.c:535 ip_queue_xmit+0x60/0x80 net/ipv4/ip_output.c:549 __tcp_transmit_skb+0x3c70/0x4890 net/ipv4/tcp_output.c:1462 tcp_transmit_skb net/ipv4/tcp_output.c:1480 [inline] tcp_write_xmit+0x3ee1/0x8900 net/ipv4/tcp_output.c:2792 __tcp_push_pending_frames net/ipv4/tcp_output.c:2977 [inline] tcp_send_fin+0xa90/0x12e0 net/ipv4/tcp_output.c:3578 tcp_shutdown+0x198/0x1f0 net/ipv4/tcp.c:2716 inet_shutdown+0x33f/0x5b0 net/ipv4/af_inet.c:923 __sys_shutdown_sock net/socket.c:2425 [inline] __sys_shutdown net/socket.c:2437 [inline] __do_sys_shutdown net/socket.c:2445 [inline] __se_sys_shutdown+0x2a4/0x440 net/socket.c:2443 __x64_sys_shutdown+0x6c/0xa0 net/socket.c:2443 do_syscall_64+0xd5/0x1f0 entry_SYSCALL_64_after_hwframe+0x6d/0x75
Uninit was stored to memory at: reqsk_alloc include/net/request_sock.h:148 [inline] inet_reqsk_alloc+0x651/0x7a0 net/ipv4/tcp_input.c:6978 cookie_tcp_reqsk_alloc+0xd4/0x900 net/ipv4/syncookies.c:328 cookie_tcp_check net/ipv4/syncookies.c:388 [inline] cookie_v4_check+0x289f/0x29e0 net/ipv4/syncookies.c:420 tcp_v4_cookie_check net/ipv4/tcp_ipv4.c:1855 [inline] tcp_v4_do_rcv+0xb17/0x10b0 net/ipv4/tcp_ipv4.c:1914 tcp_v4_rcv+0x4ce4/0x5420 net/ipv4/tcp_ipv4.c:2322 ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x332/0x500 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:314 [inline] ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:460 [inline] ip_rcv_finish+0x4a2/0x520 net/ipv4/ip_input.c:449 NF_HOOK include/linux/netfilter.h:314 [inline] ip_rcv+0xcd/0x380 net/ipv4/ip_input.c:569 __netif_receive_skb_one_core net/core/dev.c:5538 [inline] __netif_receive_skb+0x319/0x9e0 net/core/dev.c:5652 process_backlog+0x480/0x8b0 net/core/dev.c:5981 __napi_poll+0xe7/0x980 net/core/dev.c:6632 napi_poll net/core/dev.c:6701 [inline] net_rx_action+0x89d/0x1820 net/core/dev.c:6813 __do_softirq+0x1c0/0x7d7 kernel/softirq.c:554
Uninit was created at: __alloc_pages+0x9a7/0xe00 mm/page_alloc.c:4592 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] alloc_slab_page mm/slub.c:2175 [inline] allocate_slab mm/slub.c:2338 [inline] new_slab+0x2de/0x1400 mm/slub.c:2391 ___slab_alloc+0x1184/0x33d0 mm/slub.c:3525 __slab_alloc mm/slub.c:3610 [inline] __slab_alloc_node mm/slub.c:3663 [inline] slab_alloc_node mm/slub.c:3835 [inline] kmem_cache_alloc+0x6d3/0xbe0 mm/slub.c:3852 reqsk_alloc include/net/request_sock.h:131 [inline] inet_reqsk_alloc+0x66/0x7a0 net/ipv4/tcp_input.c:6978 tcp_conn_request+0x484/0x44e0 net/ipv4/tcp_input.c:7135 tcp_v4_conn_request+0x16f/0x1d0 net/ipv4/tcp_ipv4.c:1716 tcp_rcv_state_process+0x2e5/0x4bb0 net/ipv4/tcp_input.c:6655 tcp_v4_do_rcv+0xbfd/0x10b0 net/ipv4/tcp_ipv4.c:1929 tcp_v4_rcv+0x4ce4/0x5420 net/ipv4/tcp_ipv4.c:2322 ip_protocol_deliver_rcu+0x2a3/0x13d0 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x332/0x500 net/ipv4/ip_input.c:233 NF_HOOK include/linux/netfilter.h:314 [inline] ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254 dst_input include/net/dst.h:460 [inline] ip_sublist_rcv_finish net/ipv4/ip_input.c:580 [inline] ip_list_rcv_finish net/ipv4/ip_input.c:631 [inline] ip_sublist_rcv+0x15f3/0x17f0 net/ipv4/ip_input.c:639 ip_list_rcv+0x9ef/0xa40 net/ipv4/ip_input.c:674 __netif_receive_skb_list_ptype net/core/dev.c:5581 [inline] __netif_receive_skb_list_core+0x15c5/0x1670 net/core/dev.c:5629 __netif_receive_skb_list net/core/dev.c:5681 [inline] netif_receive_skb_list_internal+0x106c/0x16f0 net/core/dev.c:5773 gro_normal_list include/net/gro.h:438 [inline] napi_complete_done+0x425/0x880 net/core/dev.c:6113 virtqueue_napi_complete drivers/net/virtio_net.c:465 [inline] virtnet_poll+0x149d/0x2240 drivers/net/virtio_net.c:2211 __napi_poll+0xe7/0x980 net/core/dev.c:6632 napi_poll net/core/dev.c:6701 [inline] net_rx_action+0x89d/0x1820 net/core/dev.c:6813 __do_softirq+0x1c0/0x7d7 kernel/softirq.c:554
CPU: 0 PID: 16792 Comm: syz-executor.2 Not tainted 6.8.0-syzkaller-05562-g61387b8dcf1d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Fixes: 695751e31a63 ("bpf: tcp: Handle BPF SYN Cookie in cookie_v[46]_check().") Reported-by: syzkaller <[email protected]> Reported-by: Eric Dumazet <[email protected]> Closes: https://lore.kernel.org/bpf/CANn89iKdN9c+C_2JAUbc+VY3DDQjAQukMtiBbormAmAk9CdvQA@mail.gmail.com/ Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.8, v6.8-rc7 |
|
| #
a4a69a37 |
| 26-Feb-2024 |
Jason Xing <[email protected]> |
tcp: use drop reasons in cookie check for ipv4
Now it's time to use the prepared definitions to refine this part. Four reasons used might enough for now, I think.
Signed-off-by: Jason Xing <kernelx
tcp: use drop reasons in cookie check for ipv4
Now it's time to use the prepared definitions to refine this part. Four reasons used might enough for now, I think.
Signed-off-by: Jason Xing <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
65be4393 |
| 26-Feb-2024 |
Jason Xing <[email protected]> |
tcp: directly drop skb in cookie check for ipv4
Only move the skb drop from tcp_v4_do_rcv() to cookie_v4_check() itself, no other changes made. It can help us refine the specific drop reasons later.
tcp: directly drop skb in cookie check for ipv4
Only move the skb drop from tcp_v4_do_rcv() to cookie_v4_check() itself, no other changes made. It can help us refine the specific drop reasons later.
Signed-off-by: Jason Xing <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1 |
|
| #
695751e3 |
| 15-Jan-2024 |
Kuniyuki Iwashima <[email protected]> |
bpf: tcp: Handle BPF SYN Cookie in cookie_v[46]_check().
We will support arbitrary SYN Cookie with BPF in the following patch.
If BPF prog validates ACK and kfunc allocates a reqsk, it will be carr
bpf: tcp: Handle BPF SYN Cookie in cookie_v[46]_check().
We will support arbitrary SYN Cookie with BPF in the following patch.
If BPF prog validates ACK and kfunc allocates a reqsk, it will be carried to cookie_[46]_check() as skb->sk. If skb->sk is not NULL, we call cookie_bpf_check().
Then, we clear skb->sk and skb->destructor, which are needed not to hold refcnt for reqsk and the listener. See the following patch for details.
After that, we finish initialisation for the remaining fields with cookie_tcp_reqsk_init().
Note that the server side WScale is set only for non-BPF SYN Cookie.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
| #
b18afb6f |
| 15-Jan-2024 |
Kuniyuki Iwashima <[email protected]> |
tcp: Move tcp_ns_to_ts() to tcp.h
We will support arbitrary SYN Cookie with BPF.
When BPF prog validates ACK and kfunc allocates a reqsk, we need to call tcp_ns_to_ts() to calculate an offset of TS
tcp: Move tcp_ns_to_ts() to tcp.h
We will support arbitrary SYN Cookie with BPF.
When BPF prog validates ACK and kfunc allocates a reqsk, we need to call tcp_ns_to_ts() to calculate an offset of TSval for later use:
time t0 : Send SYN+ACK -> tsval = Initial TSval (Random Number)
t1 : Recv ACK of 3WHS -> tsoff = TSecr - tcp_ns_to_ts(usec_ts_ok, tcp_clock_ns()) = Initial TSval - t1
t2 : Send ACK -> tsval = t2 + tsoff = Initial TSval + (t2 - t1) = Initial TSval + Time Delta (x)
(x) Note that the time delta does not include the initial RTT from t0 to t1.
Let's move tcp_ns_to_ts() to tcp.h.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
show more ...
|
|
Revision tags: v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4 |
|
| #
8e7bab6b |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Factorise cookie-dependent fields initialisation in cookie_v[46]_check()
We will support arbitrary SYN Cookie with BPF, and then kfunc at TC will preallocate reqsk and initialise some fields th
tcp: Factorise cookie-dependent fields initialisation in cookie_v[46]_check()
We will support arbitrary SYN Cookie with BPF, and then kfunc at TC will preallocate reqsk and initialise some fields that should not be overwritten later by cookie_v[46]_check().
To simplify the flow in cookie_v[46]_check(), we move such fields' initialisation to cookie_tcp_reqsk_alloc() and factorise non-BPF SYN Cookie handling into cookie_tcp_check(), where we validate the cookie and allocate reqsk, as done by kfunc later.
Note that we set ireq->ecn_ok in two steps, the latter of which will be shared by the BPF case. As cookie_ecn_ok() is one-liner, now it's inlined.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
de5626b9 |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Factorise cookie-independent fields initialisation in cookie_v[46]_check().
We will support arbitrary SYN Cookie with BPF, and then some reqsk fields are initialised in kfunc, and others are do
tcp: Factorise cookie-independent fields initialisation in cookie_v[46]_check().
We will support arbitrary SYN Cookie with BPF, and then some reqsk fields are initialised in kfunc, and others are done in cookie_v[46]_check().
This patch factorises the common part as cookie_tcp_reqsk_init() and calls it in cookie_tcp_reqsk_alloc() to minimise the discrepancy between cookie_v[46]_check().
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
7b0f570f |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().
We initialise treq->af_specific in cookie_tcp_reqsk_alloc() so that we can look up a key later in tcp_create_openreq_child().
tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().
We initialise treq->af_specific in cookie_tcp_reqsk_alloc() so that we can look up a key later in tcp_create_openreq_child().
Initially, that change was added for MD5 by commit ba5a4fdd63ae ("tcp: make sure treq->af_specific is initialized"), but it has not been used since commit d0f2b7a9ca0a ("tcp: Disable header prediction for MD5 flow.").
Now, treq->af_specific is used only by TCP-AO, so, we can move that initialisation into tcp_ao_syncookie().
In addition to that, l3index in cookie_v[46]_check() is only used for tcp_ao_syncookie(), so let's move it as well.
While at it, we move down tcp_ao_syncookie() in cookie_v4_check() so that it will be called after security_inet_conn_request() to make functions order consistent with cookie_v6_check().
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
efce3d1f |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Don't initialise tp->tsoffset in tcp_get_cookie_sock().
When we create a full socket from SYN Cookie, we initialise tcp_sk(sk)->tsoffset redundantly in tcp_get_cookie_sock() as the field is inh
tcp: Don't initialise tp->tsoffset in tcp_get_cookie_sock().
When we create a full socket from SYN Cookie, we initialise tcp_sk(sk)->tsoffset redundantly in tcp_get_cookie_sock() as the field is inherited from tcp_rsk(req)->ts_off.
cookie_v[46]_check |- treq->ts_off = 0 `- tcp_get_cookie_sock |- tcp_v[46]_syn_recv_sock | `- tcp_create_openreq_child | `- newtp->tsoffset = treq->ts_off `- tcp_sk(child)->tsoffset = tsoff
Let's initialise tcp_rsk(req)->ts_off with the correct offset and remove the second initialisation of tcp_sk(sk)->tsoffset.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
7577bc82 |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Don't pass cookie to __cookie_v[46]_check().
tcp_hdr(skb) and SYN Cookie are passed to __cookie_v[46]_check(), but none of the callers passes cookie other than ntohl(th->ack_seq) - 1.
Let's fe
tcp: Don't pass cookie to __cookie_v[46]_check().
tcp_hdr(skb) and SYN Cookie are passed to __cookie_v[46]_check(), but none of the callers passes cookie other than ntohl(th->ack_seq) - 1.
Let's fetch it in __cookie_v[46]_check() instead of passing the cookie over and over.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
50468cdd |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Clean up goto labels in cookie_v[46]_check().
We will support arbitrary SYN Cookie with BPF, and then reqsk will be preallocated before cookie_v[46]_check().
Depending on how validation fails,
tcp: Clean up goto labels in cookie_v[46]_check().
We will support arbitrary SYN Cookie with BPF, and then reqsk will be preallocated before cookie_v[46]_check().
Depending on how validation fails, we send RST or just drop skb.
To make the error handling easier, let's clean up goto labels.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
45c28509 |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Cache sock_net(sk) in cookie_v[46]_check().
sock_net(sk) is used repeatedly in cookie_v[46]_check(). Let's cache it in a variable.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed
tcp: Cache sock_net(sk) in cookie_v[46]_check().
sock_net(sk) is used repeatedly in cookie_v[46]_check(). Let's cache it in a variable.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
34efc9cf |
| 29-Nov-2023 |
Kuniyuki Iwashima <[email protected]> |
tcp: Clean up reverse xmas tree in cookie_v[46]_check().
We will grow and cut the xmas tree in cookie_v[46]_check(). This patch cleans it up to make later patches tidy.
Signed-off-by: Kuniyuki Iwas
tcp: Clean up reverse xmas tree in cookie_v[46]_check().
We will grow and cut the xmas tree in cookie_v[46]_check(). This patch cleans it up to make later patches tidy.
Signed-off-by: Kuniyuki Iwashima <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.7-rc3, v6.7-rc2, v6.7-rc1 |
|
| #
cdbab623 |
| 31-Oct-2023 |
Eric Dumazet <[email protected]> |
tcp: fix fastopen code vs usec TS
After blamed commit, TFO client-ack-dropped-then-recovery-ms-timestamps packetdrill test failed.
David Morley and Neal Cardwell started investigating and Neal poin
tcp: fix fastopen code vs usec TS
After blamed commit, TFO client-ack-dropped-then-recovery-ms-timestamps packetdrill test failed.
David Morley and Neal Cardwell started investigating and Neal pointed that we had :
tcp_conn_request() tcp_try_fastopen() -> tcp_fastopen_create_child -> child = inet_csk(sk)->icsk_af_ops->syn_recv_sock() -> tcp_create_openreq_child() -> copy req_usec_ts from req: newtp->tcp_usec_ts = treq->req_usec_ts; // now the new TFO server socket always does usec TS, no matter // what the route options are... send_synack() -> tcp_make_synack() // disable tcp_rsk(req)->req_usec_ts if route option is not present: if (tcp_rsk(req)->req_usec_ts < 0) tcp_rsk(req)->req_usec_ts = dst_tcp_usec_ts(dst);
tcp_conn_request() has the initial dst, we can initialize tcp_rsk(req)->req_usec_ts there instead of later in send_synack();
This means tcp_rsk(req)->req_usec_ts can be a boolean.
Many thanks to David an Neal for their help.
Fixes: 614e8316aa4c ("tcp: add support for usec resolution in TCP TS values") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Suggested-by: Neal Cardwell <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: David Morley <[email protected]> Acked-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.6 |
|
| #
248411b8 |
| 23-Oct-2023 |
Dmitry Safonov <[email protected]> |
net/tcp: Wire up l3index to TCP-AO
Similarly how TCP_MD5SIG_FLAG_IFINDEX works for TCP-MD5, TCP_AO_KEYF_IFINDEX is an AO-key flag that binds that MKT to a specified by L3 ifinndex. Similarly, withou
net/tcp: Wire up l3index to TCP-AO
Similarly how TCP_MD5SIG_FLAG_IFINDEX works for TCP-MD5, TCP_AO_KEYF_IFINDEX is an AO-key flag that binds that MKT to a specified by L3 ifinndex. Similarly, without this flag the key will work in the default VRF l3index = 0 for connections. To prevent AO-keys from overlapping, it's restricted to add key B for a socket that has key A, which have the same sndid/rcvid and one of the following is true: - !(A.keyflags & TCP_AO_KEYF_IFINDEX) or !(B.keyflags & TCP_AO_KEYF_IFINDEX) so that any key is non-bound to a VRF - A.l3index == B.l3index both want to work for the same VRF
Additionally, it's restricted to match TCP-MD5 keys for the same peer the following way: |--------------|--------------------|----------------|---------------| | | MD5 key without | MD5 key | MD5 key | | | l3index | l3index=0 | l3index=N | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | without | reject | reject | reject | | l3index | | | | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=0 | reject | reject | allow | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=N | reject | allow | reject | |--------------|--------------------|----------------|---------------|
This is done with the help of tcp_md5_do_lookup_any_l3index() to reject adding AO key without TCP_AO_KEYF_IFINDEX if there's TCP-MD5 in any VRF. This is important for case where sysctl_tcp_l3mdev_accept = 1 Similarly, for TCP-AO lookups tcp_ao_do_lookup() may be used with l3index < 0, so that __tcp_ao_key_cmp() will match TCP-AO key in any VRF.
Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
06b22ef2 |
| 23-Oct-2023 |
Dmitry Safonov <[email protected]> |
net/tcp: Wire TCP-AO to request sockets
Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for check
net/tcp: Wire TCP-AO to request sockets
Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys).
Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc7 |
|
| #
614e8316 |
| 20-Oct-2023 |
Eric Dumazet <[email protected]> |
tcp: add support for usec resolution in TCP TS values
Back in 2015, Van Jacobson suggested to use usec resolution in TCP TS values. This has been implemented in our private kernels.
Goals were :
1
tcp: add support for usec resolution in TCP TS values
Back in 2015, Van Jacobson suggested to use usec resolution in TCP TS values. This has been implemented in our private kernels.
Goals were :
1) better observability of delays in networking stacks. 2) better disambiguation of events based on TSval/ecr values. 3) building block for congestion control modules needing usec resolution.
Back then we implemented a schem based on private SYN options to negotiate the feature.
For upstream submission, we chose to use a route attribute, because this feature is probably going to be used in private networks [1] [2].
ip route add 10/8 ... features tcp_usec_ts
Note that RFC 7323 recommends a "timestamp clock frequency in the range 1 ms to 1 sec per tick.", but also mentions "the maximum acceptable clock frequency is one tick every 59 ns."
[1] Unfortunately RFC 7323 5.5 (Outdated Timestamps) suggests to invalidate TS.Recent values after a flow was idle for more than 24 days. This is the part making usec_ts a problem for peers following this recommendation for long living idle flows.
[2] Attempts to standardize usec ts went nowhere:
https://www.ietf.org/proceedings/97/slides/slides-97-tcpm-tcp-options-for-low-latency-00.pdf https://datatracker.ietf.org/doc/draft-wang-tcpm-low-latency-opt/
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
003e07a1 |
| 20-Oct-2023 |
Eric Dumazet <[email protected]> |
tcp: move tcp_ns_to_ts() to net/ipv4/syncookies.c
tcp_ns_to_ts() is only used once from cookie_init_timestamp().
Also add the 'bool usec_ts' parameter to enable usec TS later.
Signed-off-by: Eric
tcp: move tcp_ns_to_ts() to net/ipv4/syncookies.c
tcp_ns_to_ts() is only used once from cookie_init_timestamp().
Also add the 'bool usec_ts' parameter to enable usec TS later.
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
73ed8e03 |
| 20-Oct-2023 |
Eric Dumazet <[email protected]> |
tcp: fix cookie_init_timestamp() overflows
cookie_init_timestamp() is supposed to return a 64bit timestamp suitable for both TSval determination and setting of skb->tstamp.
Unfortunately it uses 32
tcp: fix cookie_init_timestamp() overflows
cookie_init_timestamp() is supposed to return a 64bit timestamp suitable for both TSval determination and setting of skb->tstamp.
Unfortunately it uses 32bit fields and overflows after 2^32 * 10^6 nsec (~49 days) of uptime.
Generated TSval are still correct, but skb->tstamp might be set far away in the past, potentially confusing other layers.
tcp_ns_to_ts() is changed to return a full 64bit value, ts and ts_now variables are changed to u64 type, and TSMASK is removed in favor of shifts operations.
While we are at it, change this sequence: ts >>= TSBITS; ts--; ts <<= TSBITS; ts |= options; to: ts -= (1UL << TSBITS);
Fixes: 9a568de4818d ("tcp: switch TCP TS option (RFC 7323) to 1ms clock") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6 |
|
| #
6f8a76f8 |
| 05-Jun-2023 |
Guillaume Nault <[email protected]> |
tcp: Set route scope properly in cookie_v4_check().
RT_CONN_FLAGS(sk) overloads flowi4_tos with the RTO_ONLINK bit when sk has the SOCK_LOCALROUTE flag set. This allows ip_route_output_key_hash() to
tcp: Set route scope properly in cookie_v4_check().
RT_CONN_FLAGS(sk) overloads flowi4_tos with the RTO_ONLINK bit when sk has the SOCK_LOCALROUTE flag set. This allows ip_route_output_key_hash() to eventually adjust flowi4_scope.
Instead of relying on special handling of the RTO_ONLINK bit, we can just set the route scope correctly. This will eventually allow to avoid special interpretation of tos variables and to convert ->flowi4_tos to dscp_t.
Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1 |
|
| #
3fff8818 |
| 10-Dec-2022 |
Matthieu Baerts <[email protected]> |
mptcp: remove MPTCP 'ifdef' in TCP SYN cookies
To ease the maintenance, it is often recommended to avoid having #ifdef preprocessor conditions.
Here the section related to CONFIG_MPTCP was quite sh
mptcp: remove MPTCP 'ifdef' in TCP SYN cookies
To ease the maintenance, it is often recommended to avoid having #ifdef preprocessor conditions.
Here the section related to CONFIG_MPTCP was quite short but the next commit needs to add more code around. It is then cleaner to move specific MPTCP code to functions located in net/mptcp directory.
Now that mptcp_subflow_request_sock_ops structure can be static, it can also be marked as "read only after init".
Suggested-by: Paolo Abeni <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Cc: [email protected] Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|