|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
1f615422 |
| 25-Mar-2025 |
Jakub Kicinski <[email protected]> |
Revert "udp_tunnel: GRO optimizations"
Revert "udp_tunnel: use static call for GRO hooks when possible" This reverts commit 311b36574ceaccfa3f91b74054a09cd4bb877702.
Revert "udp_tunnel: create a fa
Revert "udp_tunnel: GRO optimizations"
Revert "udp_tunnel: use static call for GRO hooks when possible" This reverts commit 311b36574ceaccfa3f91b74054a09cd4bb877702.
Revert "udp_tunnel: create a fastpath GRO lookup." This reverts commit 8d4880db378350f8ed8969feea13bdc164564fc1.
There are multiple small issues with the series. In the interest of unblocking the merge window let's opt for a revert.
Link: https://lore.kernel.org/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7 |
|
| #
8d4880db |
| 11-Mar-2025 |
Paolo Abeni <[email protected]> |
udp_tunnel: create a fastpath GRO lookup.
Most UDP tunnels bind a socket to a local port, with ANY address, no peer and no interface index specified. Additionally it's quite common to have a single
udp_tunnel: create a fastpath GRO lookup.
Most UDP tunnels bind a socket to a local port, with ANY address, no peer and no interface index specified. Additionally it's quite common to have a single tunnel device per namespace.
Track in each namespace the UDP tunnel socket respecting the above. When only a single one is present, store a reference in the netns.
When such reference is not NULL, UDP tunnel GRO lookup just need to match the incoming packet destination port vs the socket local port.
The tunnel socket never sets the reuse[port] flag[s]. When bound to no address and interface, no other socket can exist in the same netns matching the specified local port.
Matching packets with non-local destination addresses will be aggregated, and eventually segmented as needed - no behavior changes intended.
Note that the UDP tunnel socket reference is stored into struct netns_ipv4 for both IPv4 and IPv6 tunnels. That is intentional to keep all the fastpath-related netns fields in the same struct and allow cacheline-based optimization. Currently both the IPv4 and IPv6 socket pointer share the same cacheline as the `udp_table` field.
Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/4d5c319c4471161829f50cb8436841de81a5edae.1741718157.git.pabeni@redhat.com Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12 |
|
| #
dab78a17 |
| 14-Nov-2024 |
Philo Lu <[email protected]> |
net/udp: Add 4-tuple hash list basis
Add a new hash list, hash4, in udp table. It will be used to implement 4-tuple hash for connected udp sockets. This patch adds the hlist to table, and implements
net/udp: Add 4-tuple hash list basis
Add a new hash list, hash4, in udp table. It will be used to implement 4-tuple hash for connected udp sockets. This patch adds the hlist to table, and implements helpers and the initialization. 4-tuple hash is implemented in the following patch.
hash4 uses hlist_nulls to avoid moving wrongly onto another hlist due to concurrent rehash, because rehash() can happen with lookup().
Co-developed-by: Cambda Zhu <[email protected]> Signed-off-by: Cambda Zhu <[email protected]> Co-developed-by: Fred Chen <[email protected]> Signed-off-by: Fred Chen <[email protected]> Co-developed-by: Yubing Qiu <[email protected]> Signed-off-by: Yubing Qiu <[email protected]> Signed-off-by: Philo Lu <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Acked-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7 |
|
| #
b3e90f37 |
| 05-May-2024 |
Yoann Congal <[email protected]> |
printk: Change type of CONFIG_BASE_SMALL to bool
CONFIG_BASE_SMALL is currently a type int but is only used as a boolean.
So, change its type to bool and adapt all usages: CONFIG_BASE_SMALL == 0 be
printk: Change type of CONFIG_BASE_SMALL to bool
CONFIG_BASE_SMALL is currently a type int but is only used as a boolean.
So, change its type to bool and adapt all usages: CONFIG_BASE_SMALL == 0 becomes !IS_ENABLED(CONFIG_BASE_SMALL) and CONFIG_BASE_SMALL != 0 becomes IS_ENABLED(CONFIG_BASE_SMALL).
Reviewed-by: Petr Mladek <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Masahiro Yamada <[email protected]> Signed-off-by: Yoann Congal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Petr Mladek <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4 |
|
| #
1382e3b6 |
| 11-Apr-2024 |
Yuri Benditovich <[email protected]> |
net: change maximum number of UDP segments to 128
The commit fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") adds check of potential number of UDP segments vs UDP_MAX_SEGMENTS
net: change maximum number of UDP segments to 128
The commit fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") adds check of potential number of UDP segments vs UDP_MAX_SEGMENTS in linux/virtio_net.h. After this change certification test of USO guest-to-guest transmit on Windows driver for virtio-net device fails, for example with packet size of ~64K and mss of 536 bytes. In general the USO should not be more restrictive than TSO. Indeed, in case of unreasonably small mss a lot of segments can cause queue overflow and packet loss on the destination. Limit of 128 segments is good for any practical purpose, with minimal meaningful mss of 536 the maximal UDP packet will be divided to ~120 segments. The number of segments for UDP packets is validated vs UDP_MAX_SEGMENTS also in udp.c (v4,v6), this does not affect quest-to-guest path but does affect packets sent to host, for example. It is important to mention that UDP_MAX_SEGMENTS is kernel-only define and not available to user mode socket applications. In order to request MSS smaller than MTU the applications just uses setsockopt with SOL_UDP and UDP_SEGMENT and there is no limitations on socket API level.
Fixes: fc8b2a619469 ("net: more strict VIRTIO_NET_HDR_GSO_UDP_L4 validation") Signed-off-by: Yuri Benditovich <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc3, v6.9-rc2 |
|
| #
3d010c80 |
| 26-Mar-2024 |
Antoine Tenart <[email protected]> |
udp: do not accept non-tunnel GSO skbs landing in a tunnel
When rx-udp-gro-forwarding is enabled UDP packets might be GROed when being forwarded. If such packets might land in a tunnel this can caus
udp: do not accept non-tunnel GSO skbs landing in a tunnel
When rx-udp-gro-forwarding is enabled UDP packets might be GROed when being forwarded. If such packets might land in a tunnel this can cause various issues and udp_gro_receive makes sure this isn't the case by looking for a matching socket. This is performed in udp4/6_gro_lookup_skb but only in the current netns. This is an issue with tunneled packets when the endpoint is in another netns. In such cases the packets will be GROed at the UDP level, which leads to various issues later on. The same thing can happen with rx-gro-list.
We saw this with geneve packets being GROed at the UDP level. In such case gso_size is set; later the packet goes through the geneve rx path, the geneve header is pulled, the offset are adjusted and frag_list skbs are not adjusted with regard to geneve. When those skbs hit skb_fragment, it will misbehave. Different outcomes are possible depending on what the GROed skbs look like; from corrupted packets to kernel crashes.
One example is a BUG_ON[1] triggered in skb_segment while processing the frag_list. Because gso_size is wrong (geneve header was pulled) skb_segment thinks there is "geneve header size" of data in frag_list, although it's in fact the next packet. The BUG_ON itself has nothing to do with the issue. This is only one of the potential issues.
Looking up for a matching socket in udp_gro_receive is fragile: the lookup could be extended to all netns (not speaking about performances) but nothing prevents those packets from being modified in between and we could still not find a matching socket. It's OK to keep the current logic there as it should cover most cases but we also need to make sure we handle tunnel packets being GROed too early.
This is done by extending the checks in udp_unexpected_gso: GSO packets lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must be segmented.
[1] kernel BUG at net/core/skbuff.c:4408! RIP: 0010:skb_segment+0xd2a/0xf70 __udp_gso_segment+0xaa/0x560
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets") Signed-off-by: Antoine Tenart <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6 |
|
| #
f796feab |
| 20-Feb-2024 |
Paolo Abeni <[email protected]> |
udp: add local "peek offset enabled" flag
We want to re-organize the struct sock layout. The sk_peek_off field location is problematic, as most protocols want it in the RX read area, while UDP wants
udp: add local "peek offset enabled" flag
We want to re-organize the struct sock layout. The sk_peek_off field location is problematic, as most protocols want it in the RX read area, while UDP wants it on a cacheline different from sk_receive_queue.
Create a local (inside udp_sock) copy of the 'peek offset is enabled' flag and place it inside the same cacheline of reader_queue.
Check such flag before reading sk_peek_off. This will save potential false sharing and cache misses in the fast-path.
Tested under UDP flood with small packets. The struct sock layout update causes a 4% performance drop, and this patch restores completely the original tput.
Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/67ab679c15fbf49fa05b3ffe05d91c47ab84f147.1708426665.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2 |
|
| #
882af43a |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udplite: fix various data-races
udp->pcflag, udp->pcslen and udp->pcrlen reads/writes are racy.
Move udp->pcflag to udp->udp_flags for atomicity, and add READ_ONCE()/WRITE_ONCE() annotations for pc
udplite: fix various data-races
udp->pcflag, udp->pcslen and udp->pcrlen reads/writes are racy.
Move udp->pcflag to udp->udp_flags for atomicity, and add READ_ONCE()/WRITE_ONCE() annotations for pcslen and pcrlen.
Fixes: ba4e58eca8aa ("[NET]: Supporting UDP-Lite (RFC 3828) in Linux") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
729549aa |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udplite: remove UDPLITE_BIT
This flag is set but never read, we can remove it.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Pa
udplite: remove UDPLITE_BIT
This flag is set but never read, we can remove it.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
ac9a7f4c |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GRO
Move udp->encap_enabled to udp->udp_flags.
Add udp_test_and_set_bit() helper to allow lockless udp_tunnel_encap_enable() implementation.
Signed-off-by:
udp: lockless UDP_ENCAP_L2TPINUDP / UDP_GRO
Move udp->encap_enabled to udp->udp_flags.
Add udp_test_and_set_bit() helper to allow lockless udp_tunnel_encap_enable() implementation.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
f5f52f08 |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flags
These are read locklessly, move them to udp_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem
udp: move udp->accept_udp_{l4|fraglist} to udp->udp_flags
These are read locklessly, move them to udp_flags to fix data-races.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
e1dc0615 |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udp: move udp->gro_enabled to udp->udp_flags
syzbot reported that udp->gro_enabled can be read locklessly. Use one atomic bit from udp->udp_flags.
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain
udp: move udp->gro_enabled to udp->udp_flags
syzbot reported that udp->gro_enabled can be read locklessly. Use one atomic bit from udp->udp_flags.
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
bcbc1b1d |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udp: move udp->no_check6_rx to udp->udp_flags
syzbot reported that udp->no_check6_rx can be read locklessly. Use one atomic bit from udp->udp_flags.
Fixes: 1c19448c9ba6 ("net: Make enabling of zero
udp: move udp->no_check6_rx to udp->udp_flags
syzbot reported that udp->no_check6_rx can be read locklessly. Use one atomic bit from udp->udp_flags.
Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
a0002127 |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udp: move udp->no_check6_tx to udp->udp_flags
syzbot reported that udp->no_check6_tx can be read locklessly. Use one atomic bit from udp->udp_flags
Fixes: 1c19448c9ba6 ("net: Make enabling of zero
udp: move udp->no_check6_tx to udp->udp_flags
syzbot reported that udp->no_check6_tx can be read locklessly. Use one atomic bit from udp->udp_flags
Fixes: 1c19448c9ba6 ("net: Make enabling of zero UDP6 csums more restrictive") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
81b36803 |
| 12-Sep-2023 |
Eric Dumazet <[email protected]> |
udp: introduce udp->udp_flags
According to syzbot, it is time to use proper atomic flags for various UDP flags.
Add udp_flags field, and convert udp->corkflag to first bit in it.
Signed-off-by: Er
udp: introduce udp->udp_flags
According to syzbot, it is time to use proper atomic flags for various UDP flags.
Add udp_flags field, and convert udp->corkflag to first bit in it.
Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3 |
|
| #
94c540fb |
| 17-Mar-2023 |
Eric Dumazet <[email protected]> |
udp: preserve const qualifier in udp_sk()
We can change udp_sk() to propagate const qualifier of its argument, thanks to container_of_const()
This should avoid some potential errors caused by accid
udp: preserve const qualifier in udp_sk()
We can change udp_sk() to propagate const qualifier of its argument, thanks to container_of_const()
This should avoid some potential errors caused by accidental (const -> not_const) promotion.
Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6 |
|
| #
9804985b |
| 14-Nov-2022 |
Kuniyuki Iwashima <[email protected]> |
udp: Introduce optional per-netns hash table.
The maximum hash table size is 64K due to the nature of the protocol. [0] It's smaller than TCP, and fewer sockets can cause a performance drop.
On an
udp: Introduce optional per-netns hash table.
The maximum hash table size is 64K due to the nature of the protocol. [0] It's smaller than TCP, and fewer sockets can cause a performance drop.
On an EC2 c5.24xlarge instance (192 GiB memory), after running iperf3 in different netns, creating 32Mi sockets without data transfer in the root netns causes regression for the iperf3's connection.
uhash_entries sockets length Gbps 64K 1 1 5.69 1Mi 16 5.27 2Mi 32 4.90 4Mi 64 4.09 8Mi 128 2.96 16Mi 256 2.06 32Mi 512 1.12
The per-netns hash table breaks the lengthy lists into shorter ones. It is useful on a multi-tenant system with thousands of netns. With smaller hash tables, we can look up sockets faster, isolate noisy neighbours, and reduce lock contention.
The max size of the per-netns table is 64K as well. This is because the possible hash range by udp_hashfn() always fits in 64K within the same netns and we cannot make full use of the whole buckets larger than 64K.
/* 0 < num < 64K -> X < hash < X + 64K */ (num + net_hash_mix(net)) & mask;
Also, the min size is 128. We use a bitmap to search for an available port in udp_lib_get_port(). To keep the bitmap on the stack and not fire the CONFIG_FRAME_WARN error at build time, we round up the table size to 128.
The sysctl usage is the same with TCP:
$ dmesg | cut -d ' ' -f 6- | grep "UDP hash" UDP hash table entries: 65536 (order: 9, 2097152 bytes, vmalloc)
# sysctl net.ipv4.udp_hash_entries net.ipv4.udp_hash_entries = 65536 # can be changed by uhash_entries
# sysctl net.ipv4.udp_child_hash_entries net.ipv4.udp_child_hash_entries = 0 # disabled by default
# ip netns add test1 # ip netns exec test1 sysctl net.ipv4.udp_hash_entries net.ipv4.udp_hash_entries = -65536 # share the global table
# sysctl -w net.ipv4.udp_child_hash_entries=100 net.ipv4.udp_child_hash_entries = 100
# ip netns add test2 # ip netns exec test2 sysctl net.ipv4.udp_hash_entries net.ipv4.udp_hash_entries = 128 # own a per-netns table with 2^n buckets
We could optimise the hash table lookup/iteration further by removing the netns comparison for the per-netns one in the future. Also, we could optimise the sparse udp_hslot layout by putting it in udp_table.
[0]: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1 |
|
| #
42fb06b3 |
| 12-Oct-2022 |
David Howells <[email protected]> |
net: Change the udp encap_err_rcv to allow use of {ip,ipv6}_icmp_error()
Change the udp encap_err_rcv signature to match ip_icmp_error() and ipv6_icmp_error() so that those can be used from the call
net: Change the udp encap_err_rcv to allow use of {ip,ipv6}_icmp_error()
Change the udp encap_err_rcv signature to match ip_icmp_error() and ipv6_icmp_error() so that those can be used from the called function and export them.
Signed-off-by: David Howells <[email protected]> cc: Marc Dionne <[email protected]> cc: [email protected] cc: [email protected]
show more ...
|
| #
8a3854c7 |
| 20-Oct-2022 |
Paolo Abeni <[email protected]> |
udp: track the forward memory release threshold in an hot cacheline
When the receiver process and the BH runs on different cores, udp_rmem_release() experience a cache miss while accessing sk_rcvbuf
udp: track the forward memory release threshold in an hot cacheline
When the receiver process and the BH runs on different cores, udp_rmem_release() experience a cache miss while accessing sk_rcvbuf, as the latter shares the same cacheline with sk_forward_alloc, written by the BH.
With this patch, UDP tracks the rcvbuf value and its update via custom SOL_SOCKET socket options, and copies the forward memory threshold value used by udp_rmem_release() in a different cacheline, already accessed by the above function and uncontended.
Since the UDP socket init operation grown a bit, factor out the common code between v4 and v6 in a shared helper.
Overall the above give a 10% peek throughput increase under UDP flood.
Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3 |
|
| #
ac56a0b4 |
| 26-Aug-2022 |
David Howells <[email protected]> |
rxrpc: Fix ICMP/ICMP6 error handling
Because rxrpc pretends to be a tunnel on top of a UDP/UDP6 socket, allowing it to siphon off UDP packets early in the handling of received UDP packets thereby av
rxrpc: Fix ICMP/ICMP6 error handling
Because rxrpc pretends to be a tunnel on top of a UDP/UDP6 socket, allowing it to siphon off UDP packets early in the handling of received UDP packets thereby avoiding the packet going through the UDP receive queue, it doesn't get ICMP packets through the UDP ->sk_error_report() callback. In fact, it doesn't appear that there's any usable option for getting hold of ICMP packets.
Fix this by adding a new UDP encap hook to distribute error messages for UDP tunnels. If the hook is set, then the tunnel driver will be able to see ICMP packets. The hook provides the offset into the packet of the UDP header of the original packet that caused the notification.
An alternative would be to call the ->error_handler() hook - but that requires that the skbuff be cloned (as ip_icmp_error() or ipv6_cmp_error() do, though isn't really necessary or desirable in rxrpc's case is we want to parse them there and then, not queue them).
Changes ======= ver #3) - Fixed an uninitialised variable.
ver #2) - Fixed some missing CONFIG_AF_RXRPC_IPV6 conditionals.
Fixes: 5271953cad31 ("rxrpc: Use the UDP encap_rcv hook") Signed-off-by: David Howells <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2 |
|
| #
cc81df83 |
| 26-Jan-2022 |
Jakub Kicinski <[email protected]> |
udp: remove inner_udp_hdr()
Not used since added in v3.8.
Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <davem@davem
udp: remove inner_udp_hdr()
Not used since added in v3.8.
Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6 |
|
| #
d18931a9 |
| 30-Mar-2021 |
Paolo Abeni <[email protected]> |
vxlan: allow L4 GRO passthrough
When passing up an UDP GSO packet with L4 aggregation, there is no need to segment it at the vxlan level. We can propagate the packet untouched and let it be segmente
vxlan: allow L4 GRO passthrough
When passing up an UDP GSO packet with L4 aggregation, there is no need to segment it at the vxlan level. We can propagate the packet untouched and let it be segmented later, if needed.
Introduce an helper to allow let the UDP socket to accept any L4 aggregation and use it in the vxlan driver.
v1 -> v2: - updated to use the newly introduced UDP socket 'accept*' fields
Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
78352f73 |
| 30-Mar-2021 |
Paolo Abeni <[email protected]> |
udp: never accept GSO_FRAGLIST packets
Currently the UDP protocol delivers GSO_FRAGLIST packets to the sockets without the expected segmentation.
This change addresses the issue introducing and mai
udp: never accept GSO_FRAGLIST packets
Currently the UDP protocol delivers GSO_FRAGLIST packets to the sockets without the expected segmentation.
This change addresses the issue introducing and maintaining a couple of new fields to explicitly accept SKB_GSO_UDP_L4 or GSO_FRAGLIST packets. Additionally updates udp_unexpected_gso() accordingly.
UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist zeroed.
v1 -> v2: - use 2 bits instead of a whole GSO bitmask (Willem)
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3 |
|
| #
2874c5fd |
| 27-May-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2 |
|
| #
a36e185e |
| 08-Nov-2018 |
Stefano Brivio <[email protected]> |
udp: Handle ICMP errors for tunnels with same destination port on both endpoints
For both IPv4 and IPv6, if we can't match errors to a socket, try tunnels before ignoring them. Look up a socket with
udp: Handle ICMP errors for tunnels with same destination port on both endpoints
For both IPv4 and IPv6, if we can't match errors to a socket, try tunnels before ignoring them. Look up a socket with the original source and destination ports as found in the UDP packet inside the ICMP payload, this will work for tunnels that force the same destination port for both endpoints, i.e. VXLAN and GENEVE.
Actually, lwtunnels could break this assumption if they are configured by an external control plane to have different destination ports on the endpoints: in this case, we won't be able to trace ICMP messages back to them.
For IPv6 redirect messages, call ip6_redirect() directly with the output interface argument set to the interface we received the packet from (as it's the very interface we should build the exception on), otherwise the new nexthop will be rejected. There's no such need for IPv4.
Tunnels can now export an encap_err_lookup() operation that indicates a match. Pass the packet to the lookup function, and if the tunnel driver reports a matching association, continue with regular ICMP error handling.
v2: - Added newline between network and transport header sets in __udp{4,6}_lib_err_encap() (David Miller) - Removed redundant skb_reset_network_header(skb); in __udp4_lib_err_encap() - Removed redundant reassignment of iph in __udp4_lib_err_encap() (Sabrina Dubroca) - Edited comment to __udp{4,6}_lib_err_encap() to reflect the fact this won't work with lwtunnels configured to use asymmetric ports. By the way, it's VXLAN, not VxLAN (Jiri Benc)
Signed-off-by: Stefano Brivio <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|