|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
b27055a0 |
| 02-Apr-2025 |
Lin Ma <[email protected]> |
net: fix geneve_opt length integer overflow
struct geneve_opt uses 5 bit length for each single option, which means every vary size option should be smaller than 128 bytes.
However, all current rel
net: fix geneve_opt length integer overflow
struct geneve_opt uses 5 bit length for each single option, which means every vary size option should be smaller than 128 bytes.
However, all current related Netlink policies cannot promise this length condition and the attacker can exploit a exact 128-byte size option to *fake* a zero length option and confuse the parsing logic, further achieve heap out-of-bounds read.
One example crash log is like below:
[ 3.905425] ================================================================== [ 3.905925] BUG: KASAN: slab-out-of-bounds in nla_put+0xa9/0xe0 [ 3.906255] Read of size 124 at addr ffff888005f291cc by task poc/177 [ 3.906646] [ 3.906775] CPU: 0 PID: 177 Comm: poc-oob-read Not tainted 6.1.132 #1 [ 3.907131] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 [ 3.907784] Call Trace: [ 3.907925] <TASK> [ 3.908048] dump_stack_lvl+0x44/0x5c [ 3.908258] print_report+0x184/0x4be [ 3.909151] kasan_report+0xc5/0x100 [ 3.909539] kasan_check_range+0xf3/0x1a0 [ 3.909794] memcpy+0x1f/0x60 [ 3.909968] nla_put+0xa9/0xe0 [ 3.910147] tunnel_key_dump+0x945/0xba0 [ 3.911536] tcf_action_dump_1+0x1c1/0x340 [ 3.912436] tcf_action_dump+0x101/0x180 [ 3.912689] tcf_exts_dump+0x164/0x1e0 [ 3.912905] fw_dump+0x18b/0x2d0 [ 3.913483] tcf_fill_node+0x2ee/0x460 [ 3.914778] tfilter_notify+0xf4/0x180 [ 3.915208] tc_new_tfilter+0xd51/0x10d0 [ 3.918615] rtnetlink_rcv_msg+0x4a2/0x560 [ 3.919118] netlink_rcv_skb+0xcd/0x200 [ 3.919787] netlink_unicast+0x395/0x530 [ 3.921032] netlink_sendmsg+0x3d0/0x6d0 [ 3.921987] __sock_sendmsg+0x99/0xa0 [ 3.922220] __sys_sendto+0x1b7/0x240 [ 3.922682] __x64_sys_sendto+0x72/0x90 [ 3.922906] do_syscall_64+0x5e/0x90 [ 3.923814] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 [ 3.924122] RIP: 0033:0x7e83eab84407 [ 3.924331] Code: 48 89 fa 4c 89 df e8 38 aa 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 faf [ 3.925330] RSP: 002b:00007ffff505e370 EFLAGS: 00000202 ORIG_RAX: 000000000000002c [ 3.925752] RAX: ffffffffffffffda RBX: 00007e83eaafa740 RCX: 00007e83eab84407 [ 3.926173] RDX: 00000000000001a8 RSI: 00007ffff505e3c0 RDI: 0000000000000003 [ 3.926587] RBP: 00007ffff505f460 R08: 00007e83eace1000 R09: 000000000000000c [ 3.926977] R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffff505f3c0 [ 3.927367] R13: 00007ffff505f5c8 R14: 00007e83ead1b000 R15: 00005d4fbbe6dcb8
Fix these issues by enforing correct length condition in related policies.
Fixes: 925d844696d9 ("netfilter: nft_tunnel: add support for geneve opts") Fixes: 4ece47787077 ("lwtunnel: add options setting and dumping for geneve") Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key") Fixes: 0a6e77784f49 ("net/sched: allow flower to match tunnel options") Signed-off-by: Lin Ma <[email protected]> Reviewed-by: Xin Long <[email protected]> Acked-by: Cong Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
89304247 |
| 29-Mar-2025 |
Guillaume Nault <[email protected]> |
tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu().
Because skb_tunnel_check_pmtu() doesn't handle PACKET_HOST packets, commit 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper pmtu
tunnels: Accept PACKET_HOST in skb_tunnel_check_pmtu().
Because skb_tunnel_check_pmtu() doesn't handle PACKET_HOST packets, commit 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper pmtud support.") forced skb->pkt_type to PACKET_OUTGOING for openvswitch packets that are sent using the OVS_ACTION_ATTR_OUTPUT action. This allowed such packets to invoke the iptunnel_pmtud_check_icmp() or iptunnel_pmtud_check_icmpv6() helpers and thus trigger PMTU update on the input device.
However, this also broke other parts of PMTU discovery. Since these packets don't have the PACKET_HOST type anymore, they won't trigger the sending of ICMP Fragmentation Needed or Packet Too Big messages to remote hosts when oversized (see the skb_in->pkt_type condition in __icmp_send() for example).
These two skb->pkt_type checks are therefore incompatible as one requires skb->pkt_type to be PACKET_HOST, while the other requires it to be anything but PACKET_HOST.
It makes sense to not trigger ICMP messages for non-PACKET_HOST packets as these messages should be generated only for incoming l2-unicast packets. However there doesn't seem to be any reason for skb_tunnel_check_pmtu() to ignore PACKET_HOST packets.
Allow both cases to work by allowing skb_tunnel_check_pmtu() to work on PACKET_HOST packets and not overriding skb->pkt_type in openvswitch anymore.
Fixes: 30a92c9e3d6b ("openvswitch: Set the skbuff pkt_type for proper pmtud support.") Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Reviewed-by: Aaron Conole <[email protected]> Tested-by: Aaron Conole <[email protected]> Link: https://patch.msgid.link/eac941652b86fddf8909df9b3bf0d97bc9444793.1743208264.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2 |
|
| #
5832c4a7 |
| 27-Mar-2024 |
Alexander Lobakin <[email protected]> |
ip_tunnel: convert __be16 tunnel flags to bitmaps
Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT have been defined as __be16. Now all of those 16 bits are occupied and there's no m
ip_tunnel: convert __be16 tunnel flags to bitmaps
Historically, tunnel flags like TUNNEL_CSUM or TUNNEL_ERSPAN_OPT have been defined as __be16. Now all of those 16 bits are occupied and there's no more free space for new flags. It can't be simply switched to a bigger container with no adjustments to the values, since it's an explicit Endian storage, and on LE systems (__be16)0x0001 equals to (__be64)0x0001000000000000. We could probably define new 64-bit flags depending on the Endianness, i.e. (__be64)0x0001 on BE and (__be64)0x00010000... on LE, but that would introduce an Endianness dependency and spawn a ton of Sparse warnings. To mitigate them, all of those places which were adjusted with this change would be touched anyway, so why not define stuff properly if there's no choice.
Define IP_TUNNEL_*_BIT counterparts as a bit number instead of the value already coded and a fistful of <16 <-> bitmap> converters and helpers. The two flags which have a different bit position are SIT_ISATAP_BIT and VTI_ISVTI_BIT, as they were defined not as __cpu_to_be16(), but as (__force __be16), i.e. had different positions on LE and BE. Now they both have strongly defined places. Change all __be16 fields which were used to store those flags, to IP_TUNNEL_DECLARE_FLAGS() -> DECLARE_BITMAP(__IP_TUNNEL_FLAG_NUM) -> unsigned long[1] for now, and replace all TUNNEL_* occurrences to their bitmap counterparts. Use the converters in the places which talk to the userspace, hardware (NFP) or other hosts (GRE header). The rest must explicitly use the new flags only. This must be done at once, otherwise there will be too many conversions throughout the code in the intermediate commits. Finally, disable the old __be16 flags for use in the kernel code (except for the two 'irregular' flags mentioned above), to prevent any accidental (mis)use of them. For the userspace, nothing is changed, only additions were made.
Most noticeable bloat-o-meter difference (.text):
vmlinux: 307/-1 (306) gre.ko: 62/0 (62) ip_gre.ko: 941/-217 (724) [*] ip_tunnel.ko: 390/-900 (-510) [**] ip_vti.ko: 138/0 (138) ip6_gre.ko: 534/-18 (516) [*] ip6_tunnel.ko: 118/-10 (108)
[*] gre_flags_to_tnl_flags() grew, but still is inlined [**] ip_tunnel_find() got uninlined, hence such decrease
The average code size increase in non-extreme case is 100-200 bytes per module, mostly due to sizeof(long) > sizeof(__be16), as %__IP_TUNNEL_FLAG_NUM is less than %BITS_PER_LONG and the compilers are able to expand the majority of bitmap_*() calls here into direct operations on scalars.
Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
117aef12 |
| 27-Mar-2024 |
Alexander Lobakin <[email protected]> |
ip_tunnel: use a separate struct to store tunnel params in the kernel
Unlike IPv6 tunnels which use purely-kernel __ip6_tnl_parm structure to store params inside the kernel, IPv4 tunnel code uses th
ip_tunnel: use a separate struct to store tunnel params in the kernel
Unlike IPv6 tunnels which use purely-kernel __ip6_tnl_parm structure to store params inside the kernel, IPv4 tunnel code uses the same ip_tunnel_parm which is being used to talk with the userspace. This makes it difficult to alter or add any fields or use a different format for whatever data. Define struct ip_tunnel_parm_kern, a 1:1 copy of ip_tunnel_parm for now, and use it throughout the code. Define the pieces, where the copy user <-> kernel happens, as standalone functions, and copy the data there field-by-field, so that the kernel-side structure could be easily modified later on and the users wouldn't have to care about this.
Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3 |
|
| #
d75abeec |
| 01-Feb-2024 |
Antoine Tenart <[email protected]> |
tunnels: fix out of bounds access when building IPv6 PMTU error
If the ICMPv6 error is built from a non-linear skb we get the following splat,
BUG: KASAN: slab-out-of-bounds in do_csum+0x220/0x24
tunnels: fix out of bounds access when building IPv6 PMTU error
If the ICMPv6 error is built from a non-linear skb we get the following splat,
BUG: KASAN: slab-out-of-bounds in do_csum+0x220/0x240 Read of size 4 at addr ffff88811d402c80 by task netperf/820 CPU: 0 PID: 820 Comm: netperf Not tainted 6.8.0-rc1+ #543 ... kasan_report+0xd8/0x110 do_csum+0x220/0x240 csum_partial+0xc/0x20 skb_tunnel_check_pmtu+0xeb9/0x3280 vxlan_xmit_one+0x14c2/0x4080 vxlan_xmit+0xf61/0x5c00 dev_hard_start_xmit+0xfb/0x510 __dev_queue_xmit+0x7cd/0x32a0 br_dev_queue_push_xmit+0x39d/0x6a0
Use skb_checksum instead of csum_partial who cannot deal with non-linear SKBs.
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Antoine Tenart <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5 |
|
| #
6a7ac3d2 |
| 03-Aug-2023 |
Florian Westphal <[email protected]> |
tunnels: fix kasan splat when generating ipv4 pmtu error
If we try to emit an icmp error in response to a nonliner skb, we get
BUG: KASAN: slab-out-of-bounds in ip_compute_csum+0x134/0x220 Read of
tunnels: fix kasan splat when generating ipv4 pmtu error
If we try to emit an icmp error in response to a nonliner skb, we get
BUG: KASAN: slab-out-of-bounds in ip_compute_csum+0x134/0x220 Read of size 4 at addr ffff88811c50db00 by task iperf3/1691 CPU: 2 PID: 1691 Comm: iperf3 Not tainted 6.5.0-rc3+ #309 [..] kasan_report+0x105/0x140 ip_compute_csum+0x134/0x220 iptunnel_pmtud_build_icmp+0x554/0x1020 skb_tunnel_check_pmtu+0x513/0xb80 vxlan_xmit_one+0x139e/0x2ef0 vxlan_xmit+0x1867/0x2760 dev_hard_start_xmit+0x1ee/0x4f0 br_dev_queue_push_xmit+0x4d1/0x660 [..]
ip_compute_csum() cannot deal with nonlinear skbs, so avoid it. After this change, splat is gone and iperf3 is no longer stuck.
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Florian Westphal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0 |
|
| #
b86fca80 |
| 29-Sep-2022 |
Liu Jian <[email protected]> |
net: Add helper function to parse netlink msg of ip_tunnel_parm
Add ip_tunnel_netlink_parms to parse netlink msg of ip_tunnel_parm. Reduces duplicate code, no actual functional changes.
Signed-off-
net: Add helper function to parse netlink msg of ip_tunnel_parm
Add ip_tunnel_netlink_parms to parse netlink msg of ip_tunnel_parm. Reduces duplicate code, no actual functional changes.
Signed-off-by: Liu Jian <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
537dd2d9 |
| 29-Sep-2022 |
Liu Jian <[email protected]> |
net: Add helper function to parse netlink msg of ip_tunnel_encap
Add ip_tunnel_netlink_encap_parms to parse netlink msg of ip_tunnel_encap. Reduces duplicate code, no actual functional changes.
Sig
net: Add helper function to parse netlink msg of ip_tunnel_encap
Add ip_tunnel_netlink_encap_parms to parse netlink msg of ip_tunnel_encap. Reduces duplicate code, no actual functional changes.
Signed-off-by: Liu Jian <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4 |
|
| #
853a7614 |
| 24-Jun-2022 |
Eric Dumazet <[email protected]> |
tunnels: do not assume mac header is set in skb_tunnel_check_pmtu()
Recently added debug in commit f9aefd6b2aa3 ("net: warn if mac header was not set") caught a bug in skb_tunnel_check_pmtu(), as sh
tunnels: do not assume mac header is set in skb_tunnel_check_pmtu()
Recently added debug in commit f9aefd6b2aa3 ("net: warn if mac header was not set") caught a bug in skb_tunnel_check_pmtu(), as shown in this syzbot report [1].
In ndo_start_xmit() paths, there is really no need to use skb->mac_header, because skb->data is supposed to point at it.
[1] WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_mac_header_len include/linux/skbuff.h:2784 [inline] WARNING: CPU: 1 PID: 8604 at include/linux/skbuff.h:2784 skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413 Modules linked in: CPU: 1 PID: 8604 Comm: syz-executor.3 Not tainted 5.19.0-rc2-syzkaller-00443-g8720bd951b8e #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:skb_mac_header_len include/linux/skbuff.h:2784 [inline] RIP: 0010:skb_tunnel_check_pmtu+0x5de/0x2f90 net/ipv4/ip_tunnel_core.c:413 Code: 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c 02 00 0f 84 b9 fe ff ff 4c 89 ff e8 7c 0f d7 f9 e9 ac fe ff ff e8 c2 13 8a f9 <0f> 0b e9 28 fc ff ff e8 b6 13 8a f9 48 8b 54 24 70 48 b8 00 00 00 RSP: 0018:ffffc90002e4f520 EFLAGS: 00010212 RAX: 0000000000000324 RBX: ffff88804d5fd500 RCX: ffffc90005b52000 RDX: 0000000000040000 RSI: ffffffff87f05e3e RDI: 0000000000000003 RBP: ffffc90002e4f650 R08: 0000000000000003 R09: 000000000000ffff R10: 000000000000ffff R11: 0000000000000000 R12: 000000000000ffff R13: 0000000000000000 R14: 000000000000ffcd R15: 000000000000001f FS: 00007f3babba9700(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000080 CR3: 0000000075319000 CR4: 00000000003506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> geneve_xmit_skb drivers/net/geneve.c:927 [inline] geneve_xmit+0xcf8/0x35d0 drivers/net/geneve.c:1107 __netdev_start_xmit include/linux/netdevice.h:4805 [inline] netdev_start_xmit include/linux/netdevice.h:4819 [inline] __dev_direct_xmit+0x500/0x730 net/core/dev.c:4309 dev_direct_xmit include/linux/netdevice.h:3007 [inline] packet_direct_xmit+0x1b8/0x2c0 net/packet/af_packet.c:282 packet_snd net/packet/af_packet.c:3073 [inline] packet_sendmsg+0x21f4/0x55d0 net/packet/af_packet.c:3104 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:734 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2489 ___sys_sendmsg+0xf3/0x170 net/socket.c:2543 __sys_sendmsg net/socket.c:2572 [inline] __do_sys_sendmsg net/socket.c:2581 [inline] __se_sys_sendmsg net/socket.c:2579 [inline] __x64_sys_sendmsg+0x132/0x220 net/socket.c:2579 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f3baaa89109 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f3babba9168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f3baab9bf60 RCX: 00007f3baaa89109 RDX: 0000000000000000 RSI: 0000000020000a00 RDI: 0000000000000003 RBP: 00007f3baaae305d R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe74f2543f R14: 00007f3babba9300 R15: 0000000000022000 </TASK>
Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Stefano Brivio <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3 |
|
| #
fda4fde2 |
| 07-Jan-2021 |
Julian Wiedmann <[email protected]> |
net: ip_tunnel: clean up endianness conversions
sparse complains about some harmless endianness issues:
> net/ipv4/ip_tunnel_core.c:225:43: warning: cast to restricted __be16 > net/ipv4/ip_tunnel_c
net: ip_tunnel: clean up endianness conversions
sparse complains about some harmless endianness issues:
> net/ipv4/ip_tunnel_core.c:225:43: warning: cast to restricted __be16 > net/ipv4/ip_tunnel_core.c:225:43: warning: incorrect type in initializer (different base types) > net/ipv4/ip_tunnel_core.c:225:43: expected restricted __be16 [usertype] mtu > net/ipv4/ip_tunnel_core.c:225:43: got unsigned short [usertype]
iptunnel_pmtud_build_icmp() uses the wrong flavour of byte-order conversion when storing the MTU into the ICMPv4 packet. Use htons(), just like iptunnel_pmtud_build_icmpv6() does.
> net/ipv4/ip_tunnel_core.c:248:35: warning: cast from restricted __be16 > net/ipv4/ip_tunnel_core.c:248:35: warning: incorrect type in argument 3 (different base types) > net/ipv4/ip_tunnel_core.c:248:35: expected unsigned short type > net/ipv4/ip_tunnel_core.c:248:35: got restricted __be16 [usertype] > net/ipv4/ip_tunnel_core.c:341:35: warning: cast from restricted __be16 > net/ipv4/ip_tunnel_core.c:341:35: warning: incorrect type in argument 3 (different base types) > net/ipv4/ip_tunnel_core.c:341:35: expected unsigned short type > net/ipv4/ip_tunnel_core.c:341:35: got restricted __be16 [usertype]
eth_header() wants the Ethertype in host-order, use the correct flavour of byte-order conversion.
> net/ipv4/ip_tunnel_core.c:600:45: warning: restricted __be16 degrades to integer > net/ipv4/ip_tunnel_core.c:609:30: warning: incorrect type in assignment (different base types) > net/ipv4/ip_tunnel_core.c:609:30: expected int type > net/ipv4/ip_tunnel_core.c:609:30: got restricted __be16 [usertype] > net/ipv4/ip_tunnel_core.c:619:30: warning: incorrect type in assignment (different base types) > net/ipv4/ip_tunnel_core.c:619:30: expected int type > net/ipv4/ip_tunnel_core.c:619:30: got restricted __be16 [usertype] > net/ipv4/ip_tunnel_core.c:629:30: warning: incorrect type in assignment (different base types) > net/ipv4/ip_tunnel_core.c:629:30: expected int type > net/ipv4/ip_tunnel_core.c:629:30: got restricted __be16 [usertype]
The TUNNEL_* types are big-endian, so adjust the type of the local variable in ip_tun_parse_opts().
Signed-off-by: Julian Wiedmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3 |
|
| #
682036b2 |
| 07-Nov-2020 |
Heiner Kallweit <[email protected]> |
net: remove ip_tunnel_get_stats64
After having migrated all users remove ip_tunnel_get_stats64().
Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]
net: remove ip_tunnel_get_stats64
After having migrated all users remove ip_tunnel_get_stats64().
Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
77a2d673 |
| 06-Nov-2020 |
Stefano Brivio <[email protected]> |
tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies
Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6 packets over a link with a PMTU estimation of exactly 1350 bytes
tunnels: Fix off-by-one in lower MTU bounds for ICMP/ICMPv6 replies
Jianlin reports that a bridged IPv6 VXLAN endpoint, carrying IPv6 packets over a link with a PMTU estimation of exactly 1350 bytes, won't trigger ICMPv6 Packet Too Big replies when the encapsulated datagrams exceed said PMTU value. VXLAN over IPv6 adds 70 bytes of overhead, so an ICMPv6 reply indicating 1280 bytes as inner MTU would be legitimate and expected.
This comes from an off-by-one error I introduced in checks added as part of commit 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets"), whose purpose was to prevent sending ICMPv6 Packet Too Big messages with an MTU lower than the smallest permissible IPv6 link MTU, i.e. 1280 bytes.
In iptunnel_pmtud_check_icmpv6(), avoid triggering a reply only if the advertised MTU would be less than, and not equal to, 1280 bytes.
Also fix the analogous comparison for IPv4, that is, skip the ICMP reply only if the resulting MTU is strictly less than 576 bytes.
This becomes apparent while running the net/pmtu.sh bridged VXLAN or GENEVE selftests with adjusted lower-link MTU values. Using e.g. GENEVE, setting ll_mtu to the values reported below, in the test_pmtu_ipvX_over_bridged_vxlanY_or_geneveY_exception() test function, we can see failures on the following tests:
test | ll_mtu -------------------------------|-------- pmtu_ipv4_br_geneve4_exception | 626 pmtu_ipv6_br_geneve4_exception | 1330 pmtu_ipv6_br_geneve6_exception | 1350
owing to the different tunneling overheads implied by the corresponding configurations.
Reported-by: Jianlin Shi <[email protected]> Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Stefano Brivio <[email protected]> Link: https://lore.kernel.org/r/4f5fc2f33bfdf8409549fafd4f952b008bf04d63.1604681709.git.sbrivio@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v5.10-rc2, v5.10-rc1 |
|
| #
cf89f18f |
| 12-Oct-2020 |
Heiner Kallweit <[email protected]> |
iptunnel: use new function dev_fetch_sw_netstats
Simplify the code by using new function dev_fetch_sw_netstats().
Signed-off-by: Heiner Kallweit <[email protected]> Link: https://lore.kernel.org
iptunnel: use new function dev_fetch_sw_netstats
Simplify the code by using new function dev_fetch_sw_netstats().
Signed-off-by: Heiner Kallweit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5 |
|
| #
681d2cfb |
| 13-Sep-2020 |
Xin Long <[email protected]> |
lwtunnel: only keep the available bits when setting vxlan md->gbp
As we can see from vxlan_build/parse_gbp_hdr(), when processing metadata on vxlan rx/tx path, only dont_learn/policy_applied/policy_
lwtunnel: only keep the available bits when setting vxlan md->gbp
As we can see from vxlan_build/parse_gbp_hdr(), when processing metadata on vxlan rx/tx path, only dont_learn/policy_applied/policy_id fields can be set to or parse from the packet for vxlan gbp option.
So do the mask when set it in lwtunnel, as it does in act_tunnel_key and cls_flower.
Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1 |
|
| #
96436094 |
| 12-Aug-2020 |
David S. Miller <[email protected]> |
Revert "ipv4: tunnel: fix compilation on ARCH=um"
This reverts commit 06a7a37be55e29961c9ba2abec4d07c8e0e21861.
The bug was already fixed, this added a dup include.
Reported-by: Eric Dumazet <eric
Revert "ipv4: tunnel: fix compilation on ARCH=um"
This reverts commit 06a7a37be55e29961c9ba2abec4d07c8e0e21861.
The bug was already fixed, this added a dup include.
Reported-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
06a7a37b |
| 12-Aug-2020 |
Johannes Berg <[email protected]> |
ipv4: tunnel: fix compilation on ARCH=um
With certain configurations, a 64-bit ARCH=um errors out here with an unknown csum_ipv6_magic() function. Include the right header file to always have it.
S
ipv4: tunnel: fix compilation on ARCH=um
With certain configurations, a 64-bit ARCH=um errors out here with an unknown csum_ipv6_magic() function. Include the right header file to always have it.
Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
8ed54f16 |
| 05-Aug-2020 |
Stefano Brivio <[email protected]> |
ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
On architectures defining _HAVE_ARCH_IPV6_CSUM, we get csum_ipv6_magic() defined by means of arch checksum.h headers. On other archit
ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
On architectures defining _HAVE_ARCH_IPV6_CSUM, we get csum_ipv6_magic() defined by means of arch checksum.h headers. On other architectures, we actually need to include net/ip6_checksum.h to be able to use it.
Without this include, building with defconfig breaks at least for s390.
Reported-by: Stephen Rothwell <[email protected]> Fixes: 4cb47a8644cc ("tunnels: PMTU discovery support for directly bridged IP packets") Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
4cb47a86 |
| 04-Aug-2020 |
Stefano Brivio <[email protected]> |
tunnels: PMTU discovery support for directly bridged IP packets
It's currently possible to bridge Ethernet tunnels carrying IP packets directly to external interfaces without assigning them addresse
tunnels: PMTU discovery support for directly bridged IP packets
It's currently possible to bridge Ethernet tunnels carrying IP packets directly to external interfaces without assigning them addresses and routes on the bridged network itself: this is the case for UDP tunnels bridged with a standard bridge or by Open vSwitch.
PMTU discovery is currently broken with those configurations, because the encapsulation effectively decreases the MTU of the link, and while we are able to account for this using PMTU discovery on the lower layer, we don't have a way to relay ICMP or ICMPv6 messages needed by the sender, because we don't have valid routes to it.
On the other hand, as a tunnel endpoint, we can't fragment packets as a general approach: this is for instance clearly forbidden for VXLAN by RFC 7348, section 4.3:
VTEPs MUST NOT fragment VXLAN packets. Intermediate routers may fragment encapsulated VXLAN packets due to the larger frame size. The destination VTEP MAY silently discard such VXLAN fragments.
The same paragraph recommends that the MTU over the physical network accomodates for encapsulations, but this isn't a practical option for complex topologies, especially for typical Open vSwitch use cases.
Further, it states that:
Other techniques like Path MTU discovery (see [RFC1191] and [RFC1981]) MAY be used to address this requirement as well.
Now, PMTU discovery already works for routed interfaces, we get route exceptions created by the encapsulation device as they receive ICMP Fragmentation Needed and ICMPv6 Packet Too Big messages, and we already rebuild those messages with the appropriate MTU and route them back to the sender.
Add the missing bits for bridged cases:
- checks in skb_tunnel_check_pmtu() to understand if it's appropriate to trigger a reply according to RFC 1122 section 3.2.2 for ICMP and RFC 4443 section 2.4 for ICMPv6. This function is already called by UDP tunnels
- a new function generating those ICMP or ICMPv6 replies. We can't reuse icmp_send() and icmp6_send() as we don't see the sender as a valid destination. This doesn't need to be generic, as we don't cover any other type of ICMP errors given that we only provide an encapsulation function to the sender
While at it, make the MTU check in skb_tunnel_check_pmtu() accurate: we might receive GSO buffers here, and the passed headroom already includes the inner MAC length, so we don't have to account for it a second time (that would imply three MAC headers on the wire, but there are just two).
This issue became visible while bridging IPv6 packets with 4500 bytes of payload over GENEVE using IPv4 with a PMTU of 4000. Given the 50 bytes of encapsulation headroom, we would advertise MTU as 3950, and we would reject fragmented IPv6 datagrams of 3958 bytes size on the wire. We're exclusively dealing with network MTU here, though, so we could get Ethernet frames up to 3964 octets in that case.
v2: - moved skb_tunnel_check_pmtu() to ip_tunnel_core.c (David Ahern) - split IPv4/IPv6 functions (David Ahern)
Signed-off-by: Stefano Brivio <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4 |
|
| #
2606aff9 |
| 30-Jun-2020 |
Jason A. Donenfeld <[email protected]> |
net: ip_tunnel: add header_ops for layer 3 devices
Some devices that take straight up layer 3 packets benefit from having a shared header_ops so that AF_PACKET sockets can inject packets that are re
net: ip_tunnel: add header_ops for layer 3 devices
Some devices that take straight up layer 3 packets benefit from having a shared header_ops so that AF_PACKET sockets can inject packets that are recognized. This shared infrastructure will be used by other drivers that currently can't inject packets using AF_PACKET. It also exposes the parser function, as it is useful in standalone form too.
Signed-off-by: Jason A. Donenfeld <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6 |
|
| #
faee6769 |
| 27-Mar-2020 |
Alexander Aring <[email protected]> |
net: add net available in build_state
The build_state callback of lwtunnel doesn't contain the net namespace structure yet. This patch will add it so we can check on specific address configuration a
net: add net available in build_state
The build_state callback of lwtunnel doesn't contain the net namespace structure yet. This patch will add it so we can check on specific address configuration at creation time of rpl source routes.
Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4 |
|
| #
1841b982 |
| 21-Nov-2019 |
Xin Long <[email protected]> |
lwtunnel: check erspan options before allocating tun_info
As Jakub suggested on another patch, it's better to do the check on erspan options before allocating memory.
Signed-off-by: Xin Long <lucie
lwtunnel: check erspan options before allocating tun_info
As Jakub suggested on another patch, it's better to do the check on erspan options before allocating memory.
Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
7b6a70f7 |
| 21-Nov-2019 |
Xin Long <[email protected]> |
lwtunnel: be STRICT to validate the new LWTUNNEL_IP(6)_OPTS
LWTUNNEL_IP(6)_OPTS are the new items in ip(6)_tun_policy, which are parsed by nla_parse_nested_deprecated(). We should check it strictly
lwtunnel: be STRICT to validate the new LWTUNNEL_IP(6)_OPTS
LWTUNNEL_IP(6)_OPTS are the new items in ip(6)_tun_policy, which are parsed by nla_parse_nested_deprecated(). We should check it strictly by setting .strict_start_type = LWTUNNEL_IP(6)_OPTS.
This patch also adds missing LWTUNNEL_IP6_OPTS in ip6_tun_policy.
Fixes: 4ece47787077 ("lwtunnel: add options setting and dumping for geneve") Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
2f1d370b |
| 19-Nov-2019 |
Xin Long <[email protected]> |
lwtunnel: add support for multiple geneve opts
geneve RFC (draft-ietf-nvo3-geneve-14) allows a geneve packet to carry multiple geneve opts, so it's necessary for lwtunnel to support adding multiple
lwtunnel: add support for multiple geneve opts
geneve RFC (draft-ietf-nvo3-geneve-14) allows a geneve packet to carry multiple geneve opts, so it's necessary for lwtunnel to support adding multiple geneve opts in one lwtunnel route. But vxlan and erspan opts are still only allowed to add one option.
With this patch, iproute2 could make it like:
# ip r a 1.1.1.0/24 encap ip id 1 geneve_opts 0:0:12121212,1:2:12121212 \ dst 10.1.0.2 dev geneve1
# ip r a 1.1.1.0/24 encap ip id 1 vxlan_opts 456 \ dst 10.1.0.2 dev erspan1
# ip r a 1.1.1.0/24 encap ip id 1 erspan_opts 1:123:0:0 \ dst 10.1.0.2 dev erspan1
Which are pretty much like cls_flower and act_tunnel_key.
Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
3132174b |
| 18-Nov-2019 |
Xin Long <[email protected]> |
lwtunnel: change to use nla_put_u8 for LWTUNNEL_IP_OPT_ERSPAN_VER
LWTUNNEL_IP_OPT_ERSPAN_VER is u8 type, and nla_put_u8 should have been used instead of nla_put_u32(). This is a copy-paste error.
F
lwtunnel: change to use nla_put_u8 for LWTUNNEL_IP_OPT_ERSPAN_VER
LWTUNNEL_IP_OPT_ERSPAN_VER is u8 type, and nla_put_u8 should have been used instead of nla_put_u32(). This is a copy-paste error.
Fixes: b0a21810bd5e ("lwtunnel: add options setting and dumping for erspan") Signed-off-by: Xin Long <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.4-rc8, v5.4-rc7 |
|
| #
0c06d166 |
| 10-Nov-2019 |
Xin Long <[email protected]> |
lwtunnel: ignore any TUNNEL_OPTIONS_PRESENT flags set by users
TUNNEL_OPTIONS_PRESENT (TUNNEL_GENEVE_OPT|TUNNEL_VXLAN_OPT| TUNNEL_ERSPAN_OPT) flags should be set only according to tb[LWTUNNEL_IP_OPT
lwtunnel: ignore any TUNNEL_OPTIONS_PRESENT flags set by users
TUNNEL_OPTIONS_PRESENT (TUNNEL_GENEVE_OPT|TUNNEL_VXLAN_OPT| TUNNEL_ERSPAN_OPT) flags should be set only according to tb[LWTUNNEL_IP_OPTS], which is done in ip_tun_parse_opts().
When setting info key.tun_flags, the TUNNEL_OPTIONS_PRESENT bits in tb[LWTUNNEL_IP(6)_FLAGS] passed from users should be ignored.
While at it, replace all (TUNNEL_GENEVE_OPT|TUNNEL_VXLAN_OPT| TUNNEL_ERSPAN_OPT) with 'TUNNEL_OPTIONS_PRESENT'.
Fixes: 3093fbe7ff4b ("route: Per route IP tunnel metadata via lightweight tunnel") Fixes: 32a2b002ce61 ("ipv6: route: per route IP tunnel metadata via lightweight tunnel") Signed-off-by: Xin Long <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|