|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1 |
|
| #
8fa7292f |
| 05-Apr-2025 |
Thomas Gleixner <[email protected]> |
treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines.
Conversion was done with c
treewide: Switch/rename to timer_delete[_sync]()
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v6.14, v6.14-rc7 |
|
| #
8033d2ae |
| 12-Mar-2025 |
Stanislav Fomichev <[email protected]> |
Revert "net: replace dev_addr_sem with netdev instance lock"
This reverts commit df43d8bf10316a7c3b1e47e3cc0057a54df4a5b8.
Cc: Kohei Enju <[email protected]> Reviewed-by: Kuniyuki Iwashima <kuniyu@a
Revert "net: replace dev_addr_sem with netdev instance lock"
This reverts commit df43d8bf10316a7c3b1e47e3cc0057a54df4a5b8.
Cc: Kohei Enju <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Fixes: df43d8bf1031 ("net: replace dev_addr_sem with netdev instance lock") Signed-off-by: Stanislav Fomichev <[email protected]> Link: https://patch.msgid.link/[email protected] Tested-by: Lei Yang <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc6 |
|
| #
df43d8bf |
| 05-Mar-2025 |
Stanislav Fomichev <[email protected]> |
net: replace dev_addr_sem with netdev instance lock
Lockdep reports possible circular dependency in [0]. Instead of fixing the ordering, replace global dev_addr_sem with netdev instance lock. Most o
net: replace dev_addr_sem with netdev instance lock
Lockdep reports possible circular dependency in [0]. Instead of fixing the ordering, replace global dev_addr_sem with netdev instance lock. Most of the paths that set/get mac are RTNL protected. Two places where it's not, convert to explicit locking: - sysfs address_show - dev_get_mac_address via dev_ioctl
0: https://netdev-3.bots.linux.dev/vmksft-forwarding-dbg/results/993321/24-router-bridge-1d-lag-sh/stderr
Signed-off-by: Stanislav Fomichev <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
0ca23a4d |
| 05-Mar-2025 |
Marcus Wichelmann <[email protected]> |
net: tun: Enable transfer of XDP metadata to skb
When the XDP metadata area was used, it is expected that the same metadata can also be accessed from TC, as can be read in the description of the bpf
net: tun: Enable transfer of XDP metadata to skb
When the XDP metadata area was used, it is expected that the same metadata can also be accessed from TC, as can be read in the description of the bpf_xdp_adjust_meta helper function. In the tun driver, this was not yet implemented.
To make this work, the skb that is being built on XDP_PASS should know of the current size of the metadata area. This is ensured by adding calls to skb_metadata_set. For the tun_xdp_one code path, an additional check is necessary to handle the case where the externally initialized xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
More information about this feature can be found in the commit message of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
Signed-off-by: Marcus Wichelmann <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://patch.msgid.link/[email protected]
show more ...
|
| #
c2315ebb |
| 05-Mar-2025 |
Marcus Wichelmann <[email protected]> |
net: tun: Enable XDP metadata support
Enable the support for the bpf_xdp_adjust_meta helper function for XDP buffers initialized by the tun driver. This allows to reserve a metadata area that is use
net: tun: Enable XDP metadata support
Enable the support for the bpf_xdp_adjust_meta helper function for XDP buffers initialized by the tun driver. This allows to reserve a metadata area that is useful to pass any information from one XDP program to another one, for example when using tail-calls.
Whether this helper function can be used in an XDP program depends on how the xdp_buff was initialized. Most net drivers initialize the xdp_buff in a way, that allows bpf_xdp_adjust_meta to be used. In case of the tun driver, this is currently not the case.
There are two code paths in the tun driver that lead to a bpf_prog_run_xdp and where metadata support should be enabled:
1. tun_build_skb, which is called by tun_get_user and is used when writing packets from userspace into the device. In this case, the xdp_buff created in tun_build_skb has no support for bpf_xdp_adjust_meta and calls of that helper function result in ENOTSUPP.
For this code path, it's sufficient to set the meta_valid argument of the xdp_prepare_buff call. The reserved headroom is large enough already.
2. tun_xdp_one, which is called by tun_sendmsg which again is called by other drivers (e.g. vhost_net). When the TUN_MSG_PTR mode is used, another driver may pass a batch of xdp_buffs to the tun driver. In this case, that other driver is the one initializing the xdp_buff.
See commit 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") for details.
For now, the vhost_net driver is the only one using TUN_MSG_PTR and it already initializes the xdp_buffs with metadata support and sufficient headroom. But the tun driver disables it again, so the xdp_set_data_meta_invalid call has to be removed.
Signed-off-by: Marcus Wichelmann <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://patch.msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2 |
|
| #
1d41e2fa |
| 07-Feb-2025 |
Akihiko Odaki <[email protected]> |
tun: Extract the vnet handling code
The vnet handling code will be reused by tap.
Functions are renamed to ensure that their names contain "vnet" to clarify that they are part of the decoupled vnet
tun: Extract the vnet handling code
The vnet handling code will be reused by tap.
Functions are renamed to ensure that their names contain "vnet" to clarify that they are part of the decoupled vnet handling code.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
2506251e |
| 07-Feb-2025 |
Akihiko Odaki <[email protected]> |
tun: Decouple vnet handling
Decouple the vnet handling code so that we can reuse it for tap.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]
tun: Decouple vnet handling
Decouple the vnet handling code so that we can reuse it for tap.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
60df67b9 |
| 07-Feb-2025 |
Akihiko Odaki <[email protected]> |
tun: Decouple vnet from tun_struct
Decouple vnet-related functions from tun_struct so that we can reuse them for tap in the future.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-
tun: Decouple vnet from tun_struct
Decouple vnet-related functions from tun_struct so that we can reuse them for tap in the future.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
07e8b3ba |
| 07-Feb-2025 |
Akihiko Odaki <[email protected]> |
tun: Keep hdr_len in tun_get_user()
hdr_len is repeatedly used so keep it in a local variable.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <willemb@google.
tun: Keep hdr_len in tun_get_user()
hdr_len is repeatedly used so keep it in a local variable.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
5a9c5e5d |
| 07-Feb-2025 |
Akihiko Odaki <[email protected]> |
tun: Refactor CONFIG_TUN_VNET_CROSS_LE
Check IS_ENABLED(CONFIG_TUN_VNET_CROSS_LE) to save some lines and make future changes easier.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed
tun: Refactor CONFIG_TUN_VNET_CROSS_LE
Check IS_ENABLED(CONFIG_TUN_VNET_CROSS_LE) to save some lines and make future changes easier.
Signed-off-by: Akihiko Odaki <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
a70c7b3c |
| 04-Feb-2025 |
Willem de Bruijn <[email protected]> |
tun: revert fix group permission check
This reverts commit 3ca459eaba1bf96a8c7878de84fa8872259a01e3.
The blamed commit caused a regression when neither tun->owner nor tun->group is set. This is int
tun: revert fix group permission check
This reverts commit 3ca459eaba1bf96a8c7878de84fa8872259a01e3.
The blamed commit caused a regression when neither tun->owner nor tun->group is set. This is intended to be allowed, but now requires CAP_NET_ADMIN.
Discussion in the referenced thread pointed out that the original issue that prompted this patch can be resolved in userspace.
The relaxed access control may also make a device accessible when it previously wasn't, while existing users may depend on it to not be.
This is a clean pure git revert, except for fixing the indentation on the gid_valid line that checkpatch correctly flagged.
Fixes: 3ca459eaba1b ("tun: fix group permission check") Link: https://lore.kernel.org/netdev/CAFqZXNtkCBT4f+PwyVRmQGoT3p1eVa01fCG_aNtpt6dakXncUg@mail.gmail.com/ Signed-off-by: Willem de Bruijn <[email protected]> Cc: Ondrej Mosnacek <[email protected]> Cc: Stas Sergeev <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4 |
|
| #
a126061c |
| 17-Dec-2024 |
Eric Dumazet <[email protected]> |
ptr_ring: do not block hard interrupts in ptr_ring_resize_multiple()
Jakub added a lockdep_assert_no_hardirq() check in __page_pool_put_page() to increase test coverage.
syzbot found a splat caused
ptr_ring: do not block hard interrupts in ptr_ring_resize_multiple()
Jakub added a lockdep_assert_no_hardirq() check in __page_pool_put_page() to increase test coverage.
syzbot found a splat caused by hard irq blocking in ptr_ring_resize_multiple() [1]
As current users of ptr_ring_resize_multiple() do not require hard irqs being masked, replace it to only block BH.
Rename helpers to better reflect they are safe against BH only.
- ptr_ring_resize_multiple() to ptr_ring_resize_multiple_bh() - skb_array_resize_multiple() to skb_array_resize_multiple_bh()
[1]
WARNING: CPU: 1 PID: 9150 at net/core/page_pool.c:709 __page_pool_put_page net/core/page_pool.c:709 [inline] WARNING: CPU: 1 PID: 9150 at net/core/page_pool.c:709 page_pool_put_unrefed_netmem+0x157/0xa40 net/core/page_pool.c:780 Modules linked in: CPU: 1 UID: 0 PID: 9150 Comm: syz.1.1052 Not tainted 6.11.0-rc3-syzkaller-00202-gf8669d7b5f5d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 RIP: 0010:__page_pool_put_page net/core/page_pool.c:709 [inline] RIP: 0010:page_pool_put_unrefed_netmem+0x157/0xa40 net/core/page_pool.c:780 Code: 74 0e e8 7c aa fb f7 eb 43 e8 75 aa fb f7 eb 3c 65 8b 1d 38 a8 6a 76 31 ff 89 de e8 a3 ae fb f7 85 db 74 0b e8 5a aa fb f7 90 <0f> 0b 90 eb 1d 65 8b 1d 15 a8 6a 76 31 ff 89 de e8 84 ae fb f7 85 RSP: 0018:ffffc9000bda6b58 EFLAGS: 00010083 RAX: ffffffff8997e523 RBX: 0000000000000000 RCX: 0000000000040000 RDX: ffffc9000fbd0000 RSI: 0000000000001842 RDI: 0000000000001843 RBP: 0000000000000000 R08: ffffffff8997df2c R09: 1ffffd40003a000d R10: dffffc0000000000 R11: fffff940003a000e R12: ffffea0001d00040 R13: ffff88802e8a4000 R14: dffffc0000000000 R15: 00000000ffffffff FS: 00007fb7aaf716c0(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa15a0d4b72 CR3: 00000000561b0000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tun_ptr_free drivers/net/tun.c:617 [inline] __ptr_ring_swap_queue include/linux/ptr_ring.h:571 [inline] ptr_ring_resize_multiple_noprof include/linux/ptr_ring.h:643 [inline] tun_queue_resize drivers/net/tun.c:3694 [inline] tun_device_event+0xaaf/0x1080 drivers/net/tun.c:3714 notifier_call_chain+0x19f/0x3e0 kernel/notifier.c:93 call_netdevice_notifiers_extack net/core/dev.c:2032 [inline] call_netdevice_notifiers net/core/dev.c:2046 [inline] dev_change_tx_queue_len+0x158/0x2a0 net/core/dev.c:9024 do_setlink+0xff6/0x41f0 net/core/rtnetlink.c:2923 rtnl_setlink+0x40d/0x5a0 net/core/rtnetlink.c:3201 rtnetlink_rcv_msg+0x73f/0xcf0 net/core/rtnetlink.c:6647 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2550
Fixes: ff4e538c8c3e ("page_pool: add a lockdep check for recycling in hardirq") Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/T/ Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc3 |
|
| #
429fde2d |
| 12-Dec-2024 |
Eric Dumazet <[email protected]> |
net: tun: fix tun_napi_alloc_frags()
syzbot reported the following crash [1]
Issue came with the blamed commit. Instead of going through all the iov components, we keep using the first one and end
net: tun: fix tun_napi_alloc_frags()
syzbot reported the following crash [1]
Issue came with the blamed commit. Instead of going through all the iov components, we keep using the first one and end up with a malformed skb.
[1]
kernel BUG at net/core/skbuff.c:2849 ! Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 UID: 0 PID: 6230 Comm: syz-executor132 Not tainted 6.13.0-rc1-syzkaller-00407-g96b6fcc0ee41 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/25/2024 RIP: 0010:__pskb_pull_tail+0x1568/0x1570 net/core/skbuff.c:2848 Code: 38 c1 0f 8c 32 f1 ff ff 4c 89 f7 e8 92 96 74 f8 e9 25 f1 ff ff e8 e8 ae 09 f8 48 8b 5c 24 08 e9 eb fb ff ff e8 d9 ae 09 f8 90 <0f> 0b 66 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 RSP: 0018:ffffc90004cbef30 EFLAGS: 00010293 RAX: ffffffff8995c347 RBX: 00000000fffffff2 RCX: ffff88802cf45a00 RDX: 0000000000000000 RSI: 00000000fffffff2 RDI: 0000000000000000 RBP: ffff88807df0c06a R08: ffffffff8995b084 R09: 1ffff1100fbe185c R10: dffffc0000000000 R11: ffffed100fbe185d R12: ffff888076e85d50 R13: ffff888076e85c80 R14: ffff888076e85cf4 R15: ffff888076e85c80 FS: 00007f0dca6ea6c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0dca6ead58 CR3: 00000000119da000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> skb_cow_data+0x2da/0xcb0 net/core/skbuff.c:5284 tipc_aead_decrypt net/tipc/crypto.c:894 [inline] tipc_crypto_rcv+0x402/0x24e0 net/tipc/crypto.c:1844 tipc_rcv+0x57e/0x12a0 net/tipc/node.c:2109 tipc_l2_rcv_msg+0x2bd/0x450 net/tipc/bearer.c:668 __netif_receive_skb_list_ptype net/core/dev.c:5720 [inline] __netif_receive_skb_list_core+0x8b7/0x980 net/core/dev.c:5762 __netif_receive_skb_list net/core/dev.c:5814 [inline] netif_receive_skb_list_internal+0xa51/0xe30 net/core/dev.c:5905 gro_normal_list include/net/gro.h:515 [inline] napi_complete_done+0x2b5/0x870 net/core/dev.c:6256 napi_complete include/linux/netdevice.h:567 [inline] tun_get_user+0x2ea0/0x4890 drivers/net/tun.c:1982 tun_chr_write_iter+0x10d/0x1f0 drivers/net/tun.c:2057 do_iter_readv_writev+0x600/0x880 vfs_writev+0x376/0xba0 fs/read_write.c:1050 do_writev+0x1b6/0x360 fs/read_write.c:1096 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: de4f5fed3f23 ("iov_iter: add iter_iovec() helper") Reported-by: [email protected] Closes: https://lore.kernel.org/netdev/[email protected]/T/#u Cc: [email protected] Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Joe Damato <[email protected]> Reviewed-by: Jens Axboe <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc2 |
|
| #
3ca459ea |
| 05-Dec-2024 |
Stas Sergeev <[email protected]> |
tun: fix group permission check
Currently tun checks the group permission even if the user have matched. Besides going against the usual permission semantic, this has a very interesting implication:
tun: fix group permission check
Currently tun checks the group permission even if the user have matched. Besides going against the usual permission semantic, this has a very interesting implication: if the tun group is not among the supplementary groups of the tun user, then effectively no one can access the tun device. CAP_SYS_ADMIN still can, but its the same as not setting the tun ownership.
This patch relaxes the group checking so that either the user match or the group match is enough. This avoids the situation when no one can access the device even though the ownership is properly set.
Also I simplified the logic by removing the redundant inversions: tun_not_capable() --> !tun_capable()
Signed-off-by: Stas Sergeev <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2 |
|
| #
b63c755c |
| 30-Sep-2024 |
Dr. David Alan Gilbert <[email protected]> |
appletalk: Remove deadcode
alloc_ltalkdev in net/appletalk/dev.c is dead since commit 00f3696f7555 ("net: appletalk: remove cops support")
Removing it (and it's helper) leaves dev.c and if_ltalk.h
appletalk: Remove deadcode
alloc_ltalkdev in net/appletalk/dev.c is dead since commit 00f3696f7555 ("net: appletalk: remove cops support")
Removing it (and it's helper) leaves dev.c and if_ltalk.h empty; remove them and the Makefile entry.
tun.c was including that if_ltalk.h but actually wanted the uapi version for LTALK_ALEN, fix up the path.
Signed-off-by: Dr. David Alan Gilbert <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc1 |
|
| #
cb787f4a |
| 27-Sep-2024 |
Al Viro <[email protected]> |
[tree-wide] finally take no_llseek out
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek")
To quote that commit,
At -rc1 we'll need do a mechanical
[tree-wide] finally take no_llseek out
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek")
To quote that commit,
At -rc1 we'll need do a mechanical removal of no_llseek -
git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done
would do it.
Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe.
Signed-off-by: Al Viro <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v6.11, v6.11-rc7, v6.11-rc6 |
|
| #
00d066a4 |
| 29-Aug-2024 |
Alexander Lobakin <[email protected]> |
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_featur
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag.
Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3 |
|
| #
1934b212 |
| 09-Aug-2024 |
Christian Brauner <[email protected]> |
file: reclaim 24 bytes from f_owner
We do embedd struct fown_struct into struct file letting it take up 32 bytes in total. We could tweak struct fown_struct to be more compact but really it shouldn'
file: reclaim 24 bytes from f_owner
We do embedd struct fown_struct into struct file letting it take up 32 bytes in total. We could tweak struct fown_struct to be more compact but really it shouldn't even be embedded in struct file in the first place.
Instead, actual users of struct fown_struct should allocate the struct on demand. This frees up 24 bytes in struct file.
That will have some potentially user-visible changes for the ownership fcntl()s. Some of them can now fail due to allocation failures. Practically, that probably will almost never happen as the allocations are small and they only happen once per file.
The fown_struct is used during kill_fasync() which is used by e.g., pipes to generate a SIGIO signal. Sending of such signals is conditional on userspace having set an owner for the file using one of the F_OWNER fcntl()s. Such users will be unaffected if struct fown_struct is allocated during the fcntl() call.
There are a few subsystems that call __f_setown() expecting file->f_owner to be allocated:
(1) tun devices file->f_op->fasync::tun_chr_fasync() -> __f_setown()
There are no callers of tun_chr_fasync().
(2) tty devices
file->f_op->fasync::tty_fasync() -> __tty_fasync() -> __f_setown()
tty_fasync() has no additional callers but __tty_fasync() has. Note that __tty_fasync() only calls __f_setown() if the @on argument is true. It's called from:
file->f_op->release::tty_release() -> tty_release() -> __tty_fasync() -> __f_setown()
tty_release() calls __tty_fasync() with @on false => __f_setown() is never called from tty_release(). => All callers of tty_release() are safe as well.
file->f_op->release::tty_open() -> tty_release() -> __tty_fasync() -> __f_setown()
__tty_hangup() calls __tty_fasync() with @on false => __f_setown() is never called from tty_release(). => All callers of __tty_hangup() are safe as well.
From the callchains it's obvious that (1) and (2) end up getting called via file->f_op->fasync(). That can happen either through the F_SETFL fcntl() with the FASYNC flag raised or via the FIOASYNC ioctl(). If FASYNC is requested and the file isn't already FASYNC then file->f_op->fasync() is called with @on true which ends up causing both (1) and (2) to call __f_setown().
(1) and (2) are the only subsystems that call __f_setown() from the file->f_op->fasync() handler. So both (1) and (2) have been updated to allocate a struct fown_struct prior to calling fasync_helper() to register with the fasync infrastructure. That's safe as they both call fasync_helper() which also does allocations if @on is true.
The other interesting case are file leases:
(3) file leases lease_manager_ops->lm_setup::lease_setup() -> __f_setown()
Which in turn is called from:
generic_add_lease() -> lease_manager_ops->lm_setup::lease_setup() -> __f_setown()
So here again we can simply make generic_add_lease() allocate struct fown_struct prior to the lease_manager_ops->lm_setup::lease_setup() which happens under a spinlock.
With that the two remaining subsystems that call __f_setown() are:
(4) dnotify (5) sockets
Both have their own custom ioctls to set struct fown_struct and both have been converted to allocate a struct fown_struct on demand from their respective ioctls.
Interactions with O_PATH are fine as well e.g., when opening a /dev/tty as O_PATH then no file->f_op->open() happens thus no file->f_owner is allocated. That's fine as no file operation will be set for those and the device has never been opened. fcntl()s called on such things will just allocate a ->f_owner on demand. Although I have zero idea why'd you care about f_owner on an O_PATH fd.
Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Christian Brauner <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc2, v6.11-rc1 |
|
| #
04958480 |
| 24-Jul-2024 |
Dongli Zhang <[email protected]> |
tun: add missing verification for short frame
The cited commit missed to check against the validity of the frame length in the tun_xdp_one() path, which could cause a corrupted skb to be sent downst
tun: add missing verification for short frame
The cited commit missed to check against the validity of the frame length in the tun_xdp_one() path, which could cause a corrupted skb to be sent downstack. Even before the skb is transmitted, the tun_xdp_one-->eth_type_trans() may access the Ethernet header although it can be less than ETH_HLEN. Once transmitted, this could either cause out-of-bound access beyond the actual length, or confuse the underlayer with incorrect or inconsistent header length in the skb metadata.
In the alternative path, tun_get_user() already prohibits short frame which has the length less than Ethernet header size from being transmitted for IFF_TAP.
This is to drop any frame shorter than the Ethernet header size just like how tun_get_user() does.
CVE: CVE-2024-41091 Inspired-by: https://lore.kernel.org/netdev/[email protected]/ Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") Cc: [email protected] Signed-off-by: Dongli Zhang <[email protected]> Reviewed-by: Si-Wei Liu <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Paolo Abeni <[email protected]> Reviewed-by: Jason Wang <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.10, v6.10-rc7 |
|
| #
fecef4cd |
| 04-Jul-2024 |
Sebastian Andrzej Siewior <[email protected]> |
tun: Assign missing bpf_net_context.
During the introduction of struct bpf_net_context handling for XDP-redirect, the tun driver has been missed. Jakub also pointed out that there is another call ch
tun: Assign missing bpf_net_context.
During the introduction of struct bpf_net_context handling for XDP-redirect, the tun driver has been missed. Jakub also pointed out that there is another call chain to do_xdp_generic() originating from netif_receive_skb() and drivers may use it outside from the NAPI context.
Set the bpf_net_context before invoking BPF XDP program within the TUN driver. Set the bpf_net_context also in do_xdp_generic() if a xdp program is available.
Reported-by: [email protected] Reported-by: [email protected] Reported-by: [email protected] Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5 |
|
| #
7ab4f16f |
| 19-Apr-2024 |
Pavel Begunkov <[email protected]> |
net: extend ubuf_info callback to ops structure
We'll need to associate additional callbacks with ubuf_info, introduce a structure holding ubuf_info callbacks. Apart from a more smarter io_uring not
net: extend ubuf_info callback to ops structure
We'll need to associate additional callbacks with ubuf_info, introduce a structure holding ubuf_info callbacks. Apart from a more smarter io_uring notification management introduced in next patches, it can be used to generalise msg_zerocopy_put_abort() and also store ->sg_from_iter, which is currently passed in struct msghdr.
Reviewed-by: Jens Axboe <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/all/a62015541de49c0e2a8a0377a1d5d0a5aeb07016.1713369317.git.asml.silence@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
f8bbc07a |
| 15-Apr-2024 |
Lei Chen <[email protected]> |
tun: limit printing rate when illegal packet received by tun dev
vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet c
tun: limit printing rate when illegal packet received by tun dev
vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected.
net_ratelimit mechanism can be used to limit the dumping rate.
PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f
Fixes: ef3db4a59542 ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8 |
|
| #
490a79fa |
| 06-Mar-2024 |
Eric Dumazet <[email protected]> |
net: introduce include/net/rps.h
Move RPS related structures and helpers from include/linux/netdevice.h and include/net/sock.h to a new include file.
Signed-off-by: Eric Dumazet <[email protected]
net: introduce include/net/rps.h
Move RPS related structures and helpers from include/linux/netdevice.h and include/net/sock.h to a new include file.
Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
4166204d |
| 04-Mar-2024 |
Breno Leitao <[email protected]> |
net: tap: Remove generic .ndo_get_stats64
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is d
net: tap: Remove generic .ndo_get_stats64
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is configured") moved the callback to dev_get_tstats64() to net core, so, unless the driver is doing some custom stats collection, it does not need to set .ndo_get_stats64.
Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64 function pointer.
Signed-off-by: Breno Leitao <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
| #
46f480ec |
| 04-Mar-2024 |
Breno Leitao <[email protected]> |
net: tuntap: Leverage core stats allocator
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in th
net: tuntap: Leverage core stats allocator
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver.
With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now.
Remove the allocation in the tun/tap driver and leverage the network core allocation instead.
Signed-off-by: Breno Leitao <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|