|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6 |
|
| #
8ef890df |
| 07-Mar-2025 |
Jakub Kicinski <[email protected]> |
net: move misc netdev_lock flavors to a separate header
Move the more esoteric helpers for netdev instance lock to a dedicated header. This avoids growing netdevice.h to infinity and makes rebuildin
net: move misc netdev_lock flavors to a separate header
Move the more esoteric helpers for netdev instance lock to a dedicated header. This avoids growing netdevice.h to infinity and makes rebuilding the kernel much faster (after touching the header with the helpers).
The main netdev_lock() / netdev_unlock() functions are used in static inlines in netdevice.h and will probably be used most commonly, so keep them in netdevice.h.
Acked-by: Stanislav Fomichev <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc5 |
|
| #
0c493da8 |
| 28-Feb-2025 |
Nicolas Dichtel <[email protected]> |
net: rename netns_local to netns_immutable
The name 'netns_local' is confusing. A following commit will export it via netlink, so let's use a more explicit name.
Reported-by: Eric Dumazet <edumazet
net: rename netns_local to netns_immutable
The name 'netns_local' is confusing. A following commit will export it via netlink, so let's use a more explicit name.
Reported-by: Eric Dumazet <[email protected]> Suggested-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc4 |
|
| #
0e4427f8 |
| 20-Feb-2025 |
Ido Schimmel <[email protected]> |
net: loopback: Avoid sending IP packets without an Ethernet header
After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev") IPv4 neighbors can be constructed on the blackhole net dev
net: loopback: Avoid sending IP packets without an Ethernet header
After commit 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev") IPv4 neighbors can be constructed on the blackhole net device, but they are constructed with an output function (neigh_direct_output()) that simply calls dev_queue_xmit(). The latter will transmit packets via 'skb->dev' which might not be the blackhole net device if dst_dev_put() switched 'dst->dev' to the blackhole net device while another CPU was using the dst entry in ip_output(), but after it already initialized 'skb->dev' from 'dst->dev'.
Specifically, the following can happen:
CPU1 CPU2
udp_sendmsg(sk1) udp_sendmsg(sk2) udp_send_skb() [...] ip_output() skb->dev = skb_dst(skb)->dev dst_dev_put() dst->dev = blackhole_netdev ip_finish_output2() resolves neigh on dst->dev neigh_output() neigh_direct_output() dev_queue_xmit()
This will result in IPv4 packets being sent without an Ethernet header via a valid net device:
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 22:07:02.329668 20:00:40:11:18:fb > 45:00:00:44:f4:94, ethertype Unknown (0x58c6), length 68: 0x0000: 8dda 74ca f1ae ca6c ca6c 0098 969c 0400 ..t....l.l...... 0x0010: 0000 4730 3f18 6800 0000 0000 0000 9971 ..G0?.h........q 0x0020: c4c9 9055 a157 0a70 9ead bf83 38ca ab38 ...U.W.p....8..8 0x0030: 8add ab96 e052 .....R
Fix by making sure that neighbors are constructed on top of the blackhole net device with an output function that simply consumes the packets, in a similar fashion to dst_discard_out() and blackhole_netdev_xmit().
Fixes: 8d7017fd621d ("blackhole_netdev: use blackhole_netdev to invalidate dst entries") Fixes: 22600596b675 ("ipv4: give an IPv4 dev to blackhole_netdev") Reported-by: Florian Meister <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13 |
|
| #
2248c053 |
| 14-Jan-2025 |
Kuniyuki Iwashima <[email protected]> |
net: loopback: Hold rtnl_net_lock() in blackhole_netdev_init().
blackhole_netdev is the global device in init_net.
Let's hold rtnl_net_lock(&init_net) in blackhole_netdev_init().
While at it, the
net: loopback: Hold rtnl_net_lock() in blackhole_netdev_init().
blackhole_netdev is the global device in init_net.
Let's hold rtnl_net_lock(&init_net) in blackhole_netdev_init().
While at it, the unnecessary dev_net_set() call is removed, which is done in alloc_netdev_mqs().
Signed-off-by: Kuniyuki Iwashima <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6 |
|
| #
05c1280a |
| 29-Aug-2024 |
Alexander Lobakin <[email protected]> |
netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local
"Interface can't change network namespaces" is rather an attribute, not a feature, and it can't be changed via Ethtool. Make it a "co
netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local
"Interface can't change network namespaces" is rather an attribute, not a feature, and it can't be changed via Ethtool. Make it a "cold" private flag instead of a netdev_feature and free one more bit.
Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
| #
00d066a4 |
| 29-Aug-2024 |
Alexander Lobakin <[email protected]> |
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_featur
netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature, rather an attribute, very similar to IFF_NO_QUEUE (and hot). Free one netdev_features_t bit and make it a "hot" private flag.
Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7 |
|
| #
3b5933e9 |
| 29-Apr-2024 |
Breno Leitao <[email protected]> |
net: loopback: Do not allocate lstats explicitly
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of
net: loopback: Do not allocate lstats explicitly
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and convert veth & vrf"), stats allocation could be done on net core instead of in this driver.
With this new approach, the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc). This is core responsibility now.
Remove the allocation in the loopback driver and leverage the network core allocation instead.
Signed-off-by: Breno Leitao <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5 |
|
| #
0bef5120 |
| 12-Feb-2024 |
Eric Dumazet <[email protected]> |
net: add netdev_lockdep_set_classes() to virtual drivers
Based on a syzbot report, it appears many virtual drivers do not yet use netdev_lockdep_set_classes(), triggerring lockdep false positives.
net: add netdev_lockdep_set_classes() to virtual drivers
Based on a syzbot report, it appears many virtual drivers do not yet use netdev_lockdep_set_classes(), triggerring lockdep false positives.
WARNING: possible recursive locking detected 6.8.0-rc4-next-20240212-syzkaller #0 Not tainted
syz-executor.0/19016 is trying to acquire lock: ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline] ffff8880162cb298 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
but task is already holding lock: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline] ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340
other info that might help us debug this: Possible unsafe locking scenario:
CPU0 lock(_xmit_ETHER#2); lock(_xmit_ETHER#2);
*** DEADLOCK ***
May be due to missing lock nesting notation
9 locks held by syz-executor.0/19016: #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline] #0: ffffffff8f385208 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x82c/0x1040 net/core/rtnetlink.c:6603 #1: ffffc90000a08c00 ((&in_dev->mr_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x600 kernel/time/timer.c:1697 #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #2: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228 #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline] #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline] #3: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284 #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline] #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline] #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline] #4: ffff8880416e3258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325 #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: __netif_tx_lock include/linux/netdevice.h:4452 [inline] #5: ffff8880223db4d8 (_xmit_ETHER#2){+.-.}-{2:2}, at: sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340 #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:298 [inline] #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:750 [inline] #6: ffffffff8e131520 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1360 net/ipv4/ip_output.c:228 #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: local_bh_disable include/linux/bottom_half.h:20 [inline] #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: rcu_read_lock_bh include/linux/rcupdate.h:802 [inline] #7: ffffffff8e131580 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x2c4/0x3b10 net/core/dev.c:4284 #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: spin_trylock include/linux/spinlock.h:361 [inline] #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: qdisc_run_begin include/net/sch_generic.h:195 [inline] #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_xmit_skb net/core/dev.c:3771 [inline] #8: ffff888014d9d258 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{2:2}, at: __dev_queue_xmit+0x1262/0x3b10 net/core/dev.c:4325
stack backtrace: CPU: 1 PID: 19016 Comm: syz-executor.0 Not tainted 6.8.0-rc4-next-20240212-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 check_deadlock kernel/locking/lockdep.c:3062 [inline] validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __netif_tx_lock include/linux/netdevice.h:4452 [inline] sch_direct_xmit+0x1c4/0x5f0 net/sched/sch_generic.c:340 __dev_xmit_skb net/core/dev.c:3784 [inline] __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325 neigh_output include/net/neighbour.h:542 [inline] ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235 iptunnel_xmit+0x540/0x9b0 net/ipv4/ip_tunnel_core.c:82 ip_tunnel_xmit+0x20ee/0x2960 net/ipv4/ip_tunnel.c:831 erspan_xmit+0x9de/0x1460 net/ipv4/ip_gre.c:720 __netdev_start_xmit include/linux/netdevice.h:4989 [inline] netdev_start_xmit include/linux/netdevice.h:5003 [inline] xmit_one net/core/dev.c:3555 [inline] dev_hard_start_xmit+0x242/0x770 net/core/dev.c:3571 sch_direct_xmit+0x2b6/0x5f0 net/sched/sch_generic.c:342 __dev_xmit_skb net/core/dev.c:3784 [inline] __dev_queue_xmit+0x1912/0x3b10 net/core/dev.c:4325 neigh_output include/net/neighbour.h:542 [inline] ip_finish_output2+0xe66/0x1360 net/ipv4/ip_output.c:235 igmpv3_send_cr net/ipv4/igmp.c:723 [inline] igmp_ifc_timer_expire+0xb71/0xd90 net/ipv4/igmp.c:813 call_timer_fn+0x17e/0x600 kernel/time/timer.c:1700 expire_timers kernel/time/timer.c:1751 [inline] __run_timers+0x621/0x830 kernel/time/timer.c:2038 run_timer_softirq+0x67/0xf0 kernel/time/timer.c:2051 __do_softirq+0x2bc/0x943 kernel/softirq.c:554 invoke_softirq kernel/softirq.c:428 [inline] __irq_exit_rcu+0xf2/0x1c0 kernel/softirq.c:633 irq_exit_rcu+0x9/0x30 kernel/softirq.c:645 instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1076 [inline] sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1076 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702 RIP: 0010:resched_offsets_ok kernel/sched/core.c:10127 [inline] RIP: 0010:__might_resched+0x16f/0x780 kernel/sched/core.c:10142 Code: 00 4c 89 e8 48 c1 e8 03 48 ba 00 00 00 00 00 fc ff df 48 89 44 24 38 0f b6 04 10 84 c0 0f 85 87 04 00 00 41 8b 45 00 c1 e0 08 <01> d8 44 39 e0 0f 85 d6 00 00 00 44 89 64 24 1c 48 8d bc 24 a0 00 RSP: 0018:ffffc9000ee069e0 EFLAGS: 00000246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8880296a9e00 RDX: dffffc0000000000 RSI: ffff8880296a9e00 RDI: ffffffff8bfe8fa0 RBP: ffffc9000ee06b00 R08: ffffffff82326877 R09: 1ffff11002b5ad1b R10: dffffc0000000000 R11: ffffed1002b5ad1c R12: 0000000000000000 R13: ffff8880296aa23c R14: 000000000000062a R15: 1ffff92001dc0d44 down_write+0x19/0x50 kernel/locking/rwsem.c:1578 kernfs_activate fs/kernfs/dir.c:1403 [inline] kernfs_add_one+0x4af/0x8b0 fs/kernfs/dir.c:819 __kernfs_create_file+0x22e/0x2e0 fs/kernfs/file.c:1056 sysfs_add_file_mode_ns+0x24a/0x310 fs/sysfs/file.c:307 create_files fs/sysfs/group.c:64 [inline] internal_create_group+0x4f4/0xf20 fs/sysfs/group.c:152 internal_create_groups fs/sysfs/group.c:192 [inline] sysfs_create_groups+0x56/0x120 fs/sysfs/group.c:218 create_dir lib/kobject.c:78 [inline] kobject_add_internal+0x472/0x8d0 lib/kobject.c:240 kobject_add_varg lib/kobject.c:374 [inline] kobject_init_and_add+0x124/0x190 lib/kobject.c:457 netdev_queue_add_kobject net/core/net-sysfs.c:1706 [inline] netdev_queue_update_kobjects+0x1f3/0x480 net/core/net-sysfs.c:1758 register_queue_kobjects net/core/net-sysfs.c:1819 [inline] netdev_register_kobject+0x265/0x310 net/core/net-sysfs.c:2059 register_netdevice+0x1191/0x19c0 net/core/dev.c:10298 bond_newlink+0x3b/0x90 drivers/net/bonding/bond_netlink.c:576 rtnl_newlink_create net/core/rtnetlink.c:3506 [inline] __rtnl_newlink net/core/rtnetlink.c:3726 [inline] rtnl_newlink+0x158f/0x20a0 net/core/rtnetlink.c:3739 rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6606 netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543 netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline] netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367 netlink_sendmsg+0xa3c/0xd70 net/netlink/af_netlink.c:1908 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:745 __sys_sendto+0x3a4/0x4f0 net/socket.c:2191 __do_sys_sendto net/socket.c:2203 [inline] __se_sys_sendto net/socket.c:2199 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2199 do_syscall_64+0xfb/0x240 entry_SYSCALL_64_after_hwframe+0x6d/0x75 RIP: 0033:0x7fc3fa87fa9c
Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7 |
|
| #
31d929de |
| 23-Nov-2022 |
Rasmus Villemoes <[email protected]> |
net: loopback: use NET_NAME_PREDICTABLE for name_assign_type
When the name_assign_type attribute was introduced (commit 685343fc3ba6, "net: add name_assign_type netdev attribute"), the loopback devi
net: loopback: use NET_NAME_PREDICTABLE for name_assign_type
When the name_assign_type attribute was introduced (commit 685343fc3ba6, "net: add name_assign_type netdev attribute"), the loopback device was explicitly mentioned as one which would make use of NET_NAME_PREDICTABLE:
The name_assign_type attribute gives hints where the interface name of a given net-device comes from. These values are currently defined: ... NET_NAME_PREDICTABLE: The ifname has been assigned by the kernel in a predictable way that is guaranteed to avoid reuse and always be the same for a given device. Examples include statically created devices like the loopback device [...]
Switch to that so that reading /sys/class/net/lo/name_assign_type produces something sensible instead of returning -EINVAL.
Signed-off-by: Rasmus Villemoes <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3 |
|
| #
068c38ad |
| 26-Oct-2022 |
Thomas Gleixner <[email protected]> |
net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore.
net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore.
Convert to the regular interface.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7 |
|
| #
d6f938ce |
| 13-May-2022 |
Eric Dumazet <[email protected]> |
net: loopback: enable BIG TCP packets
Set the driver limit to GSO_MAX_SIZE (512 KB).
This allows the admin/user to set a GSO limit up to this value.
Tested:
ip link set dev lo gso_max_size 200000
net: loopback: enable BIG TCP packets
Set the driver limit to GSO_MAX_SIZE (512 KB).
This allows the admin/user to set a GSO limit up to this value.
Tested:
ip link set dev lo gso_max_size 200000 netperf -H ::1 -t TCP_RR -l 100 -- -r 80000,80000 &
tcpdump shows :
18:28:42.962116 IP6 ::1 > ::1: HBH 40051 > 63780: Flags [P.], seq 3626480001:3626560001, ack 3626560001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 80000 18:28:42.962138 IP6 ::1.63780 > ::1.40051: Flags [.], ack 3626560001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 0 18:28:42.962152 IP6 ::1 > ::1: HBH 63780 > 40051: Flags [P.], seq 3626560001:3626640001, ack 3626560001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 80000 18:28:42.962157 IP6 ::1.40051 > ::1.63780: Flags [.], ack 3626640001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 0 18:28:42.962180 IP6 ::1 > ::1: HBH 40051 > 63780: Flags [P.], seq 3626560001:3626640001, ack 3626640001, win 17743, options [nop,nop,TS val 3771179265 ecr 3771179265], length 80000 18:28:42.962214 IP6 ::1.63780 > ::1.40051: Flags [.], ack 3626640001, win 17743, options [nop,nop,TS val 3771179266 ecr 3771179265], length 0 18:28:42.962228 IP6 ::1 > ::1: HBH 63780 > 40051: Flags [P.], seq 3626640001:3626720001, ack 3626640001, win 17743, options [nop,nop,TS val 3771179266 ecr 3771179265], length 80000 18:28:42.962233 IP6 ::1.40051 > ::1.63780: Flags [.], ack 3626720001, win 17743, options [nop,nop,TS val 3771179266 ecr 3771179266], length 0
Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7 |
|
| #
de799101 |
| 02-Mar-2022 |
Martin KaFai Lau <[email protected]> |
net: Add skb_clear_tstamp() to keep the mono delivery_time
Right now, skb->tstamp is reset to 0 whenever the skb is forwarded.
If skb->tstamp has the mono delivery_time, clearing it can hurt the pe
net: Add skb_clear_tstamp() to keep the mono delivery_time
Right now, skb->tstamp is reset to 0 whenever the skb is forwarded.
If skb->tstamp has the mono delivery_time, clearing it can hurt the performance when it finally transmits out to fq@phy-dev.
The earlier patch added a skb->mono_delivery_time bit to flag the skb->tstamp carrying the mono delivery_time.
This patch adds skb_clear_tstamp() helper which keeps the mono delivery_time and clears everything else.
The delivery_time clearing will be postponed until the stack knows the skb will be delivered locally. It will be done in a latter patch.
Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc6, v5.17-rc5, v5.17-rc4 |
|
| #
baebdf48 |
| 11-Feb-2022 |
Sebastian Andrzej Siewior <[email protected]> |
net: dev: Makes sure netif_rx() can be invoked in any context.
Dave suggested a while ago (eleven years by now) "Let's make netif_rx() work in all contexts and get rid of netif_rx_ni()". Eric agreed
net: dev: Makes sure netif_rx() can be invoked in any context.
Dave suggested a while ago (eleven years by now) "Let's make netif_rx() work in all contexts and get rid of netif_rx_ni()". Eric agreed and pointed out that modern devices should use netif_receive_skb() to avoid the overhead. In the meantime someone added another variant, netif_rx_any_context(), which behaves as suggested.
netif_rx() must be invoked with disabled bottom halves to ensure that pending softirqs, which were raised within the function, are handled. netif_rx_ni() can be invoked only from process context (bottom halves must be enabled) because the function handles pending softirqs without checking if bottom halves were disabled or not. netif_rx_any_context() invokes on the former functions by checking in_interrupts().
netif_rx() could be taught to handle both cases (disabled and enabled bottom halves) by simply disabling bottom halves while invoking netif_rx_internal(). The local_bh_enable() invocation will then invoke pending softirqs only if the BH-disable counter drops to zero.
Eric is concerned about the overhead of BH-disable+enable especially in regard to the loopback driver. As critical as this driver is, it will receive a shortcut to avoid the additional overhead which is not needed.
Add a local_bh_disable() section in netif_rx() to ensure softirqs are handled if needed. Provide __netif_rx() which does not disable BH and has a lockdep assert to ensure that interrupts are disabled. Use this shortcut in the loopback driver and in drivers/net/*.c. Make netif_rx_ni() and netif_rx_any_context() invoke netif_rx() so they can be removed once they are no more users left.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8 |
|
| #
b6459415 |
| 29-Dec-2021 |
Jakub Kicinski <[email protected]> |
net: Don't include filter.h from net/sock.h
sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of
net: Don't include filter.h from net/sock.h
sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of struct sk_filter instead. This decreases the number of rebuilt objects when bpf.h is touched from ~5k to ~1k.
There's a lot of missing includes this was masking. Primarily in networking tho, this time.
Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Marc Kleine-Budde <[email protected]> Acked-by: Florian Fainelli <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Acked-by: Stefano Garzarella <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11 |
|
| #
1edb5cbf |
| 09-Feb-2021 |
Petr Machata <[email protected]> |
Revert "net-loopback: set lo dev initial state to UP"
In commit c9dca822c729 ("net-loopback: set lo dev initial state to UP"), linux started automatically bringing up the loopback device of a newly
Revert "net-loopback: set lo dev initial state to UP"
In commit c9dca822c729 ("net-loopback: set lo dev initial state to UP"), linux started automatically bringing up the loopback device of a newly created namespace. However, an existing user script might reasonably have the following stanza when creating a new namespace -- and in fact at least tools/testing/selftests/net/fib_nexthops.sh in Linux's very own testsuite does:
# set -e # ip netns add foo # ip -netns foo addr add 127.0.0.1/8 dev lo # ip -netns foo link set lo up # set +e
This will now fail, because the kernel reasonably rejects "ip addr add" of a duplicate address. The described change of behavior therefore constitutes a breakage. Revert it.
Fixes: c9dca822c729 ("net-loopback: set lo dev initial state to UP") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc7 |
|
| #
c9dca822 |
| 01-Feb-2021 |
Jian Yang <[email protected]> |
net-loopback: set lo dev initial state to UP
Traditionally loopback devices come up with initial state as DOWN for any new network-namespace. This would mean that anyone needing this device would ha
net-loopback: set lo dev initial state to UP
Traditionally loopback devices come up with initial state as DOWN for any new network-namespace. This would mean that anyone needing this device would have to bring this UP by issuing something like 'ip link set lo up'. This can be avoided if the initial state is set as UP.
Signed-off-by: Mahesh Bandewar <[email protected]> Signed-off-by: Jian Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7 |
|
| #
fd2f4737 |
| 08-Nov-2019 |
Eric Dumazet <[email protected]> |
net: use u64_stats_t in struct pcpu_lstats
In order to fix the data-race found by KCSAN, we can use the new u64_stats_t type and its accessors instead of plain u64 fields. This will still generate o
net: use u64_stats_t in struct pcpu_lstats
In order to fix the data-race found by KCSAN, we can use the new u64_stats_t type and its accessors instead of plain u64 fields. This will still generate optimal code for both 32 and 64 bit platforms.
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
dd5382a0 |
| 08-Nov-2019 |
Eric Dumazet <[email protected]> |
net: provide dev_lstats_add() helper
Many network drivers need it and hand-coded the same function.
In order to ease u64_stats_t adoption, it is time to factorize.
Signed-off-by: Eric Dumazet <edu
net: provide dev_lstats_add() helper
Many network drivers need it and hand-coded the same function.
In order to ease u64_stats_t adoption, it is time to factorize.
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
de7d5084 |
| 08-Nov-2019 |
Eric Dumazet <[email protected]> |
net: provide dev_lstats_read() helper
Many network drivers use hand-coded implementation of the same thing, let's factorize things so that u64_stats_t adoption is done once.
Signed-off-by: Eric Dum
net: provide dev_lstats_read() helper
Many network drivers use hand-coded implementation of the same thing, let's factorize things so that u64_stats_t adoption is done once.
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2 |
|
| #
d62962b3 |
| 03-Jul-2019 |
Mahesh Bandewar <[email protected]> |
loopback: fix lockdep splat
dev_init_scheduler() and dev_activate() expect the caller to hold RTNL. Since we don't want blackhole device to be initialized per ns, we are initializing at init.
[
loopback: fix lockdep splat
dev_init_scheduler() and dev_activate() expect the caller to hold RTNL. Since we don't want blackhole device to be initialized per ns, we are initializing at init.
[ 3.855027] Call Trace: [ 3.855034] dump_stack+0x67/0x95 [ 3.855037] lockdep_rcu_suspicious+0xd5/0x110 [ 3.855044] dev_init_scheduler+0xe3/0x120 [ 3.855048] ? net_olddevs_init+0x60/0x60 [ 3.855050] blackhole_netdev_init+0x45/0x6e [ 3.855052] do_one_initcall+0x6c/0x2fa [ 3.855058] ? rcu_read_lock_sched_held+0x8c/0xa0 [ 3.855066] kernel_init_freeable+0x1e5/0x288 [ 3.855071] ? rest_init+0x260/0x260 [ 3.855074] kernel_init+0xf/0x180 [ 3.855076] ? rest_init+0x260/0x260 [ 3.855078] ret_from_fork+0x24/0x30
Fixes: 4de83b88c66 ("loopback: create blackhole net device similar to loopack.") Reported-by: Geert Uytterhoeven <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Mahesh Bandewar <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
4de83b88 |
| 01-Jul-2019 |
Mahesh Bandewar <[email protected]> |
loopback: create blackhole net device similar to loopack.
Create a blackhole net device that can be used for "dead" dst entries instead of loopback device. This blackhole device differs from loopbac
loopback: create blackhole net device similar to loopack.
Create a blackhole net device that can be used for "dead" dst entries instead of loopback device. This blackhole device differs from loopback in few aspects: (a) It's not per-ns. (b) MTU on this device is ETH_MIN_MTU (c) The xmit function is essentially kfree_skb(). and (d) since it's not registered it won't have ifindex.
Lower MTU effectively make the device not pass the MTU check during the route check when a dst associated with the skb is dead.
Signed-off-by: Mahesh Bandewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3 |
|
| #
2874c5fd |
| 27-May-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 3029 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5 |
|
| #
af730342 |
| 12-Apr-2019 |
Julian Wiedmann <[email protected]> |
net: loopback: use generic helper to report timestamping info
For reporting the common set of SW timestamping capabilities, use ethtool_op_get_ts_info() instead of re-implementing it.
Signed-off-by
net: loopback: use generic helper to report timestamping info
For reporting the common set of SW timestamping capabilities, use ethtool_op_get_ts_info() instead of re-implementing it.
Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19 |
|
| #
4c16128b |
| 20-Oct-2018 |
Eric Dumazet <[email protected]> |
net: loopback: clear skb->tstamp before netif_rx()
At least UDP / TCP stacks can now cook skbs with a tstamp using MONOTONIC base (or arbitrary values with SCM_TXTIME)
Since loopback driver does no
net: loopback: clear skb->tstamp before netif_rx()
At least UDP / TCP stacks can now cook skbs with a tstamp using MONOTONIC base (or arbitrary values with SCM_TXTIME)
Since loopback driver does not call (directly or indirectly) skb_scrub_packet(), we need to clear skb->tstamp so that net_timestamp_check() can eventually resample the time, using ktime_get_real().
Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time.") Fixes: fb420d5d91c1 ("tcp/fq: move back to CLOCK_MONOTONIC") Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Soheil Hassas Yeganeh <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4 |
|
| #
52bb6677 |
| 14-Sep-2018 |
Li RongQing <[email protected]> |
net: move definition of pcpu_lstats to header file
pcpu_lstats is defined in several files, so unify them as one and move to header file
Signed-off-by: Zhang Yu <[email protected]> Signed-off-by:
net: move definition of pcpu_lstats to header file
pcpu_lstats is defined in several files, so unify them as one and move to header file
Signed-off-by: Zhang Yu <[email protected]> Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|