History log of /linux-6.15/net/core/drop_monitor.c (Results 1 – 25 of 104)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1
# 8fa7292f 05-Apr-2025 Thomas Gleixner <[email protected]>

treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with c

treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with coccinelle plus manual fixups where necessary.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3
# 07b598c0 13-Feb-2025 Gavrilov Ilia <[email protected]>

drop_monitor: fix incorrect initialization order

Syzkaller reports the following bug:

BUG: spinlock bad magic on CPU#1, syz-executor.0/7995
lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <non

drop_monitor: fix incorrect initialization order

Syzkaller reports the following bug:

BUG: spinlock bad magic on CPU#1, syz-executor.0/7995
lock: 0xffff88805303f3e0, .magic: 00000000, .owner: <none>/-1, .owner_cpu: 0
CPU: 1 PID: 7995 Comm: syz-executor.0 Tainted: G E 5.10.209+ #1
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x119/0x179 lib/dump_stack.c:118
debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
do_raw_spin_lock+0x1f6/0x270 kernel/locking/spinlock_debug.c:112
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:117 [inline]
_raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
reset_per_cpu_data+0xe6/0x240 [drop_monitor]
net_dm_cmd_trace+0x43d/0x17a0 [drop_monitor]
genl_family_rcv_msg_doit+0x22f/0x330 net/netlink/genetlink.c:739
genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
genl_rcv_msg+0x341/0x5a0 net/netlink/genetlink.c:800
netlink_rcv_skb+0x14d/0x440 net/netlink/af_netlink.c:2497
genl_rcv+0x29/0x40 net/netlink/genetlink.c:811
netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
netlink_unicast+0x54b/0x800 net/netlink/af_netlink.c:1348
netlink_sendmsg+0x914/0xe00 net/netlink/af_netlink.c:1916
sock_sendmsg_nosec net/socket.c:651 [inline]
__sock_sendmsg+0x157/0x190 net/socket.c:663
____sys_sendmsg+0x712/0x870 net/socket.c:2378
___sys_sendmsg+0xf8/0x170 net/socket.c:2432
__sys_sendmsg+0xea/0x1b0 net/socket.c:2461
do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x62/0xc7
RIP: 0033:0x7f3f9815aee9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3f972bf0c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f3f9826d050 RCX: 00007f3f9815aee9
RDX: 0000000020000000 RSI: 0000000020001300 RDI: 0000000000000007
RBP: 00007f3f981b63bd R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f3f9826d050 R15: 00007ffe01ee6768

If drop_monitor is built as a kernel module, syzkaller may have time
to send a netlink NET_DM_CMD_START message during the module loading.
This will call the net_dm_monitor_start() function that uses
a spinlock that has not yet been initialized.

To fix this, let's place resource initialization above the registration
of a generic netlink family.

Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with Syzkaller.

Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
Cc: [email protected]
Signed-off-by: Ilia Gavrilov <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2
# 5f60d5f6 01-Oct-2024 Al Viro <[email protected]>

move asm/unaligned.h to linux/unaligned.h

asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-

move asm/unaligned.h to linux/unaligned.h

asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h

show more ...


Revision tags: v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5
# c53795d4 17-Jun-2024 Yan Zhai <[email protected]>

net: add rx_sk to trace_kfree_skb

skb does not include enough information to find out receiving
sockets/services and netns/containers on packet drops. In theory
skb->dev tells about netns, but it ca

net: add rx_sk to trace_kfree_skb

skb does not include enough information to find out receiving
sockets/services and netns/containers on packet drops. In theory
skb->dev tells about netns, but it can get cleared/reused, e.g. by TCP
stack for OOO packet lookup. Similarly, skb->sk often identifies a local
sender, and tells nothing about a receiver.

Allow passing an extra receiving socket to the tracepoint to improve
the visibility on receiving drops.

Signed-off-by: Yan Zhai <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4
# f1e197a6 11-Apr-2024 Wander Lairson Costa <[email protected]>

drop_monitor: replace spin_lock by raw_spin_lock

trace_drop_common() is called with preemption disabled, and it acquires
a spin_lock. This is problematic for RT kernels because spin_locks are
sleepi

drop_monitor: replace spin_lock by raw_spin_lock

trace_drop_common() is called with preemption disabled, and it acquires
a spin_lock. This is problematic for RT kernels because spin_locks are
sleeping locks in this configuration, which causes the following splat:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 449, name: rcuc/47
preempt_count: 1, expected: 0
RCU nest depth: 2, expected: 2
5 locks held by rcuc/47/449:
#0: ff1100086ec30a60 ((softirq_ctrl.lock)){+.+.}-{2:2}, at: __local_bh_disable_ip+0x105/0x210
#1: ffffffffb394a280 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0xbf/0x130
#2: ffffffffb394a280 (rcu_read_lock){....}-{1:2}, at: __local_bh_disable_ip+0x11c/0x210
#3: ffffffffb394a160 (rcu_callback){....}-{0:0}, at: rcu_do_batch+0x360/0xc70
#4: ff1100086ee07520 (&data->lock){+.+.}-{2:2}, at: trace_drop_common.constprop.0+0xb5/0x290
irq event stamp: 139909
hardirqs last enabled at (139908): [<ffffffffb1df2b33>] _raw_spin_unlock_irqrestore+0x63/0x80
hardirqs last disabled at (139909): [<ffffffffb19bd03d>] trace_drop_common.constprop.0+0x26d/0x290
softirqs last enabled at (139892): [<ffffffffb07a1083>] __local_bh_enable_ip+0x103/0x170
softirqs last disabled at (139898): [<ffffffffb0909b33>] rcu_cpu_kthread+0x93/0x1f0
Preemption disabled at:
[<ffffffffb1de786b>] rt_mutex_slowunlock+0xab/0x2e0
CPU: 47 PID: 449 Comm: rcuc/47 Not tainted 6.9.0-rc2-rt1+ #7
Hardware name: Dell Inc. PowerEdge R650/0Y2G81, BIOS 1.6.5 04/15/2022
Call Trace:
<TASK>
dump_stack_lvl+0x8c/0xd0
dump_stack+0x14/0x20
__might_resched+0x21e/0x2f0
rt_spin_lock+0x5e/0x130
? trace_drop_common.constprop.0+0xb5/0x290
? skb_queue_purge_reason.part.0+0x1bf/0x230
trace_drop_common.constprop.0+0xb5/0x290
? preempt_count_sub+0x1c/0xd0
? _raw_spin_unlock_irqrestore+0x4a/0x80
? __pfx_trace_drop_common.constprop.0+0x10/0x10
? rt_mutex_slowunlock+0x26a/0x2e0
? skb_queue_purge_reason.part.0+0x1bf/0x230
? __pfx_rt_mutex_slowunlock+0x10/0x10
? skb_queue_purge_reason.part.0+0x1bf/0x230
trace_kfree_skb_hit+0x15/0x20
trace_kfree_skb+0xe9/0x150
kfree_skb_reason+0x7b/0x110
skb_queue_purge_reason.part.0+0x1bf/0x230
? __pfx_skb_queue_purge_reason.part.0+0x10/0x10
? mark_lock.part.0+0x8a/0x520
...

trace_drop_common() also disables interrupts, but this is a minor issue
because we could easily replace it with a local_lock.

Replace the spin_lock with raw_spin_lock to avoid sleeping in atomic
context.

Signed-off-by: Wander Lairson Costa <[email protected]>
Reported-by: Hu Chunyu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7
# cd4d7263 20-Dec-2023 Ido Schimmel <[email protected]>

genetlink: Use internal flags for multicast groups

As explained in commit e03781879a0d ("drop_monitor: Require
'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
multicast group

genetlink: Use internal flags for multicast groups

As explained in commit e03781879a0d ("drop_monitor: Require
'CAP_SYS_ADMIN' when joining "events" group"), the "flags" field in the
multicast group structure reuses uAPI flags despite the field not being
exposed to user space. This makes it impossible to extend its use
without adding new uAPI flags, which is inappropriate for internal
kernel checks.

Solve this by adding internal flags (i.e., "GENL_MCAST_*") and convert
the existing users to use them instead of the uAPI flags.

Tested using the reproducers in commit 44ec98ea5ea9 ("psample: Require
'CAP_NET_ADMIN' when joining "packets" group") and commit e03781879a0d
("drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group").

No functional changes intended.

Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Mat Martineau <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v6.7-rc6, v6.7-rc5
# e0378187 06-Dec-2023 Ido Schimmel <[email protected]>

drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group

The "NET_DM" generic netlink family notifies drop locations over the
"events" multicast group. This is problematic since by default

drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" group

The "NET_DM" generic netlink family notifies drop locations over the
"events" multicast group. This is problematic since by default generic
netlink allows non-root users to listen to these notifications.

Fix by adding a new field to the generic netlink multicast group
structure that when set prevents non-root users or root without the
'CAP_SYS_ADMIN' capability (in the user namespace owning the network
namespace) from joining the group. Set this field for the "events"
group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the
nature of the information that is shared over this group.

Note that the capability check in this case will always be performed
against the initial user namespace since the family is not netns aware
and only operates in the initial network namespace.

A new field is added to the structure rather than using the "flags"
field because the existing field uses uAPI flags and it is inappropriate
to add a new uAPI flag for an internal kernel check. In net-next we can
rework the "flags" field to use internal flags and fold the new field
into it. But for now, in order to reduce the amount of changes, add a
new field.

Since the information can only be consumed by root, mark the control
plane operations that start and stop the tracing as root-only using the
'GENL_ADMIN_PERM' flag.

Tested using [1].

Before:

# capsh -- -c ./dm_repo
# capsh --drop=cap_sys_admin -- -c ./dm_repo

After:

# capsh -- -c ./dm_repo
# capsh --drop=cap_sys_admin -- -c ./dm_repo
Failed to join "events" multicast group

[1]
$ cat dm.c
#include <stdio.h>
#include <netlink/genl/ctrl.h>
#include <netlink/genl/genl.h>
#include <netlink/socket.h>

int main(int argc, char **argv)
{
struct nl_sock *sk;
int grp, err;

sk = nl_socket_alloc();
if (!sk) {
fprintf(stderr, "Failed to allocate socket\n");
return -1;
}

err = genl_connect(sk);
if (err) {
fprintf(stderr, "Failed to connect socket\n");
return err;
}

grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events");
if (grp < 0) {
fprintf(stderr,
"Failed to resolve \"events\" multicast group\n");
return grp;
}

err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE);
if (err) {
fprintf(stderr, "Failed to join \"events\" multicast group\n");
return err;
}

return 0;
}
$ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c

Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol")
Reported-by: "The UK's National Cyber Security Centre (NCSC)" <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3
# 071c0fc6 19-Apr-2023 Johannes Berg <[email protected]>

net: extend drop reasons for multiple subsystems

Extend drop reasons to make them usable by subsystems
other than core by reserving the high 16 bits for a
new subsystem ID, of which 0 of course is u

net: extend drop reasons for multiple subsystems

Extend drop reasons to make them usable by subsystems
other than core by reserving the high 16 bits for a
new subsystem ID, of which 0 of course is used for the
existing reasons immediately.

To still be able to have string reasons, restructure
that code a bit to make the loopup under RCU, the only
user of this (right now) is drop_monitor.

Link: https://lore.kernel.org/netdev/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4
# 20b0b53a 04-Nov-2022 Jakub Kicinski <[email protected]>

genetlink: introduce split op representation

We currently have two forms of operations - small ops and "full" ops
(or just ops). The former does not have pointers for some of the less
commonly used

genetlink: introduce split op representation

We currently have two forms of operations - small ops and "full" ops
(or just ops). The former does not have pointers for some of the less
commonly used features (namely dump start/done and policy).

The "full" ops, however, still don't contain all the necessary
information. In particular the policy is per command ID, while
do and dump often accept different attributes. It's also not
possible to define different pre_doit and post_doit callbacks
for different commands within the family.

At the same time a lot of commands do not support dumping and
therefore all the dump-related information is wasted space.

Create a new command representation which can hold info about
a do implementation or a dump implementation, but not both at
the same time.

Use this new representation on the command execution path
(genl_family_rcv_msg) as we either run a do or a dump and
don't have to create a "full" op there.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v6.1-rc3
# d120d1a6 26-Oct-2022 Thomas Gleixner <[email protected]>

net: Remove the obsolte u64_stats_fetch_*_irq() users (net).

Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.

Conv

net: Remove the obsolte u64_stats_fetch_*_irq() users (net).

Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.

Convert to the regular interface.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3
# 9c5d03d3 25-Aug-2022 Jakub Kicinski <[email protected]>

genetlink: start to validate reserved header bytes

We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.

One use ca

genetlink: start to validate reserved header bytes

We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.

One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.

To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.

Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Paul Moore <[email protected]> (NetLabel)
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v6.0-rc2
# 70986397 18-Aug-2022 Wolfram Sang <[email protected]>

net: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by

net: move from strlcpy with unused retval to strscpy

Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2
# c6cce71e 08-Jun-2022 Eric Dumazet <[email protected]>

drop_monitor: adopt u64_stats_t

As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type")
we should use u64_stats_t and related accessors to avoid load/store tearing.

Signed-off-b

drop_monitor: adopt u64_stats_t

As explained in commit 316580b69d0a ("u64_stats: provide u64_stats_t type")
we should use u64_stats_t and related accessors to avoid load/store tearing.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


# d62607c3 08-Jun-2022 Jakub Kicinski <[email protected]>

net: rename reference+tracking helpers

Netdev reference helpers have a dev_ prefix for historic
reasons. Renaming the old helpers would be too much churn
but we can rename the tracking ones which ar

net: rename reference+tracking helpers

Netdev reference helpers have a dev_ prefix for historic
reasons. Renaming the old helpers would be too much churn
but we can rename the tracking ones which are relatively
recent and should be the default for new code.

Rename:
dev_hold_track() -> netdev_hold()
dev_put_track() -> netdev_put()
dev_replace_track() -> netdev_ref_replace()

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


# ec43908d 06-Jun-2022 Menglong Dong <[email protected]>

net: skb: use auto-generation to convert skb drop reason to string

It is annoying to add new skb drop reasons to 'enum skb_drop_reason'
and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy

net: skb: use auto-generation to convert skb drop reason to string

It is annoying to add new skb drop reasons to 'enum skb_drop_reason'
and TRACE_SKB_DROP_REASON in trace/event/skb.h, and it's easy to forget
to add the new reasons we added to TRACE_SKB_DROP_REASON.

TRACE_SKB_DROP_REASON is used to convert drop reason of type number
to string. For now, the string we passed to user space is exactly the
same as the name in 'enum skb_drop_reason' with a 'SKB_DROP_REASON_'
prefix. Therefore, we can use 'auto-generation' to generate these
drop reasons to string at build time.

The new source 'dropreason_str.c' will be auto generated during build
time, which contains the string array
'const char * const drop_reasons[]'.

Signed-off-by: Menglong Dong <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

show more ...


Revision tags: v5.19-rc1, v5.18, v5.18-rc7
# a3af33ab 13-May-2022 Menglong Dong <[email protected]>

net: dm: check the boundary of skb drop reasons

The 'reason' will be set to 'SKB_DROP_REASON_NOT_SPECIFIED' if it not
small that SKB_DROP_REASON_MAX in net_dm_packet_trace_kfree_skb_hit(),
but it ca

net: dm: check the boundary of skb drop reasons

The 'reason' will be set to 'SKB_DROP_REASON_NOT_SPECIFIED' if it not
small that SKB_DROP_REASON_MAX in net_dm_packet_trace_kfree_skb_hit(),
but it can't avoid it to be 0, which is invalid and can cause NULL
pointer in drop_reasons.

Therefore, reset it to SKB_DROP_REASON_NOT_SPECIFIED when 'reason <= 0'.

Reviewed-by: Jiang Biao <[email protected]>
Reviewed-by: Hao Peng <[email protected]>
Signed-off-by: Menglong Dong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6
# b26ef81c 22-Feb-2022 Eric Dumazet <[email protected]>

drop_monitor: remove quadratic behavior

drop_monitor is using an unique list on which all netdevices in
the host have an element, regardless of their netns.

This scales poorly, not only at device u

drop_monitor: remove quadratic behavior

drop_monitor is using an unique list on which all netdevices in
the host have an element, regardless of their netns.

This scales poorly, not only at device unregister time (what I
caught during my netns dismantle stress tests), but also at packet
processing time whenever trace_napi_poll_hit() is called.

If the intent was to avoid adding one pointer in 'struct net_device'
then surely we prefer O(1) behavior.

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Neil Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.17-rc5, v5.17-rc4
# dcd54265 10-Feb-2022 Eric Dumazet <[email protected]>

drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit

trace_napi_poll_hit() is reading stat->dev while another thread can write
on it from dropmon_net_event()

Use READ_ONCE()/WRITE

drop_monitor: fix data-race in dropmon_net_event / trace_napi_poll_hit

trace_napi_poll_hit() is reading stat->dev while another thread can write
on it from dropmon_net_event()

Use READ_ONCE()/WRITE_ONCE() here, RCU rules are properly enforced already,
we only have to take care of load/store tearing.

BUG: KCSAN: data-race in dropmon_net_event / trace_napi_poll_hit

write to 0xffff88816f3ab9c0 of 8 bytes by task 20260 on cpu 1:
dropmon_net_event+0xb8/0x2b0 net/core/drop_monitor.c:1579
notifier_call_chain kernel/notifier.c:84 [inline]
raw_notifier_call_chain+0x53/0xb0 kernel/notifier.c:392
call_netdevice_notifiers_info net/core/dev.c:1919 [inline]
call_netdevice_notifiers_extack net/core/dev.c:1931 [inline]
call_netdevice_notifiers net/core/dev.c:1945 [inline]
unregister_netdevice_many+0x867/0xfb0 net/core/dev.c:10415
ip_tunnel_delete_nets+0x24a/0x280 net/ipv4/ip_tunnel.c:1123
vti_exit_batch_net+0x2a/0x30 net/ipv4/ip_vti.c:515
ops_exit_list net/core/net_namespace.c:173 [inline]
cleanup_net+0x4dc/0x8d0 net/core/net_namespace.c:597
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30

read to 0xffff88816f3ab9c0 of 8 bytes by interrupt on cpu 0:
trace_napi_poll_hit+0x89/0x1c0 net/core/drop_monitor.c:292
trace_napi_poll include/trace/events/napi.h:14 [inline]
__napi_poll+0x36b/0x3f0 net/core/dev.c:6366
napi_poll net/core/dev.c:6432 [inline]
net_rx_action+0x29e/0x650 net/core/dev.c:6519
__do_softirq+0x158/0x2de kernel/softirq.c:558
do_softirq+0xb1/0xf0 kernel/softirq.c:459
__local_bh_enable_ip+0x68/0x70 kernel/softirq.c:383
__raw_spin_unlock_bh include/linux/spinlock_api_smp.h:167 [inline]
_raw_spin_unlock_bh+0x33/0x40 kernel/locking/spinlock.c:210
spin_unlock_bh include/linux/spinlock.h:394 [inline]
ptr_ring_consume_bh include/linux/ptr_ring.h:367 [inline]
wg_packet_decrypt_worker+0x73c/0x780 drivers/net/wireguard/receive.c:506
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
worker_thread+0x616/0xa70 kernel/workqueue.c:2454
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30

value changed: 0xffff88815883e000 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 26435 Comm: kworker/0:1 Not tainted 5.17.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: wg-crypt-wg2 wg_packet_decrypt_worker

Fixes: 4ea7e38696c7 ("dropmon: add ability to detect when hardware dropsrxpackets")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Neil Horman <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


# 5cad527d 09-Feb-2022 Menglong Dong <[email protected]>

net: drop_monitor: support drop reason

In the commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()")
drop reason is introduced to the tracepoint of kfree_skb. Therefore,
drop_monitor is able

net: drop_monitor: support drop reason

In the commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()")
drop reason is introduced to the tracepoint of kfree_skb. Therefore,
drop_monitor is able to report the drop reason to users by netlink.

The drop reasons are reported as string to users, which is exactly
the same as what we do when reporting it to ftrace.

Signed-off-by: Menglong Dong <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16
# c504e5c2 09-Jan-2022 Menglong Dong <[email protected]>

net: skb: introduce kfree_skb_reason()

Introduce the interface kfree_skb_reason(), which is able to pass
the reason why the skb is dropped to 'kfree_skb' tracepoint.

Add the 'reason' field to 'trac

net: skb: introduce kfree_skb_reason()

Introduce the interface kfree_skb_reason(), which is able to pass
the reason why the skb is dropped to 'kfree_skb' tracepoint.

Add the 'reason' field to 'trace_kfree_skb', therefor user can get
more detail information about abnormal skb with 'drop_monitor' or
eBPF.

All drop reasons are defined in the enum 'skb_drop_reason', and
they will be print as string in 'kfree_skb' tracepoint in format
of 'reason: XXX'.

( Maybe the reasons should be defined in a uapi header file, so that
user space can use them? )

Signed-off-by: Menglong Dong <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4
# 4dbd24f6 05-Dec-2021 Eric Dumazet <[email protected]>

drop_monitor: add net device refcount tracker

We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Jakub Kicinski <kuba@k

drop_monitor: add net device refcount tracker

We want to track all dev_hold()/dev_put() to ease leak hunting.

Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>

show more ...


Revision tags: v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5
# 1160dfa1 05-Aug-2021 Yajun Deng <[email protected]>

net: Remove redundant if statements

The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <[email protected]>
Signed-off-by: D

net: Remove redundant if statements

The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4
# a835f903 18-Mar-2021 Xiong Zhenwu <[email protected]>

/net/core/: fix misspellings using codespell tool

A typo is found out by codespell tool in 1734th line of drop_monitor.c:

$ codespell ./net/core/
./net/core/drop_monitor.c:1734: guarnateed ==> gua

/net/core/: fix misspellings using codespell tool

A typo is found out by codespell tool in 1734th line of drop_monitor.c:

$ codespell ./net/core/
./net/core/drop_monitor.c:1734: guarnateed ==> guaranteed

Fix a typo found by codespell.

Signed-off-by: Xiong Zhenwu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.12-rc3
# 9398e9c0 10-Mar-2021 Ido Schimmel <[email protected]>

drop_monitor: Perform cleanup upon probe registration failure

In the rare case that drop_monitor fails to register its probe on the
'napi_poll' tracepoint, it will not deactivate its hysteresis time

drop_monitor: Perform cleanup upon probe registration failure

In the rare case that drop_monitor fails to register its probe on the
'napi_poll' tracepoint, it will not deactivate its hysteresis timer as
part of the error path. If the hysteresis timer was armed by the shortly
lived 'kfree_skb' probe and user space retries to initiate tracing, a
warning will be emitted for trying to initialize an active object [1].

Fix this by properly undoing all the operations that were done prior to
probe registration, in both software and hardware code paths.

Note that syzkaller managed to fail probe registration by injecting a
slab allocation failure [2].

[1]
ODEBUG: init active (active state 0) object type: timer_list hint: sched_send_work+0x0/0x60 include/linux/list.h:135
WARNING: CPU: 1 PID: 8649 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Modules linked in:
CPU: 1 PID: 8649 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
[...]
Call Trace:
__debug_object_init+0x524/0xd10 lib/debugobjects.c:588
debug_timer_init kernel/time/timer.c:722 [inline]
debug_init kernel/time/timer.c:770 [inline]
init_timer_key+0x2d/0x340 kernel/time/timer.c:814
net_dm_trace_on_set net/core/drop_monitor.c:1111 [inline]
set_all_monitor_traces net/core/drop_monitor.c:1188 [inline]
net_dm_monitor_start net/core/drop_monitor.c:1295 [inline]
net_dm_cmd_trace+0x720/0x1220 net/core/drop_monitor.c:1339
genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:739
genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:800
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502
genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338
netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:672
____sys_sendmsg+0x6e8/0x810 net/socket.c:2348
___sys_sendmsg+0xf3/0x170 net/socket.c:2402
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2435
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae

[2]
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 1
CPU: 1 PID: 8645 Comm: syz-executor.0 Not tainted 5.11.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
dump_stack+0xfa/0x151
should_fail.cold+0x5/0xa
should_failslab+0x5/0x10
__kmalloc+0x72/0x3f0
tracepoint_add_func+0x378/0x990
tracepoint_probe_register+0x9c/0xe0
net_dm_cmd_trace+0x7fc/0x1220
genl_family_rcv_msg_doit+0x228/0x320
genl_rcv_msg+0x328/0x580
netlink_rcv_skb+0x153/0x420
genl_rcv+0x24/0x40
netlink_unicast+0x533/0x7d0
netlink_sendmsg+0x856/0xd90
sock_sendmsg+0xcf/0x120
____sys_sendmsg+0x6e8/0x810
___sys_sendmsg+0xf3/0x170
__sys_sendmsg+0xe5/0x1b0
do_syscall_64+0x2d/0x70
entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 70c69274f354 ("drop_monitor: Initialize timer and work item upon tracing enable")
Fixes: 8ee2267ad33e ("drop_monitor: Convert to using devlink tracepoint")
Reported-by: [email protected]
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8
# 66a9b928 02-Oct-2020 Jakub Kicinski <[email protected]>

genetlink: move to smaller ops wherever possible

Bulk of the genetlink users can use smaller ops, move them.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Johannes Berg <johannes@sip

genetlink: move to smaller ops wherever possible

Bulk of the genetlink users can use smaller ops, move them.

Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


12345