|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5 |
|
| #
70d0bb45 |
| 22-Aug-2024 |
Simon Horman <[email protected]> |
net: Correct spelling in headers
Correct spelling in Networking headers. As reported by codespell.
Signed-off-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/20240822-net-spell-v
net: Correct spelling in headers
Correct spelling in Networking headers. As reported by codespell.
Signed-off-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
show more ...
|
|
Revision tags: v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2 |
|
| #
94dfc73e |
| 07-Apr-2022 |
Gustavo A. R. Silva <[email protected]> |
treewide: uapi: Replace zero-length arrays with flexible-array members
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a stru
treewide: uapi: Replace zero-length arrays with flexible-array members
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2].
This code was transformed with the help of Coccinelle: (linux-5.19-rc2$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch)
@@ identifier S, member, array; type T1, T2; @@
struct S { ... T1 member; T2 array[ - 0 ]; };
-fstrict-flex-arrays=3 is coming and we need to land these changes to prevent issues like these in the short future:
../fs/minix/dir.c:337:3: warning: 'strcpy' will always overflow; destination buffer has size 0, but the source string has length 2 (including NUL byte) [-Wfortify-source] strcpy(de3->name, "."); ^
Since these are all [0] to [] changes, the risk to UAPI is nearly zero. If this breaks anything, we can use a union with a new member name.
[1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays
Link: https://github.com/KSPP/linux/issues/78 Build-tested-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/lkml/62b675ec.wKX6AOZ6cbE71vtF%[email protected]/ Acked-by: Dan Williams <[email protected]> # For ndctl.h Signed-off-by: Gustavo A. R. Silva <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4 |
|
| #
c1077616 |
| 01-Sep-2020 |
Wei Wang <[email protected]> |
ip: expose inet sockopts through inet_diag
Expose all exisiting inet sockopt bits through inet_diag for debug purpose. Corresponding changes in iproute2 ss will be submitted to output all these valu
ip: expose inet sockopts through inet_diag
Expose all exisiting inet sockopt bits through inet_diag for debug purpose. Corresponding changes in iproute2 ss will be submitted to output all these values.
Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Mahesh Bandewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5 |
|
| #
3f935c75 |
| 09-Jul-2020 |
Paolo Abeni <[email protected]> |
inet_diag: support for wider protocol numbers
After commit bf9765145b85 ("sock: Make sk_protocol a 16-bit value") the current size of 'sdiag_protocol' is not sufficient to represent the possible pro
inet_diag: support for wider protocol numbers
After commit bf9765145b85 ("sock: Make sk_protocol a 16-bit value") the current size of 'sdiag_protocol' is not sufficient to represent the possible protocol values.
This change introduces a new inet diag request attribute to let user space specify the relevant protocol number using u32 values.
The attribute is parsed by inet diag core on get/dump command and the extended protocol value, if available, is preferred to 'sdiag_protocol' to lookup the diag handler.
The parse attributed are exposed to all the diag handlers via the cb->data.
Note that inet_diag_dump_one_icsk() is left unmodified, as it will not be used by protocol using the extended attribute.
Suggested-by: David S. Miller <[email protected]> Co-developed-by: Christoph Paasch <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Acked-by: Mat Martineau <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4 |
|
| #
b1f3e43d |
| 30-Apr-2020 |
Dmitry Yakunin <[email protected]> |
inet_diag: add support for cgroup filter
This patch adds ability to filter sockets based on cgroup v2 ID. Such filter is helpful in ss utility for filtering sockets by cgroup pathname.
Signed-off-b
inet_diag: add support for cgroup filter
This patch adds ability to filter sockets based on cgroup v2 ID. Such filter is helpful in ss utility for filtering sockets by cgroup pathname.
Signed-off-by: Dmitry Yakunin <[email protected]> Reviewed-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
6e3a401f |
| 30-Apr-2020 |
Dmitry Yakunin <[email protected]> |
inet_diag: add cgroup id attribute
This patch adds cgroup v2 ID to common inet diag message attributes. Cgroup v2 ID is kernfs ID (ino or ino+gen). This attribute allows filter inet diag output by c
inet_diag: add cgroup id attribute
This patch adds cgroup v2 ID to common inet diag message attributes. Cgroup v2 ID is kernfs ID (ino or ino+gen). This attribute allows filter inet diag output by cgroup ID obtained by name_to_handle_at() syscall. When net_cls or net_prio cgroup is activated this ID is equal to 1 (root cgroup ID) for newly created sockets.
Some notes about this ID:
1) gets initialized in socket() syscall 2) incoming socket gets ID from listening socket (not during accept() syscall) 3) not changed when process get moved to another cgroup 4) can point to deleted cgroup (refcounting)
v2: - use CONFIG_SOCK_CGROUP_DATA instead if CONFIG_CGROUPS
v3: - fix attr size by using nla_total_size_64bit() (Eric Dumazet) - more detailed commit message (Konstantin Khlebnikov)
Signed-off-by: Dmitry Yakunin <[email protected]> Reviewed-by: Konstantin Khlebnikov <[email protected]> Acked-By: Tejun Heo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6 |
|
| #
5147dfb5 |
| 27-Mar-2020 |
Davide Caratti <[email protected]> |
mptcp: allow dumping subflow context to userspace
add ulp-specific diagnostic functions, so that subflow information can be dumped to userspace programs like 'ss'.
v2 -> v3: - uapi: use bit macros
mptcp: allow dumping subflow context to userspace
add ulp-specific diagnostic functions, so that subflow information can be dumped to userspace programs like 'ss'.
v2 -> v3: - uapi: use bit macros appropriate for userspace
Co-developed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Co-developed-by: Paolo Abeni <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4 |
|
| #
085c20ca |
| 25-Feb-2020 |
Martin KaFai Lau <[email protected]> |
bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump()
This patch will dump out the bpf_sk_storages of a sk if the request has the INET_DIAG_REQ_SK_BPF_STORAGES nlattr.
An array of SK_DIAG_BPF_ST
bpf: inet_diag: Dump bpf_sk_storages in inet_diag_dump()
This patch will dump out the bpf_sk_storages of a sk if the request has the INET_DIAG_REQ_SK_BPF_STORAGES nlattr.
An array of SK_DIAG_BPF_STORAGE_REQ_MAP_FD can be specified in INET_DIAG_REQ_SK_BPF_STORAGES to select which bpf_sk_storage to dump. If no map_fd is specified, all bpf_sk_storages of a sk will be dumped.
bpf_sk_storages can be added to the system at runtime. It is difficult to find a proper static value for cb->min_dump_alloc.
This patch learns the nlattr size required to dump the bpf_sk_storages of a sk. If it happens to be the very first nlmsg of a dump and it cannot fit the needed bpf_sk_storages, it will try to expand the skb by "pskb_expand_head()".
Instead of expanding it in inet_sk_diag_fill(), it is expanded at a sleepable context in __inet_diag_dump() so __GFP_DIRECT_RECLAIM can be used. In __inet_diag_dump(), it will retry as long as the skb is empty and the cb->min_dump_alloc becomes larger than before. cb->min_dump_alloc is bounded by KMALLOC_MAX_SIZE. The min_dump_alloc is also changed from 'u16' to 'u32' to accommodate a sk that may have a few large bpf_sk_storages.
The updated cb->min_dump_alloc will also be used to allocate the skb in the next dump. This logic already exists in netlink_dump().
Here is the sample output of a locally modified 'ss' and it could be made more readable by using BTF later: [root@arch-fb-vm1 ~]# ss --bpf-map-id 14 --bpf-map-id 13 -t6an 'dst [::1]:8989' State Recv-Q Send-Q Local Address:Port Peer Address:PortProcess ESTAB 0 0 [::1]:51072 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] ESTAB 0 0 [::1]:51070 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ]
[root@arch-fb-vm1 ~]# ~/devshare/github/iproute2/misc/ss --bpf-maps -t6an 'dst [::1]:8989' State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 0 0 [::1]:51072 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ] ESTAB 0 0 [::1]:51070 [::1]:8989 bpf_map_id:14 value:[ 3feb ] bpf_map_id:13 value:[ 3f ] bpf_map_id:12 value:[ 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000... total:65407 ]
Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
| #
0df6d328 |
| 25-Feb-2020 |
Martin KaFai Lau <[email protected]> |
inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data
The INET_DIAG_REQ_BYTECODE nlattr is currently re-found every time when the "dump()" is re-started.
In a latter patch, it will also nee
inet_diag: Move the INET_DIAG_REQ_BYTECODE nlattr to cb->data
The INET_DIAG_REQ_BYTECODE nlattr is currently re-found every time when the "dump()" is re-started.
In a latter patch, it will also need to parse the new INET_DIAG_REQ_SK_BPF_STORAGES nlattr to learn the map_fds. Thus, this patch takes this chance to store the parsed nlattr in cb->data during the "start" time of a dump.
By doing this, the "bc" argument also becomes unnecessary and is removed. Also, the two copies of the INET_DIAG_REQ_BYTECODE parsing-audit logic between compat/current version can be consolidated to one.
Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
show more ...
|
|
Revision tags: v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7 |
|
| #
26811cc9 |
| 30-Aug-2019 |
Davide Caratti <[email protected]> |
net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag
When an application configures kernel TLS on top of a TCP socket, it's now possible for inet_diag_handler() to collect infor
net: tls: export protocol version, cipher, tx_conf/rx_conf to socket diag
When an application configures kernel TLS on top of a TCP socket, it's now possible for inet_diag_handler() to collect information regarding the protocol version, the cipher type and TX / RX configuration, in case INET_DIAG_INFO is requested.
Signed-off-by: Davide Caratti <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
| #
61723b39 |
| 30-Aug-2019 |
Davide Caratti <[email protected]> |
tcp: ulp: add functions to dump ulp-specific information
currently, only getsockopt(TCP_ULP) can be invoked to know if a ULP is on top of a TCP socket. Extend idiag_get_aux() and idiag_get_aux_size(
tcp: ulp: add functions to dump ulp-specific information
currently, only getsockopt(TCP_ULP) can be invoked to know if a ULP is on top of a TCP socket. Extend idiag_get_aux() and idiag_get_aux_size(), introduced by commit b37e88407c1d ("inet_diag: allow protocols to provide additional data"), to report the ULP name and other information that can be made available by the ULP through optional functions.
Users having CAP_NET_ADMIN privileges will then be able to retrieve this information through inet_diag_handler, if they specify INET_DIAG_INFO in the request.
Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6 |
|
| #
1ec17dbd |
| 09-Feb-2019 |
Konstantin Khlebnikov <[email protected]> |
inet_diag: fix reporting cgroup classid and fallback to priority
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested extensions has only 8 bits. Thus extensions starting from DCTC
inet_diag: fix reporting cgroup classid and fallback to priority
Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested extensions has only 8 bits. Thus extensions starting from DCTCPINFO cannot be requested directly. Some of them included into response unconditionally or hook into some of lower 8 bits.
Extension INET_DIAG_CLASS_ID has not way to request from the beginning.
This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space reservation, and documents behavior for other extensions.
Also this patch adds fallback to reporting socket priority. This filed is more widely used for traffic classification because ipv4 sockets automatically maps TOS to priority and default qdisc pfifo_fast knows about that. But priority could be changed via setsockopt SO_PRIORITY so INET_DIAG_TOS isn't enough for predicting class.
Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).
So, after this patch INET_DIAG_CLASS_ID will report socket priority for most common setup when net_cls isn't set and/or cgroup2 in use.
Fixes: 0888e372c37f ("net: inet: diag: expose sockets cgroup classid") Signed-off-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3, v4.17-rc2, v4.17-rc1, v4.16, v4.16-rc7, v4.16-rc6, v4.16-rc5, v4.16-rc4, v4.16-rc3, v4.16-rc2, v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7, v4.15-rc6 |
|
| #
bbb6189d |
| 27-Dec-2017 |
Kristian Evensen <[email protected]> |
inet_diag: Add equal-operator for ports
inet_diag currently provides less/greater than or equal operators for comparing ports when filtering sockets. An equal comparison can be performed by combinin
inet_diag: Add equal-operator for ports
inet_diag currently provides less/greater than or equal operators for comparing ports when filtering sockets. An equal comparison can be performed by combining the two existing operators, or a user can for example request a port range and then do the final filtering in userspace. However, these approaches both have drawbacks. Implementing equal using LE/GE causes the size and complexity of a filter to grow quickly as the number of ports increase, while it on busy machines would be great if the kernel only returns information about relevant sockets.
This patch introduces source and destination port equal operators. INET_DIAG_BC_S_EQ is used to match a source port, INET_DIAG_BC_D_EQ a destination port, and usage is the same as for the existing port operators. I.e., the port to match is stored in the no-member of the next inet_diag_bc_op-struct in the filter.
Signed-off-by: Kristian Evensen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.15-rc5, v4.15-rc4, v4.15-rc3, v4.15-rc2, v4.15-rc1, v4.14, v4.14-rc8 |
|
| #
6f52b16c |
| 01-Nov-2017 |
Greg Kroah-Hartman <[email protected]> |
License cleanup: add SPDX license identifier to uapi header files with no license
Many user space API headers are missing licensing information, which makes it hard for compliance tools to determine
License cleanup: add SPDX license identifier to uapi header files with no license
Many user space API headers are missing licensing information, which makes it hard for compliance tools to determine the correct license.
By default are files without license information under the default license of the kernel, which is GPLV2. Marking them GPLV2 would exclude them from being included in non GPLV2 code, which is obviously not intended. The user space API headers fall under the syscall exception which is in the kernels COPYING file:
NOTE! This copyright does *not* cover user programs that use kernel services by normal system calls - this is merely considered normal use of the kernel, and does *not* fall under the heading of "derived work".
otherwise syscall usage would not be possible.
Update the files which contain no license information with an SPDX license identifier. The chosen identifier is 'GPL-2.0 WITH Linux-syscall-note' which is the officially assigned identifier for the Linux syscall exception. SPDX license identifiers are a legally binding shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. See the previous patch in this series for the methodology of how this patch was researched.
Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Philippe Ombredanne <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v4.14-rc7, v4.14-rc6, v4.14-rc5, v4.14-rc4, v4.14-rc3, v4.14-rc2, v4.14-rc1, v4.13 |
|
| #
c03fa9bc |
| 31-Aug-2017 |
Ivan Delalande <[email protected]> |
tcp_diag: report TCP MD5 signing keys and addresses
Report TCP MD5 (RFC2385) signing keys, addresses and address prefixes to processes with CAP_NET_ADMIN requesting INET_DIAG_INFO. Currently it is n
tcp_diag: report TCP MD5 signing keys and addresses
Report TCP MD5 (RFC2385) signing keys, addresses and address prefixes to processes with CAP_NET_ADMIN requesting INET_DIAG_INFO. Currently it is not possible to retrieve these from the kernel once they have been configured on sockets.
Signed-off-by: Ivan Delalande <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.13-rc7, v4.13-rc6 |
|
| #
0888e372 |
| 17-Aug-2017 |
Levin, Alexander (Sasha Levin) <[email protected]> |
net: inet: diag: expose sockets cgroup classid
This is useful for directly looking up a task based on class id rather than having to scan through all open file descriptors.
Signed-off-by: Sasha Lev
net: inet: diag: expose sockets cgroup classid
This is useful for directly looking up a task based on class id rather than having to scan through all open file descriptors.
Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.13-rc5, v4.13-rc4, v4.13-rc3, v4.13-rc2, v4.13-rc1, v4.12, v4.12-rc7, v4.12-rc6, v4.12-rc5, v4.12-rc4, v4.12-rc3, v4.12-rc2, v4.12-rc1, v4.11, v4.11-rc8, v4.11-rc7, v4.11-rc6, v4.11-rc5, v4.11-rc4, v4.11-rc3, v4.11-rc2, v4.11-rc1, v4.10, v4.10-rc8, v4.10-rc7, v4.10-rc6, v4.10-rc5, v4.10-rc4, v4.10-rc3, v4.10-rc2, v4.10-rc1, v4.9, v4.9-rc8, v4.9-rc7, v4.9-rc6, v4.9-rc5, v4.9-rc4, v4.9-rc3, v4.9-rc2 |
|
| #
432490f9 |
| 21-Oct-2016 |
Cyrill Gorcunov <[email protected]> |
net: ip, diag -- Add diag interface for raw sockets
In criu we are actively using diag interface to collect sockets present in the system when dumping applications. And while for unix, tcp, udp[lite
net: ip, diag -- Add diag interface for raw sockets
In criu we are actively using diag interface to collect sockets present in the system when dumping applications. And while for unix, tcp, udp[lite], packet, netlink it works as expected, the raw sockets do not have. Thus add it.
v2: - add missing sock_put calls in raw_diag_dump_one (by eric.dumazet@) - implement @destroy for diag requests (by dsa@)
v3: - add export of raw_abort for IPv6 (by dsa@) - pass net-admin flag into inet_sk_diag_fill due to changes in net-next branch (by dsa@)
v4: - use @pad in struct inet_diag_req_v2 for raw socket protocol specification: raw module carries sockets which may have custom protocol passed from socket() syscall and sole @sdiag_protocol is not enough to match underlied ones - start reporting protocol specifed in socket() call when sockets are raw ones for the same reason: user space tools like ss may parse this attribute and use it for socket matching
v5 (by eric.dumazet@): - use sock_hold in raw_sock_get instead of atomic_inc, we're holding (raw_v4_hashinfo|raw_v6_hashinfo)->lock when looking up so counter won't be zero here.
v6: - use sdiag_raw_protocol() helper which will access @pad structure used for raw sockets protocol specification: we can't simply rename this member without breaking uapi
v7: - sine sdiag_raw_protocol() helper is not suitable for uapi lets rather make an alias structure with proper names. __check_inet_diag_req_raw helper will catch if any of structure unintentionally changed.
CC: David S. Miller <[email protected]> CC: Eric Dumazet <[email protected]> CC: David Ahern <[email protected]> CC: Alexey Kuznetsov <[email protected]> CC: James Morris <[email protected]> CC: Hideaki YOSHIFUJI <[email protected]> CC: Patrick McHardy <[email protected]> CC: Andrey Vagin <[email protected]> CC: Stephen Hemminger <[email protected]> Signed-off-by: Cyrill Gorcunov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.9-rc1, v4.8, v4.8-rc8 |
|
| #
0f8782ea |
| 20-Sep-2016 |
Neal Cardwell <[email protected]> |
tcp_bbr: add BBR congestion control
This commit implements a new TCP congestion control algorithm: BBR (Bottleneck Bandwidth and RTT). A detailed description of BBR will be published in ACM Queue, V
tcp_bbr: add BBR congestion control
This commit implements a new TCP congestion control algorithm: BBR (Bottleneck Bandwidth and RTT). A detailed description of BBR will be published in ACM Queue, Vol. 14 No. 5, September-October 2016, as "BBR: Congestion-Based Congestion Control".
BBR has significantly increased throughput and reduced latency for connections on Google's internal backbone networks and google.com and YouTube Web servers.
BBR requires only changes on the sender side, not in the network or the receiver side. Thus it can be incrementally deployed on today's Internet, or in datacenters.
The Internet has predominantly used loss-based congestion control (largely Reno or CUBIC) since the 1980s, relying on packet loss as the signal to slow down. While this worked well for many years, loss-based congestion control is unfortunately out-dated in today's networks. On today's Internet, loss-based congestion control causes the infamous bufferbloat problem, often causing seconds of needless queuing delay, since it fills the bloated buffers in many last-mile links. On today's high-speed long-haul links using commodity switches with shallow buffers, loss-based congestion control has abysmal throughput because it over-reacts to losses caused by transient traffic bursts.
In 1981 Kleinrock and Gale showed that the optimal operating point for a network maximizes delivered bandwidth while minimizing delay and loss, not only for single connections but for the network as a whole. Finding that optimal operating point has been elusive, since any single network measurement is ambiguous: network measurements are the result of both bandwidth and propagation delay, and those two cannot be measured simultaneously.
While it is impossible to disambiguate any single bandwidth or RTT measurement, a connection's behavior over time tells a clearer story. BBR uses a measurement strategy designed to resolve this ambiguity. It combines these measurements with a robust servo loop using recent control systems advances to implement a distributed congestion control algorithm that reacts to actual congestion, not packet loss or transient queue delay, and is designed to converge with high probability to a point near the optimal operating point.
In a nutshell, BBR creates an explicit model of the network pipe by sequentially probing the bottleneck bandwidth and RTT. On the arrival of each ACK, BBR derives the current delivery rate of the last round trip, and feeds it through a windowed max-filter to estimate the bottleneck bandwidth. Conversely it uses a windowed min-filter to estimate the round trip propagation delay. The max-filtered bandwidth and min-filtered RTT estimates form BBR's model of the network pipe.
Using its model, BBR sets control parameters to govern sending behavior. The primary control is the pacing rate: BBR applies a gain multiplier to transmit faster or slower than the observed bottleneck bandwidth. The conventional congestion window (cwnd) is now the secondary control; the cwnd is set to a small multiple of the estimated BDP (bandwidth-delay product) in order to allow full utilization and bandwidth probing while bounding the potential amount of queue at the bottleneck.
When a BBR connection starts, it enters STARTUP mode and applies a high gain to perform an exponential search to quickly probe the bottleneck bandwidth (doubling its sending rate each round trip, like slow start). However, instead of continuing until it fills up the buffer (i.e. a loss), or until delay or ACK spacing reaches some threshold (like Hystart), it uses its model of the pipe to estimate when that pipe is full: it estimates the pipe is full when it notices the estimated bandwidth has stopped growing. At that point it exits STARTUP and enters DRAIN mode, where it reduces its pacing rate to drain the queue it estimates it has created.
Then BBR enters steady state. In steady state, PROBE_BW mode cycles between first pacing faster to probe for more bandwidth, then pacing slower to drain any queue that created if no more bandwidth was available, and then cruising at the estimated bandwidth to utilize the pipe without creating excess queue. Occasionally, on an as-needed basis, it sends significantly slower to probe for RTT (PROBE_RTT mode).
BBR has been fully deployed on Google's wide-area backbone networks and we're experimenting with BBR on Google.com and YouTube on a global scale. Replacing CUBIC with BBR has resulted in significant improvements in network latency and application (RPC, browser, and video) metrics. For more details please refer to our upcoming ACM Queue publication.
Example performance results, to illustrate the difference between BBR and CUBIC:
Resilience to random loss (e.g. from shallow buffers): Consider a netperf TCP_STREAM test lasting 30 secs on an emulated path with a 10Gbps bottleneck, 100ms RTT, and 1% packet loss rate. CUBIC gets 3.27 Mbps, and BBR gets 9150 Mbps (2798x higher).
Low latency with the bloated buffers common in today's last-mile links: Consider a netperf TCP_STREAM test lasting 120 secs on an emulated path with a 10Mbps bottleneck, 40ms RTT, and 1000-packet bottleneck buffer. Both fully utilize the bottleneck bandwidth, but BBR achieves this with a median RTT 25x lower (43 ms instead of 1.09 secs).
Our long-term goal is to improve the congestion control algorithms used on the Internet. We are hopeful that BBR can help advance the efforts toward this goal, and motivate the community to do further research.
Test results, performance evaluations, feedback, and BBR-related discussions are very welcome in the public e-mail list for BBR:
https://groups.google.com/forum/#!forum/bbr-dev
NOTE: BBR *must* be used with the fq qdisc ("man tc-fq") with pacing enabled, since pacing is integral to the BBR design and implementation. BBR without pacing would not function properly, and may incur unnecessary high packet loss rates.
Signed-off-by: Van Jacobson <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Nandita Dukkipati <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.8-rc7, v4.8-rc6 |
|
| #
d545caca |
| 07-Sep-2016 |
Lorenzo Colitti <[email protected]> |
net: inet: diag: expose the socket mark to privileged processes.
This adds the capability for a process that has CAP_NET_ADMIN on a socket to see the socket mark in socket dumps.
Commit a52e95abf77
net: inet: diag: expose the socket mark to privileged processes.
This adds the capability for a process that has CAP_NET_ADMIN on a socket to see the socket mark in socket dumps.
Commit a52e95abf772 ("net: diag: allow socket bytecode filters to match socket marks") recently gave privileged processes the ability to filter socket dumps based on mark. This patch is complementary: it ensures that the mark is also passed to userspace in the socket's netlink attributes. It is useful for tools like ss which display information about sockets.
Tested: https://android-review.googlesource.com/270210 Signed-off-by: Lorenzo Colitti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.8-rc5, v4.8-rc4 |
|
| #
a52e95ab |
| 24-Aug-2016 |
Lorenzo Colitti <[email protected]> |
net: diag: allow socket bytecode filters to match socket marks
This allows a privileged process to filter by socket mark when dumping sockets via INET_DIAG_BY_FAMILY. This is useful on systems that
net: diag: allow socket bytecode filters to match socket marks
This allows a privileged process to filter by socket mark when dumping sockets via INET_DIAG_BY_FAMILY. This is useful on systems that use mark-based routing such as Android.
The ability to filter socket marks requires CAP_NET_ADMIN, which is consistent with other privileged operations allowed by the SOCK_DIAG interface such as the ability to destroy sockets and the ability to inspect BPF filters attached to packet sockets.
Tested: https://android-review.googlesource.com/261350 Signed-off-by: Lorenzo Colitti <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.8-rc3, v4.8-rc2, v4.8-rc1, v4.7, v4.7-rc7, v4.7-rc6, v4.7-rc5 |
|
| #
637c841d |
| 24-Jun-2016 |
David Ahern <[email protected]> |
net: diag: Add support to filter on device index
Add support to inet_diag facility to filter sockets based on device index. If an interface index is in the filter only sockets bound to that index (s
net: diag: Add support to filter on device index
Add support to inet_diag facility to filter sockets based on device index. If an interface index is in the filter only sockets bound to that index (sk_bound_dev_if) are returned.
Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.7-rc4, v4.7-rc3, v4.7-rc2, v4.7-rc1, v4.6, v4.6-rc7, v4.6-rc6 |
|
| #
6ed46d12 |
| 26-Apr-2016 |
Nicolas Dichtel <[email protected]> |
sock_diag: align nlattr properly when needed
I also fix the value of INET_DIAG_MAX. It's wrong since commit 8f840e47f190 which is only in net-next right now, thus I didn't make a separate patch.
Fi
sock_diag: align nlattr properly when needed
I also fix the value of INET_DIAG_MAX. It's wrong since commit 8f840e47f190 which is only in net-next right now, thus I didn't make a separate patch.
Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.6-rc5, v4.6-rc4 |
|
| #
8f840e47 |
| 14-Apr-2016 |
Xin Long <[email protected]> |
sctp: add the sctp_diag.c file
This one will implement all the interface of inet_diag, inet_diag_handler. which includes sctp_diag_dump, sctp_diag_dump_one and sctp_diag_get_info.
It will work as a
sctp: add the sctp_diag.c file
This one will implement all the interface of inet_diag, inet_diag_handler. which includes sctp_diag_dump, sctp_diag_dump_one and sctp_diag_get_info.
It will work as a module, and register inet_diag_handler when loading.
v2->v3: - fix the mistake in inet_assoc_attr_size().
- change inet_diag_msg_laddrs_fill() name to inet_diag_msg_sctpladdrs_fill.
- change inet_diag_msg_paddrs_fill() name to inet_diag_msg_sctpaddrs_fill.
- add inet_diag_msg_sctpinfo_fill() to make asoc/ep fill code clearer.
- add inet_diag_msg_sctpasoc_fill() to make asoc fill code clearer.
- merge inet_asoc_diag_fill() and inet_ep_diag_fill() to inet_sctp_diag_fill().
- call sctp_diag_get_info() directly, instead by handler, cause the caller is in the same file with it.
- call lock_sock in sctp_tsp_dump_one() to make sure we call get sctp info safely.
- after lock_sock(sk), we should check sk != assoc->base.sk.
- change mem[SK_MEMINFO_WMEM_ALLOC] to asoc->sndbuf_used for asoc dump when asoc->ep->sndbuf_policy is set. don't use INET_DIAG_MEMINFO attr any more.
Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.6-rc3, v4.6-rc2, v4.6-rc1, v4.5, v4.5-rc7, v4.5-rc6, v4.5-rc5, v4.5-rc4, v4.5-rc3, v4.5-rc2, v4.5-rc1, v4.4, v4.4-rc8, v4.4-rc7, v4.4-rc6, v4.4-rc5, v4.4-rc4, v4.4-rc3, v4.4-rc2, v4.4-rc1, v4.3, v4.3-rc7, v4.3-rc6, v4.3-rc5, v4.3-rc4, v4.3-rc3, v4.3-rc2, v4.3-rc1, v4.2, v4.2-rc8, v4.2-rc7, v4.2-rc6, v4.2-rc5, v4.2-rc4, v4.2-rc3, v4.2-rc2, v4.2-rc1 |
|
| #
20462155 |
| 24-Jun-2015 |
Phil Sutter <[email protected]> |
net: inet_diag: export IPV6_V6ONLY sockopt
For AF_INET6 sockets, the value of struct ipv6_pinfo.ipv6only is exported to userspace. It indicates whether a socket bound to in6addr_any listens on IPv4
net: inet_diag: export IPV6_V6ONLY sockopt
For AF_INET6 sockets, the value of struct ipv6_pinfo.ipv6only is exported to userspace. It indicates whether a socket bound to in6addr_any listens on IPv4 as well as IPv6. Since the socket is natively IPv6, it is not listed by e.g. 'ss -l -4'.
This patch is accompanied by an appropriate one for iproute2 to enable the additional information in 'ss -e'.
Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|
|
Revision tags: v4.1 |
|
| #
35ac838a |
| 15-Jun-2015 |
Craig Gallek <[email protected]> |
sock_diag: implement a get_info handler for inet
This get_info handler will simply dispatch to the appropriate existing inet protocol handler.
This patch also includes a new netlink attribute (INET
sock_diag: implement a get_info handler for inet
This get_info handler will simply dispatch to the appropriate existing inet protocol handler.
This patch also includes a new netlink attribute (INET_DIAG_PROTOCOL). This attribute is currently only used for multicast messages. Without this attribute, there is no way of knowing the IP protocol used by the socket information being broadcast. This attribute is not necessary in the 'dump' variant of this protocol (though it could easily be added) because dump requests are issued for specific family/protocol pairs.
Tested: ss -E (note, the -E option has not yet been merged into the upstream version of ss).
Signed-off-by: Craig Gallek <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
show more ...
|