History log of /linux-6.15/include/linux/list.h (Results 1 – 25 of 103)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1
# cec78a59 17-Sep-2024 Zijun Hu <[email protected]>

list: Remove duplicated and unused macro list_for_each_reverse

Remove macro list_for_each_reverse due to below reasons:

- it is same as list_for_each_prev.
- it is not used by current kernel tree.

list: Remove duplicated and unused macro list_for_each_reverse

Remove macro list_for_each_reverse due to below reasons:

- it is same as list_for_each_prev.
- it is not used by current kernel tree.

Signed-off-by: Zijun Hu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4
# 2932fb0a 08-Feb-2024 Wei Yang <[email protected]>

list: leverage list_is_head() for list_entry_is_head()

This is what list_is_head() exactly do.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei Ya

list: leverage list_is_head() for list_entry_is_head()

This is what list_is_head() exactly do.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wei Yang <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7
# a43c4756 04-Jan-2024 Pierre Gondois <[email protected]>

list: add hlist_count_nodes()

Add a generic hlist_count_nodes() function and use it in two drivers.


This patch (of 3):

Add a function to count nodes in a hlist. hlist_count_nodes() is similar
to

list: add hlist_count_nodes()

Add a generic hlist_count_nodes() function and use it in two drivers.


This patch (of 3):

Add a function to count nodes in a hlist. hlist_count_nodes() is similar
to list_count_nodes().

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Pierre Gondois <[email protected]>
Reviewed-by: Carlos Llamas <[email protected]>
Acked-by: Coly Li <[email protected]>
Acked-by: Marco Elver <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Cc: Arve Hjønnevåg <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Martijn Coenen <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Todd Kjos <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3
# 083772c9 26-Nov-2023 Jakub Kicinski <[email protected]>

net: page_pool: record pools per netdev

Link the page pools with netdevs. This needs to be netns compatible
so we have two options. Either we record the pools per netns and
have to worry about movin

net: page_pool: record pools per netdev

Link the page pools with netdevs. This needs to be netns compatible
so we have two options. Either we record the pools per netns and
have to worry about moving them as the netdev gets moved.
Or we record them directly on the netdev so they move with the netdev
without any extra work.

Implement the latter option. Since pools may outlast netdev we need
a place to store orphans. In time honored tradition use loopback
for this purpose.

Reviewed-by: Mina Almasry <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>

show more ...


Revision tags: v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3
# 8bf0cdfa 21-Sep-2023 Ingo Molnar <[email protected]>

<linux/list.h>: Introduce the list_for_each_reverse() method

The list_head counterpart of list_for_each_entry_reverse() was missing,
add it to complete the list handling APIs in <linux/list.h>.

[ T

<linux/list.h>: Introduce the list_for_each_reverse() method

The list_head counterpart of list_for_each_entry_reverse() was missing,
add it to complete the list handling APIs in <linux/list.h>.

[ This new API is also relied on by a WIP scheduler patch, so this
variant is not a theoretical possibility only. ]

Signed-off-by: Ingo Molnar <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]

show more ...


Revision tags: v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6
# aebc7b0d 11-Aug-2023 Marco Elver <[email protected]>

list: Introduce CONFIG_LIST_HARDENED

Numerous production kernel configs (see [1, 2]) are choosing to enable
CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened
configs [3]. The m

list: Introduce CONFIG_LIST_HARDENED

Numerous production kernel configs (see [1, 2]) are choosing to enable
CONFIG_DEBUG_LIST, which is also being recommended by KSPP for hardened
configs [3]. The motivation behind this is that the option can be used
as a security hardening feature (e.g. CVE-2019-2215 and CVE-2019-2025
are mitigated by the option [4]).

The feature has never been designed with performance in mind, yet common
list manipulation is happening across hot paths all over the kernel.

Introduce CONFIG_LIST_HARDENED, which performs list pointer checking
inline, and only upon list corruption calls the reporting slow path.

To generate optimal machine code with CONFIG_LIST_HARDENED:

1. Elide checking for pointer values which upon dereference would
result in an immediate access fault (i.e. minimal hardening
checks). The trade-off is lower-quality error reports.

2. Use the __preserve_most function attribute (available with Clang,
but not yet with GCC) to minimize the code footprint for calling
the reporting slow path. As a result, function size of callers is
reduced by avoiding saving registers before calling the rarely
called reporting slow path.

Note that all TUs in lib/Makefile already disable function tracing,
including list_debug.c, and __preserve_most's implied notrace has
no effect in this case.

3. Because the inline checks are a subset of the full set of checks in
__list_*_valid_or_report(), always return false if the inline
checks failed. This avoids redundant compare and conditional
branch right after return from the slow path.

As a side-effect of the checks being inline, if the compiler can prove
some condition to always be true, it can completely elide some checks.

Since DEBUG_LIST is functionally a superset of LIST_HARDENED, the
Kconfig variables are changed to reflect that: DEBUG_LIST selects
LIST_HARDENED, whereas LIST_HARDENED itself has no dependency on
DEBUG_LIST.

Running netperf with CONFIG_LIST_HARDENED (using a Clang compiler with
"preserve_most") shows throughput improvements, in my case of ~7% on
average (up to 20-30% on some test cases).

Link: https://r.android.com/1266735 [1]
Link: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/blob/main/config [2]
Link: https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project/Recommended_Settings [3]
Link: https://googleprojectzero.blogspot.com/2019/11/bad-binder-android-in-wild-exploit.html [4]
Signed-off-by: Marco Elver <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>

show more ...


# b16c42c8 11-Aug-2023 Marco Elver <[email protected]>

list_debug: Introduce inline wrappers for debug checks

Turn the list debug checking functions __list_*_valid() into inline
functions that wrap the out-of-line functions. Care is taken to ensure
the

list_debug: Introduce inline wrappers for debug checks

Turn the list debug checking functions __list_*_valid() into inline
functions that wrap the out-of-line functions. Care is taken to ensure
the inline wrappers are always inlined, so that additional compiler
instrumentation (such as sanitizers) does not result in redundant
outlining.

This change is preparation for performing checks in the inline wrappers.

No functional change intended.

Signed-off-by: Marco Elver <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kees Cook <[email protected]>

show more ...


Revision tags: v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8
# 4d70c746 30-Nov-2022 Andy Shevchenko <[email protected]>

i915: Move list_count() to list.h as list_count_nodes() for broader use

Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API f

i915: Move list_count() to list.h as list_count_nodes() for broader use

Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API for that by
moving code from i915 to list.h.

Reviewed-by: Lucas De Marchi <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


# 51daa42d 30-Nov-2022 Greg Kroah-Hartman <[email protected]>

Revert "i915: Move list_count() to list.h for broader use"

This reverts commit a9efc04cfd05690e91279f41c2325c46335c43ef as it
breaks the build.

Link: https://lore.kernel.org/r/20221130131854.35b58b

Revert "i915: Move list_count() to list.h for broader use"

This reverts commit a9efc04cfd05690e91279f41c2325c46335c43ef as it
breaks the build.

Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Cc: Lucas De Marchi <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Reported-by: Stephen Rothwell <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v6.1-rc7
# a9efc04c 23-Nov-2022 Andy Shevchenko <[email protected]>

i915: Move list_count() to list.h for broader use

Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API for that by
moving code

i915: Move list_count() to list.h for broader use

Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API for that by
moving code from i915 to list.h.

Reviewed-by: Lucas De Marchi <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

show more ...


Revision tags: v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18
# ad25f5cb 21-May-2022 David Howells <[email protected]>

rxrpc: Fix locking issue

There's a locking issue with the per-netns list of calls in rxrpc. The
pieces of code that add and remove a call from the list use write_lock()
and the calls procfile uses

rxrpc: Fix locking issue

There's a locking issue with the per-netns list of calls in rxrpc. The
pieces of code that add and remove a call from the list use write_lock()
and the calls procfile uses read_lock() to access it. However, the timer
callback function may trigger a removal by trying to queue a call for
processing and finding that it's already queued - at which point it has a
spare refcount that it has to do something with. Unfortunately, if it puts
the call and this reduces the refcount to 0, the call will be removed from
the list. Unfortunately, since the _bh variants of the locking functions
aren't used, this can deadlock.

================================
WARNING: inconsistent lock state
5.18.0-rc3-build4+ #10 Not tainted
--------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
ksoftirqd/2/25 [HC0[0]:SC1[1]:HE1:SE0] takes:
ffff888107ac4038 (&rxnet->call_lock){+.?.}-{2:2}, at: rxrpc_put_call+0x103/0x14b
{SOFTIRQ-ON-W} state was registered at:
...
Possible unsafe locking scenario:

CPU0
----
lock(&rxnet->call_lock);
<Interrupt>
lock(&rxnet->call_lock);

*** DEADLOCK ***

1 lock held by ksoftirqd/2/25:
#0: ffff8881008ffdb0 ((&call->timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x23d

Changes
=======
ver #2)
- Changed to using list_next_rcu() rather than rcu_dereference() directly.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.18-rc7, v5.18-rc6
# 2fbdf45d 06-May-2022 Ricardo Martinez <[email protected]>

list: Add list_next_entry_circular() and list_prev_entry_circular()

Add macros to get the next or previous entries and wraparound if
needed. For example, calling list_next_entry_circular() on the la

list: Add list_next_entry_circular() and list_prev_entry_circular()

Add macros to get the next or previous entries and wraparound if
needed. For example, calling list_next_entry_circular() on the last
element should return the first element in the list.

Signed-off-by: Ricardo Martinez <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

show more ...


Revision tags: v5.18-rc5
# d679ae94 29-Apr-2022 Kuniyuki Iwashima <[email protected]>

list: fix a data-race around ep->rdllist

ep_poll() first calls ep_events_available() with no lock held and checks
if ep->rdllist is empty by list_empty_careful(), which reads
rdllist->prev. Thus al

list: fix a data-race around ep->rdllist

ep_poll() first calls ep_events_available() with no lock held and checks
if ep->rdllist is empty by list_empty_careful(), which reads
rdllist->prev. Thus all accesses to it need some protection to avoid
store/load-tearing.

Note INIT_LIST_HEAD_RCU() already has the annotation for both prev
and next.

Commit bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket
fds.") added the first lockless ep_events_available(), and commit
c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
made some ep_events_available() calls lockless and added single call under
a lock, finally commit e59d3c64cba6 ("epoll: eliminate unnecessary lock
for zero timeout") made the last ep_events_available() lockless.

BUG: KCSAN: data-race in do_epoll_wait / do_epoll_wait

write to 0xffff88810480c7d8 of 8 bytes by task 1802 on cpu 0:
INIT_LIST_HEAD include/linux/list.h:38 [inline]
list_splice_init include/linux/list.h:492 [inline]
ep_start_scan fs/eventpoll.c:622 [inline]
ep_send_events fs/eventpoll.c:1656 [inline]
ep_poll fs/eventpoll.c:1806 [inline]
do_epoll_wait+0x4eb/0xf40 fs/eventpoll.c:2234
do_epoll_pwait fs/eventpoll.c:2268 [inline]
__do_sys_epoll_pwait fs/eventpoll.c:2281 [inline]
__se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275
__x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0xffff88810480c7d8 of 8 bytes by task 1799 on cpu 1:
list_empty_careful include/linux/list.h:329 [inline]
ep_events_available fs/eventpoll.c:381 [inline]
ep_poll fs/eventpoll.c:1797 [inline]
do_epoll_wait+0x279/0xf40 fs/eventpoll.c:2234
do_epoll_pwait fs/eventpoll.c:2268 [inline]
__do_sys_epoll_pwait fs/eventpoll.c:2281 [inline]
__se_sys_epoll_pwait+0x12b/0x240 fs/eventpoll.c:2275
__x64_sys_epoll_pwait+0x74/0x80 fs/eventpoll.c:2275
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae

value changed: 0xffff88810480c7d0 -> 0xffff888103c15098

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 1799 Comm: syz-fuzzer Tainted: G W 5.17.0-rc7-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Link: https://lkml.kernel.org/r/[email protected]
Fixes: e59d3c64cba6 ("epoll: eliminate unnecessary lock for zero timeout")
Fixes: c5a282e9635e ("fs/epoll: reduce the scope of wq lock in epoll_wait()")
Fixes: bf3b9f6372c4 ("epoll: Add busy poll support to epoll with socket fds.")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reported-by: [email protected]
Cc: Al Viro <[email protected]>, Andrew Morton <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: Kuniyuki Iwashima <[email protected]>
Cc: "Soheil Hassas Yeganeh" <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: "Sridhar Samudrala" <[email protected]>
Cc: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1
# 04254730 20-Jan-2022 Andy Shevchenko <[email protected]>

list: introduce list_is_head() helper and re-use it in list.h

Introduce list_is_head() in the similar (*) way as it's done for
list_entry_is_head(). Make use of it in the list.h.

*) it's done as i

list: introduce list_is_head() helper and re-use it in list.h

Introduce list_is_head() in the similar (*) way as it's done for
list_entry_is_head(). Make use of it in the list.h.

*) it's done as inliner and not a macro to be aligned with other
list_is_*() APIs; while at it, make all three to have the same
style.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Cc: Heikki Krogerus <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1
# cd7187e1 09-Nov-2021 Andy Shevchenko <[email protected]>

include/linux/list.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are invo

include/linux/list.h: replace kernel.h with the necessary inclusions

When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.

Replace kernel.h inclusion with the list of what is really being used.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Andy Shevchenko <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Brendan Higgins <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Laurent Pinchart <[email protected]>
Cc: Mauro Carvalho Chehab <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: Sakari Ailus <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thorsten Leemhuis <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5
# 4704bd31 16-Nov-2020 Mauro Carvalho Chehab <[email protected]>

list: Fix a typo at the kernel-doc markup

hlist_add_behing -> hlist_add_behind

Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>


Revision tags: v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6
# 1eafe075 20-Sep-2020 Asif Rasheed <[email protected]>

list.h: Update comment to explicitly note circular lists

The students in the Operating System Lecture Section at the
American University of Sharjah were confused by the header comment
in include/lin

list.h: Update comment to explicitly note circular lists

The students in the Operating System Lecture Section at the
American University of Sharjah were confused by the header comment
in include/linux/list.h, which says "Simple doubly linked list
implementation". This comment means "simple" as in "not complex",
but "simple" is often used in this context to mean "not circular".
This commit therefore avoids this ambiguity by explicitly calling out
"circular".

Signed-off-by: Asif Rasheed <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>

show more ...


# e1308161 16-Oct-2020 Andy Shevchenko <[email protected]>

include/linux/list.h: add a macro to test if entry is pointing to the head

Add a macro to test if entry is pointing to the head of the list which is
useful in cases like:

list_for_each_entry(pos,

include/linux/list.h: add a macro to test if entry is pointing to the head

Add a macro to test if entry is pointing to the head of the list which is
useful in cases like:

list_for_each_entry(pos, &head, member) {
if (cond)
break;
}
if (list_entry_is_head(pos, &head, member))
return -ERRNO;

that allows to avoid additional variable to be added to track if loop has
not been stopped in the middle.

While here, convert list_for_each_entry*() family of macros to use a new one.

Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Cezary Rojewski <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7
# c6fe44d9 23-Jul-2020 Linus Torvalds <[email protected]>

list: add "list_del_init_careful()" to go with "list_empty_careful()"

That gives us ordering guarantees around the pair.

Signed-off-by: Linus Torvalds <[email protected]>


Revision tags: v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2
# c84716c4 21-Mar-2019 NeilBrown <[email protected]>

list/hashtable: minor documentation corrections.

hash_for_each_safe() and hash_for_each_possible_safe()
need to be passed a temp 'struct hlist_node' pointer, but
do not say that in the documentation

list/hashtable: minor documentation corrections.

hash_for_each_safe() and hash_for_each_possible_safe()
need to be passed a temp 'struct hlist_node' pointer, but
do not say that in the documentation - they just say
a 'struct'.

Also the documentation for hlist_for_each_entry_safe()
describes @n as "another" hlist_node, but in reality it is
the only one.

Signed-off-by: NeilBrown <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>

show more ...


# 46deb744 09-Nov-2019 Paul E. McKenney <[email protected]>

rcu: Add and update docbook header comments in list.h

[ paulmck: Fix typo found by kbuild test robot. ]
Signed-off-by: Paul E. McKenney <[email protected]>


# 28ca0d6d 28-Nov-2019 Pavel Begunkov <[email protected]>

list: introduce list_for_each_continue()

As other *continue() helpers, this continues iteration from a given
position.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axb

list: introduce list_for_each_continue()

As other *continue() helpers, this continues iteration from a given
position.

Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

show more ...


# c54a2744 07-Nov-2019 Eric Dumazet <[email protected]>

list: Add hlist_unhashed_lockless()

We would like to use hlist_unhashed() from timer_pending(),
which runs without protection of a lock.

Note that other callers might also want to use this variant.

list: Add hlist_unhashed_lockless()

We would like to use hlist_unhashed() from timer_pending(),
which runs without protection of a lock.

Note that other callers might also want to use this variant.

Instead of forcing a READ_ONCE() for all hlist_unhashed()
callers, add a new helper with an explicit _lockless suffix
in the name to better document what is going on.

Also add various WRITE_ONCE() in __hlist_del(), hlist_add_head()
and hlist_add_before()/hlist_add_behind() to pair with
the READ_ONCE().

Signed-off-by: Eric Dumazet <[email protected]>
Cc: Thomas Gleixner <[email protected]>
[ paulmck: Also add WRITE_ONCE() to rculist.h. ]
Signed-off-by: Paul E. McKenney <[email protected]>

show more ...


# c8af5cd7 28-Jun-2019 Toke Høiland-Jørgensen <[email protected]>

xskmap: Move non-standard list manipulation to helper

Add a helper in list.h for the non-standard way of clearing a list that is
used in xskmap. This makes it easier to reuse it in the other map typ

xskmap: Move non-standard list manipulation to helper

Add a helper in list.h for the non-standard way of clearing a list that is
used in xskmap. This makes it easier to reuse it in the other map types,
and also makes sure this usage is not forgotten in any list refactorings in
the future.

Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>

show more ...


# e900a918 14-May-2019 Dan Williams <[email protected]>

mm: shuffle initial free memory to improve memory-side-cache utilization

Patch series "mm: Randomize free memory", v10.

This patch (of 3):

Randomization of the page allocator improves the average

mm: shuffle initial free memory to improve memory-side-cache utilization

Patch series "mm: Randomize free memory", v10.

This patch (of 3):

Randomization of the page allocator improves the average utilization of
a direct-mapped memory-side-cache. Memory side caching is a platform
capability that Linux has been previously exposed to in HPC
(high-performance computing) environments on specialty platforms. In
that instance it was a smaller pool of high-bandwidth-memory relative to
higher-capacity / lower-bandwidth DRAM. Now, this capability is going
to be found on general purpose server platforms where DRAM is a cache in
front of higher latency persistent memory [1].

Robert offered an explanation of the state of the art of Linux
interactions with memory-side-caches [2], and I copy it here:

It's been a problem in the HPC space:
http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/

A kernel module called zonesort is available to try to help:
https://software.intel.com/en-us/articles/xeon-phi-software

and this abandoned patch series proposed that for the kernel:
https://lkml.kernel.org/r/[email protected]

Dan's patch series doesn't attempt to ensure buffers won't conflict, but
also reduces the chance that the buffers will. This will make performance
more consistent, albeit slower than "optimal" (which is near impossible
to attain in a general-purpose kernel). That's better than forcing
users to deploy remedies like:
"To eliminate this gradual degradation, we have added a Stream
measurement to the Node Health Check that follows each job;
nodes are rebooted whenever their measured memory bandwidth
falls below 300 GB/s."

A replacement for zonesort was merged upstream in commit cc9aec03e58f
("x86/numa_emulation: Introduce uniform split capability"). With this
numa_emulation capability, memory can be split into cache sized
("near-memory" sized) numa nodes. A bind operation to such a node, and
disabling workloads on other nodes, enables full cache performance.
However, once the workload exceeds the cache size then cache conflicts
are unavoidable. While HPC environments might be able to tolerate
time-scheduling of cache sized workloads, for general purpose server
platforms, the oversubscribed cache case will be the common case.

The worst case scenario is that a server system owner benchmarks a
workload at boot with an un-contended cache only to see that performance
degrade over time, even below the average cache performance due to
excessive conflicts. Randomization clips the peaks and fills in the
valleys of cache utilization to yield steady average performance.

Here are some performance impact details of the patches:

1/ An Intel internal synthetic memory bandwidth measurement tool, saw a
3X speedup in a contrived case that tries to force cache conflicts.
The contrived cased used the numa_emulation capability to force an
instance of the benchmark to be run in two of the near-memory sized
numa nodes. If both instances were placed on the same emulated they
would fit and cause zero conflicts. While on separate emulated nodes
without randomization they underutilized the cache and conflicted
unnecessarily due to the in-order allocation per node.

2/ A well known Java server application benchmark was run with a heap
size that exceeded cache size by 3X. The cache conflict rate was 8%
for the first run and degraded to 21% after page allocator aging. With
randomization enabled the rate levelled out at 11%.

3/ A MongoDB workload did not observe measurable difference in
cache-conflict rates, but the overall throughput dropped by 7% with
randomization in one case.

4/ Mel Gorman ran his suite of performance workloads with randomization
enabled on platforms without a memory-side-cache and saw a mix of some
improvements and some losses [3].

While there is potentially significant improvement for applications that
depend on low latency access across a wide working-set, the performance
may be negligible to negative for other workloads. For this reason the
shuffle capability defaults to off unless a direct-mapped
memory-side-cache is detected. Even then, the page_alloc.shuffle=0
parameter can be specified to disable the randomization on those systems.

Outside of memory-side-cache utilization concerns there is potentially
security benefit from randomization. Some data exfiltration and
return-oriented-programming attacks rely on the ability to infer the
location of sensitive data objects. The kernel page allocator, especially
early in system boot, has predictable first-in-first out behavior for
physical pages. Pages are freed in physical address order when first
onlined.

Quoting Kees:
"While we already have a base-address randomization
(CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and
memory layouts would certainly be using the predictability of
allocation ordering (i.e. for attacks where the base address isn't
important: only the relative positions between allocated memory).
This is common in lots of heap-style attacks. They try to gain
control over ordering by spraying allocations, etc.

I'd really like to see this because it gives us something similar
to CONFIG_SLAB_FREELIST_RANDOM but for the page allocator."

While SLAB_FREELIST_RANDOM reduces the predictability of some local slab
caches it leaves vast bulk of memory to be predictably in order allocated.
However, it should be noted, the concrete security benefits are hard to
quantify, and no known CVE is mitigated by this randomization.

Introduce shuffle_free_memory(), and its helper shuffle_zone(), to perform
a Fisher-Yates shuffle of the page allocator 'free_area' lists when they
are initially populated with free memory at boot and at hotplug time. Do
this based on either the presence of a page_alloc.shuffle=Y command line
parameter, or autodetection of a memory-side-cache (to be added in a
follow-on patch).

The shuffling is done in terms of CONFIG_SHUFFLE_PAGE_ORDER sized free
pages where the default CONFIG_SHUFFLE_PAGE_ORDER is MAX_ORDER-1 i.e. 10,
4MB this trades off randomization granularity for time spent shuffling.
MAX_ORDER-1 was chosen to be minimally invasive to the page allocator
while still showing memory-side cache behavior improvements, and the
expectation that the security implications of finer granularity
randomization is mitigated by CONFIG_SLAB_FREELIST_RANDOM. The
performance impact of the shuffling appears to be in the noise compared to
other memory initialization work.

This initial randomization can be undone over time so a follow-on patch is
introduced to inject entropy on page free decisions. It is reasonable to
ask if the page free entropy is sufficient, but it is not enough due to
the in-order initial freeing of pages. At the start of that process
putting page1 in front or behind page0 still keeps them close together,
page2 is still near page1 and has a high chance of being adjacent. As
more pages are added ordering diversity improves, but there is still high
page locality for the low address pages and this leads to no significant
impact to the cache conflict rate.

[1]: https://itpeernetwork.intel.com/intel-optane-dc-persistent-memory-operating-modes/
[2]: https://lkml.kernel.org/r/AT5PR8401MB1169D656C8B5E121752FC0F8AB120@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM
[3]: https://lkml.org/lkml/2018/10/12/309

[[email protected]: fix shuffle enable]
Link: http://lkml.kernel.org/r/154943713038.3858443.4125180191382062871.stgit@dwillia2-desk3.amr.corp.intel.com
[[email protected]: fix SHUFFLE_PAGE_ALLOCATOR help texts]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Robert Elliott <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


12345