History log of /linux-6.15/ipc/sem.c (Results 1 – 25 of 217)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1
# b46fae06 09-Jul-2023 Christophe JAILLET <[email protected]>

ipc/sem: use flexible array in 'struct sem_undo'

Turn 'semadj' in 'struct sem_undo' into a flexible array.

The advantages are:
- save the size of a pointer when the new undo structure is allocat

ipc/sem: use flexible array in 'struct sem_undo'

Turn 'semadj' in 'struct sem_undo' into a flexible array.

The advantages are:
- save the size of a pointer when the new undo structure is allocated
- avoid some always ugly pointer arithmetic to get the address of semadj
- avoid an indirection when the array is accessed

While at it, use struct_size() to compute the size of the new undo
structure.

Link: https://lkml.kernel.org/r/1ba993d443ad7e16ac2b1902adab1f05ebdfa454.1688918791.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Manfred Spraul <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Cc: Christophe JAILLET <[email protected]>
Cc: Jann Horn <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1
# b52be557 05-Dec-2022 Jann Horn <[email protected]>

ipc/sem: Fix dangling sem_array access in semtimedop race

When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
l

ipc/sem: Fix dangling sem_array access in semtimedop race

When __do_semtimedop() goes to sleep because it has to wait for a
semaphore value becoming zero or becoming bigger than some threshold, it
links the on-stack sem_queue to the sem_array, then goes to sleep
without holding a reference on the sem_array.

When __do_semtimedop() comes back out of sleep, one of two things must
happen:

a) We prove that the on-stack sem_queue has been disconnected from the
(possibly freed) sem_array, making it safe to return from the stack
frame that the sem_queue exists in.

b) We stabilize our reference to the sem_array, lock the sem_array, and
detach the sem_queue from the sem_array ourselves.

sem_array has RCU lifetime, so for case (b), the reference can be
stabilized inside an RCU read-side critical section by locklessly
checking whether the sem_queue is still connected to the sem_array.

However, the current code does the lockless check on sem_queue before
starting an RCU read-side critical section, so the result of the
lockless check immediately becomes useless.

Fix it by doing rcu_read_lock() before the lockless check. Now RCU
ensures that if we observe the object being on our queue, the object
can't be freed until rcu_read_unlock().

This bug is only hittable on kernel builds with full preemption support
(either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).

Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop")
Cc: [email protected]
Signed-off-by: Jann Horn <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7
# 49c9dd0d 10-May-2022 Prakash Sangappa <[email protected]>

ipc: update semtimedop() to use hrtimer

semtimedop() should be converted to use hrtimer like it has been done for
most of the system calls with timeouts. This system call already takes a
struct tim

ipc: update semtimedop() to use hrtimer

semtimedop() should be converted to use hrtimer like it has been done for
most of the system calls with timeouts. This system call already takes a
struct timespec as an argument and can therefore provide finer granularity
timed wait.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Prakash Sangappa <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Reviewed-by: Manfred Spraul <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# 0e900029 10-May-2022 Michal Orzel <[email protected]>

ipc/sem: remove redundant assignments

Get rid of redundant assignments which end up in values not being
read either because they are overwritten or the function ends.

Reported by clang-tidy [deadco

ipc/sem: remove redundant assignments

Get rid of redundant assignments which end up in values not being
read either because they are overwritten or the function ends.

Reported by clang-tidy [deadcode.DeadStores]

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Orzel <[email protected]>
Reviewed-by: Tom Rix <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3
# 520ba724 04-Feb-2022 Minghao Chi <[email protected]>

ipc/sem: do not sleep with a spin lock held

We can't call kvfree() with a spin lock held, so defer it.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: fc37a3b8

ipc/sem: do not sleep with a spin lock held

We can't call kvfree() with a spin lock held, so defer it.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo allocation")
Reported-by: Zeal Robot <[email protected]>
Signed-off-by: Minghao Chi <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Reviewed-by: Manfred Spraul <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Yang Guang <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Bhaskar Chowdhury <[email protected]>
Cc: Vasily Averin <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1
# 6a4746ba 11-Sep-2021 Vasily Averin <[email protected]>

ipc: remove memcg accounting for sops objects in do_semtimedop()

Linus proposes to revert an accounting for sops objects in
do_semtimedop() because it's really just a temporary buffer
for a single s

ipc: remove memcg accounting for sops objects in do_semtimedop()

Linus proposes to revert an accounting for sops objects in
do_semtimedop() because it's really just a temporary buffer
for a single semtimedop() system call.

This object can consume up to 2 pages, syscall is sleeping
one, size and duration can be controlled by user, and this
allocation can be repeated by many thread at the same time.

However Shakeel Butt pointed that there are much more popular
objects with the same life time and similar memory
consumption, the accounting of which was decided to be
rejected for performance reasons.

Considering at least 2 pages for task_struct and 2 pages for
the kernel stack, a back of the envelope calculation gives a
footprint amplification of <1.5 so this temporal buffer can be
safely ignored.

The factor would IMO be interesting if it was >> 2 (from the
PoV of excessive (ab)use, fine-grained accounting seems to be
currently unfeasible due to performance impact).

Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: 18319498fdd4 ("memcg: enable accounting of ipc resources")
Signed-off-by: Vasily Averin <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Michal Koutný <[email protected]>
Acked-by: Shakeel Butt <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 18319498 02-Sep-2021 Vasily Averin <[email protected]>

memcg: enable accounting of ipc resources

When user creates IPC objects it forces kernel to allocate memory for
these long-living objects.

It makes sense to account them to restrict the host's memo

memcg: enable accounting of ipc resources

When user creates IPC objects it forces kernel to allocate memory for
these long-living objects.

It makes sense to account them to restrict the host's memory consumption
from inside the memcg-limited container.

This patch enables accounting for IPC shared memory segments, messages
semaphores and semaphore's undo lists.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vasily Averin <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: "J. Bruce Fields" <[email protected]>
Cc: Jeff Layton <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Kirill Tkhai <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: Yutian Yang <[email protected]>
Cc: Zefan Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.14, v5.14-rc7, v5.14-rc6
# bdec0145 11-Aug-2021 Arnd Bergmann <[email protected]>

ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation

sys_oabi_semtimedop() is one of the last users of set_fs() on Arm. To
remove this one, expose the internal code of the actual implementation

ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation

sys_oabi_semtimedop() is one of the last users of set_fs() on Arm. To
remove this one, expose the internal code of the actual implementation
that operates on a kernel pointer and call it directly after copying.

There should be no measurable impact on the normal execution of this
function, and it makes the overly long function a little shorter, which
may help readability.

While reworking the oabi version, make it behave a little more like
the native one, using kvmalloc_array() and restructure the code
flow in a similar way.

The naming of __do_semtimedop() is not very good, I hope someone can
come up with a better name.

One regression was spotted by kernel test robot <[email protected]>
and fixed before the first mailing list submission.

Acked-by: Christoph Hellwig <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Russell King (Oracle) <[email protected]>

show more ...


Revision tags: v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1
# 17d056e0 01-Jul-2021 Manfred Spraul <[email protected]>

ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock

The patch solves three weaknesses in ipc/sem.c:

1) The initial read of use_global_lock in sem_lock() is an intentional
race. KCSAN de

ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock

The patch solves three weaknesses in ipc/sem.c:

1) The initial read of use_global_lock in sem_lock() is an intentional
race. KCSAN detects these accesses and prints a warning.

2) The code assumes that plain C read/writes are not mangled by the CPU
or the compiler.

3) The comment it sysvipc_sem_proc_show() was hard to understand: The
rest of the comments in ipc/sem.c speaks about sem_perm.lock, and
suddenly this function speaks about ipc_lock_object().

To solve 1) and 2), use READ_ONCE()/WRITE_ONCE(). Plain C reads are used
in code that owns sma->sem_perm.lock.

The comment is updated to solve 3)

[[email protected]: use READ_ONCE()/WRITE_ONCE() for use_global_lock]
Link: https://lkml.kernel.org/r/[email protected]

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Manfred Spraul <[email protected]>
Reviewed-by: Paul E. McKenney <[email protected]>
Reviewed-by: Davidlohr Bueso <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# fc37a3b8 01-Jul-2021 Vasily Averin <[email protected]>

ipc sem: use kvmalloc for sem_undo allocation

Patch series "ipc: allocations cleanup", v2.

Some ipc objects use the wrong allocation functions: small objects can use
kmalloc(), and vice versa, pote

ipc sem: use kvmalloc for sem_undo allocation

Patch series "ipc: allocations cleanup", v2.

Some ipc objects use the wrong allocation functions: small objects can use
kmalloc(), and vice versa, potentially large objects can use kmalloc().

This patch (of 2):

Size of sem_undo can exceed one page and with the maximum possible nsems =
32000 it can grow up to 64Kb. Let's switch its allocation to kvmalloc to
avoid user-triggered disruptive actions like OOM killer in case of
high-order memory shortage.

User triggerable high order allocations are quite a problem on heavily
fragmented systems. They can be a DoS vector.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vasily Averin <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Shakeel Butt <[email protected]>
Acked-by: Roman Gushchin <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitry Safonov <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3
# a11ddb37 23-May-2021 Varad Gautam <[email protected]>

ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry

do_mq_timedreceive calls wq_sleep with a stack local address. The
sender (do_mq_timedsend) uses this address to later call p

ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry

do_mq_timedreceive calls wq_sleep with a stack local address. The
sender (do_mq_timedsend) uses this address to later call pipelined_send.

This leads to a very hard to trigger race where a do_mq_timedreceive
call might return and leave do_mq_timedsend to rely on an invalid
address, causing the following crash:

RIP: 0010:wake_q_add_safe+0x13/0x60
Call Trace:
__x64_sys_mq_timedsend+0x2a9/0x490
do_syscall_64+0x80/0x680
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f5928e40343

The race occurs as:

1. do_mq_timedreceive calls wq_sleep with the address of `struct
ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it
holds a valid `struct ext_wait_queue *` as long as the stack has not
been overwritten.

2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and
do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call
__pipelined_op.

3. Sender calls __pipelined_op::smp_store_release(&this->state,
STATE_READY). Here is where the race window begins. (`this` is
`ewq_addr`.)

4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it
will see `state == STATE_READY` and break.

5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed
to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's
stack. (Although the address may not get overwritten until another
function happens to touch it, which means it can persist around for an
indefinite time.)

6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a
`struct ext_wait_queue *`, and uses it to find a task_struct to pass to
the wake_q_add_safe call. In the lucky case where nothing has
overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct.
In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a
bogus address as the receiver's task_struct causing the crash.

do_mq_timedsend::__pipelined_op() should not dereference `this` after
setting STATE_READY, as the receiver counterpart is now free to return.
Change __pipelined_op to call wake_q_add_safe on the receiver's
task_struct returned by get_task_struct, instead of dereferencing `this`
which sits on the receiver's stack.

As Manfred pointed out, the race potentially also exists in
ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare. Fix
those in the same way.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: c5b2cbdbdac563 ("ipc/mqueue.c: update/document memory barriers")
Fixes: 8116b54e7e23ef ("ipc/sem.c: document and update memory barriers")
Fixes: 0d97a82ba830d8 ("ipc/msg.c: update and document memory barriers")
Signed-off-by: Varad Gautam <[email protected]>
Reported-by: Matthias von Faber <[email protected]>
Acked-by: Davidlohr Bueso <[email protected]>
Acked-by: Manfred Spraul <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: "Eric W. Biederman" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.13-rc2, v5.13-rc1
# 7497835f 07-May-2021 Bhaskar Chowdhury <[email protected]>

ipc/sem.c: spelling fix

s/purpuse/purpose/

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Bhaskar Chowdhury <[email protected]>
Acked-by: Randy Dunl

ipc/sem.c: spelling fix

s/purpuse/purpose/

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Bhaskar Chowdhury <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# b1989a3d 07-May-2021 Bhaskar Chowdhury <[email protected]>

ipc/sem.c: mundane typo fixes

s/runtine/runtime/
s/AQUIRE/ACQUIRE/
s/seperately/separately/
s/wont/won\'t/
s/succesfull/successful/

Link: https://lkml.kernel.org/r/20210326022240.26375-1-unixbhaska

ipc/sem.c: mundane typo fixes

s/runtine/runtime/
s/AQUIRE/ACQUIRE/
s/seperately/separately/
s/wont/won\'t/
s/succesfull/successful/

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Bhaskar Chowdhury <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3
# df561f66 23-Aug-2020 Gustavo A. R. Silva <[email protected]>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through mar

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <[email protected]>

show more ...


Revision tags: v5.9-rc2, v5.9-rc1
# 00898e85 12-Aug-2020 Alexey Dobriyan <[email protected]>

ipc: uninline functions

Two functions are only called via function pointers, don't bother
inlining them.

Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <akpm@linu

ipc: uninline functions

Two functions are only called via function pointers, don't bother
inlining them.

Signed-off-by: Alexey Dobriyan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3
# edf28f40 21-Feb-2020 Ioanna Alifieraki <[email protected]>

Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"

This reverts commit a97955844807e327df11aa33869009d14d6b7de0.

Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list loc

Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"

This reverts commit a97955844807e327df11aa33869009d14d6b7de0.

Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage
in exit_sem()") removes a lock that is needed. This leads to a process
looping infinitely in exit_sem() and can also lead to a crash. There is
a reproducer available in [1] and with the commit reverted the issue
does not reproduce anymore.

Using the reproducer found in [1] is fairly easy to reach a point where
one of the child processes is looping infinitely in exit_sem between
for(;;) and if (semid == -1) block, while it's trying to free its last
sem_undo structure which has already been freed by freeary().

Each sem_undo struct is on two lists: one per semaphore set (list_id)
and one per process (list_proc). The list_id list tracks undos by
semaphore set, and the list_proc by process.

Undo structures are removed either by freeary() or by exit_sem(). The
freeary function is invoked when the user invokes a syscall to remove a
semaphore set. During this operation freeary() traverses the list_id
associated with the semaphore set and removes the undo structures from
both the list_id and list_proc lists.

For this case, exit_sem() is called at process exit. Each process
contains a struct sem_undo_list (referred to as "ulp") which contains
the head for the list_proc list. When the process exits, exit_sem()
traverses this list to remove each sem_undo struct. As in freeary(),
whenever a sem_undo struct is removed from list_proc, it is also removed
from the list_id list.

Removing elements from list_id is safe for both exit_sem() and freeary()
due to sem_lock(). Removing elements from list_proc is not safe;
freeary() locks &un->ulp->lock when it performs
list_del_rcu(&un->list_proc) but exit_sem() does not (locking was
removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list
lock usage in exit_sem()").

This can result in the following situation while executing the
reproducer [1] : Consider a child process in exit_sem() and the parent
in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).

- The list_proc for the child contains the last two undo structs A and
B (the rest have been removed either by exit_sem() or freeary()).

- The semid for A is 1 and semid for B is 2.

- exit_sem() removes A and at the same time freeary() removes B.

- Since A and B have different semid sem_lock() will acquire different
locks for each process and both can proceed.

The bug is that they remove A and B from the same list_proc at the same
time because only freeary() acquires the ulp lock. When exit_sem()
removes A it makes ulp->list_proc.next to point at B and at the same
time freeary() removes B setting B->semid=-1.

At the next iteration of for(;;) loop exit_sem() will try to remove B.

The only way to break from for(;;) is for (&un->list_proc ==
&ulp->list_proc) to be true which is not. Then exit_sem() will check if
B->semid=-1 which is and will continue looping in for(;;) until the
memory for B is reallocated and the value at B->semid is changed.

At that point, exit_sem() will crash attempting to unlink B from the
lists (this can be easily triggered by running the reproducer [1] a
second time).

To prove this scenario instrumentation was added to keep information
about each sem_undo (un) struct that is removed per process and per
semaphore set (sma).

CPU0 CPU1
[caller holds sem_lock(sma for A)] ...
freeary() exit_sem()
... ...
... sem_lock(sma for B)
spin_lock(A->ulp->lock) ...
list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc)

Undo structures A and B have different semid and sem_lock() operations
proceed. However they belong to the same list_proc list and they are
removed at the same time. This results into ulp->list_proc.next
pointing to the address of B which is already removed.

After reverting commit a97955844807 ("ipc,sem: remove uneeded
sem_undo_list lock usage in exit_sem()") the issue was no longer
reproducible.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779

Link: http://lkml.kernel.org/r/[email protected]
Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()")
Signed-off-by: Ioanna Alifieraki <[email protected]>
Acked-by: Manfred Spraul <[email protected]>
Acked-by: Herton R. Krzesinski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.6-rc2, v5.6-rc1
# 8116b54e 04-Feb-2020 Manfred Spraul <[email protected]>

ipc/sem.c: document and update memory barriers

Document and update the memory barriers in ipc/sem.c:

- Add smp_store_release() to wake_up_sem_queue_prepare() and
document why it is needed.

- Rea

ipc/sem.c: document and update memory barriers

Document and update the memory barriers in ipc/sem.c:

- Add smp_store_release() to wake_up_sem_queue_prepare() and
document why it is needed.

- Read q->status using READ_ONCE+smp_acquire__after_ctrl_dep().
as the pair for the barrier inside wake_up_sem_queue_prepare().

- Add comments to all barriers, and mention the rules in the block
regarding locking.

- Switch to using wake_q_add_safe().

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Manfred Spraul <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1
# 984035ad 25-Sep-2019 Joel Fernandes (Google) <[email protected]>

ipc/sem.c: convert to use built-in RCU list checking

CONFIG_PROVE_RCU_LIST requires list_for_each_entry_rcu() to pass a lockdep
expression if using srcu or locking for protection. It can only check

ipc/sem.c: convert to use built-in RCU list checking

CONFIG_PROVE_RCU_LIST requires list_for_each_entry_rcu() to pass a lockdep
expression if using srcu or locking for protection. It can only check
regular RCU protection, all other protection needs to be passed as lockdep
expression.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: Jonathan Derrick <[email protected]>
Cc: Keith Busch <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1
# 4a2ae929 08-Mar-2019 Gustavo A. R. Silva <[email protected]>

ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size

Use kvzalloc() instead of kvmalloc() and memset().

Also, make use of the struct_size() helper instead of the open-coded
version

ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size

Use kvzalloc() instead of kvmalloc() and memset().

Also, make use of the struct_size() helper instead of the open-coded
version in order to avoid any potential type mistakes.

This code was detected with the help of Coccinelle.

Link: http://lkml.kernel.org/r/20190131214221.GA28930@embeddedor
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Kees Cook <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# 667da6a2 08-Mar-2019 Mathieu Malaterre <[email protected]>

ipc: annotate implicit fall through

There is a plan to build the kernel with -Wimplicit-fallthrough and this
place in the code produced a warning (W=1).

This commit remove the following warning:

ipc: annotate implicit fall through

There is a plan to build the kernel with -Wimplicit-fallthrough and this
place in the code produced a warning (W=1).

This commit remove the following warning:

ipc/sem.c:1683:6: warning: this statement may fall through [-Wimplicit-fallthrough=]

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mathieu Malaterre <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1
# 8dabe724 06-Jan-2019 Arnd Bergmann <[email protected]>

y2038: syscalls: rename y2038 compat syscalls

A lot of system calls that pass a time_t somewhere have an implementation
using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
been rework

y2038: syscalls: rename y2038 compat syscalls

A lot of system calls that pass a time_t somewhere have an implementation
using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
been reworked so that this implementation can now be used on 32-bit
architectures as well.

The missing step is to redefine them using the regular SYSCALL_DEFINEx()
to get them out of the compat namespace and make it possible to build them
on 32-bit architectures.

Any system call that ends in 'time' gets a '32' suffix on its name for
that version, while the others get a '_time32' suffix, to distinguish
them from the normal version, which takes a 64-bit time argument in the
future.

In this step, only 64-bit architectures are changed, doing this rename
first lets us avoid touching the 32-bit architectures twice.

Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>

show more ...


# 275f2214 31-Dec-2018 Arnd Bergmann <[email protected]>

ipc: rename old-style shmctl/semctl/msgctl syscalls

The behavior of these system calls is slightly different between
architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION
symbol. Mo

ipc: rename old-style shmctl/semctl/msgctl syscalls

The behavior of these system calls is slightly different between
architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION
symbol. Most architectures that implement the split IPC syscalls don't set
that symbol and only get the modern version, but alpha, arm, microblaze,
mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag.

For the architectures that so far only implement sys_ipc(), i.e. m68k,
mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior
when adding the split syscalls, so we need to distinguish between the
two groups of architectures.

The method I picked for this distinction is to have a separate system call
entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl()
does not. The system call tables of the five architectures are changed
accordingly.

As an additional benefit, we no longer need the configuration specific
definition for ipc_parse_version(), it always does the same thing now,
but simply won't get called on architectures with the modern interface.

A small downside is that on architectures that do set
ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points
that are never called. They only add a few bytes of bloat, so it seems
better to keep them compared to adding yet another Kconfig symbol.
I considered adding new syscall numbers for the IPC_64 variants for
consistency, but decided against that for now.

Signed-off-by: Arnd Bergmann <[email protected]>

show more ...


Revision tags: v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5
# 9afc5eee 13-Jul-2018 Arnd Bergmann <[email protected]>

y2038: globally rename compat_time to old_time32

Christoph Hellwig suggested a slightly different path for handling
backwards compatibility with the 32-bit time_t based system calls:

Rather than si

y2038: globally rename compat_time to old_time32

Christoph Hellwig suggested a slightly different path for handling
backwards compatibility with the 32-bit time_t based system calls:

Rather than simply reusing the compat_sys_* entry points on 32-bit
architectures unchanged, we get rid of those entry points and the
compat_time types by renaming them to something that makes more sense
on 32-bit architectures (which don't have a compat mode otherwise),
and then share the entry points under the new name with the 64-bit
architectures that use them for implementing the compatibility.

The following types and interfaces are renamed here, and moved
from linux/compat_time.h to linux/time32.h:

old new
--- ---
compat_time_t old_time32_t
struct compat_timeval struct old_timeval32
struct compat_timespec struct old_timespec32
struct compat_itimerspec struct old_itimerspec32
ns_to_compat_timeval() ns_to_old_timeval32()
get_compat_itimerspec64() get_old_itimerspec32()
put_compat_itimerspec64() put_old_itimerspec32()
compat_get_timespec64() get_old_timespec32()
compat_put_timespec64() put_old_timespec32()

As we already have aliases in place, this patch addresses only the
instances that are relevant to the system call interface in particular,
not those that occur in device drivers and other modules. Those
will get handled separately, while providing the 64-bit version
of the respective interfaces.

I'm not renaming the timex, rusage and itimerval structures, as we are
still debating what the new interface will look like, and whether we
will need a replacement at all.

This also doesn't change the names of the syscall entry points, which can
be done more easily when we actually switch over the 32-bit architectures
to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to
SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix.

Suggested-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Arnd Bergmann <[email protected]>

show more ...


# 27c331a1 22-Aug-2018 Manfred Spraul <[email protected]>

ipc/util.c: further variable name cleanups

The varable names got a mess, thus standardize them again:

id: user space id. Called semid, shmid, msgid if the type is known.
Most functions use "id"

ipc/util.c: further variable name cleanups

The varable names got a mess, thus standardize them again:

id: user space id. Called semid, shmid, msgid if the type is known.
Most functions use "id" already.
idx: "index" for the idr lookup
Right now, some functions use lid, ipc_addid() already uses idx as
the variable name.
seq: sequence number, to avoid quick collisions of the user space id
key: user space key, used for the rhash tree

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Manfred Spraul <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


# eae04d25 22-Aug-2018 Davidlohr Bueso <[email protected]>

ipc: simplify ipc initialization

Now that we know that rhashtable_init() will not fail, we can get rid of a
lot of the unnecessary cleanup paths when the call errored out.

[[email protected]

ipc: simplify ipc initialization

Now that we know that rhashtable_init() will not fail, we can get rid of a
lot of the unnecessary cleanup paths when the call errored out.

[[email protected]: variable name added to util.h to resolve checkpatch warning]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Davidlohr Bueso <[email protected]>
Signed-off-by: Manfred Spraul <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


123456789