|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1 |
|
| #
b46fae06 |
| 09-Jul-2023 |
Christophe JAILLET <[email protected]> |
ipc/sem: use flexible array in 'struct sem_undo'
Turn 'semadj' in 'struct sem_undo' into a flexible array.
The advantages are: - save the size of a pointer when the new undo structure is allocat
ipc/sem: use flexible array in 'struct sem_undo'
Turn 'semadj' in 'struct sem_undo' into a flexible array.
The advantages are: - save the size of a pointer when the new undo structure is allocated - avoid some always ugly pointer arithmetic to get the address of semadj - avoid an indirection when the array is accessed
While at it, use struct_size() to compute the size of the new undo structure.
Link: https://lkml.kernel.org/r/1ba993d443ad7e16ac2b1902adab1f05ebdfa454.1688918791.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Manfred Spraul <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Cc: Christophe JAILLET <[email protected]> Cc: Jann Horn <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1 |
|
| #
b52be557 |
| 05-Dec-2022 |
Jann Horn <[email protected]> |
ipc/sem: Fix dangling sem_array access in semtimedop race
When __do_semtimedop() goes to sleep because it has to wait for a semaphore value becoming zero or becoming bigger than some threshold, it l
ipc/sem: Fix dangling sem_array access in semtimedop race
When __do_semtimedop() goes to sleep because it has to wait for a semaphore value becoming zero or becoming bigger than some threshold, it links the on-stack sem_queue to the sem_array, then goes to sleep without holding a reference on the sem_array.
When __do_semtimedop() comes back out of sleep, one of two things must happen:
a) We prove that the on-stack sem_queue has been disconnected from the (possibly freed) sem_array, making it safe to return from the stack frame that the sem_queue exists in.
b) We stabilize our reference to the sem_array, lock the sem_array, and detach the sem_queue from the sem_array ourselves.
sem_array has RCU lifetime, so for case (b), the reference can be stabilized inside an RCU read-side critical section by locklessly checking whether the sem_queue is still connected to the sem_array.
However, the current code does the lockless check on sem_queue before starting an RCU read-side critical section, so the result of the lockless check immediately becomes useless.
Fix it by doing rcu_read_lock() before the lockless check. Now RCU ensures that if we observe the object being on our queue, the object can't be freed until rcu_read_unlock().
This bug is only hittable on kernel builds with full preemption support (either CONFIG_PREEMPT or PREEMPT_DYNAMIC with preempt=full).
Fixes: 370b262c896e ("ipc/sem: avoid idr tree lookup for interrupted semop") Cc: [email protected] Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7 |
|
| #
49c9dd0d |
| 10-May-2022 |
Prakash Sangappa <[email protected]> |
ipc: update semtimedop() to use hrtimer
semtimedop() should be converted to use hrtimer like it has been done for most of the system calls with timeouts. This system call already takes a struct tim
ipc: update semtimedop() to use hrtimer
semtimedop() should be converted to use hrtimer like it has been done for most of the system calls with timeouts. This system call already takes a struct timespec as an argument and can therefore provide finer granularity timed wait.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Prakash Sangappa <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: Manfred Spraul <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
0e900029 |
| 10-May-2022 |
Michal Orzel <[email protected]> |
ipc/sem: remove redundant assignments
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends.
Reported by clang-tidy [deadco
ipc/sem: remove redundant assignments
Get rid of redundant assignments which end up in values not being read either because they are overwritten or the function ends.
Reported by clang-tidy [deadcode.DeadStores]
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Orzel <[email protected]> Reviewed-by: Tom Rix <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3 |
|
| #
520ba724 |
| 04-Feb-2022 |
Minghao Chi <[email protected]> |
ipc/sem: do not sleep with a spin lock held
We can't call kvfree() with a spin lock held, so defer it.
Link: https://lkml.kernel.org/r/[email protected] Fixes: fc37a3b8
ipc/sem: do not sleep with a spin lock held
We can't call kvfree() with a spin lock held, so defer it.
Link: https://lkml.kernel.org/r/[email protected] Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo allocation") Reported-by: Zeal Robot <[email protected]> Signed-off-by: Minghao Chi <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Reviewed-by: Manfred Spraul <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Yang Guang <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Bhaskar Chowdhury <[email protected]> Cc: Vasily Averin <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1 |
|
| #
6a4746ba |
| 11-Sep-2021 |
Vasily Averin <[email protected]> |
ipc: remove memcg accounting for sops objects in do_semtimedop()
Linus proposes to revert an accounting for sops objects in do_semtimedop() because it's really just a temporary buffer for a single s
ipc: remove memcg accounting for sops objects in do_semtimedop()
Linus proposes to revert an accounting for sops objects in do_semtimedop() because it's really just a temporary buffer for a single semtimedop() system call.
This object can consume up to 2 pages, syscall is sleeping one, size and duration can be controlled by user, and this allocation can be repeated by many thread at the same time.
However Shakeel Butt pointed that there are much more popular objects with the same life time and similar memory consumption, the accounting of which was decided to be rejected for performance reasons.
Considering at least 2 pages for task_struct and 2 pages for the kernel stack, a back of the envelope calculation gives a footprint amplification of <1.5 so this temporal buffer can be safely ignored.
The factor would IMO be interesting if it was >> 2 (from the PoV of excessive (ab)use, fine-grained accounting seems to be currently unfeasible due to performance impact).
Link: https://lore.kernel.org/lkml/[email protected]/ Fixes: 18319498fdd4 ("memcg: enable accounting of ipc resources") Signed-off-by: Vasily Averin <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Michal Koutný <[email protected]> Acked-by: Shakeel Butt <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
18319498 |
| 02-Sep-2021 |
Vasily Averin <[email protected]> |
memcg: enable accounting of ipc resources
When user creates IPC objects it forces kernel to allocate memory for these long-living objects.
It makes sense to account them to restrict the host's memo
memcg: enable accounting of ipc resources
When user creates IPC objects it forces kernel to allocate memory for these long-living objects.
It makes sense to account them to restrict the host's memory consumption from inside the memcg-limited container.
This patch enables accounting for IPC shared memory segments, messages semaphores and semaphore's undo lists.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "J. Bruce Fields" <[email protected]> Cc: Jeff Layton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Kirill Tkhai <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Serge Hallyn <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Yutian Yang <[email protected]> Cc: Zefan Li <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.14, v5.14-rc7, v5.14-rc6 |
|
| #
bdec0145 |
| 11-Aug-2021 |
Arnd Bergmann <[email protected]> |
ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation
sys_oabi_semtimedop() is one of the last users of set_fs() on Arm. To remove this one, expose the internal code of the actual implementation
ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation
sys_oabi_semtimedop() is one of the last users of set_fs() on Arm. To remove this one, expose the internal code of the actual implementation that operates on a kernel pointer and call it directly after copying.
There should be no measurable impact on the normal execution of this function, and it makes the overly long function a little shorter, which may help readability.
While reworking the oabi version, make it behave a little more like the native one, using kvmalloc_array() and restructure the code flow in a similar way.
The naming of __do_semtimedop() is not very good, I hope someone can come up with a better name.
One regression was spotted by kernel test robot <[email protected]> and fixed before the first mailing list submission.
Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]>
show more ...
|
|
Revision tags: v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1 |
|
| #
17d056e0 |
| 01-Jul-2021 |
Manfred Spraul <[email protected]> |
ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
The patch solves three weaknesses in ipc/sem.c:
1) The initial read of use_global_lock in sem_lock() is an intentional race. KCSAN de
ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
The patch solves three weaknesses in ipc/sem.c:
1) The initial read of use_global_lock in sem_lock() is an intentional race. KCSAN detects these accesses and prints a warning.
2) The code assumes that plain C read/writes are not mangled by the CPU or the compiler.
3) The comment it sysvipc_sem_proc_show() was hard to understand: The rest of the comments in ipc/sem.c speaks about sem_perm.lock, and suddenly this function speaks about ipc_lock_object().
To solve 1) and 2), use READ_ONCE()/WRITE_ONCE(). Plain C reads are used in code that owns sma->sem_perm.lock.
The comment is updated to solve 3)
[[email protected]: use READ_ONCE()/WRITE_ONCE() for use_global_lock] Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Manfred Spraul <[email protected]> Reviewed-by: Paul E. McKenney <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
fc37a3b8 |
| 01-Jul-2021 |
Vasily Averin <[email protected]> |
ipc sem: use kvmalloc for sem_undo allocation
Patch series "ipc: allocations cleanup", v2.
Some ipc objects use the wrong allocation functions: small objects can use kmalloc(), and vice versa, pote
ipc sem: use kvmalloc for sem_undo allocation
Patch series "ipc: allocations cleanup", v2.
Some ipc objects use the wrong allocation functions: small objects can use kmalloc(), and vice versa, potentially large objects can use kmalloc().
This patch (of 2):
Size of sem_undo can exceed one page and with the maximum possible nsems = 32000 it can grow up to 64Kb. Let's switch its allocation to kvmalloc to avoid user-triggered disruptive actions like OOM killer in case of high-order memory shortage.
User triggerable high order allocations are quite a problem on heavily fragmented systems. They can be a DoS vector.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vasily Averin <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3 |
|
| #
a11ddb37 |
| 23-May-2021 |
Varad Gautam <[email protected]> |
ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
do_mq_timedreceive calls wq_sleep with a stack local address. The sender (do_mq_timedsend) uses this address to later call p
ipc/mqueue, msg, sem: avoid relying on a stack reference past its expiry
do_mq_timedreceive calls wq_sleep with a stack local address. The sender (do_mq_timedsend) uses this address to later call pipelined_send.
This leads to a very hard to trigger race where a do_mq_timedreceive call might return and leave do_mq_timedsend to rely on an invalid address, causing the following crash:
RIP: 0010:wake_q_add_safe+0x13/0x60 Call Trace: __x64_sys_mq_timedsend+0x2a9/0x490 do_syscall_64+0x80/0x680 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7f5928e40343
The race occurs as:
1. do_mq_timedreceive calls wq_sleep with the address of `struct ext_wait_queue` on function stack (aliased as `ewq_addr` here) - it holds a valid `struct ext_wait_queue *` as long as the stack has not been overwritten.
2. `ewq_addr` gets added to info->e_wait_q[RECV].list in wq_add, and do_mq_timedsend receives it via wq_get_first_waiter(info, RECV) to call __pipelined_op.
3. Sender calls __pipelined_op::smp_store_release(&this->state, STATE_READY). Here is where the race window begins. (`this` is `ewq_addr`.)
4. If the receiver wakes up now in do_mq_timedreceive::wq_sleep, it will see `state == STATE_READY` and break.
5. do_mq_timedreceive returns, and `ewq_addr` is no longer guaranteed to be a `struct ext_wait_queue *` since it was on do_mq_timedreceive's stack. (Although the address may not get overwritten until another function happens to touch it, which means it can persist around for an indefinite time.)
6. do_mq_timedsend::__pipelined_op() still believes `ewq_addr` is a `struct ext_wait_queue *`, and uses it to find a task_struct to pass to the wake_q_add_safe call. In the lucky case where nothing has overwritten `ewq_addr` yet, `ewq_addr->task` is the right task_struct. In the unlucky case, __pipelined_op::wake_q_add_safe gets handed a bogus address as the receiver's task_struct causing the crash.
do_mq_timedsend::__pipelined_op() should not dereference `this` after setting STATE_READY, as the receiver counterpart is now free to return. Change __pipelined_op to call wake_q_add_safe on the receiver's task_struct returned by get_task_struct, instead of dereferencing `this` which sits on the receiver's stack.
As Manfred pointed out, the race potentially also exists in ipc/msg.c::expunge_all and ipc/sem.c::wake_up_sem_queue_prepare. Fix those in the same way.
Link: https://lkml.kernel.org/r/[email protected] Fixes: c5b2cbdbdac563 ("ipc/mqueue.c: update/document memory barriers") Fixes: 8116b54e7e23ef ("ipc/sem.c: document and update memory barriers") Fixes: 0d97a82ba830d8 ("ipc/msg.c: update and document memory barriers") Signed-off-by: Varad Gautam <[email protected]> Reported-by: Matthias von Faber <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Acked-by: Manfred Spraul <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.13-rc2, v5.13-rc1 |
|
| #
7497835f |
| 07-May-2021 |
Bhaskar Chowdhury <[email protected]> |
ipc/sem.c: spelling fix
s/purpuse/purpose/
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Bhaskar Chowdhury <[email protected]> Acked-by: Randy Dunl
ipc/sem.c: spelling fix
s/purpuse/purpose/
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Bhaskar Chowdhury <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
b1989a3d |
| 07-May-2021 |
Bhaskar Chowdhury <[email protected]> |
ipc/sem.c: mundane typo fixes
s/runtine/runtime/ s/AQUIRE/ACQUIRE/ s/seperately/separately/ s/wont/won\'t/ s/succesfull/successful/
Link: https://lkml.kernel.org/r/20210326022240.26375-1-unixbhaska
ipc/sem.c: mundane typo fixes
s/runtine/runtime/ s/AQUIRE/ACQUIRE/ s/seperately/separately/ s/wont/won\'t/ s/succesfull/successful/
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Bhaskar Chowdhury <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3 |
|
| #
df561f66 |
| 23-Aug-2020 |
Gustavo A. R. Silva <[email protected]> |
treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through mar
treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case.
[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through
Signed-off-by: Gustavo A. R. Silva <[email protected]>
show more ...
|
|
Revision tags: v5.9-rc2, v5.9-rc1 |
|
| #
00898e85 |
| 12-Aug-2020 |
Alexey Dobriyan <[email protected]> |
ipc: uninline functions
Two functions are only called via function pointers, don't bother inlining them.
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <akpm@linu
ipc: uninline functions
Two functions are only called via function pointers, don't bother inlining them.
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Davidlohr Bueso <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3 |
|
| #
edf28f40 |
| 21-Feb-2020 |
Ioanna Alifieraki <[email protected]> |
Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"
This reverts commit a97955844807e327df11aa33869009d14d6b7de0.
Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list loc
Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()"
This reverts commit a97955844807e327df11aa33869009d14d6b7de0.
Commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") removes a lock that is needed. This leads to a process looping infinitely in exit_sem() and can also lead to a crash. There is a reproducer available in [1] and with the commit reverted the issue does not reproduce anymore.
Using the reproducer found in [1] is fairly easy to reach a point where one of the child processes is looping infinitely in exit_sem between for(;;) and if (semid == -1) block, while it's trying to free its last sem_undo structure which has already been freed by freeary().
Each sem_undo struct is on two lists: one per semaphore set (list_id) and one per process (list_proc). The list_id list tracks undos by semaphore set, and the list_proc by process.
Undo structures are removed either by freeary() or by exit_sem(). The freeary function is invoked when the user invokes a syscall to remove a semaphore set. During this operation freeary() traverses the list_id associated with the semaphore set and removes the undo structures from both the list_id and list_proc lists.
For this case, exit_sem() is called at process exit. Each process contains a struct sem_undo_list (referred to as "ulp") which contains the head for the list_proc list. When the process exits, exit_sem() traverses this list to remove each sem_undo struct. As in freeary(), whenever a sem_undo struct is removed from list_proc, it is also removed from the list_id list.
Removing elements from list_id is safe for both exit_sem() and freeary() due to sem_lock(). Removing elements from list_proc is not safe; freeary() locks &un->ulp->lock when it performs list_del_rcu(&un->list_proc) but exit_sem() does not (locking was removed by commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()").
This can result in the following situation while executing the reproducer [1] : Consider a child process in exit_sem() and the parent in freeary() (because of semctl(sid[i], NSEM, IPC_RMID)).
- The list_proc for the child contains the last two undo structs A and B (the rest have been removed either by exit_sem() or freeary()).
- The semid for A is 1 and semid for B is 2.
- exit_sem() removes A and at the same time freeary() removes B.
- Since A and B have different semid sem_lock() will acquire different locks for each process and both can proceed.
The bug is that they remove A and B from the same list_proc at the same time because only freeary() acquires the ulp lock. When exit_sem() removes A it makes ulp->list_proc.next to point at B and at the same time freeary() removes B setting B->semid=-1.
At the next iteration of for(;;) loop exit_sem() will try to remove B.
The only way to break from for(;;) is for (&un->list_proc == &ulp->list_proc) to be true which is not. Then exit_sem() will check if B->semid=-1 which is and will continue looping in for(;;) until the memory for B is reallocated and the value at B->semid is changed.
At that point, exit_sem() will crash attempting to unlink B from the lists (this can be easily triggered by running the reproducer [1] a second time).
To prove this scenario instrumentation was added to keep information about each sem_undo (un) struct that is removed per process and per semaphore set (sma).
CPU0 CPU1 [caller holds sem_lock(sma for A)] ... freeary() exit_sem() ... ... ... sem_lock(sma for B) spin_lock(A->ulp->lock) ... list_del_rcu(un_A->list_proc) list_del_rcu(un_B->list_proc)
Undo structures A and B have different semid and sem_lock() operations proceed. However they belong to the same list_proc list and they are removed at the same time. This results into ulp->list_proc.next pointing to the address of B which is already removed.
After reverting commit a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") the issue was no longer reproducible.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1694779
Link: http://lkml.kernel.org/r/[email protected] Fixes: a97955844807 ("ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()") Signed-off-by: Ioanna Alifieraki <[email protected]> Acked-by: Manfred Spraul <[email protected]> Acked-by: Herton R. Krzesinski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: <[email protected]> Cc: Joel Fernandes (Google) <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Jay Vosburgh <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.6-rc2, v5.6-rc1 |
|
| #
8116b54e |
| 04-Feb-2020 |
Manfred Spraul <[email protected]> |
ipc/sem.c: document and update memory barriers
Document and update the memory barriers in ipc/sem.c:
- Add smp_store_release() to wake_up_sem_queue_prepare() and document why it is needed.
- Rea
ipc/sem.c: document and update memory barriers
Document and update the memory barriers in ipc/sem.c:
- Add smp_store_release() to wake_up_sem_queue_prepare() and document why it is needed.
- Read q->status using READ_ONCE+smp_acquire__after_ctrl_dep(). as the pair for the barrier inside wake_up_sem_queue_prepare().
- Add comments to all barriers, and mention the rules in the block regarding locking.
- Switch to using wake_q_add_safe().
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Manfred Spraul <[email protected]> Cc: Waiman Long <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1 |
|
| #
984035ad |
| 25-Sep-2019 |
Joel Fernandes (Google) <[email protected]> |
ipc/sem.c: convert to use built-in RCU list checking
CONFIG_PROVE_RCU_LIST requires list_for_each_entry_rcu() to pass a lockdep expression if using srcu or locking for protection. It can only check
ipc/sem.c: convert to use built-in RCU list checking
CONFIG_PROVE_RCU_LIST requires list_for_each_entry_rcu() to pass a lockdep expression if using srcu or locking for protection. It can only check regular RCU protection, all other protection needs to be passed as lockdep expression.
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Joel Fernandes (Google) <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: "Gustavo A. R. Silva" <[email protected]> Cc: Jonathan Derrick <[email protected]> Cc: Keith Busch <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1 |
|
| #
4a2ae929 |
| 08-Mar-2019 |
Gustavo A. R. Silva <[email protected]> |
ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
Use kvzalloc() instead of kvmalloc() and memset().
Also, make use of the struct_size() helper instead of the open-coded version
ipc/sem.c: replace kvmalloc/memset with kvzalloc and use struct_size
Use kvzalloc() instead of kvmalloc() and memset().
Also, make use of the struct_size() helper instead of the open-coded version in order to avoid any potential type mistakes.
This code was detected with the help of Coccinelle.
Link: http://lkml.kernel.org/r/20190131214221.GA28930@embeddedor Signed-off-by: Gustavo A. R. Silva <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Manfred Spraul <[email protected]> Cc: Kees Cook <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
667da6a2 |
| 08-Mar-2019 |
Mathieu Malaterre <[email protected]> |
ipc: annotate implicit fall through
There is a plan to build the kernel with -Wimplicit-fallthrough and this place in the code produced a warning (W=1).
This commit remove the following warning:
ipc: annotate implicit fall through
There is a plan to build the kernel with -Wimplicit-fallthrough and this place in the code produced a warning (W=1).
This commit remove the following warning:
ipc/sem.c:1683:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mathieu Malaterre <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1 |
|
| #
8dabe724 |
| 06-Jan-2019 |
Arnd Bergmann <[email protected]> |
y2038: syscalls: rename y2038 compat syscalls
A lot of system calls that pass a time_t somewhere have an implementation using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have been rework
y2038: syscalls: rename y2038 compat syscalls
A lot of system calls that pass a time_t somewhere have an implementation using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have been reworked so that this implementation can now be used on 32-bit architectures as well.
The missing step is to redefine them using the regular SYSCALL_DEFINEx() to get them out of the compat namespace and make it possible to build them on 32-bit architectures.
Any system call that ends in 'time' gets a '32' suffix on its name for that version, while the others get a '_time32' suffix, to distinguish them from the normal version, which takes a 64-bit time argument in the future.
In this step, only 64-bit architectures are changed, doing this rename first lets us avoid touching the 32-bit architectures twice.
Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
show more ...
|
| #
275f2214 |
| 31-Dec-2018 |
Arnd Bergmann <[email protected]> |
ipc: rename old-style shmctl/semctl/msgctl syscalls
The behavior of these system calls is slightly different between architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION symbol. Mo
ipc: rename old-style shmctl/semctl/msgctl syscalls
The behavior of these system calls is slightly different between architectures, as determined by the CONFIG_ARCH_WANT_IPC_PARSE_VERSION symbol. Most architectures that implement the split IPC syscalls don't set that symbol and only get the modern version, but alpha, arm, microblaze, mips-n32, mips-n64 and xtensa expect the caller to pass the IPC_64 flag.
For the architectures that so far only implement sys_ipc(), i.e. m68k, mips-o32, powerpc, s390, sh, sparc, and x86-32, we want the new behavior when adding the split syscalls, so we need to distinguish between the two groups of architectures.
The method I picked for this distinction is to have a separate system call entry point: sys_old_*ctl() now uses ipc_parse_version, while sys_*ctl() does not. The system call tables of the five architectures are changed accordingly.
As an additional benefit, we no longer need the configuration specific definition for ipc_parse_version(), it always does the same thing now, but simply won't get called on architectures with the modern interface.
A small downside is that on architectures that do set ARCH_WANT_IPC_PARSE_VERSION, we now have an extra set of entry points that are never called. They only add a few bytes of bloat, so it seems better to keep them compared to adding yet another Kconfig symbol. I considered adding new syscall numbers for the IPC_64 variants for consistency, but decided against that for now.
Signed-off-by: Arnd Bergmann <[email protected]>
show more ...
|
|
Revision tags: v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5 |
|
| #
9afc5eee |
| 13-Jul-2018 |
Arnd Bergmann <[email protected]> |
y2038: globally rename compat_time to old_time32
Christoph Hellwig suggested a slightly different path for handling backwards compatibility with the 32-bit time_t based system calls:
Rather than si
y2038: globally rename compat_time to old_time32
Christoph Hellwig suggested a slightly different path for handling backwards compatibility with the 32-bit time_t based system calls:
Rather than simply reusing the compat_sys_* entry points on 32-bit architectures unchanged, we get rid of those entry points and the compat_time types by renaming them to something that makes more sense on 32-bit architectures (which don't have a compat mode otherwise), and then share the entry points under the new name with the 64-bit architectures that use them for implementing the compatibility.
The following types and interfaces are renamed here, and moved from linux/compat_time.h to linux/time32.h:
old new --- --- compat_time_t old_time32_t struct compat_timeval struct old_timeval32 struct compat_timespec struct old_timespec32 struct compat_itimerspec struct old_itimerspec32 ns_to_compat_timeval() ns_to_old_timeval32() get_compat_itimerspec64() get_old_itimerspec32() put_compat_itimerspec64() put_old_itimerspec32() compat_get_timespec64() get_old_timespec32() compat_put_timespec64() put_old_timespec32()
As we already have aliases in place, this patch addresses only the instances that are relevant to the system call interface in particular, not those that occur in device drivers and other modules. Those will get handled separately, while providing the 64-bit version of the respective interfaces.
I'm not renaming the timex, rusage and itimerval structures, as we are still debating what the new interface will look like, and whether we will need a replacement at all.
This also doesn't change the names of the syscall entry points, which can be done more easily when we actually switch over the 32-bit architectures to use them, at that point we need to change COMPAT_SYSCALL_DEFINEx to SYSCALL_DEFINEx with a new name, e.g. with a _time32 suffix.
Suggested-by: Christoph Hellwig <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Arnd Bergmann <[email protected]>
show more ...
|
| #
27c331a1 |
| 22-Aug-2018 |
Manfred Spraul <[email protected]> |
ipc/util.c: further variable name cleanups
The varable names got a mess, thus standardize them again:
id: user space id. Called semid, shmid, msgid if the type is known. Most functions use "id"
ipc/util.c: further variable name cleanups
The varable names got a mess, thus standardize them again:
id: user space id. Called semid, shmid, msgid if the type is known. Most functions use "id" already. idx: "index" for the idr lookup Right now, some functions use lid, ipc_addid() already uses idx as the variable name. seq: sequence number, to avoid quick collisions of the user space id key: user space key, used for the rhash tree
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Manfred Spraul <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
| #
eae04d25 |
| 22-Aug-2018 |
Davidlohr Bueso <[email protected]> |
ipc: simplify ipc initialization
Now that we know that rhashtable_init() will not fail, we can get rid of a lot of the unnecessary cleanup paths when the call errored out.
[[email protected]
ipc: simplify ipc initialization
Now that we know that rhashtable_init() will not fail, we can get rid of a lot of the unnecessary cleanup paths when the call errored out.
[[email protected]: variable name added to util.h to resolve checkpatch warning] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Manfred Spraul <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Kees Cook <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|