History log of /linux-6.15/kernel/workqueue.c (Results 1 – 25 of 846)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1
# 8fa7292f 05-Apr-2025 Thomas Gleixner <[email protected]>

treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with c

treewide: Switch/rename to timer_delete[_sync]()

timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.

Conversion was done with coccinelle plus manual fixups where necessary.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

show more ...


Revision tags: v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3
# 8221fd1a 14-Feb-2025 Will Deacon <[email protected]>

workqueue: Log additional details when rejecting work

Syzbot regularly runs into the following warning on arm64:

| WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 current_wq_worker kernel/wo

workqueue: Log additional details when rejecting work

Syzbot regularly runs into the following warning on arm64:

| WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 current_wq_worker kernel/workqueue_internal.h:69 [inline]
| WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 is_chained_work kernel/workqueue.c:2199 [inline]
| WARNING: CPU: 1 PID: 6023 at kernel/workqueue.c:2257 __queue_work+0xe50/0x1308 kernel/workqueue.c:2256
| Modules linked in:
| CPU: 1 UID: 0 PID: 6023 Comm: klogd Not tainted 6.13.0-rc2-syzkaller-g2e7aff49b5da #0
| Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
| pstate: 404000c5 (nZcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : __queue_work+0xe50/0x1308 kernel/workqueue_internal.h:69
| lr : current_wq_worker kernel/workqueue_internal.h:69 [inline]
| lr : is_chained_work kernel/workqueue.c:2199 [inline]
| lr : __queue_work+0xe50/0x1308 kernel/workqueue.c:2256

[...]

| __queue_work+0xe50/0x1308 kernel/workqueue.c:2256 (L)
| delayed_work_timer_fn+0x74/0x90 kernel/workqueue.c:2485
| call_timer_fn+0x1b4/0x8b8 kernel/time/timer.c:1793
| expire_timers kernel/time/timer.c:1839 [inline]
| __run_timers kernel/time/timer.c:2418 [inline]
| __run_timer_base+0x59c/0x7b4 kernel/time/timer.c:2430
| run_timer_base kernel/time/timer.c:2439 [inline]
| run_timer_softirq+0xcc/0x194 kernel/time/timer.c:2449

The warning is probably because we are trying to queue work into a
destroyed workqueue, but the softirq context makes it hard to pinpoint
the problematic caller.

Extend the warning diagnostics to print both the function we are trying
to queue as well as the name of the workqueue.

Cc: Tejun Heo <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Link: https://syzkaller.appspot.com/bug?extid=e13e654d315d4da1277c
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.14-rc2, v6.14-rc1
# e7694611 23-Jan-2025 Lai Jiangshan <[email protected]>

workqueue: Put the pwq after detaching the rescuer from the pool

The commit 68f83057b913("workqueue: Reap workers via kthread_stop() and
remove detach_completion") adds code to reap the normal worke

workqueue: Put the pwq after detaching the rescuer from the pool

The commit 68f83057b913("workqueue: Reap workers via kthread_stop() and
remove detach_completion") adds code to reap the normal workers but
mistakenly does not handle the rescuer and also removes the code waiting
for the rescuer in put_unbound_pool(), which caused a use-after-free bug
reported by Cheung Wall.

To avoid the use-after-free bug, the pool’s reference must be held until
the detachment is complete. Therefore, move the code that puts the pwq
after detaching the rescuer from the pool.

Reported-by: cheung wall <[email protected]>
Cc: cheung wall <[email protected]>
Link: https://lore.kernel.org/lkml/CAKHoSAvP3iQW+GwmKzWjEAOoPvzeWeoMO0Gz7Pp3_4kxt-RMoA@mail.gmail.com/
Fixes: 68f83057b913("workqueue: Reap workers via kthread_stop() and remove detach_completion")
Signed-off-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1
# d40797d6 22-Nov-2024 Peter Zijlstra <[email protected]>

kasan: make kasan_record_aux_stack_noalloc() the default behaviour

kasan_record_aux_stack_noalloc() was introduced to record a stack trace
without allocating memory in the process. It has been adde

kasan: make kasan_record_aux_stack_noalloc() the default behaviour

kasan_record_aux_stack_noalloc() was introduced to record a stack trace
without allocating memory in the process. It has been added to callers
which were invoked while a raw_spinlock_t was held. More and more callers
were identified and changed over time. Is it a good thing to have this
while functions try their best to do a locklessly setup? The only
downside of having kasan_record_aux_stack() not allocate any memory is
that we end up without a stacktrace if stackdepot runs out of memory and
at the same stacktrace was not recorded before To quote Marco Elver from
https://lore.kernel.org/all/CANpmjNPmQYJ7pv1N3cuU8cP18u7PP_uoZD8YxwZd4jtbof9nVQ@mail.gmail.com/

| I'd be in favor, it simplifies things. And stack depot should be
| able to replenish its pool sufficiently in the "non-aux" cases
| i.e. regular allocations. Worst case we fail to record some
| aux stacks, but I think that's only really bad if there's a bug
| around one of these allocations. In general the probabilities
| of this being a regression are extremely small [...]

Make the kasan_record_aux_stack_noalloc() behaviour default as
kasan_record_aux_stack().

[[email protected]: dressed the diff as patch]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 7cb3007ce2da ("kasan: generic: introduce kasan_record_aux_stack_noalloc()")
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Reported-by: [email protected]
Closes: https://lore.kernel.org/all/[email protected]
Reviewed-by: Andrey Konovalov <[email protected]>
Reviewed-by: Marco Elver <[email protected]>
Reviewed-by: Waiman Long <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Ben Segall <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Hyeonggon Yoo <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Joel Fernandes (Google) <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: [email protected]
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Zqiang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


# da30ba22 09-Jan-2025 Imran Khan <[email protected]>

workqueue: warn if delayed_work is queued to an offlined cpu.

delayed_work submitted to an offlined cpu, will not get executed,
after the specified delay if the cpu remains offline. If the cpu
never

workqueue: warn if delayed_work is queued to an offlined cpu.

delayed_work submitted to an offlined cpu, will not get executed,
after the specified delay if the cpu remains offline. If the cpu
never comes online the work will never get executed.
checking for online cpu in __queue_delayed_work, does not sound
like a good idea because to do this reliably we need hotplug lock
and since work may be submitted from atomic contexts, we would
have to use cpus_read_trylock. But if trylock fails we would queue
the work on any cpu and this may not be optimal because our intended
cpu might still be online.

Putting a WARN_ON_ONCE for an already offlined cpu, will indicate users
of queue_delayed_work_on, if they are (wrongly) trying to queue
delayed_work on offlined cpu. Also indicate the problem of using
offlined cpu with queue_delayed_work_on, in its description.

Signed-off-by: Imran Khan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1
# b04e317b 26-Sep-2024 Frederic Weisbecker <[email protected]>

treewide: Introduce kthread_run_worker[_on_cpu]()

kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker()

treewide: Introduce kthread_run_worker[_on_cpu]()

kthread_create() creates a kthread without running it yet. kthread_run()
creates a kthread and runs it.

On the other hand, kthread_create_worker() creates a kthread worker and
runs it.

This difference in behaviours is confusing. Also there is no way to
create a kthread worker and affine it using kthread_bind_mask() or
kthread_affine_preferred() before starting it.

Consolidate the behaviours and introduce kthread_run_worker[_on_cpu]()
that behaves just like kthread_run(). kthread_create_worker[_on_cpu]()
will now only create a kthread worker without starting it.

Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>

show more ...


# d57212f2 24-Dec-2024 Su Hui <[email protected]>

workqueue: add printf attribute to __alloc_workqueue()

Fix a compiler warning with W=1:
kernel/workqueue.c: error:
function ‘__alloc_workqueue’ might be a candidate for ‘gnu_printf’
format attribute

workqueue: add printf attribute to __alloc_workqueue()

Fix a compiler warning with W=1:
kernel/workqueue.c: error:
function ‘__alloc_workqueue’ might be a candidate for ‘gnu_printf’
format attribute[-Werror=suggest-attribute=format]
5657 | name_len = vsnprintf(wq->name, sizeof(wq->name), fmt, args);
| ^~~~~~~~

Fixes: 9b59a85a84dc ("workqueue: Don't call va_start / va_end twice")
Signed-off-by: Su Hui <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# de35994e 19-Dec-2024 Tvrtko Ursulin <[email protected]>

workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker

After commit
746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
amdgpu started seeing t

workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker

After commit
746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
amdgpu started seeing the following warning:

[ ] workqueue: WQ_MEM_RECLAIM sdma0:drm_sched_run_job_work [gpu_sched] is flushing !WQ_MEM_RECLAIM events:amdgpu_device_delay_enable_gfx_off [amdgpu]
...
[ ] Workqueue: sdma0 drm_sched_run_job_work [gpu_sched]
...
[ ] Call Trace:
[ ] <TASK>
...
[ ] ? check_flush_dependency+0xf5/0x110
...
[ ] cancel_delayed_work_sync+0x6e/0x80
[ ] amdgpu_gfx_off_ctrl+0xab/0x140 [amdgpu]
[ ] amdgpu_ring_alloc+0x40/0x50 [amdgpu]
[ ] amdgpu_ib_schedule+0xf4/0x810 [amdgpu]
[ ] ? drm_sched_run_job_work+0x22c/0x430 [gpu_sched]
[ ] amdgpu_job_run+0xaa/0x1f0 [amdgpu]
[ ] drm_sched_run_job_work+0x257/0x430 [gpu_sched]
[ ] process_one_work+0x217/0x720
...
[ ] </TASK>

The intent of the verifcation done in check_flush_depedency is to ensure
forward progress during memory reclaim, by flagging cases when either a
memory reclaim process, or a memory reclaim work item is flushed from a
context not marked as memory reclaim safe.

This is correct when flushing, but when called from the
cancel(_delayed)_work_sync() paths it is a false positive because work is
either already running, or will not be running at all. Therefore
cancelling it is safe and we can relax the warning criteria by letting the
helper know of the calling context.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Fixes: fca839c00a12 ("workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue")
References: 746ae46c1113 ("drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM")
Cc: Tejun Heo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]
Cc: Matthew Brost <[email protected]>
Cc: <[email protected]> # v4.5+
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# 85f0d8e3 15-Nov-2024 Wangyang Guo <[email protected]>

workqueue: Reduce expensive locks for unbound workqueue

For unbound workqueue, pwqs usually map to just a few pools. Most of
the time, pwqs will be linked sequentially to wq->pwqs list by cpu
index.

workqueue: Reduce expensive locks for unbound workqueue

For unbound workqueue, pwqs usually map to just a few pools. Most of
the time, pwqs will be linked sequentially to wq->pwqs list by cpu
index. Usually, consecutive CPUs have the same workqueue attribute
(e.g. belong to the same NUMA node). This makes pwqs with the same
pool cluster together in the pwq list.

Only do lock/unlock if the pool has changed in flush_workqueue_prep_pwqs().
This reduces the number of expensive lock operations.

The performance data shows this change boosts FIO by 65x in some cases
when multiple concurrent threads write to xfs mount points with fsync.

FIO Benchmark Details
- FIO version: v3.35
- FIO Options: ioengine=libaio,iodepth=64,norandommap=1,rw=write,
size=128M,bs=4k,fsync=1
- FIO Job Configs: 64 jobs in total writing to 4 mount points (ramdisks
formatted as xfs file system).
- Kernel Codebase: v6.12-rc5
- Test Platform: Xeon 8380 (2 sockets)

Reviewed-by: Tim Chen <[email protected]>
Signed-off-by: Wangyang Guo <[email protected]>
Reviewed-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.11
# 73613840 12-Sep-2024 Lai Jiangshan <[email protected]>

workqueue: Clear worker->pool in the worker thread context

Marc Hartmayer reported:
[ 23.133876] Unable to handle kernel pointer dereference in virtual kernel address space
[ 23.

workqueue: Clear worker->pool in the worker thread context

Marc Hartmayer reported:
[ 23.133876] Unable to handle kernel pointer dereference in virtual kernel address space
[ 23.133950] Failing address: 0000000000000000 TEID: 0000000000000483
[ 23.133954] Fault in home space mode while using kernel ASCE.
[ 23.133957] AS:000000001b8f0007 R3:0000000056cf4007 S:0000000056cf3800 P:000000000000003d
[ 23.134207] Oops: 0004 ilc:2 [#1] SMP
(snip)
[ 23.134516] Call Trace:
[ 23.134520] [<0000024e326caf28>] worker_thread+0x48/0x430
[ 23.134525] ([<0000024e326caf18>] worker_thread+0x38/0x430)
[ 23.134528] [<0000024e326d3a3e>] kthread+0x11e/0x130
[ 23.134533] [<0000024e3264b0dc>] __ret_from_fork+0x3c/0x60
[ 23.134536] [<0000024e333fb37a>] ret_from_fork+0xa/0x38
[ 23.134552] Last Breaking-Event-Address:
[ 23.134553] [<0000024e333f4c04>] mutex_unlock+0x24/0x30
[ 23.134562] Kernel panic - not syncing: Fatal exception: panic_on_oops

With debuging and analysis, worker_thread() accesses to the nullified
worker->pool when the newly created worker is destroyed before being
waken-up, in which case worker_thread() can see the result detach_worker()
reseting worker->pool to NULL at the begining.

Move the code "worker->pool = NULL;" out from detach_worker() to fix the
problem.

worker->pool had been designed to be constant for regular workers and
changeable for rescuer. To share attaching/detaching code for regular
and rescuer workers and to avoid worker->pool being accessed inadvertently
when the worker has been detached, worker->pool is reset to NULL when
detached no matter the worker is rescuer or not.

To maintain worker->pool being reset after detached, move the code
"worker->pool = NULL;" in the worker thread context after detached.

It is either be in the regular worker thread context after PF_WQ_WORKER
is cleared or in rescuer worker thread context with wq_pool_attach_mutex
held. So it is safe to do so.

Cc: Marc Hartmayer <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Marc Hartmayer <[email protected]>
Fixes: f4b7b53c94af ("workqueue: Detach workers directly in idle_cull_fn()")
Cc: [email protected] # v6.11+
Signed-off-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# b4722b85 11-Sep-2024 Baoquan He <[email protected]>

kernel/workqueue.c: fix DEFINE_PER_CPU_SHARED_ALIGNED expansion

Make tags always produces below annoying warnings:

ctags: Warning: kernel/workqueue.c:470: null expansion of name pattern "\1"
ctags:

kernel/workqueue.c: fix DEFINE_PER_CPU_SHARED_ALIGNED expansion

Make tags always produces below annoying warnings:

ctags: Warning: kernel/workqueue.c:470: null expansion of name pattern "\1"
ctags: Warning: kernel/workqueue.c:474: null expansion of name pattern "\1"
ctags: Warning: kernel/workqueue.c:478: null expansion of name pattern "\1"

In commit 25528213fe9f ("tags: Fix DEFINE_PER_CPU expansions"), codes in
places have been adjusted including cpu_worker_pools definition. I noticed
in commit 4cb1ef64609f ("workqueue: Implement BH workqueues to eventually
replace tasklets"), cpu_worker_pools definition was unfolded back. Not
sure if it was intentionally done or ignored carelessly.

Makes change to mute them specifically.

Signed-off-by: Baoquan He <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4
# 84c425be 15-Aug-2024 Sergey Senozhatsky <[email protected]>

workqueue: fix null-ptr-deref on __alloc_workqueue() error

wq->lockdep_map is set only after __alloc_workqueue()
successfully returns. However, on its error path
__alloc_workqueue() may call destroy

workqueue: fix null-ptr-deref on __alloc_workqueue() error

wq->lockdep_map is set only after __alloc_workqueue()
successfully returns. However, on its error path
__alloc_workqueue() may call destroy_workqueue() which
expects wq->lockdep_map to be already set, which results
in a null-ptr-deref in touch_wq_lockdep_map().

Add a simple NULL-check to touch_wq_lockdep_map().

Oops: general protection fault, probably for non-canonical address
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
RIP: 0010:__lock_acquire+0x81/0x7800
[..]
Call Trace:
<TASK>
? __die_body+0x66/0xb0
? die_addr+0xb2/0xe0
? exc_general_protection+0x300/0x470
? asm_exc_general_protection+0x22/0x30
? __lock_acquire+0x81/0x7800
? mark_lock+0x94/0x330
? __lock_acquire+0x12fd/0x7800
? __lock_acquire+0x3439/0x7800
lock_acquire+0x14c/0x3e0
? __flush_workqueue+0x167/0x13a0
? __init_swait_queue_head+0xaf/0x150
? __flush_workqueue+0x167/0x13a0
__flush_workqueue+0x17d/0x13a0
? __flush_workqueue+0x167/0x13a0
? lock_release+0x50f/0x830
? drain_workqueue+0x94/0x300
drain_workqueue+0xe3/0x300
destroy_workqueue+0xac/0xc40
? workqueue_sysfs_register+0x159/0x2f0
__alloc_workqueue+0x1506/0x1760
alloc_workqueue+0x61/0x150
...

Signed-off-by: Sergey Senozhatsky <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# 9b59a85a 20-Aug-2024 Matthew Brost <[email protected]>

workqueue: Don't call va_start / va_end twice

Calling va_start / va_end multiple times is undefined and causes
problems with certain compiler / platforms.

Change alloc_ordered_workqueue_lockdep_map

workqueue: Don't call va_start / va_end twice

Calling va_start / va_end multiple times is undefined and causes
problems with certain compiler / platforms.

Change alloc_ordered_workqueue_lockdep_map to a macro and updated
__alloc_workqueue to take a va_list argument.

Cc: Sergey Senozhatsky <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.11-rc3
# ec0a7d44 09-Aug-2024 Matthew Brost <[email protected]>

workqueue: Add interface for user-defined workqueue lockdep map

Add an interface for a user-defined workqueue lockdep map, which is
helpful when multiple workqueues are created for the same purpose.

workqueue: Add interface for user-defined workqueue lockdep map

Add an interface for a user-defined workqueue lockdep map, which is
helpful when multiple workqueues are created for the same purpose. This
also helps avoid leaking lockdep maps on each workqueue creation.

v2:
- Add alloc_workqueue_lockdep_map (Tejun)
v3:
- Drop __WQ_USER_OWNED_LOCKDEP (Tejun)
- static inline alloc_ordered_workqueue_lockdep_map (Tejun)

Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# 4f022f43 09-Aug-2024 Matthew Brost <[email protected]>

workqueue: Change workqueue lockdep map to pointer

Will help enable user-defined lockdep maps for workqueues.

Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by:

workqueue: Change workqueue lockdep map to pointer

Will help enable user-defined lockdep maps for workqueues.

Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# b188c57a 09-Aug-2024 Matthew Brost <[email protected]>

workqueue: Split alloc_workqueue into internal function and lockdep init

Will help enable user-defined lockdep maps for workqueues.

Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <jiangshanlai@gma

workqueue: Split alloc_workqueue into internal function and lockdep init

Will help enable user-defined lockdep maps for workqueues.

Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# 073107b3 06-Aug-2024 Sangmoon Kim <[email protected]>

workqueue: add cmdline parameter workqueue.panic_on_stall

When we want to debug the workqueue stall, we can immediately make
a panic to get the information we want.

In some systems, it may be neces

workqueue: add cmdline parameter workqueue.panic_on_stall

When we want to debug the workqueue stall, we can immediately make
a panic to get the information we want.

In some systems, it may be necessary to quickly reboot the system to
escape from a workqueue lockup situation. In this case, we can control
the number of stall detections to generate panic.

workqueue.panic_on_stall sets the number times of the stall to trigger
panic. 0 disables the panic on stall.

Signed-off-by: Sangmoon Kim <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# c4c8f369 05-Aug-2024 Uros Bizjak <[email protected]>

workqueue: Correct declaration of cpu_pwq in struct workqueue_struct

cpu_pwq is used in various percpu functions that expect variable in
__percpu address space. Correct the declaration of cpu_pwq to

workqueue: Correct declaration of cpu_pwq in struct workqueue_struct

cpu_pwq is used in various percpu functions that expect variable in
__percpu address space. Correct the declaration of cpu_pwq to

struct pool_workqueue __rcu * __percpu *cpu_pwq

to declare the variable as __percpu pointer.

The patch also fixes following sparse errors:

workqueue.c:380:37: warning: duplicate [noderef]
workqueue.c:380:37: error: multiple address spaces given: __rcu & __percpu
workqueue.c:2271:15: error: incompatible types in comparison expression (different address spaces):
workqueue.c:2271:15: struct pool_workqueue [noderef] __rcu *
workqueue.c:2271:15: struct pool_workqueue [noderef] __percpu *

and uncovers a couple of exisiting "incorrect type in assignment"
warnings (from __rcu address space), which this patch does not address.

Found by GCC's named address space checks.

There were no changes in the resulting object files.

Signed-off-by: Uros Bizjak <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# 8bc35475 05-Aug-2024 Tejun Heo <[email protected]>

workqueue: Fix spruious data race in __flush_work()

When flushing a work item for cancellation, __flush_work() knows that it
exclusively owns the work item through its PENDING bit. 134874e2eee9
("wo

workqueue: Fix spruious data race in __flush_work()

When flushing a work item for cancellation, __flush_work() knows that it
exclusively owns the work item through its PENDING bit. 134874e2eee9
("workqueue: Allow cancel_work_sync() and disable_work() from atomic
contexts on BH work items") added a read of @work->data to determine whether
to use busy wait for BH work items that are being canceled. While the read
is safe when @from_cancel, @work->data was read before testing @from_cancel
to simplify code structure:

data = *work_data_bits(work);
if (from_cancel &&
!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {

While the read data was never used if !@from_cancel, this could trigger
KCSAN data race detection spuriously:

==================================================================
BUG: KCSAN: data-race in __flush_work / __flush_work

write to 0xffff8881223aa3e8 of 8 bytes by task 3998 on cpu 0:
instrument_write include/linux/instrumented.h:41 [inline]
___set_bit include/asm-generic/bitops/instrumented-non-atomic.h:28 [inline]
insert_wq_barrier kernel/workqueue.c:3790 [inline]
start_flush_work kernel/workqueue.c:4142 [inline]
__flush_work+0x30b/0x570 kernel/workqueue.c:4178
flush_work kernel/workqueue.c:4229 [inline]
...

read to 0xffff8881223aa3e8 of 8 bytes by task 50 on cpu 1:
__flush_work+0x42a/0x570 kernel/workqueue.c:4188
flush_work kernel/workqueue.c:4229 [inline]
flush_delayed_work+0x66/0x70 kernel/workqueue.c:4251
...

value changed: 0x0000000000400000 -> 0xffff88810006c00d

Reorganize the code so that @from_cancel is tested before @work->data is
accessed. The only problem is triggering KCSAN detection spuriously. This
shouldn't need READ_ONCE() or other access qualifiers.

No functional changes.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: [email protected]
Fixes: 134874e2eee9 ("workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items")
Link: http://lkml.kernel.org/r/[email protected]
Cc: Jens Axboe <[email protected]>

show more ...


Revision tags: v6.11-rc2, v6.11-rc1
# 98cc1730 25-Jul-2024 Lai Jiangshan <[email protected]>

workqueue: Remove incorrect "WARN_ON_ONCE(!list_empty(&worker->entry));" from dying worker

The commit 68f83057b913 ("workqueue: Reap workers via kthread_stop()
and remove detach_completion") changes

workqueue: Remove incorrect "WARN_ON_ONCE(!list_empty(&worker->entry));" from dying worker

The commit 68f83057b913 ("workqueue: Reap workers via kthread_stop()
and remove detach_completion") changes the procedure of destroying
workers; the dying workers are kept in the cull_list in wake_dying_workers()
with the pool lock held and removed from the cull_list by the newly
added reap_dying_workers() without the pool lock.

This can cause a warning if the dying worker is wokenup earlier than
reaped as reported by Marc:

2024/07/23 18:01:21 [M83LP63]: [ 157.267727] ------------[ cut here ]------------
2024/07/23 18:01:21 [M83LP63]: [ 157.267735] WARNING: CPU: 21 PID: 725 at kernel/workqueue.c:3340 worker_thread+0x54e/0x558
2024/07/23 18:01:21 [M83LP63]: [ 157.267746] Modules linked in: binfmt_misc nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables sunrpc dm_service_time s390_trng vfio_ccw mdev vfio_iommu_type1 vfio sch_fq_codel
2024/07/23 18:01:21 [M83LP63]: loop dm_multipath configfs nfnetlink lcs ctcm fsm zfcp scsi_transport_fc ghash_s390 prng chacha_s390 libchacha aes_s390 des_s390 libdes sha3_512_s390 sha3_256_s390 sha512_s390 sha256_s390 sha1_s390 sha_common scm_block eadm_sch scsi_dh_rdac scsi_dh_emc scsi_dh_alua pkey zcrypt rng_core autofs4
2024/07/23 18:01:21 [M83LP63]: [ 157.267792] CPU: 21 PID: 725 Comm: kworker/dying Not tainted 6.10.0-rc2-00239-g68f83057b913 #95
2024/07/23 18:01:21 [M83LP63]: [ 157.267796] Hardware name: IBM 3906 M04 704 (LPAR)
2024/07/23 18:01:21 [M83LP63]: [ 157.267802] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
2024/07/23 18:01:21 [M83LP63]: [ 157.267797] Krnl PSW : 0704d00180000000 000003d600fcd9fa (worker_thread+0x552/0x558)
2024/07/23 18:01:21 [M83LP63]: [ 157.267806] Krnl GPRS: 6479696e6700776f 000002c901b62780 000003d602493ec8 000002c914954600
2024/07/23 18:01:21 [M83LP63]: [ 157.267809] 0000000000000000 0000000000000008 000002c901a85400 000002c90719e840
2024/07/23 18:01:21 [M83LP63]: [ 157.267811] 000002c90719e880 000002c901a85420 000002c91127adf0 000002c901a85400
2024/07/23 18:01:21 [M83LP63]: [ 157.267813] 000002c914954600 0000000000000000 000003d600fcd772 000003560452bd98
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] Krnl Code: 000003d600fcd9ec: c0e500674262 brasl %r14,000003d601cb5eb0
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcd9f2: a7f4ffc8 brc 15,000003d600fcd982
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] #000003d600fcd9f6: af000000 mc 0,0
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] >000003d600fcd9fa: a7f4fec2 brc 15,000003d600fcd77e
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcd9fe: 0707 bcr 0,%r7
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcda00: c00400682e10 brcl 0,000003d601cd3620
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcda06: eb7ff0500024 stmg %r7,%r15,80(%r15)
2024/07/23 18:01:21 [M83LP63]: [ 157.267822] 000003d600fcda0c: b90400ef lgr %r14,%r15
2024/07/23 18:01:21 [M83LP63]: [ 157.267853] Call Trace:
2024/07/23 18:01:21 [M83LP63]: [ 157.267855] [<000003d600fcd9fa>] worker_thread+0x552/0x558
2024/07/23 18:01:21 [M83LP63]: [ 157.267859] ([<000003d600fcd772>] worker_thread+0x2ca/0x558)
2024/07/23 18:01:21 [M83LP63]: [ 157.267862] [<000003d600fd6c80>] kthread+0x120/0x128
2024/07/23 18:01:21 [M83LP63]: [ 157.267865] [<000003d600f5305c>] __ret_from_fork+0x3c/0x58
2024/07/23 18:01:21 [M83LP63]: [ 157.267868] [<000003d601cc746a>] ret_from_fork+0xa/0x30
2024/07/23 18:01:21 [M83LP63]: [ 157.267873] Last Breaking-Event-Address:
2024/07/23 18:01:21 [M83LP63]: [ 157.267874] [<000003d600fcd778>] worker_thread+0x2d0/0x558

Since the procedure of destroying workers is changed, the WARN_ON_ONCE()
becomes incorrect and should be removed.

Cc: Marc Hartmayer <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Reported-by: Marc Hartmayer <[email protected]>
Fixes: 68f83057b913 ("workqueue: Reap workers via kthread_stop() and remove detach_completion")
Cc: [email protected] # v6.11+
Signed-off-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# 38f7e145 30-Jul-2024 Will Deacon <[email protected]>

workqueue: Fix UBSAN 'subtraction overflow' error in shift_and_mask()

UBSAN reports the following 'subtraction overflow' error when booting
in a virtual machine on Android:

| Internal error: UBSAN

workqueue: Fix UBSAN 'subtraction overflow' error in shift_and_mask()

UBSAN reports the following 'subtraction overflow' error when booting
in a virtual machine on Android:

| Internal error: UBSAN: integer subtraction overflow: 00000000f2005515 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.10.0-00006-g3cbe9e5abd46-dirty #4
| Hardware name: linux,dummy-virt (DT)
| pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
| pc : cancel_delayed_work+0x34/0x44
| lr : cancel_delayed_work+0x2c/0x44
| sp : ffff80008002ba60
| x29: ffff80008002ba60 x28: 0000000000000000 x27: 0000000000000000
| x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
| x23: 0000000000000000 x22: 0000000000000000 x21: ffff1f65014cd3c0
| x20: ffffc0e84c9d0da0 x19: ffffc0e84cab3558 x18: ffff800080009058
| x17: 00000000247ee1f8 x16: 00000000247ee1f8 x15: 00000000bdcb279d
| x14: 0000000000000001 x13: 0000000000000075 x12: 00000a0000000000
| x11: ffff1f6501499018 x10: 00984901651fffff x9 : ffff5e7cc35af000
| x8 : 0000000000000001 x7 : 3d4d455453595342 x6 : 000000004e514553
| x5 : ffff1f6501499265 x4 : ffff1f650ff60b10 x3 : 0000000000000620
| x2 : ffff80008002ba78 x1 : 0000000000000000 x0 : 0000000000000000
| Call trace:
| cancel_delayed_work+0x34/0x44
| deferred_probe_extend_timeout+0x20/0x70
| driver_register+0xa8/0x110
| __platform_driver_register+0x28/0x3c
| syscon_init+0x24/0x38
| do_one_initcall+0xe4/0x338
| do_initcall_level+0xac/0x178
| do_initcalls+0x5c/0xa0
| do_basic_setup+0x20/0x30
| kernel_init_freeable+0x8c/0xf8
| kernel_init+0x28/0x1b4
| ret_from_fork+0x10/0x20
| Code: f9000fbf 97fffa2f 39400268 37100048 (d42aa2a0)
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: UBSAN: integer subtraction overflow: Fatal exception

This is due to shift_and_mask() using a signed immediate to construct
the mask and being called with a shift of 31 (WORK_OFFQ_POOL_SHIFT) so
that it ends up decrementing from INT_MIN.

Use an unsigned constant '1U' to generate the mask in shift_and_mask().

Cc: Tejun Heo <[email protected]>
Cc: Lai Jiangshan <[email protected]>
Fixes: 1211f3b21c2a ("workqueue: Preserve OFFQ bits in cancel[_sync] paths")
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# aa868475 15-Jul-2024 Lai Jiangshan <[email protected]>

workqueue: Remove unneeded lockdep_assert_cpus_held()

The commit 19af45757383 ("workqueue: Remove cpus_read_lock() from
apply_wqattrs_lock()") removes the unneed cpus_read_lock() after the pwq
creat

workqueue: Remove unneeded lockdep_assert_cpus_held()

The commit 19af45757383 ("workqueue: Remove cpus_read_lock() from
apply_wqattrs_lock()") removes the unneed cpus_read_lock() after the pwq
creations and installations have been reworked based on wq_online_cpumask
rather than cpu_online_mask making cpus_read_lock() is unneeded during
wqattrs changes.

But it desn't remove the lockdep_assert_cpus_held() checks during wqattrs
changes, which leads to complaints from lockdep reported by kernel test
robot:

[ 15.726567][ T131] ------------[ cut here ]------------
[ 15.728117][ T131] WARNING: CPU: 1 PID: 131 at kernel/cpu.c:525 lockdep_assert_cpus_held (kernel/cpu.c:525)
[ 15.731191][ T131] Modules linked in: floppy(+) parport_pc(+) parport qemu_fw_cfg rtc_cmos
[ 15.733423][ T131] CPU: 1 PID: 131 Comm: systemd-udevd Tainted: G T 6.10.0-rc2-00254-g19af45757383 #1 df6f039f42e8818bf9a534449362ebad1aad32e2
[ 15.737011][ T131] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 15.739760][ T131] EIP: lockdep_assert_cpus_held (kernel/cpu.c:525)
[ 15.741326][ T131] Code: 97 c2 03 72 20 83 3d f4 73 97 c2 00 74 17 55 89 e5 b8 fc bd 4d c2 ba ff ff ff ff e8 e4 57 d1 00 85 c0 74 06 5d 31 c0 31 d2 c3 <0f> 0b eb f6 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55 89 e5 b8

Fix it by removing the unneeded lockdep_assert_cpus_held().
Also remove the unneed cpus_read_lock() from wq_affn_dfl_set().

tj: Dropped the removal of cpus_read_lock/unlock() in wq_affn_dfl_set() to
keep this patch fix only.

Cc: kernel test robot <[email protected]>
Fixes: 19af45757383("workqueue: Remove cpus_read_lock() from apply_wqattrs_lock()")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-lkp/[email protected]
Signed-off-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


Revision tags: v6.10, v6.10-rc7
# 58629d48 03-Jul-2024 Lai Jiangshan <[email protected]>

workqueue: Always queue work items to the newest PWQ for order workqueues

To ensure non-reentrancy, __queue_work() attempts to enqueue a work
item to the pool of the currently executing worker. This

workqueue: Always queue work items to the newest PWQ for order workqueues

To ensure non-reentrancy, __queue_work() attempts to enqueue a work
item to the pool of the currently executing worker. This is not only
unnecessary for an ordered workqueue, where order inherently suggests
non-reentrancy, but it could also disrupt the sequence if the item is
not enqueued on the newest PWQ.

Just queue it to the newest PWQ and let order management guarantees
non-reentrancy.

Signed-off-by: Lai Jiangshan <[email protected]>
Fixes: 4c065dbce1e8 ("workqueue: Enable unbound cpumask update on ordered workqueues")
Cc: [email protected] # v6.9+
Signed-off-by: Tejun Heo <[email protected]>
(cherry picked from commit 74347be3edfd11277799242766edf844c43dd5d3)

show more ...


# b2b1f933 11-Jul-2024 Lai Jiangshan <[email protected]>

workqueue: Rename wq_update_pod() to unbound_wq_update_pwq()

What wq_update_pod() does is just to update the pwq of the specific
cpu. Rename it and update the comments.

Signed-off-by: Lai Jiangsha

workqueue: Rename wq_update_pod() to unbound_wq_update_pwq()

What wq_update_pod() does is just to update the pwq of the specific
cpu. Rename it and update the comments.

Signed-off-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


# d160a58d 11-Jul-2024 Lai Jiangshan <[email protected]>

workqueue: Remove the arguments @hotplug_cpu and @online from wq_update_pod()

The arguments @hotplug_cpu and @online are not used in wq_update_pod()
since the functions called by wq_update_pod() don

workqueue: Remove the arguments @hotplug_cpu and @online from wq_update_pod()

The arguments @hotplug_cpu and @online are not used in wq_update_pod()
since the functions called by wq_update_pod() don't need them.

Signed-off-by: Lai Jiangshan <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>

show more ...


12345678910>>...34