|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5 |
|
| #
d6834d9c |
| 21-Oct-2024 |
Li Huafei <[email protected]> |
watchdog/hardlockup/perf: Fix perf_event memory leak
During stress-testing, we found a kmemleak report for perf_event:
unreferenced object 0xff110001410a33e0 (size 1328): comm "kworker/4:11",
watchdog/hardlockup/perf: Fix perf_event memory leak
During stress-testing, we found a kmemleak report for perf_event:
unreferenced object 0xff110001410a33e0 (size 1328): comm "kworker/4:11", pid 288, jiffies 4294916004 hex dump (first 32 bytes): b8 be c2 3b 02 00 11 ff 22 01 00 00 00 00 ad de ...;...."....... f0 33 0a 41 01 00 11 ff f0 33 0a 41 01 00 11 ff .3.A.....3.A.... backtrace (crc 24eb7b3a): [<00000000e211b653>] kmem_cache_alloc_node_noprof+0x269/0x2e0 [<000000009d0985fa>] perf_event_alloc+0x5f/0xcf0 [<00000000084ad4a2>] perf_event_create_kernel_counter+0x38/0x1b0 [<00000000fde96401>] hardlockup_detector_event_create+0x50/0xe0 [<0000000051183158>] watchdog_hardlockup_enable+0x17/0x70 [<00000000ac89727f>] softlockup_start_fn+0x15/0x40 ...
Our stress test includes CPU online and offline cycles, and updating the watchdog configuration.
After reading the code, I found that there may be a race between cleaning up perf_event after updating watchdog and disabling event when the CPU goes offline:
CPU0 CPU1 CPU2 (update watchdog) (hotplug offline CPU1)
... _cpu_down(CPU1) cpus_read_lock() // waiting for cpu lock softlockup_start_all smp_call_on_cpu(CPU1) softlockup_start_fn ... watchdog_hardlockup_enable(CPU1) perf create E1 watchdog_ev[CPU1] = E1 cpus_read_unlock() cpus_write_lock() cpuhp_kick_ap_work(CPU1) cpuhp_thread_fun ... watchdog_hardlockup_disable(CPU1) watchdog_ev[CPU1] = NULL dead_event[CPU1] = E1 __lockup_detector_cleanup for each dead_events_mask release each dead_event /* * CPU1 has not been added to * dead_events_mask, then E1 * will not be released */ CPU1 -> dead_events_mask cpumask_clear(&dead_events_mask) // dead_events_mask is cleared, E1 is leaked
In this case, the leaked perf_event E1 matches the perf_event leak reported by kmemleak. Due to the low probability of problem recurrence (only reported once), I added some hack delays in the code:
static void __lockup_detector_reconfigure(void) { ... watchdog_hardlockup_start(); cpus_read_unlock(); + mdelay(100); /* * Must be called outside the cpus locked section to prevent * recursive locking in the perf code. ... }
void watchdog_hardlockup_disable(unsigned int cpu) { ... perf_event_disable(event); this_cpu_write(watchdog_ev, NULL); this_cpu_write(dead_event, event); + mdelay(100); cpumask_set_cpu(smp_processor_id(), &dead_events_mask); atomic_dec(&watchdog_cpus); ... }
void hardlockup_detector_perf_cleanup(void) { ... perf_event_release_kernel(event); per_cpu(dead_event, cpu) = NULL; } + mdelay(100); cpumask_clear(&dead_events_mask); }
Then, simultaneously performing CPU on/off and switching watchdog, it is almost certain to reproduce this leak.
The problem here is that releasing perf_event is not within the CPU hotplug read-write lock. Commit:
941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock")
introduced deferred release to solve the deadlock caused by calling get_online_cpus() when releasing perf_event. Later, commit:
efe951d3de91 ("perf/x86: Fix perf,x86,cpuhp deadlock")
removed the get_online_cpus() call on the perf_event release path to solve another deadlock problem.
Therefore, it is now possible to move the release of perf_event back into the CPU hotplug read-write lock, and release the event immediately after disabling it.
Fixes: 941154bd6937 ("watchdog/hardlockup/perf: Prevent CPU hotplug deadlock") Signed-off-by: Li Huafei <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
7c0db8a4 |
| 17-Jan-2025 |
Hamza Mahfooz <[email protected]> |
cpu: export lockdep_assert_cpus_held()
If CONFIG_HYPERV=m, lockdep_assert_cpus_held() is undefined for HyperV. So, export the function so that GPL drivers can use it more broadly.
Cc: Michael Kelle
cpu: export lockdep_assert_cpus_held()
If CONFIG_HYPERV=m, lockdep_assert_cpus_held() is undefined for HyperV. So, export the function so that GPL drivers can use it more broadly.
Cc: Michael Kelley <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Tested-by: Michael Kelley <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]> Message-ID: <[email protected]>
show more ...
|
| #
2f8dea16 |
| 20-Dec-2024 |
Koichiro Den <[email protected]> |
hrtimers: Handle CPU state correctly on hotplug
Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway through a CPU hotunplug down to CPUHP_HRTIMERS_PREPARE, and then back to CPUH
hrtimers: Handle CPU state correctly on hotplug
Consider a scenario where a CPU transitions from CPUHP_ONLINE to halfway through a CPU hotunplug down to CPUHP_HRTIMERS_PREPARE, and then back to CPUHP_ONLINE:
Since hrtimers_prepare_cpu() does not run, cpu_base.hres_active remains set to 1 throughout. However, during a CPU unplug operation, the tick and the clockevents are shut down at CPUHP_AP_TICK_DYING. On return to the online state, for instance CFS incorrectly assumes that the hrtick is already active, and the chance of the clockevent device to transition to oneshot mode is also lost forever for the CPU, unless it goes back to a lower state than CPUHP_HRTIMERS_PREPARE once.
This round-trip reveals another issue; cpu_base.online is not set to 1 after the transition, which appears as a WARN_ON_ONCE in enqueue_hrtimer().
Aside of that, the bulk of the per CPU state is not reset either, which means there are dangling pointers in the worst case.
Address this by adding a corresponding startup() callback, which resets the stale per CPU state and sets the online flag.
[ tglx: Make the new callback unconditionally available, remove the online modification in the prepare() callback and clear the remaining state in the starting callback instead of the prepare callback ]
Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier") Signed-off-by: Koichiro Den <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/[email protected]
show more ...
|
| #
21641bd9 |
| 04-Nov-2024 |
Nicholas Piggin <[email protected]> |
lazy tlb: fix hotplug exit race with MMU_LAZY_TLB_SHOOTDOWN
CPU unplug first calls __cpu_disable(), and that's where powerpc calls cleanup_cpu_mmu_context(), which clears this CPU from mm_cpumask()
lazy tlb: fix hotplug exit race with MMU_LAZY_TLB_SHOOTDOWN
CPU unplug first calls __cpu_disable(), and that's where powerpc calls cleanup_cpu_mmu_context(), which clears this CPU from mm_cpumask() of all mms in the system.
However this CPU may still be using a lazy tlb mm, and its mm_cpumask bit will be cleared from it. The CPU does not switch away from the lazy tlb mm until arch_cpu_idle_dead() calls idle_task_exit().
If that user mm exits in this window, it will not be subject to the lazy tlb mm shootdown and may be freed while in use as a lazy mm by the CPU that is being unplugged.
cleanup_cpu_mmu_context() could be moved later, but it looks better to move the lazy tlb mm switching earlier. The problem with doing the lazy mm switching in idle_task_exit() is explained in commit bf2c59fce4074 ("sched/core: Fix illegal RCU from offline CPUs"), which added a wart to switch away from the mm but leave it set in active_mm to be cleaned up later.
So instead, switch away from the lazy tlb mm at sched_cpu_wait_empty(), which is the last hotplug state before teardown (CPUHP_AP_SCHED_WAIT_EMPTY). This CPU will never switch to a user thread from this point, so it has no chance to pick up a new lazy tlb mm. This removes the lazy tlb mm handling wart in CPU unplug.
With this, idle_task_exit() is not needed anymore and can be cleaned up. This leaves the prototype alone, to be cleaned after this change.
herton: took the suggestions from https://lore.kernel.org/all/87jzvyprsw.ffs@tglx/ and made adjustments on the initial patch proposed by Nicholas.
Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 2655421ae69f ("lazy tlb: shoot lazies, non-refcounting lazy tlb mm reference handling scheme") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Herton R. Krzesinski <[email protected]> Suggested-by: Thomas Gleixner <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
7f15d4ab |
| 20-Dec-2024 |
Dr. David Alan Gilbert <[email protected]> |
cpu: Remove unused init_cpu_online
The last use of init_cpu_online() was removed by the commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture")
Remove it.
Signed-off-by: Dr. David Alan G
cpu: Remove unused init_cpu_online
The last use of init_cpu_online() was removed by the commit cf8e8658100d ("arch: Remove Itanium (IA-64) architecture")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <[email protected]> Signed-off-by: Yury Norov <[email protected]>
show more ...
|
| #
e7240bd9 |
| 18-Nov-2024 |
Thomas Weißschuh <[email protected]> |
cpu: Remove spurious NULL in attribute_group definition
This NULL value is most-likely a copy-paste error from an array definition. The NULL doesn't have any effect.
Signed-off-by: Thomas Weißschuh
cpu: Remove spurious NULL in attribute_group definition
This NULL value is most-likely a copy-paste error from an array definition. The NULL doesn't have any effect.
Signed-off-by: Thomas Weißschuh <[email protected]> Link: https://lore.kernel.org/r/20241118-sysfs-const-attribute_group-fixes-v1-3-48e0b0ad8cba@weissschuh.net Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
| #
3b1596a2 |
| 29-Oct-2024 |
Frederic Weisbecker <[email protected]> |
clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING
The way the clockevent devices are finally stopped while a CPU is offlining is currently chaotic. The layout being by
clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING
The way the clockevent devices are finally stopped while a CPU is offlining is currently chaotic. The layout being by order:
1) tick_sched_timer_dying() stops the tick and the underlying clockevent but only for oneshot case. The periodic tick and its related clockevent still runs.
2) tick_broadcast_offline() detaches and stops the per-cpu oneshot broadcast and append it to the released list.
3) Some individual clockevent drivers stop the clockevents (a second time if the tick is oneshot)
4) Once the CPU is dead, a control CPU remotely detaches and stops (a 3rd time if oneshot mode) the CPU clockevent and adds it to the released list.
5) The released list containing the broadcast device released on step 2) and the remotely detached clockevent from step 4) are unregistered.
These random events can be factorized if the current clockevent is detached and stopped by the dying CPU at the generic layer, that is from the dying CPU:
a) Stop the tick b) Stop/detach the underlying per-cpu oneshot broadcast clockevent c) Stop/detach the underlying clockevent d) Release / unregister the clockevents from b) and c) e) Release / unregister the remaining clockevents from the dying CPU. This part could be performed by the dying CPU
This way the drivers and the tick layer don't need to care about clockevent operations during cpuhotplug down. This also unifies the tick behaviour on offline CPUs between oneshot and periodic modes, avoiding offline ticks altogether for sanity.
Adopt the simplification.
[ tglx: Remove the WARN_ON() in clockevents_register_device() as that is called from an upcoming CPU before the CPU is marked online ]
Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
|
Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1 |
|
| #
0784181b |
| 26-Sep-2024 |
David Woodhouse <[email protected]> |
lockdep: Add lockdep_cleanup_dead_cpu()
Add a function to check that an offline CPU has left the tracing infrastructure in a sane state.
Commit 9bb69ba4c177 ("ACPI: processor_idle: use raw_safe_hal
lockdep: Add lockdep_cleanup_dead_cpu()
Add a function to check that an offline CPU has left the tracing infrastructure in a sane state.
Commit 9bb69ba4c177 ("ACPI: processor_idle: use raw_safe_halt() in acpi_idle_play_dead()") fixed an issue where the acpi_idle_play_dead() function called safe_halt() instead of raw_safe_halt(), which had the side-effect of setting the hardirqs_enabled flag for the offline CPU.
On x86 this triggered warnings from lockdep_assert_irqs_disabled() when the CPU was brought back online again later. These warnings were too early for the exception to be handled correctly, leading to a triple-fault.
Add lockdep_cleanup_dead_cpu() to check for this kind of failure mode, print the events leading up to it, and correct it so that the CPU can come online again correctly. Re-introducing the original bug now merely results in this warning instead:
[ 61.556652] smpboot: CPU 1 is now offline [ 61.556769] CPU 1 left hardirqs enabled! [ 61.556915] irq event stamp: 128149 [ 61.556965] hardirqs last enabled at (128149): [<ffffffff81720a36>] acpi_idle_play_dead+0x46/0x70 [ 61.557055] hardirqs last disabled at (128148): [<ffffffff81124d50>] do_idle+0x90/0xe0 [ 61.557117] softirqs last enabled at (128078): [<ffffffff81cec74c>] __do_softirq+0x31c/0x423 [ 61.557199] softirqs last disabled at (128065): [<ffffffff810baae1>] __irq_exit_rcu+0x91/0x100
[boqun: Capitalize the title and reword the message a bit]
Signed-off-by: David Woodhouse <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Boqun Feng <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.11, v6.11-rc7 |
|
| #
662a1bfb |
| 04-Sep-2024 |
Anna-Maria Behnsen <[email protected]> |
cpu: Use already existing usleep_range()
usleep_range() is a wrapper arount usleep_range_state() which hands in TASK_UNTINTERRUPTIBLE as state argument.
Use already exising wrapper usleep_range().
cpu: Use already existing usleep_range()
usleep_range() is a wrapper arount usleep_range_state() which hands in TASK_UNTINTERRUPTIBLE as state argument.
Use already exising wrapper usleep_range(). No functional change.
Signed-off-by: Anna-Maria Behnsen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/all/20240904-devel-anna-maria-b4-timers-flseep-v1-2-e98760256370@linutronix.de
show more ...
|
|
Revision tags: v6.11-rc6 |
|
| #
8db70fae |
| 25-Aug-2024 |
Thorsten Blum <[email protected]> |
cpu: Fix W=1 build kernel-doc warning
Building the kernel with W=1 generates the following warning:
kernel/cpu.c:2693: warning: This comment starts with '/**', but isn't a kernel-doc com
cpu: Fix W=1 build kernel-doc warning
Building the kernel with W=1 generates the following warning:
kernel/cpu.c:2693: warning: This comment starts with '/**', but isn't a kernel-doc comment.
The function topology_is_core_online() is a simple helper function and doesn't need a kernel-doc comment.
Use a normal comment instead.
Signed-off-by: Thorsten Blum <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
|
Revision tags: v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2 |
|
| #
6c17ea1f |
| 31-Jul-2024 |
Nysal Jan K.A <[email protected]> |
cpu/SMT: Enable SMT only if a core is online
If a core is offline then enabling SMT should not online CPUs of this core. By enabling SMT, what is intended is either changing the SMT value from "off"
cpu/SMT: Enable SMT only if a core is online
If a core is offline then enabling SMT should not online CPUs of this core. By enabling SMT, what is intended is either changing the SMT value from "off" to "on" or setting the SMT level (threads per core) from a lower to higher value.
On PowerPC the ppc64_cpu utility can be used, among other things, to perform the following functions:
ppc64_cpu --cores-on # Get the number of online cores ppc64_cpu --cores-on=X # Put exactly X cores online ppc64_cpu --offline-cores=X[,Y,...] # Put specified cores offline ppc64_cpu --smt={on|off|value} # Enable, disable or change SMT level
If the user has decided to offline certain cores, enabling SMT should not online CPUs in those cores. This patch fixes the issue and changes the behaviour as described, by introducing an arch specific function topology_is_core_online(). It is currently implemented only for PowerPC.
Fixes: 73c58e7e1412 ("powerpc: Add HOTPLUG_SMT support") Reported-by: Tyrel Datwyler <[email protected]> Closes: https://groups.google.com/g/powerpc-utils-devel/c/wrwVzAAnRlI/m/5KJSoqP4BAAJ Signed-off-by: Nysal Jan K.A <[email protected]> Reviewed-by: Shrikanth Hegde <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.11-rc1 |
|
| #
2dce9931 |
| 16-Jul-2024 |
Jiaxun Yang <[email protected]> |
cpu/hotplug: Provide weak fallback for arch_cpuhp_init_parallel_bringup()
CONFIG_HOTPLUG_PARALLEL expects the architecture to implement arch_cpuhp_init_parallel_bringup() to decide whether paralllel
cpu/hotplug: Provide weak fallback for arch_cpuhp_init_parallel_bringup()
CONFIG_HOTPLUG_PARALLEL expects the architecture to implement arch_cpuhp_init_parallel_bringup() to decide whether paralllel hotplug is possible and to do the necessary architecture specific initialization.
There are architectures which can enable it unconditionally and do not require architecture specific initialization.
Provide a weak fallback for arch_cpuhp_init_parallel_bringup() so that such architectures are not forced to implement empty stub functions.
Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
| #
c0e81a45 |
| 16-Jul-2024 |
Jiaxun Yang <[email protected]> |
cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT
Provide stub functions for SMT related parallel bring up functions so that HOTPLUG_PARALLEL can work without HOTPLUG_SMT.
Signed-off-by
cpu/hotplug: Make HOTPLUG_PARALLEL independent of HOTPLUG_SMT
Provide stub functions for SMT related parallel bring up functions so that HOTPLUG_PARALLEL can work without HOTPLUG_SMT.
Signed-off-by: Jiaxun Yang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]
show more ...
|
|
Revision tags: v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2 |
|
| #
4e1a7df4 |
| 29-May-2024 |
James Morse <[email protected]> |
cpumask: Add enabled cpumask for present CPUs that can be brought online
The 'offline' file in sysfs shows all offline CPUs, including those that aren't present. User-space is expected to remove not
cpumask: Add enabled cpumask for present CPUs that can be brought online
The 'offline' file in sysfs shows all offline CPUs, including those that aren't present. User-space is expected to remove not-present CPUs from this list to learn which CPUs could be brought online.
CPUs can be present but not-enabled. These CPUs can't be brought online until the firmware policy changes, which comes with an ACPI notification that will register the CPUs.
With only the offline and present files, user-space is unable to determine which CPUs it can try to bring online. Add a new CPU mask that shows this based on all the registered CPUs.
Signed-off-by: James Morse <[email protected]> Tested-by: Miguel Luis <[email protected]> Tested-by: Vishnu Pajjuri <[email protected]> Tested-by: Jianyong Wu <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Gavin Shan <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
| #
6ef8eb51 |
| 18-Jun-2024 |
Huacai Chen <[email protected]> |
cpu: Fix broken cmdline "nosmp" and "maxcpus=0"
After the rework of "Parallel CPU bringup", the cmdline "nosmp" and "maxcpus=0" parameters are not working anymore. These parameters set setup_max_cpu
cpu: Fix broken cmdline "nosmp" and "maxcpus=0"
After the rework of "Parallel CPU bringup", the cmdline "nosmp" and "maxcpus=0" parameters are not working anymore. These parameters set setup_max_cpus to zero and that's handed to bringup_nonboot_cpus().
The code there does a decrement before checking for zero, which brings it into the negative space and brings up all CPUs.
Add a zero check at the beginning of the function to prevent this.
[ tglx: Massaged change log ]
Fixes: 18415f33e2ac4ab382 ("cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE") Fixes: 06c6796e0304234da6 ("cpu/hotplug: Fix off by one in cpuhp_bringup_mask()") Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
66e48e49 |
| 14-Jun-2024 |
Kirill A. Shutemov <[email protected]> |
cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup
ACPI MADT doesn't allow to offline a CPU after it has been woken up.
Currently, CPU hotplug is prevented based on the confidential
cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup
ACPI MADT doesn't allow to offline a CPU after it has been woken up.
Currently, CPU hotplug is prevented based on the confidential computing attribute which is set for Intel TDX. But TDX is not the only possible user of the wake up method. Any platform that uses ACPI MADT wakeup method cannot offline CPU.
Disable CPU offlining on ACPI MADT wakeup enumeration.
This has no visible effects for users: currently, TDX guest is the only platform that uses the ACPI MADT wakeup method.
Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Tested-by: Tao Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
1037e4c5 |
| 14-Jun-2024 |
Kirill A. Shutemov <[email protected]> |
cpu/hotplug: Add support for declaring CPU offlining not supported
The ACPI MADT mailbox wakeup method doesn't allow to offline a CPU after it has been woken up.
Currently, offlining is prevented b
cpu/hotplug: Add support for declaring CPU offlining not supported
The ACPI MADT mailbox wakeup method doesn't allow to offline a CPU after it has been woken up.
Currently, offlining is prevented based on the confidential computing attribute which is set for Intel TDX. But TDX is not the only possible user of the wake up method. The MADT wakeup can be implemented outside of a confidential computing environment. Offline support is a property of the wakeup method, not the CoCo implementation.
Introduce cpu_hotplug_disable_offlining() that can be called to indicate that CPU offlining should be disabled.
This function is going to replace CC_ATTR_HOTPLUG_DISABLED for ACPI MADT wakeup method.
Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Tested-by: Tao Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.10-rc1 |
|
| #
fde78e46 |
| 24-May-2024 |
Stanislav Spassov <[email protected]> |
cpu/hotplug: Reverse order of iteration in freeze_secondary_cpus()
Whenever CPU hotplug state callbacks are registered, the startup callback is invoked on CPUs that have already reached the provided
cpu/hotplug: Reverse order of iteration in freeze_secondary_cpus()
Whenever CPU hotplug state callbacks are registered, the startup callback is invoked on CPUs that have already reached the provided state in order of ascending CPU IDs.
In freeze_secondary_cpus() the teardown of CPUs happens in the same are invoked in the same order. This is known to make a difference is the current implementation of these callbacks in arch/x86/events/intel/uncore.c:
- uncore_event_cpu_online() designates the first CPU it is invoked for on each package as the uncore event collector for that package
- uncore_event_cpu_offline() if the CPU being offlined is the event collector for its package, transfers that responsibility over to the next (by ascending CPU id) one in the same package
With the current order of CPU teardowns in freeze_secondary_cpus(), the latter ends up doing the ownership transfer work on every single CPU. That work involves a synchronize_rcu() call, ultimately unnecessarily degrading the performance of CPU offlining.
To address this make freeze_secondary_cpus() iterate through the CPUs in reverse order, so that the teardown happens in order of descending CPU IDs.
[ tglx: Massage change log ]
Signed-off-by: Stanislav Spassov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
932d8476 |
| 15-May-2024 |
Yuntao Wang <[email protected]> |
cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
Commit 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare stage") added a dynamic range for the prepare states, bu
cpu/hotplug: Fix dynstate assignment in __cpuhp_setup_state_cpuslocked()
Commit 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare stage") added a dynamic range for the prepare states, but did not handle the assignment of the dynstate variable in __cpuhp_setup_state_cpuslocked().
This causes the corresponding startup callback not to be invoked when calling __cpuhp_setup_state_cpuslocked() with the CPUHP_BP_PREPARE_DYN parameter, even though it should be.
Currently, the users of __cpuhp_setup_state_cpuslocked(), for one reason or another, have not triggered this bug.
Fixes: 4205e4786d0b ("cpu/hotplug: Provide dynamic range for prepare stage") Signed-off-by: Yuntao Wang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5 |
|
| #
ce0abef6 |
| 20-Apr-2024 |
Sean Christopherson <[email protected]> |
cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
Explicitly disallow enabling mitigations at runtime for kernels that were built with CONFIG_CPU_MITIGATIONS=n, as some architectures m
cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n
Explicitly disallow enabling mitigations at runtime for kernels that were built with CONFIG_CPU_MITIGATIONS=n, as some architectures may omit code entirely if mitigations are disabled at compile time.
E.g. on x86, a large pile of Kconfigs are buried behind CPU_MITIGATIONS, and trying to provide sane behavior for retroactively enabling mitigations is extremely difficult, bordering on impossible. E.g. page table isolation and call depth tracking require build-time support, BHI mitigations will still be off without additional kernel parameters, etc.
[ bp: Touchups. ]
Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
fe42754b |
| 20-Apr-2024 |
Sean Christopherson <[email protected]> |
cpu: Re-enable CPU mitigations by default for !X86 architectures
Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it on for all architectures exception x86. A recent commit to
cpu: Re-enable CPU mitigations by default for !X86 architectures
Rename x86's to CPU_MITIGATIONS, define it in generic code, and force it on for all architectures exception x86. A recent commit to turn mitigations off by default if SPECULATION_MITIGATIONS=n kinda sorta missed that "cpu_mitigations" is completely generic, whereas SPECULATION_MITIGATIONS is x86-specific.
Rename x86's SPECULATIVE_MITIGATIONS instead of keeping both and have it select CPU_MITIGATIONS, as having two configs for the same thing is unnecessary and confusing. This will also allow x86 to use the knob to manage mitigations that aren't strictly related to speculative execution.
Use another Kconfig to communicate to common code that CPU_MITIGATIONS is already defined instead of having x86's menu depend on the common CPU_MITIGATIONS. This allows keeping a single point of contact for all of x86's mitigations, and it's not clear that other architectures *want* to allow disabling mitigations at compile-time.
Fixes: f337a6a21e2f ("x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n") Closes: https://lkml.kernel.org/r/20240413115324.53303a68%40canb.auug.org.au Reported-by: Stephen Rothwell <[email protected]> Reported-by: Michael Ellerman <[email protected]> Reported-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Acked-by: Borislav Petkov (AMD) <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.9-rc4 |
|
| #
f337a6a2 |
| 09-Apr-2024 |
Sean Christopherson <[email protected]> |
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
Initialize cpu_mitigations to CPU_MITIGATIONS_OFF if the kernel is built with CONFIG_SPECULATION_MITIGATIONS=n, as the
x86/cpu: Actually turn off mitigations by default for SPECULATION_MITIGATIONS=n
Initialize cpu_mitigations to CPU_MITIGATIONS_OFF if the kernel is built with CONFIG_SPECULATION_MITIGATIONS=n, as the help text quite clearly states that disabling SPECULATION_MITIGATIONS is supposed to turn off all mitigations by default.
│ If you say N, all mitigations will be disabled. You really │ should know what you are doing to say so.
As is, the kernel still defaults to CPU_MITIGATIONS_AUTO, which results in some mitigations being enabled in spite of SPECULATION_MITIGATIONS=n.
Fixes: f43b9876e857 ("x86/retbleed: Add fine grained Kconfig knobs") Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Daniel Sneddon <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.9-rc3 |
|
| #
2125c003 |
| 04-Apr-2024 |
Waiman Long <[email protected]> |
cgroup/cpuset: Make cpuset hotplug processing synchronous
Since commit 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()"), cpuset hotplug was done asynchronously via a work fun
cgroup/cpuset: Make cpuset hotplug processing synchronous
Since commit 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()"), cpuset hotplug was done asynchronously via a work function. This is to avoid recursive locking of cgroup_mutex.
Since then, the cgroup locking scheme has changed quite a bit. A cpuset_mutex was introduced to protect cpuset specific operations. The cpuset_mutex is then replaced by a cpuset_rwsem. With commit d74b27d63a8b ("cgroup/cpuset: Change cpuset_rwsem and hotplug lock order"), cpu_hotplug_lock is acquired before cpuset_rwsem. Later on, cpuset_rwsem is reverted back to cpuset_mutex. All these locking changes allow the hotplug code to call into cpuset core directly.
The following commits were also merged due to the asynchronous nature of cpuset hotplug processing.
- commit b22afcdf04c9 ("cpu/hotplug: Cure the cpusets trainwreck") - commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs") - commit 28b89b9e6f7b ("cpuset: handle race between CPU hotplug and cpuset_hotplug_work")
Clean up all these bandages by making cpuset hotplug processing synchronous again with the exception that the call to cgroup_transfer_tasks() to transfer tasks out of an empty cgroup v1 cpuset, if necessary, will still be done via a work function due to the existing cgroup_mutex -> cpu_hotplug_lock dependency. It is possible to reverse that dependency, but that will require updating a number of different cgroup controllers. This special hotplug code path should be rarely taken anyway.
As all the cpuset states will be updated by the end of the hotplug operation, we can revert most the above commits except commit 50e76632339d ("sched/cpuset/pm: Fix cpuset vs. suspend-resume bugs") which is partially reverted. Also removing some cpus_read_lock trylock attempts in the cpuset partition code as they are no longer necessary since the cpu_hotplug_lock is now held for the whole duration of the cpuset hotplug code path.
Signed-off-by: Waiman Long <[email protected]> Tested-by: Valentin Schneider <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7 |
|
| #
4c8a4985 |
| 27-Feb-2024 |
Ingo Molnar <[email protected]> |
smp: Avoid 'setup_max_cpus' namespace collision/shadowing
bringup_nonboot_cpus() gets passed the 'setup_max_cpus' variable in init/main.c - which is also the name of the parameter, shadowing the nam
smp: Avoid 'setup_max_cpus' namespace collision/shadowing
bringup_nonboot_cpus() gets passed the 'setup_max_cpus' variable in init/main.c - which is also the name of the parameter, shadowing the name.
To reduce confusion and to allow the 'setup_max_cpus' value to be #defined in the <linux/smp.h> header, use the 'max_cpus' name for the function parameter name.
Signed-off-by: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected]
show more ...
|
|
Revision tags: v6.8-rc6 |
|
| #
500f8f9b |
| 25-Feb-2024 |
Frederic Weisbecker <[email protected]> |
tick: Assume timekeeping is correctly handed over upon last offline idle call
The timekeeping duty is handed over from the outgoing CPU on stop machine, then the oneshot tick is stopped right after.
tick: Assume timekeeping is correctly handed over upon last offline idle call
The timekeeping duty is handed over from the outgoing CPU on stop machine, then the oneshot tick is stopped right after. Therefore it's guaranteed that the current CPU isn't the timekeeper upon its last call to idle.
Besides, calling tick_nohz_idle_stop_tick() while the dying CPU goes into idle suggests that the tick is going to be stopped while it is actually stopped already from the appropriate CPU hotplug state.
Remove the confusing call and the obsolete case handling and convert it to a sanity check that verifies the above assumption.
Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|