|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5 |
|
| #
9f98a4f4 |
| 28-Feb-2025 |
Vishal Annapurve <[email protected]> |
x86/tdx: Fix arch_safe_halt() execution for TDX VMs
Direct HLT instruction execution causes #VEs for TDX VMs which is routed to hypervisor via TDCALL. If HLT is executed in STI-shadow, resulting #VE
x86/tdx: Fix arch_safe_halt() execution for TDX VMs
Direct HLT instruction execution causes #VEs for TDX VMs which is routed to hypervisor via TDCALL. If HLT is executed in STI-shadow, resulting #VE handler will enable interrupts before TDCALL is routed to hypervisor leading to missed wakeup events, as current TDX spec doesn't expose interruptibility state information to allow #VE handler to selectively enable interrupts.
Commit bfe6ed0c6727 ("x86/tdx: Add HLT support for TDX guests") prevented the idle routines from executing HLT instruction in STI-shadow. But it missed the paravirt routine which can be reached via this path as an example:
kvm_wait() => safe_halt() => raw_safe_halt() => arch_safe_halt() => irq.safe_halt() => pv_native_safe_halt()
To reliably handle arch_safe_halt() for TDX VMs, introduce explicit dependency on CONFIG_PARAVIRT and override paravirt halt()/safe_halt() routines with TDX-safe versions that execute direct TDCALL and needed interrupt flag updates. Executing direct TDCALL brings in additional benefit of avoiding HLT related #VEs altogether.
As tested by Ryan Afranji:
"Tested with the specjbb2015 benchmark. It has heavy lock contention which leads to many halt calls. TDX VMs suffered a poor score before this patchset.
Verified the major performance improvement with this patchset applied."
Fixes: bfe6ed0c6727 ("x86/tdx: Add HLT support for TDX guests") Signed-off-by: Vishal Annapurve <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Kirill A. Shutemov <[email protected]> Tested-by: Ryan Afranji <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Juergen Gross <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
9a93e29f |
| 14-Mar-2025 |
Brian Gerst <[email protected]> |
x86/syscall: Move sys_ni_syscall()
Move sys_ni_syscall() to kernel/process.c, and remove the now empty entry/common.c
No functional changes.
Signed-off-by: Brian Gerst <[email protected]> Signed-o
x86/syscall: Move sys_ni_syscall()
Move sys_ni_syscall() to kernel/process.c, and remove the now empty entry/common.c
No functional changes.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Sohil Mehta <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
5d3b81d4 |
| 26-Feb-2025 |
Benjamin Berg <[email protected]> |
x86/fpu: Avoid copying dynamic FP state from init_task in arch_dup_task_struct()
The init_task instance of struct task_struct is statically allocated and may not contain the full FP state for usersp
x86/fpu: Avoid copying dynamic FP state from init_task in arch_dup_task_struct()
The init_task instance of struct task_struct is statically allocated and may not contain the full FP state for userspace. As such, limit the copy to the valid area of both init_task and 'dst' and ensure all memory is initialized.
Note that the FP state is only needed for userspace, and as such it is entirely reasonable for init_task to not contain parts of it.
Fixes: 5aaeb5c01c5b ("x86/fpu, sched: Introduce CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and use it on x86") Signed-off-by: Benjamin Berg <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Oleg Nesterov <[email protected]> Link: https://lore.kernel.org/r/[email protected] ----
v2: - Fix code if arch_task_struct_size < sizeof(init_task) by using memcpy_and_pad.
show more ...
|
|
Revision tags: v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1 |
|
| #
2df1ad0d |
| 02-Feb-2025 |
Brian Gerst <[email protected]> |
x86/arch_prctl: Simplify sys_arch_prctl()
Use in_ia32_syscall() instead of a compat syscall entry.
No change in functionality intended.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by
x86/arch_prctl: Simplify sys_arch_prctl()
Use in_ia32_syscall() instead of a compat syscall entry.
No change in functionality intended.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3 |
|
| #
e5d3a578 |
| 13-Dec-2024 |
Dave Hansen <[email protected]> |
x86/cpu: Make all all CPUID leaf names consistent
The leaf names are not consistent. Give them all a CPUID_LEAF_ prefix for consistency and vertical alignment.
Signed-off-by: Dave Hansen <dave.han
x86/cpu: Make all all CPUID leaf names consistent
The leaf names are not consistent. Give them all a CPUID_LEAF_ prefix for consistency and vertical alignment.
Signed-off-by: Dave Hansen <[email protected]> Acked-by: Dave Jiang <[email protected]> # for ioatdma bits Link: https://lore.kernel.org/all/20241213205040.7B0C3241%40davehans-spike.ostc.intel.com
show more ...
|
| #
497f7028 |
| 13-Dec-2024 |
Dave Hansen <[email protected]> |
x86/cpu: Move MWAIT leaf definition to common header
Begin constructing a common place to keep all CPUID leaf definitions. Move CPUID_MWAIT_LEAF to the CPUID header and include it where needed.
Sig
x86/cpu: Move MWAIT leaf definition to common header
Begin constructing a common place to keep all CPUID leaf definitions. Move CPUID_MWAIT_LEAF to the CPUID header and include it where needed.
Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Zhao Liu <[email protected]> Link: https://lore.kernel.org/all/20241213205028.EE94D02A%40davehans-spike.ostc.intel.com
show more ...
|
|
Revision tags: v6.13-rc2 |
|
| #
29188c16 |
| 03-Dec-2024 |
Juergen Gross <[email protected]> |
x86/paravirt: Remove the WBINVD callback
The pv_ops::cpu.wbinvd paravirt callback is a leftover of lguest times. Today it is no longer needed, as all users use the native WBINVD implementation.
Rem
x86/paravirt: Remove the WBINVD callback
The pv_ops::cpu.wbinvd paravirt callback is a leftover of lguest times. Today it is no longer needed, as all users use the native WBINVD implementation.
Remove the callback and rename native_wbinvd() to wbinvd().
Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.13-rc1 |
|
| #
2190966f |
| 28-Nov-2024 |
Peter Zijlstra <[email protected]> |
x86: Convert unreachable() to BUG()
Avoid unreachable() as it can (and will in the absence of UBSAN) generate fallthrough code. Use BUG() so we get a UD2 trap (with unreachable annotation).
Signed-
x86: Convert unreachable() to BUG()
Avoid unreachable() as it can (and will in the absence of UBSAN) generate fallthrough code. Use BUG() so we get a UD2 trap (with unreachable annotation).
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4 |
|
| #
26ba7353 |
| 14-Jun-2024 |
Kirill A. Shutemov <[email protected]> |
x86/smp: Add smp_ops.stop_this_cpu() callback
If the helper is defined, it is called instead of halt() to stop the CPU at the end of stop_this_cpu() and on crash CPU shutdown.
ACPI MADT will use it
x86/smp: Add smp_ops.stop_this_cpu() callback
If the helper is defined, it is called instead of halt() to stop the CPU at the end of stop_this_cpu() and on crash CPU shutdown.
ACPI MADT will use it to hand over the CPU to BIOS in order to be able to wake it up again after kexec.
Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Kai Huang <[email protected]> Tested-by: Tao Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7 |
|
| #
35ce6492 |
| 28-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/idle: Select idle routine only once
The idle routine selection is done on every CPU bringup operation and has a guard in place which is effective after the first invocation, which is a pointless
x86/idle: Select idle routine only once
The idle routine selection is done on every CPU bringup operation and has a guard in place which is effective after the first invocation, which is a pointless exercise.
Invoke it once on the boot CPU and mark the related functions __init. The guard check has to stay as xen_set_default_idle() runs early.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/87edcu6vaq.ffs@tglx
show more ...
|
| #
5f75916e |
| 29-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/idle: Let prefer_mwait_c1_over_halt() return bool
The return value is truly boolean. Make it so.
No functional change.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borisl
x86/idle: Let prefer_mwait_c1_over_halt() return bool
The return value is truly boolean. Make it so.
No functional change.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
f3d7eab7 |
| 29-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/idle: Cleanup idle_setup()
Updating the static call for x86_idle() from idle_setup() is counter-intuitive.
Let select_idle_routine() handle it like the other idle choices, which allows to simpl
x86/idle: Cleanup idle_setup()
Updating the static call for x86_idle() from idle_setup() is counter-intuitive.
Let select_idle_routine() handle it like the other idle choices, which allows to simplify the idle selection later on.
While at it rewrite comments and return a proper error code and not -1.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
0ab56287 |
| 29-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/idle: Clean up idle selection
Clean up the code to make it readable. No functional change.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]
x86/idle: Clean up idle selection
Clean up the code to make it readable. No functional change.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
cb81deef |
| 28-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/idle: Sanitize X86_BUG_AMD_E400 handling
amd_e400_idle(), the idle routine for AMD CPUs which are affected by erratum 400 violates the RCU constraints by invoking tick_broadcast_enter() and tick
x86/idle: Sanitize X86_BUG_AMD_E400 handling
amd_e400_idle(), the idle routine for AMD CPUs which are affected by erratum 400 violates the RCU constraints by invoking tick_broadcast_enter() and tick_broadcast_exit() after the core code has marked RCU non-idle. The functions can end up in lockdep or tracing, which rightfully triggers a RCU warning.
The core code provides now a static branch conditional invocation of the broadcast functions.
Remove amd_e400_idle(), enforce default_idle() and enable the static branch on affected CPUs to cure this.
[ bp: Fold in a fix for a IS_ENABLED() check fail missing a "CONFIG_" prefix which tglx spotted. ]
Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Link: https://lore.kernel.org/r/877cim6sis.ffs@tglx
show more ...
|
|
Revision tags: v6.8-rc6, v6.8-rc5 |
|
| #
44c76825 |
| 17-Feb-2024 |
Kees Cook <[email protected]> |
x86: Increase brk randomness entropy for 64-bit systems
In commit c1d171a00294 ("x86: randomize brk"), arch_randomize_brk() was defined to use a 32MB range (13 bits of entropy), but was never increa
x86: Increase brk randomness entropy for 64-bit systems
In commit c1d171a00294 ("x86: randomize brk"), arch_randomize_brk() was defined to use a 32MB range (13 bits of entropy), but was never increased when moving to 64-bit. The default arch_randomize_brk() uses 32MB for 32-bit tasks, and 1GB (18 bits of entropy) for 64-bit tasks.
Update x86_64 to match the entropy used by arm64 and other 64-bit architectures.
Reported-by: [email protected] Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Jiri Kosina <[email protected]> Closes: https://lore.kernel.org/linux-hardening/CA+2EKTVLvc8hDZc+2Yhwmus=dzOUG5E4gV7ayCbu0MPJTZzWkw@mail.gmail.com/ Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
8078f4d6 |
| 13-Feb-2024 |
Thomas Gleixner <[email protected]> |
x86/cpu/topology: Rename smp_num_siblings
It's really a non-intuitive name. Rename it to __max_threads_per_core which is obvious.
Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Mich
x86/cpu/topology: Rename smp_num_siblings
It's really a non-intuitive name. Rename it to __max_threads_per_core which is obvious.
Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Michael Kelley <[email protected]> Tested-by: Sohil Mehta <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7 |
|
| #
54aa699e |
| 03-Jan-2024 |
Bjorn Helgaas <[email protected]> |
arch/x86: Fix typos
Fix typos, most reported by "codespell arch/x86". Only touches comments, no code changes.
Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Ingo Molnar <mingo@k
arch/x86: Fix typos
Fix typos, most reported by "codespell arch/x86". Only touches comments, no code changes.
Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1 |
|
| #
748c90c6 |
| 08-Sep-2023 |
Rick Edgecombe <[email protected]> |
x86/shstk: Remove useless clone error handling
When clone fails after the shadow stack is allocated, any allocated shadow stack is cleaned up in exit_thread() in copy_process(). So the logic in copy
x86/shstk: Remove useless clone error handling
When clone fails after the shadow stack is allocated, any allocated shadow stack is cleaned up in exit_thread() in copy_process(). So the logic in copy_thread() is unneeded, and also will not handle failures that happen outside of copy_thread().
In addition, since there is a second attempt to unmap the same shadow stack, there is a race where an newly mapped region could get unmapped.
So remove the logic in copy_thread() and rely on exit_thread() to handle clone failure.
Fixes: b2926a36b97a ("x86/shstk: Handle thread shadow stack") Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Tested-by: H.J. Lu <[email protected]> Link: https://lore.kernel.org/all/20230908203655.543765-3-rick.p.edgecombe%40intel.com
show more ...
|
|
Revision tags: v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7 |
|
| #
b2926a36 |
| 13-Jun-2023 |
Rick Edgecombe <[email protected]> |
x86/shstk: Handle thread shadow stack
When a process is duplicated, but the child shares the address space with the parent, there is potential for the threads sharing a single stack to cause conflic
x86/shstk: Handle thread shadow stack
When a process is duplicated, but the child shares the address space with the parent, there is potential for the threads sharing a single stack to cause conflicts for each other. In the normal non-CET case this is handled in two ways.
With regular CLONE_VM a new stack is provided by userspace such that the parent and child have different stacks.
For vfork, the parent is suspended until the child exits. So as long as the child doesn't return from the vfork()/CLONE_VFORK calling function and sticks to a limited set of operations, the parent and child can share the same stack.
For shadow stack, these scenarios present similar sharing problems. For the CLONE_VM case, the child and the parent must have separate shadow stacks. Instead of changing clone to take a shadow stack, have the kernel just allocate one and switch to it.
Use stack_size passed from clone3() syscall for thread shadow stack size. A compat-mode thread shadow stack size is further reduced to 1/4. This allows more threads to run in a 32-bit address space. The clone() does not pass stack_size, which was added to clone3(). In that case, use RLIMIT_STACK size and cap to 4 GB.
For shadow stack enabled vfork(), the parent and child can share the same shadow stack, like they can share a normal stack. Since the parent is suspended until the child terminates, the child will not interfere with the parent while executing as long as it doesn't return from the vfork() and overwrite up the shadow stack. The child can safely overwrite down the shadow stack, as the parent can just overwrite this later. So CET does not add any additional limitations for vfork().
Free the shadow stack on thread exit by doing it in mm_release(). Skip this when exiting a vfork() child since the stack is shared in the parent.
During this operation, the shadow stack pointer of the new thread needs to be updated to point to the newly allocated shadow stack. Since the ability to do this is confined to the FPU subsystem, change fpu_clone() to take the new shadow stack pointer, and update it internally inside the FPU subsystem. This part was suggested by Thomas Gleixner.
Co-developed-by: Yu-cheng Yu <[email protected]> Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Yu-cheng Yu <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Tested-by: Pengfei Xu <[email protected]> Tested-by: John Allen <[email protected]> Tested-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/all/20230613001108.3040476-30-rick.p.edgecombe%40intel.com
show more ...
|
| #
3aec4ecb |
| 23-Jun-2023 |
Brian Gerst <[email protected]> |
x86: Rewrite ret_from_fork() in C
When kCFI is enabled, special handling is needed for the indirect call to the kernel thread function. Rewrite the ret_from_fork() function in C so that the compile
x86: Rewrite ret_from_fork() in C
When kCFI is enabled, special handling is needed for the indirect call to the kernel thread function. Rewrite the ret_from_fork() function in C so that the compiler can properly handle the indirect call.
Suggested-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Sami Tolvanen <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
| #
9b040453 |
| 15-Jun-2023 |
Tony Battersby <[email protected]> |
x86/smp: Dont access non-existing CPUID leaf
stop_this_cpu() tests CPUID leaf 0x8000001f::EAX unconditionally. Intel CPUs return the content of the highest supported leaf when a non-existing leaf is
x86/smp: Dont access non-existing CPUID leaf
stop_this_cpu() tests CPUID leaf 0x8000001f::EAX unconditionally. Intel CPUs return the content of the highest supported leaf when a non-existing leaf is read, while AMD CPUs return all zeros for unsupported leafs.
So the result of the test on Intel CPUs is lottery.
While harmless it's incorrect and causes the conditional wbinvd() to be issued where not required.
Check whether the leaf is supported before reading it.
[ tglx: Adjusted changelog ]
Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use") Signed-off-by: Tony Battersby <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Reviewed-by: Borislav Petkov (AMD) <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1 |
|
| #
1f5e7eb7 |
| 26-Apr-2023 |
Thomas Gleixner <[email protected]> |
x86/smp: Make stop_other_cpus() more robust
Tony reported intermittent lockups on poweroff. His analysis identified the wbinvd() in stop_this_cpu() as the culprit. This was added to ensure that on S
x86/smp: Make stop_other_cpus() more robust
Tony reported intermittent lockups on poweroff. His analysis identified the wbinvd() in stop_this_cpu() as the culprit. This was added to ensure that on SME enabled machines a kexec() does not leave any stale data in the caches when switching from encrypted to non-encrypted mode or vice versa.
That wbinvd() is conditional on the SME feature bit which is read directly from CPUID. But that readout does not check whether the CPUID leaf is available or not. If it's not available the CPU will return the value of the highest supported leaf instead. Depending on the content the "SME" bit might be set or not.
That's incorrect but harmless. Making the CPUID readout conditional makes the observed hangs go away, but it does not fix the underlying problem:
CPU0 CPU1
stop_other_cpus() send_IPIs(REBOOT); stop_this_cpu() while (num_online_cpus() > 1); set_online(false); proceed... -> hang wbinvd()
WBINVD is an expensive operation and if multiple CPUs issue it at the same time the resulting delays are even larger.
But CPU0 already observed num_online_cpus() going down to 1 and proceeds which causes the system to hang.
This issue exists independent of WBINVD, but the delays caused by WBINVD make it more prominent.
Make this more robust by adding a cpumask which is initialized to the online CPU mask before sending the IPIs and CPUs clear their bit in stop_this_cpu() after the WBINVD completed. Check for that cpumask to become empty in stop_other_cpus() instead of watching num_online_cpus().
The cpumask cannot plug all holes either, but it's better than a raw counter and allows to restrict the NMI fallback IPI to be sent only the CPUs which have not reported within the timeout window.
Fixes: 08f253ec3767 ("x86/cpu: Clear SME feature flag when not in use") Reported-by: Tony Battersby <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Ashok Raj <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/all/[email protected] Link: https://lore.kernel.org/r/87h6r770bv.ffs@tglx
show more ...
|
|
Revision tags: v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2 |
|
| #
2f8794bd |
| 12-Mar-2023 |
Kirill A. Shutemov <[email protected]> |
x86/mm: Provide arch_prctl() interface for LAM
Add a few of arch_prctl() handles:
- ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number of tag bits. It is rounded up to the nea
x86/mm: Provide arch_prctl() interface for LAM
Add a few of arch_prctl() handles:
- ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number of tag bits. It is rounded up to the nearest LAM mode that can provide it. For now only LAM_U57 is supported, with 6 tag bits.
- ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag bits located in the address.
- ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request. Zero if LAM is not supported.
Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Alexander Potapenko <[email protected]> Link: https://lore.kernel.org/all/20230312112612.31869-9-kirill.shutemov%40linux.intel.com
show more ...
|
| #
74c228d2 |
| 12-Mar-2023 |
Kirill A. Shutemov <[email protected]> |
x86/uaccess: Provide untagged_addr() and remove tags before address check
untagged_addr() is a helper used by the core-mm to strip tag bits and get the address to the canonical shape based on rules
x86/uaccess: Provide untagged_addr() and remove tags before address check
untagged_addr() is a helper used by the core-mm to strip tag bits and get the address to the canonical shape based on rules of the current thread. It only handles userspace addresses.
The untagging mask is stored in per-CPU variable and set on context switching to the task.
The tags must not be included into check whether it's okay to access the userspace address. Strip tags in access_ok().
Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Alexander Potapenko <[email protected]> Link: https://lore.kernel.org/all/20230312112612.31869-7-kirill.shutemov%40linux.intel.com
show more ...
|
|
Revision tags: v6.3-rc1, v6.2 |
|
| #
b4c108d7 |
| 14-Feb-2023 |
Philippe Mathieu-Daudé <[email protected]> |
x86/cpu: Expose arch_cpu_idle_dead()'s prototype definition
Include <linux/cpu.h> to make sure arch_cpu_idle_dead() matches its prototype going forward.
Inspired-by: Josh Poimboeuf <jpoimboe@kernel
x86/cpu: Expose arch_cpu_idle_dead()'s prototype definition
Include <linux/cpu.h> to make sure arch_cpu_idle_dead() matches its prototype going forward.
Inspired-by: Josh Poimboeuf <[email protected]> Signed-off-by: Philippe Mathieu-Daudé <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Josh Poimboeuf <[email protected]>
show more ...
|