History log of /linux-6.15/arch/x86/kernel/process_64.c (Results 1 – 25 of 321)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6
# a1e4cc01 03-Mar-2025 Brian Gerst <[email protected]>

x86/percpu: Move current_task to percpu hot section

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Uros Bizjak <ubizjak

x86/percpu: Move current_task to percpu hot section

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Uros Bizjak <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 385f72c8 03-Mar-2025 Brian Gerst <[email protected]>

x86/percpu: Move top_of_stack to percpu hot section

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Uros Bizjak <ubizjak

x86/percpu: Move top_of_stack to percpu hot section

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Uros Bizjak <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# c6a09180 03-Mar-2025 Brian Gerst <[email protected]>

x86/irq: Move irq stacks to percpu hot section

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Uros Bizjak <ubizjak@gmai

x86/irq: Move irq stacks to percpu hot section

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Uros Bizjak <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 399fd7a2 03-Mar-2025 Brian Gerst <[email protected]>

x86/asm: Merge KSTK_ESP() implementations

Commit:

263042e4630a ("Save user RSP in pt_regs->sp on SYSCALL64 fastpath")

simplified the 64-bit implementation of KSTK_ESP() which is
now identical to

x86/asm: Merge KSTK_ESP() implementations

Commit:

263042e4630a ("Save user RSP in pt_regs->sp on SYSCALL64 fastpath")

simplified the 64-bit implementation of KSTK_ESP() which is
now identical to 32-bit. Merge them into a common definition.

No functional change.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1
# 684b1291 02-Feb-2025 Brian Gerst <[email protected]>

x86/arch_prctl/64: Clean up ARCH_MAP_VDSO_32

process_64.c is not built on native 32-bit, so CONFIG_X86_32 will never
be set.

No change in functionality intended.

Signed-off-by: Brian Gerst <brgers

x86/arch_prctl/64: Clean up ARCH_MAP_VDSO_32

process_64.c is not built on native 32-bit, so CONFIG_X86_32 will never
be set.

No change in functionality intended.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 2df1ad0d 02-Feb-2025 Brian Gerst <[email protected]>

x86/arch_prctl: Simplify sys_arch_prctl()

Use in_ia32_syscall() instead of a compat syscall entry.

No change in functionality intended.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by

x86/arch_prctl: Simplify sys_arch_prctl()

Use in_ia32_syscall() instead of a compat syscall entry.

No change in functionality intended.

Signed-off-by: Brian Gerst <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7
# b7c35279 02-Jul-2024 Yosry Ahmed <[email protected]>

x86/mm: Cleanup prctl_enable_tagged_addr() nr_bits error checking

There are two separate checks in prctl_enable_tagged_addr() that nr_bits
is in the correct range. The checks are arranged such the c

x86/mm: Cleanup prctl_enable_tagged_addr() nr_bits error checking

There are two separate checks in prctl_enable_tagged_addr() that nr_bits
is in the correct range. The checks are arranged such the correct case
is sandwiched between both error cases, which do exactly the same thing.

Simplify the if condition and pull the correct case outside with the
rest of the success code path.

Signed-off-by: Yosry Ahmed <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
Link: https://lore.kernel.org/all/20240702132139.3332013-4-yosryahmed%40google.com

show more ...


# ec225f8c 02-Jul-2024 Yosry Ahmed <[email protected]>

x86/mm: Fix LAM inconsistency during context switch

LAM can only be enabled when a process is single-threaded. But _kernel_
threads can temporarily use a single-threaded process's mm. That means
t

x86/mm: Fix LAM inconsistency during context switch

LAM can only be enabled when a process is single-threaded. But _kernel_
threads can temporarily use a single-threaded process's mm. That means
that a context-switching kernel thread can race and observe the mm's LAM
metadata (mm->context.lam_cr3_mask) change.

The context switch code does two logical things with that metadata:
populate CR3 and populate 'cpu_tlbstate.lam'. If it hits this race,
'cpu_tlbstate.lam' and CR3 can end up out of sync.

This de-synchronization is currently harmless. But it is confusing and
might lead to warnings or real bugs.

Update set_tlbstate_lam_mode() to take in the LAM mask and untag mask
instead of an mm_struct pointer, and while we are at it, rename it to
cpu_tlbstate_update_lam(). This should also make it clearer that we are
updating cpu_tlbstate. In switch_mm_irqs_off(), read the LAM mask once
and use it for both the cpu_tlbstate update and the CR3 update.

Signed-off-by: Yosry Ahmed <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
Link: https://lore.kernel.org/all/20240702132139.3332013-3-yosryahmed%40google.com

show more ...


# 3b299b99 02-Jul-2024 Yosry Ahmed <[email protected]>

x86/mm: Use IPIs to synchronize LAM enablement

LAM can only be enabled when a process is single-threaded. But _kernel_
threads can temporarily use a single-threaded process's mm.

If LAM is enabled

x86/mm: Use IPIs to synchronize LAM enablement

LAM can only be enabled when a process is single-threaded. But _kernel_
threads can temporarily use a single-threaded process's mm.

If LAM is enabled by a userspace process while a kthread is using its
mm, the kthread will not observe LAM enablement (i.e. LAM will be
disabled in CR3). This could be fine for the kthread itself, as LAM only
affects userspace addresses. However, if the kthread context switches to
a thread in the same userspace process, CR3 may or may not be updated
because the mm_struct doesn't change (based on pending TLB flushes). If
CR3 is not updated, the userspace thread will run incorrectly with LAM
disabled, which may cause page faults when using tagged addresses.
Example scenario:

CPU 1 CPU 2
/* kthread */
kthread_use_mm()
/* user thread */
prctl_enable_tagged_addr()
/* LAM enabled on CPU 2 */
/* LAM disabled on CPU 1 */
context_switch() /* to CPU 1 */
/* Switching to user thread */
switch_mm_irqs_off()
/* CR3 not updated */
/* LAM is still disabled on CPU 1 */

Synchronize LAM enablement by sending an IPI to all CPUs running with
the mm_struct to enable LAM. This makes sure LAM is enabled on CPU 1
in the above scenario before prctl_enable_tagged_addr() returns and
userspace starts using tagged addresses, and before it's possible to
run the userspace process on CPU 1.

In switch_mm_irqs_off(), move reading the LAM mask until after
mm_cpumask() is updated. This ensures that if an outdated LAM mask is
written to CR3, an IPI is received to update it right after IRQs are
re-enabled.

[ dhansen: Add a LAM enabling helper and comment it ]

Fixes: 82721d8b25d7 ("x86/mm: Handle LAM on context switch")
Suggested-by: Andy Lutomirski <[email protected]>
Signed-off-by: Yosry Ahmed <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Kirill A. Shutemov <[email protected]>
Link: https://lore.kernel.org/all/20240702132139.3332013-2-yosryahmed%40google.com

show more ...


Revision tags: v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5
# b53c6bd5 21-Apr-2024 David Kaplan <[email protected]>

x86/cpu: Fix check for RDPKRU in __show_regs()

cpu_feature_enabled(X86_FEATURE_OSPKE) does not necessarily reflect
whether CR4.PKE is set on the CPU. In particular, they may differ on
non-BSP CPUs

x86/cpu: Fix check for RDPKRU in __show_regs()

cpu_feature_enabled(X86_FEATURE_OSPKE) does not necessarily reflect
whether CR4.PKE is set on the CPU. In particular, they may differ on
non-BSP CPUs before setup_pku() is executed. In this scenario, RDPKRU
will #UD causing the system to hang.

Fix by checking CR4 for PKE enablement which is always correct for the
current CPU.

The scenario happens by inserting a WARN* before setup_pku() in
identiy_cpu() or some other diagnostic which would lead to calling
__show_regs().

[ bp: Massage commit message. ]

Signed-off-by: David Kaplan <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5
# ad41a14c 05-Dec-2023 H. Peter Anvin (Intel) <[email protected]>

x86/fred: Allow single-step trap and NMI when starting a new task

Entering a new task is logically speaking a return from a system call
(exec, fork, clone, etc.). As such, if ptrace enables single s

x86/fred: Allow single-step trap and NMI when starting a new task

Entering a new task is logically speaking a return from a system call
(exec, fork, clone, etc.). As such, if ptrace enables single stepping
a single step exception should be allowed to trigger immediately upon
entering user space. This is not optional.

NMI should *never* be disabled in user space. As such, this is an
optional, opportunistic way to catch errors.

Allow single-step trap and NMI when starting a new task, thus once
the new task enters user space, single-step trap and NMI are both
enabled immediately.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Xin Li <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Shan Kang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 09794f68 05-Dec-2023 H. Peter Anvin (Intel) <[email protected]>

x86/fred: Disallow the swapgs instruction when FRED is enabled

SWAPGS is no longer needed thus NOT allowed with FRED because FRED
transitions ensure that an operating system can _always_ operate
wit

x86/fred: Disallow the swapgs instruction when FRED is enabled

SWAPGS is no longer needed thus NOT allowed with FRED because FRED
transitions ensure that an operating system can _always_ operate
with its own GS base address:

- For events that occur in ring 3, FRED event delivery swaps the GS
base address with the IA32_KERNEL_GS_BASE MSR.

- ERETU (the FRED transition that returns to ring 3) also swaps the
GS base address with the IA32_KERNEL_GS_BASE MSR.

And the operating system can still setup the GS segment for a user
thread without the need of loading a user thread GS with:

- Using LKGS, available with FRED, to modify other attributes of the
GS segment without compromising its ability always to operate with
its own GS base address.

- Accessing the GS segment base address for a user thread as before
using RDMSR or WRMSR on the IA32_KERNEL_GS_BASE MSR.

Note, LKGS loads the GS base address into the IA32_KERNEL_GS_BASE MSR
instead of the GS segment's descriptor cache. As such, the operating
system never changes its runtime GS base address.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Xin Li <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Shan Kang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# ee63291a 05-Dec-2023 Xin Li <[email protected]>

x86/ptrace: Cleanup the definition of the pt_regs structure

struct pt_regs is hard to read because the member or section related
comments are not aligned with the members.

The 'cs' and 'ss' members

x86/ptrace: Cleanup the definition of the pt_regs structure

struct pt_regs is hard to read because the member or section related
comments are not aligned with the members.

The 'cs' and 'ss' members of pt_regs are type of 'unsigned long' while
in reality they are only 16-bit wide. This works so far as the
remaining space is unused, but FRED will use the remaining bits for
other purposes.

To prepare for FRED:

- Cleanup the formatting
- Convert 'cs' and 'ss' to u16 and embed them into an union
with a u64
- Fixup the related printk() format strings

Suggested-by: Thomas Gleixner <[email protected]>
Originally-by: H. Peter Anvin (Intel) <[email protected]>
Signed-off-by: Xin Li <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov (AMD) <[email protected]>
Tested-by: Shan Kang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7
# 24b8a236 18-Oct-2023 Linus Torvalds <[email protected]>

x86/fpu: Clean up FPU switching in the middle of task switching

It happens to work, but it's very very wrong, because our 'current'
macro is magic that is supposedly loading a stable value.

It just

x86/fpu: Clean up FPU switching in the middle of task switching

It happens to work, but it's very very wrong, because our 'current'
macro is magic that is supposedly loading a stable value.

It just happens to be not quite stable enough and the compilers
re-load the value enough for this code to work. But it's wrong.

The whole

struct fpu *prev_fpu = &prev->fpu;

thing in __switch_to() is pretty ugly. There's no reason why we
should look at that 'prev_fpu' pointer there, or pass it down.

And it only generates worse code, in how it loads 'current' when
__switch_to() has the right task pointers.

The attached patch not only cleans this up, it actually
generates better code too:

(a) it removes one push/pop pair at entry/exit because there's one
less register used (no 'current')

(b) it removes that pointless load of 'current' because it just uses
the right argument:

- movq %gs:pcpu_hot(%rip), %r12
- testq $16384, (%r12)
+ testq $16384, (%rdi)

Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Uros Bizjak <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7
# 67840ad0 13-Jun-2023 Rick Edgecombe <[email protected]>

x86/shstk: Add ARCH_SHSTK_STATUS

CRIU and GDB need to get the current shadow stack and WRSS enablement
status. This information is already available via /proc/pid/status, but
this is inconvenient fo

x86/shstk: Add ARCH_SHSTK_STATUS

CRIU and GDB need to get the current shadow stack and WRSS enablement
status. This information is already available via /proc/pid/status, but
this is inconvenient for CRIU because it involves parsing the text output
in an area of the code where this is difficult. Provide a status
arch_prctl(), ARCH_SHSTK_STATUS for retrieving the status. Have arg2 be a
userspace address, and make the new arch_prctl simply copy the features
out to userspace.

Suggested-by: Mike Rapoport <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/all/20230613001108.3040476-43-rick.p.edgecombe%40intel.com

show more ...


# 680ed2f1 13-Jun-2023 Mike Rapoport <[email protected]>

x86/shstk: Add ARCH_SHSTK_UNLOCK

Userspace loaders may lock features before a CRIU restore operation has
the chance to set them to whatever state is required by the process
being restored. Allow a w

x86/shstk: Add ARCH_SHSTK_UNLOCK

Userspace loaders may lock features before a CRIU restore operation has
the chance to set them to whatever state is required by the process
being restored. Allow a way for CRIU to unlock features. Add it as an
arch_prctl() like the other shadow stack operations, but restrict it being
called by the ptrace arch_pctl() interface.

[Merged into recent API changes, added commit log and docs]

Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/all/20230613001108.3040476-42-rick.p.edgecombe%40intel.com

show more ...


# 98cfa463 13-Jun-2023 Rick Edgecombe <[email protected]>

x86: Introduce userspace API for shadow stack

Add three new arch_prctl() handles:

- ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
feature. Returns 0 on success or a negative value

x86: Introduce userspace API for shadow stack

Add three new arch_prctl() handles:

- ARCH_SHSTK_ENABLE/DISABLE enables or disables the specified
feature. Returns 0 on success or a negative value on error.

- ARCH_SHSTK_LOCK prevents future disabling or enabling of the
specified feature. Returns 0 on success or a negative value
on error.

The features are handled per-thread and inherited over fork(2)/clone(2),
but reset on exec().

Co-developed-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Rick Edgecombe <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Tested-by: Pengfei Xu <[email protected]>
Tested-by: John Allen <[email protected]>
Tested-by: Kees Cook <[email protected]>
Link: https://lore.kernel.org/all/20230613001108.3040476-27-rick.p.edgecombe%40intel.com

show more ...


Revision tags: v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6
# 97740266 03-Apr-2023 Kirill A. Shutemov <[email protected]>

x86/mm/iommu/sva: Do not allow to set FORCE_TAGGED_SVA bit from outside

arch_prctl(ARCH_FORCE_TAGGED_SVA) overrides the default and allows LAM
and SVA to co-exist in the process. It is expected by c

x86/mm/iommu/sva: Do not allow to set FORCE_TAGGED_SVA bit from outside

arch_prctl(ARCH_FORCE_TAGGED_SVA) overrides the default and allows LAM
and SVA to co-exist in the process. It is expected by called by the
process when it knows what it is doing.

arch_prctl() operates on the current process, but the same code is
reachable from ptrace where it can be called on arbitrary task.

Make it strict and only allow to set MM_CONTEXT_FORCE_TAGGED_SVA for the
current process.

Fixes: 23e5d9ec2bab ("x86/mm/iommu/sva: Make LAM and SVA mutually exclusive")
Suggested-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Link: https://lore.kernel.org/all/20230403111020.3136-3-kirill.shutemov%40linux.intel.com

show more ...


# fca1fdd2 03-Apr-2023 Kirill A. Shutemov <[email protected]>

x86/mm/iommu/sva: Fix error code for LAM enabling failure due to SVA

Normally, LAM and SVA are mutually exclusive. LAM enabling will fail if
SVA is already in use.

Correct error code for the failur

x86/mm/iommu/sva: Fix error code for LAM enabling failure due to SVA

Normally, LAM and SVA are mutually exclusive. LAM enabling will fail if
SVA is already in use.

Correct error code for the failure. EINTR is nonsensical there.

Fixes: 23e5d9ec2bab ("x86/mm/iommu/sva: Make LAM and SVA mutually exclusive")
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Link: https://lore.kernel.org/all/CACT4Y+YfqSMsZArhh25TESmG-U4jO5Hjphz87wKSnTiaw2Wrfw@mail.gmail.com
Link: https://lore.kernel.org/all/20230403111020.3136-2-kirill.shutemov%40linux.intel.com

show more ...


Revision tags: v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2
# 23e5d9ec 12-Mar-2023 Kirill A. Shutemov <[email protected]>

x86/mm/iommu/sva: Make LAM and SVA mutually exclusive

IOMMU and SVA-capable devices know nothing about LAM and only expect
canonical addresses. An attempt to pass down tagged pointer will lead
to ad

x86/mm/iommu/sva: Make LAM and SVA mutually exclusive

IOMMU and SVA-capable devices know nothing about LAM and only expect
canonical addresses. An attempt to pass down tagged pointer will lead
to address translation failure.

By default do not allow to enable both LAM and use SVA in the same
process.

The new ARCH_FORCE_TAGGED_SVA arch_prctl() overrides the limitation.
By using the arch_prctl() userspace takes responsibility to never pass
tagged address to the device.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Reviewed-by: Ashok Raj <[email protected]>
Reviewed-by: Jacob Pan <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/all/20230312112612.31869-12-kirill.shutemov%40linux.intel.com

show more ...


# 2f8794bd 12-Mar-2023 Kirill A. Shutemov <[email protected]>

x86/mm: Provide arch_prctl() interface for LAM

Add a few of arch_prctl() handles:

- ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number
of tag bits. It is rounded up to the nea

x86/mm: Provide arch_prctl() interface for LAM

Add a few of arch_prctl() handles:

- ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number
of tag bits. It is rounded up to the nearest LAM mode that can
provide it. For now only LAM_U57 is supported, with 6 tag bits.

- ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag
bits located in the address.

- ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request.
Zero if LAM is not supported.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Alexander Potapenko <[email protected]>
Link: https://lore.kernel.org/all/20230312112612.31869-9-kirill.shutemov%40linux.intel.com

show more ...


# 5ef495e5 12-Mar-2023 Kirill A. Shutemov <[email protected]>

x86: Allow atomic MM_CONTEXT flags setting

So far there's no need in atomic setting of MM context flags in
mm_context_t::flags. The flags set early in exec and never change
after that.

LAM enabling

x86: Allow atomic MM_CONTEXT flags setting

So far there's no need in atomic setting of MM context flags in
mm_context_t::flags. The flags set early in exec and never change
after that.

LAM enabling requires atomic flag setting. The upcoming flag
MM_CONTEXT_FORCE_TAGGED_SVA can be set much later in the process
lifetime where multiple threads exist.

Convert the field to unsigned long and do MM_CONTEXT_* accesses with
__set_bit() and test_bit().

No functional changes.

Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Alexander Potapenko <[email protected]>
Link: https://lore.kernel.org/all/20230312112612.31869-3-kirill.shutemov%40linux.intel.com

show more ...


# 7fef0997 07-Mar-2023 Linus Torvalds <[email protected]>

x86/resctl: fix scheduler confusion with 'current'

The implementation of 'current' on x86 is very intentionally special: it
is a very common thing to look up, and it uses 'this_cpu_read_stable()'
to

x86/resctl: fix scheduler confusion with 'current'

The implementation of 'current' on x86 is very intentionally special: it
is a very common thing to look up, and it uses 'this_cpu_read_stable()'
to get the current thread pointer efficiently from per-cpu storage.

And the keyword in there is 'stable': the current thread pointer never
changes as far as a single thread is concerned. Even if when a thread
is preempted, or moved to another CPU, or even across an explicit call
'schedule()' that thread will still have the same value for 'current'.

It is, after all, the kernel base pointer to thread-local storage.
That's why it's stable to begin with, but it's also why it's important
enough that we have that special 'this_cpu_read_stable()' access for it.

So this is all done very intentionally to allow the compiler to treat
'current' as a value that never visibly changes, so that the compiler
can do CSE and combine multiple different 'current' accesses into one.

However, there is obviously one very special situation when the
currently running thread does actually change: inside the scheduler
itself.

So the scheduler code paths are special, and do not have a 'current'
thread at all. Instead there are _two_ threads: the previous and the
next thread - typically called 'prev' and 'next' (or prev_p/next_p)
internally.

So this is all actually quite straightforward and simple, and not all
that complicated.

Except for when you then have special code that is run in scheduler
context, that code then has to be aware that 'current' isn't really a
valid thing. Did you mean 'prev'? Did you mean 'next'?

In fact, even if then look at the code, and you use 'current' after the
new value has been assigned to the percpu variable, we have explicitly
told the compiler that 'current' is magical and always stable. So the
compiler is quite free to use an older (or newer) value of 'current',
and the actual assignment to the percpu storage is not relevant even if
it might look that way.

Which is exactly what happened in the resctl code, that blithely used
'current' in '__resctrl_sched_in()' when it really wanted the new
process state (as implied by the name: we're scheduling 'into' that new
resctl state). And clang would end up just using the old thread pointer
value at least in some configurations.

This could have happened with gcc too, and purely depends on random
compiler details. Clang just seems to have been more aggressive about
moving the read of the per-cpu current_task pointer around.

The fix is trivial: just make the resctl code adhere to the scheduler
rules of using the prev/next thread pointer explicitly, instead of using
'current' in a situation where it just wasn't valid.

That same code is then also used outside of the scheduler context (when
a thread resctl state is explicitly changed), and then we will just pass
in 'current' as that pointer, of course. There is no ambiguity in that
case.

The fix may be trivial, but noticing and figuring out what went wrong
was not. The credit for that goes to Stephane Eranian.

Reported-by: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/lkml/[email protected]/
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Tony Luck <[email protected]>
Tested-by: Stephane Eranian <[email protected]>
Tested-by: Babu Moger <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>

show more ...


Revision tags: v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4
# 6007878a 04-Nov-2022 Juergen Gross <[email protected]>

x86/cpu: Switch to cpu_feature_enabled() for X86_FEATURE_XENPV

Convert the remaining cases of static_cpu_has(X86_FEATURE_XENPV) and
boot_cpu_has(X86_FEATURE_XENPV) to use cpu_feature_enabled(), allo

x86/cpu: Switch to cpu_feature_enabled() for X86_FEATURE_XENPV

Convert the remaining cases of static_cpu_has(X86_FEATURE_XENPV) and
boot_cpu_has(X86_FEATURE_XENPV) to use cpu_feature_enabled(), allowing
more efficient code in case the kernel is configured without
CONFIG_XEN_PV.

Signed-off-by: Juergen Gross <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6
# d7b6d709 15-Sep-2022 Thomas Gleixner <[email protected]>

x86/percpu: Move irq_stack variables next to current_task

Further extend struct pcpu_hot with the hard and soft irq stack
pointers.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by

x86/percpu: Move irq_stack variables next to current_task

Further extend struct pcpu_hot with the hard and soft irq stack
pointers.

Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


12345678910>>...13