|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6 |
|
| #
a1e4cc01 |
| 03-Mar-2025 |
Brian Gerst <[email protected]> |
x86/percpu: Move current_task to percpu hot section
No functional change.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Uros Bizjak <ubizjak
x86/percpu: Move current_task to percpu hot section
No functional change.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Uros Bizjak <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
385f72c8 |
| 03-Mar-2025 |
Brian Gerst <[email protected]> |
x86/percpu: Move top_of_stack to percpu hot section
No functional change.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Uros Bizjak <ubizjak
x86/percpu: Move top_of_stack to percpu hot section
No functional change.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Uros Bizjak <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1 |
|
| #
2df1ad0d |
| 02-Feb-2025 |
Brian Gerst <[email protected]> |
x86/arch_prctl: Simplify sys_arch_prctl()
Use in_ia32_syscall() instead of a compat syscall entry.
No change in functionality intended.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by
x86/arch_prctl: Simplify sys_arch_prctl()
Use in_ia32_syscall() instead of a compat syscall entry.
No change in functionality intended.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7 |
|
| #
24b8a236 |
| 18-Oct-2023 |
Linus Torvalds <[email protected]> |
x86/fpu: Clean up FPU switching in the middle of task switching
It happens to work, but it's very very wrong, because our 'current' macro is magic that is supposedly loading a stable value.
It just
x86/fpu: Clean up FPU switching in the middle of task switching
It happens to work, but it's very very wrong, because our 'current' macro is magic that is supposedly loading a stable value.
It just happens to be not quite stable enough and the compilers re-load the value enough for this code to work. But it's wrong.
The whole
struct fpu *prev_fpu = &prev->fpu;
thing in __switch_to() is pretty ugly. There's no reason why we should look at that 'prev_fpu' pointer there, or pass it down.
And it only generates worse code, in how it loads 'current' when __switch_to() has the right task pointers.
The attached patch not only cleans this up, it actually generates better code too:
(a) it removes one push/pop pair at entry/exit because there's one less register used (no 'current')
(b) it removes that pointless load of 'current' because it just uses the right argument:
- movq %gs:pcpu_hot(%rip), %r12 - testq $16384, (%r12) + testq $16384, (%rdi)
Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2 |
|
| #
7fef0997 |
| 07-Mar-2023 |
Linus Torvalds <[email protected]> |
x86/resctl: fix scheduler confusion with 'current'
The implementation of 'current' on x86 is very intentionally special: it is a very common thing to look up, and it uses 'this_cpu_read_stable()' to
x86/resctl: fix scheduler confusion with 'current'
The implementation of 'current' on x86 is very intentionally special: it is a very common thing to look up, and it uses 'this_cpu_read_stable()' to get the current thread pointer efficiently from per-cpu storage.
And the keyword in there is 'stable': the current thread pointer never changes as far as a single thread is concerned. Even if when a thread is preempted, or moved to another CPU, or even across an explicit call 'schedule()' that thread will still have the same value for 'current'.
It is, after all, the kernel base pointer to thread-local storage. That's why it's stable to begin with, but it's also why it's important enough that we have that special 'this_cpu_read_stable()' access for it.
So this is all done very intentionally to allow the compiler to treat 'current' as a value that never visibly changes, so that the compiler can do CSE and combine multiple different 'current' accesses into one.
However, there is obviously one very special situation when the currently running thread does actually change: inside the scheduler itself.
So the scheduler code paths are special, and do not have a 'current' thread at all. Instead there are _two_ threads: the previous and the next thread - typically called 'prev' and 'next' (or prev_p/next_p) internally.
So this is all actually quite straightforward and simple, and not all that complicated.
Except for when you then have special code that is run in scheduler context, that code then has to be aware that 'current' isn't really a valid thing. Did you mean 'prev'? Did you mean 'next'?
In fact, even if then look at the code, and you use 'current' after the new value has been assigned to the percpu variable, we have explicitly told the compiler that 'current' is magical and always stable. So the compiler is quite free to use an older (or newer) value of 'current', and the actual assignment to the percpu storage is not relevant even if it might look that way.
Which is exactly what happened in the resctl code, that blithely used 'current' in '__resctrl_sched_in()' when it really wanted the new process state (as implied by the name: we're scheduling 'into' that new resctl state). And clang would end up just using the old thread pointer value at least in some configurations.
This could have happened with gcc too, and purely depends on random compiler details. Clang just seems to have been more aggressive about moving the read of the per-cpu current_task pointer around.
The fix is trivial: just make the resctl code adhere to the scheduler rules of using the prev/next thread pointer explicitly, instead of using 'current' in a situation where it just wasn't valid.
That same code is then also used outside of the scheduler context (when a thread resctl state is explicitly changed), and then we will just pass in 'current' as that pointer, of course. There is no ambiguity in that case.
The fix may be trivial, but noticing and figuring out what went wrong was not. The credit for that goes to Stephane Eranian.
Reported-by: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/lkml/[email protected]/ Reviewed-by: Nick Desaulniers <[email protected]> Tested-by: Tony Luck <[email protected]> Tested-by: Stephane Eranian <[email protected]> Tested-by: Babu Moger <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6 |
|
| #
c063a217 |
| 15-Sep-2022 |
Thomas Gleixner <[email protected]> |
x86/percpu: Move current_top_of_stack next to current_task
Extend the struct pcpu_hot cacheline with current_top_of_stack; another very frequently used value.
Signed-off-by: Thomas Gleixner <tglx@l
x86/percpu: Move current_top_of_stack next to current_task
Extend the struct pcpu_hot cacheline with current_top_of_stack; another very frequently used value.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
e57ef2ed |
| 15-Sep-2022 |
Thomas Gleixner <[email protected]> |
x86: Put hot per CPU variables into a struct
The layout of per-cpu variables is at the mercy of the compiler. This can lead to random performance fluctuations from build to build.
Create a structur
x86: Put hot per CPU variables into a struct
The layout of per-cpu variables is at the mercy of the compiler. This can lead to random performance fluctuations from build to build.
Create a structure to hold some of the hottest per-cpu variables, starting with current_task.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7 |
|
| #
f5c0b4f3 |
| 12-May-2022 |
Thomas Gleixner <[email protected]> |
x86/prctl: Remove pointless task argument
The functions invoked via do_arch_prctl_common() can only operate on the current task and none of these function uses the task argument.
Signed-off-by: Tho
x86/prctl: Remove pointless task argument
The functions invoked via do_arch_prctl_common() can only operate on the current task and none of these function uses the task argument.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/87lev7vtxj.ffs@tglx
show more ...
|
|
Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1 |
|
| #
3a24a608 |
| 25-Mar-2022 |
Brian Gerst <[email protected]> |
x86/32: Remove lazy GS macros
GS is always a user segment now.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Thomas Gleixner <tglx@linutron
x86/32: Remove lazy GS macros
GS is always a user segment now.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6 |
|
| #
63e81807 |
| 15-Oct-2021 |
Thomas Gleixner <[email protected]> |
x86/fpu: Move context switch and exit to user inlines into sched.h
internal.h is a kitchen sink which needs to get out of the way to prepare for the upcoming changes.
Move the context switch and ex
x86/fpu: Move context switch and exit to user inlines into sched.h
internal.h is a kitchen sink which needs to get out of the way to prepare for the upcoming changes.
Move the context switch and exit to user inlines into a separate header, which is all that code needs.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
| #
9568bfb4 |
| 15-Oct-2021 |
Thomas Gleixner <[email protected]> |
x86/fpu: Remove pointless argument from switch_fpu_finish()
Unused since the FPU switching rework.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Li
x86/fpu: Remove pointless argument from switch_fpu_finish()
Unused since the FPU switching rework.
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4 |
|
| #
44e21535 |
| 29-Jun-2020 |
Dmitry Safonov <[email protected]> |
x86/dumpstack: Add log_lvl to __show_regs()
show_trace_log_lvl() provides x86 platform-specific way to unwind backtrace with a given log level. Unfortunately, registers dump(s) are not printed with
x86/dumpstack: Add log_lvl to __show_regs()
show_trace_log_lvl() provides x86 platform-specific way to unwind backtrace with a given log level. Unfortunately, registers dump(s) are not printed with the same log level - instead, KERN_DEFAULT is always used.
Arista's switches uses quite common setup with rsyslog, where only urgent messages goes to console (console_log_level=KERN_ERR), everything else goes into /var/log/ as the console baud-rate often is indecently slow (9600 bps).
Backtrace dumps without registers printed have proven to be as useful as morning standups. Furthermore, in order to introduce KERN_UNSUPPRESSED (which I believe is still the most elegant way to fix raciness of sysrq[1]) the log level should be passed down the stack to register dumping functions. Besides, there is a potential use-case for printing traces with KERN_DEBUG level [2] (where registers dump shouldn't appear with higher log level).
Add log_lvl parameter to __show_regs(). Keep the used log level intact to separate visible change.
[1]: https://lore.kernel.org/lkml/[email protected]/ [2]: https://lore.kernel.org/linux-doc/[email protected]/
Signed-off-by: Dmitry Safonov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Petr Mladek <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.8-rc3, v5.8-rc2, v5.8-rc1 |
|
| #
e31cf2f4 |
| 09-Jun-2020 |
Mike Rapoport <[email protected]> |
mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.
The low level page table accessors (pXY_index(), pXY_offset
mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.
The low level page table accessors (pXY_index(), pXY_offset()) are duplicated across all architectures and sometimes more than once. For instance, we have 31 definition of pgd_offset() for 25 supported architectures.
Most of these definitions are actually identical and typically it boils down to, e.g.
static inline unsigned long pmd_index(unsigned long address) { return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1); }
static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address) { return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address); }
These definitions can be shared among 90% of the arches provided XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
For architectures that really need a custom version there is always possibility to override the generic version with the usual ifdefs magic.
These patches introduce include/linux/pgtable.h that replaces include/asm-generic/pgtable.h and add the definitions of the page table accessors to the new header.
This patch (of 12):
The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the functions involving page table manipulations, e.g. pte_alloc() and pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h> in the files that include <linux/mm.h>.
The include statements in such cases are remove with a simple loop:
for f in $(git grep -l "include <linux/mm.h>") ; do sed -i -e '/include <asm\/pgtable.h>/ d' $f done
Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chris Zankel <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vincent Chen <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5 |
|
| #
8dd97c65 |
| 05-May-2020 |
Reinette Chatre <[email protected]> |
x86/resctrl: Rename asm/resctrl_sched.h to asm/resctrl.h
asm/resctrl_sched.h is dedicated to the code used for configuration of the CPU resource control state when a task is scheduled.
Rename resct
x86/resctrl: Rename asm/resctrl_sched.h to asm/resctrl.h
asm/resctrl_sched.h is dedicated to the code used for configuration of the CPU resource control state when a task is scheduled.
Rename resctrl_sched.h to resctrl.h in preparation of additions that will no longer make this file dedicated to work done during scheduling.
No functional change.
Suggested-by: Borislav Petkov <[email protected]> Signed-off-by: Reinette Chatre <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/6914e0ef880b539a82a6d889f9423496d471ad1d.1588715690.git.reinette.chatre@intel.com
show more ...
|
|
Revision tags: v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6 |
|
| #
ffd75b37 |
| 13-Mar-2020 |
Brian Gerst <[email protected]> |
x86: Remove unneeded includes
Clean up includes of and in <asm/syscalls.h>
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kern
x86: Remove unneeded includes
Clean up includes of and in <asm/syscalls.h>
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3 |
|
| #
2b10906f |
| 19-Dec-2019 |
Brian Gerst <[email protected]> |
x86: Remove force_iret()
force_iret() was originally intended to prevent the return to user mode with the SYSRET or SYSEXIT instructions, in cases where the register state could have been changed to
x86: Remove force_iret()
force_iret() was originally intended to prevent the return to user mode with the SYSRET or SYSEXIT instructions, in cases where the register state could have been changed to be incompatible with those instructions. The entry code has been significantly reworked since then, and register state is validated before SYSRET or SYSEXIT are used. force_iret() no longer serves its original purpose and can be eliminated.
Signed-off-by: Brian Gerst <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8 |
|
| #
a24ca997 |
| 11-Nov-2019 |
Thomas Gleixner <[email protected]> |
x86/iopl: Remove legacy IOPL option
The IOPL emulation via the I/O bitmap is sufficient. Remove the legacy cruft dealing with the (e)flags based IOPL mechanism.
Signed-off-by: Thomas Gleixner <tglx
x86/iopl: Remove legacy IOPL option
The IOPL emulation via the I/O bitmap is sufficient. Remove the legacy cruft dealing with the (e)flags based IOPL mechanism.
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Juergen Gross <[email protected]> (Paravirt and Xen parts) Acked-by: Andy Lutomirski <[email protected]>
show more ...
|
| #
2fff071d |
| 11-Nov-2019 |
Thomas Gleixner <[email protected]> |
x86/process: Unify copy_thread_tls()
While looking at the TSS io bitmap it turned out that any change in that area would require identical changes to copy_thread_tls(). The 32 and 64 bit variants sh
x86/process: Unify copy_thread_tls()
While looking at the TSS io bitmap it turned out that any change in that area would require identical changes to copy_thread_tls(). The 32 and 64 bit variants share sufficient code to consolidate them into a common function to avoid duplication of upcoming modifications.
Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Andy Lutomirski <[email protected]>
show more ...
|
|
Revision tags: v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1 |
|
| #
3c88c692 |
| 07-May-2019 |
Peter Zijlstra <[email protected]> |
x86/stackframe/32: Provide consistent pt_regs
Currently pt_regs on x86_32 has an oddity in that kernel regs (!user_mode(regs)) are short two entries (esp/ss). This means that any code trying to use
x86/stackframe/32: Provide consistent pt_regs
Currently pt_regs on x86_32 has an oddity in that kernel regs (!user_mode(regs)) are short two entries (esp/ss). This means that any code trying to use them (typically: regs->sp) needs to jump through some unfortunate hoops.
Change the entry code to fix this up and create a full pt_regs frame.
This then simplifies various trampolines in ftrace and kprobes, the stack unwinder, ptrace, kdump and kgdb.
Much thanks to Josh for help with the cleanups!
Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Josh Poimboeuf <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4 |
|
| #
5f409e20 |
| 03-Apr-2019 |
Rik van Riel <[email protected]> |
x86/fpu: Defer FPU state load until return to userspace
Defer loading of FPU state until return to userspace. This gives the kernel the potential to skip loading FPU state for tasks that stay in ker
x86/fpu: Defer FPU state load until return to userspace
Defer loading of FPU state until return to userspace. This gives the kernel the potential to skip loading FPU state for tasks that stay in kernel mode, or for tasks that end up with repeated invocations of kernel_fpu_begin() & kernel_fpu_end().
The fpregs_lock/unlock() section ensures that the registers remain unchanged. Otherwise a context switch or a bottom half could save the registers to its FPU context and the processor's FPU registers would became random if modified at the same time.
KVM swaps the host/guest registers on entry/exit path. This flow has been kept as is. First it ensures that the registers are loaded and then saves the current (host) state before it loads the guest's registers. The swap is done at the very end with disabled interrupts so it should not change anymore before theg guest is entered. The read/save version seems to be cheaper compared to memcpy() in a micro benchmark.
Each thread gets TIF_NEED_FPU_LOAD set as part of fork() / fpu__copy(). For kernel threads, this flag gets never cleared which avoids saving / restoring the FPU state for kernel threads and during in-kernel usage of the FPU registers.
[ bp: Correct and update commit message and fix checkpatch warnings. s/register/registers/ where it is used in plural. minor comment corrections. remove unused trace_x86_fpu_activate_state() TP. ]
Signed-off-by: Rik van Riel <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Aubrey Li <[email protected]> Cc: Babu Moger <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jann Horn <[email protected]> Cc: "Jason A. Donenfeld" <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: kvm ML <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: "Radim Krčmář" <[email protected]> Cc: Tim Chen <[email protected]> Cc: Waiman Long <[email protected]> Cc: x86-ml <[email protected]> Cc: Yi Wang <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
| #
2722146e |
| 03-Apr-2019 |
Sebastian Andrzej Siewior <[email protected]> |
x86/fpu: Remove fpu->initialized
The struct fpu.initialized member is always set to one for user tasks and zero for kernel tasks. This avoids saving/restoring the FPU registers for kernel threads.
x86/fpu: Remove fpu->initialized
The struct fpu.initialized member is always set to one for user tasks and zero for kernel tasks. This avoids saving/restoring the FPU registers for kernel threads.
The ->initialized = 0 case for user tasks has been removed in previous changes, for instance, by doing an explicit unconditional init at fork() time for FPU-less systems which was otherwise delayed until the emulated opcode.
The context switch code (switch_fpu_prepare() + switch_fpu_finish()) can't unconditionally save/restore registers for kernel threads. Not only would it slow down the switch but also load a zeroed xcomp_bv for XSAVES.
For kernel_fpu_begin() (+end) the situation is similar: EFI with runtime services uses this before alternatives_patched is true. Which means that this function is used too early and it wasn't the case before.
For those two cases, use current->mm to distinguish between user and kernel thread. For kernel_fpu_begin() skip save/restore of the FPU registers.
During the context switch into a kernel thread don't do anything. There is no reason to save the FPU state of a kernel thread.
The reordering in __switch_to() is important because the current() pointer needs to be valid before switch_fpu_finish() is invoked so ->mm is seen of the new task instead the old one.
N.B.: fpu__save() doesn't need to check ->mm because it is called by user tasks only.
[ bp: Massage. ]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Aubrey Li <[email protected]> Cc: Babu Moger <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jann Horn <[email protected]> Cc: "Jason A. Donenfeld" <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: kvm ML <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Nicolai Stange <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Will Deacon <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
| #
6dd677a0 |
| 03-Apr-2019 |
Sebastian Andrzej Siewior <[email protected]> |
x86/fpu: Remove fpu__restore()
There are no users of fpu__restore() so it is time to remove it. The comment regarding fpu__restore() and TS bit is stale since commit
b3b0870ef3ffe ("i387: do not
x86/fpu: Remove fpu__restore()
There are no users of fpu__restore() so it is time to remove it. The comment regarding fpu__restore() and TS bit is stale since commit
b3b0870ef3ffe ("i387: do not preload FPU state at task switch time")
and has no meaning since.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Aubrey Li <[email protected]> Cc: Babu Moger <[email protected]> Cc: "Chang S. Bae" <[email protected]> Cc: Dmitry Safonov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jann Horn <[email protected]> Cc: "Jason A. Donenfeld" <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: kvm ML <[email protected]> Cc: [email protected] Cc: Nicolai Stange <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Rik van Riel <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7 |
|
| #
6690e86b |
| 14-Feb-2019 |
Peter Zijlstra <[email protected]> |
sched/x86: Save [ER]FLAGS on context switch
Effectively reverts commit:
2c7577a75837 ("sched/x86_64: Don't save flags on context switch")
Specifically because SMAP uses FLAGS.AC which invalidate
sched/x86: Save [ER]FLAGS on context switch
Effectively reverts commit:
2c7577a75837 ("sched/x86_64: Don't save flags on context switch")
Specifically because SMAP uses FLAGS.AC which invalidates the claim that the kernel has clean flags.
In particular; while preemption from interrupt return is fine (the IRET frame on the exception stack contains FLAGS) it breaks any code that does synchonous scheduling, including preempt_enable().
This has become a significant issue ever since commit:
5b24a7a2aa20 ("Add 'unsafe' user access functions for batched accesses")
provided for means of having 'normal' C code between STAC / CLAC, exposing the FLAGS.AC state. So far this hasn't led to trouble, however fix it before it comes apart.
Reported-by: Julien Thierry <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Fixes: 5b24a7a2aa20 ("Add 'unsafe' user access functions for batched accesses") Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5 |
|
| #
e08e3211 |
| 28-Nov-2018 |
Sebastian Andrzej Siewior <[email protected]> |
x86/process/32: Remove asm/math_emu.h include
The math_emu.h header files contains the definition of struct math_emu_info which is not used in this file.
Remove the asm/math_emu.h include.
Signed-
x86/process/32: Remove asm/math_emu.h include
The math_emu.h header files contains the definition of struct math_emu_info which is not used in this file.
Remove the asm/math_emu.h include.
Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Jason A. Donenfeld" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jann Horn <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: kvm ML <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v4.20-rc4 |
|
| #
ff16701a |
| 25-Nov-2018 |
Thomas Gleixner <[email protected]> |
x86/process: Consolidate and simplify switch_to_xtra() code
Move the conditional invocation of __switch_to_xtra() into an inline function so the logic can be shared between 32 and 64 bit.
Remove th
x86/process: Consolidate and simplify switch_to_xtra() code
Move the conditional invocation of __switch_to_xtra() into an inline function so the logic can be shared between 32 and 64 bit.
Remove the handthrough of the TSS pointer and retrieve the pointer directly in the bitmap handling function. Use this_cpu_ptr() instead of the per_cpu() indirection.
This is a preparatory change so integration of conditional indirect branch speculation optimization happens only in one place.
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Tim Chen <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Casey Schaufler <[email protected]> Cc: Asit Mallick <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Jon Masters <[email protected]> Cc: Waiman Long <[email protected]> Cc: Greg KH <[email protected]> Cc: Dave Stewart <[email protected]> Cc: Kees Cook <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
show more ...
|