|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3 |
|
| #
d262a192 |
| 12-Feb-2025 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Fix KASAN hit by not flagging text patching area as VM_ALLOC
Erhard reported the following KASAN hit while booting his PowerMac G4 with a KASAN-enabled kernel 6.13-rc6:
BUG
powerpc/code-patching: Fix KASAN hit by not flagging text patching area as VM_ALLOC
Erhard reported the following KASAN hit while booting his PowerMac G4 with a KASAN-enabled kernel 6.13-rc6:
BUG: KASAN: vmalloc-out-of-bounds in copy_to_kernel_nofault+0xd8/0x1c8 Write of size 8 at addr f1000000 by task chronyd/1293
CPU: 0 UID: 123 PID: 1293 Comm: chronyd Tainted: G W 6.13.0-rc6-PMacG4 #2 Tainted: [W]=WARN Hardware name: PowerMac3,6 7455 0x80010303 PowerMac Call Trace: [c2437590] [c1631a84] dump_stack_lvl+0x70/0x8c (unreliable) [c24375b0] [c0504998] print_report+0xdc/0x504 [c2437610] [c050475c] kasan_report+0xf8/0x108 [c2437690] [c0505a3c] kasan_check_range+0x24/0x18c [c24376a0] [c03fb5e4] copy_to_kernel_nofault+0xd8/0x1c8 [c24376c0] [c004c014] patch_instructions+0x15c/0x16c [c2437710] [c00731a8] bpf_arch_text_copy+0x60/0x7c [c2437730] [c0281168] bpf_jit_binary_pack_finalize+0x50/0xac [c2437750] [c0073cf4] bpf_int_jit_compile+0xb30/0xdec [c2437880] [c0280394] bpf_prog_select_runtime+0x15c/0x478 [c24378d0] [c1263428] bpf_prepare_filter+0xbf8/0xc14 [c2437990] [c12677ec] bpf_prog_create_from_user+0x258/0x2b4 [c24379d0] [c027111c] do_seccomp+0x3dc/0x1890 [c2437ac0] [c001d8e0] system_call_exception+0x2dc/0x420 [c2437f30] [c00281ac] ret_from_syscall+0x0/0x2c --- interrupt: c00 at 0x5a1274 NIP: 005a1274 LR: 006a3b3c CTR: 005296c8 REGS: c2437f40 TRAP: 0c00 Tainted: G W (6.13.0-rc6-PMacG4) MSR: 0200f932 <VEC,EE,PR,FP,ME,IR,DR,RI> CR: 24004422 XER: 00000000
GPR00: 00000166 af8f3fa0 a7ee3540 00000001 00000000 013b6500 005a5858 0200f932 GPR08: 00000000 00001fe9 013d5fc8 005296c8 2822244c 00b2fcd8 00000000 af8f4b57 GPR16: 00000000 00000001 00000000 00000000 00000000 00000001 00000000 00000002 GPR24: 00afdbb0 00000000 00000000 00000000 006e0004 013ce060 006e7c1c 00000001 NIP [005a1274] 0x5a1274 LR [006a3b3c] 0x6a3b3c --- interrupt: c00
The buggy address belongs to the virtual mapping at [f1000000, f1002000) created by: text_area_cpu_up+0x20/0x190
The buggy address belongs to the physical page: page: refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x76e30 flags: 0x80000000(zone=2) raw: 80000000 00000000 00000122 00000000 00000000 00000000 ffffffff 00000001 raw: 00000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: f0ffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0ffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >f1000000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ^ f1000080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f1000100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 ==================================================================
f8 corresponds to KASAN_VMALLOC_INVALID which means the area is not initialised hence not supposed to be used yet.
Powerpc text patching infrastructure allocates a virtual memory area using get_vm_area() and flags it as VM_ALLOC. But that flag is meant to be used for vmalloc() and vmalloc() allocated memory is not supposed to be used before a call to __vmalloc_node_range() which is never called for that area.
That went undetected until commit e4137f08816b ("mm, kasan, kmsan: instrument copy_from/to_kernel_nofault")
The area allocated by text_area_cpu_up() is not vmalloc memory, it is mapped directly on demand when needed by map_kernel_page(). There is no VM flag corresponding to such usage, so just pass no flag. That way the area will be unpoisonned and usable immediately.
Reported-by: Erhard Furtner <[email protected]> Closes: https://lore.kernel.org/all/20250112135832.57c92322@yea/ Fixes: 37bc3e5fd764 ("powerpc/lib/code-patching: Use alternate map for patch_instruction()") Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/06621423da339b374f48c0886e3a5db18e896be8.1739342693.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v6.14-rc2 |
|
| #
dc9c5166 |
| 03-Feb-2025 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Disable KASAN report during patching via temporary mm
Erhard reports the following KASAN hit on Talos II (power9) with kernel 6.13:
[ 12.028126] ===========================
powerpc/code-patching: Disable KASAN report during patching via temporary mm
Erhard reports the following KASAN hit on Talos II (power9) with kernel 6.13:
[ 12.028126] ================================================================== [ 12.028198] BUG: KASAN: user-memory-access in copy_to_kernel_nofault+0x8c/0x1a0 [ 12.028260] Write of size 8 at addr 0000187e458f2000 by task systemd/1
[ 12.028346] CPU: 87 UID: 0 PID: 1 Comm: systemd Tainted: G T 6.13.0-P9-dirty #3 [ 12.028408] Tainted: [T]=RANDSTRUCT [ 12.028446] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV [ 12.028500] Call Trace: [ 12.028536] [c000000008dbf3b0] [c000000001656a48] dump_stack_lvl+0xbc/0x110 (unreliable) [ 12.028609] [c000000008dbf3f0] [c0000000006e2fc8] print_report+0x6b0/0x708 [ 12.028666] [c000000008dbf4e0] [c0000000006e2454] kasan_report+0x164/0x300 [ 12.028725] [c000000008dbf600] [c0000000006e54d4] kasan_check_range+0x314/0x370 [ 12.028784] [c000000008dbf640] [c0000000006e6310] __kasan_check_write+0x20/0x40 [ 12.028842] [c000000008dbf660] [c000000000578e8c] copy_to_kernel_nofault+0x8c/0x1a0 [ 12.028902] [c000000008dbf6a0] [c0000000000acfe4] __patch_instructions+0x194/0x210 [ 12.028965] [c000000008dbf6e0] [c0000000000ade80] patch_instructions+0x150/0x590 [ 12.029026] [c000000008dbf7c0] [c0000000001159bc] bpf_arch_text_copy+0x6c/0xe0 [ 12.029085] [c000000008dbf800] [c000000000424250] bpf_jit_binary_pack_finalize+0x40/0xc0 [ 12.029147] [c000000008dbf830] [c000000000115dec] bpf_int_jit_compile+0x3bc/0x930 [ 12.029206] [c000000008dbf990] [c000000000423720] bpf_prog_select_runtime+0x1f0/0x280 [ 12.029266] [c000000008dbfa00] [c000000000434b18] bpf_prog_load+0xbb8/0x1370 [ 12.029324] [c000000008dbfb70] [c000000000436ebc] __sys_bpf+0x5ac/0x2e00 [ 12.029379] [c000000008dbfd00] [c00000000043a228] sys_bpf+0x28/0x40 [ 12.029435] [c000000008dbfd20] [c000000000038eb4] system_call_exception+0x334/0x610 [ 12.029497] [c000000008dbfe50] [c00000000000c270] system_call_vectored_common+0xf0/0x280 [ 12.029561] --- interrupt: 3000 at 0x3fff82f5cfa8 [ 12.029608] NIP: 00003fff82f5cfa8 LR: 00003fff82f5cfa8 CTR: 0000000000000000 [ 12.029660] REGS: c000000008dbfe80 TRAP: 3000 Tainted: G T (6.13.0-P9-dirty) [ 12.029735] MSR: 900000000280f032 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI> CR: 42004848 XER: 00000000 [ 12.029855] IRQMASK: 0 GPR00: 0000000000000169 00003fffdcf789a0 00003fff83067100 0000000000000005 GPR04: 00003fffdcf78a98 0000000000000090 0000000000000000 0000000000000008 GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000000000 00003fff836ff7e0 c000000000010678 0000000000000000 GPR16: 0000000000000000 0000000000000000 00003fffdcf78f28 00003fffdcf78f90 GPR20: 0000000000000000 0000000000000000 0000000000000000 00003fffdcf78f80 GPR24: 00003fffdcf78f70 00003fffdcf78d10 00003fff835c7239 00003fffdcf78bd8 GPR28: 00003fffdcf78a98 0000000000000000 0000000000000000 000000011f547580 [ 12.030316] NIP [00003fff82f5cfa8] 0x3fff82f5cfa8 [ 12.030361] LR [00003fff82f5cfa8] 0x3fff82f5cfa8 [ 12.030405] --- interrupt: 3000 [ 12.030444] ==================================================================
Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU") is inspired from x86 but unlike x86 is doesn't disable KASAN reports during patching. This wasn't a problem at the begining because __patch_mem() is not instrumented.
Commit 465cabc97b42 ("powerpc/code-patching: introduce patch_instructions()") use copy_to_kernel_nofault() to copy several instructions at once. But when using temporary mm the destination is not regular kernel memory but a kind of kernel-like memory located in user address space. Because it is not in kernel address space it is not covered by KASAN shadow memory. Since commit e4137f08816b ("mm, kasan, kmsan: instrument copy_from/to_kernel_nofault") KASAN reports bad accesses from copy_to_kernel_nofault(). Here a bad access to user memory is reported because KASAN detects the lack of shadow memory and the address is below TASK_SIZE.
Do like x86 in commit b3fd8e83ada0 ("x86/alternatives: Use temporary mm for text poking") and disable KASAN reports during patching when using temporary mm.
Reported-by: Erhard Furtner <[email protected]> Close: https://lore.kernel.org/all/20250201151435.48400261@yea/ Fixes: 465cabc97b42 ("powerpc/code-patching: introduce patch_instructions()") Signed-off-by: Christophe Leroy <[email protected]> Acked-by: Michael Ellerman <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Link: https://patch.msgid.link/1c05b2a1b02ad75b981cfc45927e0b4a90441046.1738577687.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5 |
|
| #
0c3beacf |
| 23-Oct-2024 |
Mike Rapoport (Microsoft) <[email protected]> |
asm-generic: introduce text-patching.h
Several architectures support text patching, but they name the header files that declare patching functions differently.
Make all such headers consistently na
asm-generic: introduce text-patching.h
Several architectures support text patching, but they name the header files that declare patching functions differently.
Make all such headers consistently named text-patching.h and add an empty header in asm-generic for architectures that do not support text patching.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport (Microsoft) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> # m68k Acked-by: Arnd Bergmann <[email protected]> Reviewed-by: Luis Chamberlain <[email protected]> Tested-by: kdevops <[email protected]> Cc: Andreas Larsson <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dinh Nguyen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Helge Deller <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Johannes Berg <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Liam R. Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Russell King <[email protected]> Cc: Song Liu <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1 |
|
| #
dbf828aa |
| 15-May-2024 |
Benjamin Gray <[email protected]> |
powerpc/code-patching: Add data patch alignment check
The new data patching still needs to be aligned within a cacheline too for the flushes to work correctly. To simplify this requirement, we just
powerpc/code-patching: Add data patch alignment check
The new data patching still needs to be aligned within a cacheline too for the flushes to work correctly. To simplify this requirement, we just say data patches must be aligned.
Detect when data patching is not aligned, returning an invalid argument error.
Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Hari Bathini <[email protected]> Acked-by: Naveen N Rao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
e6b8940e |
| 15-May-2024 |
Benjamin Gray <[email protected]> |
powerpc/code-patching: Add generic memory patching
patch_instruction() is designed for patching instructions in otherwise readonly memory. Other consumers also sometimes need to patch readonly memor
powerpc/code-patching: Add generic memory patching
patch_instruction() is designed for patching instructions in otherwise readonly memory. Other consumers also sometimes need to patch readonly memory, so have abused patch_instruction() for arbitrary data patches.
This is a problem on ppc64 as patch_instruction() decides on the patch width using the 'instruction' opcode to see if it's a prefixed instruction. Data that triggers this can lead to larger writes, possibly crossing a page boundary and failing the write altogether.
Introduce patch_uint(), and patch_ulong(), with aliases patch_u32(), and patch_u64() (on ppc64) designed for aligned data patches. The patch size is now determined by the called function, and is passed as an additional parameter to generic internals.
While the instruction flushing is not required for data patches, it remains unconditional in this patch. A followup series is possible if benchmarking shows fewer flushes gives an improvement in some data-patching workload.
ppc32 does not support prefixed instructions, so is unaffected by the original issue. Care is taken in not exposing the size parameter in the public (non-static) interface, so the compiler can const-propagate it away.
Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Hari Bathini <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.9, v6.9-rc7 |
|
| #
0a956d52 |
| 05-May-2024 |
Mike Rapoport (IBM) <[email protected]> |
powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate
There are places where CONFIG_MODULES guards the code that depends on memory allocation being done with module_alloc().
Repla
powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate
There are places where CONFIG_MODULES guards the code that depends on memory allocation being done with module_alloc().
Replace CONFIG_MODULES with CONFIG_EXECMEM in such places.
Signed-off-by: Mike Rapoport (IBM) <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
show more ...
|
|
Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2 |
|
| #
c3710ee7 |
| 25-Mar-2024 |
Benjamin Gray <[email protected]> |
powerpc/code-patching: Use dedicated memory routines for patching
The patching page set up as a writable alias may be in quadrant 0 (userspace) if the temporary mm path is used. This causes sanitise
powerpc/code-patching: Use dedicated memory routines for patching
The patching page set up as a writable alias may be in quadrant 0 (userspace) if the temporary mm path is used. This causes sanitiser failures if so. Sanitiser failures also occur on the non-mm path because the plain memset family is instrumented, and KASAN treats the patching window as poisoned.
Introduce locally defined patch_* variants of memset that perform an uninstrumented lower level set, as well as detecting write errors like the original single patch variant does.
copy_to_user() is not correct here, as the PTE makes it a proper kernel page (the EAA is privileged access only, RW). It just happens to be in quadrant 0 because that's the hardware's mechanism for using the current PID vs PID 0 in translations. Importantly, it's incorrect to allow user page accesses.
Now that the patching memsets are used, we also propagate a failure up to the caller as the single patch variant does.
Signed-off-by: Benjamin Gray <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7 |
|
| #
465cabc9 |
| 20-Oct-2023 |
Hari Bathini <[email protected]> |
powerpc/code-patching: introduce patch_instructions()
patch_instruction() entails setting up pte, patching the instruction, clearing the pte and flushing the tlb. If multiple instructions need to be
powerpc/code-patching: introduce patch_instructions()
patch_instruction() entails setting up pte, patching the instruction, clearing the pte and flushing the tlb. If multiple instructions need to be patched, every instruction would have to go through the above drill unnecessarily. Instead, introduce patch_instructions() function that sets up the pte, clears the pte and flushes the tlb only once per page range of instructions to be patched. Duplicate most of the patch_instruction() code instead of merging with it, to avoid the performance degradation observed on ppc32, for patch_instruction(), with the code path merged. Also, setup poking_init() always as BPF expects poking_init() to be setup even when STRICT_KERNEL_RWX is off.
Signed-off-by: Hari Bathini <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.6-rc6, v6.6-rc5 |
|
| #
74726fda |
| 07-Oct-2023 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU") added a hwsync for when __patch_instru
powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure
Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU") added a hwsync for when __patch_instruction() fails, we results in a quite odd unbalanced logic.
Instead of calling mb() when __patch_instruction() returns an error, call mb() in the __patch_instruction()'s error path directly.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/e88b154eaf2efd9ff177d472d3411dcdec8ff4f5.1696675567.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1 |
|
| #
980411a4 |
| 16-Dec-2022 |
Michael Ellerman <[email protected]> |
powerpc/code-patching: Fix oops with DEBUG_VM enabled
Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is enabled:
------------[ cut here ]------------ kernel BUG at arch/pow
powerpc/code-patching: Fix oops with DEBUG_VM enabled
Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is enabled:
------------[ cut here ]------------ kernel BUG at arch/powerpc/mm/pgtable.c:333! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc2+ #1 Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV ... NIP assert_pte_locked+0x180/0x1a0 LR assert_pte_locked+0x170/0x1a0 Call Trace: 0x60000000 (unreliable) patch_instruction+0x618/0x6d0 arch_prepare_kprobe+0xfc/0x2d0 register_kprobe+0x520/0x7c0 arch_init_kprobes+0x28/0x3c init_kprobes+0x108/0x184 do_one_initcall+0x60/0x2e0 kernel_init_freeable+0x1f0/0x3e0 kernel_init+0x34/0x1d0 ret_from_kernel_thread+0x5c/0x64
It's caused by the assert_spin_locked() failing in assert_pte_locked(). The assert fails because the PTE was unlocked in text_area_cpu_up_mm(), and never relocked.
The PTE page shouldn't be freed, the patching_mm is only used for patching on this CPU, only that single PTE is ever mapped, and it's only unmapped at CPU offline.
In fact assert_pte_locked() has a special case to ignore init_mm entirely, and the patching_mm is more-or-less like init_mm, so possibly the check could be skipped for patching_mm too.
But for now be conservative, and use the proper PTE accessors at patching time, so that the PTE lock is held while the PTE is used. That also avoids the warning in assert_pte_locked().
With that it's no longer necessary to save the PTE in cpu_patching_context for the mm_patch_enabled() case.
Fixes: c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU") Reported-by: Nathan Chancellor <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.1, v6.1-rc8 |
|
| #
6f3a81b6 |
| 02-Dec-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Remove protection against patching init addresses after init
Once init section is freed, attempting to patch init code ends up in the weed.
Commit 51c3c62b58b3 ("powerpc: Avo
powerpc/code-patching: Remove protection against patching init addresses after init
Once init section is freed, attempting to patch init code ends up in the weed.
Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections") protected patch_instruction() against that, but it is the responsibility of the caller to ensure that the patched memory is valid.
All callers have now been verified and fixed so the check can be removed.
This improves ftrace activation by about 2% on 8xx.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/504310828f473d424e2ed229eff57bf075f52796.1669969781.git.christophe.leroy@csgroup.eu
show more ...
|
| #
84ecfe6f |
| 02-Dec-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX
No need to have one implementation of patch_instruction() for CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.
In patch_
powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX
No need to have one implementation of patch_instruction() for CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.
In patch_instruction(), call raw_patch_instruction() when !CONFIG_STRICT_KERNEL_RWX.
In poking_init(), bail out immediately, it will be equivalent to the weak default implementation.
Everything else is declared static and will be discarded by GCC when !CONFIG_STRICT_KERNEL_RWX.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/f67d2a109404d03e8fdf1ea15388c8778337a76b.1669969781.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v6.1-rc7, v6.1-rc6, v6.1-rc5 |
|
| #
2f228ee1 |
| 09-Nov-2022 |
Benjamin Gray <[email protected]> |
powerpc/code-patching: Consolidate and cache per-cpu patching context
With the temp mm context support, there are CPU local variables to hold the patch address and pte. Use these in the non-temp mm
powerpc/code-patching: Consolidate and cache per-cpu patching context
With the temp mm context support, there are CPU local variables to hold the patch address and pte. Use these in the non-temp mm path as well instead of adding a level of indirection through the text_poke_area vm_struct and pointer chasing the pte.
As both paths use these fields now, there is no need to let unreferenced variables be dropped by the compiler, so it is cleaner to merge them into a single context struct. This has the additional benefit of removing a redundant CPU local pointer, as only one of cpu_patching_mm / text_poke_area is ever used, while remaining well-typed. It also groups each CPU's data into a single cacheline.
Signed-off-by: Benjamin Gray <[email protected]> [mpe: Shorten name to 'area' as suggested by Christophe] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
c28c15b6 |
| 09-Nov-2022 |
Christopher M. Riedl <[email protected]> |
powerpc/code-patching: Use temporary mm for Radix MMU
x86 supports the notion of a temporary mm which restricts access to temporary PTEs to a single CPU. A temporary mm is useful for situations wher
powerpc/code-patching: Use temporary mm for Radix MMU
x86 supports the notion of a temporary mm which restricts access to temporary PTEs to a single CPU. A temporary mm is useful for situations where a CPU needs to perform sensitive operations (such as patching a STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing said mappings to other CPUs. Another benefit is that other CPU TLBs do not need to be flushed when the temporary mm is torn down.
Mappings in the temporary mm can be set in the userspace portion of the address-space.
Interrupts must be disabled while the temporary mm is in use. HW breakpoints, which may have been set by userspace as watchpoints on addresses now within the temporary mm, are saved and disabled when loading the temporary mm. The HW breakpoints are restored when unloading the temporary mm. All HW breakpoints are indiscriminately disabled while the temporary mm is in use - this may include breakpoints set by perf.
Use the `poking_init` init hook to prepare a temporary mm and patching address. Initialize the temporary mm using mm_alloc(). Choose a randomized patching address inside the temporary mm userspace address space. The patching address is randomized between PAGE_SIZE and DEFAULT_MAP_WINDOW-PAGE_SIZE.
Bits of entropy with 64K page size on BOOK3S_64:
bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)
PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB bits of entropy = log2(128TB / 64K) bits of entropy = 31
The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU operates - by default the space above DEFAULT_MAP_WINDOW is not available. Currently the Hash MMU does not use a temporary mm so technically this upper limit isn't necessary; however, a larger randomization range does not further "harden" this overall approach and future work may introduce patching with a temporary mm on Hash as well.
Randomization occurs only once during initialization for each CPU as it comes online.
The patching page is mapped with PAGE_KERNEL to set EAA[0] for the PTE which ignores the AMR (so no need to unlock/lock KUAP) according to PowerISA v3.0b Figure 35 on Radix.
Based on x86 implementation:
commit 4fc19708b165 ("x86/alternatives: Initialize temporary mm for patching")
and:
commit b3fd8e83ada0 ("x86/alternatives: Use temporary mm for text poking")
From: Benjamin Gray <[email protected]>
Synchronisation is done according to ISA 3.1B Book 3 Chapter 13 "Synchronization Requirements for Context Alterations". Switching the mm is a change to the PID, which requires a CSI before and after the change, and a hwsync between the last instruction that performs address translation for an associated storage access.
Instruction fetch is an associated storage access, but the instruction address mappings are not being changed, so it should not matter which context they use. We must still perform a hwsync to guard arbitrary prior code that may have accessed a userspace address.
TLB invalidation is local and VA specific. Local because only this core used the patching mm, and VA specific because we only care that the writable mapping is purged. Leaving the other mappings intact is more efficient, especially when performing many code patches in a row (e.g., as ftrace would).
Signed-off-by: Christopher M. Riedl <[email protected]> Signed-off-by: Benjamin Gray <[email protected]> [mpe: Use mm_alloc() per 107b6828a7cd ("x86/mm: Use mm_alloc() in poking_init()")] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
071c95c1 |
| 09-Nov-2022 |
Benjamin Gray <[email protected]> |
powerpc/code-patching: Use WARN_ON and fix check in poking_init
BUG_ON() when failing to initialise the code patching window is unnecessary, and use of BUG_ON is discouraged. We don't set poking_ini
powerpc/code-patching: Use WARN_ON and fix check in poking_init
BUG_ON() when failing to initialise the code patching window is unnecessary, and use of BUG_ON is discouraged. We don't set poking_init_done in this case, so failure to init the boot CPU will result in a strict RWX error when a following patch_instruction uses raw_patch_instruction. If it only fails for later CPUs, they won't be onlined in the first place.
The return value of cpuhp_setup_state() is also >= 0 on success, so check for < 0.
Signed-off-by: Benjamin Gray <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2 |
|
| #
8b4bb0ad |
| 15-Aug-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Speed up page mapping/unmapping
Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area") the patch area is premapped so intermediate page tables are already all
powerpc/code-patching: Speed up page mapping/unmapping
Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area") the patch area is premapped so intermediate page tables are already allocated.
Use __set_pte_at() directly instead of the heavy map_kernel_page(), at for unmapping just do a pte_clear() followed by a flush.
__set_pte_at() can be used directly without the filters in set_pte_at() because we are mapping a normal page non executable.
Make sure gcc knows text_poke_area is page aligned in order to optimise the flush.
This change reduces by 66% the time needed to activate ftrace on an 8xx (588000 tb ticks instead of 1744000).
Signed-off-by: Christophe Leroy <[email protected]> [mpe: Add ptesync needed on radix to avoid spurious fault] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7 |
|
| #
bbffdd2f |
| 09-May-2022 |
Christophe Leroy <[email protected]> |
powerpc/ftrace: Use patch_instruction() return directly
Instead of returning -EPERM when patch_instruction() fails, just return what patch_instruction returns.
That simplifies ftrace_modify_code():
powerpc/ftrace: Use patch_instruction() return directly
Instead of returning -EPERM when patch_instruction() fails, just return what patch_instruction returns.
That simplifies ftrace_modify_code():
0: 94 21 ff c0 stwu r1,-64(r1) 4: 93 e1 00 3c stw r31,60(r1) 8: 7c 7f 1b 79 mr. r31,r3 c: 40 80 00 30 bge 3c <ftrace_modify_code+0x3c> 10: 93 c1 00 38 stw r30,56(r1) 14: 7c 9e 23 78 mr r30,r4 18: 7c a4 2b 78 mr r4,r5 1c: 80 bf 00 00 lwz r5,0(r31) 20: 7c 1e 28 40 cmplw r30,r5 24: 40 82 00 34 bne 58 <ftrace_modify_code+0x58> 28: 83 c1 00 38 lwz r30,56(r1) 2c: 7f e3 fb 78 mr r3,r31 30: 83 e1 00 3c lwz r31,60(r1) 34: 38 21 00 40 addi r1,r1,64 38: 48 00 00 00 b 38 <ftrace_modify_code+0x38> 38: R_PPC_REL24 patch_instruction
Before:
0: 94 21 ff c0 stwu r1,-64(r1) 4: 93 e1 00 3c stw r31,60(r1) 8: 7c 7f 1b 79 mr. r31,r3 c: 40 80 00 4c bge 58 <ftrace_modify_code+0x58> 10: 93 c1 00 38 stw r30,56(r1) 14: 7c 9e 23 78 mr r30,r4 18: 7c a4 2b 78 mr r4,r5 1c: 80 bf 00 00 lwz r5,0(r31) 20: 7c 08 02 a6 mflr r0 24: 90 01 00 44 stw r0,68(r1) 28: 7c 1e 28 40 cmplw r30,r5 2c: 40 82 00 48 bne 74 <ftrace_modify_code+0x74> 30: 7f e3 fb 78 mr r3,r31 34: 48 00 00 01 bl 34 <ftrace_modify_code+0x34> 34: R_PPC_REL24 patch_instruction 38: 80 01 00 44 lwz r0,68(r1) 3c: 20 63 00 00 subfic r3,r3,0 40: 83 c1 00 38 lwz r30,56(r1) 44: 7c 63 19 10 subfe r3,r3,r3 48: 7c 08 03 a6 mtlr r0 4c: 83 e1 00 3c lwz r31,60(r1) 50: 38 21 00 40 addi r1,r1,64 54: 4e 80 00 20 blr
It improves ftrace activation/deactivation duration by about 3%.
Modify patch_instruction() return on failure to -EPERM in order to match with ftrace expectations. Other users of patch_instruction() do not care about the exact error value returned.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/49a8597230713e2633e7d9d7b56140787c4a7e20.1652074503.git.christophe.leroy@csgroup.eu
show more ...
|
| #
d2f47dab |
| 09-May-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Inline create_branch()
create_branch() is a good candidate for inlining because: - Flags can be folded in. - Range tests are likely to be already done.
Hence reducing the cre
powerpc/code-patching: Inline create_branch()
create_branch() is a good candidate for inlining because: - Flags can be folded in. - Range tests are likely to be already done.
Hence reducing the create_branch() to only a set of instructions.
So inline it.
It improves ftrace activation by 10%.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/69851cc9a7bf8f03d025e6d29e165f2d0bd3bb6e.1652074503.git.christophe.leroy@csgroup.eu
show more ...
|
| #
1acbf27e |
| 09-May-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()
Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range() are simple tests that are worth inlining.
Signed-off-by: Ch
powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()
Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range() are simple tests that are worth inlining.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a05be0ccb7373e6a9789a1988fcd0c810f5f9269.1652074503.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1 |
|
| #
17512892 |
| 22-Mar-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Use jump_label to check if poking_init() is done
It's only during early startup that poking_init() is not done yet, for instance when calling ftrace_init().
Once poking_init(
powerpc/code-patching: Use jump_label to check if poking_init() is done
It's only during early startup that poking_init() is not done yet, for instance when calling ftrace_init().
Once poking_init() has been called there must be a poking area, no need to check it everytime patch_instruction() is called.
ftrace activation time is reduced by 7% with the change on an 8xx.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/8d6088aca7b63247377b6d9e4897d08d935fbe93.1647962456.git.christophe.leroy@csgroup.eu
show more ...
|
| #
b0337678 |
| 22-Mar-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Use jump_label for testing freed initmem
Once init is done, initmem is freed forever so no need to test system_state at every call to patch_instruction().
Use jump_label.
Th
powerpc/code-patching: Use jump_label for testing freed initmem
Once init is done, initmem is freed forever so no need to test system_state at every call to patch_instruction().
Use jump_label.
This reduces by 2% the time needed to activate ftrace on an 8xx.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/0aee964721cab7316cffde21a2ca223cee14d373.1647962456.git.christophe.leroy@csgroup.eu
show more ...
|
| #
cb3ac452 |
| 22-Mar-2022 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
If CONFIG_MODULES is not set, there is no point in checking whether text is in module area.
This reduced the tim
powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES
If CONFIG_MODULES is not set, there is no point in checking whether text is in module area.
This reduced the time needed to activate/deactivate ftrace by more than 10% on an 8xx.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/f3c701cce00a38620788c0fc43ff0b611a268c54.1647962456.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6 |
|
| #
591b4b26 |
| 23-Feb-2022 |
Michael Ellerman <[email protected]> |
powerpc/code-patching: Pre-map patch area
Paul reported a warning with DEBUG_ATOMIC_SLEEP=y:
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256 in_atomic(): 0, i
powerpc/code-patching: Pre-map patch area
Paul reported a warning with DEBUG_ATOMIC_SLEEP=y:
BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256 in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 0, expected: 0 ... Call Trace: dump_stack_lvl+0xa0/0xec (unreliable) __might_resched+0x2f4/0x310 kmem_cache_alloc+0x220/0x4b0 __pud_alloc+0x74/0x1d0 hash__map_kernel_page+0x2cc/0x390 do_patch_instruction+0x134/0x4a0 arch_jump_label_transform+0x64/0x78 __jump_label_update+0x148/0x180 static_key_enable_cpuslocked+0xd0/0x120 static_key_enable+0x30/0x50 check_kvm_guest+0x60/0x88 pSeries_smp_probe+0x54/0xb0 smp_prepare_cpus+0x3e0/0x430 kernel_init_freeable+0x20c/0x43c kernel_init+0x30/0x1a0 ret_from_kernel_thread+0x5c/0x64
Peter pointed out that this is because do_patch_instruction() has disabled interrupts, but then map_patch_area() calls map_kernel_page() then hash__map_kernel_page() which does a sleeping memory allocation.
We only see the warning in KVM guests with SMT enabled, which is not particularly common, or on other platforms if CONFIG_KPROBES is disabled, also not common. The reason we don't see it in most configurations is that another path that happens to have interrupts enabled has allocated the required page tables for us, eg. there's a path in kprobes init that does that. That's just pure luck though.
As Christophe suggested, the simplest solution is to do a dummy map/unmap when we initialise the patching, so that any required page table levels are pre-allocated before the first call to do_patch_instruction(). This works because the unmap doesn't free any page tables that were allocated by the map, it just clears the PTE, leaving the page table levels there for the next map.
Reported-by: Paul Menzel <[email protected]> Debugged-by: Peter Zijlstra <[email protected]> Suggested-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4 |
|
| #
f30a578d |
| 02-Dec-2021 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Move code patching selftests in its own file
Code patching selftests are half of code-patching.c. As they are guarded by CONFIG_CODE_PATCHING_SELFTESTS, they'd be better in th
powerpc/code-patching: Move code patching selftests in its own file
Code patching selftests are half of code-patching.c. As they are guarded by CONFIG_CODE_PATCHING_SELFTESTS, they'd be better in their own file.
Also add a missing __init for instr_is_branch_to_addr()
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/c0c30504f04eb546a48ff77127a8bccd12a3d809.1638446239.git.christophe.leroy@csgroup.eu
show more ...
|
| #
31acc599 |
| 02-Dec-2021 |
Christophe Leroy <[email protected]> |
powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h
To enable moving selftests in their own C file in following patch, move instr_is_branch_iform() and instr_is_branch_bform()
powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h
To enable moving selftests in their own C file in following patch, move instr_is_branch_iform() and instr_is_branch_bform() to code-patching.h
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/fca0f3b191211b3681020885a611bf73eef20563.1638446239.git.christophe.leroy@csgroup.eu
show more ...
|