History log of /linux-6.15/arch/powerpc/lib/code-patching.c (Results 1 – 25 of 87)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3
# d262a192 12-Feb-2025 Christophe Leroy <[email protected]>

powerpc/code-patching: Fix KASAN hit by not flagging text patching area as VM_ALLOC

Erhard reported the following KASAN hit while booting his PowerMac G4
with a KASAN-enabled kernel 6.13-rc6:

BUG

powerpc/code-patching: Fix KASAN hit by not flagging text patching area as VM_ALLOC

Erhard reported the following KASAN hit while booting his PowerMac G4
with a KASAN-enabled kernel 6.13-rc6:

BUG: KASAN: vmalloc-out-of-bounds in copy_to_kernel_nofault+0xd8/0x1c8
Write of size 8 at addr f1000000 by task chronyd/1293

CPU: 0 UID: 123 PID: 1293 Comm: chronyd Tainted: G W 6.13.0-rc6-PMacG4 #2
Tainted: [W]=WARN
Hardware name: PowerMac3,6 7455 0x80010303 PowerMac
Call Trace:
[c2437590] [c1631a84] dump_stack_lvl+0x70/0x8c (unreliable)
[c24375b0] [c0504998] print_report+0xdc/0x504
[c2437610] [c050475c] kasan_report+0xf8/0x108
[c2437690] [c0505a3c] kasan_check_range+0x24/0x18c
[c24376a0] [c03fb5e4] copy_to_kernel_nofault+0xd8/0x1c8
[c24376c0] [c004c014] patch_instructions+0x15c/0x16c
[c2437710] [c00731a8] bpf_arch_text_copy+0x60/0x7c
[c2437730] [c0281168] bpf_jit_binary_pack_finalize+0x50/0xac
[c2437750] [c0073cf4] bpf_int_jit_compile+0xb30/0xdec
[c2437880] [c0280394] bpf_prog_select_runtime+0x15c/0x478
[c24378d0] [c1263428] bpf_prepare_filter+0xbf8/0xc14
[c2437990] [c12677ec] bpf_prog_create_from_user+0x258/0x2b4
[c24379d0] [c027111c] do_seccomp+0x3dc/0x1890
[c2437ac0] [c001d8e0] system_call_exception+0x2dc/0x420
[c2437f30] [c00281ac] ret_from_syscall+0x0/0x2c
--- interrupt: c00 at 0x5a1274
NIP: 005a1274 LR: 006a3b3c CTR: 005296c8
REGS: c2437f40 TRAP: 0c00 Tainted: G W (6.13.0-rc6-PMacG4)
MSR: 0200f932 <VEC,EE,PR,FP,ME,IR,DR,RI> CR: 24004422 XER: 00000000

GPR00: 00000166 af8f3fa0 a7ee3540 00000001 00000000 013b6500 005a5858 0200f932
GPR08: 00000000 00001fe9 013d5fc8 005296c8 2822244c 00b2fcd8 00000000 af8f4b57
GPR16: 00000000 00000001 00000000 00000000 00000000 00000001 00000000 00000002
GPR24: 00afdbb0 00000000 00000000 00000000 006e0004 013ce060 006e7c1c 00000001
NIP [005a1274] 0x5a1274
LR [006a3b3c] 0x6a3b3c
--- interrupt: c00

The buggy address belongs to the virtual mapping at
[f1000000, f1002000) created by:
text_area_cpu_up+0x20/0x190

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:00000000 index:0x0 pfn:0x76e30
flags: 0x80000000(zone=2)
raw: 80000000 00000000 00000122 00000000 00000000 00000000 ffffffff 00000001
raw: 00000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
f0ffff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0ffff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>f1000000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
f1000080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
f1000100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================

f8 corresponds to KASAN_VMALLOC_INVALID which means the area is not
initialised hence not supposed to be used yet.

Powerpc text patching infrastructure allocates a virtual memory area
using get_vm_area() and flags it as VM_ALLOC. But that flag is meant
to be used for vmalloc() and vmalloc() allocated memory is not
supposed to be used before a call to __vmalloc_node_range() which is
never called for that area.

That went undetected until commit e4137f08816b ("mm, kasan, kmsan:
instrument copy_from/to_kernel_nofault")

The area allocated by text_area_cpu_up() is not vmalloc memory, it is
mapped directly on demand when needed by map_kernel_page(). There is
no VM flag corresponding to such usage, so just pass no flag. That way
the area will be unpoisonned and usable immediately.

Reported-by: Erhard Furtner <[email protected]>
Closes: https://lore.kernel.org/all/20250112135832.57c92322@yea/
Fixes: 37bc3e5fd764 ("powerpc/lib/code-patching: Use alternate map for patch_instruction()")
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Madhavan Srinivasan <[email protected]>
Link: https://patch.msgid.link/06621423da339b374f48c0886e3a5db18e896be8.1739342693.git.christophe.leroy@csgroup.eu

show more ...


Revision tags: v6.14-rc2
# dc9c5166 03-Feb-2025 Christophe Leroy <[email protected]>

powerpc/code-patching: Disable KASAN report during patching via temporary mm

Erhard reports the following KASAN hit on Talos II (power9) with kernel 6.13:

[ 12.028126] ===========================

powerpc/code-patching: Disable KASAN report during patching via temporary mm

Erhard reports the following KASAN hit on Talos II (power9) with kernel 6.13:

[ 12.028126] ==================================================================
[ 12.028198] BUG: KASAN: user-memory-access in copy_to_kernel_nofault+0x8c/0x1a0
[ 12.028260] Write of size 8 at addr 0000187e458f2000 by task systemd/1

[ 12.028346] CPU: 87 UID: 0 PID: 1 Comm: systemd Tainted: G T 6.13.0-P9-dirty #3
[ 12.028408] Tainted: [T]=RANDSTRUCT
[ 12.028446] Hardware name: T2P9D01 REV 1.01 POWER9 0x4e1202 opal:skiboot-bc106a0 PowerNV
[ 12.028500] Call Trace:
[ 12.028536] [c000000008dbf3b0] [c000000001656a48] dump_stack_lvl+0xbc/0x110 (unreliable)
[ 12.028609] [c000000008dbf3f0] [c0000000006e2fc8] print_report+0x6b0/0x708
[ 12.028666] [c000000008dbf4e0] [c0000000006e2454] kasan_report+0x164/0x300
[ 12.028725] [c000000008dbf600] [c0000000006e54d4] kasan_check_range+0x314/0x370
[ 12.028784] [c000000008dbf640] [c0000000006e6310] __kasan_check_write+0x20/0x40
[ 12.028842] [c000000008dbf660] [c000000000578e8c] copy_to_kernel_nofault+0x8c/0x1a0
[ 12.028902] [c000000008dbf6a0] [c0000000000acfe4] __patch_instructions+0x194/0x210
[ 12.028965] [c000000008dbf6e0] [c0000000000ade80] patch_instructions+0x150/0x590
[ 12.029026] [c000000008dbf7c0] [c0000000001159bc] bpf_arch_text_copy+0x6c/0xe0
[ 12.029085] [c000000008dbf800] [c000000000424250] bpf_jit_binary_pack_finalize+0x40/0xc0
[ 12.029147] [c000000008dbf830] [c000000000115dec] bpf_int_jit_compile+0x3bc/0x930
[ 12.029206] [c000000008dbf990] [c000000000423720] bpf_prog_select_runtime+0x1f0/0x280
[ 12.029266] [c000000008dbfa00] [c000000000434b18] bpf_prog_load+0xbb8/0x1370
[ 12.029324] [c000000008dbfb70] [c000000000436ebc] __sys_bpf+0x5ac/0x2e00
[ 12.029379] [c000000008dbfd00] [c00000000043a228] sys_bpf+0x28/0x40
[ 12.029435] [c000000008dbfd20] [c000000000038eb4] system_call_exception+0x334/0x610
[ 12.029497] [c000000008dbfe50] [c00000000000c270] system_call_vectored_common+0xf0/0x280
[ 12.029561] --- interrupt: 3000 at 0x3fff82f5cfa8
[ 12.029608] NIP: 00003fff82f5cfa8 LR: 00003fff82f5cfa8 CTR: 0000000000000000
[ 12.029660] REGS: c000000008dbfe80 TRAP: 3000 Tainted: G T (6.13.0-P9-dirty)
[ 12.029735] MSR: 900000000280f032 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI> CR: 42004848 XER: 00000000
[ 12.029855] IRQMASK: 0
GPR00: 0000000000000169 00003fffdcf789a0 00003fff83067100 0000000000000005
GPR04: 00003fffdcf78a98 0000000000000090 0000000000000000 0000000000000008
GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00003fff836ff7e0 c000000000010678 0000000000000000
GPR16: 0000000000000000 0000000000000000 00003fffdcf78f28 00003fffdcf78f90
GPR20: 0000000000000000 0000000000000000 0000000000000000 00003fffdcf78f80
GPR24: 00003fffdcf78f70 00003fffdcf78d10 00003fff835c7239 00003fffdcf78bd8
GPR28: 00003fffdcf78a98 0000000000000000 0000000000000000 000000011f547580
[ 12.030316] NIP [00003fff82f5cfa8] 0x3fff82f5cfa8
[ 12.030361] LR [00003fff82f5cfa8] 0x3fff82f5cfa8
[ 12.030405] --- interrupt: 3000
[ 12.030444] ==================================================================

Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for
Radix MMU") is inspired from x86 but unlike x86 is doesn't disable
KASAN reports during patching. This wasn't a problem at the begining
because __patch_mem() is not instrumented.

Commit 465cabc97b42 ("powerpc/code-patching: introduce
patch_instructions()") use copy_to_kernel_nofault() to copy several
instructions at once. But when using temporary mm the destination is
not regular kernel memory but a kind of kernel-like memory located
in user address space. Because it is not in kernel address space it is
not covered by KASAN shadow memory. Since commit e4137f08816b ("mm,
kasan, kmsan: instrument copy_from/to_kernel_nofault") KASAN reports
bad accesses from copy_to_kernel_nofault(). Here a bad access to user
memory is reported because KASAN detects the lack of shadow memory and
the address is below TASK_SIZE.

Do like x86 in commit b3fd8e83ada0 ("x86/alternatives: Use temporary
mm for text poking") and disable KASAN reports during patching when
using temporary mm.

Reported-by: Erhard Furtner <[email protected]>
Close: https://lore.kernel.org/all/20250201151435.48400261@yea/
Fixes: 465cabc97b42 ("powerpc/code-patching: introduce patch_instructions()")
Signed-off-by: Christophe Leroy <[email protected]>
Acked-by: Michael Ellerman <[email protected]>
Signed-off-by: Madhavan Srinivasan <[email protected]>
Link: https://patch.msgid.link/1c05b2a1b02ad75b981cfc45927e0b4a90441046.1738577687.git.christophe.leroy@csgroup.eu

show more ...


Revision tags: v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5
# 0c3beacf 23-Oct-2024 Mike Rapoport (Microsoft) <[email protected]>

asm-generic: introduce text-patching.h

Several architectures support text patching, but they name the header
files that declare patching functions differently.

Make all such headers consistently na

asm-generic: introduce text-patching.h

Several architectures support text patching, but they name the header
files that declare patching functions differently.

Make all such headers consistently named text-patching.h and add an empty
header in asm-generic for architectures that do not support text patching.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Mike Rapoport (Microsoft) <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]> # m68k
Acked-by: Arnd Bergmann <[email protected]>
Reviewed-by: Luis Chamberlain <[email protected]>
Tested-by: kdevops <[email protected]>
Cc: Andreas Larsson <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Borislav Petkov (AMD) <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu (Google) <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Russell King <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Steven Rostedt (Google) <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>

show more ...


Revision tags: v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1
# dbf828aa 15-May-2024 Benjamin Gray <[email protected]>

powerpc/code-patching: Add data patch alignment check

The new data patching still needs to be aligned within a
cacheline too for the flushes to work correctly. To simplify
this requirement, we just

powerpc/code-patching: Add data patch alignment check

The new data patching still needs to be aligned within a
cacheline too for the flushes to work correctly. To simplify
this requirement, we just say data patches must be aligned.

Detect when data patching is not aligned, returning an invalid
argument error.

Signed-off-by: Benjamin Gray <[email protected]>
Reviewed-by: Hari Bathini <[email protected]>
Acked-by: Naveen N Rao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]

show more ...


# e6b8940e 15-May-2024 Benjamin Gray <[email protected]>

powerpc/code-patching: Add generic memory patching

patch_instruction() is designed for patching instructions in otherwise
readonly memory. Other consumers also sometimes need to patch readonly
memor

powerpc/code-patching: Add generic memory patching

patch_instruction() is designed for patching instructions in otherwise
readonly memory. Other consumers also sometimes need to patch readonly
memory, so have abused patch_instruction() for arbitrary data patches.

This is a problem on ppc64 as patch_instruction() decides on the patch
width using the 'instruction' opcode to see if it's a prefixed
instruction. Data that triggers this can lead to larger writes, possibly
crossing a page boundary and failing the write altogether.

Introduce patch_uint(), and patch_ulong(), with aliases patch_u32(), and
patch_u64() (on ppc64) designed for aligned data patches. The patch
size is now determined by the called function, and is passed as an
additional parameter to generic internals.

While the instruction flushing is not required for data patches, it
remains unconditional in this patch. A followup series is possible if
benchmarking shows fewer flushes gives an improvement in some
data-patching workload.

ppc32 does not support prefixed instructions, so is unaffected by the
original issue. Care is taken in not exposing the size parameter in the
public (non-static) interface, so the compiler can const-propagate it
away.

Signed-off-by: Benjamin Gray <[email protected]>
Reviewed-by: Hari Bathini <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]

show more ...


Revision tags: v6.9, v6.9-rc7
# 0a956d52 05-May-2024 Mike Rapoport (IBM) <[email protected]>

powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate

There are places where CONFIG_MODULES guards the code that depends on
memory allocation being done with module_alloc().

Repla

powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate

There are places where CONFIG_MODULES guards the code that depends on
memory allocation being done with module_alloc().

Replace CONFIG_MODULES with CONFIG_EXECMEM in such places.

Signed-off-by: Mike Rapoport (IBM) <[email protected]>
Signed-off-by: Luis Chamberlain <[email protected]>

show more ...


Revision tags: v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2
# c3710ee7 25-Mar-2024 Benjamin Gray <[email protected]>

powerpc/code-patching: Use dedicated memory routines for patching

The patching page set up as a writable alias may be in quadrant 0
(userspace) if the temporary mm path is used. This causes sanitise

powerpc/code-patching: Use dedicated memory routines for patching

The patching page set up as a writable alias may be in quadrant 0
(userspace) if the temporary mm path is used. This causes sanitiser
failures if so. Sanitiser failures also occur on the non-mm path
because the plain memset family is instrumented, and KASAN treats the
patching window as poisoned.

Introduce locally defined patch_* variants of memset that perform an
uninstrumented lower level set, as well as detecting write errors like
the original single patch variant does.

copy_to_user() is not correct here, as the PTE makes it a proper kernel
page (the EAA is privileged access only, RW). It just happens to be in
quadrant 0 because that's the hardware's mechanism for using the current
PID vs PID 0 in translations. Importantly, it's incorrect to allow user
page accesses.

Now that the patching memsets are used, we also propagate a failure up
to the caller as the single patch variant does.

Signed-off-by: Benjamin Gray <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]

show more ...


Revision tags: v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7
# 465cabc9 20-Oct-2023 Hari Bathini <[email protected]>

powerpc/code-patching: introduce patch_instructions()

patch_instruction() entails setting up pte, patching the instruction,
clearing the pte and flushing the tlb. If multiple instructions need
to be

powerpc/code-patching: introduce patch_instructions()

patch_instruction() entails setting up pte, patching the instruction,
clearing the pte and flushing the tlb. If multiple instructions need
to be patched, every instruction would have to go through the above
drill unnecessarily. Instead, introduce patch_instructions() function
that sets up the pte, clears the pte and flushes the tlb only once
per page range of instructions to be patched. Duplicate most of the
patch_instruction() code instead of merging with it, to avoid the
performance degradation observed on ppc32, for patch_instruction(),
with the code path merged. Also, setup poking_init() always as BPF
expects poking_init() to be setup even when STRICT_KERNEL_RWX is off.

Signed-off-by: Hari Bathini <[email protected]>
Acked-by: Song Liu <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]

show more ...


Revision tags: v6.6-rc6, v6.6-rc5
# 74726fda 07-Oct-2023 Christophe Leroy <[email protected]>

powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure

Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for
Radix MMU") added a hwsync for when __patch_instru

powerpc/code-patching: Perform hwsync in __patch_instruction() in case of failure

Commit c28c15b6d28a ("powerpc/code-patching: Use temporary mm for
Radix MMU") added a hwsync for when __patch_instruction() fails,
we results in a quite odd unbalanced logic.

Instead of calling mb() when __patch_instruction() returns an error,
call mb() in the __patch_instruction()'s error path directly.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/e88b154eaf2efd9ff177d472d3411dcdec8ff4f5.1696675567.git.christophe.leroy@csgroup.eu

show more ...


Revision tags: v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1
# 980411a4 16-Dec-2022 Michael Ellerman <[email protected]>

powerpc/code-patching: Fix oops with DEBUG_VM enabled

Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is
enabled:

------------[ cut here ]------------
kernel BUG at arch/pow

powerpc/code-patching: Fix oops with DEBUG_VM enabled

Nathan reported that the new per-cpu mm patching oopses if DEBUG_VM is
enabled:

------------[ cut here ]------------
kernel BUG at arch/powerpc/mm/pgtable.c:333!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc2+ #1
Hardware name: IBM PowerNV (emulated by qemu) POWER9 0x4e1200 opal:v7.0 PowerNV
...
NIP assert_pte_locked+0x180/0x1a0
LR assert_pte_locked+0x170/0x1a0
Call Trace:
0x60000000 (unreliable)
patch_instruction+0x618/0x6d0
arch_prepare_kprobe+0xfc/0x2d0
register_kprobe+0x520/0x7c0
arch_init_kprobes+0x28/0x3c
init_kprobes+0x108/0x184
do_one_initcall+0x60/0x2e0
kernel_init_freeable+0x1f0/0x3e0
kernel_init+0x34/0x1d0
ret_from_kernel_thread+0x5c/0x64

It's caused by the assert_spin_locked() failing in assert_pte_locked().
The assert fails because the PTE was unlocked in text_area_cpu_up_mm(),
and never relocked.

The PTE page shouldn't be freed, the patching_mm is only used for
patching on this CPU, only that single PTE is ever mapped, and it's only
unmapped at CPU offline.

In fact assert_pte_locked() has a special case to ignore init_mm
entirely, and the patching_mm is more-or-less like init_mm, so possibly
the check could be skipped for patching_mm too.

But for now be conservative, and use the proper PTE accessors at
patching time, so that the PTE lock is held while the PTE is used. That
also avoids the warning in assert_pte_locked().

With that it's no longer necessary to save the PTE in
cpu_patching_context for the mm_patch_enabled() case.

Fixes: c28c15b6d28a ("powerpc/code-patching: Use temporary mm for Radix MMU")
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.1, v6.1-rc8
# 6f3a81b6 02-Dec-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Remove protection against patching init addresses after init

Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58b3 ("powerpc: Avo

powerpc/code-patching: Remove protection against patching init addresses after init

Once init section is freed, attempting to patch init code
ends up in the weed.

Commit 51c3c62b58b3 ("powerpc: Avoid code patching freed init sections")
protected patch_instruction() against that, but it is the responsibility
of the caller to ensure that the patched memory is valid.

All callers have now been verified and fixed so the check
can be removed.

This improves ftrace activation by about 2% on 8xx.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/504310828f473d424e2ed229eff57bf075f52796.1669969781.git.christophe.leroy@csgroup.eu

show more ...


# 84ecfe6f 02-Dec-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX

No need to have one implementation of patch_instruction() for
CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.

In patch_

powerpc/code-patching: Remove #ifdef CONFIG_STRICT_KERNEL_RWX

No need to have one implementation of patch_instruction() for
CONFIG_STRICT_KERNEL_RWX and one for !CONFIG_STRICT_KERNEL_RWX.

In patch_instruction(), call raw_patch_instruction() when
!CONFIG_STRICT_KERNEL_RWX.

In poking_init(), bail out immediately, it will be equivalent
to the weak default implementation.

Everything else is declared static and will be discarded by
GCC when !CONFIG_STRICT_KERNEL_RWX.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/f67d2a109404d03e8fdf1ea15388c8778337a76b.1669969781.git.christophe.leroy@csgroup.eu

show more ...


Revision tags: v6.1-rc7, v6.1-rc6, v6.1-rc5
# 2f228ee1 09-Nov-2022 Benjamin Gray <[email protected]>

powerpc/code-patching: Consolidate and cache per-cpu patching context

With the temp mm context support, there are CPU local variables to hold
the patch address and pte. Use these in the non-temp mm

powerpc/code-patching: Consolidate and cache per-cpu patching context

With the temp mm context support, there are CPU local variables to hold
the patch address and pte. Use these in the non-temp mm path as well
instead of adding a level of indirection through the text_poke_area
vm_struct and pointer chasing the pte.

As both paths use these fields now, there is no need to let unreferenced
variables be dropped by the compiler, so it is cleaner to merge them
into a single context struct. This has the additional benefit of
removing a redundant CPU local pointer, as only one of cpu_patching_mm /
text_poke_area is ever used, while remaining well-typed. It also groups
each CPU's data into a single cacheline.

Signed-off-by: Benjamin Gray <[email protected]>
[mpe: Shorten name to 'area' as suggested by Christophe]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# c28c15b6 09-Nov-2022 Christopher M. Riedl <[email protected]>

powerpc/code-patching: Use temporary mm for Radix MMU

x86 supports the notion of a temporary mm which restricts access to
temporary PTEs to a single CPU. A temporary mm is useful for situations
wher

powerpc/code-patching: Use temporary mm for Radix MMU

x86 supports the notion of a temporary mm which restricts access to
temporary PTEs to a single CPU. A temporary mm is useful for situations
where a CPU needs to perform sensitive operations (such as patching a
STRICT_KERNEL_RWX kernel) requiring temporary mappings without exposing
said mappings to other CPUs. Another benefit is that other CPU TLBs do
not need to be flushed when the temporary mm is torn down.

Mappings in the temporary mm can be set in the userspace portion of the
address-space.

Interrupts must be disabled while the temporary mm is in use. HW
breakpoints, which may have been set by userspace as watchpoints on
addresses now within the temporary mm, are saved and disabled when
loading the temporary mm. The HW breakpoints are restored when unloading
the temporary mm. All HW breakpoints are indiscriminately disabled while
the temporary mm is in use - this may include breakpoints set by perf.

Use the `poking_init` init hook to prepare a temporary mm and patching
address. Initialize the temporary mm using mm_alloc(). Choose a
randomized patching address inside the temporary mm userspace address
space. The patching address is randomized between PAGE_SIZE and
DEFAULT_MAP_WINDOW-PAGE_SIZE.

Bits of entropy with 64K page size on BOOK3S_64:

bits of entropy = log2(DEFAULT_MAP_WINDOW_USER64 / PAGE_SIZE)

PAGE_SIZE=64K, DEFAULT_MAP_WINDOW_USER64=128TB
bits of entropy = log2(128TB / 64K)
bits of entropy = 31

The upper limit is DEFAULT_MAP_WINDOW due to how the Book3s64 Hash MMU
operates - by default the space above DEFAULT_MAP_WINDOW is not
available. Currently the Hash MMU does not use a temporary mm so
technically this upper limit isn't necessary; however, a larger
randomization range does not further "harden" this overall approach and
future work may introduce patching with a temporary mm on Hash as well.

Randomization occurs only once during initialization for each CPU as it
comes online.

The patching page is mapped with PAGE_KERNEL to set EAA[0] for the PTE
which ignores the AMR (so no need to unlock/lock KUAP) according to
PowerISA v3.0b Figure 35 on Radix.

Based on x86 implementation:

commit 4fc19708b165
("x86/alternatives: Initialize temporary mm for patching")

and:

commit b3fd8e83ada0
("x86/alternatives: Use temporary mm for text poking")

From: Benjamin Gray <[email protected]>

Synchronisation is done according to ISA 3.1B Book 3 Chapter 13
"Synchronization Requirements for Context Alterations". Switching the mm
is a change to the PID, which requires a CSI before and after the change,
and a hwsync between the last instruction that performs address
translation for an associated storage access.

Instruction fetch is an associated storage access, but the instruction
address mappings are not being changed, so it should not matter which
context they use. We must still perform a hwsync to guard arbitrary
prior code that may have accessed a userspace address.

TLB invalidation is local and VA specific. Local because only this core
used the patching mm, and VA specific because we only care that the
writable mapping is purged. Leaving the other mappings intact is more
efficient, especially when performing many code patches in a row (e.g.,
as ftrace would).

Signed-off-by: Christopher M. Riedl <[email protected]>
Signed-off-by: Benjamin Gray <[email protected]>
[mpe: Use mm_alloc() per 107b6828a7cd ("x86/mm: Use mm_alloc() in poking_init()")]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


# 071c95c1 09-Nov-2022 Benjamin Gray <[email protected]>

powerpc/code-patching: Use WARN_ON and fix check in poking_init

BUG_ON() when failing to initialise the code patching window is
unnecessary, and use of BUG_ON is discouraged. We don't set
poking_ini

powerpc/code-patching: Use WARN_ON and fix check in poking_init

BUG_ON() when failing to initialise the code patching window is
unnecessary, and use of BUG_ON is discouraged. We don't set
poking_init_done in this case, so failure to init the boot CPU will
result in a strict RWX error when a following patch_instruction uses
raw_patch_instruction. If it only fails for later CPUs, they won't be
onlined in the first place.

The return value of cpuhp_setup_state() is also >= 0 on success,
so check for < 0.

Signed-off-by: Benjamin Gray <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2
# 8b4bb0ad 15-Aug-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Speed up page mapping/unmapping

Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
all

powerpc/code-patching: Speed up page mapping/unmapping

Since commit 591b4b268435 ("powerpc/code-patching: Pre-map patch area")
the patch area is premapped so intermediate page tables are already
allocated.

Use __set_pte_at() directly instead of the heavy map_kernel_page(),
at for unmapping just do a pte_clear() followed by a flush.

__set_pte_at() can be used directly without the filters in
set_pte_at() because we are mapping a normal page non executable.

Make sure gcc knows text_poke_area is page aligned in order to
optimise the flush.

This change reduces by 66% the time needed to activate ftrace on
an 8xx (588000 tb ticks instead of 1744000).

Signed-off-by: Christophe Leroy <[email protected]>
[mpe: Add ptesync needed on radix to avoid spurious fault]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7
# bbffdd2f 09-May-2022 Christophe Leroy <[email protected]>

powerpc/ftrace: Use patch_instruction() return directly

Instead of returning -EPERM when patch_instruction() fails,
just return what patch_instruction returns.

That simplifies ftrace_modify_code():

powerpc/ftrace: Use patch_instruction() return directly

Instead of returning -EPERM when patch_instruction() fails,
just return what patch_instruction returns.

That simplifies ftrace_modify_code():

0: 94 21 ff c0 stwu r1,-64(r1)
4: 93 e1 00 3c stw r31,60(r1)
8: 7c 7f 1b 79 mr. r31,r3
c: 40 80 00 30 bge 3c <ftrace_modify_code+0x3c>
10: 93 c1 00 38 stw r30,56(r1)
14: 7c 9e 23 78 mr r30,r4
18: 7c a4 2b 78 mr r4,r5
1c: 80 bf 00 00 lwz r5,0(r31)
20: 7c 1e 28 40 cmplw r30,r5
24: 40 82 00 34 bne 58 <ftrace_modify_code+0x58>
28: 83 c1 00 38 lwz r30,56(r1)
2c: 7f e3 fb 78 mr r3,r31
30: 83 e1 00 3c lwz r31,60(r1)
34: 38 21 00 40 addi r1,r1,64
38: 48 00 00 00 b 38 <ftrace_modify_code+0x38>
38: R_PPC_REL24 patch_instruction

Before:

0: 94 21 ff c0 stwu r1,-64(r1)
4: 93 e1 00 3c stw r31,60(r1)
8: 7c 7f 1b 79 mr. r31,r3
c: 40 80 00 4c bge 58 <ftrace_modify_code+0x58>
10: 93 c1 00 38 stw r30,56(r1)
14: 7c 9e 23 78 mr r30,r4
18: 7c a4 2b 78 mr r4,r5
1c: 80 bf 00 00 lwz r5,0(r31)
20: 7c 08 02 a6 mflr r0
24: 90 01 00 44 stw r0,68(r1)
28: 7c 1e 28 40 cmplw r30,r5
2c: 40 82 00 48 bne 74 <ftrace_modify_code+0x74>
30: 7f e3 fb 78 mr r3,r31
34: 48 00 00 01 bl 34 <ftrace_modify_code+0x34>
34: R_PPC_REL24 patch_instruction
38: 80 01 00 44 lwz r0,68(r1)
3c: 20 63 00 00 subfic r3,r3,0
40: 83 c1 00 38 lwz r30,56(r1)
44: 7c 63 19 10 subfe r3,r3,r3
48: 7c 08 03 a6 mtlr r0
4c: 83 e1 00 3c lwz r31,60(r1)
50: 38 21 00 40 addi r1,r1,64
54: 4e 80 00 20 blr

It improves ftrace activation/deactivation duration by about 3%.

Modify patch_instruction() return on failure to -EPERM in order to
match with ftrace expectations. Other users of patch_instruction()
do not care about the exact error value returned.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/49a8597230713e2633e7d9d7b56140787c4a7e20.1652074503.git.christophe.leroy@csgroup.eu

show more ...


# d2f47dab 09-May-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Inline create_branch()

create_branch() is a good candidate for inlining because:
- Flags can be folded in.
- Range tests are likely to be already done.

Hence reducing the cre

powerpc/code-patching: Inline create_branch()

create_branch() is a good candidate for inlining because:
- Flags can be folded in.
- Range tests are likely to be already done.

Hence reducing the create_branch() to only a set of instructions.

So inline it.

It improves ftrace activation by 10%.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/69851cc9a7bf8f03d025e6d29e165f2d0bd3bb6e.1652074503.git.christophe.leroy@csgroup.eu

show more ...


# 1acbf27e 09-May-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()

Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range()
are simple tests that are worth inlining.

Signed-off-by: Ch

powerpc/code-patching: Inline is_offset_in_{cond}_branch_range()

Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range()
are simple tests that are worth inlining.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/a05be0ccb7373e6a9789a1988fcd0c810f5f9269.1652074503.git.christophe.leroy@csgroup.eu

show more ...


Revision tags: v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1
# 17512892 22-Mar-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Use jump_label to check if poking_init() is done

It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init(

powerpc/code-patching: Use jump_label to check if poking_init() is done

It's only during early startup that poking_init() is not done yet,
for instance when calling ftrace_init().

Once poking_init() has been called there must be a poking area, no
need to check it everytime patch_instruction() is called.

ftrace activation time is reduced by 7% with the change on an 8xx.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/8d6088aca7b63247377b6d9e4897d08d935fbe93.1647962456.git.christophe.leroy@csgroup.eu

show more ...


# b0337678 22-Mar-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Use jump_label for testing freed initmem

Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

Th

powerpc/code-patching: Use jump_label for testing freed initmem

Once init is done, initmem is freed forever so no need to
test system_state at every call to patch_instruction().

Use jump_label.

This reduces by 2% the time needed to activate ftrace on an 8xx.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/0aee964721cab7316cffde21a2ca223cee14d373.1647962456.git.christophe.leroy@csgroup.eu

show more ...


# cb3ac452 22-Mar-2022 Christophe Leroy <[email protected]>

powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES

If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the tim

powerpc/code-patching: Don't call is_vmalloc_or_module_addr() without CONFIG_MODULES

If CONFIG_MODULES is not set, there is no point in checking
whether text is in module area.

This reduced the time needed to activate/deactivate ftrace
by more than 10% on an 8xx.

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/f3c701cce00a38620788c0fc43ff0b611a268c54.1647962456.git.christophe.leroy@csgroup.eu

show more ...


Revision tags: v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6
# 591b4b26 23-Feb-2022 Michael Ellerman <[email protected]>

powerpc/code-patching: Pre-map patch area

Paul reported a warning with DEBUG_ATOMIC_SLEEP=y:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256
in_atomic(): 0, i

powerpc/code-patching: Pre-map patch area

Paul reported a warning with DEBUG_ATOMIC_SLEEP=y:

BUG: sleeping function called from invalid context at include/linux/sched/mm.h:256
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
preempt_count: 0, expected: 0
...
Call Trace:
dump_stack_lvl+0xa0/0xec (unreliable)
__might_resched+0x2f4/0x310
kmem_cache_alloc+0x220/0x4b0
__pud_alloc+0x74/0x1d0
hash__map_kernel_page+0x2cc/0x390
do_patch_instruction+0x134/0x4a0
arch_jump_label_transform+0x64/0x78
__jump_label_update+0x148/0x180
static_key_enable_cpuslocked+0xd0/0x120
static_key_enable+0x30/0x50
check_kvm_guest+0x60/0x88
pSeries_smp_probe+0x54/0xb0
smp_prepare_cpus+0x3e0/0x430
kernel_init_freeable+0x20c/0x43c
kernel_init+0x30/0x1a0
ret_from_kernel_thread+0x5c/0x64

Peter pointed out that this is because do_patch_instruction() has
disabled interrupts, but then map_patch_area() calls map_kernel_page()
then hash__map_kernel_page() which does a sleeping memory allocation.

We only see the warning in KVM guests with SMT enabled, which is not
particularly common, or on other platforms if CONFIG_KPROBES is
disabled, also not common. The reason we don't see it in most
configurations is that another path that happens to have interrupts
enabled has allocated the required page tables for us, eg. there's a
path in kprobes init that does that. That's just pure luck though.

As Christophe suggested, the simplest solution is to do a dummy
map/unmap when we initialise the patching, so that any required page
table levels are pre-allocated before the first call to
do_patch_instruction(). This works because the unmap doesn't free any
page tables that were allocated by the map, it just clears the PTE,
leaving the page table levels there for the next map.

Reported-by: Paul Menzel <[email protected]>
Debugged-by: Peter Zijlstra <[email protected]>
Suggested-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

show more ...


Revision tags: v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4
# f30a578d 02-Dec-2021 Christophe Leroy <[email protected]>

powerpc/code-patching: Move code patching selftests in its own file

Code patching selftests are half of code-patching.c.
As they are guarded by CONFIG_CODE_PATCHING_SELFTESTS,
they'd be better in th

powerpc/code-patching: Move code patching selftests in its own file

Code patching selftests are half of code-patching.c.
As they are guarded by CONFIG_CODE_PATCHING_SELFTESTS,
they'd be better in their own file.

Also add a missing __init for instr_is_branch_to_addr()

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/c0c30504f04eb546a48ff77127a8bccd12a3d809.1638446239.git.christophe.leroy@csgroup.eu

show more ...


# 31acc599 02-Dec-2021 Christophe Leroy <[email protected]>

powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h

To enable moving selftests in their own C file in following patch,
move instr_is_branch_iform() and instr_is_branch_bform()

powerpc/code-patching: Move instr_is_branch_{i/b}form() in code-patching.h

To enable moving selftests in their own C file in following patch,
move instr_is_branch_iform() and instr_is_branch_bform()
to code-patching.h

Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/fca0f3b191211b3681020885a611bf73eef20563.1638446239.git.christophe.leroy@csgroup.eu

show more ...


1234