|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4 |
|
| #
491db21d |
| 17-Oct-2024 |
Suzuki K Poulose <[email protected]> |
efi: arm64: Map Device with Prot Shared
Device mappings need to be emulated by the VMM so must be mapped shared with the host.
Reviewed-by: Gavin Shan <[email protected]> Reviewed-by: Catalin Marina
efi: arm64: Map Device with Prot Shared
Device mappings need to be emulated by the VMM so must be mapped shared with the host.
Reviewed-by: Gavin Shan <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Signed-off-by: Steven Price <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4 |
|
| #
46e27b99 |
| 13-Jun-2024 |
Waiman Long <[email protected]> |
efi/arm64: Fix kmemleak false positive in arm64_efi_rt_init()
The kmemleak code sometimes complains about the following leak:
unreferenced object 0xffff8000102e0000 (size 32768): comm "swapper/0"
efi/arm64: Fix kmemleak false positive in arm64_efi_rt_init()
The kmemleak code sometimes complains about the following leak:
unreferenced object 0xffff8000102e0000 (size 32768): comm "swapper/0", pid 1, jiffies 4294937323 (age 71.240s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000db9a88a3>] __vmalloc_node_range+0x324/0x450 [<00000000ff8903a4>] __vmalloc_node+0x90/0xd0 [<000000001a06634f>] arm64_efi_rt_init+0x64/0xdc [<0000000007826a8d>] do_one_initcall+0x178/0xac0 [<0000000054a87017>] do_initcalls+0x190/0x1d0 [<00000000308092d0>] kernel_init_freeable+0x2c0/0x2f0 [<000000003e7b99e0>] kernel_init+0x28/0x14c [<000000002246af5b>] ret_from_fork+0x10/0x20
The memory object in this case is for efi_rt_stack_top and is allocated in an initcall. So this is certainly a false positive. Mark the object as not a leak to quash it.
Signed-off-by: Waiman Long <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
|
Revision tags: v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1 |
|
| #
0069455b |
| 21-Mar-2024 |
Kent Overstreet <[email protected]> |
fix missing vmalloc.h includes
Patch series "Memory allocation profiling", v6.
Overview: Low overhead [1] per-callsite memory allocation profiling. Not just for debug kernels, overhead low enough t
fix missing vmalloc.h includes
Patch series "Memory allocation profiling", v6.
Overview: Low overhead [1] per-callsite memory allocation profiling. Not just for debug kernels, overhead low enough to be deployed in production.
Example output: root@moria-kvm:~# sort -rn /proc/allocinfo 127664128 31168 mm/page_ext.c:270 func:alloc_page_ext 56373248 4737 mm/slub.c:2259 func:alloc_slab_page 14880768 3633 mm/readahead.c:247 func:page_cache_ra_unbounded 14417920 3520 mm/mm_init.c:2530 func:alloc_large_system_hash 13377536 234 block/blk-mq.c:3421 func:blk_mq_alloc_rqs 11718656 2861 mm/filemap.c:1919 func:__filemap_get_folio 9192960 2800 kernel/fork.c:307 func:alloc_thread_stack_node 4206592 4 net/netfilter/nf_conntrack_core.c:2567 func:nf_ct_alloc_hashtable 4136960 1010 drivers/staging/ctagmod/ctagmod.c:20 [ctagmod] func:ctagmod_start 3940352 962 mm/memory.c:4214 func:alloc_anon_folio 2894464 22613 fs/kernfs/dir.c:615 func:__kernfs_new_node ...
Usage: kconfig options: - CONFIG_MEM_ALLOC_PROFILING - CONFIG_MEM_ALLOC_PROFILING_ENABLED_BY_DEFAULT - CONFIG_MEM_ALLOC_PROFILING_DEBUG adds warnings for allocations that weren't accounted because of a missing annotation
sysctl: /proc/sys/vm/mem_profiling
Runtime info: /proc/allocinfo
Notes:
[1]: Overhead To measure the overhead we are comparing the following configurations: (1) Baseline with CONFIG_MEMCG_KMEM=n (2) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) (3) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) (4) Enabled at runtime (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n && /proc/sys/vm/mem_profiling=1) (5) Baseline with CONFIG_MEMCG_KMEM=y && allocating with __GFP_ACCOUNT (6) Disabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=n) && CONFIG_MEMCG_KMEM=y (7) Enabled by default (CONFIG_MEM_ALLOC_PROFILING=y && CONFIG_MEM_ALLOC_PROFILING_BY_DEFAULT=y) && CONFIG_MEMCG_KMEM=y
Performance overhead: To evaluate performance we implemented an in-kernel test executing multiple get_free_page/free_page and kmalloc/kfree calls with allocation sizes growing from 8 to 240 bytes with CPU frequency set to max and CPU affinity set to a specific CPU to minimize the noise. Below are results from running the test on Ubuntu 22.04.2 LTS with 6.8.0-rc1 kernel on 56 core Intel Xeon:
kmalloc pgalloc (1 baseline) 6.764s 16.902s (2 default disabled) 6.793s (+0.43%) 17.007s (+0.62%) (3 default enabled) 7.197s (+6.40%) 23.666s (+40.02%) (4 runtime enabled) 7.405s (+9.48%) 23.901s (+41.41%) (5 memcg) 13.388s (+97.94%) 48.460s (+186.71%) (6 def disabled+memcg) 13.332s (+97.10%) 48.105s (+184.61%) (7 def enabled+memcg) 13.446s (+98.78%) 54.963s (+225.18%)
Memory overhead: Kernel size:
text data bss dec diff (1) 26515311 18890222 17018880 62424413 (2) 26524728 19423818 16740352 62688898 264485 (3) 26524724 19423818 16740352 62688894 264481 (4) 26524728 19423818 16740352 62688898 264485 (5) 26541782 18964374 16957440 62463596 39183
Memory consumption on a 56 core Intel CPU with 125GB of memory: Code tags: 192 kB PageExts: 262144 kB (256MB) SlabExts: 9876 kB (9.6MB) PcpuExts: 512 kB (0.5MB)
Total overhead is 0.2% of total memory.
Benchmarks:
Hackbench tests run 100 times: hackbench -s 512 -l 200 -g 15 -f 25 -P baseline disabled profiling enabled profiling avg 0.3543 0.3559 (+0.0016) 0.3566 (+0.0023) stdev 0.0137 0.0188 0.0077
hackbench -l 10000 baseline disabled profiling enabled profiling avg 6.4218 6.4306 (+0.0088) 6.5077 (+0.0859) stdev 0.0933 0.0286 0.0489
stress-ng tests: stress-ng --class memory --seq 4 -t 60 stress-ng --class cpu --seq 4 -t 60 Results posted at: https://evilpiepirate.org/~kent/memalloc_prof_v4_stress-ng/
[2] https://lore.kernel.org/all/[email protected]/
This patch (of 37):
The next patch drops vmalloc.h from a system header in order to fix a circular dependency; this adds it to all the files that were pulling it in implicitly.
[[email protected]: fix arch/alpha/lib/memcpy.c] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix arch/x86/mm/numa_32.c] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: a few places were depending on sizes.h] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix mm/kasan/hw_tags.c] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix arc build] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kent Overstreet <[email protected]> Signed-off-by: Suren Baghdasaryan <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Pasha Tatashin <[email protected]> Tested-by: Kees Cook <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Alex Gaynor <[email protected]> Cc: Alice Ryhl <[email protected]> Cc: Andreas Hindborg <[email protected]> Cc: Benno Lossin <[email protected]> Cc: "Björn Roy Baron" <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Gary Guo <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5 |
|
| #
5a00bfd6 |
| 15-Feb-2024 |
Ryan Roberts <[email protected]> |
arm64/mm: new ptep layer to manage contig bit
Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API
arm64/mm: new ptep layer to manage contig bit
Create a new layer for the in-table PTE manipulation APIs. For now, The existing API is prefixed with double underscore to become the arch-private API and the public API is just a simple wrapper that calls the private API.
The public API implementation will subsequently be used to transparently manipulate the contiguous bit where appropriate. But since there are already some contig-aware users (e.g. hugetlb, kernel mapper), we must first ensure those users use the private API directly so that the future contig-bit manipulations in the public API do not interfere with those existing uses.
The following APIs are treated this way:
- ptep_get - set_pte - set_ptes - pte_clear - ptep_get_and_clear - ptep_test_and_clear_young - ptep_clear_flush_young - ptep_set_wrprotect - ptep_set_access_flags
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
| #
53273655 |
| 15-Feb-2024 |
Ryan Roberts <[email protected]> |
arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)
There are a number of places in the arch code that read a pte by using the READ_ONCE() macro. Refactor these call sites to instead use the ptep_
arm64/mm: convert READ_ONCE(*ptep) to ptep_get(ptep)
There are a number of places in the arch code that read a pte by using the READ_ONCE() macro. Refactor these call sites to instead use the ptep_get() helper, which itself is a READ_ONCE(). Generated code should be the same.
This will benefit us when we shortly introduce the transparent contpte support. In this case, ptep_get() will become more complex so we now have all the code abstracted through it.
Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ryan Roberts <[email protected]> Tested-by: John Hubbard <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Barry Song <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Morse <[email protected]> Cc: Kefeng Wang <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
show more ...
|
|
Revision tags: v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7 |
|
| #
b8466fe8 |
| 17-Oct-2023 |
Arnd Bergmann <[email protected]> |
efi: move screen_info into efi init code
After the vga console no longer relies on global screen_info, there are only two remaining use cases:
- on the x86 architecture, it is used for multiple bo
efi: move screen_info into efi init code
After the vga console no longer relies on global screen_info, there are only two remaining use cases:
- on the x86 architecture, it is used for multiple boot methods (bzImage, EFI, Xen, kexec) to commucate the initial VGA or framebuffer settings to a number of device drivers.
- on other architectures, it is only used as part of the EFI stub, and only for the three sysfb framebuffers (simpledrm, simplefb, efifb).
Remove the duplicate data structure definitions by moving it into the efi-init.c file that sets it up initially for the EFI case, leaving x86 as an exception that retains its own definition for non-EFI boots.
The added #ifdefs here are optional, I added them to further limit the reach of screen_info to configurations that have at least one of the users enabled.
Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Acked-by: Helge Deller <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
| #
bbbb6577 |
| 16-Oct-2023 |
Mark Rutland <[email protected]> |
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_BTI
In system_supports_bti() we use cpus_have_const_cap() to check for ARM64_HAS_BTI, but this is not necessary and alternative_has_cap_*() or cpus_h
arm64: Avoid cpus_have_const_cap() for ARM64_HAS_BTI
In system_supports_bti() we use cpus_have_const_cap() to check for ARM64_HAS_BTI, but this is not necessary and alternative_has_cap_*() or cpus_have_final_*cap() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching.
When CONFIG_ARM64_BTI_KERNEL=y, the ARM64_HAS_BTI cpucap is a strict boot cpu feature which is detected and patched early on the boot cpu. All uses guarded by CONFIG_ARM64_BTI_KERNEL happen after the boot CPU has detected ARM64_HAS_BTI and patched boot alternatives, and hence can safely use alternative_has_cap_*() or cpus_have_final_boot_cap().
Regardless of CONFIG_ARM64_BTI_KERNEL, all other uses of ARM64_HAS_BTI happen after system capabilities have been finalized and alternatives have been patched. Hence these can safely use alternative_has_cap_*) or cpus_have_final_cap().
This patch splits system_supports_bti() into system_supports_bti() and system_supports_bti_kernel(), with the former handling where the cpucap affects userspace functionality, and ther latter handling where the cpucap affects kernel functionality. The use of cpus_have_const_cap() is replaced by cpus_have_final_cap() in cpus_have_const_cap, and cpus_have_final_boot_cap() in system_supports_bti_kernel(). This will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. The use of cpus_have_final_cap() and cpus_have_final_boot_cap() will make it easier to spot if code is chaanged such that these run before the ARM64_HAS_BTI cpucap is guaranteed to have been finalized.
Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Mark Brown <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6 |
|
| #
c37ce235 |
| 08-Aug-2023 |
Ard Biesheuvel <[email protected]> |
efi/arm64: Move EFI runtime call setup/teardown helpers out of line
Only the arch_efi_call_virt() macro that some architectures override needs to be a macro, given that it is variadic and encapsulat
efi/arm64: Move EFI runtime call setup/teardown helpers out of line
Only the arch_efi_call_virt() macro that some architectures override needs to be a macro, given that it is variadic and encapsulates calls via function pointers that have different prototypes.
The associated setup and teardown code are not special in this regard, and don't need to be instantiated at each call site. So turn them into ordinary C functions and move them out of line.
Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc5, v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1 |
|
| #
8b0d1354 |
| 06-Jul-2023 |
Thomas Zimmermann <[email protected]> |
efi: Do not include <linux/screen_info.h> from EFI header
The header file <linux/efi.h> does not need anything from <linux/screen_info.h>. Declare struct screen_info and remove the include statement
efi: Do not include <linux/screen_info.h> from EFI header
The header file <linux/efi.h> does not need anything from <linux/screen_info.h>. Declare struct screen_info and remove the include statements. Update a number of source files that require struct screen_info's definition.
v2: * update loongarch (Jingfeng)
Signed-off-by: Thomas Zimmermann <[email protected]> Reviewed-by: Javier Martinez Canillas <[email protected]> Reviewed-by: Sui Jingfeng <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Russell King <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
show more ...
|
|
Revision tags: v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2 |
|
| #
0e68b551 |
| 15-Feb-2023 |
Pierre Gondois <[email protected]> |
arm64: efi: Make efi_rt_lock a raw_spinlock
Running a rt-kernel base on 6.2.0-rc3-rt1 on an Ampere Altra outputs the following: BUG: sleeping function called from invalid context at kernel/locking
arm64: efi: Make efi_rt_lock a raw_spinlock
Running a rt-kernel base on 6.2.0-rc3-rt1 on an Ampere Altra outputs the following: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9, name: kworker/u320:0 preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 3 locks held by kworker/u320:0/9: #0: ffff3fff8c27d128 ((wq_completion)efi_rts_wq){+.+.}-{0:0}, at: process_one_work (./include/linux/atomic/atomic-long.h:41) #1: ffff80000861bdd0 ((work_completion)(&efi_rts_work.work)){+.+.}-{0:0}, at: process_one_work (./include/linux/atomic/atomic-long.h:41) #2: ffffdf7e1ed3e460 (efi_rt_lock){+.+.}-{3:3}, at: efi_call_rts (drivers/firmware/efi/runtime-wrappers.c:101) Preemption disabled at: efi_virtmap_load (./arch/arm64/include/asm/mmu_context.h:248) CPU: 0 PID: 9 Comm: kworker/u320:0 Tainted: G W 6.2.0-rc3-rt1 Hardware name: WIWYNN Mt.Jade Server System B81.03001.0005/Mt.Jade Motherboard, BIOS 1.08.20220218 (SCP: 1.08.20220218) 2022/02/18 Workqueue: efi_rts_wq efi_call_rts Call trace: dump_backtrace (arch/arm64/kernel/stacktrace.c:158) show_stack (arch/arm64/kernel/stacktrace.c:165) dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) dump_stack (lib/dump_stack.c:114) __might_resched (kernel/sched/core.c:10134) rt_spin_lock (kernel/locking/rtmutex.c:1769 (discriminator 4)) efi_call_rts (drivers/firmware/efi/runtime-wrappers.c:101) [...]
This seems to come from commit ff7a167961d1 ("arm64: efi: Execute runtime services from a dedicated stack") which adds a spinlock. This spinlock is taken through: efi_call_rts() \-efi_call_virt() \-efi_call_virt_pointer() \-arch_efi_call_virt_setup()
Make 'efi_rt_lock' a raw_spinlock to avoid being preempted.
[ardb: The EFI runtime services are called with a different set of translation tables, and are permitted to use the SIMD registers. The context switch code preserves/restores neither, and so EFI calls must be made with preemption disabled, rather than only disabling migration.]
Fixes: ff7a167961d1 ("arm64: efi: Execute runtime services from a dedicated stack") Signed-off-by: Pierre Gondois <[email protected]> Cc: <[email protected]> # v6.1+ Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc8, v6.2-rc7 |
|
| #
1d959312 |
| 01-Feb-2023 |
Ard Biesheuvel <[email protected]> |
efi: arm64: Wire up BTI annotation in memory attributes table
UEFI v2.10 extends the EFI memory attributes table with a flag that indicates whether or not all RuntimeServicesCode regions were constr
efi: arm64: Wire up BTI annotation in memory attributes table
UEFI v2.10 extends the EFI memory attributes table with a flag that indicates whether or not all RuntimeServicesCode regions were constructed with BTI landing pads, permitting the OS to map these regions with BTI restrictions enabled.
So let's take this into account on arm64.
Signed-off-by: Ard Biesheuvel <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Mark Rutland <[email protected]> Reviewed-by: Will Deacon <[email protected]>
show more ...
|
| #
cf1d2ffc |
| 01-Feb-2023 |
Ard Biesheuvel <[email protected]> |
efi: Discover BTI support in runtime services regions
Add the generic plumbing to detect whether or not the runtime code regions were constructed with BTI/IBT landing pads by the firmware, permittin
efi: Discover BTI support in runtime services regions
Add the generic plumbing to detect whether or not the runtime code regions were constructed with BTI/IBT landing pads by the firmware, permitting the OS to enable enforcement when mapping these regions into the OS's address space.
Signed-off-by: Ard Biesheuvel <[email protected]> Reviewed-by: Kees Cook <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3 |
|
| #
8a9a1a18 |
| 28-Oct-2022 |
Ard Biesheuvel <[email protected]> |
arm64: efi: Avoid workqueue to check whether EFI runtime is live
Comparing current_work() against efi_rts_work.work is sufficient to decide whether current is currently running EFI runtime services
arm64: efi: Avoid workqueue to check whether EFI runtime is live
Comparing current_work() against efi_rts_work.work is sufficient to decide whether current is currently running EFI runtime services code at any level in its call stack.
However, there are other potential users of the EFI runtime stack, such as the ACPI subsystem, which may invoke efi_call_virt_pointer() directly, and so any sync exceptions occurring in firmware during those calls are currently misidentified.
So instead, let's check whether the stashed value of the thread stack pointer points into current's thread stack. This can only be the case if current was interrupted while running EFI runtime code. Note that this implies that we should clear the stashed value after switching back, to avoid false positives.
Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
| #
e8dfdf31 |
| 28-Oct-2022 |
Ard Biesheuvel <[email protected]> |
arm64: efi: Recover from synchronous exceptions occurring in firmware
Unlike x86, which has machinery to deal with page faults that occur during the execution of EFI runtime services, arm64 has noth
arm64: efi: Recover from synchronous exceptions occurring in firmware
Unlike x86, which has machinery to deal with page faults that occur during the execution of EFI runtime services, arm64 has nothing like that, and a synchronous exception raised by firmware code brings down the whole system.
With more EFI based systems appearing that were not built to run Linux (such as the Windows-on-ARM laptops based on Qualcomm SOCs), as well as the introduction of PRM (platform specific firmware routines that are callable just like EFI runtime services), we are more likely to run into issues of this sort, and it is much more likely that we can identify and work around such issues if they don't bring down the system entirely.
Since we already use a EFI runtime services call wrapper in assembler, we can quite easily add some code that captures the execution state at the point where the call is made, allowing us to revert to this state and proceed execution if the call triggered a synchronous exception.
Given that the kernel and the firmware don't share any data structures that could end up in an indeterminate state, we can happily continue running, as long as we mark the EFI runtime services as unavailable from that point on.
Signed-off-by: Ard Biesheuvel <[email protected]> Acked-by: Catalin Marinas <[email protected]>
show more ...
|
| #
ff7a1679 |
| 05-Dec-2022 |
Ard Biesheuvel <[email protected]> |
arm64: efi: Execute runtime services from a dedicated stack
With the introduction of PRMT in the ACPI subsystem, the EFI rts workqueue is no longer the only caller of efi_call_virt_pointer() in the
arm64: efi: Execute runtime services from a dedicated stack
With the introduction of PRMT in the ACPI subsystem, the EFI rts workqueue is no longer the only caller of efi_call_virt_pointer() in the kernel. This means the EFI runtime services lock is no longer sufficient to manage concurrent calls into firmware, but also that firmware calls may occur that are not marshalled via the workqueue mechanism, but originate directly from the caller context.
For added robustness, and to ensure that the runtime services have 8 KiB of stack space available as per the EFI spec, introduce a spinlock protected EFI runtime stack of 8 KiB, where the spinlock also ensures serialization between the EFI rts workqueue (which itself serializes EFI runtime calls) and other callers of efi_call_virt_pointer().
While at it, use the stack pivot to avoid reloading the shadow call stack pointer from the ordinary stack, as doing so could produce a gadget to defeat it.
Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
| #
7572ac3c |
| 30-Nov-2022 |
Ard Biesheuvel <[email protected]> |
arm64: efi: Revert "Recover from synchronous exceptions ..."
This reverts commit 23715a26c8d81291, which introduced some code in assembler that manipulates both the ordinary and the shadow call stac
arm64: efi: Revert "Recover from synchronous exceptions ..."
This reverts commit 23715a26c8d81291, which introduced some code in assembler that manipulates both the ordinary and the shadow call stack pointer in a way that could potentially be taken advantage of. So let's revert it, and do a better job the next time around.
Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
| #
9b9eaee9 |
| 06-Nov-2022 |
Ard Biesheuvel <[email protected]> |
arm64: efi: Fix handling of misaligned runtime regions and drop warning
Currently, when mapping the EFI runtime regions in the EFI page tables, we complain about misaligned regions in a rather noisy
arm64: efi: Fix handling of misaligned runtime regions and drop warning
Currently, when mapping the EFI runtime regions in the EFI page tables, we complain about misaligned regions in a rather noisy way, using WARN().
Not only does this produce a lot of irrelevant clutter in the log, it is factually incorrect, as misaligned runtime regions are actually allowed by the EFI spec as long as they don't require conflicting memory types within the same 64k page.
So let's drop the warning, and tweak the code so that we - take both the start and end of the region into account when checking for misalignment - only revert to RWX mappings for non-code regions if misaligned code regions are also known to exist.
Cc: <[email protected]> Acked-by: Linus Torvalds <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]>
show more ...
|
| #
23715a26 |
| 28-Oct-2022 |
Ard Biesheuvel <[email protected]> |
arm64: efi: Recover from synchronous exceptions occurring in firmware
Unlike x86, which has machinery to deal with page faults that occur during the execution of EFI runtime services, arm64 has noth
arm64: efi: Recover from synchronous exceptions occurring in firmware
Unlike x86, which has machinery to deal with page faults that occur during the execution of EFI runtime services, arm64 has nothing like that, and a synchronous exception raised by firmware code brings down the whole system.
With more EFI based systems appearing that were not built to run Linux (such as the Windows-on-ARM laptops based on Qualcomm SOCs), as well as the introduction of PRM (platform specific firmware routines that are callable just like EFI runtime services), we are more likely to run into issues of this sort, and it is much more likely that we can identify and work around such issues if they don't bring down the system entirely.
Since we already use a EFI runtime services call wrapper in assembler, we can quite easily add some code that captures the execution state at the point where the call is made, allowing us to revert to this state and proceed execution if the call triggered a synchronous exception.
Given that the kernel and the firmware don't share any data structures that could end up in an indeterminate state, we can happily continue running, as long as we mark the EFI runtime services as unavailable from that point on.
Signed-off-by: Ard Biesheuvel <[email protected]> Acked-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.1-rc2, v6.1-rc1, v6.0, v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5 |
|
| #
9b16c213 |
| 04-Aug-2021 |
Michael Kelley <[email protected]> |
arm64: efi: Export screen_info
The Hyper-V frame buffer driver may be built as a module, and it needs access to screen_info. So export screen_info.
Signed-off-by: Michael Kelley <mikelley@microsoft
arm64: efi: Export screen_info
The Hyper-V frame buffer driver may be built as a module, and it needs access to screen_info. So export screen_info.
Signed-off-by: Michael Kelley <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Marc Zyngier <[email protected]> Acked-by: Mark Rutland <[email protected]> Acked-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
show more ...
|
|
Revision tags: v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7, v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1 |
|
| #
33def849 |
| 22-Oct-2020 |
Joe Perches <[email protected]> |
treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences.
Remove the q
treewide: Convert macro and uses of __section(foo) to __section("foo")
Use a more generic form for __section that requires quotes to avoid complications with clang and gcc differences.
Remove the quote operator # from compiler_attributes.h __section macro.
Convert all unquoted __section(foo) uses to quoted __section("foo"). Also convert __attribute__((section("foo"))) uses to __section("foo") even if the __attribute__ has multiple list entry forms.
Conversion done using the script at:
https://lore.kernel.org/lkml/[email protected]/2-convert_section.pl
Signed-off-by: Joe Perches <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Reviewed-by: Miguel Ojeda <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4, v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3, v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1 |
|
| #
8b1e0f81 |
| 12-Jul-2019 |
Anshuman Khandual <[email protected]> |
mm/pgtable: drop pgtable_t variable from pte_fn_t functions
Drop the pgtable_t variable from all implementation for pte_fn_t as none of them use it. apply_to_pte_range() should stop computing it as
mm/pgtable: drop pgtable_t variable from pte_fn_t functions
Drop the pgtable_t variable from all implementation for pte_fn_t as none of them use it. apply_to_pte_range() should stop computing it as well. Should help us save some cycles.
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Anshuman Khandual <[email protected]> Acked-by: Matthew Wilcox <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Russell King <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Logan Gunthorpe <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Dan Williams <[email protected]> Cc: <[email protected]> Cc: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
show more ...
|
|
Revision tags: v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4 |
|
| #
d2912cb1 |
| 04-Jun-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Enrico Weigelt <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6, v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4, v4.20-rc3, v4.20-rc2, v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3, v4.17-rc2, v4.17-rc1, v4.16, v4.16-rc7, v4.16-rc6, v4.16-rc5 |
|
| #
7e611e7d |
| 08-Mar-2018 |
Ard Biesheuvel <[email protected]> |
efi/arm64: Check whether x18 is preserved by runtime services calls
Whether or not we will ever decide to start using x18 as a platform register in Linux is uncertain, but by that time, we will need
efi/arm64: Check whether x18 is preserved by runtime services calls
Whether or not we will ever decide to start using x18 as a platform register in Linux is uncertain, but by that time, we will need to ensure that UEFI runtime services calls don't corrupt it.
So let's start issuing warnings now for this, and increase the likelihood that these firmware images have all been replaced by that time.
This has been fixed on the EDK2 side in commit:
6d73863b5464 ("BaseTools/tools_def AARCH64: mark register x18 as reserved")
dated July 13, 2017.
Signed-off-by: Ard Biesheuvel <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|
|
Revision tags: v4.16-rc4, v4.16-rc3, v4.16-rc2 |
|
| #
20a004e7 |
| 15-Feb-2018 |
Will Deacon <[email protected]> |
arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
In many cases, page tables can be accessed concurrently by either another CPU (due to things like fast gup) or by the hardware page tab
arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
In many cases, page tables can be accessed concurrently by either another CPU (due to things like fast gup) or by the hardware page table walker itself, which may set access/dirty bits. In such cases, it is important to use READ_ONCE/WRITE_ONCE when accessing page table entries so that entries cannot be torn, merged or subject to apparent loss of coherence due to compiler transformations.
Whilst there are some scenarios where this cannot happen (e.g. pinned kernel mappings for the linear region), the overhead of using READ_ONCE /WRITE_ONCE everywhere is minimal and makes the code an awful lot easier to reason about. This patch consistently uses these macros in the arch code, as well as explicitly namespacing pointers to page table entries from the entries themselves by using adopting a 'p' suffix for the former (as is sometimes used elsewhere in the kernel source).
Tested-by: Yury Norov <[email protected]> Tested-by: Richard Ruigrok <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7 |
|
| #
1e9de1d2 |
| 02-Jan-2018 |
Ard Biesheuvel <[email protected]> |
arm64/efi: Ignore EFI_MEMORY_XP attribute if RP and/or WP are set
The UEFI memory map is a bit vague about how to interpret the EFI_MEMORY_XP attribute when it is combined with EFI_MEMORY_RP and/or
arm64/efi: Ignore EFI_MEMORY_XP attribute if RP and/or WP are set
The UEFI memory map is a bit vague about how to interpret the EFI_MEMORY_XP attribute when it is combined with EFI_MEMORY_RP and/or EFI_MEMORY_WP, which have retroactively been redefined as cacheability attributes rather than permission attributes.
So let's ignore EFI_MEMORY_XP if _RP and/or _WP are also set. In this case, it is likely that they are being used to describe the capability of the region (i.e., whether it has the controls to reconfigure it as non-executable) rather than the nature of the contents of the region (i.e., whether it contains data that we will never attempt to execute)
Reported-by: Stephen Boyd <[email protected]> Tested-by: Stephen Boyd <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Cc: Arvind Yadav <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tyler Baicar <[email protected]> Cc: Vasyl Gomonovych <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
show more ...
|