|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6, v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6 |
|
| #
d3568644 |
| 24-Oct-2023 |
Maria Yu <[email protected]> |
arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
The counting of module PLTs has been broken when CONFIG_RANDOMIZE_BASE=n since commit:
3e35d303ab7d22c4 ("arm64: module: rework module
arm64: module: Fix PLT counting when CONFIG_RANDOMIZE_BASE=n
The counting of module PLTs has been broken when CONFIG_RANDOMIZE_BASE=n since commit:
3e35d303ab7d22c4 ("arm64: module: rework module VA range selection")
Prior to that commit, when CONFIG_RANDOMIZE_BASE=n, the kernel image and all modules were placed within a 128M region, and no PLTs were necessary for B or BL. Hence count_plts() and partition_branch_plt_relas() skipped handling B and BL when CONFIG_RANDOMIZE_BASE=n.
After that commit, modules can be placed anywhere within a 2G window regardless of CONFIG_RANDOMIZE_BASE, and hence PLTs may be necessary for B and BL even when CONFIG_RANDOMIZE_BASE=n. Unfortunately that commit failed to update count_plts() and partition_branch_plt_relas() accordingly.
Due to this, module_emit_plt_entry() may fail if an insufficient number of PLT entries have been reserved, resulting in modules failing to load with -ENOEXEC.
Fix this by counting PLTs regardless of CONFIG_RANDOMIZE_BASE in count_plts() and partition_branch_plt_relas().
Fixes: 3e35d303ab7d ("arm64: module: rework module VA range selection") Signed-off-by: Maria Yu <[email protected]> Cc: <[email protected]> # 6.5.x Acked-by: Ard Biesheuvel <[email protected]> Fixes: 3e35d303ab7d ("arm64: module: rework module VA range selection") Reviewed-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc7 |
|
| #
0a285dfe |
| 16-Oct-2023 |
Mark Rutland <[email protected]> |
arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_843419
In count_plts() and is_forbidden_offset_for_adrp() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_843419, but this is not n
arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_843419
In count_plts() and is_forbidden_offset_for_adrp() we use cpus_have_const_cap() to check for ARM64_WORKAROUND_843419, but this is not necessary and cpus_have_final_cap() would be preferable.
For historical reasons, cpus_have_const_cap() is more complicated than it needs to be. Before cpucaps are finalized, it will perform a bitmap test of the system_cpucaps bitmap, and once cpucaps are finalized it will use an alternative branch. This used to be necessary to handle some race conditions in the window between cpucap detection and the subsequent patching of alternatives and static branches, where different branches could be out-of-sync with one another (or w.r.t. alternative sequences). Now that we use alternative branches instead of static branches, these are all patched atomically w.r.t. one another, and there are only a handful of cases that need special care in the window between cpucap detection and alternative patching.
Due to the above, it would be nice to remove cpus_have_const_cap(), and migrate callers over to alternative_has_cap_*(), cpus_have_final_cap(), or cpus_have_cap() depending on when their requirements. This will remove redundant instructions and improve code generation, and will make it easier to determine how each callsite will behave before, during, and after alternative patching.
It's not possible to load a module in the window between detecting the ARM64_WORKAROUND_843419 cpucap and patching alternatives. The module VA range limits are initialized much later in module_init_limits() which is a subsys_initcall, and module loading cannot happen before this. Hence it's not necessary for count_plts() or is_forbidden_offset_for_adrp() to use cpus_have_const_cap().
This patch replaces the use of cpus_have_const_cap() with cpus_have_final_cap() which will avoid generating code to test the system_cpucaps bitmap and should be better for all subsequent calls at runtime. Using cpus_have_final_cap() clearly documents that we do not expect this code to run before cpucaps are finalized, and will make it easier to spot issues if code is changed in future to allow modules to be loaded earlier. The ARM64_WORKAROUND_843419 cpucap is added to cpucap_is_possible() so that code can be elided entirely when this is not possible, and redundant IS_ENABLED() checks are removed.
Signed-off-by: Mark Rutland <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.6-rc6, v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5 |
|
| #
f928f8b1 |
| 01-Aug-2023 |
James Morse <[email protected]> |
arm64: module: Use module_init_layout_section() to spot init sections
Today module_frob_arch_sections() spots init sections from their 'init' prefix, and uses this to keep the init PLTs separate fro
arm64: module: Use module_init_layout_section() to spot init sections
Today module_frob_arch_sections() spots init sections from their 'init' prefix, and uses this to keep the init PLTs separate from the rest.
module_emit_plt_entry() uses within_module_init() to determine if a location is in the init text or not, but this depends on whether core code thought this was an init section.
Naturally the logic is different.
module_init_layout_section() groups the init and exit text together if module unloading is disabled, as the exit code will never run. The result is kernels with this configuration can't load all their modules because there are not enough PLTs for the combined init+exit section.
This results in the following: | WARNING: CPU: 2 PID: 51 at arch/arm64/kernel/module-plts.c:99 module_emit_plt_entry+0x184/0x1cc | Modules linked in: crct10dif_common | CPU: 2 PID: 51 Comm: modprobe Not tainted 6.5.0-rc4-yocto-standard-dirty #15208 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : module_emit_plt_entry+0x184/0x1cc | lr : module_emit_plt_entry+0x94/0x1cc | sp : ffffffc0803bba60 [...] | Call trace: | module_emit_plt_entry+0x184/0x1cc | apply_relocate_add+0x2bc/0x8e4 | load_module+0xe34/0x1bd4 | init_module_from_file+0x84/0xc0 | __arm64_sys_finit_module+0x1b8/0x27c | invoke_syscall.constprop.0+0x5c/0x104 | do_el0_svc+0x58/0x160 | el0_svc+0x38/0x110 | el0t_64_sync_handler+0xc0/0xc4 | el0t_64_sync+0x190/0x194
A previous patch exposed module_init_layout_section(), use that so the logic is the same.
Reported-by: Adam Johnston <[email protected]> Tested-by: Adam Johnston <[email protected]> Fixes: 055f23b74b20 ("module: check for exit sections in layout_sections() instead of module_init_section()") Cc: <[email protected]> # 5.15.x: 60a0aab7463ee69 arm64: module-plts: inline linux/moduleloader.h Cc: <[email protected]> # 5.15.x Signed-off-by: James Morse <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6, v6.4-rc5, v6.4-rc4, v6.4-rc3 |
|
| #
60a0aab7 |
| 16-May-2023 |
Arnd Bergmann <[email protected]> |
arm64: module-plts: inline linux/moduleloader.h
module_frob_arch_sections() is declared in moduleloader.h, but that is not included before the definition:
arch/arm64/kernel/module-plts.c:286:5: err
arm64: module-plts: inline linux/moduleloader.h
module_frob_arch_sections() is declared in moduleloader.h, but that is not included before the definition:
arch/arm64/kernel/module-plts.c:286:5: error: no previous prototype for 'module_frob_arch_sections' [-Werror=missing-prototypes]
Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Kees Cook <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4, v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8 |
|
| #
ac3b4328 |
| 07-Feb-2023 |
Song Liu <[email protected]> |
module: replace module_layout with module_memory
module_layout manages different types of memory (text, data, rodata, etc.) in one allocation, which is problematic for some reasons:
1. It is hard t
module: replace module_layout with module_memory
module_layout manages different types of memory (text, data, rodata, etc.) in one allocation, which is problematic for some reasons:
1. It is hard to enable CONFIG_STRICT_MODULE_RWX. 2. It is hard to use huge pages in modules (and not break strict rwx). 3. Many archs uses module_layout for arch-specific data, but it is not obvious how these data are used (are they RO, RX, or RW?)
Improve the scenario by replacing 2 (or 3) module_layout per module with up to 7 module_memory per module:
MOD_TEXT, MOD_DATA, MOD_RODATA, MOD_RO_AFTER_INIT, MOD_INIT_TEXT, MOD_INIT_DATA, MOD_INIT_RODATA,
and allocating them separately. This adds slightly more entries to mod_tree (from up to 3 entries per module, to up to 7 entries per module). However, this at most adds a small constant overhead to __module_address(), which is expected to be fast.
Various archs use module_layout for different data. These data are put into different module_memory based on their location in module_layout. IOW, data that used to go with text is allocated with MOD_MEM_TYPE_TEXT; data that used to go with data is allocated with MOD_MEM_TYPE_DATA, etc.
module_memory simplifies quite some of the module code. For example, ARCH_WANTS_MODULES_DATA_IN_VMALLOC is a lot cleaner, as it just uses a different allocator for the data. kernel/module/strict_rwx.c is also much cleaner with module_memory.
Signed-off-by: Song Liu <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Christophe Leroy <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Reviewed-by: Luis Chamberlain <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>
show more ...
|
|
Revision tags: v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7, v6.1-rc6, v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0 |
|
| #
3fb420f5 |
| 29-Sep-2022 |
Li Huafei <[email protected]> |
arm64: module: Make plt_equals_entry() static
Since commit 4e69ecf4da1e ("arm64/module: ftrace: deal with place relative nature of PLTs"), plt_equals_entry() is not used outside of module-plts.c, so
arm64: module: Make plt_equals_entry() static
Since commit 4e69ecf4da1e ("arm64/module: ftrace: deal with place relative nature of PLTs"), plt_equals_entry() is not used outside of module-plts.c, so make it static.
Signed-off-by: Li Huafei <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v6.0-rc7, v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17 |
|
| #
dd671f16 |
| 18-Mar-2022 |
Julia Lawall <[email protected]> |
arm64: fix typos in comments
Various spelling mistakes in comments. Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]> Link: https://lore.kernel.org/r/2022031
arm64: fix typos in comments
Various spelling mistakes in comments. Detected with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]> Link: https://lore.kernel.org/r/[email protected] [will: Squashed in [email protected]] Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v5.17-rc8, v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7, v5.15-rc6, v5.15-rc5, v5.15-rc4, v5.15-rc3, v5.15-rc2, v5.15-rc1, v5.14, v5.14-rc7, v5.14-rc6, v5.14-rc5, v5.14-rc4, v5.14-rc3, v5.14-rc2, v5.14-rc1, v5.13, v5.13-rc7, v5.13-rc6, v5.13-rc5, v5.13-rc4, v5.13-rc3, v5.13-rc2, v5.13-rc1, v5.12, v5.12-rc8, v5.12-rc7, v5.12-rc6, v5.12-rc5, v5.12-rc4, v5.12-rc3, v5.12-rc2, v5.12-rc1, v5.12-rc1-dontuse, v5.11, v5.11-rc7 |
|
| #
d9f1b52a |
| 04-Feb-2021 |
Zhiyuan Dai <[email protected]> |
arm64: improve whitespace
In a few places we don't have whitespace between macro parameters, which makes them hard to read. This patch adds whitespace to clearly separate the parameters.
In a few p
arm64: improve whitespace
In a few places we don't have whitespace between macro parameters, which makes them hard to read. This patch adds whitespace to clearly separate the parameters.
In a few places we have unnecessary whitespace around unary operators, which is confusing, This patch removes the unnecessary whitespace.
Signed-off-by: Zhiyuan Dai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v5.11-rc6, v5.11-rc5, v5.11-rc4, v5.11-rc3, v5.11-rc2, v5.11-rc1, v5.10, v5.10-rc7, v5.10-rc6, v5.10-rc5, v5.10-rc4, v5.10-rc3, v5.10-rc2, v5.10-rc1, v5.9, v5.9-rc8, v5.9-rc7, v5.9-rc6, v5.9-rc5, v5.9-rc4 |
|
| #
e0328fed |
| 01-Sep-2020 |
Jessica Yu <[email protected]> |
arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE
In the arm64 module linker script, the section .text.ftrace_trampoline is specified unconditionally regardless of wheth
arm64/module: set trampoline section flags regardless of CONFIG_DYNAMIC_FTRACE
In the arm64 module linker script, the section .text.ftrace_trampoline is specified unconditionally regardless of whether CONFIG_DYNAMIC_FTRACE is enabled (this is simply due to the limitation that module linker scripts are not preprocessed like the vmlinux one).
Normally, for .plt and .text.ftrace_trampoline, the section flags present in the module binary wouldn't matter since module_frob_arch_sections() would assign them manually anyway. However, the arm64 module loader only sets the section flags for .text.ftrace_trampoline when CONFIG_DYNAMIC_FTRACE=y. That's only become problematic recently due to a recent change in binutils-2.35, where the .text.ftrace_trampoline section (along with the .plt section) is now marked writable and executable (WAX).
We no longer allow writable and executable sections to be loaded due to commit 5c3a7db0c7ec ("module: Harden STRICT_MODULE_RWX"), so this is causing all modules linked with binutils-2.35 to be rejected under arm64. Drop the IS_ENABLED(CONFIG_DYNAMIC_FTRACE) check in module_frob_arch_sections() so that the section flags for .text.ftrace_trampoline get properly set to SHF_EXECINSTR|SHF_ALLOC, without SHF_WRITE.
Signed-off-by: Jessica Yu <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Link: http://lore.kernel.org/r/20200831094651.GA16385@linux-8ccs Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v5.9-rc3, v5.9-rc2, v5.9-rc1, v5.8, v5.8-rc7, v5.8-rc6, v5.8-rc5, v5.8-rc4, v5.8-rc3 |
|
| #
d4e03409 |
| 23-Jun-2020 |
Saravana Kannan <[email protected]> |
arm64/module: Optimize module load time by optimizing PLT counting
When loading a module, module_frob_arch_sections() tries to figure out the number of PLTs that'll be needed to handle all the RELAs
arm64/module: Optimize module load time by optimizing PLT counting
When loading a module, module_frob_arch_sections() tries to figure out the number of PLTs that'll be needed to handle all the RELAs. While doing this, it tries to dedupe PLT allocations for multiple R_AARCH64_CALL26 relocations to the same symbol. It does the same for R_AARCH64_JUMP26 relocations.
To make checks for duplicates easier/faster, it sorts the relocation list by type, symbol and addend. That way, to check for a duplicate relocation, it just needs to compare with the previous entry.
However, sorting the entire relocation array is unnecessary and expensive (O(n log n)) because there are a lot of other relocation types that don't need deduping or can't be deduped.
So this commit partitions the array into entries that need deduping and those that don't. And then sorts just the part that needs deduping. And when CONFIG_RANDOMIZE_BASE is disabled, the sorting is skipped entirely because PLTs are not allocated for R_AARCH64_CALL26 and R_AARCH64_JUMP26 if it's disabled.
This gives significant reduction in module load time for modules with large number of relocations with no measurable impact on modules with a small number of relocations. In my test setup with CONFIG_RANDOMIZE_BASE enabled, these were the results for a few downstream modules:
Module Size (MB) wlan 14 video codec 3.8 drm 1.8 IPA 2.5 audio 1.2 gpu 1.8
Without this patch: Module Number of entries sorted Module load time (ms) wlan 243739 283 video codec 74029 138 drm 53837 67 IPA 42800 90 audio 21326 27 gpu 20967 32
Total time to load all these module: 637 ms
With this patch: Module Number of entries sorted Module load time (ms) wlan 22454 61 video codec 10150 47 drm 13014 40 IPA 8097 63 audio 4606 16 gpu 6527 20
Total time to load all these modules: 247
Time saved during boot for just these 6 modules: 390 ms
Signed-off-by: Saravana Kannan <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v5.8-rc2, v5.8-rc1, v5.7, v5.7-rc7, v5.7-rc6, v5.7-rc5, v5.7-rc4, v5.7-rc3, v5.7-rc2, v5.7-rc1, v5.6, v5.6-rc7, v5.6-rc6, v5.6-rc5, v5.6-rc4, v5.6-rc3, v5.6-rc2, v5.6-rc1, v5.5, v5.5-rc7, v5.5-rc6, v5.5-rc5, v5.5-rc4, v5.5-rc3, v5.5-rc2, v5.5-rc1, v5.4, v5.4-rc8, v5.4-rc7, v5.4-rc6, v5.4-rc5, v5.4-rc4, v5.4-rc3, v5.4-rc2, v5.4-rc1, v5.3, v5.3-rc8, v5.3-rc7, v5.3-rc6, v5.3-rc5, v5.3-rc4, v5.3-rc3, v5.3-rc2, v5.3-rc1, v5.2, v5.2-rc7, v5.2-rc6, v5.2-rc5, v5.2-rc4, v5.2-rc3, v5.2-rc2, v5.2-rc1, v5.1, v5.1-rc7, v5.1-rc6, v5.1-rc5, v5.1-rc4, v5.1-rc3, v5.1-rc2, v5.1-rc1, v5.0, v5.0-rc8, v5.0-rc7, v5.0-rc6 |
|
| #
3b23e499 |
| 08-Feb-2019 |
Torsten Duwe <[email protected]> |
arm64: implement ftrace with regs
This patch implements FTRACE_WITH_REGS for arm64, which allows a traced function's arguments (and some other registers) to be captured into a struct pt_regs, allowi
arm64: implement ftrace with regs
This patch implements FTRACE_WITH_REGS for arm64, which allows a traced function's arguments (and some other registers) to be captured into a struct pt_regs, allowing these to be inspected and/or modified. This is a building block for live-patching, where a function's arguments may be forwarded to another function. This is also necessary to enable ftrace and in-kernel pointer authentication at the same time, as it allows the LR value to be captured and adjusted prior to signing.
Using GCC's -fpatchable-function-entry=N option, we can have the compiler insert a configurable number of NOPs between the function entry point and the usual prologue. This also ensures functions are AAPCS compliant (e.g. disabling inter-procedural register allocation).
For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the following:
| unsigned long bar(void); | | unsigned long foo(void) | { | return bar() + 1; | }
... to:
| <foo>: | nop | nop | stp x29, x30, [sp, #-16]! | mov x29, sp | bl 0 <bar> | add x0, x0, #0x1 | ldp x29, x30, [sp], #16 | ret
This patch builds the kernel with -fpatchable-function-entry=2, prefixing each function with two NOPs. To trace a function, we replace these NOPs with a sequence that saves the LR into a GPR, then calls an ftrace entry assembly function which saves this and other relevant registers:
| mov x9, x30 | bl <ftrace-entry>
Since patchable functions are AAPCS compliant (and the kernel does not use x18 as a platform register), x9-x18 can be safely clobbered in the patched sequence and the ftrace entry code.
There are now two ftrace entry functions, ftrace_regs_entry (which saves all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is allocated for each within modules.
Signed-off-by: Torsten Duwe <[email protected]> [Mark: rework asm, comments, PLTs, initialization, commit message] Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: AKASHI Takahiro <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Julien Thierry <[email protected]> Cc: Will Deacon <[email protected]>
show more ...
|
| #
b3e089cd |
| 30-Jul-2019 |
Chuhong Yuan <[email protected]> |
arm64: Replace strncmp with str_has_prefix
In commit b6b2735514bc ("tracing: Use str_has_prefix() instead of using fixed sizes") the newly introduced str_has_prefix() was used to replace error-prone
arm64: Replace strncmp with str_has_prefix
In commit b6b2735514bc ("tracing: Use str_has_prefix() instead of using fixed sizes") the newly introduced str_has_prefix() was used to replace error-prone strncmp(str, const, len). Here fix codes with the same pattern.
Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: Will Deacon <[email protected]>
show more ...
|
| #
d2912cb1 |
| 04-Jun-2019 |
Thomas Gleixner <[email protected]> |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of th
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation
this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation #
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 4122 file(s).
Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Enrico Weigelt <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
show more ...
|
|
Revision tags: v5.0-rc5, v5.0-rc4, v5.0-rc3, v5.0-rc2, v5.0-rc1, v4.20, v4.20-rc7, v4.20-rc6, v4.20-rc5, v4.20-rc4 |
|
| #
bdb85cd1 |
| 22-Nov-2018 |
Ard Biesheuvel <[email protected]> |
arm64/module: switch to ADRP/ADD sequences for PLT entries
Now that we have switched to the small code model entirely, and reduced the extended KASLR range to 4 GB, we can be sure that the targets o
arm64/module: switch to ADRP/ADD sequences for PLT entries
Now that we have switched to the small code model entirely, and reduced the extended KASLR range to 4 GB, we can be sure that the targets of relative branches that are out of range are in range for a ADRP/ADD pair, which is one instruction shorter than our current MOVN/MOVK/MOVK sequence, and is more idiomatic and so it is more likely to be implemented efficiently by micro-architectures.
So switch over the ordinary PLT code and the special handling of the Cortex-A53 ADRP errata, as well as the ftrace trampline handling.
Reviewed-by: Torsten Duwe <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> [will: Added a couple of comments in the plt equality check] Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v4.20-rc3, v4.20-rc2 |
|
| #
c8ebf64e |
| 05-Nov-2018 |
Jessica Yu <[email protected]> |
arm64/module: use plt section indices for relocations
Instead of saving a pointer to the .plt and .init.plt sections to apply plt-based relocations, save and use their section indices instead.
The
arm64/module: use plt section indices for relocations
Instead of saving a pointer to the .plt and .init.plt sections to apply plt-based relocations, save and use their section indices instead.
The mod->arch.{core,init}.plt pointers were problematic for livepatch because they pointed within temporary section headers (provided by the module loader via info->sechdrs) that would be freed after module load. Since livepatch modules may need to apply relocations post-module-load (for example, to patch a module that is loaded later), using section indices to offset into the section headers (instead of accessing them through a saved pointer) allows livepatch modules on arm64 to pass in their own copy of the section headers to apply_relocate_add() to apply delayed relocations.
Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Signed-off-by: Jessica Yu <[email protected]> Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v4.20-rc1, v4.19, v4.19-rc8, v4.19-rc7, v4.19-rc6, v4.19-rc5, v4.19-rc4, v4.19-rc3, v4.19-rc2, v4.19-rc1, v4.18, v4.18-rc8, v4.18-rc7, v4.18-rc6, v4.18-rc5, v4.18-rc4, v4.18-rc3, v4.18-rc2, v4.18-rc1, v4.17, v4.17-rc7, v4.17-rc6, v4.17-rc5, v4.17-rc4, v4.17-rc3 |
|
| #
ed231ae3 |
| 24-Apr-2018 |
Kim Phillips <[email protected]> |
arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp
Commit a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") introduced a function whose na
arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp
Commit a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") introduced a function whose name ends with "_veneer".
This clashes with commit bd8b22d2888e ("Kbuild: kallsyms: ignore veneers emitted by the ARM linker"), which removes symbols ending in "_veneer" from kallsyms.
The problem was manifested as 'perf test -vvvvv vmlinux' failed, correctly claiming the symbol 'module_emit_adrp_veneer' was present in vmlinux, but not in kallsyms.
... ERR : 0xffff00000809aa58: module_emit_adrp_veneer not on kallsyms ... test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED!
Fix the problem by renaming module_emit_adrp_veneer to module_emit_veneer_for_adrp. Now the test passes.
Fixes: a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") Acked-by: Ard Biesheuvel <[email protected]> Cc: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Michal Marek <[email protected]> Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v4.17-rc2, v4.17-rc1, v4.16, v4.16-rc7, v4.16-rc6, v4.16-rc5 |
|
| #
ca79acca |
| 06-Mar-2018 |
Ard Biesheuvel <[email protected]> |
arm64/kernel: enable A53 erratum #8434319 handling at runtime
Omit patching of ADRP instruction at module load time if the current CPUs are not susceptible to the erratum.
Signed-off-by: Ard Bieshe
arm64/kernel: enable A53 erratum #8434319 handling at runtime
Omit patching of ADRP instruction at module load time if the current CPUs are not susceptible to the erratum.
Signed-off-by: Ard Biesheuvel <[email protected]> [will: Drop duplicate initialisation of .def_scope field] Signed-off-by: Will Deacon <[email protected]>
show more ...
|
| #
a257e025 |
| 06-Mar-2018 |
Ard Biesheuvel <[email protected]> |
arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419
Working around Cortex-A53 erratum #843419 involves special handling of ADRP instructions that end up in the last two instructio
arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419
Working around Cortex-A53 erratum #843419 involves special handling of ADRP instructions that end up in the last two instruction slots of a 4k page, or whose output register gets overwritten without having been read. (Note that the latter instruction sequence is never emitted by a properly functioning compiler, which is why it is disregarded by the handling of the same erratum in the bfd.ld linker which we rely on for the core kernel)
Normally, this gets taken care of by the linker, which can spot such sequences at final link time, and insert a veneer if the ADRP ends up at a vulnerable offset. However, linux kernel modules are partially linked ELF objects, and so there is no 'final link time' other than the runtime loading of the module, at which time all the static relocations are resolved.
For this reason, we have implemented the #843419 workaround for modules by avoiding ADRP instructions altogether, by using the large C model, and by passing -mpc-relative-literal-loads to recent versions of GCC that may emit adrp/ldr pairs to perform literal loads. However, this workaround forces us to keep literal data mixed with the instructions in the executable .text segment, and literal data may inadvertently turn into an exploitable speculative gadget depending on the relative offsets of arbitrary symbols.
So let's reimplement this workaround in a way that allows us to switch back to the small C model, and to drop the -mpc-relative-literal-loads GCC switch, by patching affected ADRP instructions at runtime: - ADRP instructions that do not appear at 4k relative offset 0xff8 or 0xffc are ignored - ADRP instructions that are within 1 MB of their target symbol are converted into ADR instructions - remaining ADRP instructions are redirected via a veneer that performs the load using an unaffected movn/movk sequence.
Signed-off-by: Ard Biesheuvel <[email protected]> [will: tidied up ADRP -> ADR instruction patching.] [will: use ULL suffix for 64-bit immediate] Signed-off-by: Will Deacon <[email protected]>
show more ...
|
| #
5e8307b9 |
| 06-Mar-2018 |
Ard Biesheuvel <[email protected]> |
arm64: module: don't BUG when exceeding preallocated PLT count
When PLTs are emitted at relocation time, we really should not exceed the number that we counted when parsing the relocation tables, an
arm64: module: don't BUG when exceeding preallocated PLT count
When PLTs are emitted at relocation time, we really should not exceed the number that we counted when parsing the relocation tables, and so currently, we BUG() on this condition. However, even though this is a clear bug in this particular piece of code, we can easily recover by failing to load the module.
So instead, return 0 from module_emit_plt_entry() if this condition occurs, which is not a valid kernel address, and can hence serve as a flag value that makes the relocation routine bail out.
Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v4.16-rc4, v4.16-rc3, v4.16-rc2, v4.16-rc1, v4.15, v4.15-rc9, v4.15-rc8, v4.15-rc7, v4.15-rc6, v4.15-rc5, v4.15-rc4, v4.15-rc3, v4.15-rc2, v4.15-rc1 |
|
| #
be0f272b |
| 20-Nov-2017 |
Ard Biesheuvel <[email protected]> |
arm64: ftrace: emit ftrace-mod.o contents through code
When building the arm64 kernel with both CONFIG_ARM64_MODULE_PLTS and CONFIG_DYNAMIC_FTRACE enabled, the ftrace-mod.o object file is built with
arm64: ftrace: emit ftrace-mod.o contents through code
When building the arm64 kernel with both CONFIG_ARM64_MODULE_PLTS and CONFIG_DYNAMIC_FTRACE enabled, the ftrace-mod.o object file is built with the kernel and contains a trampoline that is linked into each module, so that modules can be loaded far away from the kernel and still reach the ftrace entry point in the core kernel with an ordinary relative branch, as is emitted by the compiler instrumentation code dynamic ftrace relies on.
In order to be able to build out of tree modules, this object file needs to be included into the linux-headers or linux-devel packages, which is undesirable, as it makes arm64 a special case (although a precedent does exist for 32-bit PPC).
Given that the trampoline essentially consists of a PLT entry, let's not bother with a source or object file for it, and simply patch it in whenever the trampoline is being populated, using the existing PLT support routines.
Cc: <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Will Deacon <[email protected]>
show more ...
|
| #
7e8b9c1d |
| 20-Nov-2017 |
Ard Biesheuvel <[email protected]> |
arm64: module-plts: factor out PLT generation code for ftrace
To allow the ftrace trampoline code to reuse the PLT entry routines, factor it out and move it into asm/module.h.
Cc: <[email protected]
arm64: module-plts: factor out PLT generation code for ftrace
To allow the ftrace trampoline code to reuse the PLT entry routines, factor it out and move it into asm/module.h.
Cc: <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Will Deacon <[email protected]>
show more ...
|
|
Revision tags: v4.14, v4.14-rc8, v4.14-rc7, v4.14-rc6, v4.14-rc5, v4.14-rc4, v4.14-rc3, v4.14-rc2, v4.14-rc1, v4.13, v4.13-rc7, v4.13-rc6, v4.13-rc5, v4.13-rc4, v4.13-rc3, v4.13-rc2, v4.13-rc1, v4.12, v4.12-rc7, v4.12-rc6, v4.12-rc5, v4.12-rc4, v4.12-rc3, v4.12-rc2, v4.12-rc1, v4.11, v4.11-rc8, v4.11-rc7, v4.11-rc6, v4.11-rc5, v4.11-rc4, v4.11-rc3, v4.11-rc2, v4.11-rc1 |
|
| #
24af6c4e |
| 21-Feb-2017 |
Ard Biesheuvel <[email protected]> |
arm64: module: split core and init PLT sections
The arm64 module PLT code allocates all PLT entries in a single core section, since the overhead of having a separate init PLT section is not justifie
arm64: module: split core and init PLT sections
The arm64 module PLT code allocates all PLT entries in a single core section, since the overhead of having a separate init PLT section is not justified by the small number of PLT entries usually required for init code.
However, the core and init module regions are allocated independently, and there is a corner case where the core region may be allocated from the VMALLOC region if the dedicated module region is exhausted, but the init region, being much smaller, can still be allocated from the module region. This leads to relocation failures if the distance between those regions exceeds 128 MB. (In fact, this corner case is highly unlikely to occur on arm64, but the issue has been observed on ARM, whose module region is much smaller).
So split the core and init PLT regions, and name the latter ".init.plt" so it gets allocated along with (and sufficiently close to) the .init sections that it serves. Also, given that init PLT entries may need to be emitted for branches that target the core module, modify the logic that disregards defined symbols to only disregard symbols that are defined in the same section as the relocated branch instruction.
Since there may now be two PLT entries associated with each entry in the symbol table, we can no longer hijack the symbol::st_size fields to record the addresses of PLT entries as we emit them for zero-addend relocations. So instead, perform an explicit comparison to check for duplicate entries.
Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|
|
Revision tags: v4.10, v4.10-rc8, v4.10-rc7, v4.10-rc6, v4.10-rc5, v4.10-rc4, v4.10-rc3, v4.10-rc2, v4.10-rc1, v4.9, v4.9-rc8, v4.9-rc7, v4.9-rc6, v4.9-rc5, v4.9-rc4, v4.9-rc3, v4.9-rc2, v4.9-rc1, v4.8, v4.8-rc8, v4.8-rc7, v4.8-rc6, v4.8-rc5, v4.8-rc4, v4.8-rc3, v4.8-rc2, v4.8-rc1, v4.7, v4.7-rc7, v4.7-rc6, v4.7-rc5, v4.7-rc4, v4.7-rc3, v4.7-rc2, v4.7-rc1, v4.6, v4.6-rc7, v4.6-rc6, v4.6-rc5, v4.6-rc4, v4.6-rc3, v4.6-rc2, v4.6-rc1, v4.5, v4.5-rc7, v4.5-rc6, v4.5-rc5, v4.5-rc4, v4.5-rc3, v4.5-rc2, v4.5-rc1, v4.4, v4.4-rc8, v4.4-rc7, v4.4-rc6, v4.4-rc5, v4.4-rc4, v4.4-rc3 |
|
| #
fd045f6c |
| 24-Nov-2015 |
Ard Biesheuvel <[email protected]> |
arm64: add support for module PLTs
This adds support for emitting PLTs at module load time for relative branches that are out of range. This is a prerequisite for KASLR, which may place the kernel a
arm64: add support for module PLTs
This adds support for emitting PLTs at module load time for relative branches that are out of range. This is a prerequisite for KASLR, which may place the kernel and the modules anywhere in the vmalloc area, making it more likely that branch target offsets exceed the maximum range of +/- 128 MB.
In this version, I removed the distinction between relocations against .init executable sections and ordinary executable sections. The reason is that it is hardly worth the trouble, given that .init.text usually does not contain that many far branches, and this version now only reserves PLT entry space for jump and call relocations against undefined symbols (since symbols defined in the same module can be assumed to be within +/- 128 MB)
For example, the mac80211.ko module (which is fairly sizable at ~400 KB) built with -mcmodel=large gives the following relocation counts:
relocs branches unique !local .text 3925 3347 518 219 .init.text 11 8 7 1 .exit.text 4 4 4 1 .text.unlikely 81 67 36 17
('unique' means branches to unique type/symbol/addend combos, of which !local is the subset referring to undefined symbols)
IOW, we are only emitting a single PLT entry for the .init sections, and we are better off just adding it to the core PLT section instead.
Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
show more ...
|