|
Revision tags: v6.15, v6.15-rc7, v6.15-rc6, v6.15-rc5, v6.15-rc4, v6.15-rc3, v6.15-rc2, v6.15-rc1, v6.14, v6.14-rc7, v6.14-rc6, v6.14-rc5, v6.14-rc4, v6.14-rc3, v6.14-rc2, v6.14-rc1, v6.13, v6.13-rc7, v6.13-rc6, v6.13-rc5, v6.13-rc4, v6.13-rc3, v6.13-rc2, v6.13-rc1, v6.12, v6.12-rc7, v6.12-rc6, v6.12-rc5, v6.12-rc4, v6.12-rc3, v6.12-rc2, v6.12-rc1, v6.11, v6.11-rc7, v6.11-rc6, v6.11-rc5, v6.11-rc4, v6.11-rc3, v6.11-rc2, v6.11-rc1, v6.10, v6.10-rc7, v6.10-rc6 |
|
| #
7bf5f056 |
| 28-Jun-2024 |
Michael Ellerman <[email protected]> |
powerpc: Replace CONFIG_4xx with CONFIG_44x
Replace 4xx usage with 44x, and replace 4xx_SOC with 44x.
Also, as pointed out by Christophe, if 44x || BOOKE can be simplified to just test BOOKE, becau
powerpc: Replace CONFIG_4xx with CONFIG_44x
Replace 4xx usage with 44x, and replace 4xx_SOC with 44x.
Also, as pointed out by Christophe, if 44x || BOOKE can be simplified to just test BOOKE, because 44x always selects BOOKE.
Retain the CONFIG_4xx symbol, as there are drivers that use it to mean 4xx || 44x, those will need updating before CONFIG_4xx can be removed.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
002b27a5 |
| 28-Jun-2024 |
Michael Ellerman <[email protected]> |
powerpc/4xx: Remove CONFIG_BOOKE_OR_40x
Now that 40x is gone, replace CONFIG_BOOKE_OR_40x by CONFIG_BOOKE.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/202406281212
powerpc/4xx: Remove CONFIG_BOOKE_OR_40x
Now that 40x is gone, replace CONFIG_BOOKE_OR_40x by CONFIG_BOOKE.
Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
732b32da |
| 28-Jun-2024 |
Christophe Leroy <[email protected]> |
powerpc: Remove core support for 40x
Now that 40x platforms have gone, remove support for 40x in the core of powerpc arch.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-b
powerpc: Remove core support for 40x
Now that 40x platforms have gone, remove support for 40x in the core of powerpc arch.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.10-rc5, v6.10-rc4, v6.10-rc3, v6.10-rc2, v6.10-rc1, v6.9, v6.9-rc7, v6.9-rc6, v6.9-rc5, v6.9-rc4, v6.9-rc3, v6.9-rc2, v6.9-rc1, v6.8, v6.8-rc7, v6.8-rc6, v6.8-rc5, v6.8-rc4, v6.8-rc3, v6.8-rc2, v6.8-rc1, v6.7, v6.7-rc8, v6.7-rc7, v6.7-rc6, v6.7-rc5, v6.7-rc4, v6.7-rc3, v6.7-rc2, v6.7-rc1, v6.6, v6.6-rc7, v6.6-rc6 |
|
| #
f0eee815 |
| 10-Oct-2023 |
Michael Ellerman <[email protected]> |
powerpc/47x: Fix 47x syscall return crash
Eddie reported that newer kernels were crashing during boot on his 476 FSP2 system:
kernel tried to execute user page (b7ee2000) - exploit attempt? (uid:
powerpc/47x: Fix 47x syscall return crash
Eddie reported that newer kernels were crashing during boot on his 476 FSP2 system:
kernel tried to execute user page (b7ee2000) - exploit attempt? (uid: 0) BUG: Unable to handle kernel instruction fetch Faulting instruction address: 0xb7ee2000 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=4K FSP-2 Modules linked in: CPU: 0 PID: 61 Comm: mount Not tainted 6.1.55-d23900f.ppcnf-fsp2 #1 Hardware name: ibm,fsp2 476fpe 0x7ff520c0 FSP-2 NIP: b7ee2000 LR: 8c008000 CTR: 00000000 REGS: bffebd83 TRAP: 0400 Not tainted (6.1.55-d23900f.ppcnf-fs p2) MSR: 00000030 <IR,DR> CR: 00001000 XER: 20000000 GPR00: c00110ac bffebe63 bffebe7e bffebe88 8c008000 00001000 00000d12 b7ee2000 GPR08: 00000033 00000000 00000000 c139df10 48224824 1016c314 10160000 00000000 GPR16: 10160000 10160000 00000008 00000000 10160000 00000000 10160000 1017f5b0 GPR24: 1017fa50 1017f4f0 1017fa50 1017f740 1017f630 00000000 00000000 1017f4f0 NIP [b7ee2000] 0xb7ee2000 LR [8c008000] 0x8c008000 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 0000000000000000 ]---
The problem is in ret_from_syscall where the check for icache_44x_need_flush is done. When the flush is needed the code jumps out-of-line to do the flush, and then intends to jump back to continue the syscall return.
However the branch back to label 1b doesn't return to the correct location, instead branching back just prior to the return to userspace, causing bogus register values to be used by the rfi.
The breakage was introduced by commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") which inadvertently removed the "1" label and reused it elsewhere.
Fix it by adding named local labels in the correct locations. Note that the return label needs to be outside the ifdef so that CONFIG_PPC_47x=n compiles.
Fixes: 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") Cc: [email protected] # v5.12+ Reported-by: Eddie James <[email protected]> Tested-by: Eddie James <[email protected]> Link: https://lore.kernel.org/linuxppc-dev/[email protected]/ Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.6-rc5, v6.6-rc4, v6.6-rc3, v6.6-rc2, v6.6-rc1, v6.5, v6.5-rc7, v6.5-rc6, v6.5-rc5 |
|
| #
3eb3f168 |
| 06-Aug-2023 |
Masahiro Yamada <[email protected]> |
powerpc: remove unneeded #include <asm/export.h>
There is no EXPORT_SYMBOL line there, hence #include <asm/export.h> is unneeded.
Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by
powerpc: remove unneeded #include <asm/export.h>
There is no EXPORT_SYMBOL line there, hence #include <asm/export.h> is unneeded.
Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.5-rc4, v6.5-rc3, v6.5-rc2, v6.5-rc1, v6.4, v6.4-rc7, v6.4-rc6 |
|
| #
afc63868 |
| 06-Jun-2023 |
Nicholas Piggin <[email protected]> |
powerpc: merge 32-bit and 64-bit _switch implementation
The _switch stack frame setup are substantially the same, so are the comments. The difference in how the stack and current are switched, and o
powerpc: merge 32-bit and 64-bit _switch implementation
The _switch stack frame setup are substantially the same, so are the comments. The difference in how the stack and current are switched, and other hardware and software housekeeping is done is moved into macros.
Generated code should be unchanged.
Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Tweak include orer to fix compile errors on some configs] Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
6958ad05 |
| 06-Jun-2023 |
Nicholas Piggin <[email protected]> |
powerpc/32: Rearrange _switch to prepare for 32/64 merge
Change the order of some operations and change some register numbers in preparation to merge 32-bit and 64-bit switch.
Signed-off-by: Nichol
powerpc/32: Rearrange _switch to prepare for 32/64 merge
Change the order of some operations and change some register numbers in preparation to merge 32-bit and 64-bit switch.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
fc8562c9 |
| 06-Jun-2023 |
Nicholas Piggin <[email protected]> |
powerpc/32: Remove sync from _switch
64-bit has removed the sync from _switch since commit 9145effd626d1 ("powerpc/64: Drop explicit hwsync in context switch"). The same logic there should apply to
powerpc/32: Remove sync from _switch
64-bit has removed the sync from _switch since commit 9145effd626d1 ("powerpc/64: Drop explicit hwsync in context switch"). The same logic there should apply to 32-bit. Remove the sync and replace with a placeholder comment (32 and 64 will be merged with a later change).
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.4-rc5, v6.4-rc4, v6.4-rc3, v6.4-rc2, v6.4-rc1, v6.3, v6.3-rc7, v6.3-rc6, v6.3-rc5, v6.3-rc4 |
|
| #
b504b6aa |
| 25-Mar-2023 |
Nicholas Piggin <[email protected]> |
powerpc: differentiate kthread from user kernel thread start
Kernel created user threads start similarly to kernel threads in that they call a kernel function after first returning from _switch, so
powerpc: differentiate kthread from user kernel thread start
Kernel created user threads start similarly to kernel threads in that they call a kernel function after first returning from _switch, so they share ret_from_kernel_thread for this. Kernel threads never return from that function though, whereas user threads often do (although some don't, e.g., IO threads).
Split these startup functions in two, and catch kernel threads that improperly return from their function. This is intended to make the complicated code a little bit easier to understand.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
af5ca9d5 |
| 25-Mar-2023 |
Nicholas Piggin <[email protected]> |
powerpc: use switch frame for ret_from_kernel_thread parameters
The kernel thread path in copy_thread creates a user interrupt frame on stack and stores the function and arg parameters there, and re
powerpc: use switch frame for ret_from_kernel_thread parameters
The kernel thread path in copy_thread creates a user interrupt frame on stack and stores the function and arg parameters there, and ret_from_kernel_thread loads them. This is a slightly confusing way to overload that frame. Non-volatile registers are loaded from the switch frame, so the parameters can be stored there. The user interrupt frame is now only used by user threads when they return to user.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
| #
959791e4 |
| 25-Mar-2023 |
Nicholas Piggin <[email protected]> |
powerpc: copy_thread make ret_from_fork register setup consistent
The ret_from_fork code for 64e and 32-bit set r3 for syscall_exit_prepare the same way that 64s does, so there should be no need to
powerpc: copy_thread make ret_from_fork register setup consistent
The ret_from_fork code for 64e and 32-bit set r3 for syscall_exit_prepare the same way that 64s does, so there should be no need to special-case them in copy_thread.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected]
show more ...
|
|
Revision tags: v6.3-rc3, v6.3-rc2, v6.3-rc1, v6.2, v6.2-rc8, v6.2-rc7, v6.2-rc6, v6.2-rc5, v6.2-rc4, v6.2-rc3, v6.2-rc2, v6.2-rc1, v6.1, v6.1-rc8, v6.1-rc7 |
|
| #
6f291a03 |
| 27-Nov-2022 |
Nicholas Piggin <[email protected]> |
powerpc: add a define for the switch frame size and regs offset
This is open-coded in process.c, ppc32 uses a different define with the same value, and the C definition is name differently which mak
powerpc: add a define for the switch frame size and regs offset
This is open-coded in process.c, ppc32 uses a different define with the same value, and the C definition is name differently which makes it an extra indirection to grep for.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
d2e8ff9f |
| 27-Nov-2022 |
Nicholas Piggin <[email protected]> |
powerpc: add a definition for the marker offset within the interrupt frame
Define a constant rather than open-code the offset for the "regs" marker.
Signed-off-by: Nicholas Piggin <[email protected]
powerpc: add a definition for the marker offset within the interrupt frame
Define a constant rather than open-code the offset for the "regs" marker.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
c03be0a3 |
| 27-Nov-2022 |
Nicholas Piggin <[email protected]> |
powerpc: add definition for pt_regs offset within an interrupt frame
This is a common offset that currently uses the overloaded STACK_FRAME_OVERHEAD constant. It's easier to read and more flexible t
powerpc: add definition for pt_regs offset within an interrupt frame
This is a common offset that currently uses the overloaded STACK_FRAME_OVERHEAD constant. It's easier to read and more flexible to use a specific regs offset for this.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.1-rc6 |
|
| #
2da37761 |
| 14-Nov-2022 |
Christophe Leroy <[email protected]> |
powerpc/32: Fix objtool unannotated intra-function call warnings
Fix several annotations in assembly files on PPC32.
[Sathvika Vasireddy: Changed subject line and removed Kconfig change to enable
powerpc/32: Fix objtool unannotated intra-function call warnings
Fix several annotations in assembly files on PPC32.
[Sathvika Vasireddy: Changed subject line and removed Kconfig change to enable objtool, as it is a part of "objtool/powerpc: Enable objtool to be built on ppc" patch in this series.]
Tested-by: Naveen N. Rao <[email protected]> Reviewed-by: Naveen N. Rao <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Sathvika Vasireddy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.1-rc5, v6.1-rc4, v6.1-rc3, v6.1-rc2, v6.1-rc1, v6.0 |
|
| #
17773afd |
| 26-Sep-2022 |
Nicholas Piggin <[email protected]> |
powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER
Using a 32-bit constant for this marker allows it to be loaded with two ALU instructions, like 32-bit. This avoids a TOC entry and a TOC
powerpc/64: use 32-bit immediate for STACK_FRAME_REGS_MARKER
Using a 32-bit constant for this marker allows it to be loaded with two ALU instructions, like 32-bit. This avoids a TOC entry and a TOC load that depends on the r2 value that has just been loaded from the PACA.
This changes the value for 32-bit as well, so both have the same value in the low 4 bytes and 64-bit has 0 in the top bytes.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v6.0-rc7 |
|
| #
f8971c62 |
| 21-Sep-2022 |
Rohan McLure <[email protected]> |
powerpc: Change system_call_exception calling convention
Change system_call_exception arguments to pass a pointer to a stack frame container caller state, as well as the original r0, which determine
powerpc: Change system_call_exception calling convention
Change system_call_exception arguments to pass a pointer to a stack frame container caller state, as well as the original r0, which determines the number of the syscall. This has been observed to yield improved performance to passing them by registers, circumventing the need to allocate a stack frame.
Signed-off-by: Rohan McLure <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> [mpe: Retain clearing of high bits of args for compat tasks] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
15ba7450 |
| 21-Sep-2022 |
Rohan McLure <[email protected]> |
powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel sta
powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel stack frame. Issue the REST_GPR{,S} macros to clearly signal when this is happening, and bunch together restores at the end of the interrupt handler where the saved value is not consumed earlier in the handler code.
Signed-off-by: Rohan McLure <[email protected]> Reported-by: Christophe Leroy <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
2c27d4a4 |
| 21-Sep-2022 |
Rohan McLure <[email protected]> |
powerpc: Save caller r3 prior to system_call_exception
This reverts commit 8875f47b7681 ("powerpc/syscall: Save r3 in regs->orig_r3 ").
Save caller's original r3 state to the kernel stackframe befo
powerpc: Save caller r3 prior to system_call_exception
This reverts commit 8875f47b7681 ("powerpc/syscall: Save r3 in regs->orig_r3 ").
Save caller's original r3 state to the kernel stackframe before entering system_call_exception. This allows for user registers to be cleared by the time system_call_exception is entered, reducing the influence of user registers on speculation within the kernel.
Prior to this commit, orig_r3 was saved at the beginning of system_call_exception. Instead, save orig_r3 while the user value is still live in r3.
Also replicate this early save in 32-bit. A similar save was removed in commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") when 32-bit adopted system_call_exception. Revert its removal of orig_r3 saves.
Signed-off-by: Rohan McLure <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
| #
aa5f59df |
| 19-Sep-2022 |
Christophe Leroy <[email protected]> |
powerpc: Remove CONFIG_PPC_BOOK3E_MMU
CONFIG_PPC_BOOK3E_MMU is redundant with CONFIG_PPC_E500.
Remove it.
Also rename mmu-book3e.h to mmu-e500.h
Signed-off-by: Christophe Leroy <christophe.leroy@
powerpc: Remove CONFIG_PPC_BOOK3E_MMU
CONFIG_PPC_BOOK3E_MMU is redundant with CONFIG_PPC_E500.
Remove it.
Also rename mmu-book3e.h to mmu-e500.h
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/c5549cd59a131204ff94ab909cad2e2dad4ddf2f.1663606876.git.christophe.leroy@csgroup.eu
show more ...
|
| #
688de017 |
| 19-Sep-2022 |
Christophe Leroy <[email protected]> |
powerpc: Change CONFIG_E500 to CONFIG_PPC_E500
It will be used outside arch/powerpc, make it clear its a powerpc configuration item.
And we already have CONFIG_PPC_E500MC, so that will make it more
powerpc: Change CONFIG_E500 to CONFIG_PPC_E500
It will be used outside arch/powerpc, make it clear its a powerpc configuration item.
And we already have CONFIG_PPC_E500MC, so that will make it more consistent.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/e63b22083c11c4300f4a82d3123a46e5fdd54fa6.1663606876.git.christophe.leroy@csgroup.eu
show more ...
|
|
Revision tags: v6.0-rc6, v6.0-rc5, v6.0-rc4, v6.0-rc3, v6.0-rc2, v6.0-rc1, v5.19, v5.19-rc8, v5.19-rc7, v5.19-rc6, v5.19-rc5, v5.19-rc4, v5.19-rc3, v5.19-rc2, v5.19-rc1, v5.18, v5.18-rc7, v5.18-rc6, v5.18-rc5, v5.18-rc4, v5.18-rc3, v5.18-rc2, v5.18-rc1, v5.17, v5.17-rc8 |
|
| #
838ee286 |
| 08-Mar-2022 |
Nicholas Piggin <[email protected]> |
powerpc/rtas: Move rtas entry assembly into its own file
This makes working on the code a bit easier.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <mpe@ellerma
powerpc/rtas: Move rtas entry assembly into its own file
This makes working on the code a bit easier.
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
show more ...
|
|
Revision tags: v5.17-rc7, v5.17-rc6, v5.17-rc5, v5.17-rc4, v5.17-rc3, v5.17-rc2, v5.17-rc1, v5.16, v5.16-rc8, v5.16-rc7, v5.16-rc6, v5.16-rc5, v5.16-rc4, v5.16-rc3, v5.16-rc2, v5.16-rc1, v5.15, v5.15-rc7 |
|
| #
047a6fd4 |
| 19-Oct-2021 |
Christophe Leroy <[email protected]> |
powerpc/config: Add CONFIG_BOOKE_OR_40x
We have many functionnalities common to 40x and BOOKE, it leads to many places with #if defined(CONFIG_BOOKE) || defined(CONFIG_40x).
We are going to add a f
powerpc/config: Add CONFIG_BOOKE_OR_40x
We have many functionnalities common to 40x and BOOKE, it leads to many places with #if defined(CONFIG_BOOKE) || defined(CONFIG_40x).
We are going to add a few more with KUAP for booke/40x, so create a new symbol which is defined when either BOOKE or 40x is defined.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/9a3dbd60924cb25c9f944d3d8205ac5a0d15e229.1634627931.git.christophe.leroy@csgroup.eu
show more ...
|
| #
70428da9 |
| 19-Oct-2021 |
Christophe Leroy <[email protected]> |
powerpc/32s: Save content of sr0 to avoid 'mfsr'
Calling 'mfsr' to get the content of segment registers is heavy, in addition it requires clearing of the 'reserved' bits.
In order to avoid this ope
powerpc/32s: Save content of sr0 to avoid 'mfsr'
Calling 'mfsr' to get the content of segment registers is heavy, in addition it requires clearing of the 'reserved' bits.
In order to avoid this operation, save it in mm context and in thread struct.
The saved sr0 is the one used by kernel, this means that on locking entry it can be used as is.
For unlocking, the only thing to do is to clear SR_NX.
This improves null_syscall selftest by 12 cycles, ie 4%.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/b02baf2ed8f09bad910dfaeeb7353b2ae6830525.1634627931.git.christophe.leroy@csgroup.eu
show more ...
|
| #
526d4a4c |
| 19-Oct-2021 |
Christophe Leroy <[email protected]> |
powerpc/32s: Do kuep_lock() and kuep_unlock() in assembly
When interrupt and syscall entries where converted to C, KUEP locking and unlocking was also converted. It improved performance by unrolling
powerpc/32s: Do kuep_lock() and kuep_unlock() in assembly
When interrupt and syscall entries where converted to C, KUEP locking and unlocking was also converted. It improved performance by unrolling the loop, and allowed easily implementing boot time deactivation of KUEP.
However, null_syscall selftest shows that KUEP is still heavy (361 cycles with KUEP, 212 cycles without).
A way to improve more is to group 'mtsr's together, instead of repeating 'addi' + 'mtsr' several times.
In order to do that, more registers need to be available. In C, GCC will always be able to provide the requested number of registers, but at the cost of saving some data on the stack, which is counter performant here.
So let's do it in assembly, when we have full control of which register can be used. It also has the advantage of locking earlier and unlocking later and it helps GCC generating less tricky code. The only drawback is to make boot time deactivation less straight forward and require 'hand' instruction patching.
Group 'mtsr's by 4.
With this change, null_syscall selftest reports 336 cycles. Without the change it was 361 cycles, that's a 7% reduction.
Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/115cb279e9b9948dfd93a065e047081c59e3a2a6.1634627931.git.christophe.leroy@csgroup.eu
show more ...
|