|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
c42a2255 |
| 13-Jun-2022 |
zhongyunde <[email protected]> |
[MachineScheduler] Order more stores by ascending address
According D125377, we order STP Q's by ascending address. While on some targets, paired 128 bit loads and stores are slow, so the STP will s
[MachineScheduler] Order more stores by ascending address
According D125377, we order STP Q's by ascending address. While on some targets, paired 128 bit loads and stores are slow, so the STP will split into STRQ and STUR, so I hope these stores will also be ordered. Also add subtarget feature ascend-store-address to control the aggressive order.
Reviewed By: dmgreen, fhahn
Differential Revision: https://reviews.llvm.org/D126700
show more ...
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
ad73ce31 |
| 26-May-2022 |
Zongwei Lan <[email protected]> |
[Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
e0ff354b |
| 18-Apr-2022 |
Momchil Velikov <[email protected]> |
[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer
[Re-commit after fixing a dereference of "end" iterator]
The AArch64LoadStoreOptimnizer pass may merge a register increment/
[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer
[Re-commit after fixing a dereference of "end" iterator]
The AArch64LoadStoreOptimnizer pass may merge a register increment/decrement with a following memory operation. In doing so, it may break CFI by moving a stack pointer adjustment past the CFI instruction that described *that* adjustment.
This patch fixes this issue by moving said CFI instruction after the merged instruction, where the SP increment/decrement actually takes place.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D114547
show more ...
|
| #
62d4686b |
| 14-Apr-2022 |
Momchil Velikov <[email protected]> |
Revert "[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer"
This reverts commit ecbf32dd88fc91b4fe709dc14bb3493dda6e8854.
It's possible this patch is the reason for an asertio
Revert "[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer"
This reverts commit ecbf32dd88fc91b4fe709dc14bb3493dda6e8854.
It's possible this patch is the reason for an asertion failure `!NodePtr->isKnownSentinel()` in `AArch64LoadStoreOpt::mergeUpdateInsn` (https://lab.llvm.org/buildbot/#/builders/185/builds/1555) reverting while I investigate.
show more ...
|
| #
ecbf32dd |
| 13-Apr-2022 |
Momchil Velikov <[email protected]> |
[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer
The AArch64LoadStoreOptimnizer pass may merge a register increment/decrement with a following memory operation. In doing so,
[AArch64] Async unwind - Adjust unwind info in AArch64LoadStoreOptimizer
The AArch64LoadStoreOptimnizer pass may merge a register increment/decrement with a following memory operation. In doing so, it may break CFI by moving a stack pointer adjustment past the CFI instruction that described *that* adjustment.
This patch fixes this issue by moving said CFI instruction after the merged instruction, where the SP increment/decrement actually takes place.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D114547
show more ...
|
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
d2c8aa0b |
| 01-Mar-2022 |
Florian Hahn <[email protected]> |
[AArch64] Pass Reg instead of MI to tryToFindRenameRegister (NFC).
FirstMI is only used to get the load/store operand and the machine function. Pass the MF and register explicitly, so the helper can
[AArch64] Pass Reg instead of MI to tryToFindRenameRegister (NFC).
FirstMI is only used to get the load/store operand and the machine function. Pass the MF and register explicitly, so the helper can be used to find rename registers for other instructions in the future.
show more ...
|
| #
45c969de |
| 01-Mar-2022 |
Florian Hahn <[email protected]> |
[AArch64] Remove unused argument from tryToFindRegisterToRename (NFC).
The MI argument is not used by the function. Remove it.
|
| #
1d74b531 |
| 10-Feb-2022 |
Huihui Zhang <[email protected]> |
[AArch64][LoadStoreOptimizer] Ignore undef registers when checking rename register used between paired instructions.
The content of undef registers are not used in meaningful ways, when checking if
[AArch64][LoadStoreOptimizer] Ignore undef registers when checking rename register used between paired instructions.
The content of undef registers are not used in meaningful ways, when checking if a rename register is used between paired instructions we should ignore undef registers.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D119305
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
d6b07348 |
| 19-Jan-2022 |
Jim Lin <[email protected]> |
[NFC] Use Register instead of unsigned
|
|
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
3d41ef68 |
| 20-Aug-2021 |
Tim Northover <[email protected]> |
AArch64: don't form indexed paired ops if base reg overlaps operands.
The registers involved might not be identical, but can still overlap (e.g. "str w0, [x0, #4]!").
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
| #
1cb7849a |
| 23-Jun-2021 |
Martin Storsjö <[email protected]> |
Revert "[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier"
This reverts commit ea011ec5ed53599305de62ca5fcfd31f4b3448c3.
This still causes some miscompiles, I'l
Revert "[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier"
This reverts commit ea011ec5ed53599305de62ca5fcfd31f4b3448c3.
This still causes some miscompiles, I'll follow up in the phabricator review with a sample of that issue (which is part of the sample of the previous issue).
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc2 |
|
| #
ea011ec5 |
| 14-Jun-2021 |
Meera Nakrani <[email protected]> |
[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier
This is a recommit that fixes unwanted STP generation by checking that the base register has not been modified
[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier
This is a recommit that fixes unwanted STP generation by checking that the base register has not been modified or used elsewhere.
Our initial motivating case was memcpy's with alignments > 16. The loads/stores, to which small memcpy's expand, are kept together in several places so that we get a sequence like this for a 64 bit copy: LD w0 LD w1 ST w0 ST w1 The load/store optimiser can generate a LDP/STP w0, w1 from this because the registers read/written are consecutive. In our case however, the sequence is optimised during ISel, resulting in: LD w0 ST w0 LD w0 ST w0 This instruction reordering allows reuse of registers. Since the registers are no longer consecutive (i.e. they are the same), it inhibits LDP/STP creation. The approach here is to perform renaming: LD w0 ST w0 LD w1 ST w1 to enable the folding of the stores into a STP. We do not yet generate the LDP due to a limitation in the renaming implementation, but plan to look at that in a follow-up so that we fully support this case. While this was initially motivated by certain memcpy's, this is a general approach and thus is beneficial for other cases too, as can be seen in some test changes.
Differential Revision: https://reviews.llvm.org/D103597
show more ...
|
| #
99653702 |
| 10-Jun-2021 |
Martin Storsjö <[email protected]> |
Revert "[AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier"
This reverts commit d96ea46629803641038ebe46d8cd512f8cf7e20f, as it caused various misoptimizations, see https:/
Revert "[AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier"
This reverts commit d96ea46629803641038ebe46d8cd512f8cf7e20f, as it caused various misoptimizations, see https://reviews.llvm.org/D103597 for discussion on the issues.
show more ...
|
| #
d96ea466 |
| 02-Jun-2021 |
Meera Nakrani <[email protected]> |
[AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier
Our initial motivating case was memcpy's with alignments > 16. The loads/stores, to which small memcpy's expand, are kept
[AArch64LoadStoreOptimizer] Generate more STPs by renaming registers earlier
Our initial motivating case was memcpy's with alignments > 16. The loads/stores, to which small memcpy's expand, are kept together in several places so that we get a sequence like this for a 64 bit copy: LD w0 LD w1 ST w0 ST w1 The load/store optimiser can generate a LDP/STP w0, w1 from this because the registers read/written are consecutive. In our case however, the sequence is optimised during ISel, resulting in: LD w0 ST w0 LD w0 ST w0 This instruction reordering allows reuse of registers. Since the registers are no longer consecutive (i.e. they are the same), it inhibits LDP/STP creation. The approach here is to perform renaming: LD w0 ST w0 LD w1 ST w1 to enable the folding of the stores into a STP. We do not yet generate the LDP due to a limitation in the renaming implementation, but plan to look at that in a follow-up so that we fully support this case. While this was initially motivated by certain memcpy's, this is a general approach and thus is beneficial for other cases too, as can be seen in some test changes.
Differential Revision: https://reviews.llvm.org/D103597
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
3f4bad5e |
| 05-May-2021 |
Stelios Ioannou <[email protected]> |
[AArch64] Fix for the pre-indexed paired load/store optimization.
This patch fixes an issue where a pre-indexed store e.g., STR x1, [x0, #24]! with a store like STR x0, [x0, #8] are merged into a si
[AArch64] Fix for the pre-indexed paired load/store optimization.
This patch fixes an issue where a pre-indexed store e.g., STR x1, [x0, #24]! with a store like STR x0, [x0, #8] are merged into a single store: STP x1, x0, [x0, #24]! . They shouldn’t be merged because the second store uses x0 as both the stored value and the address and so it needs to be using the updated x0. Therefore, it should not be folded into a STP <>pre.
Additionally a new test case is added to verify this fix.
Differential Revision: https://reviews.llvm.org/D101888
Change-Id: I26f1985ac84e970961e2cdca23c590fa6773851a
show more ...
|
| #
936c777e |
| 30-Apr-2021 |
Stelios Ioannou <[email protected]> |
[AArch64] Adds a pre-indexed paired Load/Store optimization for LDR-STR.
This patch merges STR<S,D,Q,W,X>pre-STR<S,D,Q,W,X>ui and LDR<S,D,Q,W,X>pre-LDR<S,D,Q,W,X>ui instruction pairs into a single S
[AArch64] Adds a pre-indexed paired Load/Store optimization for LDR-STR.
This patch merges STR<S,D,Q,W,X>pre-STR<S,D,Q,W,X>ui and LDR<S,D,Q,W,X>pre-LDR<S,D,Q,W,X>ui instruction pairs into a single STP<S,D,Q,W,X>pre and LDP<S,D,Q,W,X>pre instruction, respectively. For each pair, there is a MIR test that verifies this optimization.
Differential Revision: https://reviews.llvm.org/D99272
Change-Id: Ie97a20c8c716c08492fe229c22e14e3c98ef08b7
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init |
|
| #
0146d206 |
| 21-Jan-2021 |
Amara Emerson <[email protected]> |
[AArch64] Do not fold SP adjustments into pre-increment addr modes if it overflows the redzone.
Instead of outright disabling this completely with the noredzone attribute, we only avoid doing the op
[AArch64] Do not fold SP adjustments into pre-increment addr modes if it overflows the redzone.
Instead of outright disabling this completely with the noredzone attribute, we only avoid doing the optimization if there are memory operations between the adjustment and the load/store that the adjustment would be folded into. This avoids the case of something like a stack cookie being corrupted if an exception happens before the pre-increment to the SP occurs.
This also prevents the folding happening if we have a redzone, but the offset being folded is above the redzone amount (128 bytes in this case).
rdar://73269336
Differential Revision: https://reviews.llvm.org/D95179
show more ...
|
|
Revision tags: llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5 |
|
| #
f4b9dfd9 |
| 29-Sep-2020 |
Martin Storsjö <[email protected]> |
[AArch64] Don't merge sp decrement into later stores when using WinCFI
This matches the corresponding existing case in AArch64LoadStoreOpt::findMatchingUpdateInsnForward.
Both cases could also be m
[AArch64] Don't merge sp decrement into later stores when using WinCFI
This matches the corresponding existing case in AArch64LoadStoreOpt::findMatchingUpdateInsnForward.
Both cases could also be modified to check MBBI->getFlag(FrameSetup/FrameDestroy) instead of forbidding any optimization involving SP, but the effect is probably pretty much the same.
Differential Revision: https://reviews.llvm.org/D88541
show more ...
|
| #
8d8cb1ad |
| 30-Sep-2020 |
Congzhe Cao <congzhe.cao@@huawei.com> |
[AArch64] Avoid pairing loads when the base reg is modified
When pairing loads, we should check if in between the two loads the base register has been modified. If that is the case then avoid pairin
[AArch64] Avoid pairing loads when the base reg is modified
When pairing loads, we should check if in between the two loads the base register has been modified. If that is the case then avoid pairing them because the second load actually loads from a different address.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D86956
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc4 |
|
| #
c2deacd9 |
| 23-Sep-2020 |
Andrew Wei <[email protected]> |
[AArch64] Fix ldst optimization of non-immediate store offset
When matching store instruction for ldst opt, we should make sure store instr is in 'reg+imm' form as load instr, otherwise, it will hav
[AArch64] Fix ldst optimization of non-immediate store offset
When matching store instruction for ldst opt, we should make sure store instr is in 'reg+imm' form as load instr, otherwise, it will have assertion in isLdOffsetInRangeOfSt since it will use getImm() directly.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87905
show more ...
|
| #
4edb3d36 |
| 22-Sep-2020 |
Congzhe Cao <congzhe.cao@@huawei.com> |
[AArch64] Avoid pairing loads with same result reg
When pairing ldr instructions to an ldp instruction, we cannot pair two ldr destination registers where one is a sub or super register of the other
[AArch64] Avoid pairing loads with same result reg
When pairing ldr instructions to an ldp instruction, we cannot pair two ldr destination registers where one is a sub or super register of the other.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D86906
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
1975ff9a |
| 08-Jun-2020 |
Florian Hahn <[email protected]> |
[AArch64] Fix ldst-opt of multiple disjunct subregs.
Currently aarch64-ldst-opt will incorrectly rename registers with multiple disjunct subregisters (e.g. result of LD3). This patch updates the can
[AArch64] Fix ldst-opt of multiple disjunct subregs.
Currently aarch64-ldst-opt will incorrectly rename registers with multiple disjunct subregisters (e.g. result of LD3). This patch updates the canRenameUpToDef to bail out if it encounters such a register class that contains the register to rename.
Fixes PR46105.
Reviewers: efriedma, dmgreen, paquette, t.p.northover
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D81108
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
505685a6 |
| 24-Apr-2020 |
Jean-Michel Gorius <[email protected]> |
[llvm][CodeGen] Check for memory instructions when querying for alias status
Summary: Add a check to make sure that MachineInstr::mayAlias returns prematurely if at least one of its instruction para
[llvm][CodeGen] Check for memory instructions when querying for alias status
Summary: Add a check to make sure that MachineInstr::mayAlias returns prematurely if at least one of its instruction parameters does not access memory. This prevents calls to TargetInstrInfo::areMemAccessesTriviallyDisjoint with incompatible instructions.
A side effect of this change is to render the mayAlias helper in the AArch64 load/store optimizer obsolete. We can now directly call the MachineInstr::mayAlias member function.
Reviewers: hfinkel, t.p.northover, mcrosier, eli.friedman, efriedma
Reviewed By: efriedma
Subscribers: efriedma, kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78823
show more ...
|
| #
c2c2dc52 |
| 18-Apr-2020 |
Vedant Kumar <[email protected]> |
[AArch64LoadStoreOptimizer] Skip debug insts during pattern matching [12/14]
Do not count the presence of debug insts against the limit set by LdStLimit, and allow the optimizer to find matching ins
[AArch64LoadStoreOptimizer] Skip debug insts during pattern matching [12/14]
Do not count the presence of debug insts against the limit set by LdStLimit, and allow the optimizer to find matching insts by skipping over debug insts.
Differential Revision: https://reviews.llvm.org/D78411
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5 |
|
| #
83cdb654 |
| 13-Mar-2020 |
Diogo Sampaio <[email protected]> |
[AArch64][Fix] LdSt optimization generate premature stack-popping
Summary: When moving add and sub to memory operand instructions, aarch64-ldst-opt would prematurally pop the stack pointer, before m
[AArch64][Fix] LdSt optimization generate premature stack-popping
Summary: When moving add and sub to memory operand instructions, aarch64-ldst-opt would prematurally pop the stack pointer, before memory instructions that do access the stack using indirect loads. e.g. ``` int foo(int offset){ int local[4] = {0}; return local[offset]; } ``` would generate: ``` sub sp, sp, #16 ; Push the stack mov x8, sp ; Save stack in register stp xzr, xzr, [sp], #16 ; Zero initialize stack, and post-increment, making it invalid ------ If an exception goes here, the stack value might be corrupted ldr w0, [x8, w0, sxtw #2] ; Access correct position, but it is not guarded by SP ```
Reviewers: fhahn, foad, thegameg, eli.friedman, efriedma
Reviewed By: efriedma
Subscribers: efriedma, kristof.beyls, hiraditya, danielkiss, llvm-commits, simon_tatham
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75755
show more ...
|