|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
57bdd989 |
| 25-Jul-2022 |
Nikita Popov <[email protected]> |
[ARM] Add target feature to force 32-bit atomics
This adds a +atomic-32 target feature, which instructs LLVM to assume that lock-free 32-bit atomics are available for this target, even if they usual
[ARM] Add target feature to force 32-bit atomics
This adds a +atomic-32 target feature, which instructs LLVM to assume that lock-free 32-bit atomics are available for this target, even if they usually wouldn't be.
If only atomic loads/stores are used, then this won't emit libcalls. If atomic CAS is used, then the user is responsible for providing any necessary __sync implementations (e.g. by masking interrupts for single-core privileged use cases).
See https://reviews.llvm.org/D120026#3674333 for context on this change. The tl;dr is that the thumbv6m target in Rust has historically made atomic load/store only available, which is incompatible with the change from D120026, which switched these to use libatomic.
Differential Revision: https://reviews.llvm.org/D130480
(cherry picked from commit b1b1086973d5be26f127540852ace59c5119e90a)
show more ...
|
| #
6cb95290 |
| 19-Jul-2022 |
David Green <[email protected]> |
[ARM] Remove VBICimm if no cleared bits are demanded
If none of the bits of a VBICimm are demanded, we can remove the node entirely using the input operand instead.
Differential Revision: https://r
[ARM] Remove VBICimm if no cleared bits are demanded
If none of the bits of a VBICimm are demanded, we can remove the node entirely using the input operand instead.
Differential Revision: https://reviews.llvm.org/D129966
show more ...
|
| #
0f6b0461 |
| 19-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XO
[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits
The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits.
This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115
Alive2: https://alive2.llvm.org/ce/z/fl7T7K
Differential Revision: https://reviews.llvm.org/D129933
show more ...
|
| #
259c36e7 |
| 18-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] Add asserts to isDesirableToCommuteWithShift overrides to ensure its being called from a shift. NFC.
|
| #
07ab0cb4 |
| 18-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] Add missing asserts to shouldFoldConstantShiftPairToMask overrides to ensure a shl/srl pair is used. NFC.
|
| #
ddd94851 |
| 08-Jul-2022 |
John Brawn <[email protected]> |
[MVE] Don't distribute add of vecreduce if it has more than one use
If the add has more than one use then applying the transformation won't cause it to be removed, so we can end up applying it again
[MVE] Don't distribute add of vecreduce if it has more than one use
If the add has more than one use then applying the transformation won't cause it to be removed, so we can end up applying it again causing an infinite loop.
Differential Revision: https://reviews.llvm.org/D129361
show more ...
|
| #
0a11ad2a |
| 11-Jul-2022 |
David Green <[email protected]> |
[ARM] Expand MVE i1 fptoint and inttofp if mve.fp is not present.
If MVE.fp is not present then we cannot select the vector i1 fp operations to VCMP instructions, so need to expand.
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
0916d96d |
| 21-Jun-2022 |
Kazu Hirata <[email protected]> |
Don't use Optional::hasValue (NFC)
|
| #
6725d806 |
| 14-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[NFC][Alignment] Use Align in shouldAlignPointerArgs
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
007917b9 |
| 08-Jun-2022 |
David Sherwood <[email protected]> |
[MVE] Fold fadd(select(..., +0.0)) into a predicated fadd
We already have patterns for matching fadd(select(..., -0.0)), but an upcoming patch will lead to patterns using +0.0 as the identity instea
[MVE] Fold fadd(select(..., +0.0)) into a predicated fadd
We already have patterns for matching fadd(select(..., -0.0)), but an upcoming patch will lead to patterns using +0.0 as the identity instead of -0.0. I'm adding support for these patterns now to avoid any regressions for MVE.
Differential Revision: https://reviews.llvm.org/D127275
show more ...
|
| #
07881861 |
| 03-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with
[Alignment][NFC] Remove usage of MemSDNode::getAlignment
I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times.
Differential Revision: https://reviews.llvm.org/D126910
show more ...
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
668bb963 |
| 19-May-2022 |
Martin Storsjö <[email protected]> |
[ARM] Implement lowering of the sponentry intrinsic
This is needed for SEH based setjmp on Windows.
Differential Revision: https://reviews.llvm.org/D126763
|
| #
ad73ce31 |
| 26-May-2022 |
Zongwei Lan <[email protected]> |
[Target] use getSubtarget<> instead of static_cast<>(getSubtarget())
Differential Revision: https://reviews.llvm.org/D125391
|
| #
a86cfaea |
| 21-May-2022 |
David Green <[email protected]> |
[ARM] Add register-mask for tail returns
The TC_RETURN/TCRETURNdi under Arm does not currently add the register-mask operand when tail folding, which leads to the register (like LR) not being 'used'
[ARM] Add register-mask for tail returns
The TC_RETURN/TCRETURNdi under Arm does not currently add the register-mask operand when tail folding, which leads to the register (like LR) not being 'used' by the return. This changes the code to unconditionally set the register mask on the call, as opposed to skipping it for tail calls.
I don't believe this will currently alter any codegen, but should glue things together better post-frame lowering. It matches the AArch64 code better.
Differential Revision: https://reviews.llvm.org/D125906
show more ...
|
| #
f848798b |
| 04-May-2022 |
David Green <[email protected]> |
[ARM] Delay creation of MVE Imm shifts to legalization
The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine - because matching i64 constants is difficult - does not apply for MVE
[ARM] Delay creation of MVE Imm shifts to legalization
The reasoning for creating VSHLIMM/VSHRsIMM/VSHRuIMM nodes in a combine - because matching i64 constants is difficult - does not apply for MVE, as there are not v2i64 shifts. Delaying the creation of the nodes can allow extra transforms on target independant shl/shr.
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
c4ea925f |
| 05-Apr-2022 |
Matt Arsenault <[email protected]> |
AtomicExpand: Change return type for shouldExpandAtomicStoreInIR
Use the same enum as the other atomic instructions for consistency, in preparation for addition of another strategy.
Introduce a new
AtomicExpand: Change return type for shouldExpandAtomicStoreInIR
Use the same enum as the other atomic instructions for consistency, in preparation for addition of another strategy.
Introduce a new "Expand" option, since the store expansion does not use cmpxchg. Alternatively, the existing CmpXChg strategy could be renamed to Expand.
show more ...
|
| #
662b9fa0 |
| 28-Mar-2022 |
Shao-Ce SUN <[email protected]> |
[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122557
|
| #
ddca6662 |
| 18-Mar-2022 |
Eli Friedman <[email protected]> |
[ARM] Fix shouldExpandAtomicLoadInIR for subtargets without ldrexd.
Regression from 2f497ec3; we should not try to generate ldrexd on targets that don't have it.
Also, while I'm here, fix shouldExp
[ARM] Fix shouldExpandAtomicLoadInIR for subtargets without ldrexd.
Regression from 2f497ec3; we should not try to generate ldrexd on targets that don't have it.
Also, while I'm here, fix shouldExpandAtomicStoreInIR, for consistency. That doesn't really have any practical effect, though. On Thumb targets where we need to use __sync_* libcalls, there is no libcall for stores, so SelectionDAG calls __sync_lock_test_and_set_8 anyway.
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
2f497ec3 |
| 17-Feb-2022 |
Eli Friedman <[email protected]> |
[ARM] Fix ARM backend to correctly use atomic expansion routines.
Without this patch, clang would generate calls to __sync_* routines on targets where it does not make sense; we can't assume the rou
[ARM] Fix ARM backend to correctly use atomic expansion routines.
Without this patch, clang would generate calls to __sync_* routines on targets where it does not make sense; we can't assume the routines exist on unknown targets. Linux has special implementations of the routines that work on old ARM targets; other targets have no such routines. In general, atomics operations which aren't natively supported should go through libatomic (__atomic_*) APIs, which can support arbitrary atomics through locks.
ARM targets older than v6, where this patch makes a difference, are rare in practice, but not completely extinct. See, for example, discussion on D116088.
This also affects Cortex-M0, but I don't think __sync_* routines actually exist in any Cortex-M0 libraries. So in practice this just leads to a slightly different linker error for those cases, I think.
Mechanically, this patch does the following:
- Ensures we run atomic expansion unconditionally; it never makes sense to completely skip it. - Fixes getMaxAtomicSizeInBitsSupported() so it returns an appropriate number on all ARM subtargets. - Fixes shouldExpandAtomicRMWInIR() and shouldExpandAtomicCmpXchgInIR() to correctly handle subtargets that don't have atomic instructions.
Differential Revision: https://reviews.llvm.org/D120026
show more ...
|
| #
831ab35b |
| 23-Feb-2022 |
Tomas Matheson <[email protected]> |
[ARM][AArch64] generate subtarget feature flags
Reland of D120906 after sanitizer failures.
This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the Subtarg
[ARM][AArch64] generate subtarget feature flags
Reland of D120906 after sanitizer failures.
This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods.
Some naming inconsistencies have been fixed to allow this, and one unused member removed.
This implementation only applies to boolean members; in future both BitVector and enum members could also be generated.
Differential Revision: https://reviews.llvm.org/D120906
show more ...
|
| #
62c48154 |
| 18-Mar-2022 |
Tomas Matheson <[email protected]> |
Revert "[ARM][AArch64] generate subtarget feature flags"
This reverts commit dd8b0fecb95df7689aac26c2ef9ebd1f527f9f46.
|
| #
dd8b0fec |
| 23-Feb-2022 |
Tomas Matheson <[email protected]> |
[ARM][AArch64] generate subtarget feature flags
This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of
[ARM][AArch64] generate subtarget feature flags
This patch aims to reduce a lot of the boilerplate around adding new subtarget features. From the SubtargetFeatures tablegen definitions, a series of calls to the macro GET_SUBTARGETINFO_MACRO are generated in ARM/AArch64GenSubtargetInfo.inc. ARMSubtarget/AArch64Subtarget can then use this macro to define bool members and the corresponding getter methods.
Some naming inconsistencies have been fixed to allow this, and one unused member removed.
This implementation only applies to boolean members; in future both BitVector and enum members could also be generated.
Differential Revision: https://reviews.llvm.org/D120906
show more ...
|
| #
bd38234d |
| 17-Mar-2022 |
Sterling Augustine <[email protected]> |
Reland "Use a stable-sort when combining bases"
Differential Revision: https://reviews.llvm.org/D121922
|
| #
84810e1f |
| 17-Mar-2022 |
Sterling Augustine <[email protected]> |
Revert "Use a stable-sort when combining bases"
This reverts commit 81417261a15f46284f2613118120d7d6de2bc02d.
|
| #
81417261 |
| 16-Mar-2022 |
Sterling Augustine <[email protected]> |
Use a stable-sort when combining bases
While experimenting with different algorithms for std::sort I discovered that combine-vmovdrr.ll fails if this sort is not stable.
I suspect that the test is
Use a stable-sort when combining bases
While experimenting with different algorithms for std::sort I discovered that combine-vmovdrr.ll fails if this sort is not stable.
I suspect that the test is too stringent in its check--the resultant code looks functionally identical to me under both stable and unstable sorting, but a generic fix is quite a bit more difficult to implement.
Thanks to [email protected] for finding the proper fix.
Differential Revision: https://reviews.llvm.org/D121870
show more ...
|