|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
e5c892dd |
| 08-May-2022 |
Paul Walker <[email protected]> |
[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.
This patch starts small, only detecting sequences of the form <a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNode
[SVE][SelectionDAG] Use INDEX to generate matching instances of BUILD_VECTOR.
This patch starts small, only detecting sequences of the form <a, a+n, a+2n, a+3n, ...> where a and n are ConstantSDNodes.
Differential Revision: https://reviews.llvm.org/D125194
show more ...
|
| #
428c0f2a |
| 24-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] getNode - assert that SMUL_LOHI/UMUL_LOHI nodes have the correct ops + types
|
| #
0708771c |
| 24-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] MaskedVectorIsZero - don't bother with (-1).isSubsetOf mask check. NFC.
Just use KnownBits::isZero() to ensure all the bits are known zero.
|
| #
ac8be213 |
| 23-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] isSplatValue - don't attempt to merge any BITCAST sub elements if they contain UNDEFs
We still haven't found a solution that correctly handles 'don't care' sub elements properly - given how cl
[DAG] isSplatValue - don't attempt to merge any BITCAST sub elements if they contain UNDEFs
We still haven't found a solution that correctly handles 'don't care' sub elements properly - given how close it is to the next release branch, I'm making this fail safe change and we can revisit this later if we can't find alternatives.
NOTE: This isn't a reversion of D128570 - it's the removal of undef handling across bitcasts entirely
Fixes #56520
show more ...
|
| #
89372524 |
| 23-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] computeKnownBits - add basic shift-by-parts handling
Concat KnownBits from ISD::SHL_PARTS / ISD::SRA_PARTS / ISD::SRL_PARTS lo/hi operands and perform the KnownBits calculation by the shift am
[DAG] computeKnownBits - add basic shift-by-parts handling
Concat KnownBits from ISD::SHL_PARTS / ISD::SRA_PARTS / ISD::SRL_PARTS lo/hi operands and perform the KnownBits calculation by the shift amount on the extended type, before splitting the KnownBits based on the requested lo/hi result.
show more ...
|
| #
23d6186b |
| 21-Jul-2022 |
David Green <[email protected]> |
[SelectionDAG] Fix fptoi.sat scalable vector lowering
Vector fptosi_sat and fptoui_sat were being expanded by unrolling the vector operation. This doesn't work for scalable vector, so this patch add
[SelectionDAG] Fix fptoi.sat scalable vector lowering
Vector fptosi_sat and fptoui_sat were being expanded by unrolling the vector operation. This doesn't work for scalable vector, so this patch adds a call to TLI.expandFP_TO_INT_SAT if the vector is scalable.
Scalable tests are added for AArch64 and RISCV. Some of the AArch64 fptoi_sat operations should be legal, but that will be handled in another patch.
Differential Revision: https://reviews.llvm.org/D130028
show more ...
|
| #
029e83b4 |
| 20-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] getNode - don't bother creating ADDO(X,0) or SUBO(X,0) nodes.
Similar to what we already do in getNode for basic ADD/SUB nodes, return the X operand directly, but here we know that there will
[DAG] getNode - don't bother creating ADDO(X,0) or SUBO(X,0) nodes.
Similar to what we already do in getNode for basic ADD/SUB nodes, return the X operand directly, but here we know that there will be no/zero overflow as well.
As noted on D127115 - this path is being exercised by llvm/test/CodeGen/ARM/dsp-mlal.ll, although I haven't been able to get any codegen without a topological worklist.
show more ...
|
| #
766cd954 |
| 20-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] getNode - assert that ADDO/SUBO nodes have the correct ops + types
|
| #
8d0383eb |
| 24-Jun-2022 |
Matt Arsenault <[email protected]> |
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is
CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable.
Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy.
Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.
show more ...
|
| #
26ce3370 |
| 17-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] computeKnownBits - move UDIV handling to same place as UREM/SREM. NFC.
|
| #
5ec47c6d |
| 17-Jul-2022 |
Simon Pilgrim <[email protected]> |
[DAG] Add MERGE_VALUE computeKnownBits/ComputeNumSignBits handling.
Just forward the value tracking to the operand specified by the ResNo
|
| #
9e6d1f4b |
| 17-Jul-2022 |
Kazu Hirata <[email protected]> |
[CodeGen] Qualify auto variables in for loops (NFC)
|
| #
dde2a7fb |
| 12-Jul-2022 |
Philip Reames <[email protected]> |
[RISCV] Exploit fact that vscale is always power of two to replace urem sequence
When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The R
[RISCV] Exploit fact that vscale is always power of two to replace urem sequence
When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The RHS of that urem is a (possibly shifted) call to @llvm.vscale.
vscale is effectively the number of "blocks" in the vector register. (That is, types such as <vscale x 8 x i8> and <vscale x 1 x i8> both fill one 64 bit block, and vscale is essentially how many of those blocks there are in a single vector register at runtime.)
We know from the RISCV V extension specification that VLEN must be a power of two between ELEN and 2^16. Since our block size is 64 bits, the must be a power of two numbers of blocks. (For everything other than VLEN<=32, but that's already broken.)
It is worth noting that AArch64 SVE specification explicitly allows non-power-of-two sizes for the vector registers and thus can't claim that vscale is a power of two by this logic.
Differential Revision: https://reviews.llvm.org/D129609
show more ...
|
| #
ede60037 |
| 29-Jun-2022 |
Nicolai Hähnle <[email protected]> |
ManagedStatic: remove many straightforward uses in llvm
(Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other
ManagedStatic: remove many straightforward uses in llvm
(Reapply after revert in e9ce1a588030d8d4004f5d7e443afe46245e9a92 due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.)
Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
show more ...
|
| #
e9ce1a58 |
| 10-Jul-2022 |
Nicolai Hähnle <[email protected]> |
Revert "ManagedStatic: remove many straightforward uses in llvm"
This reverts commit e6f1f062457c928c18a88c612f39d9e168f65a85.
Reverting due to a failure on the fuchsia-x86_64-linux buildbot.
|
| #
e6f1f062 |
| 29-Jun-2022 |
Nicolai Hähnle <[email protected]> |
ManagedStatic: remove many straightforward uses in llvm
Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases,
ManagedStatic: remove many straightforward uses in llvm
Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it.
Differential Revision: https://reviews.llvm.org/D129120
show more ...
|
| #
2247fdc8 |
| 08-Jul-2022 |
Sergei Barannikov <[email protected]> |
[SelectionDAG] computeKnownBits / ComputeNumSignBits for the remaining overflow-aware nodes
Some overflow-aware nodes were missing from the switches in computeKnownBits and ComputeNumSignBits.
|
| #
1023ddaf |
| 06-Jul-2022 |
Shilei Tian <[email protected]> |
[LLVM] Add the support for fmax and fmin in atomicrmw instruction
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instr
[LLVM] Add the support for fmax and fmin in atomicrmw instruction
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instruction will be expanded to CAS loop. There are already a couple of targets supporting the feature. I'll create another patch(es) to enable them accordingly.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D127041
show more ...
|
| #
72a23cef |
| 01-Jul-2022 |
Xiang1 Zhang <[email protected]> |
[ISel] Match all bits when merge undefs for DAG combine
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D128570
|
| #
64f44a90 |
| 01-Jul-2022 |
Xiang1 Zhang <[email protected]> |
Revert "[ISel] Match all bits when merge undef(s) for DAG combine"
This reverts commit 5fe5aa284efed1ee1492e1f266351b35f0a8bb69.
|
| #
5fe5aa28 |
| 30-Jun-2022 |
Xiang1 Zhang <[email protected]> |
[ISel] Match all bits when merge undef(s) for DAG combine
|
| #
4aafebce |
| 28-Jun-2022 |
Tim Northover <[email protected]> |
SelectionDAG: allow FP extensions when folding extract/insert.
Before, we were trying to sign extend half -> float, and asserted in getNode.
|
| #
2c3a4a93 |
| 22-Jun-2022 |
Simon Pilgrim <[email protected]> |
[DAG] SelectionDAG::GetDemandedBits - don't recurse back into GetDemandedBits
Another minor cleanup as we work toward removing GetDemandedBits entirely - call SimplifyMultipleUseDemandedBits directl
[DAG] SelectionDAG::GetDemandedBits - don't recurse back into GetDemandedBits
Another minor cleanup as we work toward removing GetDemandedBits entirely - call SimplifyMultipleUseDemandedBits directly.
show more ...
|
| #
8cecb6be |
| 21-Jun-2022 |
Simon Pilgrim <[email protected]> |
[DAG] Remove SelectionDAG::GetDemandedBits DemandedElts variant. NFC.
We're slowly removing SelectionDAG::GetDemandedBits and replacing it with SimplifyMultipleUseDemandedBits, we no longer have any
[DAG] Remove SelectionDAG::GetDemandedBits DemandedElts variant. NFC.
We're slowly removing SelectionDAG::GetDemandedBits and replacing it with SimplifyMultipleUseDemandedBits, we no longer have any uses for the vector demanded elt variant.
show more ...
|
| #
7a47ee51 |
| 21-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Don't use Optional::getValue (NFC)
|