|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
601b3a13 |
| 17-Jul-2022 |
Kazu Hirata <[email protected]> |
[Analysis] Qualify auto variables in for loops (NFC)
|
| #
92a1b2af |
| 16-Jul-2022 |
Kazu Hirata <[email protected]> |
[Analysis] Remove isArithmeticRecurrenceKind
The last use was removed on Jul 30, 2021 in commit 9d355949937038c32c7608ebb558bbc3984f6340.
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
4e5e042d |
| 15-Sep-2021 |
Igor Kirillov <[email protected]> |
[LoopVectorize] Support reductions that store intermediary result
Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA fo
[LoopVectorize] Support reductions that store intermediary result
Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop.
Ordered fadd reductions are not yet supported.
Differential Revision: https://reviews.llvm.org/D110235
show more ...
|
| #
9727c77d |
| 25-Apr-2022 |
David Green <[email protected]> |
[NFC] Rename Instrinsic to Intrinsic
|
| #
a2979c83 |
| 07-Mar-2022 |
Florian Hahn <[email protected]> |
[IVDescriptors] Bail out instead of asserting that order is expected.
When dealing with multiple phis that depend on each other, the order might have been changed and may not match the expectation.
[IVDescriptors] Bail out instead of asserting that order is expected.
When dealing with multiple phis that depend on each other, the order might have been changed and may not match the expectation. If that happens, bail out, rather than asserting.
Fixes https://github.com/llvm/llvm-project/issues/54218 Fixes https://github.com/llvm/llvm-project/issues/54233 Fixes https://github.com/llvm/llvm-project/issues/54254
show more ...
|
| #
de8ac485 |
| 05-Mar-2022 |
Florian Hahn <[email protected]> |
[IVDescriptor] Remove SinkCandidate from SinkAfter before re-sinking.
This ensures the right order in the sink-after map is maintained. If we re-sink an instruction, it must be sunk after all earlie
[IVDescriptor] Remove SinkCandidate from SinkAfter before re-sinking.
This ensures the right order in the sink-after map is maintained. If we re-sink an instruction, it must be sunk after all earlier instructions have been sunk.
Fixes https://github.com/llvm/llvm-project/issues/54223
show more ...
|
| #
5a60260e |
| 04-Mar-2022 |
Florian Hahn <[email protected]> |
[IVDescriptor] Use DT to check order of Previous, OtherPrev.
Previous and OhterPrev may not be in the same block. Use DT::dominates instead of local comesBefore. DT::dominates is already used earlie
[IVDescriptor] Use DT to check order of Previous, OtherPrev.
Previous and OhterPrev may not be in the same block. Use DT::dominates instead of local comesBefore. DT::dominates is already used earlier to check the order of Previous and SinkCandidate.
Fixes https://github.com/llvm/llvm-project/issues/54195
show more ...
|
| #
139215af |
| 03-Mar-2022 |
Florian Hahn <[email protected]> |
[IVDescriptor] Find original 'Previous' for first-order recurrences.
This patch extends first-order recurrence handling to support cases where we already sunk an instruction for a different recurren
[IVDescriptor] Find original 'Previous' for first-order recurrences.
This patch extends first-order recurrence handling to support cases where we already sunk an instruction for a different recurrence, but LastPrev comes before Previous.
To handle those cases correctly, we need to find the earliest entry for the sink-after chain, because this is references the Previous from the original recurrence. This is needed to ensure we use the correct instruction as sink point.
Depends on D118558.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D118642
show more ...
|
| #
71c3a551 |
| 28-Feb-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor: before: 1065940348 after: 1065307662
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Diff
Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor: before: 1065940348 after: 1065307662
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659
show more ...
|
| #
12fb133e |
| 22-Feb-2022 |
Kerry McLaughlin <[email protected]> |
[LoopVectorize] Support conditional in-loop vector reductions
Extends getReductionOpChain to look through Phis which may be part of the reduction chain. adjustRecipesForReductions will now also crea
[LoopVectorize] Support conditional in-loop vector reductions
Extends getReductionOpChain to look through Phis which may be part of the reduction chain. adjustRecipesForReductions will now also create a CondOp for VPReductionRecipe if the block is predicated and not only if foldTailByMasking is true.
Changes were required in tryToBlend to ensure that we don't attempt to convert the reduction Phi into a select by returning a VPBlendRecipe. The VPReductionRecipe will create a select between the Phi and the reduction.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D117580
show more ...
|
| #
b2f5164d |
| 14-Feb-2022 |
zhongyunde <[email protected]> |
[IVDescriptors] Support FOR where we have multiple sink pointed
Handles the case where Previous doesn't come before LastPrev incorrectly. Fix https://github.com/llvm/llvm-project/issues/53483
Revie
[IVDescriptors] Support FOR where we have multiple sink pointed
Handles the case where Previous doesn't come before LastPrev incorrectly. Fix https://github.com/llvm/llvm-project/issues/53483
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D118558
show more ...
|
| #
1badfbb4 |
| 01-Feb-2022 |
David Sherwood <[email protected]> |
Fix incorrect TypeSize->uint64_t cast in InductionDescriptor::isInductionPHI
The code was relying upon the implicit conversion of TypeSize to uint64_t and assuming the type in question was always fi
Fix incorrect TypeSize->uint64_t cast in InductionDescriptor::isInductionPHI
The code was relying upon the implicit conversion of TypeSize to uint64_t and assuming the type in question was always fixed. However, I discovered an issue when running the canon-freeze pass with some IR loops that contains scalable vector types. I've changed the code to bail out if the size is unknown at compile time, since we cannot compute whether the step is a multiple of the type size or not.
I added a test here:
Transforms/CanonicalizeFreezeInLoops/phis.ll
Differential Revision: https://reviews.llvm.org/D118696
show more ...
|
| #
f3e1f443 |
| 27-Jan-2022 |
Congzhe Cao <[email protected]> |
[IVDescriptor] Get the exact FP instruction that does not allow reordering
This is a bugfix in IVDescriptor.cpp.
The helper function `RecurrenceDescriptor::getExactFPMathInst()` is supposed to retu
[IVDescriptor] Get the exact FP instruction that does not allow reordering
This is a bugfix in IVDescriptor.cpp.
The helper function `RecurrenceDescriptor::getExactFPMathInst()` is supposed to return the 1st FP instruction that does not allow reordering. However, when constructing the RecurrenceDescriptor, we trace the use-def chain staring from a PHI node and for each instruction in the use-def chain, its descriptor overrides the previous one. Therefore in the final RecurrenceDescriptor we constructed, we lose previous FP instructions that does not allow reordering.
Reviewed By: kmclaughlin
Differential Revision: https://reviews.llvm.org/D118073
show more ...
|
| #
aa97bc11 |
| 21-Jan-2022 |
Nikita Popov <[email protected]> |
[NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecatin
[NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
show more ...
|
| #
961f51fd |
| 08-Nov-2021 |
Rosie Sumpter <[email protected]> |
[LoopVectorize][CostModel] Choose smaller VFs for in-loop reductions without loads/stores
For loops that contain in-loop reductions but no loads or stores, large VFs are chosen because LoopVectoriza
[LoopVectorize][CostModel] Choose smaller VFs for in-loop reductions without loads/stores
For loops that contain in-loop reductions but no loads or stores, large VFs are chosen because LoopVectorizationCostModel::getSmallestAndWidestTypes has no element types to check through and so returns the default widths (-1U for the smallest and 8 for the widest). This results in the widest VF being chosen for the following example,
float s = 0; for (int i = 0; i < N; ++i) s += (float) i*i;
which, for more computationally intensive loops, leads to large loop sizes when the operations end up being scalarized.
In this patch, for the case where ElementTypesInLoop is empty, the widest type is determined by finding the smallest type used by recurrences in the loop instead of falling back to a default value of 8 bits. This results in the cost model choosing a more sensible VF for loops like the one above.
Differential Revision: https://reviews.llvm.org/D113973
show more ...
|
| #
d74a8a78 |
| 09-Dec-2021 |
Florian Hahn <[email protected]> |
[LV] Mark various functions as const (NFC).
Make sure various accessors do not modify any state, in preparation for D115111.
|
| #
c2441b6b |
| 11-Oct-2021 |
Rosie Sumpter <[email protected]> |
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the llvm.fmuladd intrinsic.
Differential Revision: https://reviews.ll
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the llvm.fmuladd intrinsic.
Differential Revision: https://reviews.llvm.org/D111555
show more ...
|
| #
ff64b293 |
| 16-Nov-2021 |
Kerry McLaughlin <[email protected]> |
[LoopVectorize] Check the number of uses of an FAdd before classifying as ordered
checkOrderedReductions looks for Phi nodes which can be classified as in-order, meaning they can be vectorised witho
[LoopVectorize] Check the number of uses of an FAdd before classifying as ordered
checkOrderedReductions looks for Phi nodes which can be classified as in-order, meaning they can be vectorised without unsafe math. In order to vectorise the reduction it should also be classified as in-loop by getReductionOpChain, which checks that the reduction has two uses.
In this patch, a similar check is added to checkOrderedReductions so that we now return false if there are more than two uses of the FAdd instruction. This fixes PR52515.
Reviewed By: fhahn, david-arm
Differential Revision: https://reviews.llvm.org/D114002
show more ...
|
| #
112c1c34 |
| 15-Nov-2021 |
Florian Hahn <[email protected]> |
[IVDescriptor] Make sure the sign is included for negative extension.
At the moment, computeRecurrenceType does not include any sign bits in the maximum bit width. If the value can be negative, this
[IVDescriptor] Make sure the sign is included for negative extension.
At the moment, computeRecurrenceType does not include any sign bits in the maximum bit width. If the value can be negative, this means the sign bit will be missing and the sext won't properly extend the value.
If the value can be negative, increment the bitwidth by one to make sure there is at least one sign bit in the result value.
Note that the increment is also needed *if* the value is *known* to be negative, as a sign bit needs to be preserved for the sext to work.
Note that this at the moment prevents vectorization, because the analysis computes i1 as type for the recurrence when looking through the AND in lookThroughAnd.
Fixes PR51794, PR52485.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D113056
show more ...
|
| #
73797367 |
| 14-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops with User::operands (NFC)
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
26b7d9d6 |
| 04-Aug-2021 |
David Sherwood <[email protected]> |
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a prev
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a previous comparison. Consider the following C++ loop:
int r = a; for (int i = 0; i < n; i++) { if (src[i] > 3) { r = b; } src[i] += 2; }
We should be able to vectorise this loop because all we are doing is selecting between two states - 'a' and 'b' - both of which are loop invariant. This just involves building a vector of values that contain either 'a' or 'b', where the final reduced value will be 'b' if any lane contains 'b'.
The IR generated by clang typically looks like this:
%phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ] ... %pred = icmp ugt i32 %val, i32 3 %phi.update = select i1 %pred, i32 %b, i32 %phi
We already detect min/max patterns, which also involve a select + cmp. However, with the min/max patterns we are selecting loaded values (and hence loop variant) in the loop. In addition we only support certain cmp predicates. This patch adds a new pattern matching function (isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp. We only support selecting values that are integer and loop invariant, however we can support any kind of compare - integer or float.
Tests have been added here:
Transforms/LoopVectorize/AArch64/sve-select-cmp.ll Transforms/LoopVectorize/select-cmp-predicated.ll Transforms/LoopVectorize/select-cmp.ll
Differential Revision: https://reviews.llvm.org/D108136
show more ...
|
| #
685f1bfd |
| 01-Oct-2021 |
Krasimir Georgiev <[email protected]> |
Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds
Revert "[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns"
It appears to cause stage2 clang build failures, e.g., https://lab.llvm.org/buildbot/#/builders/74/builds/7145.
This reverts commit 1fb37334bdb3cdb028977382fbd84cebde64ebb2.
show more ...
|
| #
1fb37334 |
| 04-Aug-2021 |
David Sherwood <[email protected]> |
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a prev
[LoopVectorize] Permit vectorisation of more select(cmp(), X, Y) reduction patterns
This patch adds further support for vectorisation of loops that involve selecting an integer value based on a previous comparison. Consider the following C++ loop:
int r = a; for (int i = 0; i < n; i++) { if (src[i] > 3) { r = b; } src[i] += 2; }
We should be able to vectorise this loop because all we are doing is selecting between two states - 'a' and 'b' - both of which are loop invariant. This just involves building a vector of values that contain either 'a' or 'b', where the final reduced value will be 'b' if any lane contains 'b'.
The IR generated by clang typically looks like this:
%phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ] ... %pred = icmp ugt i32 %val, i32 3 %phi.update = select i1 %pred, i32 %b, i32 %phi
We already detect min/max patterns, which also involve a select + cmp. However, with the min/max patterns we are selecting loaded values (and hence loop variant) in the loop. In addition we only support certain cmp predicates. This patch adds a new pattern matching function (isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp. We only support selecting values that are integer and loop invariant, however we can support any kind of compare - integer or float.
Tests have been added here:
Transforms/LoopVectorize/AArch64/sve-select-cmp.ll Transforms/LoopVectorize/select-cmp-predicated.ll Transforms/LoopVectorize/select-cmp.ll
Differential Revision: https://reviews.llvm.org/D108136
show more ...
|
| #
61cc873a |
| 15-Sep-2021 |
David Green <[email protected]> |
[LV] Recognize intrinsic min/max reductions
This extends the reduction logic in the vectorizer to handle intrinsic versions of min and max, both the floating point variants already created by instco
[LV] Recognize intrinsic min/max reductions
This extends the reduction logic in the vectorizer to handle intrinsic versions of min and max, both the floating point variants already created by instcombine under fastmath and the integer variants from D98152.
As a bonus this allows us to match a chain of min or max operations into a single reduction, similar to how add/mul/etc work.
Differential Revision: https://reviews.llvm.org/D109645
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
| #
02f74ead |
| 23-Jun-2021 |
Nikita Popov <[email protected]> |
[IVDescriptors] Make pointer inductions compatible with opaque pointers
Store the used element type in the InductionDescriptor. For typed pointers, it remains the pointer element type. For opaque po
[IVDescriptors] Make pointer inductions compatible with opaque pointers
Store the used element type in the InductionDescriptor. For typed pointers, it remains the pointer element type. For opaque pointers, we always use an i8 element type, such that the step is a simple offset.
A previous version of this patch instead tried to guess the element type from an induction GEP, but this is not reliable, as the GEP may be hidden (see @both in iv_outside_user.ll).
Differential Revision: https://reviews.llvm.org/D104795
show more ...
|