|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
872f7000 |
| 03-Apr-2022 |
Dávid Bolvanský <[email protected]> |
Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"
This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.
|
| #
a113a582 |
| 03-Apr-2022 |
Dávid Bolvanský <[email protected]> |
[NFCI] Regenerate LoopVectorize test checks
|
|
Revision tags: llvmorg-14.0.0 |
|
| #
95f76bff |
| 13-Mar-2022 |
Florian Hahn <[email protected]> |
[LV] Create & use VPScalarIVSteps for all scalar users.
This patch is a follow-up to D115953. It updates optimizeInductions to also introduce new VPScalarIVStepsRecipes if an IV has both vector and
[LV] Create & use VPScalarIVSteps for all scalar users.
This patch is a follow-up to D115953. It updates optimizeInductions to also introduce new VPScalarIVStepsRecipes if an IV has both vector and scalar uses.
It updates all uses that only need scalar values to use the newly created recipe for the scalar steps.
This completes untangling of VPWidenIntOrFpInductionRecipe code-generation. Now the recipe *only* creates the widened vector values, as it says on the tin.
The code to genereate IR has been moved directly to VPWidenIntOrFpInductionRecipe::execute.
Note that the recipe has been updated to hold a reference to ScalarEvolution, which is needed to expand the step, until we can place the corresponding SCEV expansion in the pre-header.
Depends on D120827.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D120828
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
b3e8ace1 |
| 28-Feb-2022 |
Florian Hahn <[email protected]> |
Recommit "[VPlan] Introduce recipe to build scalar steps."
This reverts the revert commit ff93260bf6bddfbad1fa65c4d5184988885b900f.
The underlying issue causing the PPC bot failures has been fixed
Recommit "[VPlan] Introduce recipe to build scalar steps."
This reverts the revert commit ff93260bf6bddfbad1fa65c4d5184988885b900f.
The underlying issue causing the PPC bot failures has been fixed in cbaac1473403 and a corresponding test case has been added in ad2cad1c521c.
Original message:
This patch adds a new VPScalarIVStepsRecipe to handle building scalar steps.
In the first patch, it only handles the case where there is no vector induction variable needed.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D115953
show more ...
|
| #
ff93260b |
| 27-Feb-2022 |
Florian Hahn <[email protected]> |
Revert "[VPlan] Introduce recipe to build scalar steps."
This reverts commit 49b23f451cf713036c99573a35daed308d2ac894.
This appears to break some PPC build bots. Revert while I investigate.
|
| #
49b23f45 |
| 27-Feb-2022 |
Florian Hahn <[email protected]> |
[VPlan] Introduce recipe to build scalar steps.
This patch adds a new VPScalarIVStepsRecipe to handle building scalar steps.
In the first patch, it only handles the case where there is no vector in
[VPlan] Introduce recipe to build scalar steps.
This patch adds a new VPScalarIVStepsRecipe to handle building scalar steps.
In the first patch, it only handles the case where there is no vector induction variable needed.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D115953
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
42b34fac |
| 14-Jan-2022 |
Florian Hahn <[email protected]> |
Recommit "[LV] Inline CreateSplatIV call for scalar VFs."
This reverts the revert commit 073c27b5e5851f13d99d383e047309299b68827d.
A reduced test case has been added in 5e4966cbae7ba5 and the code
Recommit "[LV] Inline CreateSplatIV call for scalar VFs."
This reverts the revert commit 073c27b5e5851f13d99d383e047309299b68827d.
A reduced test case has been added in 5e4966cbae7ba5 and the code has been updated to handle the case where getInductionOpcode returns BinaryOpsEnd. In this case, the original code was always using Instruction::Add. Do the same in the patch.
Note this commit may slightly change the value naming, because it now also assigns the 'induction' name in the floating point case.
show more ...
|
| #
073c27b5 |
| 14-Jan-2022 |
James Y Knight <[email protected]> |
Revert "[LV] Inline CreateSplatIV call for scalar VFs (NFC)."
Causes a crash with the following (creduce'd) test-case:
clang -O3 '--target=aarch64-grtev4-linux-gnu' -xc - -c -o /dev/null <<EOF int
Revert "[LV] Inline CreateSplatIV call for scalar VFs (NFC)."
Causes a crash with the following (creduce'd) test-case:
clang -O3 '--target=aarch64-grtev4-linux-gnu' -xc - -c -o /dev/null <<EOF int *e; int f; int g() { int h; int *j = 0; while (&f - j > 0) { int k; k = j; if (e == j && *e) k = 5; h = k; j++; } return h; } EOF
This reverts commit 7ce48be0fd83fb4fe3d0104f324bbbcfcc82983c.
show more ...
|
| #
7ce48be0 |
| 13-Jan-2022 |
Florian Hahn <[email protected]> |
[LV] Inline CreateSplatIV call for scalar VFs (NFC).
This is a NFC change split off from D116123, as suggested there. D116123 will remove the last user of CreateSplatIV.
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
e6ad9ef4 |
| 14-Dec-2021 |
Philip Reames <[email protected]> |
[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement
The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transform
[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement
The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order.
I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions.
We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause.
Differential Revision: https://reviews.llvm.org/D115387
show more ...
|
| #
eb052f6b |
| 13-Dec-2021 |
Philip Reames <[email protected]> |
Reapply: Autogen more vectorizer tests in advance of D115387.
Drop changes to consecutive-ptr-uniforms.ll since that test checks boths IR output and debug messages. I'd missed this in the original
Reapply: Autogen more vectorizer tests in advance of D115387.
Drop changes to consecutive-ptr-uniforms.ll since that test checks boths IR output and debug messages. I'd missed this in the original commit, and Florian pointed it out in post-commit review.
Original commit message:
These are the ones my first round of scripting couldn't handle that required a bit of manual messaging. This should be the last batch in llvm-check.
This reverts commit bbba86764ae8f9365a1a3908c50eb54698b2b203.
show more ...
|
| #
bbba8676 |
| 13-Dec-2021 |
Philip Reames <[email protected]> |
Revert "Autogen more vectorizer tests in advance of D115387."
This reverts commit bbfaf0b170b6070e08f1dc22419dfedc75b9a0fe.
Post commit review noted a case where my manual update lost intentional c
Revert "Autogen more vectorizer tests in advance of D115387."
This reverts commit bbfaf0b170b6070e08f1dc22419dfedc75b9a0fe.
Post commit review noted a case where my manual update lost intentional check lines. Given I've abandoned the motivating patch, I'm just reverting the autogen prep.
show more ...
|
| #
bbfaf0b1 |
| 13-Dec-2021 |
Philip Reames <[email protected]> |
Autogen more vectorizer tests in advance of D115387.
These are the ones my first round of scripting couldn't handle that required a bit of manual messaging. This should be the last batch in llvm-ch
Autogen more vectorizer tests in advance of D115387.
These are the ones my first round of scripting couldn't handle that required a bit of manual messaging. This should be the last batch in llvm-check.
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
e90d55e1 |
| 15-Sep-2021 |
Florian Hahn <[email protected]> |
[VPlan] Support sinking recipes with uniform users outside sink target.
This is a first step towards addressing the last remaining limitation of the VPlan version of sinkScalarOperands: the legacy v
[VPlan] Support sinking recipes with uniform users outside sink target.
This is a first step towards addressing the last remaining limitation of the VPlan version of sinkScalarOperands: the legacy version can partially sink operands. For example, if a GEP has uniform users outside the sink target block, then the legacy version will sink all scalar GEPs, other than the one for lane 0.
This patch works towards addressing this case in the VPlan version by detecting such cases and duplicating the sink candidate. All users outside of the sink target will be updated to use the uniform clone.
Note that this highlights an issue with VPValue naming. If we duplicate a replicate recipe, they will share the same underlying IR value and both VPValues will have the same name ir<%gep>.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D104254
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
23c2f2e6 |
| 07-Jun-2021 |
Florian Hahn <[email protected]> |
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If th
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW.
This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling.
At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything.
Note that this could probably be further improved by using information from the original IV.
Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g
Part of a set of fixes required for PR50412.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D103255
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
c8893f3b |
| 18-Mar-2021 |
Sanjay Patel <[email protected]> |
[LoopVectorize] relax FMF constraint for FP induction
This makes the induction part of the loop vectorizer match the reduction part. We do not need all of the fast-math-flags. For example, there are
[LoopVectorize] relax FMF constraint for FP induction
This makes the induction part of the loop vectorizer match the reduction part. We do not need all of the fast-math-flags. For example, there are some that clearly are not in play like arcp or afn.
If we want to make FMF constraints consistent across the IR optimizer, we might want to add nsz too, but that's up for debate (users can't expect associative FP math and preservation of sign-of-zero at the same time?).
The calling code was fixed to avoid miscompiles with: 1bee549737ac
Differential Revision: https://reviews.llvm.org/D98708
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3 |
|
| #
1bee5497 |
| 04-Mar-2021 |
Sanjay Patel <[email protected]> |
[LoopVectorize] propagate fast-math-flags from induction instructions
This code assumed that FP math was only permissable if it was fully "fast", so it hard-coded "fast" when creating new instructio
[LoopVectorize] propagate fast-math-flags from induction instructions
This code assumed that FP math was only permissable if it was fully "fast", so it hard-coded "fast" when creating new instructions.
The underlying code already allows matching recurrences/reductions that are only "reassoc", so this change should prevent the potential miscompile seen in the test diffs (we created "fast" ops even though none existed in the original code).
I don't know if we need to create the temporary IRBuilder objects used here, so that could be follow-up clean-up.
There's an open question about whether we should require "nsz" in addition to "reassoc" here. InstCombine uses that combo for its reassociative folds, but I think codegen is not as strict.
show more ...
|
| #
36a489d1 |
| 04-Mar-2021 |
Sanjay Patel <[email protected]> |
[Analysis][LoopVectorize] rename "Unsafe" variables/methods; NFC
Similar to b3a33553aec7, but this shows a TODO and a potential miscompile is already present.
We are tracking an FP instruction that
[Analysis][LoopVectorize] rename "Unsafe" variables/methods; NFC
Similar to b3a33553aec7, but this shows a TODO and a potential miscompile is already present.
We are tracking an FP instruction that does *not* have FMF (reassoc) properties, so calling that "Unsafe" seems opposite of the common reading.
I also removed one getter method by rolling the null check into the access. Further simplification may be possible.
The motivation is to clean up the interactions between FMF and function-level attributes in these classes and their callers.
The new test shows that there is an existing bug somewhere in the callers. We assumed that the original code was fully 'fast' and so we produced IR with 'fast' even though it was just 'reassoc'.
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
278aa65c |
| 24-Dec-2020 |
Juneyoung Lee <[email protected]> |
[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: n
[IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder
This patch updates IRBuilder to create insertelement/shufflevector using poison as a placeholder.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93793
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
5cce4aff |
| 16-Dec-2020 |
Roman Lebedev <[email protected]> |
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large numbe
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
cee313d2 |
| 17-Apr-2019 |
Eric Christopher <[email protected]> |
Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.
Will be re-reverting again.
llvm-svn: 358552
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
| #
9e4bbe80 |
| 01-May-2018 |
Daniel Neilson <[email protected]> |
[LV] Preserve inbounds on created GEPs
Summary: This is a fix for PR23997.
The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations
[LV] Preserve inbounds on created GEPs
Summary: This is a fix for PR23997.
The loop vectorizer is not preserving the inbounds property of GEPs that it creates. This is inhibiting some optimizations. This patch preserves the inbounds property in the case where a load/store is being fed by an inbounds GEP.
Reviewers: mkuper, javed.absar, hsaito
Reviewed By: hsaito
Subscribers: dcaballe, hsaito, llvm-commits
Differential Revision: https://reviews.llvm.org/D46191
llvm-svn: 331269
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
| #
b0491731 |
| 28-Oct-2017 |
Sanjay Patel <[email protected]> |
[SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.
This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411
[SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.
This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired).
The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager.
Differential Revision: https://reviews.llvm.org/D38631
llvm-svn: 316835
show more ...
|
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1 |
|
| #
b05a5578 |
| 19-Jul-2017 |
Balaram Makam <[email protected]> |
[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary: When simplifying unconditional branches from empty blocks, we pre-test if t
[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary: When simplifying unconditional branches from empty blocks, we pre-test if the BB belongs to a set of loop headers and keep the block to prevent passes from destroying canonical loop structure. However, the current algorithm fails if the destination of the branch is a loop header. Especially when such a loop's latch block is folded into loop header it results in additional backedges and LoopSimplify turns it into a nested loop which prevent later optimizations from being applied (e.g., loop unrolling and loop interleaving).
This patch augments the existing algorithm by further checking if the destination of the branch belongs to a set of loop headers and defer eliminating it if yes to LateSimplifyCFG.
Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605
Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl
Reviewed By: efriedma
Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35411
llvm-svn: 308422
show more ...
|
| #
8c452d76 |
| 19-Jul-2017 |
Ayal Zaks <[email protected]> |
[LV] Test once if vector trip count is zero, instead of twice
Generate a single test to decide if there are enough iterations to jump to the vectorized loop, or else go to the scalar remainder loop.
[LV] Test once if vector trip count is zero, instead of twice
Generate a single test to decide if there are enough iterations to jump to the vectorized loop, or else go to the scalar remainder loop. This test compares the Scalar Trip Count: if STC < VF * UF go to the scalar loop. If requiresScalarEpilogue() holds, at-least one iteration must remain scalar; the rest can be used to form vector iterations. So in this case the test checks instead if (STC - 1) < VF * UF by comparing STC <= VF * UF, and going to the scalar loop if so. Otherwise the vector loop is entered for at-least one vector iteration.
This test covers the case where incrementing the backedge-taken count will overflow leading to an incorrect trip count of zero. In this (rare) case we will also avoid the vector loop and jump to the scalar loop.
This patch simplifies the existing tests and effectively removes the basic-block originally named "min.iters.checked", leaving the single test in block "vector.ph".
Original observation and initial patch by Evgeny Stupachenko.
Differential Revision: https://reviews.llvm.org/D34150
llvm-svn: 308421
show more ...
|