|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
bfb9b8e0 |
| 25-Jul-2022 |
Sanjay Patel <[email protected]> |
[Passes] add a tail-call-elim pass near the end of the opt pipeline
We call tail-call-elim near the beginning of the pipeline, but that is too early to annotate calls that get added later.
In the m
[Passes] add a tail-call-elim pass near the end of the opt pipeline
We call tail-call-elim near the beginning of the pipeline, but that is too early to annotate calls that get added later.
In the motivating case from issue #47852, the missing 'tail' on memset leads to sub-optimal codegen.
I experimented with removing the early instance of tail-call-elim instead of just adding another pass, but that appears to be slightly worse for compile-time: +0.15% vs. +0.08% time. "tailcall" shows adding the pass; "tailcall2" shows moving the pass to later, then adding the original early pass back (so 1596886802 is functionally equivalent to 180b0439dc ): https://llvm-compile-time-tracker.com/index.php?config=NewPM-O3&stat=instructions&remote=rotateright
Note that there was an effort to split the tail call functionality into 2 passes - that could help reduce compile-time if we find that this change costs more in compile-time than expected based on the preliminary testing: D60031
Differential Revision: https://reviews.llvm.org/D130374
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
37ead201 |
| 12-Nov-2021 |
Philip Reames <[email protected]> |
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.
Why does this matter? A couple of reasons: * SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.) * Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.)
Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.
* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.
show more ...
|
| #
15fefcb9 |
| 18-Oct-2021 |
Arthur Eubanks <[email protected]> |
[opt] Directly translate -O# to -passes='default<O#>'
Right now when we see -O# we add the corresponding 'default<O#>' into the list of passes to run when translating legacy -pass-name. This has the
[opt] Directly translate -O# to -passes='default<O#>'
Right now when we see -O# we add the corresponding 'default<O#>' into the list of passes to run when translating legacy -pass-name. This has the side effect of not using the default AA pipeline.
Instead, treat -O# as -passes='default<O#>', but don't allow any other -passes or -pass-name. I think we can keep `opt -O#` as shorthand for `opt -passes='default<O#>` but disallow anything more than just -O#.
Tests need to be updated to not use `opt -O# -pass-name`.
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D112036
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
23c2f2e6 |
| 07-Jun-2021 |
Florian Hahn <[email protected]> |
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If th
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW.
This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling.
At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything.
Note that this could probably be further improved by using information from the original IV.
Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g
Part of a set of fixes required for PR50412.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D103255
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
c8893f3b |
| 18-Mar-2021 |
Sanjay Patel <[email protected]> |
[LoopVectorize] relax FMF constraint for FP induction
This makes the induction part of the loop vectorizer match the reduction part. We do not need all of the fast-math-flags. For example, there are
[LoopVectorize] relax FMF constraint for FP induction
This makes the induction part of the loop vectorizer match the reduction part. We do not need all of the fast-math-flags. For example, there are some that clearly are not in play like arcp or afn.
If we want to make FMF constraints consistent across the IR optimizer, we might want to add nsz too, but that's up for debate (users can't expect associative FP math and preservation of sign-of-zero at the same time?).
The calling code was fixed to avoid miscompiles with: 1bee549737ac
Differential Revision: https://reviews.llvm.org/D98708
show more ...
|
| #
d2eae990 |
| 16-Mar-2021 |
Sanjay Patel <[email protected]> |
[LoopVectorize] add FP induction test with minimal FMF; NFC
|
|
Revision tags: llvmorg-12.0.0-rc3 |
|
| #
b46c085d |
| 26-Feb-2021 |
Roman Lebedev <[email protected]> |
[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions
These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them.
This
[NFCI] SCEVExpander: emit intrinsics for integral {u,s}{min,max} SCEV expressions
These intrinsics, not the icmp+select are the canonical form nowadays, so we might as well directly emit them.
This should not cause any regressions, but if it does, then then they would needed to be fixed regardless.
Note that this doesn't deal with `SCEVExpander::isHighCostExpansion()`, but that is a pessimization, not a correctness issue.
Additionally, the non-intrinsic form has issues with undef, see https://reviews.llvm.org/D88287#2587863
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1 |
|
| #
ab93c18c |
| 27-Jan-2021 |
Sanjay Patel <[email protected]> |
[LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes)
I am trying to untangle the fast-math-flags propagation logic in the vectorizers (see a6f022127 for SLP).
The loop ve
[LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes)
I am trying to untangle the fast-math-flags propagation logic in the vectorizers (see a6f022127 for SLP).
The loop vectorizer has a mix of checking FP function attributes, IR-level FMF, and just wrong assumptions.
I am trying to avoid regressions while fixing this, and I think the IR-level logic is good enough for that, but it's hard to say for sure. This would be the 1st step in the clean-up.
The existing test that I changed to include 'fast' actually shows a miscompile: the function only had the equivalent of nnan, but we created new instructions that had fast (all FMF set). This is similar to the example in https://llvm.org/PR35538
Differential Revision: https://reviews.llvm.org/D95452
show more ...
|
|
Revision tags: llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
5cce4aff |
| 16-Dec-2020 |
Roman Lebedev <[email protected]> |
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large numbe
[SimplifyCFG] TryToSimplifyUncondBranchFromEmptyBlock() already knows how to preserve DomTree
... so just ensure that we pass DomTreeUpdater it into it.
Fixes DomTree preservation for a large number of tests, all of which are marked as such so that they do not regress.
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1 |
|
| #
4e68bc09 |
| 16-Nov-2020 |
Sanjay Patel <[email protected]> |
Revert "[InstCombine] add multi-use demanded bits fold for add with low-bit mask"
This reverts commit e56103d25016c9ce4e98f652ac1a09379793ccf5. There is a stage2 msan failure blamed on this commit:
Revert "[InstCombine] add multi-use demanded bits fold for add with low-bit mask"
This reverts commit e56103d25016c9ce4e98f652ac1a09379793ccf5. There is a stage2 msan failure blamed on this commit: http://lab.llvm.org:8011/#/builders/74/builds/888/steps/9/logs/stdio
show more ...
|
| #
e56103d2 |
| 15-Nov-2020 |
Sanjay Patel <[email protected]> |
[InstCombine] add multi-use demanded bits fold for add with low-bit mask
I noticed an add example like the one from D91343, so here's a similar patch. The logic is based on existing code for the sin
[InstCombine] add multi-use demanded bits fold for add with low-bit mask
I noticed an add example like the one from D91343, so here's a similar patch. The logic is based on existing code for the single-use demanded bits fold. But I only matched a constant instead of using compute known bits on the operands because that was the motivating patterni that I noticed.
I think this will allow removing a special-case (but incomplete) dedicated fold within visitAnd(), but I need to untangle the existing code to be sure.
https://rise4fun.com/Alive/V6fP
Name: add with low mask Pre: (C1 & (-1 u>> countLeadingZeros(C2))) == 0 %a = add i8 %x, C1 %r = and i8 %a, C2 => %r = and i8 %x, C2
Differential Revision: https://reviews.llvm.org/D91415
show more ...
|
| #
9e0c3565 |
| 12-Nov-2020 |
Sanjay Patel <[email protected]> |
[LoopVectorize] regenerate test checks; NFC
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4 |
|
| #
1badf7c3 |
| 06-Mar-2020 |
Roman Lebedev <[email protected]> |
[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant
Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org
[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant
Summary: This is potentially more friendly for further optimizations, analysies, e.g.: https://godbolt.org/z/G24anE
This resolves phase-ordering bug that was introduced in D75145 for https://godbolt.org/z/2gBwF2 https://godbolt.org/z/XvgSua
Reviewers: spatel, nikic, dmgreen, xbolva00
Reviewed By: nikic, xbolva00
Subscribers: hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75757
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc3 |
|
| #
d6f47aeb |
| 25-Feb-2020 |
Roman Lebedev <[email protected]> |
[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model min/max (PR44668)
Summary: Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand. But this isn't really true, it
[SCEV] SCEVExpander::isHighCostExpansionHelper(): cost-model min/max (PR44668)
Summary: Previosly we simply always said that `SCEVMinMaxExpr` is too costly to expand. But this isn't really true, it expands into just a comparison+swap pair. And again much like with add/mul, there will be one less such pair than the number of operands. And we need to count the cost of operands themselves.
This does change a number of testcases, and as far as i can tell, all of these changes are improvements, in the sense that we fixed up more latches to do the [in]equality comparison.
This concludes cost-modelling changes, no other SCEV expressions exist as of now.
This is a part of addressing [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]].
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73744
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc2 |
|
| #
3bd33ccf |
| 12-Feb-2020 |
Roman Lebedev <[email protected]> |
[NFC?][SCEV][LoopVectorize] Add datalayout to the X86/float-induction-x86.ll test
Summary: Currently, `SCEVExpander::isHighCostExpansionHelper()` has the following logic: ``` if (auto *UDivExpr =
[NFC?][SCEV][LoopVectorize] Add datalayout to the X86/float-induction-x86.ll test
Summary: Currently, `SCEVExpander::isHighCostExpansionHelper()` has the following logic: ``` if (auto *UDivExpr = dyn_cast<SCEVUDivExpr>(S)) { // If the divisor is a power of two and the SCEV type fits in a native // integer (and the LHS not expensive), consider the division cheap // irrespective of whether it occurs in the user code since it can be // lowered into a right shift. if (auto *SC = dyn_cast<SCEVConstant>(UDivExpr->getRHS())) if (SC->getAPInt().isPowerOf2()) { if (isHighCostExpansionHelper(UDivExpr->getLHS(), L, At, BudgetRemaining, TTI, Processed)) return true; const DataLayout &DL = L->getHeader()->getParent()->getParent()->getDataLayout(); unsigned Width = cast<IntegerType>(UDivExpr->getType())->getBitWidth(); return DL.isIllegalInteger(Width); } ```
Since this test does not have a datalayout specified, `SCEVExpander::isHighCostExpansionHelper()` says that `[[TMP2:%.*]] = lshr exact i64 [[TMP1]], 5` is high-cost, and didn't perform it.
But future patches will change that logic to solely rely on cost-model, without any such datalayout checks, so i think it is best to show that that change is ephemeral, and can already happen without costmodel changes.
Reviewers: reames, fhahn, sanjoy, craig.topper, RKSimon
Reviewed By: RKSimon
Subscribers: javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73717
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc1 |
|
| #
7bca4a28 |
| 27-Jan-2020 |
Roman Lebedev <[email protected]> |
[NFC][LoopVectorize] Autogenerate tests affected by isHighCostExpansionHelper() cost modelling (PR44668)
|
|
Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
cee313d2 |
| 17-Apr-2019 |
Eric Christopher <[email protected]> |
Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.
Will be re-reverting again.
llvm-svn: 358552
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
| #
b0491731 |
| 28-Oct-2017 |
Sanjay Patel <[email protected]> |
[SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.
This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411
[SimplifyCFG] use pass options and remove the latesimplifycfg pass
This is no-functional-change-intended.
This is repackaging the functionality of D30333 (defer switch-to-lookup-tables) and D35411 (defer folding unconditional branches) with pass parameters rather than a named "latesimplifycfg" pass. Now that we have individual options to control the functionality, we could decouple when these fire (but that's an independent patch if desired).
The next planned step would be to add another option bit to disable the sinking transform mentioned in D38566. This should also make it clear that the new pass manager needs to be updated to limit simplifycfg in the same way as the old pass manager.
Differential Revision: https://reviews.llvm.org/D38631
llvm-svn: 316835
show more ...
|
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1 |
|
| #
b05a5578 |
| 19-Jul-2017 |
Balaram Makam <[email protected]> |
[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary: When simplifying unconditional branches from empty blocks, we pre-test if t
[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.
Summary: When simplifying unconditional branches from empty blocks, we pre-test if the BB belongs to a set of loop headers and keep the block to prevent passes from destroying canonical loop structure. However, the current algorithm fails if the destination of the branch is a loop header. Especially when such a loop's latch block is folded into loop header it results in additional backedges and LoopSimplify turns it into a nested loop which prevent later optimizations from being applied (e.g., loop unrolling and loop interleaving).
This patch augments the existing algorithm by further checking if the destination of the branch belongs to a set of loop headers and defer eliminating it if yes to LateSimplifyCFG.
Fixes PR33605: https://bugs.llvm.org/show_bug.cgi?id=33605
Reviewers: efriedma, mcrosier, pacxx, hsung, davidxl
Reviewed By: efriedma
Subscribers: ashutosh.nema, gberry, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D35411
llvm-svn: 308422
show more ...
|
| #
8c452d76 |
| 19-Jul-2017 |
Ayal Zaks <[email protected]> |
[LV] Test once if vector trip count is zero, instead of twice
Generate a single test to decide if there are enough iterations to jump to the vectorized loop, or else go to the scalar remainder loop.
[LV] Test once if vector trip count is zero, instead of twice
Generate a single test to decide if there are enough iterations to jump to the vectorized loop, or else go to the scalar remainder loop. This test compares the Scalar Trip Count: if STC < VF * UF go to the scalar loop. If requiresScalarEpilogue() holds, at-least one iteration must remain scalar; the rest can be used to form vector iterations. So in this case the test checks instead if (STC - 1) < VF * UF by comparing STC <= VF * UF, and going to the scalar loop if so. Otherwise the vector loop is entered for at-least one vector iteration.
This test covers the case where incrementing the backedge-taken count will overflow leading to an incorrect trip count of zero. In this (rare) case we will also avoid the vector loop and jump to the scalar loop.
This patch simplifies the existing tests and effectively removes the basic-block originally named "min.iters.checked", leaving the single test in block "vector.ph".
Original observation and initial patch by Evgeny Stupachenko.
Differential Revision: https://reviews.llvm.org/D34150
llvm-svn: 308421
show more ...
|
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
| #
9eed0bee |
| 26-Apr-2017 |
Matthew Simpson <[email protected]> |
[LV] Handle external uses of floating-point induction variables
Reference: https://bugs.llvm.org/show_bug.cgi?id=32758 Differential Revision: https://reviews.llvm.org/D32445
llvm-svn: 301428
|
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1 |
|
| #
376a18bd |
| 24-Jul-2016 |
Elena Demikhovsky <[email protected]> |
[Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this: float *A; float x = init; for (int i=0; i < N; ++i) { A[i] = x; x -= fp_inc;
[Loop Vectorizer] Handling loops FP induction variables.
Allowed loop vectorization with secondary FP IVs. Like this: float *A; float x = init; for (int i=0; i < N; ++i) { A[i] = x; x -= fp_inc; }
The auto-vectorization is possible when the induction binary operator is "fast" or the function has "unsafe" attribute.
Differential Revision: https://reviews.llvm.org/D21330
llvm-svn: 276554
show more ...
|