|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
a7938c74 |
| 26-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.
|
| #
3b7c3a65 |
| 25-Jun-2022 |
Kazu Hirata <[email protected]> |
Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3ac6c78ee8f67bf033976fc7d68bc6d.
|
| #
aa8feeef |
| 25-Jun-2022 |
Kazu Hirata <[email protected]> |
Don't use Optional::hasValue (NFC)
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
cfc741bc |
| 20-Jun-2022 |
Florian Hahn <[email protected]> |
[LoopPeel] Forget SCEV for updated exit phi values.
LoopPeel add new incoming values to exit phi nodes which can change the SCEV for the phi after 20d798bd47ec51.
Forget SCEVs for such phis.
Fixes
[LoopPeel] Forget SCEV for updated exit phi values.
LoopPeel add new incoming values to exit phi nodes which can change the SCEV for the phi after 20d798bd47ec51.
Forget SCEVs for such phis.
Fixes #56044.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D128164
show more ...
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3 |
|
| #
39ce6888 |
| 26-Apr-2022 |
Igor Kudrin <[email protected]> |
[LoopPeel][NFCI] Simplify the code to calculate peel count for PGO
This reorganizes the code as a preparation for D123865:
* Use more descriptive names for variables * Simplify a condition by use
[LoopPeel][NFCI] Simplify the code to calculate peel count for PGO
This reorganizes the code as a preparation for D123865:
* Use more descriptive names for variables * Simplify a condition by use an already calculated value for `MaxPeelCount` * Remove a duplicate log entry * Report basic values for loop costs
Differential Revision: https://reviews.llvm.org/D124388
show more ...
|
| #
c71890e1 |
| 26-Apr-2022 |
Igor Kudrin <[email protected]> |
[LoopPeel][NFC] Exit early if there is no room for peeling
Differential Revision: https://reviews.llvm.org/D123864
|
|
Revision tags: llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
2f6c1481 |
| 02-Mar-2022 |
Stephen Long <[email protected]> |
[LoopPeel] Add EXPENSIVE_CHECKS ifdef guard around domtree verify call
The verify call was taking 50% of the compile time in our internal LLVM fork when trying to unroll many loops.
Differential Re
[LoopPeel] Add EXPENSIVE_CHECKS ifdef guard around domtree verify call
The verify call was taking 50% of the compile time in our internal LLVM fork when trying to unroll many loops.
Differential Revision: https://reviews.llvm.org/D113028
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
a494ae43 |
| 01-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-
Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
bc48a266 |
| 01-Feb-2022 |
Anna Thomas <[email protected]> |
[LoopPeel] Use reference instead of pointer for DT argument
Cleanup code in peelLoop API. We already have usage of DT without guarding against a null DT, so this change constant folds the remaining
[LoopPeel] Use reference instead of pointer for DT argument
Cleanup code in peelLoop API. We already have usage of DT without guarding against a null DT, so this change constant folds the remaining null DT checks. Also make the argument a reference so that it is clear the argument is a nonnull DT. Extracted from D118472.
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
02d9a4d5 |
| 20-Jan-2022 |
Craig Topper <[email protected]> |
[LoopPeel] Pass TripCount to computePeelCount by value instead of by reference. NFC
The TripCount is not modified by the function so it doesn't need to be passed by reference. Verified by passing it
[LoopPeel] Pass TripCount to computePeelCount by value instead of by reference. NFC
The TripCount is not modified by the function so it doesn't need to be passed by reference. Verified by passing it as const reference before changing to value.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D117735
show more ...
|
| #
1507786c |
| 20-Jan-2022 |
Craig Topper <[email protected]> |
[LoopPeeling] Fix stale comments. NFC
These comments were not updated when PeelingPreferences split from UnrollingPreferences.
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
2b7be47b |
| 17-Dec-2021 |
Kazu Hirata <[email protected]> |
[llvm] Strip redundant lambda (NFC)
|
| #
2d31b025 |
| 09-Dec-2021 |
Philip Reames <[email protected]> |
Compute estimated trip counts for multiple exit loops
This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, bu
Compute estimated trip counts for multiple exit loops
This change allows us to estimate trip count from profile metadata for all multiple exit loops. We still do the estimate only from the latch, but that's fine as it causes us to over estimate the trip count at worst.
Reviewing the uses of the API, all but one are cases where we restrict a loop transformation (unroll, and vectorize respectively) when we know the trip count is short enough. So, as a result, the change makes these passes strictly less aggressive. The test change illustrates a case where we'd previously have runtime unrolled a loop which ran fewer iterations than the unroll factor. This is definitely unprofitable.
The one case where an upper bound on estimate trip count could drive a more aggressive transform is peeling, and I duplicated the logic being removed from the generic estimation there to keep it the same. The resulting heuristic makes no sense and should probably be immediately removed, but we can do that in a separate change.
This was noticed when analyzing regressions on D113939.
I plan to come back and incorporate estimated trip counts from other exits, but that's a minor improvement which can follow separately.
Differential Revision: https://reviews.llvm.org/D115362
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
cb728cb8 |
| 09-Nov-2021 |
Max Kazantsev <[email protected]> |
[NFC] Get rid of hardcoded magical constant and use Optionals instead
Refactor calculateIterationsToInvariance so that it doesn't need a magical constant to signify unknown answer.
|
| #
e09958d5 |
| 02-Nov-2021 |
Dmitry Makogon <[email protected]> |
[LoopPeel] Peel loops with exits followed by an unreachable or deopt block
Added support for peeling loops with exits that are followed either by an unreachable-terminated block or block that has a
[LoopPeel] Peel loops with exits followed by an unreachable or deopt block
Added support for peeling loops with exits that are followed either by an unreachable-terminated block or block that has a terminatnig deoptimize call. All blocks in the sequence must have an unique successor, maybe except for the last one.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D110922
show more ...
|
| #
d4c74cd4 |
| 26-Oct-2021 |
Max Kazantsev <[email protected]> |
[NFC] [LoopPeel] Update IDoms of non-loop blocks dominated by the loop
When peeling a loop, we assume that the latch has a `br` terminator and that all loop exits are either terminated with an `unre
[NFC] [LoopPeel] Update IDoms of non-loop blocks dominated by the loop
When peeling a loop, we assume that the latch has a `br` terminator and that all loop exits are either terminated with an `unreachable` or have a terminating deoptimize call. So when we peel off the 1st iteration, we change the IDom of all loop exits to the peeled copy of `NCD(IDom(Exit), Latch)`. This works now, but if we add logic to support loops with exits that are followed by a block with an `unreachable` or a terminating deoptimize call, changing the exit's idom wouldn't be enough and DT would be broken.
For example, let `Exit1` and `Exit2` are loop exits, and each of them unconditionally branches to the same `unreachable` terminated block. So neither of the exits dominates this unreachable block. If we change the IDoms of the exits to some peeled loop block, we don't update the dominators of the unreachable block. Currently we just don't get to the peeling logic, saying that we can't peel such loops.
Previously we stored exits' IDoms in a map before peeling a loop and then, after peeling off one iteration, we changed their IDoms. Now we use the same logic not only for exits but for all non-loop blocks dominated by the loop. So when we add logic to support peeling loops with exits which branch, for example, to an unreachable-terminated block, we would update the IDoms not only for exits, but for their successors.
Patch by Dmitry Makogon!
Differential Revision: https://reviews.llvm.org/D111611 Reviewed By: mkazantsev, nikic
show more ...
|
| #
baad10c0 |
| 18-Oct-2021 |
Max Kazantsev <[email protected]> |
Revert "[NFC] [LoopPeel] Change the way DT is updated for loop exits"
This reverts commit fa16329ae0721023376f24c7577b9020d438df1a.
See comments in discussion. Merged by mistake, not entirely getti
Revert "[NFC] [LoopPeel] Change the way DT is updated for loop exits"
This reverts commit fa16329ae0721023376f24c7577b9020d438df1a.
See comments in discussion. Merged by mistake, not entirely getting what the problem was.
show more ...
|
| #
fa16329a |
| 18-Oct-2021 |
Max Kazantsev <[email protected]> |
[NFC] [LoopPeel] Change the way DT is updated for loop exits
When peeling a loop, we assume that the latch has a `br` terminator and that all loop exits are either terminated with an `unreachable` o
[NFC] [LoopPeel] Change the way DT is updated for loop exits
When peeling a loop, we assume that the latch has a `br` terminator and that all loop exits are either terminated with an `unreachable` or have a terminating deoptimize call. So when we peel off the 1st iteration, we change the IDom of all loop exits to the peeled copy of `NCD(IDom(Exit), Latch)`. This works now, but if we add logic to support loops with exits that are followed by a block with an `unreachable` or a terminating deoptimize call, changing the exit's idom wouldn't be enough and DT would be broken.
For example, let `Exit1` and `Exit2` are loop exits, and each of them unconditionally branches to the same `unreachable` terminated block. So neither of the exits dominates this unreachable block. If we change the IDoms of the exits to some peeled loop block, we don't update the dominators of the unreachable block. Currently we just don't get to the peeling logic, saying that we can't peel such loops.
With this NFC we just insert edges from cloned exiting blocks to their exits after peeling each iteration (we accumulate the insertion updates and then after peeling apply the updates to DT).
This patch was a part of D110922.
Patch by Dmitry Makogon!
Differential Revision: https://reviews.llvm.org/D111611 Reviewed By: mkazantsev
show more ...
|
| #
40d85f16 |
| 12-Oct-2021 |
Florian Hahn <[email protected]> |
[LoopPeel] Use any_of & contains instead of for & find.
Using contains was suggested in D108114, but I forgot to include it when landing the patch.
|
| #
cd0ba9dc |
| 12-Oct-2021 |
Florian Hahn <[email protected]> |
[LoopPeel] Peel if it turns invariant loads dereferenceable.
This patch adds a new cost heuristic that allows peeling a single iteration off read-only loops, if the loop contains a load that
1.
[LoopPeel] Peel if it turns invariant loads dereferenceable.
This patch adds a new cost heuristic that allows peeling a single iteration off read-only loops, if the loop contains a load that
1. is feeding an exit condition, 2. dominates the latch, 3. is not already known to be dereferenceable, 4. and has a loop invariant address.
If all non-latch exits are terminated with unreachable, such loads in the loop are guaranteed to be dereferenceable after peeling, enabling hoisting/CSE'ing them.
This enables vectorization of loops with certain runtime-checks, like multiple calls to `std::vector::at` if the vector is passed as pointer.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D108114
show more ...
|
| #
94052179 |
| 08-Oct-2021 |
Arthur Eubanks <[email protected]> |
Revert "Recommit "[LoopPeel] Peel loops with deoptimizing exits""
This reverts commit d68b59f3ebb253ee7119a25a71c51cf19b73e030.
This is causing crashes, see D110922 for details.
|
| #
d68b59f3 |
| 08-Oct-2021 |
Max Kazantsev <[email protected]> |
Recommit "[LoopPeel] Peel loops with deoptimizing exits"
Removed obsolete DT verification that should not be there because the strategy of DT updates has changed.
Differential Revision: https://rev
Recommit "[LoopPeel] Peel loops with deoptimizing exits"
Removed obsolete DT verification that should not be there because the strategy of DT updates has changed.
Differential Revision: https://reviews.llvm.org/D110922
show more ...
|
| #
48a5a2d1 |
| 08-Oct-2021 |
Max Kazantsev <[email protected]> |
Revert "[LoopPeel] Peel loops with deoptimizing exits"
This reverts commit 8a959625c433f311233682afa7bfe1c76367700d.
Reported failures with LLVM_ENABLE_EXPENSIVE_CHECKS, need to investigate.
|
| #
8a959625 |
| 08-Oct-2021 |
Max Kazantsev <[email protected]> |
[LoopPeel] Peel loops with deoptimizing exits
Added support for peeling loops with "deoptimizing" exits - such exits that it or any of its children (or any of their children, etc) either has a @llvm
[LoopPeel] Peel loops with deoptimizing exits
Added support for peeling loops with "deoptimizing" exits - such exits that it or any of its children (or any of their children, etc) either has a @llvm.experimental.deoptimize call prior to the terminating return instruction of this basic block or is terminated with unreachable. All blocks in the the sequence must have a single successor, maybe except for the last one.
Previously we only checked the exit block for being deoptimizing. Now we check if the last reachable block from the exit is deoptimizing.
Patch by Dmitry Makogon!
Differential Revision: https://reviews.llvm.org/D110922 Reviewed By: mkazantsev
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
90d09eb3 |
| 25-Aug-2021 |
Florian Hahn <[email protected]> |
[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.
Support for peeling with multiple exit blocks was added in D63921/77bb3a486fa6.
So far it has only been enabled for loops
[LoopPeel] Allow peeling with multiple unreachable-terminated exit blocks.
Support for peeling with multiple exit blocks was added in D63921/77bb3a486fa6.
So far it has only been enabled for loops where all non-latch exits are 'de-optimizing' exits (D63923). But peeling of multi-exit loops can be highly beneficial in other cases too, like if all non-latch exiting blocks are unreachable.
The motivating case are loops with runtime checks, like the C++ example below. The main issue preventing vectorization is that the invariant accesses to load the bounds of B is conditionally executed in the loop and cannot be hoisted out. If we peel off the first iteration, they become dereferenceable in the loop, because they must execute before the loop is executed, as all non-latch exits are terminated with unreachable. This subsequently allows hoisting the loads and runtime checks out of the loop, allowing vectorization of the loop.
int sum(std::vector<int> *A, std::vector<int> *B, int N) { int cost = 0; for (int i = 0; i < N; ++i) cost += A->at(i) + B->at(i); return cost; }
This gives a ~20-30% increase of score for Geekbench5/HDR on AArch64.
Note that this requires a follow-up improvement to the peeling cost model to actually peel iterations off loops as above. I will share that shortly.
Also, peeling of multi-exits might be beneficial for exit blocks with other terminators, but I would like to keep the scope limited to known high-reward cases for now.
I removed the option to disable peeling for multi-deopt exits because the code is more general now. Alternatively, the option could also be generalized, but I am not sure if there's much value in the option?
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D108108
show more ...
|