Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4
# 81c648a3 18-May-2022 Nikita Popov <[email protected]>

[LoopUnroll] Freeze tripcount rather than condition

This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue c

[LoopUnroll] Freeze tripcount rather than condition

This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue case). The previous patch only froze the
condition on the first branch.

Rather than independently freezing the second condition, this patch
instead freezes TripCount and bases BECount on it. These are the
two quantities involved in the conditions, and this ensures that
both work on a consistent, non-poisonous trip count.

Differential Revision: https://reviews.llvm.org/D125896

show more ...


# 323514de 17-May-2022 Nikita Popov <[email protected]>

[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits

When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first

[LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits

When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first iteration,
such that we never branch on the latch exit condition. As such, we
need to freeze the condition of the new branch that is introduced
before the loop, as it now executes unconditionally.

Differential Revision: https://reviews.llvm.org/D125754

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 72031407 03-Jan-2022 Philip Reames <[email protected]>

Revert "[unroll] Prune all but first copy of invariant exit"

This reverts commit 9bd22595bad36cd19f5e7ae18ccd9f41cba29dc5.

Seeing some bot failures which look plausibly connected. Revert while inv

Revert "[unroll] Prune all but first copy of invariant exit"

This reverts commit 9bd22595bad36cd19f5e7ae18ccd9f41cba29dc5.

Seeing some bot failures which look plausibly connected. Revert while investigating/waiting for bots to stablize.

e.g. https://lab.llvm.org/buildbot#builders/36/builds/15933

show more ...


# 9bd22595 03-Jan-2022 Philip Reames <[email protected]>

[unroll] Prune all but first copy of invariant exit

If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled it

[unroll] Prune all but first copy of invariant exit

If we have an exit which is controlled by a loop invariant condition and which dominates the latch, we know only the copy in the first unrolled iteration can be taken. All other copies are dead.

The change itself is pretty straight forward, but let me add two points of context:
* I'd have expected other transform passes to catch this after unrolling, but I'm seeing multiple examples where we get to the end of O2/O3 without simplifying.
* I'd like to do a stronger change which did CSE during unroll and accounted for invariant expressions (as defined by SCEV instead of trivial ones from LoopInfo), but that doesn't fit cleanly into the current code structure.

Differential Revision: https://reviews.llvm.org/D116496

show more ...


# 8906a0fe 29-Nov-2021 Philip Reames <[email protected]>

[SCEVExpander] Drop poison generating flags when reusing instructions

The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple

[SCEVExpander] Drop poison generating flags when reusing instructions

The basic problem we have is that we're trying to reuse an instruction which is mapped to some SCEV. Since we can have multiple such instructions (potentially with different flags), this is analogous to our need to drop flags when performing CSE. A trivial implementation would simply drop flags on any instruction we decided to reuse, and that would be correct.

This patch is almost that trivial patch except that we preserve flags on the reused instruction when existing users would imply UB on overflow already. Adding new users can, at most, refine this program to one which doesn't execute UB which is valid.

In practice, this fixes two conceptual problems with the previous code: 1) a binop could have been canonicalized into a form with different opcode or operands, or 2) the inbounds GEP case which was simply unhandled.

On the test changes, most are pretty straight forward. We loose some flags (in some cases, they'd have been dropped on the next CSE pass anyways). The one that took me the longest to understand was the ashr-expansion test. What's happening there is that we're considering reuse of the mul, previously we disallowed it entirely, now we allow it with no flags. The surrounding diffs are all effects of generating the same mul with a different operand order, and then doing simple DCE.

The loss of the inbounds is unfortunate, but even there, we can recover most of those once we actually treat branch-on-poison as immediate UB.

Differential Revision: https://reviews.llvm.org/D112734

show more ...


Revision tags: llvmorg-13.0.1-rc1
# da327e72 15-Nov-2021 Philip Reames <[email protected]>

Fix a misleading FIXME in an unroll test


# 37ead201 12-Nov-2021 Philip Reames <[email protected]>

[runtime-unroll] Use incrementing IVs instead of decrementing ones

This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the

[runtime-unroll] Use incrementing IVs instead of decrementing ones

This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.

Why does this matter? A couple of reasons:
* SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.)
* Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.)

Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.

* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.

show more ...


# de2fed61 12-Nov-2021 Philip Reames <[email protected]>

[unroll] Keep unrolled iterations with initial iteration

The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is

[unroll] Keep unrolled iterations with initial iteration

The unrolling code was previously inserting new cloned blocks at the end of the function. The result of this with typical loop structures is that the new iterations are placed far from the initial iteration.

With unrolling, the general assumption is that the a) the loop is reasonable hot, and b) the first Count-1 copies of the loop are rarely (if ever) loop exiting. As such, placing Count-1 copies out of line is a fairly poor code placement choice. We'd much rather fall through into the hot (non-exiting) path. For code with branch profiles, later layout would fix this, but this may have a positive impact on non-PGO compiled code.

However, the real motivation for this change isn't performance. Its readability and human understanding. Having to jump around long distances in an IR file to trace an unrolled loop structure is error prone and tedious.

show more ...


# e01c91f2 12-Nov-2021 Philip Reames <[email protected]>

[tests] Add coverage for cases we can prune exits when runtlme unrolling


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# fa82a3d0 02-Sep-2021 Philip Reames <[email protected]>

[runtimeunroll] Support epilogue unrolling with a parent loop

This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the

[runtimeunroll] Support epilogue unrolling with a parent loop

This patch adds support for unrolling inner loops using epilogue unrolling. The basic issue is that the original latch exit block of the inner loop could be outside the outer loop. When we clone the inner loop and split the latch exit, the cloned blocks need to be in the outer loop.

Differential Revision: https://reviews.llvm.org/D108476

show more ...


# b604fcb7 31-Aug-2021 Philip Reames <[email protected]>

[runtime] Move prolog/epilog block to a post-simplify strategy

The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one

[runtime] Move prolog/epilog block to a post-simplify strategy

The runtime unroller will try to produce a non-loop if the unroll count is 2 and thus the prolog/epilog loop would only run at most one iteration. The old implementation did this by avoiding loop construction entirely. This patches instead constructs the trivial loop and then explicitly breaks the backedge and simplifies. This does result in some additional code churn when triggered, but a) results in better quality code and b) removes a codepath which didn't work properly for multiple exit epilogs.

One oddity that I want to draw to reviewer attention is that this somehow changes revisit order. The new order looks equivalent to me, but I don't understand how creating and erasing an extra loop here creates this effect.

Differential Revision: https://reviews.llvm.org/D108521

show more ...


Revision tags: llvmorg-13.0.0-rc2
# 17b9cb18 19-Aug-2021 Philip Reames <[email protected]>

[runtimeunroll] Support multiple exits to latch exit w/prolog loop

This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the s

[runtimeunroll] Support multiple exits to latch exit w/prolog loop

This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used.

This is the prolog companion to D107381. Since this was LGTMed, a problem with DT updating was reported against that patch. I roled in the analogous fix here as it seemed obvious, and not worth re-review.

As an aside, our prolog form leaves a lot of potential value on the floor when there is an invariant load or invariant condition in the loop being runtime unrolled. We should probably consider a "required prolog" heuristic. (Alternatively, maybe we should be peeling these cases more aggressively?)

Differential Revision: https://reviews.llvm.org/D108262

show more ...


# 94d09142 18-Aug-2021 Philip Reames <[email protected]>

[runtimeunroll] Support multiple exits to latch exit w/epilogue loop

This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the

[runtimeunroll] Support multiple exits to latch exit w/epilogue loop

This patch extends the runtime unrolling infrastructure to support unrolling a loop with multiple exiting blocks branching to the same exit block used by the latch. It intentionally does not include a cost model change to enable this functionality unless appropriate force flags are used.

I decided to restrict this to the epilogue case. Given the changes ended up being pretty generic, we may be able to unblock the prolog case too, but I want to do that in a separate change to reduce the amount of code we all have to understand at one time.

Differential Revision: https://reviews.llvm.org/D107381

show more ...


# 54934923 18-Aug-2021 Philip Reames <[email protected]>

[test] Remove a redundant test line

This was made redundant when I removed -instcombine from output in 70ffd65c, but I didn't notice. nikic pointed that out in review of D107381


# 911991d2 03-Aug-2021 Philip Reames <[email protected]>

[tests] Autogen an unroll test for ease of update


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 95848ea1 26-Aug-2020 Roman Lebedev <[email protected]>

[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks

As FIXME said, they really should be checking for a single user,
not use, so let's do that. It i

[Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks

As FIXME said, they really should be checking for a single user,
not use, so let's do that. It is not *that* unusual to have
the same value as incoming value in a PHI node, not unlike
how a PHI may have the same incoming basic block more than once.

There isn't a nice way to do that, Value::users() isn't uniqified,
and Value only tracks it's uses, not Users, so the check is
potentially costly since it does indeed potentially involes
traversing the entire use list of a value.

show more ...


Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# c3b8bd1e 04-Jul-2020 Roman Lebedev <[email protected]>

[InstCombine] Always try to invert non-canonical predicate of an icmp

Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPower

[InstCombine] Always try to invert non-canonical predicate of an icmp

Summary:
The actual transform i was going after was:
https://rise4fun.com/Alive/Tp9H
```
Name: zz
Pre: isPowerOf2(C0) && isPowerOf2(C1) && C1 == C0
%t0 = and i8 %x, C0
%r = icmp eq i8 %t0, C1
=>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1

Name: zz
Pre: isPowerOf2(C0)
%t0 = and i8 %x, C0
%r = icmp ne i8 %t0, 0
=>
%t = icmp eq i8 %t0, 0
%r = xor i1 %t, -1
```
but as it can be seen from the current tests, we already canonicalize most of it,
and we are only missing handling multi-use non-canonical icmp predicates.

If we have both `!=0` and `==0`, even though we can CSE them,
we end up being stuck with them. We should canonicalize to the `==0`.

I believe this is one of the cleanup steps i'll need after `-scalarizer`
if i end up proceeding with my WIP alloca promotion helper pass.

Reviewers: spatel, jdoerfert, nikic

Reviewed By: nikic

Subscribers: zzheng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D83139

show more ...


# 17a15c32 03-Jul-2020 Roman Lebedev <[email protected]>

[NFCI][LoopUnroll] s/%tmp/%i/ in one test to silence update script warning


Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4
# 1badf7c3 06-Mar-2020 Roman Lebedev <[email protected]>

[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant

Summary:
This is potentially more friendly for further optimizations,
analysies, e.g.: https://godbolt.org

[InstComine] Forego of one-use check in `(X - (X & Y)) --> (X & ~Y)` if Y is a constant

Summary:
This is potentially more friendly for further optimizations,
analysies, e.g.: https://godbolt.org/z/G24anE

This resolves phase-ordering bug that was introduced
in D75145 for https://godbolt.org/z/2gBwF2
https://godbolt.org/z/XvgSua

Reviewers: spatel, nikic, dmgreen, xbolva00

Reviewed By: nikic, xbolva00

Subscribers: hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75757

show more ...


Revision tags: llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2
# 0f22e783 02-Dec-2019 Roman Lebedev <[email protected]>

[InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant() (PR44100)

rL341831 moved one-use check higher up, restricting a few folds
that produced a single instruction from two inst

[InstCombine] Revert rL341831: relax one-use check in foldICmpAddConstant() (PR44100)

rL341831 moved one-use check higher up, restricting a few folds
that produced a single instruction from two instructions to the case
where the inner instruction would go away.

Original commit message:
> InstCombine: move hasOneUse check to the top of foldICmpAddConstant
>
> There were two combines not covered by the check before now,
> neither of which actually differed from normal in the benefit analysis.
>
> The most recent seems to be because it was just added at the top of the
> function (naturally). The older is from way back in 2008 (r46687)
> when we just didn't put those checks in so routinely, and has been
> diligently maintained since.

From the commit message alone, there doesn't seem to be a
deeper motivation, deeper problem that was trying to solve,
other than 'fixing the wrong one-use check'.

As i have briefly discusses in IRC with Tim, the original motivation
can no longer be recovered, too much time has passed.

However i believe that the original fold was doing the right thing,
we should be performing such a transformation even if the inner `add`
will not go away - that will still unchain the comparison from `add`,
it will no longer need to wait for `add` to compute.

Doing so doesn't seem to break any particular idioms,
as least as far as i can see.

References https://bugs.llvm.org/show_bug.cgi?id=44100

show more ...


Revision tags: llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# 5a89ba73 24-Jun-2019 Matt Arsenault <[email protected]>

InstCombine: Preserve nuw when reassociating nuw ops [1/3]

Alive says this is OK.

llvm-svn: 364233


Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# cee313d2 17-Apr-2019 Eric Christopher <[email protected]>

Revert "Temporarily Revert "Add basic loop fusion pass.""

The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

llvm-svn: 358552


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# bae11e79 28-Dec-2018 Anna Thomas <[email protected]>

[UnrollRuntime] NFC: Updated exiting tests and added more tests

Added more tests for multiple exiting blocks to the LatchExit.
Today these cases are not supported. Patch to follow soon.

llvm-svn: 3

[UnrollRuntime] NFC: Updated exiting tests and added more tests

Added more tests for multiple exiting blocks to the LatchExit.
Today these cases are not supported. Patch to follow soon.

llvm-svn: 350135

show more ...


# 98743fa7 28-Dec-2018 Anna Thomas <[email protected]>

[UnrollRuntime] NFC: Add comment and verify LCSSA

Added -verify-loop-lcssa to test cases.
Updated comments in ConnectProlog.

llvm-svn: 350131


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1
# 512dde77 15-Sep-2017 Anna Thomas <[email protected]>

[RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup

During runtime unrolling on loops with multiple exits, we update the
exit blocks with the correct phi value

[RuntimeUnrolling] Populate the VMap entry correctly when default generated through lookup

During runtime unrolling on loops with multiple exits, we update the
exit blocks with the correct phi values from both original and remainder
loop.
In this process, we lookup the VMap for the mapped incoming phi values,
but did not update the VMap if a default entry was generated in the VMap
during the lookup. This default value is generated when constants or
values outside the current loop are looked up.
This patch fixes the assertion failure when null entries are present in
the VMap because of this lookup. Added a testcase that showcases the
problem.

llvm-svn: 313358

show more ...


12