Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# a30e77b6 23-Jul-2022 Nuno Lopes <[email protected]>

fix tests for commit 9df0b254d24eca098


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# 997ecb00 01-Jun-2022 David Sherwood <[email protected]>

[LoopVectorize] Add FastMathFlags to the select used for reductions with tail-folding

Based on reviewer comments on https://reviews.llvm.org/D126692 I've
added FastMathFlags to the select instructio

[LoopVectorize] Add FastMathFlags to the select used for reductions with tail-folding

Based on reviewer comments on https://reviews.llvm.org/D126692 I've
added FastMathFlags to the select instruction used when tail-folding
with reductions. These flags can then be used by InstCombine to
decide upon the most optimal floating point identity value for
fadd/fsub. Doing so unlocks further optimisations, such as folding
selects into masked loads.

Differential Revision: https://reviews.llvm.org/D126778

show more ...


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 872f7000 03-Apr-2022 Dávid Bolvanský <[email protected]>

Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"

This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.


# a113a582 03-Apr-2022 Dávid Bolvanský <[email protected]>

[NFCI] Regenerate LoopVectorize test checks


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# a266af72 14-Feb-2022 Nikita Popov <[email protected]>

[InstCombine] Canonicalize SPF to min/max intrinsics

Now that integer min/max intrinsics have good support in both
InstCombine and other passes, start canonicalizing SPF min/max
to intrinsic min/max

[InstCombine] Canonicalize SPF to min/max intrinsics

Now that integer min/max intrinsics have good support in both
InstCombine and other passes, start canonicalizing SPF min/max
to intrinsic min/max.

Once this sticks, we can stop matching SPF min/max in various
places, and can remove hacks we have for preventing infinite loops
and breaking of SPF canonicalization.

Differential Revision: https://reviews.llvm.org/D98152

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# d4a8fc3a 12-Jan-2022 Florian Hahn <[email protected]>

[VPlan] Introduce and use BranchOnCount VPInstruction.

This patch adds a new BranchOnCount VPInstruction opcode with 2
operands. It first compares its 2 operands (increment of canonical
induction an

[VPlan] Introduce and use BranchOnCount VPInstruction.

This patch adds a new BranchOnCount VPInstruction opcode with 2
operands. It first compares its 2 operands (increment of canonical
induction and vector trip count), followed by a branch to either the
exit block or back to the vector header.

It must be the last recipe in the exit block of the topmost vector loop
region.

This extracts parts from D113224 and was discussed in D113223.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D116479

show more ...


# 65c4d619 05-Jan-2022 Florian Hahn <[email protected]>

[VPlan] Add VPCanonicalIVPHIRecipe, partly retire createInductionVariable.

At the moment, the primary induction variable for the vector loop is
created as part of the skeleton creation. This is tied

[VPlan] Add VPCanonicalIVPHIRecipe, partly retire createInductionVariable.

At the moment, the primary induction variable for the vector loop is
created as part of the skeleton creation. This is tied to creating the
vector loop latch outside of VPlan. This prevents from modeling the
*whole* vector loop in VPlan, which in turn is required to model
preheader and exit blocks in VPlan as well.

This patch introduces a new recipe VPCanonicalIVPHIRecipe to represent the
primary IV in VPlan and CanonicalIVIncrement{NUW} opcodes for
VPInstruction to model the increment.

This allows us to partly retire createInductionVariable. At the moment,
a bit of patching up is done after executing all blocks in the plan.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D113223

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 20b0fa91 26-Jul-2021 Dylan Fleming <[email protected]>

[SVE] Add support for folding for select + masked loads

Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(m

[SVE] Add support for folding for select + masked loads

Add folds to instcombine to support the removal of select instruction when the masked_load is guaranteed to zero the same lanes, i.e. select(mask, mload(,,mask,0), 0) -> mload(,,mask,0).

Patch originally authored by @paulwalker-arm

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D106376

show more ...


# 1c9bec72 22-Jul-2021 Simon Pilgrim <[email protected]>

[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069)

As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx

[InstCombine] Fold (gep (oneuse(gep Ptr, Idx0)), Idx1) -> (gep Ptr, (add Idx0, Idx1)) (PR51069)

As noticed on D106352, after we've folded "(select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0))" if the inner Ptr was also a (now one use) gep we could then merge the geps, using the sum of the indices instead.

I've limited this to basic 2-op geps - a more general case further down InstCombinerImpl.visitGetElementPtrInst doesn't have the one-use limitation but only creates the add if it can be created via SimplifyAddInst.

https://alive2.llvm.org/ce/z/f8pLfD (Thanks Roman!)

Differential Revision: https://reviews.llvm.org/D106450

show more ...


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 23c2f2e6 07-Jun-2021 Florian Hahn <[email protected]>

[LV] Mark increment of main vector loop induction variable as NUW.

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If th

[LV] Mark increment of main vector loop induction variable as NUW.

This patch marks the induction increment of the main induction variable
of the vector loop as NUW when not folding the tail.

If the tail is not folded, we know that End - Start >= Step (either
statically or through the minimum iteration checks). We also know that both
Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV +
%Step == %End. Hence we must exit the loop before %IV + %Step unsigned
overflows and we can mark the induction increment as NUW.

This should make SCEV return more precise bounds for the created vector
loops, used by later optimizations, like late unrolling.

At the moment quite a few tests still need to be updated, but before
doing so I'd like to get initial feedback to make sure I am not missing
anything.

Note that this could probably be further improved by using information
from the original IV.

Attempt of modeling of the assumption in Alive2:
https://alive2.llvm.org/ce/z/H_DL_g

Part of a set of fixes required for PR50412.

Reviewed By: mkazantsev

Differential Revision: https://reviews.llvm.org/D103255

show more ...


Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 4a8e6ed2 05-Jan-2021 Juneyoung Lee <[email protected]>

[SLP,LV] Use poison constant vector for shufflevector/initial insertelement

This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part o

[SLP,LV] Use poison constant vector for shufflevector/initial insertelement

This patch makes SLP and LV emit operations with initial vectors set to poison constant instead of undef.
This is a part of efforts for using poison vector instead of undef to represent "doesn't care" vector.
The goal is to make nice shufflevector optimizations valid that is currently incorrect due to the tricky interaction between undef and poison (see https://bugs.llvm.org/show_bug.cgi?id=44185 ).

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D94061

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 20b386aa 29-Oct-2020 Nikita Popov <[email protected]>

[LoopUtils] Fix neutral value for vector.reduce.fadd

Use -0.0 instead of 0.0 as the start value. The previous use of 0.0
was fine for all existing uses of this function though, as it is
always gener

[LoopUtils] Fix neutral value for vector.reduce.fadd

Use -0.0 instead of 0.0 as the start value. The previous use of 0.0
was fine for all existing uses of this function though, as it is
always generated with fast flags right now, and thus nsz.

show more ...


# be6e8e50 11-Oct-2020 David Green <[email protected]>

[LV] Tail folded inloop reductions.

This expands upon the inloop reductions added in e9761688e41cb9e976,
allowing them to be inserted into tail folded loops. Reductions are
generates with the form:

[LV] Tail folded inloop reductions.

This expands upon the inloop reductions added in e9761688e41cb9e976,
allowing them to be inserted into tail folded loops. Reductions are
generates with the form:

x = select(mask, vecop, zero)
v = vecreduce.add(x)
c = add chain, v

Where zero here is chosen as the identity value for add reductions. The
backend is then expected to fold the select and the vecreduce into a
single predicated instruction.

Most of the code is fairly straight forward, except for the creation of
blockmasks which need to ensure they are created in dominance order. The
order they are added is altered to be after any phis, keeping the
requirements for the underlying IR.

Differential Revision: https://reviews.llvm.org/D84451

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6
# 322d0afd 03-Oct-2020 Amara Emerson <[email protected]>

[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.

This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle lega

[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.

This change renames the intrinsics to not have "experimental" in the name.

The autoupgrader will handle legacy intrinsics.

Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html

Differential Revision: https://reviews.llvm.org/D88787

show more ...


Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# ae366479 25-Aug-2020 Sjoerd Meijer <[email protected]>

[LV] get.active.lane.mask consuming tripcount instead of backedge-taken count

This adapts LV to the new semantics of get.active.lane.mask as discussed in
D86147, which means that the LV now emits in

[LV] get.active.lane.mask consuming tripcount instead of backedge-taken count

This adapts LV to the new semantics of get.active.lane.mask as discussed in
D86147, which means that the LV now emits intrinsic get.active.lane.mask with
the loop tripcount instead of the backedge-taken count as its second argument.
The motivation for this is described in D86147.

Differential Revision: https://reviews.llvm.org/D86304

show more ...


# 2b69efde 21-Aug-2020 David Green <[email protected]>

[ARM][LV] Add a preferPredicatedReductionSelect target hook

As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to ta

[ARM][LV] Add a preferPredicatedReductionSelect target hook

As part of D84741, this adds a target hook for the
preferPredicatedReductionSelect option and makes use
of it under MVE, allowing us to tail predicate most
reduction loops.

Differential Revision: https://reviews.llvm.org/D85980

show more ...


Revision tags: llvmorg-11.0.0-rc2
# b8088ada 18-Aug-2020 David Green <[email protected]>

[LV] Predicated reduction tests. NFC