Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 1ed8b293 06-Jul-2022 Nikita Popov <[email protected]>

[LoopVectorizationLegality] Drop unused variable (NFC)


# 8ee913d8 06-Jul-2022 Nikita Popov <[email protected]>

[IR] Remove Constant::canTrap() (NFC)

As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() m

[IR] Remove Constant::canTrap() (NFC)

As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.

show more ...


# 644a965c 04-Jul-2022 Florian Hahn <[email protected]>

[LV] Vectorize cases with larger number of RT checks, execute only if profitable.

This patch replaces the tight hard cut-off for the number of runtime
checks with a more accurate cost-driven approac

[LV] Vectorize cases with larger number of RT checks, execute only if profitable.

This patch replaces the tight hard cut-off for the number of runtime
checks with a more accurate cost-driven approach.

The new approach allows vectorization with a larger number of runtime
checks in general, but only executes the vector loop (and runtime checks) if
considered profitable at runtime. Profitable here means that the cost-model
indicates that the runtime check cost + vector loop cost < scalar loop cost.

To do that, LV computes the minimum trip count for which runtime check cost
+ vector-loop-cost < scalar loop cost.

Note that there is still a hard cut-off to avoid excessive compile-time/code-size
increases, but it is much larger than the original limit.

The performance impact on standard test-suites like SPEC2006/SPEC2006/MultiSource
is mostly neutral, but the new approach can give substantial gains in cases where
we failed to vectorize before due to the over-aggressive cut-offs.

On AArch64 with -O3, I didn't observe any regressions outside the noise level (<0.4%)
and there are the following execution time improvements. Both `IRSmk` and `srad` are relatively short running, but the changes are far above the noise level for them on my benchmark system.

```
CFP2006/447.dealII/447.dealII -1.9%
CINT2017rate/525.x264_r/525.x264_r -2.2%
ASC_Sequoia/IRSmk/IRSmk -9.2%
Rodinia/srad/srad -36.1%
```

`size` regressions on AArch64 with -O3 are

```
MultiSource/Applications/hbd/hbd 90256.00 106768.00 18.3%
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000 240676.00 257268.00 6.9%
MultiSourc...enchmarks/mafft/pairlocalalign 472603.00 489131.00 3.5%
External/S...2017rate/525.x264_r/525.x264_r 613831.00 630343.00 2.7%
External/S...NT2006/464.h264ref/464.h264ref 818920.00 835448.00 2.0%
External/S...te/538.imagick_r/538.imagick_r 1994730.00 2027754.00 1.7%
MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4 1236471.00 1253015.00 1.3%
MultiSource/Applications/oggenc/oggenc 2108147.00 2124675.00 0.8%
External/S.../CFP2006/447.dealII/447.dealII 4742999.00 4759559.00 0.3%
External/S...rate/510.parest_r/510.parest_r 14206377.00 14239433.00 0.2%
```

Reviewed By: lebedev.ri, ebrevnov, dmgreen

Differential Revision: https://reviews.llvm.org/D109368

show more ...


Revision tags: llvmorg-14.0.6
# 6bb40552 17-Jun-2022 Malhar Jajoo <[email protected]>

[LoopVectorize] Add support for invariant stores of ordered reductions

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D126772


Revision tags: llvmorg-14.0.5
# d1570194 01-Jun-2022 Florian Hahn <[email protected]>

[VPlan] Remove unused native utilities incompatible with nested regions.

The implementations of VPlanDominatorTree, VPlanLoopInfo and VPlanPredicator
are all incompatible with modeling loops in VPla

[VPlan] Remove unused native utilities incompatible with nested regions.

The implementations of VPlanDominatorTree, VPlanLoopInfo and VPlanPredicator
are all incompatible with modeling loops in VPlans as region without
explicit back-edges.

Those pieces are not actively used and only exercised by a few gtest
unit tests. They are at the moment blocking progress towards unifying
the native and inner-loop vectorizer paths in D121624 and D123005.

I think we should not block forward progress on unused pieces of code,
so this patch removes the utilities for now. The plan is to re-introduce
them as needed in a way that is compatible with the unified VPlan scheme
used in both the inner loop vectorizer and the native path.

Reviewed By: sguggill

Differential Revision: https://reviews.llvm.org/D123017

show more ...


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 4e5e042d 15-Sep-2021 Igor Kirillov <[email protected]>

[LoopVectorize] Support reductions that store intermediary result

Adds ability to vectorize loops containing a store to a loop-invariant
address as part of a reduction that isn't converted to SSA fo

[LoopVectorize] Support reductions that store intermediary result

Adds ability to vectorize loops containing a store to a loop-invariant
address as part of a reduction that isn't converted to SSA form due to
lack of aliasing info. Runtime checks are generated to ensure the store
does not alias any other accesses in the loop.

Ordered fadd reductions are not yet supported.

Differential Revision: https://reviews.llvm.org/D110235

show more ...


# 6f81903e 03-May-2022 David Green <[email protected]>

[LV][SLP] Mark fptosi_sat as vectorizable

This adds fptosi_sat and fptoui_sat to the list of trivially
vectorizable functions, mainly so that the loop vectorizer can vectorize
the instruction. Marki

[LV][SLP] Mark fptosi_sat as vectorizable

This adds fptosi_sat and fptoui_sat to the list of trivially
vectorizable functions, mainly so that the loop vectorizer can vectorize
the instruction. Marking them as trivially vectorizable also allows them
to be SLP vectorized, and Scalarized.

The signature of a fptosi_sat requires two type overrides
(@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only
take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd
to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first
operand of the intrinsic as a overloaded (but not scalar) operand.

Differential Revision: https://reviews.llvm.org/D124358

show more ...


# 9727c77d 25-Apr-2022 David Green <[email protected]>

[NFC] Rename Instrinsic to Intrinsic


# 46432a00 24-Mar-2022 Florian Hahn <[email protected]>

[VPlan] Add VPWidenPointerInductionRecipe.

This patch moves pointer induction handling from VPWidenPHIRecipe to its
own recipe. In the process, it adds all information required to generate
code for

[VPlan] Add VPWidenPointerInductionRecipe.

This patch moves pointer induction handling from VPWidenPHIRecipe to its
own recipe. In the process, it adds all information required to generate
code for pointer inductions without relying on Legal to access the list
of induction phis.

Alternatively VPWidenPHIRecipe could also take an optional pointer to InductionDescriptor.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D121615

show more ...


# ed98c1b3 09-Mar-2022 serge-sans-paille <[email protected]>

Cleanup includes: DebugInfo & CodeGen

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332


# 5ba11503 10-Feb-2022 Philip Reames <[email protected]>

[PSE] Remove assumption that top level predicate is union from public interface [NFC*]

Note that this doesn't actually cause the top level predicate to become a non-union just yet.

The * above come

[PSE] Remove assumption that top level predicate is union from public interface [NFC*]

Note that this doesn't actually cause the top level predicate to become a non-union just yet.

The * above comes from a case in the LoopVectorizer where a predicate which is later proven no longer blocks vectorization due to a change from checking if predicates exists to whether the predicate is possibly false.

show more ...


# 7c68ed88 21-Dec-2021 Paul Walker <[email protected]>

[SVE] Reintroduce -scalable-vectorization=preferred as an alias to "on".

Some buildbots still rely on the experimental flag, so let's keep
it until everything has been migrated to the new "on by def

[SVE] Reintroduce -scalable-vectorization=preferred as an alias to "on".

Some buildbots still rely on the experimental flag, so let's keep
it until everything has been migrated to the new "on by default"
state.

show more ...


# b1ff20fd 13-Dec-2021 Sander de Smalen <[email protected]>

[LV] Enable scalable vectorization by default for SVE cores.

The availability of SVE should be sufficient to enable scalable
auto-vectorization.

This patch adds a new TTI interface to query the tar

[LV] Enable scalable vectorization by default for SVE cores.

The availability of SVE should be sufficient to enable scalable
auto-vectorization.

This patch adds a new TTI interface to query the target what style of
vectorization it wants when scalable vectors are available. For other
targets than AArch64, this currently defaults to 'FixedWidthOnly'.

Differential Revision: https://reviews.llvm.org/D115651

show more ...


# 978883d2 10-Dec-2021 Florian Hahn <[email protected]>

[VPlan] Add InductionDescriptor to VPWidenIntOrFpInduction. (NFC)

This allows easier access to the induction descriptor from VPlan,
without needing to go through Legal. VPReductionPHIRecipe already

[VPlan] Add InductionDescriptor to VPWidenIntOrFpInduction. (NFC)

This allows easier access to the induction descriptor from VPlan,
without needing to go through Legal. VPReductionPHIRecipe already
contains a RecurrenceDescriptor in a similar fashion.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D115111

show more ...


# d74a8a78 09-Dec-2021 Florian Hahn <[email protected]>

[LV] Mark various functions as const (NFC).

Make sure various accessors do not modify any state, in preparation for
D115111.


# 4f0225f6 01-Oct-2021 Kazu Hirata <[email protected]>

[Transforms] Migrate from getNumArgOperands to arg_size (NFC)

Note that getNumArgOperands is considered a legacy name. See
llvm/include/llvm/IR/InstrTypes.h for details.


Revision tags: llvmorg-13.0.0-rc3
# 45c46734 11-Sep-2021 Nikita Popov <[email protected]>

[LAA] Pass access type to getPtrStride()

Pass the access type to getPtrStride(), so it is not determined
from the pointer element type. Many cases still fetch the element
type at a higher level thou

[LAA] Pass access type to getPtrStride()

Pass the access type to getPtrStride(), so it is not determined
from the pointer element type. Many cases still fetch the element
type at a higher level though, so this only partially addresses
the issue.

show more ...


Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 95346ba8 15-Jul-2021 Philip Reames <[email protected]>

[LV] Enable vectorization of multiple exit loops w/computable exit counts

This change enables vectorization of multiple exit loops when the exit count is statically computable. That requirement - sh

[LV] Enable vectorization of multiple exit loops w/computable exit counts

This change enables vectorization of multiple exit loops when the exit count is statically computable. That requirement - shared with the rest of LV - in turn requires each exit to be analyzeable and to dominate the latch.

The majority of work to support this was done in a set of previous patches. In particular,, 72314466 avoids having multiple edges from the middle block to the exits, and 4b33b2387 which added support for non-latch single exit and multiple exits with a single exiting block. As a result, this change is basically just removing a bailout and adjusting some tests now that the prerequisite work is done and has stuck in tree for a bit.

Differential Revision: https://reviews.llvm.org/D105817

show more ...


# 287d39dd 01-Jul-2021 Paul Walker <[email protected]>

[NFC] Fix a few whitespace issues and typos.


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 5e6bfb66 11-Jun-2021 Simon Pilgrim <[email protected]>

[Analysis] Pass RecurrenceDescriptor as const reference. NFCI.

We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with Trackin

[Analysis] Pass RecurrenceDescriptor as const reference. NFCI.

We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).

Differential Revision: https://reviews.llvm.org/D104029

show more ...


# 4f01122c 08-Jun-2021 Joachim Meyer <[email protected]>

[LV] Parallel annotated loop does not imply all loads can be hoisted.

As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is anno

[LV] Parallel annotated loop does not imply all loads can be hoisted.

As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (`!llvm.loop.parallel_accesses`), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads.
This was observed in POCL (https://github.com/pocl/pocl/issues/757) and would require similar workarounds in current work at hipSYCL.

The question remains why this was initially added and what the implications of removing this optimization would be.
Do we need an alternative mechanism to propagate the information about legality of if-conversion?
Or is the idea that conditional loads in `#pragma clang loop vectorize(assume_safety)` can be executed unmasked without additional checks flawed in general?
I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous.

Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103907

show more ...


# 9f76a852 26-May-2021 Kerry McLaughlin <[email protected]>

[LoopVectorize] Enable strict reductions when allowReordering() returns false

When loop hints are passed via metadata, the allowReordering function
in LoopVectorizationLegality will allow the order

[LoopVectorize] Enable strict reductions when allowReordering() returns false

When loop hints are passed via metadata, the allowReordering function
in LoopVectorizationLegality will allow the order of floating point
operations to be changed:

bool allowReordering() const {
// When enabling loop hints are provided we allow the vectorizer to change
// the order of operations that is given by the scalar loop. This is not
// enabled by default because can be unsafe or inefficient.

The -enable-strict-reductions flag introduced in D98435 will currently only
vectorize reductions in-loop if hints are used, since canVectorizeFPMath()
will return false if reordering is not allowed.

This patch changes canVectorizeFPMath() to query whether it is safe to
vectorize the loop with ordered reductions if no hints are used. For
testing purposes, an additional flag (-hints-allow-reordering) has been
added to disable the reordering behaviour described above.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D101836

show more ...


Revision tags: llvmorg-12.0.1-rc1
# 4f86aa65 08-Apr-2021 Sander de Smalen <[email protected]>

[LV] Add -scalable-vectorization=<option> flag.

This patch adds a new option to the LoopVectorizer to control how
scalable vectors can be used.

Initially, this suggests three levels to control scal

[LV] Add -scalable-vectorization=<option> flag.

This patch adds a new option to the LoopVectorizer to control how
scalable vectors can be used.

Initially, this suggests three levels to control scalable
vectorization, although other more aggressive options can be added in
the future.

The possible options are:
- Disabled: Disables vectorization with scalable vectors.
- Enabled: Vectorize loops using scalable vectors or fixed-width
vectors, but favors fixed-width vectors when the cost
is a tie.
- Preferred: Like 'Enabled', but favoring scalable vectors when the
cost-model is inconclusive.

Reviewed By: paulwalker-arm, vkmr

Differential Revision: https://reviews.llvm.org/D101945

show more ...


# f82966d1 14-May-2021 Sander de Smalen <[email protected]>

[LoopVectorizationLegality] NFC: Mark some interfaces as 'const'

This patch marks blockNeedsPredication, isConsecutivePtr, isMaskRequired
and getSymbolicStrides as 'const'.


# ddb3b26a 28-Apr-2021 Bardia Mahjour <[email protected]>

[LV] Consider Loop Unroll Hints When Making Interleave Decisions

This patch causes the loop vectorizer to not interleave loops that have
nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.u

[LV] Consider Loop Unroll Hints When Making Interleave Decisions

This patch causes the loop vectorizer to not interleave loops that have
nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)).
Note that if a particular interleave count is being requested
(through llvm.loop.interleave_count), it will still be honoured, regardless
of the presence of nounroll hints.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D101374

show more ...


1234