History log of /llvm-project-15.0.7/llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp (Results 1 – 25 of 113)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 6bec3e93 06-Oct-2021 Jay Foad <[email protected]>

[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf

Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op).

[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf

Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557

show more ...


# 1fb415fe 13-Apr-2022 Johannes Doerfert <[email protected]>

[AMDGPU][FIX] Proper load-store-vectorizer result with opaque pointers

The original code relied on the fact that we needed a bitcast
instruction (for non constant base objects). With opaque pointers

[AMDGPU][FIX] Proper load-store-vectorizer result with opaque pointers

The original code relied on the fact that we needed a bitcast
instruction (for non constant base objects). With opaque pointers there
might not be a bitcast. Always check if reordering is required instead.

Fixes: https://github.com/llvm/llvm-project/issues/54896

Differential Revision: https://reviews.llvm.org/D123694

show more ...


# ed98c1b3 09-Mar-2022 serge-sans-paille <[email protected]>

Cleanup includes: DebugInfo & CodeGen

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121332


# a494ae43 01-Mar-2022 serge-sans-paille <[email protected]>

Cleanup includes: TransformsUtils

Estimation on the impact on preprocessor output:
before: 1065307662
after: 1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-

Cleanup includes: TransformsUtils

Estimation on the impact on preprocessor output:
before: 1065307662
after: 1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741

show more ...


# 0776f6e0 13-Jan-2022 Benjamin Kramer <[email protected]>

[LSV] Vectorize loads of vectors by turning it into a larger vector

Use shufflevector to do the subvector extracts. This allows a lot more
load merging on AMDGPU and also on NVPTX when <2 x half> is

[LSV] Vectorize loads of vectors by turning it into a larger vector

Use shufflevector to do the subvector extracts. This allows a lot more
load merging on AMDGPU and also on NVPTX when <2 x half> is involved.

Differential Revision: https://reviews.llvm.org/D117219

show more ...


# 330cb032 03-Jan-2022 Nikita Popov <[email protected]>

[LoadStoreVectorizer] Check for guaranteed-to-transfer (PR52950)

Rather than checking for nounwind in particular, make sure the
instruction is guaranteed to transfer execution, which will also
handl

[LoadStoreVectorizer] Check for guaranteed-to-transfer (PR52950)

Rather than checking for nounwind in particular, make sure the
instruction is guaranteed to transfer execution, which will also
handle non-willreturn calls correctly.

Fixes https://github.com/llvm/llvm-project/issues/52950.

show more ...


# 5f2f6118 03-Oct-2021 Dávid Bolvanský <[email protected]>

Fixed more warnings in LLVM produced by -Wbitwise-instead-of-logical


# cf284f6c 03-Oct-2021 hyeongyu kim <[email protected]>

[LSV] Change the default value of InstertElement to poison

This patch is changing the InsertElement's placeholder to poison without changing the LSV's behavior.

Regardless of whether `StoreTy` is F

[LSV] Change the default value of InstertElement to poison

This patch is changing the InsertElement's placeholder to poison without changing the LSV's behavior.

Regardless of whether `StoreTy` is FixedVectorType or not, the poison value will be overwritten with a different value.
Therefore, whether the InsertElement's placeholder is poison or undef will not affect the result of the program.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D111005

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 9d720dcb 31-Aug-2021 Nikita Popov <[email protected]>

[LoadStoreVectorizer] Make aliasing check more precise

The load store vectorizer currently uses isNoAlias() to determine
whether memory-accessing instructions should prevent vectorization.
However,

[LoadStoreVectorizer] Make aliasing check more precise

The load store vectorizer currently uses isNoAlias() to determine
whether memory-accessing instructions should prevent vectorization.
However, this only works for loads and stores. Additionally, a
couple of intrinsics like assume are special-cased to be ignored.

Instead use getModRefInfo() to generically determine whether the
instruction accesses/modifies the relevant location. This will
automatically handle all inaccessiblememonly intrinsics correctly
(as well as other calls that don't modref for other reasons).
This requires generalizing the code a bit, as it was previously
only considering loads and stored in particular.

Differential Revision: https://reviews.llvm.org/D109020

show more ...


Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4
# a9129f89 27-Jun-2021 Nikita Popov <[email protected]>

[LoadStoreVectorizer] Support opaque pointers

There are remaining redundant bitcasts.


Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 11996586 10-Jun-2021 Slava Nikolaev <[email protected]>

LoadStoreVectorizer: support different operand orders in the add sequence match

First we refactor the code which does no wrapping add sequences
match: we need to allow different operand orders for
t

LoadStoreVectorizer: support different operand orders in the add sequence match

First we refactor the code which does no wrapping add sequences
match: we need to allow different operand orders for
the key add instructions involved in the match.

Then we use the refactored code trying 4 variants of matching operands.

Originally the code relied on the fact that the matching operands
of the two last add instructions of memory index calculations
had the same LHS argument. But which operand is the same
in the two instructions is actually not essential, so now we allow
that to be any of LHS or RHS of each of the two instructions.
This increases the chances of vectorization to happen.

Reviewed By: volkan

Differential Revision: https://reviews.llvm.org/D103912

show more ...


Revision tags: llvmorg-12.0.1-rc1
# e7d26ace 12-May-2021 Justin Bogner <[email protected]>

Change the context instruction for computeKnownBits in LoadStoreVectorizer pass

This change enables cases for which the index value for the first
load/store instruction in a pair could be a function

Change the context instruction for computeKnownBits in LoadStoreVectorizer pass

This change enables cases for which the index value for the first
load/store instruction in a pair could be a function argument. This
allows using llvm.assume to provide known bits information in such
cases.

Patch by Viacheslav Nikolaev. Thanks!

Differential Revision: https://reviews.llvm.org/D101680

show more ...


# 95427210 30-Apr-2021 Justin Bogner <[email protected]>

Add support for llvm.assume intrinsic to the LoadStoreVectorizer pass

Patch by Viacheslav Nikolaev. Thanks!


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# 11ef356d 05-Feb-2021 Craig Topper <[email protected]>

[TargetLowering] Use Align in allowsMisalignedMemoryAccesses.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# f3c44569 18-Nov-2020 Hongtao Yu <[email protected]>

[CSSPGO] IR intrinsic for pseudo-probe block instrumentation

This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://review

[CSSPGO] IR intrinsic for pseudo-probe block instrumentation

This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:

1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality.
2. The counter atomics may not be fully cleaned up from the code stream eventually.
3. Extra work is needed for re-targeting.

We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.

Let's now look at an example. Given the following LLVM IR:

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
%cmp = icmp eq i32 %x, 0
br i1 %cmp, label %bb1, label %bb2
bb1:
br label %bb3
bb2:
br label %bb3
bb3:
ret void
}
```

The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
%cmp = icmp eq i32 %x, 0
call void @llvm.pseudoprobe(i64 837061429793323041, i64 1)
br i1 %cmp, label %bb1, label %bb2
bb1:
call void @llvm.pseudoprobe(i64 837061429793323041, i64 2)
br label %bb3
bb2:
call void @llvm.pseudoprobe(i64 837061429793323041, i64 3)
br label %bb3
bb3:
call void @llvm.pseudoprobe(i64 837061429793323041, i64 4)
ret void
}

```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86490

show more ...


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 5e630834 27-Aug-2020 Christopher Tetreault <[email protected]>

[SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D82056


Revision tags: llvmorg-11.0.0-rc2
# b0eb40ca 31-Jul-2020 Vitaly Buka <[email protected]>

[NFC] Remove unused GetUnderlyingObject paramenter

Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621


# 89051eba 31-Jul-2020 Vitaly Buka <[email protected]>

[NFC] GetUnderlyingObject -> getUnderlyingObject

I am going to touch them in the next patch anyway


Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 1507fc15 26-Jun-2020 Guillaume Chatelet <[email protected]>

[Alignment][NFC] Migrate TTI::isLegalToVectorize{Load,Store}Chain to Align

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/piperm

[Alignment][NFC] Migrate TTI::isLegalToVectorize{Load,Store}Chain to Align

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Differential Revision: https://reviews.llvm.org/D82653

show more ...


# d2befc66 29-May-2020 Christopher Tetreault <[email protected]>

[SVE] Eliminate calls to default-false VectorType::get() from Vectorize

Reviewers: efriedma, c-rhodes, david-arm, fhahn

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, ll

[SVE] Eliminate calls to default-false VectorType::get() from Vectorize

Reviewers: efriedma, c-rhodes, david-arm, fhahn

Reviewed By: david-arm

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80339

show more ...


Revision tags: llvmorg-10.0.1-rc1
# 63081dc6 18-May-2020 Volkan Keles <[email protected]>

LoadStoreVectorizer: Match nested adds to prove vectorization is safe

If both OpA and OpB is an add with NSW/NUW and with the same LHS operand,
we can guarantee that the transformation is safe if we

LoadStoreVectorizer: Match nested adds to prove vectorization is safe

If both OpA and OpB is an add with NSW/NUW and with the same LHS operand,
we can guarantee that the transformation is safe if we can prove that OpA
won't overflow when IdxDiff added to the RHS of OpA.

Review: https://reviews.llvm.org/D79817

show more ...


# 52e98f62 17-May-2020 Nikita Popov <[email protected]>

[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)

Now that load/store alignment is required, we no longer need most
of them. Also switch the getLoadStoreAlignment() helper to ret

[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)

Now that load/store alignment is required, we no longer need most
of them. Also switch the getLoadStoreAlignment() helper to return
Align instead of MaybeAlign.

show more ...


# 68b2e507 21-Apr-2020 Craig Topper <[email protected]>

[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.

Differential Revision: https://reviews.llvm.org/D78443


# fcc9d702 20-Apr-2020 Craig Topper <[email protected]>

Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign."

This is breaking the clang build.

This reverts commit 897409fb56f4525639b0e47e88960f24cd91c924.


# 897409fb 20-Apr-2020 Craig Topper <[email protected]>

[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.

Differential Revision: https://reviews.llvm.org/D78443


12345