|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
6bec3e93 |
| 06-Oct-2021 |
Jay Foad <[email protected]> |
[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op).
[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf
Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions.
The OrSelf versions additionally have the strange behaviour of allowing extending to a *smaller* width, or truncating to a *larger* width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting.
Differential Revision: https://reviews.llvm.org/D125557
show more ...
|
| #
1fb415fe |
| 13-Apr-2022 |
Johannes Doerfert <[email protected]> |
[AMDGPU][FIX] Proper load-store-vectorizer result with opaque pointers
The original code relied on the fact that we needed a bitcast instruction (for non constant base objects). With opaque pointers
[AMDGPU][FIX] Proper load-store-vectorizer result with opaque pointers
The original code relied on the fact that we needed a bitcast instruction (for non constant base objects). With opaque pointers there might not be a bitcast. Always check if reordering is required instead.
Fixes: https://github.com/llvm/llvm-project/issues/54896
Differential Revision: https://reviews.llvm.org/D123694
show more ...
|
| #
ed98c1b3 |
| 09-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332
|
| #
a494ae43 |
| 01-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-
Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741
show more ...
|
| #
0776f6e0 |
| 13-Jan-2022 |
Benjamin Kramer <[email protected]> |
[LSV] Vectorize loads of vectors by turning it into a larger vector
Use shufflevector to do the subvector extracts. This allows a lot more load merging on AMDGPU and also on NVPTX when <2 x half> is
[LSV] Vectorize loads of vectors by turning it into a larger vector
Use shufflevector to do the subvector extracts. This allows a lot more load merging on AMDGPU and also on NVPTX when <2 x half> is involved.
Differential Revision: https://reviews.llvm.org/D117219
show more ...
|
| #
330cb032 |
| 03-Jan-2022 |
Nikita Popov <[email protected]> |
[LoadStoreVectorizer] Check for guaranteed-to-transfer (PR52950)
Rather than checking for nounwind in particular, make sure the instruction is guaranteed to transfer execution, which will also handl
[LoadStoreVectorizer] Check for guaranteed-to-transfer (PR52950)
Rather than checking for nounwind in particular, make sure the instruction is guaranteed to transfer execution, which will also handle non-willreturn calls correctly.
Fixes https://github.com/llvm/llvm-project/issues/52950.
show more ...
|
| #
5f2f6118 |
| 03-Oct-2021 |
Dávid Bolvanský <[email protected]> |
Fixed more warnings in LLVM produced by -Wbitwise-instead-of-logical
|
| #
cf284f6c |
| 03-Oct-2021 |
hyeongyu kim <[email protected]> |
[LSV] Change the default value of InstertElement to poison
This patch is changing the InsertElement's placeholder to poison without changing the LSV's behavior.
Regardless of whether `StoreTy` is F
[LSV] Change the default value of InstertElement to poison
This patch is changing the InsertElement's placeholder to poison without changing the LSV's behavior.
Regardless of whether `StoreTy` is FixedVectorType or not, the poison value will be overwritten with a different value. Therefore, whether the InsertElement's placeholder is poison or undef will not affect the result of the program.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D111005
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
9d720dcb |
| 31-Aug-2021 |
Nikita Popov <[email protected]> |
[LoadStoreVectorizer] Make aliasing check more precise
The load store vectorizer currently uses isNoAlias() to determine whether memory-accessing instructions should prevent vectorization. However,
[LoadStoreVectorizer] Make aliasing check more precise
The load store vectorizer currently uses isNoAlias() to determine whether memory-accessing instructions should prevent vectorization. However, this only works for loads and stores. Additionally, a couple of intrinsics like assume are special-cased to be ignored.
Instead use getModRefInfo() to generically determine whether the instruction accesses/modifies the relevant location. This will automatically handle all inaccessiblememonly intrinsics correctly (as well as other calls that don't modref for other reasons). This requires generalizing the code a bit, as it was previously only considering loads and stored in particular.
Differential Revision: https://reviews.llvm.org/D109020
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4 |
|
| #
a9129f89 |
| 27-Jun-2021 |
Nikita Popov <[email protected]> |
[LoadStoreVectorizer] Support opaque pointers
There are remaining redundant bitcasts.
|
|
Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
11996586 |
| 10-Jun-2021 |
Slava Nikolaev <[email protected]> |
LoadStoreVectorizer: support different operand orders in the add sequence match
First we refactor the code which does no wrapping add sequences match: we need to allow different operand orders for t
LoadStoreVectorizer: support different operand orders in the add sequence match
First we refactor the code which does no wrapping add sequences match: we need to allow different operand orders for the key add instructions involved in the match.
Then we use the refactored code trying 4 variants of matching operands.
Originally the code relied on the fact that the matching operands of the two last add instructions of memory index calculations had the same LHS argument. But which operand is the same in the two instructions is actually not essential, so now we allow that to be any of LHS or RHS of each of the two instructions. This increases the chances of vectorization to happen.
Reviewed By: volkan
Differential Revision: https://reviews.llvm.org/D103912
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
e7d26ace |
| 12-May-2021 |
Justin Bogner <[email protected]> |
Change the context instruction for computeKnownBits in LoadStoreVectorizer pass
This change enables cases for which the index value for the first load/store instruction in a pair could be a function
Change the context instruction for computeKnownBits in LoadStoreVectorizer pass
This change enables cases for which the index value for the first load/store instruction in a pair could be a function argument. This allows using llvm.assume to provide known bits information in such cases.
Patch by Viacheslav Nikolaev. Thanks!
Differential Revision: https://reviews.llvm.org/D101680
show more ...
|
| #
95427210 |
| 30-Apr-2021 |
Justin Bogner <[email protected]> |
Add support for llvm.assume intrinsic to the LoadStoreVectorizer pass
Patch by Viacheslav Nikolaev. Thanks!
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
11ef356d |
| 05-Feb-2021 |
Craig Topper <[email protected]> |
[TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D96097
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
f3c44569 |
| 18-Nov-2020 |
Hongtao Yu <[email protected]> |
[CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://review
[CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.
A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:
1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality. 2. The counter atomics may not be fully cleaned up from the code stream eventually. 3. Extra work is needed for re-targeting.
We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.
Let's now look at an example. Given the following LLVM IR:
``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 br i1 %cmp, label %bb1, label %bb2 bb1: br label %bb3 bb2: br label %bb3 bb3: ret void } ```
The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.
``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void }
```
Reviewed By: wmi
Differential Revision: https://reviews.llvm.org/D86490
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
| #
5e630834 |
| 27-Aug-2020 |
Christopher Tetreault <[email protected]> |
[SVE] Remove calls to VectorType::getNumElements from Transforms/Vectorize
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D82056
|
|
Revision tags: llvmorg-11.0.0-rc2 |
|
| #
b0eb40ca |
| 31-Jul-2020 |
Vitaly Buka <[email protected]> |
[NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.
Differential Revision: https://reviews.llvm.org/D84621
|
| #
89051eba |
| 31-Jul-2020 |
Vitaly Buka <[email protected]> |
[NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
|
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
1507fc15 |
| 26-Jun-2020 |
Guillaume Chatelet <[email protected]> |
[Alignment][NFC] Migrate TTI::isLegalToVectorize{Load,Store}Chain to Align
This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/piperm
[Alignment][NFC] Migrate TTI::isLegalToVectorize{Load,Store}Chain to Align
This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Differential Revision: https://reviews.llvm.org/D82653
show more ...
|
| #
d2befc66 |
| 29-May-2020 |
Christopher Tetreault <[email protected]> |
[SVE] Eliminate calls to default-false VectorType::get() from Vectorize
Reviewers: efriedma, c-rhodes, david-arm, fhahn
Reviewed By: david-arm
Subscribers: tschuett, hiraditya, rkruppe, psnobl, ll
[SVE] Eliminate calls to default-false VectorType::get() from Vectorize
Reviewers: efriedma, c-rhodes, david-arm, fhahn
Reviewed By: david-arm
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80339
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
63081dc6 |
| 18-May-2020 |
Volkan Keles <[email protected]> |
LoadStoreVectorizer: Match nested adds to prove vectorization is safe
If both OpA and OpB is an add with NSW/NUW and with the same LHS operand, we can guarantee that the transformation is safe if we
LoadStoreVectorizer: Match nested adds to prove vectorization is safe
If both OpA and OpB is an add with NSW/NUW and with the same LHS operand, we can guarantee that the transformation is safe if we can prove that OpA won't overflow when IdxDiff added to the RHS of OpA.
Review: https://reviews.llvm.org/D79817
show more ...
|
| #
52e98f62 |
| 17-May-2020 |
Nikita Popov <[email protected]> |
[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)
Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to ret
[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC)
Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to return Align instead of MaybeAlign.
show more ...
|
| #
68b2e507 |
| 21-Apr-2020 |
Craig Topper <[email protected]> |
[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
|
| #
fcc9d702 |
| 20-Apr-2020 |
Craig Topper <[email protected]> |
Revert "[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign."
This is breaking the clang build.
This reverts commit 897409fb56f4525639b0e47e88960f24cd91c924.
|
| #
897409fb |
| 20-Apr-2020 |
Craig Topper <[email protected]> |
[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign.
Differential Revision: https://reviews.llvm.org/D78443
|