|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3 |
|
| #
d945a2c9 |
| 09-Aug-2022 |
Dinar Temirbulatov <[email protected]> |
[AArch64][LoopVectorize] Introduce trip count minimal value threshold to ignore tail-folding.
After D121595 was commited, I noticed regressions assosicated with small trip count numbersvectorisation
[AArch64][LoopVectorize] Introduce trip count minimal value threshold to ignore tail-folding.
After D121595 was commited, I noticed regressions assosicated with small trip count numbersvectorisation by tail folding with scalable vectors. As a solution for those issues I propose to introduce the minimal trip count threshold value.
Differential Revision: https://reviews.llvm.org/D130755
(cherry picked from commit cab6cd68340255be241b7cf169c67a1899ced115)
show more ...
|
|
Revision tags: llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
41958f76 |
| 22-Jul-2022 |
Malhar Jajoo <[email protected]> |
[Costmodel] Add "type-based-intrinsic-cost" cli option
This patch adds a command line flag to be able to test the type based cost-model analysis for Intrinsics.
Differential Revision: https://revie
[Costmodel] Add "type-based-intrinsic-cost" cli option
This patch adds a command line flag to be able to test the type based cost-model analysis for Intrinsics.
Differential Revision: https://reviews.llvm.org/D129109
show more ...
|
| #
f15b6b29 |
| 12-Jul-2022 |
David Sherwood <[email protected]> |
[AArch64] Add target hook for preferPredicateOverEpilogue
This patch adds the AArch64 hook for preferPredicateOverEpilogue, which currently returns true if SVE is enabled and one of the following co
[AArch64] Add target hook for preferPredicateOverEpilogue
This patch adds the AArch64 hook for preferPredicateOverEpilogue, which currently returns true if SVE is enabled and one of the following conditions (non-exhaustive) is met:
1. The "sve-tail-folding" option is set to "all", or 2. The "sve-tail-folding" option is set to "all+noreductions" and the loop does not contain reductions, 3. The "sve-tail-folding" option is set to "all+norecurrences" and the loop has no first-order recurrences.
Currently the default option is "disabled", but this will be changed in a later patch.
I've added new tests to show the options behave as expected here:
Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll
Differential Revision: https://reviews.llvm.org/D129560
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
03fee671 |
| 10-May-2022 |
David Sherwood <[email protected]> |
[LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operat
[LoopVectorize] Add option to use active lane mask for loop control flow
Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop.
This patch adds support for using the active lane mask for control flow by:
1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch.
I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region.
Differential Revision: https://reviews.llvm.org/D125301
show more ...
|
| #
0b5ead65 |
| 29-Jun-2022 |
Chuanqi Xu <[email protected]> |
[WebAssembly] Don't set musttail for coroutines when tail-call is not enabled
The C++20 Coroutines couldn't be compiled to WebAssembly due to an optimization named symmetric transfer requires the su
[WebAssembly] Don't set musttail for coroutines when tail-call is not enabled
The C++20 Coroutines couldn't be compiled to WebAssembly due to an optimization named symmetric transfer requires the support for musttail calls but WebAssembly doesn't support it yet.
This patch tries to fix the problem by adding a supportsTailCalls method to TargetTransformImpl to skip the symmetric transfer when tail-call feature is not supported.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D128794
show more ...
|
| #
7a9ad257 |
| 22-Jun-2022 |
Vasileios Porpodas <[email protected]> |
Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles"
This reverts commit 6d6268dcbf0f48e43f6f9fe46b3a28c29ba63c7d.
Review: https://reviews.llvm.org/D125712
|
| #
6d6268dc |
| 22-Jun-2022 |
Vasileios Porpodas <[email protected]> |
Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles"
This reverts commit 6f88acf410b48f3e6c1526df2dc32ed86f249685.
|
| #
6f88acf4 |
| 13-May-2022 |
Vasileios Porpodas <[email protected]> |
[SLP][X86] Improve reordering to consider alternate instruction bundles
During the reordering transformation we should try to avoid reordering bundles like fadd,fsub because this may block them bein
[SLP][X86] Improve reordering to consider alternate instruction bundles
During the reordering transformation we should try to avoid reordering bundles like fadd,fsub because this may block them being matched into a single vector instruction in x86. We do this by checking if a TreeEntry is such a pattern and adding it to the list of TreeEntries with orders that need to be considered.
Differential Revision: https://reviews.llvm.org/D125712
show more ...
|
| #
a9dccb00 |
| 16-Jun-2022 |
Congzhe Cao <[email protected]> |
[TargetTransformInfo] Added an opt/llc option for cache line size
In some passes we need a valid number of cache line size to do analysis or transformation, e.g., loop cache analysis and loop date p
[TargetTransformInfo] Added an opt/llc option for cache line size
In some passes we need a valid number of cache line size to do analysis or transformation, e.g., loop cache analysis and loop date prefetch. However, for some backend targets, `TTIImpl->getCacheLineSize()` is not implemented and hence 'TTI.getCacheLineSize()' would just return 0 which eventually might produce invalid result.
In this patch we add a user-specified opt/llc option for cache line size. If the option is specified by users we use the value supplied, otherwise we fall-back to the default value obtained from `TTIImpl->->getCacheLineSize()`. The powerpc target already has such an option, this patch generalizes this option to TargetTransformInfo.cpp.
Reviewed By: bmahjour, #loopoptwg
Differential Revision: https://reviews.llvm.org/D127342
show more ...
|
| #
6a845792 |
| 25-May-2022 |
eopXD <[email protected]> |
[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D126350
|
| #
bb82f746 |
| 23-May-2022 |
Jingu Kang <[email protected]> |
Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth""
This reverts commit 42ebfa8269470e6b1fe2de996d3f1db6d142e16a.
The commmit from https://reviews.llvm.org/D125918 has fix
Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth""
This reverts commit 42ebfa8269470e6b1fe2de996d3f1db6d142e16a.
The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build failure.
Differential Revision: https://reviews.llvm.org/D118979
show more ...
|
| #
ade47bdc |
| 16-May-2022 |
Peter Waller <[email protected]> |
[LV] Improve register pressure estimate at high VFs
Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate
[LV] Improve register pressure estimate at high VFs
Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate the register pressure caused by using a vector type. However, `getTypeLegalizationCost` currently only appears to understand splitting and not scalarization, so significantly underestimates the register requirements.
Instead, use `getNumRegisters`, which understands when scalarization can occur (via computeRegisterProperties).
This was discovered while investigating D118979 (Set maximum VF with shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the loop vectorizer previously ends up costing an v128i1 as 2 v64i* registers where it actually occupies 128 i32 registers.
I'm sending this patch early for comment, I'm still doing some sanity checking with LNT. I note that getRegisterClassForType appears to return VectorRC even though the type in question (large vNi1 types) end up occupying scalar registers. That might be worth fixing too.
Differential Revision: https://reviews.llvm.org/D125918
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
9dc4ced2 |
| 22-Apr-2022 |
Alexey Bataev <[email protected]> |
[SLP]Try partial store vectorization if supported by target.
We can try to vectorize number of stores less than MinVecRegSize / scalar_value_size, if it is allowed by target. Gives an extra opportun
[SLP]Try partial store vectorization if supported by target.
We can try to vectorize number of stores less than MinVecRegSize / scalar_value_size, if it is allowed by target. Gives an extra opportunity for the vectorization.
Fixes PR54985.
Differential Revision: https://reviews.llvm.org/D124284
show more ...
|
| #
fa8a9fea |
| 26-Apr-2022 |
Vasileios Porpodas <[email protected]> |
Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c.
Code review: https://reviews.llvm.or
Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c.
Code review: https://reviews.llvm.org/D124202
show more ...
|
| #
6a9bbd9f |
| 26-Apr-2022 |
Vasileios Porpodas <[email protected]> |
Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"
This reverts commit 55ce296d6f217fd0defed2592ff7b74b79b2c1f0.
|
| #
55ce296d |
| 21-Apr-2022 |
Vasileios Porpodas <[email protected]> |
[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`
Before this patch `Args` was used to pass a broadcat's arguments by SLP. This patch changes this. `Args` is no
[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`
Before this patch `Args` was used to pass a broadcat's arguments by SLP. This patch changes this. `Args` is now used for passing the operands of the shuffle.
Differential Revision: https://reviews.llvm.org/D124202
show more ...
|
| #
889588ee |
| 20-Apr-2022 |
Vasileios Porpodas <[email protected]> |
[SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`.
Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`. This helps reduce the diff of a future SLP patch
[SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`.
Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`. This helps reduce the diff of a future SLP patch for AArch64.
show more ...
|
| #
42ebfa82 |
| 12-Apr-2022 |
Muhammad Omair Javaid <[email protected]> |
Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"
This reverts commit 64b6192e812977092242ae34d6eafdcd42fea39d.
This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage:
ht
Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"
This reverts commit 64b6192e812977092242ae34d6eafdcd42fea39d.
This broke LLVM AArch64 buildbot clang-aarch64-sve-vls-2stage:
https://lab.llvm.org/buildbot/#/builders/176/builds/1515
llvm-tblgen crashes after applying this patch.
show more ...
|
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
| #
da41214d |
| 04-Feb-2022 |
Evgeniy Brevnov <[email protected]> |
Add support for atomic memory copy lowering
Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be u
Add support for atomic memory copy lowering
Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D118443
show more ...
|
| #
64b6192e |
| 05-Apr-2022 |
Jingu Kang <[email protected]> |
[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth
Set the maximum VF of AArch64 with 128 / the size of smallest type in loop.
Differential Revision: https://reviews.llvm.org/D118979
|
| #
39aa202a |
| 24-Mar-2022 |
Vasileios Porpodas <[email protected]> |
Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354
This reverts commit e6ead19b774718113007ecb1a4
Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354
This reverts commit e6ead19b774718113007ecb1a4449d7af0cbcfeb.
show more ...
|
| #
e6ead19b |
| 23-Mar-2022 |
Arthur Eubanks <[email protected]> |
Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f94928201f87f6b659fc2228efd539e8245.
Causes crashes, see comme
Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash."
This reverts commit 27bd8f94928201f87f6b659fc2228efd539e8245.
Causes crashes, see comments in D121973
show more ...
|
| #
27bd8f94 |
| 22-Mar-2022 |
Vasileios Porpodas <[email protected]> |
Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354
This reverts commit f7d7d2a08d16356c57f6d2d36b
Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 2, fixed assertion crash.
Original review: https://reviews.llvm.org/D121354
This reverts commit f7d7d2a08d16356c57f6d2d36bc2fc0589a55df9.
show more ...
|
| #
f7d7d2a0 |
| 22-Mar-2022 |
Arthur Eubanks <[email protected]> |
Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d305013de743cdbd6690e4d77c8af27e.
Causes crashes, see comments in https://reviews.llvm.org/D1
Revert "Recommit "[SLP] Fix lookahead operand reordering for splat loads.""
This reverts commit 79613185d305013de743cdbd6690e4d77c8af27e.
Causes crashes, see comments in https://reviews.llvm.org/D121973.
show more ...
|
| #
79613185 |
| 18-Mar-2022 |
Vasileios Porpodas <[email protected]> |
Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354
The original commit 9136145eb019e1d18c966d4d06a3df349b88cc14 broke the build on
Recommit "[SLP] Fix lookahead operand reordering for splat loads."
Original review: https://reviews.llvm.org/D121354
The original commit 9136145eb019e1d18c966d4d06a3df349b88cc14 broke the build on several targets.
Differential Revision: https://reviews.llvm.org/D121973
show more ...
|