|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
9df0b254 |
| 23-Jul-2022 |
Nuno Lopes <[email protected]> |
[NFC] Switch a few uses of undef to poison as placeholders for unreachable code
|
| #
cc88445a |
| 30-Jun-2022 |
Sanjay Patel <[email protected]> |
[InstCombine] canonicalize 'icmp (trunc X), C' to 'icmp (X & Mask), C'
I looked at canonicalizing in the other direction, but that causes many potential regressions and infinite loops because we alr
[InstCombine] canonicalize 'icmp (trunc X), C' to 'icmp (X & Mask), C'
I looked at canonicalizing in the other direction, but that causes many potential regressions and infinite loops because we already (possibly wrongly) canonicalize "trunc X to i1" into an and+icmp.
This has a data layout restriction to avoid creating illegal mask instructions, but we could remove that if we can show that the backend can undo this when needed.
The motivating example from issue #56119 is modeled by the PhaseOrdering test.
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
872f7000 |
| 03-Apr-2022 |
Dávid Bolvanský <[email protected]> |
Revert "[NFCI] Regenerate SROA/LoopVectorize test checks"
This reverts commit 14e3450fb57305aa9ff3e9e60687b458e43835c9.
|
| #
a113a582 |
| 03-Apr-2022 |
Dávid Bolvanský <[email protected]> |
[NFCI] Regenerate LoopVectorize test checks
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
a266af72 |
| 14-Feb-2022 |
Nikita Popov <[email protected]> |
[InstCombine] Canonicalize SPF to min/max intrinsics
Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max
[InstCombine] Canonicalize SPF to min/max intrinsics
Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max.
Once this sticks, we can stop matching SPF min/max in various places, and can remove hacks we have for preventing infinite loops and breaking of SPF canonicalization.
Differential Revision: https://reviews.llvm.org/D98152
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
e6ad9ef4 |
| 14-Dec-2021 |
Philip Reames <[email protected]> |
[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement
The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transform
[instcombine] Canonicalize constant index type to i64 for extractelement/insertelement
The basic idea to this is that a) having a single canonical type makes CSE easier, and b) many of our transforms are inconsistent about which types we end up with based on visit order.
I'm restricting this to constants as for non-constants, we'd have to decide whether the simplicity was worth extra instructions. For constants, there are no extra instructions.
We chose the canonical type as i64 arbitrarily. We might consider changing this to something else in the future if we have cause.
Differential Revision: https://reviews.llvm.org/D115387
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
c2441b6b |
| 11-Oct-2021 |
Rosie Sumpter <[email protected]> |
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the llvm.fmuladd intrinsic.
Differential Revision: https://reviews.ll
[LoopVectorize] Add vector reduction support for fmuladd intrinsic
Enables LoopVectorize to handle reduction patterns involving the llvm.fmuladd intrinsic.
Differential Revision: https://reviews.llvm.org/D111555
show more ...
|
| #
9cd7c534 |
| 22-Nov-2021 |
Huihui Zhang <[email protected]> |
[InstCombine] Enable fold select into operand for FAdd, FMul, FSub and FDiv.
For FAdd, FMul, FSub and FDiv, fold select into one of the operands to enable further optimizations, i.e., floating-poin
[InstCombine] Enable fold select into operand for FAdd, FMul, FSub and FDiv.
For FAdd, FMul, FSub and FDiv, fold select into one of the operands to enable further optimizations, i.e., floating-point reduction detection.
Turn code: %C = fadd %A, %B %D = select %cond, %C, %A
into: %C = select %cond, %B, -0.000000e+00 %D = fadd %A, %C
Alive2 verification (with --disable-undef-input), timed out otherwise. FAdd - https://alive2.llvm.org/ce/z/eUxN4Y FMul - https://alive2.llvm.org/ce/z/5SWZz4 FSub - https://alive2.llvm.org/ce/z/Dhj8dU FDiv - https://alive2.llvm.org/ce/z/Yj_NA2
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D113442
show more ...
|
| #
dcb8222d |
| 26-Oct-2021 |
Rosie Sumpter <[email protected]> |
[LoopVectorize] Propagate fast-math flags for inloop reductions
This patch updates VPReductionRecipe::execute so that the fast-math flags associated with the underlying instruction of the VPRecipe a
[LoopVectorize] Propagate fast-math flags for inloop reductions
This patch updates VPReductionRecipe::execute so that the fast-math flags associated with the underlying instruction of the VPRecipe are propagated through to the reductions which are created.
Differential Revision: https://reviews.llvm.org/D112548
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1 |
|
| #
9d355949 |
| 30-Jul-2021 |
Kerry McLaughlin <[email protected]> |
Reland "[LV] Use lookThroughAnd with logical reductions"
If a reduction Phi has a single user which `AND`s the Phi with a type mask, `lookThroughAnd` will return the user of the Phi and the narrower
Reland "[LV] Use lookThroughAnd with logical reductions"
If a reduction Phi has a single user which `AND`s the Phi with a type mask, `lookThroughAnd` will return the user of the Phi and the narrower type represented by the mask. Currently this is only used for arithmetic reductions, whereas loops containing logical reductions will create a reduction intrinsic using the widened type, for example:
for.body: %phi = phi i32 [ %and, %for.body ], [ 255, %entry ] %mask = and i32 %phi, 255 %gep = getelementptr inbounds i8, i8* %ptr, i32 %iv %load = load i8, i8* %gep %ext = zext i8 %load to i32 %and = and i32 %mask, %ext ...
^ this will generate an and reduction intrinsic such as the following: call i32 @llvm.vector.reduce.and.v8i32(<8 x i32>...)
The same example for an add instruction would create an intrinsic of type i8: call i8 @llvm.vector.reduce.add.v8i8(<8 x i8>...)
This patch changes AddReductionVar to call lookThroughAnd for other integer reductions, allowing loops similar to the example above with reductions such as and, or & xor to vectorize.
Reviewed By: david-arm, dmgreen
Differential Revision: https://reviews.llvm.org/D105632
show more ...
|
|
Revision tags: llvmorg-14-init |
|
| #
be753b20 |
| 21-Jul-2021 |
Kerry McLaughlin <[email protected]> |
Revert "[LV] Use lookThroughAnd with logical reductions"
Reverting patch due to buildbot failures.
This reverts commit e22a59967251294ccdac6b43a06f48c1b7075240.
|
| #
e22a5996 |
| 19-Jul-2021 |
Kerry McLaughlin <[email protected]> |
[LV] Use lookThroughAnd with logical reductions
If a reduction Phi has a single user which `AND`s the Phi with a type mask, `lookThroughAnd` will return the user of the Phi and the narrower type rep
[LV] Use lookThroughAnd with logical reductions
If a reduction Phi has a single user which `AND`s the Phi with a type mask, `lookThroughAnd` will return the user of the Phi and the narrower type represented by the mask. Currently this is only used for arithmetic reductions, whereas loops containing logical reductions will create a reduction intrinsic using the widened type, for example:
for.body: %phi = phi i32 [ %and, %for.body ], [ 255, %entry ] %mask = and i32 %phi, 255 %gep = getelementptr inbounds i8, i8* %ptr, i32 %iv %load = load i8, i8* %gep %ext = zext i8 %load to i32 %and = and i32 %mask, %ext ...
^ this will generate an and reduction intrinsic such as the following: call i32 @llvm.vector.reduce.and.v8i32(<8 x i32>...)
The same example for an add instruction would create an intrinsic of type i8: call i8 @llvm.vector.reduce.add.v8i8(<8 x i8>...)
This patch changes AddReductionVar to call lookThroughAnd for other integer reductions, allowing loops similar to the example above with reductions such as and, or & xor to vectorize.
Reviewed By: david-arm, dmgreen
Differential Revision: https://reviews.llvm.org/D105632
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
23c2f2e6 |
| 07-Jun-2021 |
Florian Hahn <[email protected]> |
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If th
[LV] Mark increment of main vector loop induction variable as NUW.
This patch marks the induction increment of the main induction variable of the vector loop as NUW when not folding the tail.
If the tail is not folded, we know that End - Start >= Step (either statically or through the minimum iteration checks). We also know that both Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV + %Step == %End. Hence we must exit the loop before %IV + %Step unsigned overflows and we can mark the induction increment as NUW.
This should make SCEV return more precise bounds for the created vector loops, used by later optimizations, like late unrolling.
At the moment quite a few tests still need to be updated, but before doing so I'd like to get initial feedback to make sure I am not missing anything.
Note that this could probably be further improved by using information from the original IV.
Attempt of modeling of the assumption in Alive2: https://alive2.llvm.org/ce/z/H_DL_g
Part of a set of fixes required for PR50412.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D103255
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
8a156d1c |
| 02-May-2021 |
Juneyoung Lee <[email protected]> |
[InstCombine] Fully disable select to and/or i1 folding
This is a patch that disables the poison-unsafe select -> and/or i1 folding.
It has been blocking D72396 and also has been the source of a fe
[InstCombine] Fully disable select to and/or i1 folding
This is a patch that disables the poison-unsafe select -> and/or i1 folding.
It has been blocking D72396 and also has been the source of a few miscompilations described in llvm.org/pr49688 . D99674 conditionally blocked this folding and successfully fixed the latter one. The former one was still blocked, and this patch addresses it.
Note that a few test functions that has `_logical` suffix are now deoptimized. These are created by @nikic to check the impact of disabling this optimization by copying existing original functions and replacing and/or with select.
I can see that most of these are poison-unsafe; they can be revived by introducing freeze instruction. I left comments at fcmp + select optimizations (or-fcmp.ll, and-fcmp.ll) because I think they are good targets for freeze fix.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D101191
show more ...
|
| #
e639bcce |
| 02-May-2021 |
Juneyoung Lee <[email protected]> |
run update_test_checks.py for the tests in D101191 (NFC)
This is an NFC that reruns update_test_checks.py on the tests that are going to be updated in D101191.
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
ed253ef7 |
| 09-Feb-2021 |
Juneyoung Lee <[email protected]> |
[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask
This patch fixes pr48832 by correctly generating the mask when a poison value is involved.
Consider this CFG (whic
[LoopVectorize] Fix VPRecipeBuilder::createEdgeMask to correctly generate the mask
This patch fixes pr48832 by correctly generating the mask when a poison value is involved.
Consider this CFG (which is a part of the input):
``` for.body: ; preds = %for.cond br i1 true, label %cond.false, label %land.rhs
land.rhs: ; preds = %for.body br i1 poison, label %cond.end, label %cond.false
cond.false: ; preds = %for.body, %land.rhs br label %cond.end
cond.end: ; preds = %land.rhs, %cond.false %cond = phi i32 [ 0, %cond.false ], [ 1, %land.rhs ]
```
The path for.body -> land.rhs -> cond.end should be taken when 'select i1 false, i1 poison, i1 false' holds (which means it's never taken); but VPRecipeBuilder::createEdgeMask was emitting 'and i1 false, poison' instead. The former one successfully blocks poison propagation whereas the latter one doesn't, making the condition poison and thus causing the miscompilation.
SimplifyCFG has a similar bug (which didn't expose a real-world bug yet), and a patch for this is also ongoing (see https://reviews.llvm.org/D95026).
Reviewed By: bjope
Differential Revision: https://reviews.llvm.org/D95217
show more ...
|
| #
79b1b4a5 |
| 12-Feb-2021 |
Sanjay Patel <[email protected]> |
[Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics
The vector reduction intrinsics started life as experimental ops, so backend support was lacking. As part of promot
[Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics
The vector reduction intrinsics started life as experimental ops, so backend support was lacking. As part of promoting them to 1st-class intrinsics, however, codegen support was added/improved: D58015 D90247
So I think it is safe to now remove this complication from IR.
Note that we still have an IR-level codegen expansion pass for these as discussed in D95690. Removing that is another step in simplifying the logic. Also note that x86 was already unconditionally forming reductions in IR, so there should be no difference for x86.
I spot checked a couple of the tests here by running them through opt+llc and did not see any asm diffs.
If we do find functional differences for other targets, it should be possible to (at least temporarily) restore the shuffle IR with the ExpandReductions IR pass.
Differential Revision: https://reviews.llvm.org/D96552
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
9d70dbdc |
| 27-Dec-2020 |
Juneyoung Lee <[email protected]> |
[InstCombine] use poison as placeholder for undemanded elems
Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic b
[InstCombine] use poison as placeholder for undemanded elems
Currently undef is used as a don’t-care vector when constructing a vector using a series of insertelement. However, this is problematic because undef isn’t undefined enough. Especially, a sequence of insertelement can be optimized to shufflevector, but using undef as its placeholder makes shufflevector a poison-blocking instruction because undef cannot be optimized to poison. This makes a few straightforward optimizations incorrect, such as:
``` ; https://bugs.llvm.org/show_bug.cgi?id=44185
define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %xv = insertelement <4 x float> %q, float %x, i32 2 %r = shufflevector <4 x float> %y, <4 x float> %xv, <4 x i32> { 0, 6, 2, undef } ret <4 x float> %r ; %r[3] is undef } => define <4 x float> @insert_not_undef_shuffle_translate_commute(float %x, <4 x float> %y, <4 x float> %q) { %r = insertelement <4 x float> %y, float %x, i32 1 ret <4 x float> %r ; %r[3] = %y[3], incorrect if %y[3] = poison }
Transformation doesn't verify! ERROR: Target is more poisonous than source ```
I’d like to suggest 1. Using poison as insertelement’s placeholder value (IRBuilder::CreateVectorSplat should be patched too) 2. Updating shufflevector’s semantics to return poison element if mask is undef
Note that poison is currently lowered into UNDEF in SelDag, so codegen part is okay. m_Undef() matches PoisonValue as well, so existing optimizations will still fire.
The only concern is hidden miscompilations that will go incorrect when poison constant is given. A conservative way is copying all tests having `insertelement undef` & replacing it with `insertelement poison` & run Alive2 on it, but it will create many tests and people won’t like it. :(
Instead, I’ll simply locally maintain the tests and run Alive2. If there is any bug found, I’ll report it.
Relevant links: https://bugs.llvm.org/show_bug.cgi?id=43958 , http://lists.llvm.org/pipermail/llvm-dev/2019-November/137242.html
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D93586
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
20b386aa |
| 29-Oct-2020 |
Nikita Popov <[email protected]> |
[LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always gener
[LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always generated with fast flags right now, and thus nsz.
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
| #
322d0afd |
| 03-Oct-2020 |
Amara Emerson <[email protected]> |
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle lega
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle legacy intrinsics.
Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html
Differential Revision: https://reviews.llvm.org/D88787
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
| #
745bf6cf |
| 06-Aug-2020 |
David Green <[email protected]> |
[LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them toget
[LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them together and sum the result into a 32bit general purpose register. So taking 16 i8's as inputs, they can multiply and accumulate the result into a single i32 without any rounding/truncating along the way. There are also reduction instructions for plain integer add and min/max, and operations that sum into a pair of 32bit registers together treated as a 64bit integer (even though MVE does not have a plain 64bit addition instruction). So giving the vectorizer the ability to use these instructions both enables us to vectorize at higher bitwidths, and to vectorize things we previously could not.
In order to do that we need a way to represent that the reduction operation, specified with a llvm.experimental.vector.reduce when vectorizing for Arm, occurs inside the loop not after it like most reductions. This patch attempts to do that, teaching the vectorizer about in-loop reductions. It does this through a vplan recipe representing the reductions that the original chain of reduction operations is replaced by. Cost modelling is currently just done through a prefersInloopReduction TTI hook (which follows in a later patch).
Differential Revision: https://reviews.llvm.org/D75069
show more ...
|
| #
3c39db0c |
| 05-Aug-2020 |
Jordan Rupprecht <[email protected]> |
Revert "[LoopVectorizer] Inloop vector reductions"
This reverts commit e9761688e41cb979a1fa6a79eb18145a75104933. It breaks the build:
``` ~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:
Revert "[LoopVectorizer] Inloop vector reductions"
This reverts commit e9761688e41cb979a1fa6a79eb18145a75104933. It breaks the build:
``` ~/src/llvm-project/llvm/lib/Analysis/IVDescriptors.cpp:868:10: error: no viable conversion from returned value of type 'SmallVector<[...], 8>' to function return type 'SmallVector<[...], 4>' return ReductionOperations; ```
show more ...
|
| #
e9761688 |
| 05-Aug-2020 |
David Green <[email protected]> |
[LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them toget
[LoopVectorizer] Inloop vector reductions
Arm MVE has multiple instructions such as VMLAVA.s8, which (in this case) can take two 128bit vectors, sign extend the inputs to i32, multiplying them together and sum the result into a 32bit general purpose register. So taking 16 i8's as inputs, they can multiply and accumulate the result into a single i32 without any rounding/truncating along the way. There are also reduction instructions for plain integer add and min/max, and operations that sum into a pair of 32bit registers together treated as a 64bit integer (even though MVE does not have a plain 64bit addition instruction). So giving the vectorizer the ability to use these instructions both enables us to vectorize at higher bitwidths, and to vectorize things we previously could not.
In order to do that we need a way to represent that the reduction operation, specified with a llvm.experimental.vector.reduce when vectorizing for Arm, occurs inside the loop not after it like most reductions. This patch attempts to do that, teaching the vectorizer about in-loop reductions. It does this through a vplan recipe representing the reductions that the original chain of reduction operations is replaced by. Cost modelling is currently just done through a prefersInloopReduction TTI hook (which follows in a later patch).
Differential Revision: https://reviews.llvm.org/D75069
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc1 |
|
| #
2f4c3e80 |
| 17-Jul-2020 |
David Green <[email protected]> |
[LV] Add additional InLoop redution tests. NFC
|
|
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
| #
ec7e4a9a |
| 03-Mar-2020 |
David Green <[email protected]> |
[LoopVectorizer] Add reduction tests for inloop reductions. NFC
Also adds a force-reduction-intrinsics option for testing, for forcing the generation of reduction intrinsics even when the backend is
[LoopVectorizer] Add reduction tests for inloop reductions. NFC
Also adds a force-reduction-intrinsics option for testing, for forcing the generation of reduction intrinsics even when the backend is not requesting them.
show more ...
|