|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
9df0b254 |
| 23-Jul-2022 |
Nuno Lopes <[email protected]> |
[NFC] Switch a few uses of undef to poison as placeholders for unreachable code
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
997ecb00 |
| 01-Jun-2022 |
David Sherwood <[email protected]> |
[LoopVectorize] Add FastMathFlags to the select used for reductions with tail-folding
Based on reviewer comments on https://reviews.llvm.org/D126692 I've added FastMathFlags to the select instructio
[LoopVectorize] Add FastMathFlags to the select used for reductions with tail-folding
Based on reviewer comments on https://reviews.llvm.org/D126692 I've added FastMathFlags to the select instruction used when tail-folding with reductions. These flags can then be used by InstCombine to decide upon the most optimal floating point identity value for fadd/fsub. Doing so unlocks further optimisations, such as folding selects into masked loads.
Differential Revision: https://reviews.llvm.org/D126778
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
a266af72 |
| 14-Feb-2022 |
Nikita Popov <[email protected]> |
[InstCombine] Canonicalize SPF to min/max intrinsics
Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max
[InstCombine] Canonicalize SPF to min/max intrinsics
Now that integer min/max intrinsics have good support in both InstCombine and other passes, start canonicalizing SPF min/max to intrinsic min/max.
Once this sticks, we can stop matching SPF min/max in various places, and can remove hacks we have for preventing infinite loops and breaking of SPF canonicalization.
Differential Revision: https://reviews.llvm.org/D98152
show more ...
|
| #
4517488e |
| 10-Feb-2022 |
Simon Pilgrim <[email protected]> |
[LoopVectorize] Regenerate reduction-predselect.ll test checks
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4 |
|
| #
80aa7e14 |
| 28-Jun-2021 |
Florian Hahn <[email protected]> |
[VPlan] Merge predicated-triangle regions, after sinking.
Sinking scalar operands into predicated-triangle regions may allow merging regions. This patch adds a VPlan-to-VPlan transform that tries to
[VPlan] Merge predicated-triangle regions, after sinking.
Sinking scalar operands into predicated-triangle regions may allow merging regions. This patch adds a VPlan-to-VPlan transform that tries to merge predicate-triangle regions after sinking.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D100260
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
79b1b4a5 |
| 12-Feb-2021 |
Sanjay Patel <[email protected]> |
[Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics
The vector reduction intrinsics started life as experimental ops, so backend support was lacking. As part of promot
[Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics
The vector reduction intrinsics started life as experimental ops, so backend support was lacking. As part of promoting them to 1st-class intrinsics, however, codegen support was added/improved: D58015 D90247
So I think it is safe to now remove this complication from IR.
Note that we still have an IR-level codegen expansion pass for these as discussed in D95690. Removing that is another step in simplifying the logic. Also note that x86 was already unconditionally forming reductions in IR, so there should be no difference for x86.
I spot checked a couple of the tests here by running them through opt+llc and did not see any asm diffs.
If we do find functional differences for other targets, it should be possible to (at least temporarily) restore the shuffle IR with the ExpandReductions IR pass.
Differential Revision: https://reviews.llvm.org/D96552
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
20b386aa |
| 29-Oct-2020 |
Nikita Popov <[email protected]> |
[LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always gener
[LoopUtils] Fix neutral value for vector.reduce.fadd
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0 was fine for all existing uses of this function though, as it is always generated with fast flags right now, and thus nsz.
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
| #
322d0afd |
| 03-Oct-2020 |
Amara Emerson <[email protected]> |
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle lega
[llvm][mlir] Promote the experimental reduction intrinsics to be first class intrinsics.
This change renames the intrinsics to not have "experimental" in the name.
The autoupgrader will handle legacy intrinsics.
Relevant ML thread: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140729.html
Differential Revision: https://reviews.llvm.org/D88787
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
| #
bda8fbe2 |
| 26-Aug-2020 |
Sjoerd Meijer <[email protected]> |
[LV] Fallback strategies if tail-folding fails
This implements 2 different vectorisation fallback strategies if tail-folding fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue
[LV] Fallback strategies if tail-folding fails
This implements 2 different vectorisation fallback strategies if tail-folding fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue. This can be controlled with option -prefer-predicate-over-epilogue, that has been changed to take a numeric value corresponding to the tail-folding preference and preferred fallback.
Patch by: Pierre van Houtryve, Sjoerd Meijer.
Differential Revision: https://reviews.llvm.org/D79783
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc2 |
|
| #
816097e4 |
| 20-Aug-2020 |
David Green <[email protected]> |
[LV] Allow tail folded reduction selects to remain in the loop
The normal scheme for tail folding reductions is to use:
loop: p = phi(0, a) mask = ... x = masked_load(..., mask) a = add(x,
[LV] Allow tail folded reduction selects to remain in the loop
The normal scheme for tail folding reductions is to use:
loop: p = phi(0, a) mask = ... x = masked_load(..., mask) a = add(x, p) s = select(mask, a, p)
This means we need to keep the register p and a alive out of the loop, plus the mask. On a target with predicated operations we can instead generate the phi as p = phi(0, s). This ensures the select in the loop and we can fold select(m, add(a, b), c) to something like a vaddt c, a, b using the m predicate. This in turn allows us to tail predicate the entire loop.
Differential Revision: https://reviews.llvm.org/D84741
show more ...
|
| #
b8088ada |
| 18-Aug-2020 |
David Green <[email protected]> |
[LV] Predicated reduction tests. NFC
|