History log of /llvm-project-15.0.7/llvm/lib/Target/X86/X86PartialReduction.cpp (Results 1 – 17 of 17)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 18a1085e 08-Jul-2022 Haohai Wen <[email protected]>

[X86] Fix collectLeaves for adds used by phi that forms loop

When add has additional users, we should indentify whether add's
user is phi that forms loop rather than root's.

Reviewed By: LuoYuanke

[X86] Fix collectLeaves for adds used by phi that forms loop

When add has additional users, we should indentify whether add's
user is phi that forms loop rather than root's.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D129169

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# ca33d74c 30-Mar-2022 Wei Xiao <[email protected]>

[X86] Improve x86-partial-reduction to support abs intrinsic

Current implementation only recognizes absolute operation implemented by
select instruction. This patch adds support for abs intrinsic.

[X86] Improve x86-partial-reduction to support abs intrinsic

Current implementation only recognizes absolute operation implemented by
select instruction. This patch adds support for abs intrinsic.

Differential Revision: https://reviews.llvm.org/D122777

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 5dea7a86 29-Dec-2021 Luo, Yuanke <[email protected]>

Combine to vpdpbusd when operand is constant and small enough.

Differential Revision: https://reviews.llvm.org/D116363


# 21babe4d 20-Dec-2021 Luo, Yuanke <[email protected]>

[X86] Combine reduce(add (mul x, y)) to VNNI instruction.

For below C code, we can use VNNI to combine the mul and add operation.
int usdot_prod_qi(unsigned char *restrict a, char *restrict b, int c

[X86] Combine reduce(add (mul x, y)) to VNNI instruction.

For below C code, we can use VNNI to combine the mul and add operation.
int usdot_prod_qi(unsigned char *restrict a, char *restrict b, int c,
int n) {
int i;
for (i = 0; i < 32; i++) {
c += ((int)a[i] * (int)b[i]);
}
return c;
}
We didn't support the combine acoss basic block in this patch.

Differential Revision: https://reviews.llvm.org/D116039

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init
# 05444417 24-Jan-2021 Kazu Hirata <[email protected]>

[Target] Use llvm::append_range (NFC)


# e4847a7f 23-Jan-2021 Kazu Hirata <[email protected]>

Revert "[Target] Use llvm::append_range (NFC)"

This reverts commit cc7a23828657f35f706343982cf96bb6583d4d73.

The X86WinEHState.cpp hunk seems to break certain builds.


# cc7a2382 23-Jan-2021 Kazu Hirata <[email protected]>

[Target] Use llvm::append_range (NFC)


Revision tags: llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 0da1e7eb 29-Jun-2020 Christopher Tetreault <[email protected]>

[SVE] Remove calls to VectorType::getNumElements from X86

Reviewers: efriedma, RKSimon, craig.topper, fpetrogalli, c-rhodes

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl,

[SVE] Remove calls to VectorType::getNumElements from X86

Reviewers: efriedma, RKSimon, craig.topper, fpetrogalli, c-rhodes

Reviewed By: RKSimon

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82508

show more ...


Revision tags: llvmorg-10.0.1-rc2
# 8abe8300 31-May-2020 Craig Topper <[email protected]>

[X86] Rewrite how X86PartialReduction finds candidates to consider optimizing.

Previously we walked the users of any vector binop looking for
more binops with the same opcode or phis that eventually

[X86] Rewrite how X86PartialReduction finds candidates to consider optimizing.

Previously we walked the users of any vector binop looking for
more binops with the same opcode or phis that eventually ended up
in a reduction. While this is simple it also means visiting the
same nodes many times since we'll do a forward walk for each
BinaryOperator in the chain. It was also far more general than what
we have tests for or expect to see.

This patch replaces the algorithm with a new method that starts at
extract elements looking for a horizontal reduction. Once we find
a reduction we walk through backwards through phis and adds to
collect leaves that we can consider for rewriting.

We only consider single use adds and phis. Except for a special
case if the Add is used by a phi that forms a loop back to the
Add. Including other single use Adds to support unrolled loops.

Ultimately, I want to narrow the Adds, Phis, and final reduction
based on the partial reduction we're doing. I still haven't
figured out exactly what that looks like yet. But restricting
the types of graphs we expect to handle seemed like a good first
step. As does having all the leaves and the reduction at once.

Differential Revision: https://reviews.llvm.org/D79971

show more ...


# 5a99ec10 29-May-2020 Christopher Tetreault <[email protected]>

[SVE] Eliminate calls to default-false VectorType::get() from X86

Reviewers: efriedma, sdesmalen, c-rhodes, craig.topper

Reviewed By: craig.topper

Subscribers: tschuett, hiraditya, rkruppe, psnobl

[SVE] Eliminate calls to default-false VectorType::get() from X86

Reviewers: efriedma, sdesmalen, c-rhodes, craig.topper

Reviewed By: craig.topper

Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D80331

show more ...


Revision tags: llvmorg-10.0.1-rc1
# 2b0b9b11 14-May-2020 Craig Topper <[email protected]>

[X86] Fix a regression caused by moving combineLoopMAddPattern to IR

When I moved combineLoopMAddPattern to an IR pass. I didn't match the behavior of canReduceVMulWidth that was used in the Selecti

[X86] Fix a regression caused by moving combineLoopMAddPattern to IR

When I moved combineLoopMAddPattern to an IR pass. I didn't match the behavior of canReduceVMulWidth that was used in the SelectionDAG version. canReduceVMulWidth just calls computeSignBits and assumes a truncate is always profitable. The version I put in IR just looks for constants and zext/sext. Though I neglected to check the number of bits in input of the zext/sext.

This patch adds a check for the number of input bits to the sext/zext. And it adds a special case for add/sub with zext/sext inputs which can be handled by combineTruncatedArithmetic. Match the original SelectionDAG behavior appears to be a regression in some cases if the truncate isn't removed and becomes pack and permq. So enabling only this specific case is the conservative approach.

Differential Revision: https://reviews.llvm.org/D79909

show more ...


# fa8c2ae7 14-May-2020 Craig Topper <[email protected]>

[X86] Return true from trySADReplacement in the partial reduction pass when a change is made.

Otherwise we don't signal to the pass manager that we changed IR.


# dd24fb38 17-Apr-2020 Christopher Tetreault <[email protected]>

Clean up usages of asserting vector getters in Type

Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicate

Clean up usages of asserting vector getters in Type

Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: craig.topper, sdesmalen, efriedma, RKSimon

Reviewed By: efriedma

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77264

show more ...


# 6f64daca 15-Apr-2020 Benjamin Kramer <[email protected]>

Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints

No functionality change intended.


# 1ee6ec2b 31-Mar-2020 Eli Friedman <[email protected]>

Remove "mask" operand from shufflevector.

Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and p

Remove "mask" operand from shufflevector.

Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors. Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it. Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467

show more ...


# d2dd0fac 28-Mar-2020 Michael Liao <[email protected]>

Fix `-Wsign-compare` warning. NFC.


# 9f7d4150 26-Mar-2020 Craig Topper <[email protected]>

[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.

These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag ex

[X86] Move combineLoopMAddPattern and combineLoopSADPattern to an IR pass before SelecitonDAG.

These transforms rely on a vector reduction flag on the SDNode
set by SelectionDAGBuilder. This flag exists because SelectionDAG
can't see across basic blocks so SelectionDAGBuilder is looking
across and saving the info. X86 is the only target that uses this
flag currently. By removing the X86 code we can remove the flag
and the SelectionDAGBuilder code.

This pass adds a dedicated IR pass for X86 that looks across the
blocks and transforms the IR into a form that the X86 SelectionDAG
can finish.

An advantage of this new approach is that we can enhance it to
shrink the phi nodes and final reduction tree based on the zeroes
that we need to concatenate to bring the partially reduced
reduction back up to the original width.

Differential Revision: https://reviews.llvm.org/D76649

show more ...