History log of /llvm-project-15.0.7/llvm/lib/Target/X86/X86TargetTransformInfo.cpp (Results 1 – 25 of 688)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# f1879481 16-Jul-2022 Phoebe Wang <[email protected]>

[X86][FP16] Enable vector support for FP16 emulation

This is follow up of D107082, which enable vector support according to psABI.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org

[X86][FP16] Enable vector support for FP16 emulation

This is follow up of D107082, which enable vector support according to psABI.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D127982

show more ...


Revision tags: llvmorg-14.0.6
# 7a9ad257 22-Jun-2022 Vasileios Porpodas <[email protected]>

Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles"

This reverts commit 6d6268dcbf0f48e43f6f9fe46b3a28c29ba63c7d.

Review: https://reviews.llvm.org/D125712


# 6d6268dc 22-Jun-2022 Vasileios Porpodas <[email protected]>

Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles"

This reverts commit 6f88acf410b48f3e6c1526df2dc32ed86f249685.


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4
# 6f88acf4 13-May-2022 Vasileios Porpodas <[email protected]>

[SLP][X86] Improve reordering to consider alternate instruction bundles

During the reordering transformation we should try to avoid reordering bundles
like fadd,fsub because this may block them bein

[SLP][X86] Improve reordering to consider alternate instruction bundles

During the reordering transformation we should try to avoid reordering bundles
like fadd,fsub because this may block them being matched into a single vector
instruction in x86.
We do this by checking if a TreeEntry is such a pattern and adding it to the
list of TreeEntries with orders that need to be considered.

Differential Revision: https://reviews.llvm.org/D125712

show more ...


# cf2072bc 15-Jun-2022 Simon Pilgrim <[email protected]>

[X86] X86TargetTransformInfo.cpp - use InstructionCost type to accumulate instructions costs


# 6a845792 25-May-2022 eopXD <[email protected]>

[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D126350


# 6c80267d 24-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] getScalarizationOverhead - improve extraction costs for > 128-bit vectors

We were using the default getScalarizationOverhead expansion for extraction costs, which adds up all the in

[CostModel][X86] getScalarizationOverhead - improve extraction costs for > 128-bit vectors

We were using the default getScalarizationOverhead expansion for extraction costs, which adds up all the individual element extraction costs.

This is fine for 128-bit vectors, but for 256/512-bit vectors each element extraction also has to account for extracting the upper 128-bit subvector extraction before it can handle the element. For scalarization costs we only need to extract each demanded subvector once.

Differential Revision: https://reviews.llvm.org/D125527

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 6bec3e93 06-Oct-2021 Jay Foad <[email protected]>

[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf

Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op).

[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf

Most clients only used these methods because they wanted to be able to
extend or truncate to the same bit width (which is a no-op). Now that
the standard zext, sext and trunc allow this, there is no reason to use
the OrSelf versions.

The OrSelf versions additionally have the strange behaviour of allowing
extending to a *smaller* width, or truncating to a *larger* width, which
are also treated as no-ops. A small amount of client code relied on this
(ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and
needed rewriting.

Differential Revision: https://reviews.llvm.org/D125557

show more ...


# 3d107ce2 06-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Relax fcmp costs on SSE41 targets or later

Only pre-SSE41 targets double-pump the fp comparison ops


# cbfa8573 06-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Adjust 128-bit select costs to account for slow BLENDV op

Based off the script from D103695 - Jaguar, Bulldozer, Silvermont (et al) and Haswell all have slow BLENDV ops, so adjust t

[CostModel][X86] Adjust 128-bit select costs to account for slow BLENDV op

Based off the script from D103695 - Jaguar, Bulldozer, Silvermont (et al) and Haswell all have slow BLENDV ops, so adjust the worse case cost values

show more ...


# d21bf514 06-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Adjust pre-SSE41 fp scalar select costs to account for vector ops

Based off the script from D103695, we now mainly use BLENDV or OR(AND,ANDN) to select scalar float/double ops


# f0e8c1d6 06-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Adjust 256-bit select costs to account for slow BLENDV op

Based off the script from D103695, on AVX1, Jaguar/Bulldozer both have low throughput for ymm select patterns (BLENDV + OR(

[CostModel][X86] Adjust 256-bit select costs to account for slow BLENDV op

Based off the script from D103695, on AVX1, Jaguar/Bulldozer both have low throughput for ymm select patterns (BLENDV + OR(AND,ANDN))), and even on AVX2 Haswell still struggles with BLENDV ops

show more ...


# 86bb7df6 02-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] getScalarizationOverhead - handle vXi1 extracts with MOVMSK (pre-AVX512)

We can quickly extract multiple elements of a bool vector using MOVMSK ops - since we don't know what genera

[CostModel][X86] getScalarizationOverhead - handle vXi1 extracts with MOVMSK (pre-AVX512)

We can quickly extract multiple elements of a bool vector using MOVMSK ops - since we don't know what generated the vXi1, I've been optimistic and assumed we can use PMOVMSKB to extract the maximum number of bools with a single op.

The MOVMSK pattern isn't great for extract+insert round trips as vXi1 type legalization can interfere with this a lot - so this relies on us remaining good at using getScalarizationOverhead properly (and tagging both Insert and Extract modes) for those round trip cases.

The AVX512 KMOV codegen for bool extraction is a bit of a mess so for now I've not included that - the per-element cost is a lot more accurate for current codegen.

show more ...


# d5198cf9 01-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Check for 'null op' truncations

If the legalized src/dst types are the same, assume the "truncation" is free.

This fixes some edge cases such as mul lo/hi ops and bool vectors whic

[CostModel][X86] Check for 'null op' truncations

If the legalized src/dst types are the same, assume the "truncation" is free.

This fixes some edge cases such as mul lo/hi ops and bool vectors which will get legalized back to legal vector widths

show more ...


# c2964746 01-May-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets

Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count i

[CostModel][X86] Reduce cost of vector selects on SSE2/AVX1 targets

Based off the script from D103695, we were exaggerating the cost of the OR(AND(X,M),AND(Y,~M)) expansion using instruction count instead of effective throughput

show more ...


# 371412e0 29-Apr-2022 Alexey Bataev <[email protected]>

[COST]Fix crash for non-power-2 vector shuffle mask.

Need to normalizize the mask to avoid possible crashes during attempts
to estimate cost of the very long shuffles with non-power-2 number of
elem

[COST]Fix crash for non-power-2 vector shuffle mask.

Need to normalizize the mask to avoid possible crashes during attempts
to estimate cost of the very long shuffles with non-power-2 number of
elements in masks.

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 75e1cf4a 14-Apr-2021 Alexey Bataev <[email protected]>

[COST]Improve cost model for shuffles in SLP.

Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

[COST]Improve cost model for shuffles in SLP.

Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486

show more ...


# 9861ca0c 28-Apr-2022 Alexey Bataev <[email protected]>

Revert "[COST]Improve cost model for shuffles in SLP."

This reverts commit 29a470e3804ca216d4e76c88a38086eb61c200f9 to fix
a crash reported in https://reviews.llvm.org/D100486#3479989.


# 29a470e3 14-Apr-2021 Alexey Bataev <[email protected]>

[COST]Improve cost model for shuffles in SLP.

Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

[COST]Improve cost model for shuffles in SLP.

Introduced masks where they are not added and improved target dependent
cost models to avoid returning of the incorrect cost results after
adding masks.

Differential Revision: https://reviews.llvm.org/D100486

show more ...


# fa8a9fea 26-Apr-2022 Vasileios Porpodas <[email protected]>

Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"

This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c.

Code review: https://reviews.llvm.or

Recommit "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"

This reverts commit 6a9bbd9f20dcd700e28738788bb63a160c6c088c.

Code review: https://reviews.llvm.org/D124202

show more ...


# 6a9bbd9f 26-Apr-2022 Vasileios Porpodas <[email protected]>

Revert "[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`"

This reverts commit 55ce296d6f217fd0defed2592ff7b74b79b2c1f0.


# 55ce296d 21-Apr-2022 Vasileios Porpodas <[email protected]>

[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`

Before this patch `Args` was used to pass a broadcat's arguments by SLP.
This patch changes this. `Args` is no

[SLP][TTI] Refactoring of `getShuffleCost` `Args` to work like `getArithmeticInstrCost`

Before this patch `Args` was used to pass a broadcat's arguments by SLP.
This patch changes this. `Args` is now used for passing the operands of
the shuffle.

Differential Revision: https://reviews.llvm.org/D124202

show more ...


# 889588ee 20-Apr-2022 Vasileios Porpodas <[email protected]>

[SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`.

Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`.
This helps reduce the diff of a future SLP patch

[SLP] Refactoring isLegalBroadcastLoad() to use `ElementCount`.

Replacing `unsigned` with `ElementCount` in the argument of `isLegalBroadcastLoad()`.
This helps reduce the diff of a future SLP patch for AArch64.

show more ...


# d663166a 25-Mar-2022 Simon Pilgrim <[email protected]>

[CostModel][X86] Reduce cost of v2i64 icmp base cost on SSE2 targets

Based off the script from D103695, we were exaggerating the cost of the v2i64 comparison expansion using instruction count instea

[CostModel][X86] Reduce cost of v2i64 icmp base cost on SSE2 targets

Based off the script from D103695, we were exaggerating the cost of the v2i64 comparison expansion using instruction count instead of effective throughput

show more ...


# 39aa202a 24-Mar-2022 Vasileios Porpodas <[email protected]>

Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.

Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b774718113007ecb1a4

Recommit "[SLP] Fix lookahead operand reordering for splat loads." attempt 3, fixed assertion crash.

Original review: https://reviews.llvm.org/D121354

This reverts commit e6ead19b774718113007ecb1a4449d7af0cbcfeb.

show more ...


12345678910>>...28