History log of /llvm-project-15.0.7/llvm/lib/Target/PowerPC/PPCISelLowering.cpp (Results 1 – 25 of 1722)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 96515df8 06-Jul-2022 Masoud Ataei <[email protected]>

[PowerPC] Fix the check for scalar MASS conversion

Proposing to move the check for scalar MASS conversion from constructor
of PPCTargetLowering to the lowerLibCallBase function which decides
about t

[PowerPC] Fix the check for scalar MASS conversion

Proposing to move the check for scalar MASS conversion from constructor
of PPCTargetLowering to the lowerLibCallBase function which decides
about the lowering.

The Target machine option Options.PPCGenScalarMASSEntries is set in
PPCTargetMachine.cpp. But an object of the class PPCTargetLowering
is created in one of the included header files. So, the constructor will run
before setting PPCGenScalarMASSEntries to correct value. So, we cannot
check this option in the constructor.

Differential: https://reviews.llvm.org/D128653
Reviewer: @bmahjour

show more ...


# 88b6d227 28-Jun-2022 Ting Wang <[email protected]>

[PowerPC] Improve getNormalLoadInput to reach more splat load
opportunities

There are straight forward splat load opportunities blocked by
getNormalLoadInput(), since those cases involve consecutive

[PowerPC] Improve getNormalLoadInput to reach more splat load
opportunities

There are straight forward splat load opportunities blocked by
getNormalLoadInput(), since those cases involve consecutive bitcasts.
Improve by looking through bitcasts.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D128703

show more ...


Revision tags: llvmorg-14.0.6
# e09f6ff3 20-Jun-2022 Nemanja Ivanovic <[email protected]>

[PowerPC] Disable automatic generation of STXVP

There are instances where using paired vector stores leads to significant
performance degradation due to issues with store forwarding.To avoid falling

[PowerPC] Disable automatic generation of STXVP

There are instances where using paired vector stores leads to significant
performance degradation due to issues with store forwarding.To avoid falling
into this trap with compiler - generated code, we will not emit these
instructions unless the user requests them explicitly(with a builtin or by
specifying the option).

Reviewed By : lei, amyk, saghir

Differential Revision: https://reviews.llvm.org/D127218

show more ...


# 34033a84 15-Jun-2022 Amy Kwan <[email protected]>

[PowerPC] Skip combine for vector_shuffles when two scalar_to_vector nodes are different vector types.

Currently in `combineVectorShuffle()`, we update the shuffle mask if either
input vector comes

[PowerPC] Skip combine for vector_shuffles when two scalar_to_vector nodes are different vector types.

Currently in `combineVectorShuffle()`, we update the shuffle mask if either
input vector comes from a scalar_to_vector, and we keep the respective input
vectors in its permuted form by producing PPCISD::SCALAR_TO_VECTOR_PERMUTED.
However, it is possible that we end up in a situation where both input vectors
to the vector_shuffle are scalar_to_vector, and are different vector types.
In situations like this, the shuffle mask is updated incorrectly as the current
code assumes both scalar_to_vector inputs are the same vector type.

This patch skips the combines for vector_shuffle if both input vectors are
scalar_to_vector, and if they are of different vector types. A follow up patch
will focus on fixing this issue afterwards, in order to correctly update the
shuffle mask.

Differential Revision: https://reviews.llvm.org/D127818

show more ...


Revision tags: llvmorg-14.0.5
# 335e8bf1 08-Jun-2022 Quinn Pham <[email protected]>

[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores

This patch changes the PowerPC backend to generate VSX load/store instructions
for all vector loads/stores on

[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores

This patch changes the PowerPC backend to generate VSX load/store instructions
for all vector loads/stores on Power8 and earlier (LE) instead of VMX
load/store instructions. The reason for this change is because VMX instructions
require the vector to be 16-byte aligned. So, a vector load/store will fail with
VMX instructions if the vector is misaligned. Also, `gcc` generates VSX
instructions in this situation which allow for unaligned access but require a
swap instruction after loading/before storing. This is not an issue for BE
because we already emit VSX instructions since no swap is required. And this is
not an issue on Power9 and up since we have access to `lxv[x]`/`stxv[x]` which
allow for unaligned access and do not require swaps.

This patch also delays the VSX load/store for LE combines until after
LegalizeOps to prioritize other load/store combines.

Reviewed By: #powerpc, stefanp

Differential Revision: https://reviews.llvm.org/D127309

show more ...


# 263f1b2f 14-Jun-2022 Stefan Pintilie <[email protected]>

[PowerPC] Fix combine step for shufflevector.

The combine step for shufflevector will sometimes replace undef in the mask
with a defined value. This can cause an infinite loop in some cases as anoth

[PowerPC] Fix combine step for shufflevector.

The combine step for shufflevector will sometimes replace undef in the mask
with a defined value. This can cause an infinite loop in some cases as another
combine will then put the undef back in the mask.

This patch fixes the issue so that undefs are not replaced when doing a combine.

Reviewed By: ZarkoCA, amyk, quinnp, saghir

Differential Revision: https://reviews.llvm.org/D127439

show more ...


# 07881861 03-Jun-2022 Guillaume Chatelet <[email protected]>

[Alignment][NFC] Remove usage of MemSDNode::getAlignment

I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with

[Alignment][NFC] Remove usage of MemSDNode::getAlignment

I can't remove the function just yet as it is used in the generated .inc files.
I would also like to provide a way to compare alignment with TypeSize since it came up a few times.

Differential Revision: https://reviews.llvm.org/D126910

show more ...


# ad73ce31 26-May-2022 Zongwei Lan <[email protected]>

[Target] use getSubtarget<> instead of static_cast<>(getSubtarget())

Differential Revision: https://reviews.llvm.org/D125391


Revision tags: llvmorg-14.0.4
# 115c1888 06-May-2022 David Green <[email protected]>

[DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask'))

If the mask is made up of elements that form a mask in the higher type
we can convert shuffle(bitcast into the bitcast

[DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask'))

If the mask is made up of elements that form a mask in the higher type
we can convert shuffle(bitcast into the bitcast type, simplifying the
instruction sequence. A v4i32 2,3,0,1 for example can be treated as a
1,0 v2i64 shuffle. This helps clean up some of the AArch64 concat load
combines, along with helping simplify a number of other tests.

The PowerPC combine for v16i8 splat vector loads needed some fixes to
keep it working for v16i8 vectors. This improves the handling of v2i64
shuffles to match too, hopefully improving them in general.

Differential Revision: https://reviews.llvm.org/D123801

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2
# fb193db2 20-Apr-2022 Fangrui Song <[email protected]>

[PowerPC] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds


Revision tags: llvmorg-14.0.1
# 18679ac0 08-Apr-2022 Kai Luo <[email protected]>

[PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64

AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 exp

[PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64

AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 explicitly, AtomicExpandPass won't emit `__atomic_*` libcalls for those target unable to inline atomic ops and finally the backend emits `__sync_*` libcalls. Thanks @efriedma for pointing it out.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D122868

show more ...


# 549e118e 08-Apr-2022 Kai Luo <[email protected]>

[PowerPC] Support 16-byte lock free atomics on pwr8 and up

Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on

[PowerPC] Support 16-byte lock free atomics on pwr8 and up

Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on PPC64.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D122377

show more ...


# b389354b 06-Apr-2022 Ting Wang <[email protected]>

[Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend

Add support for builtin_[max|min] which has below prototype:
A builtin_max (A1, A2, A3, ...)
All arguments must have the same type; t

[Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend

Add support for builtin_[max|min] which has below prototype:
A builtin_max (A1, A2, A3, ...)
All arguments must have the same type; they must all be float, double, or long double.
Internally use SelectCC to get the result.

Reviewed By: qiucf

Differential Revision: https://reviews.llvm.org/D122478

show more ...


# 585c85ab 31-Mar-2022 Stefan Pintilie <[email protected]>

[PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes.

To store a byval parameter the existing code would store as many 8 byte elements
as was required to store the full size of

[PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes.

To store a byval parameter the existing code would store as many 8 byte elements
as was required to store the full size of the byval parameter.
For example, a paramter of size 16 would store two element of 8 bytes.
A paramter of size 12 would also store two elements of 8 bytes.
This would sometimes store too many bytes as the size of the paramter is not
always a factor of 8.

This patch fixes that issue and now byval paramters are stored with the correct
number of bytes.

Reviewed By: nemanjai, #powerpc, quinnp, amyk

Differential Revision: https://reviews.llvm.org/D121430

show more ...


# 662b9fa0 28-Mar-2022 Shao-Ce SUN <[email protected]>

[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D122557


# c1a31ee6 20-Mar-2022 Aaron Puchert <[email protected]>

[PPCISelLowering] Avoid emitting calls to __multi3, __muloti4

After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3
instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, no

[PPCISelLowering] Avoid emitting calls to __multi3, __muloti4

After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3
instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, not
even with compiler-rt. Block it as well so that we get inline code.

Because libgcc doesn't have __muloti4, we block that as well.

Fixes #54460.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D122090

show more ...


# 300e1293 15-Mar-2022 Qiu Chaofan <[email protected]>

[PowerPC] Disable perfect shuffle by default

We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following l

[PowerPC] Disable perfect shuffle by default

We are going to remove the old 'perfect shuffle' optimization since it
brings performance penalty in hot loop around vectors. For example, in
following loop sharing the same mask:

%v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
%v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>

The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead
of `vperm-vperm`. In some large loop cases, this causes 20%+ performance
penalty.

The original attempt to resolve this is to pre-record masks of every
shufflevector operation in DAG, but that is somewhat complex and brings
unnecessary computation (to scan all nodes) in optimization. Here we
disable it by default. There're indeed some cases becoming worse after
this, which will be fixed in a more careful way in future patches.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D121082

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 30f30e1c 08-Mar-2022 Masoud Ataei <[email protected]>

[PowerPC] Fix the none tail call in scalar MASS conversion
This patch is proposing a fix for patch https://reviews.llvm.org/D101759
on none tail call math function conversion to MASS call.

Different

[PowerPC] Fix the none tail call in scalar MASS conversion
This patch is proposing a fix for patch https://reviews.llvm.org/D101759
on none tail call math function conversion to MASS call.

Differential: https://reviews.llvm.org/D121016

reviewer: @nemanjai

show more ...


# b2497e54 07-Mar-2022 Qiu Chaofan <[email protected]>

[PowerPC] Add generic fnmsub intrinsic

Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/d

[PowerPC] Add generic fnmsub intrinsic

Currently in Clang, we have two types of builtins for fnmsub operation:
one for float/double vector, they'll be transformed into IR operations;
one for float/double scalar, they'll generate corresponding intrinsics.

But for the vector version of builtin, the 3 op chain may be recognized
as expensive by some passes (like early cse). We need some way to keep
the fnmsub form until code generation.

This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub
intrinsics.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D116015

show more ...


Revision tags: llvmorg-14.0.0-rc2
# 43d48ed2 20-Feb-2022 Qiu Chaofan <[email protected]>

[PowerPC] Add option to disable perfect shuffle

Perfect shuffle was introduced into PowerPC backend years ago, and only
available in big-endian subtargets. This optimization has good effects
in simp

[PowerPC] Add option to disable perfect shuffle

Perfect shuffle was introduced into PowerPC backend years ago, and only
available in big-endian subtargets. This optimization has good effects
in simple cases, but brings serious negative impact in large programs
with many shuffle instructions sharing the same mask.

Here introduces a temporary backend hidden option to control it until we
implemented better way to fix the gap in vectorshuffle decomposition.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D120072

show more ...


# 097a95f2 10-Feb-2022 Ting Wang <[email protected]>

[PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp

Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection,
and this opens the opportunity to improve instructi

[PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp

Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection,
and this opens the opportunity to improve instruction selection on: llvm.maxnum.f128,
llvm.minnum.f128, and select_cc ordered gt/lt and (don't care) gt/lt.

Reviewed By: nemanjai, shchenz, amyk

Differential Revision: https://reviews.llvm.org/D117006

show more ...


Revision tags: llvmorg-14.0.0-rc1
# 149195f5 07-Feb-2022 Nikita Popov <[email protected]>

[PPCISelLowering] Avoid use of getPointerElementType()

Use the value type instead.


# 8ce13bc9 04-Feb-2022 Masoud Ataei <[email protected]>

[PowerPC] Option controling scalar MASS convertion

differential: https://reviews.llvm.org/D119035

reviewer: bmahjour


# 85243124 04-Feb-2022 Benjamin Kramer <[email protected]>

Tweak some uses of std::iota to skip initializing the underlying storage. NFCI.


# 256d2533 02-Feb-2022 Masoud Ataei <[email protected]>

[PowerPC] Scalar IBM MASS library conversion pass

This patch introduces the conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need t

[PowerPC] Scalar IBM MASS library conversion pass

This patch introduces the conversions from math function calls
to MASS library calls. To resolves calls generated with these conversions, one
need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX.

Differential: https://reviews.llvm.org/D101759

Reviewer: bmahjour

show more ...


12345678910>>...69