PPCISelLowering.cpp - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/PowerPC/PPCISelLowering.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 96515df8	06-Jul-2022	Masoud Ataei <[email protected]>	[PowerPC] Fix the check for scalar MASS conversion Proposing to move the check for scalar MASS conversion from constructor of PPCTargetLowering to the lowerLibCallBase function which decides about t [PowerPC] Fix the check for scalar MASS conversion Proposing to move the check for scalar MASS conversion from constructor of PPCTargetLowering to the lowerLibCallBase function which decides about the lowering. The Target machine option Options.PPCGenScalarMASSEntries is set in PPCTargetMachine.cpp. But an object of the class PPCTargetLowering is created in one of the included header files. So, the constructor will run before setting PPCGenScalarMASSEntries to correct value. So, we cannot check this option in the constructor. Differential: https://reviews.llvm.org/D128653 Reviewer: @bmahjour show more ...
# 88b6d227	28-Jun-2022	Ting Wang <[email protected]>	[PowerPC] Improve getNormalLoadInput to reach more splat load opportunities There are straight forward splat load opportunities blocked by getNormalLoadInput(), since those cases involve consecutive [PowerPC] Improve getNormalLoadInput to reach more splat load opportunities There are straight forward splat load opportunities blocked by getNormalLoadInput(), since those cases involve consecutive bitcasts. Improve by looking through bitcasts. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D128703 show more ...
Revision tags: llvmorg-14.0.6
# e09f6ff3	20-Jun-2022	Nemanja Ivanovic <[email protected]>	[PowerPC] Disable automatic generation of STXVP There are instances where using paired vector stores leads to significant performance degradation due to issues with store forwarding.To avoid falling [PowerPC] Disable automatic generation of STXVP There are instances where using paired vector stores leads to significant performance degradation due to issues with store forwarding.To avoid falling into this trap with compiler - generated code, we will not emit these instructions unless the user requests them explicitly(with a builtin or by specifying the option). Reviewed By : lei, amyk, saghir Differential Revision: https://reviews.llvm.org/D127218 show more ...
# 34033a84	15-Jun-2022	Amy Kwan <[email protected]>	[PowerPC] Skip combine for vector_shuffles when two scalar_to_vector nodes are different vector types. Currently in `combineVectorShuffle()`, we update the shuffle mask if either input vector comes [PowerPC] Skip combine for vector_shuffles when two scalar_to_vector nodes are different vector types. Currently in `combineVectorShuffle()`, we update the shuffle mask if either input vector comes from a scalar_to_vector, and we keep the respective input vectors in its permuted form by producing PPCISD::SCALAR_TO_VECTOR_PERMUTED. However, it is possible that we end up in a situation where both input vectors to the vector_shuffle are scalar_to_vector, and are different vector types. In situations like this, the shuffle mask is updated incorrectly as the current code assumes both scalar_to_vector inputs are the same vector type. This patch skips the combines for vector_shuffle if both input vectors are scalar_to_vector, and if they are of different vector types. A follow up patch will focus on fixing this issue afterwards, in order to correctly update the shuffle mask. Differential Revision: https://reviews.llvm.org/D127818 show more ...
Revision tags: llvmorg-14.0.5
# 335e8bf1	08-Jun-2022	Quinn Pham <[email protected]>	[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores This patch changes the PowerPC backend to generate VSX load/store instructions for all vector loads/stores on [PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores This patch changes the PowerPC backend to generate VSX load/store instructions for all vector loads/stores on Power8 and earlier (LE) instead of VMX load/store instructions. The reason for this change is because VMX instructions require the vector to be 16-byte aligned. So, a vector load/store will fail with VMX instructions if the vector is misaligned. Also, `gcc` generates VSX instructions in this situation which allow for unaligned access but require a swap instruction after loading/before storing. This is not an issue for BE because we already emit VSX instructions since no swap is required. And this is not an issue on Power9 and up since we have access to `lxv[x]`/`stxv[x]` which allow for unaligned access and do not require swaps. This patch also delays the VSX load/store for LE combines until after LegalizeOps to prioritize other load/store combines. Reviewed By: #powerpc, stefanp Differential Revision: https://reviews.llvm.org/D127309 show more ...
# 263f1b2f	14-Jun-2022	Stefan Pintilie <[email protected]>	[PowerPC] Fix combine step for shufflevector. The combine step for shufflevector will sometimes replace undef in the mask with a defined value. This can cause an infinite loop in some cases as anoth [PowerPC] Fix combine step for shufflevector. The combine step for shufflevector will sometimes replace undef in the mask with a defined value. This can cause an infinite loop in some cases as another combine will then put the undef back in the mask. This patch fixes the issue so that undefs are not replaced when doing a combine. Reviewed By: ZarkoCA, amyk, quinnp, saghir Differential Revision: https://reviews.llvm.org/D127439 show more ...
# 07881861	03-Jun-2022	Guillaume Chatelet <[email protected]>	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with [Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910 show more ...
# ad73ce31	26-May-2022	Zongwei Lan <[email protected]>	[Target] use getSubtarget<> instead of static_cast<>(getSubtarget()) Differential Revision: https://reviews.llvm.org/D125391
Revision tags: llvmorg-14.0.4
# 115c1888	06-May-2022	David Green <[email protected]>	[DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask')) If the mask is made up of elements that form a mask in the higher type we can convert shuffle(bitcast into the bitcast [DAG][PowerPC] Combine shuffle(bitcast(X), Mask) to bitcast(shuffle(X, Mask')) If the mask is made up of elements that form a mask in the higher type we can convert shuffle(bitcast into the bitcast type, simplifying the instruction sequence. A v4i32 2,3,0,1 for example can be treated as a 1,0 v2i64 shuffle. This helps clean up some of the AArch64 concat load combines, along with helping simplify a number of other tests. The PowerPC combine for v16i8 splat vector loads needed some fixes to keep it working for v16i8 vectors. This improves the handling of v2i64 shuffles to match too, hopefully improving them in general. Differential Revision: https://reviews.llvm.org/D123801 show more ...
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2
# fb193db2	20-Apr-2022	Fangrui Song <[email protected]>	[PowerPC] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds
Revision tags: llvmorg-14.0.1
# 18679ac0	08-Apr-2022	Kai Luo <[email protected]>	[PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64 AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 exp [PowerPC] Adjust `MaxAtomicSizeInBitsSupported` on PPC64 AtomicExpandPass uses this variable to determine emitting libcalls or not. The default value is 1024 and if we don't specify it for PPC64 explicitly, AtomicExpandPass won't emit `__atomic_` libcalls for those target unable to inline atomic ops and finally the backend emits `__sync_` libcalls. Thanks @efriedma for pointing it out. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122868 show more ...
# 549e118e	08-Apr-2022	Kai Luo <[email protected]>	[PowerPC] Support 16-byte lock free atomics on pwr8 and up Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on [PowerPC] Support 16-byte lock free atomics on pwr8 and up Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on PPC64. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D122377 show more ...
# b389354b	06-Apr-2022	Ting Wang <[email protected]>	[Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend Add support for builtin_[max\|min] which has below prototype: A builtin_max (A1, A2, A3, ...) All arguments must have the same type; t [Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend Add support for builtin_[max\|min] which has below prototype: A builtin_max (A1, A2, A3, ...) All arguments must have the same type; they must all be float, double, or long double. Internally use SelectCC to get the result. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D122478 show more ...
# 585c85ab	31-Mar-2022	Stefan Pintilie <[email protected]>	[PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes. To store a byval parameter the existing code would store as many 8 byte elements as was required to store the full size of [PowerPC] Fix lowering of byval parameters for sizes greater than 8 bytes. To store a byval parameter the existing code would store as many 8 byte elements as was required to store the full size of the byval parameter. For example, a paramter of size 16 would store two element of 8 bytes. A paramter of size 12 would also store two elements of 8 bytes. This would sometimes store too many bytes as the size of the paramter is not always a factor of 8. This patch fixes that issue and now byval paramters are stored with the correct number of bytes. Reviewed By: nemanjai, #powerpc, quinnp, amyk Differential Revision: https://reviews.llvm.org/D121430 show more ...
# 662b9fa0	28-Mar-2022	Shao-Ce SUN <[email protected]>	[NFC][CodeGen] Add a setTargetDAGCombine use ArrayRef Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D122557
# c1a31ee6	20-Mar-2022	Aaron Puchert <[email protected]>	[PPCISelLowering] Avoid emitting calls to __multi3, __muloti4 After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3 instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, no [PPCISelLowering] Avoid emitting calls to __multi3, __muloti4 After D108936, @llvm.smul.with.overflow.i64 was lowered to __multi3 instead of __mulodi4, which also doesn't exist on PowerPC 32-bit, not even with compiler-rt. Block it as well so that we get inline code. Because libgcc doesn't have __muloti4, we block that as well. Fixes #54460. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D122090 show more ...
# 300e1293	15-Mar-2022	Qiu Chaofan <[email protected]>	[PowerPC] Disable perfect shuffle by default We are going to remove the old 'perfect shuffle' optimization since it brings performance penalty in hot loop around vectors. For example, in following l [PowerPC] Disable perfect shuffle by default We are going to remove the old 'perfect shuffle' optimization since it brings performance penalty in hot loop around vectors. For example, in following loop sharing the same mask: %v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27> %v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27> The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead of `vperm-vperm`. In some large loop cases, this causes 20%+ performance penalty. The original attempt to resolve this is to pre-record masks of every shufflevector operation in DAG, but that is somewhat complex and brings unnecessary computation (to scan all nodes) in optimization. Here we disable it by default. There're indeed some cases becoming worse after this, which will be fixed in a more careful way in future patches. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D121082 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 30f30e1c	08-Mar-2022	Masoud Ataei <[email protected]>	[PowerPC] Fix the none tail call in scalar MASS conversion This patch is proposing a fix for patch https://reviews.llvm.org/D101759 on none tail call math function conversion to MASS call. Different [PowerPC] Fix the none tail call in scalar MASS conversion This patch is proposing a fix for patch https://reviews.llvm.org/D101759 on none tail call math function conversion to MASS call. Differential: https://reviews.llvm.org/D121016 reviewer: @nemanjai show more ...
# b2497e54	07-Mar-2022	Qiu Chaofan <[email protected]>	[PowerPC] Add generic fnmsub intrinsic Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/d [PowerPC] Add generic fnmsub intrinsic Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/double scalar, they'll generate corresponding intrinsics. But for the vector version of builtin, the 3 op chain may be recognized as expensive by some passes (like early cse). We need some way to keep the fnmsub form until code generation. This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub intrinsics. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D116015 show more ...
Revision tags: llvmorg-14.0.0-rc2
# 43d48ed2	20-Feb-2022	Qiu Chaofan <[email protected]>	[PowerPC] Add option to disable perfect shuffle Perfect shuffle was introduced into PowerPC backend years ago, and only available in big-endian subtargets. This optimization has good effects in simp [PowerPC] Add option to disable perfect shuffle Perfect shuffle was introduced into PowerPC backend years ago, and only available in big-endian subtargets. This optimization has good effects in simple cases, but brings serious negative impact in large programs with many shuffle instructions sharing the same mask. Here introduces a temporary backend hidden option to control it until we implemented better way to fix the gap in vectorshuffle decomposition. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D120072 show more ...
# 097a95f2	10-Feb-2022	Ting Wang <[email protected]>	[PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection, and this opens the opportunity to improve instructi [PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection, and this opens the opportunity to improve instruction selection on: llvm.maxnum.f128, llvm.minnum.f128, and select_cc ordered gt/lt and (don't care) gt/lt. Reviewed By: nemanjai, shchenz, amyk Differential Revision: https://reviews.llvm.org/D117006 show more ...
Revision tags: llvmorg-14.0.0-rc1
# 149195f5	07-Feb-2022	Nikita Popov <[email protected]>	[PPCISelLowering] Avoid use of getPointerElementType() Use the value type instead.
# 8ce13bc9	04-Feb-2022	Masoud Ataei <[email protected]>	[PowerPC] Option controling scalar MASS convertion differential: https://reviews.llvm.org/D119035 reviewer: bmahjour
# 85243124	04-Feb-2022	Benjamin Kramer <[email protected]>	Tweak some uses of std::iota to skip initializing the underlying storage. NFCI.
# 256d2533	02-Feb-2022	Masoud Ataei <[email protected]>	[PowerPC] Scalar IBM MASS library conversion pass This patch introduces the conversions from math function calls to MASS library calls. To resolves calls generated with these conversions, one need t [PowerPC] Scalar IBM MASS library conversion pass This patch introduces the conversions from math function calls to MASS library calls. To resolves calls generated with these conversions, one need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX. Differential: https://reviews.llvm.org/D101759 Reviewer: bmahjour show more ...
12 3 4 5 6 7 8 9 10 >>...69