|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
2e62a26f |
| 15-Jul-2022 |
Edd Barrett <[email protected]> |
[stackmaps] Legalise patchpoint arguments.
This is similar to D125680, but for llvm.experimental.patchpoint (instead of llvm.experimental.stackmap).
Differential review: https://reviews.llvm.org/D1
[stackmaps] Legalise patchpoint arguments.
This is similar to D125680, but for llvm.experimental.patchpoint (instead of llvm.experimental.stackmap).
Differential review: https://reviews.llvm.org/D129268
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
ed8ef65f |
| 17-Jun-2022 |
Edd Barrett <[email protected]> |
[stackmaps] Start legalizing live variable operands
Prior to this change, live variable operands passed to `llvm.experimental.stackmap` would be emitted directly to target nodes, meaning that they d
[stackmaps] Start legalizing live variable operands
Prior to this change, live variable operands passed to `llvm.experimental.stackmap` would be emitted directly to target nodes, meaning that they don't get legalised. The upshot of this is that LLVM may crash when encountering illegally typed target nodes.
e.g. https://github.com/llvm/llvm-project/issues/21657
This change introduces a platform independent stackmap DAG node whose operands are legalised as per usual, thus avoiding aforementioned crashes.
Note that some kinds of argument are still not handled properly, namely vectors, structs, and large integers, like i128s. These will need to be addressed in follow-up changes.
Note also that this does not change the behaviour of `llvm.experimental.patchpoint`. A follow up change will do the same for this intrinsic.
Differential review: https://reviews.llvm.org/D125680
show more ...
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
fb34d531 |
| 03-Jun-2022 |
Benjamin Kramer <[email protected]> |
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a si
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a simple operation I opted to always directly lower it to integer arithmetic. The other way round is more complicated if you want to preserve IEEE semantics, so it's handled by a new __truncsfbf2 compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened fadd on x86.
Possible future improvements: - Targets with bf16 conversion instructions can now make fp_to_bf16 legal - The software conversion to bf16 can be replaced by a trivial implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
436b875e |
| 30-Mar-2022 |
Sanjay Patel <[email protected]> |
[SDAG] avoid libcalls to fmin/fmax for soft-float targets
This is an extension of D70965 to avoid creating a mathlib call where it did not exist in the original source. Also see D70852 for discussio
[SDAG] avoid libcalls to fmin/fmax for soft-float targets
This is an extension of D70965 to avoid creating a mathlib call where it did not exist in the original source. Also see D70852 for discussion about an alternative proposal that was abandoned.
In the motivating bug report: https://github.com/llvm/llvm-project/issues/54554 ...we also have a more general issue about handling "no-builtin" options.
Differential Revision: https://reviews.llvm.org/D122610
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
38efa68b |
| 07-Jan-2022 |
Victor Perez <[email protected]> |
[LegalizeTypes][VP] Add splitting support for vp.select
Split vp.select in a similar way as vselect, splitting also the length parameter.
Reviewed By: craig.topper
Differential Revision: https://r
[LegalizeTypes][VP] Add splitting support for vp.select
Split vp.select in a similar way as vselect, splitting also the length parameter.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D116651
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
ebe9944a |
| 25-Sep-2021 |
Xiang1 Zhang <[email protected]> |
[ISel] Legalized arithmetic.fence.f128 for 32-bits target
Reviewed By: Craig Topper, Wang Pengfei
Differential Revision: https://reviews.llvm.org/D110467
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
735f4671 |
| 09-Sep-2021 |
Chris Lattner <[email protected]> |
[APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAl
[APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things:
1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things.
2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct!
APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment.
Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical.
Differential Revision: https://reviews.llvm.org/D109483
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
6f7f5b54 |
| 10-Aug-2021 |
Wang, Pengfei <[email protected]> |
[X86] AVX512FP16 instructions enabling 1/6
1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH.
Ref.: https://software.intel.co
[X86] AVX512FP16 instructions enabling 1/6
1. Enable FP16 type support and basic declarations used by following patches. 2. Enable new instructions VMOVW and VMOVSH.
Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D105263
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
4c7f820b |
| 26-Mar-2021 |
Bjorn Pettersson <[email protected]> |
Update @llvm.powi to handle different int sizes for the exponent
This can be seen as a follow up to commit 0ee439b705e82a4fe20e2, that changed the second argument of __powidf2, __powisf2 and __powit
Update @llvm.powi to handle different int sizes for the exponent
This can be seen as a follow up to commit 0ee439b705e82a4fe20e2, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int.
The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function.
One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi.
Differential Revision: https://reviews.llvm.org/D99439
show more ...
|
| #
536e02a2 |
| 24-May-2021 |
Bjorn Pettersson <[email protected]> |
[CodeGen] Refactor libcall lookups for RTLIB::POWI_*
Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_* libcall for a given value type.
This includes a small refactoring of Expan
[CodeGen] Refactor libcall lookups for RTLIB::POWI_*
Use RuntimeLibcalls to get a common way to pick correct RTLIB::POWI_* libcall for a given value type.
This includes a small refactoring of ExpandFPLibCall and ExpandArgFPLibCall in SelectionDAGLegalize to share a bit of code, plus adding an ExpandFPLibCall version that can be called directly when expanding FPOWI/STRICT_FPOWI to ensure that we actually use the same RTLIB::Libcall when expanding the libcall as we used when checking the legality of such a call by doing a getLibcallName check.
Differential Revision: https://reviews.llvm.org/D103050
show more ...
|
| #
9d86095f |
| 03-May-2021 |
Tomas Matheson <[email protected]> |
Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0"
This reverts commit 753185031d939711f8733639a77a6fdc3bdbad22.
|
| #
75318503 |
| 31-Mar-2021 |
Tomas Matheson <[email protected]> |
[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0
atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert s
[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0
atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops.
To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation.
Floating point legalisation: f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32)
Differential Revision: https://reviews.llvm.org/D101164
Originally submitted as 3338290c187b254ad071f4b9cbf2ddb2623cefc0. Reverted in c7df6b1223d88dfd15248fbf7b7b83dacad22ae3.
show more ...
|
| #
c7df6b12 |
| 30-Apr-2021 |
Tomas Matheson <[email protected]> |
Revert "[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0"
This reverts commit 3338290c187b254ad071f4b9cbf2ddb2623cefc0.
Broke expensive checks on debian.
|
| #
3338290c |
| 31-Mar-2021 |
Tomas Matheson <[email protected]> |
[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0
atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert s
[CodeGen][ARM] Implement atomicrmw as pseudo operations at -O0
atomicrmw instructions are expanded by AtomicExpandPass before register allocation into cmpxchg loops. Register allocation can insert spills between the exclusive loads and stores, which invalidates the exclusive monitor and can lead to infinite loops.
To avoid this, reimplement atomicrmw operations as pseudo-instructions and expand them after register allocation.
Floating point legalisation: f16 ATOMIC_LOAD_FADD(*f16, f16) is legalised to f32 ATOMIC_LOAD_FADD(*i16, f32) and then eventually f32 ATOMIC_LOAD_FADD_16(*i16, f32)
Differential Revision: https://reviews.llvm.org/D101164
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
f776d8b1 |
| 18-Jan-2021 |
Qiu Chaofan <[email protected]> |
[Legalizer] Promote result type in expanding FP_TO_XINT
This patch promotes result integer type of FP_TO_XINT in expanding. So crash in conversion from ppc_fp128 to i1 will be fixed.
Reviewed By: s
[Legalizer] Promote result type in expanding FP_TO_XINT
This patch promotes result integer type of FP_TO_XINT in expanding. So crash in conversion from ppc_fp128 to i1 will be fixed.
Reviewed By: steven.zhang
Differential Revision: https://reviews.llvm.org/D92473
show more ...
|
|
Revision tags: llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
a89d751f |
| 17-Dec-2020 |
Bjorn Pettersson <[email protected]> |
Add intrinsics for saturating float to int casts
This patch adds support for the fptoui.sat and fptosi.sat intrinsics, which provide basically the same functionality as the existing fptoui and fptos
Add intrinsics for saturating float to int casts
This patch adds support for the fptoui.sat and fptosi.sat intrinsics, which provide basically the same functionality as the existing fptoui and fptosi instructions, but will saturate (or return 0 for NaN) on values unrepresentable in the target type, instead of returning poison. Related mailing list discussion can be found at: https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ
The intrinsics have overloaded source and result type and support vector operands:
i32 @llvm.fptoui.sat.i32.f32(float %f) i100 @llvm.fptoui.sat.i100.f64(double %f) <4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f) // etc
On the SelectionDAG layer two new ISD opcodes are added, FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands and one result. The second operand is an integer constant specifying the scalar saturation width. The idea here is that initially the second operand and the scalar width of the result type are the same, but they may change during type legalization. For example:
i19 @llvm.fptsi.sat.i19.f32(float %f) // builds i19 fp_to_sint_sat f, 19 // type legalizes (through integer result promotion) i32 fp_to_sint_sat f, 19
I went for this approach, because saturated conversion does not compose well. There is no good way of "adjusting" a saturating conversion to i32 into one to i19 short of saturating twice. Specifying the saturation width separately allows directly saturating to the correct width.
There are two baseline expansions for the fp_to_xint_sat opcodes. If the integer bounds can be exactly represented in the float type and fminnum/fmaxnum are legal, we can expand to something like:
f = fmaxnum f, FP(MIN) f = fminnum f, FP(MAX) i = fptoxi f i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN
If the bounds cannot be exactly represented, we expand to something like this instead:
i = fptoxi f i = select f ult FP(MIN), MIN, i i = select f ogt FP(MAX), MAX, i i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN
It should be noted that this expansion assumes a non-trapping fptoxi.
Initial tests are for AArch64, x86_64 and ARM. This exercises all of the scalar and vector legalization. ARM is included to test float softening.
Original patch by @nikic and @ebevhan (based on D54696).
Differential Revision: https://reviews.llvm.org/D54749
show more ...
|
| #
38b44421 |
| 15-Dec-2020 |
Qiu Chaofan <[email protected]> |
[NFC] [Legalizer] Use common method for expanding fp-to-int operands
Reviewed By: RKSimon, steven.zhang
Differential Revision: https://reviews.llvm.org/D92481
|
|
Revision tags: llvmorg-11.0.1-rc1 |
|
| #
09e34048 |
| 10-Nov-2020 |
Chen Zheng <[email protected]> |
[SelectionDAG] fminnum should be a binary operator
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D91163
|
| #
c126eb75 |
| 04-Nov-2020 |
Cameron McInally <[email protected]> |
[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL
Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247.
Differential Revision: https:/
[SelectionDAG] Add legalizations for VECREDUCE_SEQ_FMUL
Hook up legalizations for VECREDUCE_SEQ_FMUL. This is following up on the VECREDUCE_SEQ_FADD work from D90247.
Differential Revision: https://reviews.llvm.org/D90644
show more ...
|
| #
1f852ba8 |
| 01-Nov-2020 |
Qiu Chaofan <[email protected]> |
[PowerPC] Avoid unnecessary fadd for unsigned to ppcf128
Unsigned 32-bit or shorter integer to ppcf128 conversion are currently expanded as signed-to-double with an extra fadd to 'complement'. But o
[PowerPC] Avoid unnecessary fadd for unsigned to ppcf128
Unsigned 32-bit or shorter integer to ppcf128 conversion are currently expanded as signed-to-double with an extra fadd to 'complement'. But on PowerPC we have native instruction to directly convert unsigned to double since ISA v2.06. This patch exploits it.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D89786
show more ...
|
| #
dda1e74b |
| 30-Oct-2020 |
Cameron McInally <[email protected]> |
[Legalize] Add legalizations for VECREDUCE_SEQ_FADD
Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass.
Differential Revision: https://reviews.
[Legalize] Add legalizations for VECREDUCE_SEQ_FADD
Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass.
Differential Revision: https://reviews.llvm.org/D90247
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6 |
|
| #
1127662c |
| 05-Oct-2020 |
Craig Topper <[email protected]> |
[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS.
getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this
[SelectionDAG] Make sure FMF are propagated when getSetcc canonicalizes FP constants to RHS.
getNode handling for ISD:SETCC calls FoldSETCC which can canonicalize FP constants to the RHS. When this happens we should create the node with the FMF that was requested. By using FlagInserter when can ensure any calls to getNode/getSetcc during canonicalization will also get the flags.
Differential Revision: https://reviews.llvm.org/D88063
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3 |
|
| #
1d782c29 |
| 21-Sep-2020 |
Qiu Chaofan <[email protected]> |
[PowerPC] Pass nofpexcept flag to custom lowered constrained ops
This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept fla
[PowerPC] Pass nofpexcept flag to custom lowered constrained ops
This is a follow-up of D86605. For strict DAG FP node, if its FP exception behavior metadata is ignore, it should have nofpexcept flag. But during custom lowering, this flag isn't passed down.
This is also seen on X86 target.
Reviewed By: uweigand
Differential Revision: https://reviews.llvm.org/D87390
show more ...
|
| #
53f36f06 |
| 12-Sep-2020 |
Nikita Popov <[email protected]> |
[Legalize][ARM][X86] Add float legalization for VECREDUCE
This adds SoftenFloatRes, PromoteFloatRes and SoftPromoteHalfRes legalizations for VECREDUCE, to fill the remaining hole in the SDAG legaliz
[Legalize][ARM][X86] Add float legalization for VECREDUCE
This adds SoftenFloatRes, PromoteFloatRes and SoftPromoteHalfRes legalizations for VECREDUCE, to fill the remaining hole in the SDAG legalization. These legalizations simply expand the reduction and let it be recursively legalized. For the PromoteFloatRes case at least it is possible to do better than that, but it's pretty tricky (because we need to consider the interaction of three different vector legalizations and the type promotion) and probably not really worthwhile.
I haven't added ExpandFloatRes support, as I am not familiar with ppc_fp128.
Differential Revision: https://reviews.llvm.org/D87569
show more ...
|
| #
b1e68f88 |
| 08-Sep-2020 |
Craig Topper <[email protected]> |
[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.:
This removes the after the fact FMF handling from D46854 in favor of passing fast math fla
[SelectionDAGBuilder] Pass fast math flags to getNode calls rather than trying to set them after the fact.:
This removes the after the fact FMF handling from D46854 in favor of passing fast math flags to getNode. This should be a superset of D87130.
This required adding a SDNodeFlags to SelectionDAG::getSetCC.
Now we manage to contant fold some stuff undefs during the initial getNode that we don't do in later DAG combines.
Differential Revision: https://reviews.llvm.org/D87200
show more ...
|