|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
8aff88fd |
| 19-Jul-2022 |
Benjamin Kramer <[email protected]> |
[LegalizeDAG] Propagate alignment in ExpandExtractFromVectorThroughStack
Unlike the name suggests this can reuse any store as a base for a memory-based vector extract. If that store is underaligned
[LegalizeDAG] Propagate alignment in ExpandExtractFromVectorThroughStack
Unlike the name suggests this can reuse any store as a base for a memory-based vector extract. If that store is underaligned the loads created to extract will have an invalid alignment. Since most CPUs are forgiving wrt alignment this is almost never an issue, on x86 this is only reproducible by extracting a 128 bit vector out of a wider vector.
I tried making a test case in the context of https://reviews.llvm.org/D127982 but it's really really fragile, as the output pretty much looks like a missed optimization.
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
fb34d531 |
| 03-Jun-2022 |
Benjamin Kramer <[email protected]> |
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a si
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a simple operation I opted to always directly lower it to integer arithmetic. The other way round is more complicated if you want to preserve IEEE semantics, so it's handled by a new __truncsfbf2 compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened fadd on x86.
Possible future improvements: - Targets with bf16 conversion instructions can now make fp_to_bf16 legal - The software conversion to bf16 can be replaced by a trivial implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
show more ...
|
| #
a1121c31 |
| 31-May-2022 |
Paul Walker <[email protected]> |
[SVE] Fix incorrect code generation for bitcasts of unpacked vector types.
Bitcasting between unpacked scalable vector types of different element counts is not a NOP because the live elements are la
[SVE] Fix incorrect code generation for bitcasts of unpacked vector types.
Bitcasting between unpacked scalable vector types of different element counts is not a NOP because the live elements are laid out differently. 01234567 e.g. nxv2i32 = XX??XX?? nxv4f16 = X?X?X?X?
Differential Revision: https://reviews.llvm.org/D126957
show more ...
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
2ea8f203 |
| 07-May-2022 |
Xiang1 Zhang <[email protected]> |
[CodeGen] Fix ConvertNodeToLibcall for STRICT_FPOWI
Reviewed By: PengfeiWang
Differential Revision: https://reviews.llvm.org/D125159
|
| #
70306542 |
| 03-May-2022 |
serge-sans-paille <[email protected]> |
[iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few regressions, fixing them.
Differential Revision: https://reviews.llvm.
[iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since fa5a4e1b95c8f37796 detected a few regressions, fixing them.
Differential Revision: https://reviews.llvm.org/D124847
show more ...
|
| #
f10a8f67 |
| 30-Apr-2022 |
Paul Walker <[email protected]> |
[LegalizeDAG] Fix TypeSize conversion error when expanding SIGN_EXTEND_INREG
SIGN_EXTEND_INREG expansion can trigger a TypeSize error because "VT.getSizeInBits() == 1" is used to detect for a boolea
[LegalizeDAG] Fix TypeSize conversion error when expanding SIGN_EXTEND_INREG
SIGN_EXTEND_INREG expansion can trigger a TypeSize error because "VT.getSizeInBits() == 1" is used to detect for a boolean without first verifying VT is a scalar.
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
170a9031 |
| 10-Oct-2021 |
Serge Pavlov <[email protected]> |
Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value cl
Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value classes. The intrinsic implements the checks made by C standard library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`, `issignaling` and corresponding IEEE-754 operations.
The primary motivation for this intrinsic is the support of strict FP mode. In this mode using compare instructions or other FP operations is not possible, because if the value is a signaling NaN, floating-point exception `Invalid` is raised, but the aforementioned functions must never raise exceptions.
Currently there are two solutions for this problem, both are implemented partially. One of them is using integer operations to implement the check. It was implemented in https://reviews.llvm.org/D95948 for `isnan`. It solves the problem of exceptions, but offers one solution for all targets, although some can do the check in more efficient way.
The other, implemented in https://reviews.llvm.org/D96568, introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target specific code into IR to implement `isnan` and some other functions. It is convenient for targets that have dedicated instruction to determine FP data class. However using target-specific intrinsic complicates analysis and can prevent some optimizations.
A special intrinsic for value class checks allows representing data class tests with enough flexibility. During IR transformations it represents the check in target-independent way and saves it from undesired transformations. In the instruction selector it allows efficient lowering depending on the used target and mode.
This implementation is an extended variant of `llvm.isnan` introduced in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic support. Target-specific treatment will be implemented in separate patches.
Differential Revision: https://reviews.llvm.org/D112025
show more ...
|
| #
12c10226 |
| 09-Dec-2021 |
John Brawn <[email protected]> |
[AArch64] Lowering and legalization of strict FP16
For strict FP16 to work correctly needs some changes in lowering and legalization: * SelectionDAGLegalize::PromoteNode was missing handling for so
[AArch64] Lowering and legalization of strict FP16
For strict FP16 to work correctly needs some changes in lowering and legalization: * SelectionDAGLegalize::PromoteNode was missing handling for some strict fp opcodes. * Some of the custom lowering of strict fp operations needed to be adjusted to work with FP16. * Custom lowering needed to be added for round-to-int operations.
With this, and the previous patches for the rest of the strict fp isel, we can set IsStrictFPEnabled = true.
Differential Revision: https://reviews.llvm.org/D115620
show more ...
|
| #
8216255c |
| 04-Apr-2022 |
Fraser Cormack <[email protected]> |
[RISCV][VP] Add basic RVV codegen for vp.fcmp
This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions.
Most notably this patch adds cond-code legalizatio
[RISCV][VP] Add basic RVV codegen for vp.fcmp
This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions.
Most notably this patch adds cond-code legalization for VP_SETCC, reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in additional SDValue parameters for the Mask and EVL. This method then uses VP operations to legalize the condcode.
There is still a general lack of canonicalization on VP_SETCC as opposed to SETCC which results in worse code than is theoretically possible.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D123051
show more ...
|
| #
1ad36487 |
| 06-Apr-2022 |
Craig Topper <[email protected]> |
[LegalizeDAG] Use SelectionDAG::getBoolConstant to simplify some code. NFC
|
| #
6be5e875 |
| 30-Mar-2022 |
Fraser Cormack <[email protected]> |
[RISCV][VP] Add basic RVV codegen for vp.icmp
This patch adds the minimum required to successfully lower vp.icmp via the new ISD::VP_SETCC node to RVV instructions.
Regular ISD::SETCC goes through
[RISCV][VP] Add basic RVV codegen for vp.icmp
This patch adds the minimum required to successfully lower vp.icmp via the new ISD::VP_SETCC node to RVV instructions.
Regular ISD::SETCC goes through a lot of canonicalization which targets may rely on which has not hereto been ported to VP_SETCC. It also supports expansion of individual condition codes and a non-boolean return type. Support for all of that will follow in later patches.
In the case of RVV this largely isn't a problem as the vector integer comparison instructions are plentiful enough that it can lower all VP_SETCC nodes on legal integer vectors except for boolean vectors, which regular SETCC folds away immediately into logical operations.
Floating-point VP_SETCC operations aren't as well supported in RVV and the backend relies on condition code expansion, so support for those operations will come in later patches.
Portions of this code were taken from the VP reference patches.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D122743
show more ...
|
| #
09854f2a |
| 22-Feb-2022 |
Matthias Gehre <[email protected]> |
[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers
Emit calls to __divei4 and friends for divison/remainder of large integers.
This fixes https://github.com/
[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers
Emit calls to __divei4 and friends for divison/remainder of large integers.
This fixes https://github.com/llvm/llvm-project/issues/44994.
The overall RFC is in https://discourse.llvm.org/t/rfc-add-support-for-division-of-large-bitint-builtins-selectiondag-globalisel-clang/60329
The compiler-rt part is in https://reviews.llvm.org/D120327
Differential Revision: https://reviews.llvm.org/D120329
show more ...
|
| #
ed98c1b3 |
| 09-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332
|
| #
28cfa764 |
| 10-Mar-2022 |
Lorenzo Albano <[email protected]> |
[VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes to represent vector strided loads and stores.
Reviewed By: simoll
Differential Revision: https://r
[VP] Strided loads/stores
This patch introduces two new experimental IR intrinsics and SDAG nodes to represent vector strided loads and stores.
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D114884
show more ...
|
| #
7d926b71 |
| 02-Feb-2022 |
Simon Moll <[email protected]> |
[VE] LEGALAVL and staged VVP legalization
The new LEGALAVL node annotates that the AVL refers to packs of 64bit. We use a two-stage lowering approach with LEGALAVL:
First, standard SDNodes are tran
[VE] LEGALAVL and staged VVP legalization
The new LEGALAVL node annotates that the AVL refers to packs of 64bit. We use a two-stage lowering approach with LEGALAVL:
First, standard SDNodes are translated into illegal VVP layer nodes. Regardless of source (VP or standard), all VVP nodes have a mask and AVL parameter. The AVL parameter refers to the element position (just as in VP intrinsics).
Second, we legalize the AVL usage in VVP layer nodes. If the element size is < 64bit, the EVL parameter has to be adjusted to refer to packs of 64bits. We wrap the legalized AVL in a LEGALAVL node to track this.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D118321
show more ...
|
| #
63b17eb9 |
| 12-Jan-2022 |
Craig Topper <[email protected]> |
[RISCV] Add strictfp support for compares.
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).
FEQ matches well to STRICT_FSETCC oeq. FLT/FLE matches well to STRICT_FSETCCS ol
[RISCV] Add strictfp support for compares.
This adds support for STRICT_FSETCC(quiet) and STRICT_FSETCCS(signaling).
FEQ matches well to STRICT_FSETCC oeq. FLT/FLE matches well to STRICT_FSETCCS olt/ole.
Others require commuting operands or multiple instructions.
STRICT_FSETCC olt/ole/ogt/oge/ult/ule/ugt/uge uses FLT/FLE, but we need to save/restore FFLAGS around them to avoid spurious exceptions. I've implemented pseudo instructions with a CustomInserter to insert the save/restore CSR instructions. Unfortunately, this doesn't honor exceptions for signaling NANs but I'm not sure if signaling nans are really supported by the constrained intrinsics.
STRICT_FSETCC one and ueq expand to a pair of FLT instructions with a save/restore of fflags around each. This could be improved in the future.
There may be some opportunities to generate better code for strict comparisons mixed with nonans fast math flags. I've left FIXMEs in the .td files for that.
Co-Authored-by: ShihPo Hung <[email protected]>
Reviewed By: arcbbb
Differential Revision: https://reviews.llvm.org/D116694
show more ...
|
| #
52d2f353 |
| 07-Dec-2021 |
Simon Pilgrim <[email protected]> |
[DAG] Update expandFunnelShift/expandROT to return the expansion directly. NFCI.
Don't return a bool to indicate if the expansion was successful, just return the SDValue result directly, like we do
[DAG] Update expandFunnelShift/expandROT to return the expansion directly. NFCI.
Don't return a bool to indicate if the expansion was successful, just return the SDValue result directly, like we do for most other basic expansions.
show more ...
|
| #
15826eb4 |
| 01-Dec-2021 |
Qiu Chaofan <[email protected]> |
[Legalizer] Avoid expansion to BR_CC if illegal
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D110616
|
| #
82bc6a09 |
| 13-Nov-2021 |
Craig Topper <[email protected]> |
[X86] Promote f16 STRICT_FROUND to f32 and call libc.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D113817
|
| #
99d5cbbd |
| 12-Nov-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use SDNode::uses (NFC)
|
| #
04c184bb |
| 22-Oct-2021 |
Craig Topper <[email protected]> |
[TargetLowering] Simplify the interface of expandABS. NFC
Instead of returning a bool to indicate success and a separate SDValue, return the SDValue and have the callers check if it is null.
Review
[TargetLowering] Simplify the interface of expandABS. NFC
Instead of returning a bool to indicate success and a separate SDValue, return the SDValue and have the callers check if it is null.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112331
show more ...
|
| #
996123e5 |
| 21-Oct-2021 |
Craig Topper <[email protected]> |
[TargetLowering] Simplify the interface for expandCTPOP/expandCTLZ/expandCTTZ.
There is no need to return a bool and have an SDValue output parameter. Just return the SDValue and let the caller chec
[TargetLowering] Simplify the interface for expandCTPOP/expandCTLZ/expandCTTZ.
There is no need to return a bool and have an SDValue output parameter. Just return the SDValue and let the caller check if it is null.
I have another patch to add more callers of these so I thought I'd clean up the interface first.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D112267
show more ...
|
| #
6678db00 |
| 15-Oct-2021 |
Dávid Bolvanský <[email protected]> |
[X86] Enable promotion of i16 popcnt (PR52056)
Solves https://bugs.llvm.org/show_bug.cgi?id=52056
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D111507
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
e7c879a6 |
| 05-Aug-2021 |
Fraser Cormack <[email protected]> |
[RISCV][VP] Add support for VP_REDUCE_* operations
This patch adds codegen support for lowering the vector-predicated reduction intrinsics to RVV instructions. The process is similar to that of the
[RISCV][VP] Add support for VP_REDUCE_* operations
This patch adds codegen support for lowering the vector-predicated reduction intrinsics to RVV instructions. The process is similar to that of the other reduction intrinsics, save for the fact that every VP reduction has a start value. We reuse the existing custom "VL" nodes, adding extra patterns where required to handle non-true masks.
To support these nodes, the `RISCVISD::VECREDUCE_*_VL` nodes have been given an explicit "merge" operand. This is to faciliate the VP reductions, where we must be careful to ensure that even if no operation is performed (when VL=0) we still produce the start value. The RVV reductions don't update the destination register under these conditions, so we tie the splatted start value to the output register.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D107657
show more ...
|
| #
9af8f1b1 |
| 09-Sep-2021 |
Craig Topper <[email protected]> |
[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483.
R
[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode.
Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D109535
show more ...
|