|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
2e62a26f |
| 15-Jul-2022 |
Edd Barrett <[email protected]> |
[stackmaps] Legalise patchpoint arguments.
This is similar to D125680, but for llvm.experimental.patchpoint (instead of llvm.experimental.stackmap).
Differential review: https://reviews.llvm.org/D1
[stackmaps] Legalise patchpoint arguments.
This is similar to D125680, but for llvm.experimental.patchpoint (instead of llvm.experimental.stackmap).
Differential review: https://reviews.llvm.org/D129268
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
ed8ef65f |
| 17-Jun-2022 |
Edd Barrett <[email protected]> |
[stackmaps] Start legalizing live variable operands
Prior to this change, live variable operands passed to `llvm.experimental.stackmap` would be emitted directly to target nodes, meaning that they d
[stackmaps] Start legalizing live variable operands
Prior to this change, live variable operands passed to `llvm.experimental.stackmap` would be emitted directly to target nodes, meaning that they don't get legalised. The upshot of this is that LLVM may crash when encountering illegally typed target nodes.
e.g. https://github.com/llvm/llvm-project/issues/21657
This change introduces a platform independent stackmap DAG node whose operands are legalised as per usual, thus avoiding aforementioned crashes.
Note that some kinds of argument are still not handled properly, namely vectors, structs, and large integers, like i128s. These will need to be addressed in follow-up changes.
Note also that this does not change the behaviour of `llvm.experimental.patchpoint`. A follow up change will do the same for this intrinsic.
Differential review: https://reviews.llvm.org/D125680
show more ...
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
fb34d531 |
| 03-Jun-2022 |
Benjamin Kramer <[email protected]> |
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a si
Promote bf16 to f32 when the target doesn't support it
This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a simple operation I opted to always directly lower it to integer arithmetic. The other way round is more complicated if you want to preserve IEEE semantics, so it's handled by a new __truncsfbf2 compiler-rt builtin.
This is of course very bare bones, but sufficient to get a semi-softened fadd on x86.
Possible future improvements: - Targets with bf16 conversion instructions can now make fp_to_bf16 legal - The software conversion to bf16 can be replaced by a trivial implementation under fast math.
Differential Revision: https://reviews.llvm.org/D126953
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
170a9031 |
| 10-Oct-2021 |
Serge Pavlov <[email protected]> |
Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value cl
Intrinsic for checking floating point class
This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value classes. The intrinsic implements the checks made by C standard library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`, `issignaling` and corresponding IEEE-754 operations.
The primary motivation for this intrinsic is the support of strict FP mode. In this mode using compare instructions or other FP operations is not possible, because if the value is a signaling NaN, floating-point exception `Invalid` is raised, but the aforementioned functions must never raise exceptions.
Currently there are two solutions for this problem, both are implemented partially. One of them is using integer operations to implement the check. It was implemented in https://reviews.llvm.org/D95948 for `isnan`. It solves the problem of exceptions, but offers one solution for all targets, although some can do the check in more efficient way.
The other, implemented in https://reviews.llvm.org/D96568, introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target specific code into IR to implement `isnan` and some other functions. It is convenient for targets that have dedicated instruction to determine FP data class. However using target-specific intrinsic complicates analysis and can prevent some optimizations.
A special intrinsic for value class checks allows representing data class tests with enough flexibility. During IR transformations it represents the check in target-independent way and saves it from undesired transformations. In the instruction selector it allows efficient lowering depending on the used target and mode.
This implementation is an extended variant of `llvm.isnan` introduced in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic support. Target-specific treatment will be implemented in separate patches.
Differential Revision: https://reviews.llvm.org/D112025
show more ...
|
| #
62a983eb |
| 06-Apr-2022 |
Daniil Kovalev <[email protected]> |
Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0e17ac41d5430f1290d3661343eee1e.
As discussed in D120714 with @thakis, the patch added unneed
Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if"
This reverts commit 83a798d4b0e17ac41d5430f1290d3661343eee1e.
As discussed in D120714 with @thakis, the patch added unneeded complexity without noticeable benefits.
show more ...
|
| #
83a798d4 |
| 06-Apr-2022 |
Daniil Kovalev <[email protected]> |
[CodeGen] Place SDNode debug ID declaration under appropriate #if
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to reduce memory usage when it is not needed.
Differential
[CodeGen] Place SDNode debug ID declaration under appropriate #if
Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to reduce memory usage when it is not needed.
Differential Revision: https://reviews.llvm.org/D120714
show more ...
|
| #
73f0af10 |
| 22-Mar-2022 |
Craig Topper <[email protected]> |
[SelectionDAG] Add printing support for the Align value of AssertAlign nodes.
Differential Revision: https://reviews.llvm.org/D122262
|
| #
ed98c1b3 |
| 09-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332
|
| #
4072e362 |
| 11-Feb-2022 |
David Green <[email protected]> |
[ISel] Port AArch64 HADD and RHADD to ISel
This ports the aarch64 combines for HADD and RHADD over to DAG combine, so that they can be used in more architectures (notably MVE in a followup patch). T
[ISel] Port AArch64 HADD and RHADD to ISel
This ports the aarch64 combines for HADD and RHADD over to DAG combine, so that they can be used in more architectures (notably MVE in a followup patch). They are renamed to AVGFLOOR and AVGCEIL in the process, to avoid confusion with instructions such as X86 hadd. The code was also rewritten slightly to remove the AArch64 idiosyncrasies.
The general pattern for a AVGFLOORS is %xe = sext i8 %x to i32 %ye = sext i8 %y to i32 %a = add i32 %xe, %ye %r = lshr i32 %a, 1 %t = trunc i32 %r to i8
An AVGFLOORU is equivalent with zext. Because of the truncate lshr==ashr, as the top bits are not demanded. An AVGCEIL also includes an extra rounding, so includes an extra add of 1.
Differential Revision: https://reviews.llvm.org/D106237
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
3f1f08f0 |
| 02-Sep-2021 |
Roman Lebedev <[email protected]> |
Revert @llvm.isnan intrinsic patchset.
Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.)
TLDR: the original patch had no prior RFC, yet i
Revert @llvm.isnan intrinsic patchset.
Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.)
TLDR: the original patch had no prior RFC, yet it had some changes that really need a proper RFC discussion. It won't be productive to discuss such an RFC, once it's actually posted, while said patch is already committed, because that introduces bias towards already-committed stuff, and the tree is potentially in broken state meanwhile.
While the end result of discussion may lead back to the current design, it may also not lead to the current design.
Therefore i take it upon myself to revert the tree back to last known good state.
This reverts commit 4c4093e6e39fe6601f9c95a95a6bc242ef648cd5. This reverts commit 0a2b1ba33ae6dcaedb81417f7c4cc714f72a5968. This reverts commit d9873711cb03ac7aedcaadcba42f82c66e962e6e. This reverts commit 791006fb8c6fff4f33c33cb513a96b1d3f94c767. This reverts commit c22b64ef66f7518abb6f022fcdfd86d16c764caf. This reverts commit 72ebcd3198327da12804305bda13d9b7088772a8. This reverts commit 5fa6039a5fc1b6392a3c9a3326a76604e0cb1001. This reverts commit 9efda541bfbd145de90f7db38d935db6246dc45a. This reverts commit 94d3ff09cfa8d7aecf480e54da9a5334e262e76b.
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
4c4093e6 |
| 05-Aug-2021 |
Serge Pavlov <[email protected]> |
Introduce intrinsic llvm.isnan
This is recommit of the patch 16ff91ebccda1128c43ff3cee104e2c603569fb2, reverted in 0c28a7c990c5218d6aec47c5052a51cba686ec5e because it had an error in call of getFast
Introduce intrinsic llvm.isnan
This is recommit of the patch 16ff91ebccda1128c43ff3cee104e2c603569fb2, reverted in 0c28a7c990c5218d6aec47c5052a51cba686ec5e because it had an error in call of getFastMathFlags (base type should be FPMathOperator but not Instruction). The original commit message is duplicated below:
Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations.
* IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
show more ...
|
| #
0c28a7c9 |
| 04-Aug-2021 |
Serge Pavlov <[email protected]> |
Revert "Introduce intrinsic llvm.isnan"
This reverts commit 16ff91ebccda1128c43ff3cee104e2c603569fb2. Several errors were reported mainly test-suite execution time. Reverted for investigation.
|
| #
16ff91eb |
| 04-Aug-2021 |
Serge Pavlov <[email protected]> |
Introduce intrinsic llvm.isnan
Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands th
Introduce intrinsic llvm.isnan
Clang has builtin function '__builtin_isnan', which implements C library function 'isnan'. This function now is implemented entirely in clang codegen, which expands the function into set of IR operations. There are three mechanisms by which the expansion can be made.
* The most common mechanism is using an unordered comparison made by instruction 'fcmp uno'. This simple solution is target-independent and works well in most cases. It however is not suitable if floating point exceptions are tracked. Corresponding IEEE 754 operation and C function must never raise FP exception, even if the argument is a signaling NaN. Compare instructions usually does not have such property, they raise 'invalid' exception in such case. So this mechanism is unsuitable when exception behavior is strict. In particular it could result in unexpected trapping if argument is SNaN.
* Another solution was implemented in https://reviews.llvm.org/D95948. It is used in the cases when raising FP exceptions by 'isnan' is not allowed. This solution implements 'isnan' using integer operations. It solves the problem of exceptions, but offers one solution for all targets, however some can do the check in more efficient way.
* Solution implemented by https://reviews.llvm.org/D96568 introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects target specific code into IR. Now only SystemZ implements this hook and it generates a call to target specific intrinsic function.
Although these mechanisms allow to implement 'isnan' with enough efficiency, expanding 'isnan' in clang has drawbacks:
* The operation 'isnan' is hidden behind generic integer operations or target-specific intrinsics. It complicates analysis and can prevent some optimizations.
* IR can be created by tools other than clang, in this case treatment of 'isnan' has to be duplicated in that tool.
Another issue with the current implementation of 'isnan' comes from the use of options '-ffast-math' or '-fno-honor-nans'. If such option is specified, 'fcmp uno' may be optimized to 'false'. It is valid optimization in general, but it results in 'isnan' always returning 'false'. For example, in some libc++ implementations the following code returns 'false':
std::isnan(std::numeric_limits<float>::quiet_NaN())
The options '-ffast-math' and '-fno-honor-nans' imply that FP operation operands are never NaNs. This assumption however should not be applied to the functions that check FP number properties, including 'isnan'. If such function returns expected result instead of actually making checks, it becomes useless in many cases. The option '-ffast-math' is often used for performance critical code, as it can speed up execution by the expense of manual treatment of corner cases. If 'isnan' returns assumed result, a user cannot use it in the manual treatment of NaNs and has to invent replacements, like making the check using integer operations. There is a discussion in https://reviews.llvm.org/D18513#387418, which also expresses the opinion, that limitations imposed by '-ffast-math' should be applied only to 'math' functions but not to 'tests'.
To overcome these drawbacks, this change introduces a new IR intrinsic function 'llvm.isnan', which realizes the check as specified by IEEE-754 and C standards in target-agnostic way. During IR transformations it does not undergo undesirable optimizations. It reaches instruction selection, where is lowered in target-dependent way. The lowering can vary depending on options like '-ffast-math' or '-ffp-model' so the resulting code satisfies requested semantics.
Differential Revision: https://reviews.llvm.org/D104854
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1 |
|
| #
3c0b596e |
| 30-Jul-2021 |
Simon Pilgrim <[email protected]> |
SelectionDAGDumper.cpp - remove nested if-else return chain. NFCI.
Match style and don't use an else after a return.
|
| #
986841cc |
| 30-Jul-2021 |
Simon Pilgrim <[email protected]> |
SelectionDAGDumper.cpp - printrWithDepthHelper - remove dead code. NFCI.
Fixes coverity warning - we have an early-out for unsigned depth == 0, so the depth < 1 early-out later on is dead code.
|
|
Revision tags: llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4 |
|
| #
2887f146 |
| 26-Jun-2021 |
David Green <[email protected]> |
[ISel] Port AArch64 SABD and UABD to DAGCombine
This ports the AArch64 SABD and USBD over to DAG Combine, where they can be used by more backends (notably MVE in a follow-up patch). The matching cod
[ISel] Port AArch64 SABD and UABD to DAGCombine
This ports the AArch64 SABD and USBD over to DAG Combine, where they can be used by more backends (notably MVE in a follow-up patch). The matching code has changed very little, just to handle legal operations and types differently. It selects from (ABS (SUB (EXTEND a), (EXTEND b))), producing a ubds/abdu which is zexted to the original type.
Differential Revision: https://reviews.llvm.org/D91937
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
bb8ce25e |
| 14-Jun-2021 |
Jeroen Dobbelaere <[email protected]> |
Intrinsic::getName: require a Module argument
Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary.
This fixes the issue that was detected by https://bugs.chromi
Intrinsic::getName: require a Module argument
Ensure that we provide a `Module` when checking if a rename of an intrinsic is necessary.
This fixes the issue that was detected by https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=32288 (as mentioned by @fhahn), after committing D91250.
Note that the `LLVMIntrinsicCopyOverloadedName` is being deprecated in favor of `LLVMIntrinsicCopyOverloadedName2`.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D99173
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
748ae528 |
| 08-Feb-2021 |
David Sherwood <[email protected]> |
[IR][SVE] Add new llvm.experimental.stepvector intrinsic
This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the
[IR][SVE] Add new llvm.experimental.stepvector intrinsic
This patch adds a new llvm.experimental.stepvector intrinsic, which takes no arguments and returns a linear integer sequence of values of the form <0, 1, ...>. It is primarily intended for scalable vectors, although it will work for fixed width vectors too. It is intended that later patches will make use of this new intrinsic when vectorising induction variables, currently only supported for fixed width. I've added a new CreateStepVector method to the IRBuilder, which will generate a call to this intrinsic for scalable vectors and fall back on creating a ConstantVector for fixed width.
For scalable vectors this intrinsic is lowered to a new ISD node called STEP_VECTOR, which takes a single constant integer argument as the step. During lowering this argument is set to a value of 1. The reason for this additional argument at the codegen level is because in future patches we will introduce various generic DAG combines such as
mul step_vector(1), 2 -> step_vector(2) add step_vector(1), step_vector(1) -> step_vector(2) shl step_vector(1), 1 -> step_vector(2) etc.
that encourage a canonical format for all targets. This hopefully means all other targets supporting scalable vectors can benefit from this too.
I've added cost model tests for both fixed width and scalable vectors:
llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll
as well as codegen lowering tests for fixed width and scalable vectors:
llvm/test/CodeGen/AArch64/neon-stepvector.ll llvm/test/CodeGen/AArch64/sve-stepvector.ll
See this thread for discussion of the intrinsic: https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html
show more ...
|
| #
9106d045 |
| 10-Mar-2021 |
Craig Topper <[email protected]> |
[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32.
On riscv32, i64 isn't a legal scalar type but we would l
[RISCV][SelectionDAG] Introduce an ISD::SPLAT_VECTOR_PARTS node that can represent a splat of 2 i32 values into a nxvXi64 vector for riscv32.
On riscv32, i64 isn't a legal scalar type but we would like to support scalable vectors of i64.
This patch introduces a new node that can represent a splat made of multiple scalar values. I've used this new node to solve the current crashes we experience when getConstant is used after type legalization.
For RISCV, we are now default expanding SPLAT_VECTOR to SPLAT_VECTOR_PARTS when needed and then handling the SPLAT_VECTOR_PARTS later during LegalizeOps. I've remove the special case I previously put in for ABS for D97991 as the default expansion is now able to succesfully use getConstant.
Reviewed By: frasercrmck
Differential Revision: https://reviews.llvm.org/D98004
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
2750f3ed |
| 08-Jan-2021 |
Cullen Rhodes <[email protected]> |
[IR] Introduce llvm.experimental.vector.splice intrinsic
This patch introduces a new intrinsic @llvm.experimental.vector.splice that constructs a vector of the same type as the two input vectors, ba
[IR] Introduce llvm.experimental.vector.splice intrinsic
This patch introduces a new intrinsic @llvm.experimental.vector.splice that constructs a vector of the same type as the two input vectors, based on a immediate where the sign of the immediate distinguishes two variants. A positive immediate specifies an index into the first vector and a negative immediate specifies the number of trailing elements to extract from the first vector.
For example:
@llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1) ==> <B, C, D, E> ; index @llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing element count
These intrinsics support both fixed and scalable vectors, where the former is lowered to a shufflevector to maintain existing behaviour, although while marked as experimental the recommended way to express this operation for fixed-width vectors is to use shufflevector. For scalable vectors where it is not possible to express a shufflevector mask for this operation, a new ISD node has been implemented.
This is one of the named shufflevector intrinsics proposed on the mailing-list in the RFC at [1].
Patch by Paul Walker and Cullen Rhodes.
[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html
Reviewed By: sdesmalen
Differential Revision: https://reviews.llvm.org/D94708
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4 |
|
| #
9525af7b |
| 28-Sep-2020 |
gbtozers <[email protected]> |
[DebugInfo] Support representation of multiple location operands in SDDbgValue
This patch modifies the class that represents debug values during ISel, SDDbgValue, to support multiple location operan
[DebugInfo] Support representation of multiple location operands in SDDbgValue
This patch modifies the class that represents debug values during ISel, SDDbgValue, to support multiple location operands (to represent a dbg.value that uses a DIArgList). Part of this class's functionality has been split off into a new class, SDDbgOperand.
The new class SDDbgOperand represents a single value, corresponding to an SSA value or MachineOperand in the IR and MIR respectively. Members of SDDbgValue that were previously related to that specific value (as opposed to the variable or DIExpression), such as the Kind enum, have been moved to SDDbgOperand. SDDbgValue now contains an array of SDDbgOperand instead, allowing it to hold more than one of these values.
All changes outside SDDbgValue are simply updates to use the new interface.
Differential Revision: https://reviews.llvm.org/D88585
show more ...
|
| #
0b417ba2 |
| 21-Feb-2021 |
Kazu Hirata <[email protected]> |
[CodeGen] Use range-based for loops (NFC)
|
| #
2d728bbf |
| 15-Jan-2021 |
Caroline Concatto <[email protected]> |
[CodeGen][SelectionDAG]Add new intrinsic experimental.vector.reverse
This patch adds a new intrinsic experimental.vector.reduce that takes a single vector and returns a vector of matching type but
[CodeGen][SelectionDAG]Add new intrinsic experimental.vector.reverse
This patch adds a new intrinsic experimental.vector.reduce that takes a single vector and returns a vector of matching type but with the original lane order reversed. For example:
``` vector.reverse(<A,B,C,D>) ==> <D,C,B,A> ```
The new intrinsic supports fixed and scalable vectors types. The fixed-width vector relies on shufflevector to maintain existing behaviour. Scalable vector uses the new ISD node - VECTOR_REVERSE.
This new intrinsic is one of the named shufflevector intrinsics proposed on the mailing-list in the RFC at [1].
Patch by Paul Walker (@paulwalker-arm).
[1] https://lists.llvm.org/pipermail/llvm-dev/2020-November/146864.html
Differential Revision: https://reviews.llvm.org/D94883
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2 |
|
| #
bf416d16 |
| 03-Feb-2020 |
Serge Pavlov <[email protected]> |
[FPEnv] Intrinsic for setting rounding mode
To set non-default rounding mode user usually calls function 'fesetround' from standard C library. This way has some disadvantages.
* It creates unnecess
[FPEnv] Intrinsic for setting rounding mode
To set non-default rounding mode user usually calls function 'fesetround' from standard C library. This way has some disadvantages.
* It creates unnecessary dependency on libc. On the other hand, setting rounding mode requires few instructions and could be made by compiler. Sometimes standard C library even is not available, like in the case of GPU or AI cores that execute small kernels. * Compiler could generate more effective code if it knows that a particular call just sets rounding mode.
This change introduces new IR intrinsic, namely 'llvm.set.rounding', which sets current rounding mode, similar to 'fesetround'. It however differs from the latter, because it is a lower level facility:
* 'llvm.set.rounding' does not return any value, whereas 'fesetround' returns non-zero value in the case of failure. In glibc 'fesetround' reports failure if its argument is invalid or unsupported or if floating point operations are unavailable on the hardware. Compiler usually knows what core it generates code for and it can validate arguments in many cases. * Rounding mode is specified in 'fesetround' using constants like 'FE_TONEAREST', which are target dependent. It is inconvenient to work with such constants at IR level.
C standard provides a target-independent way to specify rounding mode, it is used in FLT_ROUNDS, however it does not define standard way to set rounding mode using this encoding.
This change implements only IR intrinsic. Lowering it to machine code is target-specific and will be implemented latter. Mapping of 'fesetround' to 'llvm.set.rounding' is also not implemented here.
Differential Revision: https://reviews.llvm.org/D74729
show more ...
|
| #
a89d751f |
| 17-Dec-2020 |
Bjorn Pettersson <[email protected]> |
Add intrinsics for saturating float to int casts
This patch adds support for the fptoui.sat and fptosi.sat intrinsics, which provide basically the same functionality as the existing fptoui and fptos
Add intrinsics for saturating float to int casts
This patch adds support for the fptoui.sat and fptosi.sat intrinsics, which provide basically the same functionality as the existing fptoui and fptosi instructions, but will saturate (or return 0 for NaN) on values unrepresentable in the target type, instead of returning poison. Related mailing list discussion can be found at: https://groups.google.com/d/msg/llvm-dev/cgDFaBmCnDQ/CZAIMj4IBAAJ
The intrinsics have overloaded source and result type and support vector operands:
i32 @llvm.fptoui.sat.i32.f32(float %f) i100 @llvm.fptoui.sat.i100.f64(double %f) <4 x i32> @llvm.fptoui.sat.v4i32.v4f16(half %f) // etc
On the SelectionDAG layer two new ISD opcodes are added, FP_TO_UINT_SAT and FP_TO_SINT_SAT. These opcodes have two operands and one result. The second operand is an integer constant specifying the scalar saturation width. The idea here is that initially the second operand and the scalar width of the result type are the same, but they may change during type legalization. For example:
i19 @llvm.fptsi.sat.i19.f32(float %f) // builds i19 fp_to_sint_sat f, 19 // type legalizes (through integer result promotion) i32 fp_to_sint_sat f, 19
I went for this approach, because saturated conversion does not compose well. There is no good way of "adjusting" a saturating conversion to i32 into one to i19 short of saturating twice. Specifying the saturation width separately allows directly saturating to the correct width.
There are two baseline expansions for the fp_to_xint_sat opcodes. If the integer bounds can be exactly represented in the float type and fminnum/fmaxnum are legal, we can expand to something like:
f = fmaxnum f, FP(MIN) f = fminnum f, FP(MAX) i = fptoxi f i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN
If the bounds cannot be exactly represented, we expand to something like this instead:
i = fptoxi f i = select f ult FP(MIN), MIN, i i = select f ogt FP(MAX), MAX, i i = select f uo f, 0, i # unnecessary if unsigned as 0 = MIN
It should be noted that this expansion assumes a non-trapping fptoxi.
Initial tests are for AArch64, x86_64 and ARM. This exercises all of the scalar and vector legalization. ARM is included to test float softening.
Original patch by @nikic and @ebevhan (based on D54696).
Differential Revision: https://reviews.llvm.org/D54749
show more ...
|