|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
f510045d |
| 14-Jan-2022 |
Jay Foad <[email protected]> |
[CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC.
Take advantage of D117117 to simplify all {{\[}} to [ and {{\]}} to ].
Differential Revision: https://reviews.llvm.org/D117298
|
| #
0776f6e0 |
| 13-Jan-2022 |
Benjamin Kramer <[email protected]> |
[LSV] Vectorize loads of vectors by turning it into a larger vector
Use shufflevector to do the subvector extracts. This allows a lot more load merging on AMDGPU and also on NVPTX when <2 x half> is
[LSV] Vectorize loads of vectors by turning it into a larger vector
Use shufflevector to do the subvector extracts. This allows a lot more load merging on AMDGPU and also on NVPTX when <2 x half> is involved.
Differential Revision: https://reviews.llvm.org/D117219
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
ed0f4415 |
| 15-Jul-2021 |
alex-t <[email protected]> |
[AMDGPU] Divergence-driven compare operations instruction selection
Description: This change enables the compare operations to be selected to SALU/VALU form dependent of the SDNode dive
[AMDGPU] Divergence-driven compare operations instruction selection
Description: This change enables the compare operations to be selected to SALU/VALU form dependent of the SDNode divergence flag.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D106079
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
| #
62d8b8a2 |
| 30-Jul-2020 |
Piotr Sobczak <[email protected]> |
Fix 64-bit copy to SCC
Fix 64-bit copy to SCC by restricting the pattern resulting in such a copy to subtargets supporting 64-bit scalar compare, and mapping the copy to S_CMP_LG_U64.
Before introd
Fix 64-bit copy to SCC
Fix 64-bit copy to SCC by restricting the pattern resulting in such a copy to subtargets supporting 64-bit scalar compare, and mapping the copy to S_CMP_LG_U64.
Before introducing the S_CSELECT pattern with explicit SCC (0045786f146e78afee49eee053dc29ebc842fee1), there was no need for handling 64-bit copy to SCC ($scc = COPY sreg_64).
The proposed handling to read only the low bits was however based on a false premise that it is only one bit that matters, while in fact the copy source might be a vector of booleans and all bits need to be considered.
The practical problem of mapping the 64-bit copy to SCC is that the natural instruction to use (S_CMP_LG_U64) is not available on old hardware. Fix it by restricting the problematic pattern to subtargets supporting the instruction (hasScalarCompareEq64).
Differential Revision: https://reviews.llvm.org/D85207
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4 |
|
| #
0045786f |
| 04-Mar-2020 |
Piotr Sobczak <[email protected]> |
[AMDGPU] Select s_cselect
Summary: Add patterns to select s_cselect in the isel.
Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work.
Subscribers: arsenm
[AMDGPU] Select s_cselect
Summary: Add patterns to select s_cselect in the isel.
Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits
Tags: #llvm
Re-commit D81925 with a bugfix D82370.
Differential Revision: https://reviews.llvm.org/D81925 Differential Revision: https://reviews.llvm.org/D82370
show more ...
|
| #
778351df |
| 24-Jun-2020 |
Matt Arsenault <[email protected]> |
Revert "[AMDGPU] Enable compare operations to be selected by divergence"
This reverts commit 521ac0b5cea02f629d035f807460affbb65ae7ad.
Reported to break thousands of piglit tests.
|
| #
521ac0b5 |
| 19-Jun-2020 |
alex-t <[email protected]> |
[AMDGPU] Enable compare operations to be selected by divergence
Summary: Details: This patch enables SETCC to be selected to S_CMP_* if uniform and V_CMP_* if divergent.
Reviewers: rampitec, arsenm
[AMDGPU] Enable compare operations to be selected by divergence
Summary: Details: This patch enables SETCC to be selected to S_CMP_* if uniform and V_CMP_* if divergent.
Reviewers: rampitec, arsenm
Reviewed By: rampitec
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82194
show more ...
|
| #
6d9565d6 |
| 19-Jun-2020 |
Piotr Sobczak <[email protected]> |
Revert "[AMDGPU] Select s_cselect"
This caused some failures detected by the buildbot with expensive checks enabled.
This reverts commit 4067de569f119a81419fbf2e79d5f3307dfdda5b.
|
| #
4067de56 |
| 04-Mar-2020 |
Piotr Sobczak <[email protected]> |
[AMDGPU] Select s_cselect
Summary: Add patterns to select s_cselect in the isel.
Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work.
Subscribers: arsenm
[AMDGPU] Select s_cselect
Summary: Add patterns to select s_cselect in the isel.
Handle more cases of implicit SCC accesses in si-fix-sgpr-copies to allow new patterns to work.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, asbirlea, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81925
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
033f99a2 |
| 22-Mar-2019 |
Tim Renouf <[email protected]> |
[AMDGPU] Added v5i32 and v5f32 register classes
They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords.
Differential Revision: https://r
[AMDGPU] Added v5i32 and v5f32 register classes
They are not used by anything yet, but a subsequent commit will start using them for image ops that return 5 dwords.
Differential Revision: https://reviews.llvm.org/D58903
Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271 llvm-svn: 356735
show more ...
|
| #
361b5b21 |
| 21-Mar-2019 |
Tim Renouf <[email protected]> |
[AMDGPU] Support for v3i32/v3f32
Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet.
SI (gfx6) does not have dwordx3 instructions, so they are not enabled there
[AMDGPU] Support for v3i32/v3f32
Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet.
SI (gfx6) does not have dwordx3 instructions, so they are not enabled there.
Some of this patch is from Matt Arsenault, also of AMD.
Differential Revision: https://reviews.llvm.org/D58902
Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 llvm-svn: 356659
show more ...
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
| #
02dc7e19 |
| 15-Jun-2018 |
Matt Arsenault <[email protected]> |
AMDGPU: Make v4i16/v4f16 legal
Some image loads return these, and it's awkward working around them not being legal.
llvm-svn: 334835
|
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
| #
1349a04e |
| 22-May-2018 |
Matt Arsenault <[email protected]> |
AMDGPU: Make v2i16/v2f16 legal on VI
This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9.
Part
AMDGPU: Make v2i16/v2f16 legal on VI
This usually results in better code. Fixes using inline asm with short2, and also fixes having a different ABI for function parameters between VI and gfx9.
Partially cleans up the mess used for lowering of the d16 operations. Making v4f16 legal will help clean this up more, but this requires additional work.
llvm-svn: 332953
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
| #
d4ae470d |
| 19-Sep-2017 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA scheduler undoes it. Add mutation to prevent it.
Differential Revision:
[AMDGPU] Prevent post-RA scheduler from breaking memory clauses
The pre-RA scheduler does load/store clustering, but post-RA scheduler undoes it. Add mutation to prevent it.
Differential Revision: https://reviews.llvm.org/D38014
llvm-svn: 313670
show more ...
|
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2 |
|
| #
8728c5f2 |
| 07-Aug-2017 |
Matt Arsenault <[email protected]> |
AMDGPU: Cleanup subtarget features
Try to avoid mutually exclusive features. Don't use a real default GPU, and use a fake "generic". The goal is to make it easier to see which set of features are in
AMDGPU: Cleanup subtarget features
Try to avoid mutually exclusive features. Don't use a real default GPU, and use a fake "generic". The goal is to make it easier to see which set of features are incompatible between feature strings.
Most of the test changes are due to random scheduling changes from not having a default fullspeed model.
llvm-svn: 310258
show more ...
|
|
Revision tags: llvmorg-5.0.0-rc1 |
|
| #
982aee6a |
| 04-Jul-2017 |
Alexander Timofeev <[email protected]> |
[AMDGPU] Switch scalarize global loads ON by default Differential revision: https://reviews.llvm.org/D34407
llvm-svn: 307097
|
| #
e4a74137 |
| 04-Jul-2017 |
NAKAMURA Takumi <[email protected]> |
Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"
It broke a testcase.
Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll
llvm-svn: 307054
|
| #
ea7f08be |
| 03-Jul-2017 |
Alexander Timofeev <[email protected]> |
[AMDGPU] Switch scalarize global loads ON by default
Differential revision: https://reviews.llvm.org/D34407
llvm-svn: 307026
|
| #
a179d25b |
| 27-Jun-2017 |
Sam Kolton <[email protected]> |
[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This
[AMDGPU] SDWA: several fixes for V_CVT and VOPC instructions
Summary: 1. Instruction V_CVT_U32_F32 allow omod operand (see SIInstrInfo.td:1435). In fact this operand shouldn't be allowed here. This fix checks if SDWA pseudo instruction has OMod operand and then copy it. 2. There were several problems with support of VOPC instructions in SDWA peephole pass.
Reviewers: tstellar, arsenm, vpykhtin, airlied, kzhuravl
Subscribers: wdng, nhaehnle, yaxunl, dstuttard, tpr, sarnex, t-tye
Differential Revision: https://reviews.llvm.org/D34626
llvm-svn: 306413
show more ...
|
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
| #
4a48623e |
| 19-Apr-2017 |
Matt Arsenault <[email protected]> |
AMDGPU: Custom lower illegal small select types
Promote them to i32 vectors to avoid unpacking and re-packing the vectors.
llvm-svn: 300754
|
| #
3dbeefa9 |
| 21-Mar-2017 |
Matt Arsenault <[email protected]> |
AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default ca
AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel
Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel).
llvm-svn: 298444
show more ...
|
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2 |
|
| #
7aad8fd8 |
| 24-Jan-2017 |
Matt Arsenault <[email protected]> |
Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@mi
Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <[email protected]>
llvm-svn: 292982
show more ...
|
|
Revision tags: llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1 |
|
| #
0a1a7b6b |
| 17-Nov-2016 |
Konstantin Zhuravlyov <[email protected]> |
Revert "AMDGPU: Enable ConstrainCopy DAG mutation"
This reverts commit r287146.
This breaks few conformance tests.
llvm-svn: 287233
|
| #
3b36bb1d |
| 16-Nov-2016 |
Matt Arsenault <[email protected]> |
AMDGPU: Enable ConstrainCopy DAG mutation
This fixes a probably unintended divergence from the default scheduler behavior.
llvm-svn: 287146
|
| #
5d8eb25e |
| 30-Sep-2016 |
Matt Arsenault <[email protected]> |
AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for
AMDGPU: Use unsigned compare for eq/ne
For some reason there are both of these available, except for scalar 64-bit compares which only has u64. I'm not sure why there are both (I'm guessing it's for the one bit inputs we don't use), but for consistency always using the unsigned one.
llvm-svn: 282832
show more ...
|