|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
58f1417e |
| 12-May-2020 |
Carl Ritson <[email protected]> |
[AMDGPU] Order pos exports before param exports
Summary: Modify export clustering DAG mutation to move position exports before other exports types.
Reviewers: foad, arsenm, rampitec, nhaehnle
Revi
[AMDGPU] Order pos exports before param exports
Summary: Modify export clustering DAG mutation to move position exports before other exports types.
Reviewers: foad, arsenm, rampitec, nhaehnle
Reviewed By: foad
Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79670
show more ...
|
| #
5f7ea85e |
| 30-Apr-2020 |
Jay Foad <[email protected]> |
[AMDGPU] Remove unnecessary s_waitcnt between VMEM loads
VMEM loads of the same type (sampler vs no sampler) are guaranteed to write their result registers in order, so there is no need for an s_wai
[AMDGPU] Remove unnecessary s_waitcnt between VMEM loads
VMEM loads of the same type (sampler vs no sampler) are guaranteed to write their result registers in order, so there is no need for an s_waitcnt even if they write to overlapping vgprs.
Differential Revision: https://reviews.llvm.org/D79176
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1 |
|
| #
20ca49b6 |
| 16-Jan-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Update tests to use modern buffer intrinsics
|
|
Revision tags: llvmorg-11-init |
|
| #
0412f518 |
| 17-Dec-2019 |
Jay Foad <[email protected]> |
[AMDGPU] Fix typo in SIInstrInfo::memOpsHaveSameBasePtr
Summary: The typo has been present since memOpsHaveSameBasePtr was introduced in r313208.
It caused SIInstrInfo::shouldClusterMemOps to clust
[AMDGPU] Fix typo in SIInstrInfo::memOpsHaveSameBasePtr
Summary: The typo has been present since memOpsHaveSameBasePtr was introduced in r313208.
It caused SIInstrInfo::shouldClusterMemOps to cluster more mem ops than it was supposed to.
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71616
show more ...
|
|
Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1 |
|
| #
1022c0df |
| 19-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU: Decompose all values to 32-bit pieces for calling conventions
This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit.
This should also help avoid
AMDGPU: Decompose all values to 32-bit pieces for calling conventions
This is the more natural lowering, and presents more opportunities to reduce 64-bit ops to 32-bit.
This should also help avoid issues graphics shaders have had with 64-bit values, and simplify argument lowering in globalisel.
llvm-svn: 366578
show more ...
|
|
Revision tags: llvmorg-10-init |
|
| #
51a05d72 |
| 12-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU: Drop remnants of byval support for shaders
Before 2018, mesa used to use byval interchangably with inreg, which didn't really make sense. Fix tests still using it to avoid breaking in a futu
AMDGPU: Drop remnants of byval support for shaders
Before 2018, mesa used to use byval interchangably with inreg, which didn't really make sense. Fix tests still using it to avoid breaking in a future commit.
llvm-svn: 365953
show more ...
|
|
Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3 |
|
| #
0124b548 |
| 13-Feb-2018 |
Yaxun Liu <[email protected]> |
[AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43170
llvm-svn: 325030
|
|
Revision tags: llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1 |
|
| #
b600e138 |
| 03-Apr-2017 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove llvm.SI.vs.load.input
llvm-svn: 299391
|
|
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3 |
|
| #
3ea06336 |
| 22-Feb-2017 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove some uses of llvm.SI.export in tests
Merge some of the old, smaller tests into more complete versions.
llvm-svn: 295792
|
|
Revision tags: llvmorg-4.0.0-rc2 |
|
| #
7aad8fd8 |
| 24-Jan-2017 |
Matt Arsenault <[email protected]> |
Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <vedran@mi
Enable FeatureFlatForGlobal on Volcanic Islands
This switches to the workaround that HSA defaults to for the mesa path.
This should be applied to the 4.0 branch.
Patch by Vedran Miletić <[email protected]>
llvm-svn: 292982
show more ...
|
|
Revision tags: llvmorg-4.0.0-rc1 |
|
| #
3336f681 |
| 11-Dec-2016 |
Sanjoy Das <[email protected]> |
[Verifier] Add verification for TBAA metadata
Summary: This change adds some verification in the IR verifier around struct path TBAA metadata.
Other than some basic sanity checks (e.g. we get const
[Verifier] Add verification for TBAA metadata
Summary: This change adds some verification in the IR verifier around struct path TBAA metadata.
Other than some basic sanity checks (e.g. we get constant integers where we expect constant integers), this checks:
- That by the time an struct access tuple `(base-type, offset)` is "reduced" to a scalar base type, the offset is `0`. For instance, in C++ you can't start from, say `("struct-a", 16)`, and end up with `("int", 4)` -- by the time the base type is `"int"`, the offset better be zero. In particular, a variant of this invariant is needed for `llvm::getMostGenericTBAA` to be correct.
- That there are no cycles in a struct path.
- That struct type nodes have their offsets listed in an ascending order.
- That when generating the struct access path, you eventually reach the access type listed in the tbaa tag node.
Reviewers: dexonsmith, chandlerc, reames, mehdi_amini, manmanren
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D26438
llvm-svn: 289402
show more ...
|
|
Revision tags: llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
| #
df3a20cd |
| 06-Apr-2016 |
Nicolai Haehnle <[email protected]> |
AMDGPU: Add a shader calling convention
This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders.
Patch By: Bas Nieuwenhuizen <bas@basnie
AMDGPU: Add a shader calling convention
This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders.
Patch By: Bas Nieuwenhuizen <[email protected]>
Differential Revision: http://reviews.llvm.org/D18559
llvm-svn: 265589
show more ...
|
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3 |
|
| #
9c47dd58 |
| 11-Feb-2016 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove some old intrinsic uses from tests
llvm-svn: 260493
|
|
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1 |
|
| #
2aed6ca1 |
| 19-Dec-2015 |
Matt Arsenault <[email protected]> |
AMDGPU: Switch barrier intrinsics to using convergent
noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for
AMDGPU: Switch barrier intrinsics to using convergent
noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for the unroll.
llvm-svn: 256075
show more ...
|
|
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4 |
|
| #
bd8a0856 |
| 21-Aug-2015 |
Tom Stellard <[email protected]> |
AMDGPU/SI: Better handle s_wait insertion
We can wait on either VM, EXP or LGKM. The waits are independent.
Without this patch, a wait inserted because of one of them would also wait for all the pr
AMDGPU/SI: Better handle s_wait insertion
We can wait on either VM, EXP or LGKM. The waits are independent.
Without this patch, a wait inserted because of one of them would also wait for all the previous others. This patch makes s_wait only wait for the ones we need for the next instruction.
Here's an example of subtle perf reduction this patch solves:
This is without the patch:
buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen s_load_dwordx4 s[44:47], s[8:9], 0xc s_waitcnt lgkmcnt(0) buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen s_load_dwordx4 s[48:51], s[8:9], 0x10 s_waitcnt vmcnt(1) buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen
The s_waitcnt vmcnt(1) is useless. The reason it is added is because the last buffer_load_format_xyzw needs s[44:47], which was issued by the first s_load_dwordx4. It waits for all VM before that call to have finished.
Internally after every instruction, 3 counters (for VM, EXP and LGTM) are updated after every instruction. For example buffer_load_format_xyzw will increase the VM counter, and s_load_dwordx4 the LGKM one.
Without the patch, for every defined register, the current 3 counters are stored, and are used to know how long to wait when an instruction needs the register.
Because of that, the s[44:47] counter includes that to use the register you need to wait for the previous buffer_load_format_xyzw.
Instead this patch stores only the counters that matter for the register, and puts zero for the other ones, since we don't need any wait for them.
Patch by: Axel Davy
Differential Revision: http://reviews.llvm.org/D11883
llvm-svn: 245755
show more ...
|
|
Revision tags: llvmorg-3.7.0-rc3, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
| #
45bb48ea |
| 13-Jun-2015 |
Tom Stellard <[email protected]> |
R600 -> AMDGPU rename
llvm-svn: 239657
|