|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
3ccd88f2 |
| 27-Jul-2022 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Separate processUsedLDS into independent pieces, rename it
|
| #
9981afdd |
| 27-Jul-2022 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Extract kernel annotation from processUsedLDS
|
| #
923b90bd |
| 26-Jul-2022 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Separate LDS struct creation from RAUW
|
| #
26dcc7e6 |
| 26-Jul-2022 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Skip operations on padding fields in LDS struct
|
| #
2224bbcd |
| 19-Jul-2022 |
Jon Chesterfield <[email protected]> |
[nfc][amdgpu] LDS. Move selection logic up the stack.
|
| #
eda2bcad |
| 15-Jul-2022 |
Jon Chesterfield <[email protected]> |
[nfc][amdgpu] Remove dead variable and function
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
bc78c099 |
| 04-May-2022 |
Jon Chesterfield <[email protected]> |
[amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow eliding the module.lds block from kernels. Will allocate the block
[amdgpu] Elide module lds allocation in kernels with no callees
Introduces a string attribute, amdgpu-requires-module-lds, to allow eliding the module.lds block from kernels. Will allocate the block as before if the attribute is missing or has its default value of true.
Patch uses the new attribute to detect the simplest possible instance of this, where a kernel makes no calls and thus cannot call any functions that use LDS.
Tests updated to match, coverage was already good. Interesting cases is in lower-module-lds-offsets where annotating the kernel allows the backend to pick a different (in this case better) variable ordering than previously. A later patch will avoid moving kernel variables into module.lds when the kernel can have this attribute, allowing optimal ordering and locally unused variable elimination.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D122091
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <[email protected]> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
| #
c7eb8463 |
| 11-Feb-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Merge AMDGPULDSUtils into AMDGPUMemoryUtils
Differential Revision: https://reviews.llvm.org/D119502
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
24b28db8 |
| 12-Dec-2021 |
Jon Chesterfield <[email protected]> |
[amdgpu] Increase alignment of all LDS variables
Currently the superalign option only increases the alignment of variables that are moved into the module.lds block. Change that to all LDS variables.
[amdgpu] Increase alignment of all LDS variables
Currently the superalign option only increases the alignment of variables that are moved into the module.lds block. Change that to all LDS variables. Also only increase the alignment once, instead of once per function.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D115488
show more ...
|
| #
d395befa |
| 11-Dec-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops (NFC)
|
| #
86caf517 |
| 11-Dec-2021 |
Jon Chesterfield <[email protected]> |
Revert "[amdgpu][nfc] Delete dead code in LowerModuleLDS"
This reverts commit 7b9ab06d10a6a989f76e6c5ecf89d906f838fe7d. Said code is better removed as part of a larger change.
|
| #
7b9ab06d |
| 10-Dec-2021 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Delete dead code in LowerModuleLDS
|
| #
04b2f6ea |
| 09-Dec-2021 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Drop dead PtrSet, fix a comment
|
| #
f0e3b39a |
| 08-Dec-2021 |
Jon Chesterfield <[email protected]> |
[amdgpu][nfc] Move non-shared code out of LDSUtils
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
cbdf624b |
| 19-Sep-2021 |
Brendon Cahoon <[email protected]> |
[AMDGPU] Correctly merge alias.scope and noalias metadata for memops
When adding alias.scope and noalias metadata to a memcpy function, the alias.scope and noalias metadata from the operands are mer
[AMDGPU] Correctly merge alias.scope and noalias metadata for memops
When adding alias.scope and noalias metadata to a memcpy function, the alias.scope and noalias metadata from the operands are merged. The rule for merging alias.scope is to take the intersection of the domains and the union of the scopes within those domains. The rule for merging noalias is to take the intersection.
The bug is that AMDGPULowerModuleLDS was using concatenation for both alias.scope and noalias. For example, when f1 and f2 are added to the LDS structure and there is a memcpy(f2, f1, sizeof(f1)). Then, concatenation creates noalias metadata for the memcpy that includes both {f1, f2}. That means that the memcpy is assumed not to alias a prior load of f2, which enables the optimizer to remove a load of f2 that occurs after mempcy.
The function MDNode::getmostGenericAliasScope defines the semantics for alias.scope. There is a function, combineMetadata in Local.cpp, that uses intersect for noalias.
Differential Revision: https://reviews.llvm.org/D110049
show more ...
|
| #
dc6e8dfd |
| 20-Sep-2021 |
Jacob Lambert <[email protected]> |
[AMDGPU][NFC] Correct typos in lib/Target/AMDGPU/AMDGPU*.cpp files. Test commit for new contributor.
|
|
Revision tags: llvmorg-13.0.0-rc3 |
|
| #
ce51c5d4 |
| 26-Aug-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix crashing on kernel declarations when lowering LDS
This was trying to insert the used marker into a declaration.
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
8d7d89b0 |
| 17-Aug-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Add alias.scope metadata to lowered LDS struct
Alias analysis is unable to disambiguate accesses to the structure fields without it unlike distinct variables. As a result we cannot combine
[AMDGPU] Add alias.scope metadata to lowered LDS struct
Alias analysis is unable to disambiguate accesses to the structure fields without it unlike distinct variables. As a result we cannot combine ds_read and ds_write operations in a case of any store in between which always considered clobbering.
Differential Revision: https://reviews.llvm.org/D108315
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
9dc26366 |
| 16-Jul-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Disable LDS lowering for GFX shaders
Apparently these need external LDS symbols to remain.
Fixes: SC1-3279
Differential Revision: https://reviews.llvm.org/D106288
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3 |
|
| #
d274d64e |
| 23-Jun-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Check for pointer operand while refining LDS align
Also skips the propagation if alignment is 1.
Differential Revision: https://reviews.llvm.org/D104796
|
|
Revision tags: llvmorg-12.0.1-rc2 |
|
| #
2b43209e |
| 15-Jun-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Propagate LDS align into to instructions
Differential Revision: https://reviews.llvm.org/D104316
|
| #
d797a7f8 |
| 15-Jun-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Use performOptimizedStructLayout for LDS sort
This gives better packing.
Differential Revision: https://reviews.llvm.org/D104331
|
| #
80fd5fa5 |
| 21-Jun-2021 |
hsmahesha <[email protected]> |
[AMDGPU] Replace non-kernel function uses of LDS globals by pointers.
The main motivation behind pointer replacement of LDS use within non-kernel functions is - to *avoid* subsequent LDS lowering pa
[AMDGPU] Replace non-kernel function uses of LDS globals by pointers.
The main motivation behind pointer replacement of LDS use within non-kernel functions is - to *avoid* subsequent LDS lowering pass from directly packing LDS (assume large LDS) into a struct type which would otherwise cause allocating huge memory for struct instance within every kernel.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103225
show more ...
|
| #
f6632f11 |
| 10-Jun-2021 |
hsmahesha <[email protected]> |
[AMDGPU] Fix missing lowering of LDS used in global scope.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D103431
|