|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
9fa425c1 |
| 13-Jul-2022 |
Abinav Puthan Purayil <[email protected]> |
[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access
AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if FuncInfo::InstCost outweighs MemInstCost even if we
[AMDGPU] Set amdgpu-memory-bound if a basic block has dense global memory access
AMDGPUPerfHintAnalysis doesn't set the memory bound attribute if FuncInfo::InstCost outweighs MemInstCost even if we have a basic block with relatively high global memory access. GCNSchedStrategy could revert optimal scheduling in favour of occupancy which seems to degrade performance for some kernels. This change introduces the HasDenseGlobalMemAcc metric in the heuristic that makes the analysis more conservative in these cases.
This fixes SWDEV-334259/SWDEV-343932
Differential Revision: https://reviews.llvm.org/D129759
show more ...
|
| #
deac0ac5 |
| 16-Jul-2022 |
Kazu Hirata <[email protected]> |
[AMDGPU] Use default member initialization (NFC)
Identified with modernize-use-default-member-init.
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
6181c192 |
| 16-Jun-2022 |
LiaoChunyu <[email protected]> |
[AMDGPU][NFC] Remove isConstantAddr
fix isConstantAddr defined but not used
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D127959
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
c246b7bd |
| 31-Mar-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Only count global-to-global as indirect accesses
Previously any load (global, local or constant) feeding into a global load or store would be counted as an indirect access. This patch only
[AMDGPU] Only count global-to-global as indirect accesses
Previously any load (global, local or constant) feeding into a global load or store would be counted as an indirect access. This patch only counts global loads feeding into a global load or store. The rationale is that the latency for global loads is generally much larger than the other kinds.
As a side effect this makes it easier to write small kernels test cases that are not counted as having indirect accesses, despite the fact that arguments to the kernel are accessed with an SMEM load.
Differential Revision: https://reviews.llvm.org/D122804
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
1822a5ec |
| 16-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Return better Changed status from AMDGPUPerfHintAnalysis
Differential Revision: https://reviews.llvm.org/D119944
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
220815a9 |
| 13-Dec-2021 |
Nikita Popov <[email protected]> |
[AMDGPUPerfHintAnalysis] Avoid getPointerElementType()
Extract the load/store type from the instruction rather than fetching it from the pointer element type.
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
a397c1c8 |
| 08-Jul-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Tune perfhint analysis to account access width
A function with less memory instructions but wider access is the same as a function with more but narrower accesses in terms of memory boundne
[AMDGPU] Tune perfhint analysis to account access width
A function with less memory instructions but wider access is the same as a function with more but narrower accesses in terms of memory boundness. In fact the pass would give different answers before and after vectorization without this change.
Differential Revision: https://reviews.llvm.org/D105651
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
99142003 |
| 06-Jun-2021 |
Nikita Popov <[email protected]> |
[CodeGen] Add missing includes (NFC)
These currently rely on the IRBuilder.h include in TargetLowering.h. Make them explicit.
|
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
71bed820 |
| 24-May-2020 |
Simon Pilgrim <[email protected]> |
AMDGPU.h - reduce TargetMachine.h include. NFC.
Replace TargetMachine.h include with forward declaration and CodeGen.h include in AMDGPU.h.
Exposes a couple of implicit dependencies that require ad
AMDGPU.h - reduce TargetMachine.h include. NFC.
Replace TargetMachine.h include with forward declaration and CodeGen.h include in AMDGPU.h.
Exposes a couple of implicit dependencies that require additional forward declarations/includes.
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1 |
|
| #
447e2c30 |
| 14-Apr-2020 |
Mircea Trofin <[email protected]> |
[llvm][NFC][CallSite] Remove Implementation uses of CallSite
Reviewers: dblaikie, davidxl, craig.topper
Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiradity
[llvm][NFC][CallSite] Remove Implementation uses of CallSite
Reviewers: dblaikie, davidxl, craig.topper
Subscribers: arsenm, dschuff, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78142
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
| #
5e9610a3 |
| 05-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix assert in clang test
llvm-svn: 365245
|
| #
e7e23e3e |
| 05-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass
Add a string attribute instead of directly setting MachineFunctionInfo. This avoids trying to get the analysis in the MachineFunctionInfo in a way tha
AMDGPU: Make AMDGPUPerfHintAnalysis an SCC pass
Add a string attribute instead of directly setting MachineFunctionInfo. This avoids trying to get the analysis in the MachineFunctionInfo in a way that doesn't work with the new pass manager.
This will also avoid re-visiting the call graph for every single function.
llvm-svn: 365241
show more ...
|
|
Revision tags: llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3 |
|
| #
0da6350d |
| 31-Aug-2018 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove remnants of old address space mapping
llvm-svn: 341165
|
|
Revision tags: llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3 |
|
| #
7ba3fc73 |
| 11-Jun-2018 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Do not consider indirect acces through phi for wave limiter
Rational: if there is indirect access that is usually an issue because load is not ready by the use. However, if use is inside a
[AMDGPU] Do not consider indirect acces through phi for wave limiter
Rational: if there is indirect access that is usually an issue because load is not ready by the use. However, if use is inside a loop and load is outside that is potentially an issue for a first iteration only.
Differential Revision: https://reviews.llvm.org/D47740
llvm-svn: 334420
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc2 |
|
| #
7fc1cee0 |
| 25-May-2018 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Fixed test failure with AMDGPUPerfHint
We shall not keep iterator to a map while map is modified, this leads to a broken map.
llvm-svn: 333298
|
| #
1c538423 |
| 25-May-2018 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:
1. Function is memory bound. 2. Kernel can use wave limiter.
Currently these hints ar
[AMDGPU] Add perf hints to functions
This is adoption of HSAIL perfhint pass. Two types of hints are produced:
1. Function is memory bound. 2. Kernel can use wave limiter.
Currently these hints are used in the scheduler. If a function is suspected to be memory bound we allow occupancy to decrease to 4 waves in the course of scheduling.
Differential Revision: https://reviews.llvm.org/D46992
llvm-svn: 333289
show more ...
|