|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
77e793d0 |
| 16-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Return better Changed status from AMDGPUAnnotateUniformValues
Differential Revision: https://reviews.llvm.org/D119943
|
| #
290e5722 |
| 10-Feb-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Improve clobbering checks in the kernel argument promotion
Use same MSSA clobbering checks as in the AMDGPUAnnotateUniformValues. Kernel argument promotion needs exactly the same informatio
[AMDGPU] Improve clobbering checks in the kernel argument promotion
Use same MSSA clobbering checks as in the AMDGPUAnnotateUniformValues. Kernel argument promotion needs exactly the same information so factor out utility function isClobberedInFunction.
Differential Revision: https://reviews.llvm.org/D119480
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1 |
|
| #
b9cf52bc |
| 03-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Simplify AMDGPUAnnotateUniformValues::visitLoadInst
Always set uniform metadata on the pointer if it is an instruction, but otherwise do not bother to create a trivial getelementptr instruc
[AMDGPU] Simplify AMDGPUAnnotateUniformValues::visitLoadInst
Always set uniform metadata on the pointer if it is an instruction, but otherwise do not bother to create a trivial getelementptr instruction, because AMDGPUInstrInfo::isUniformMMO can already detect that various non-instruction pointers are uniform.
Most of the test case churn is from tests that used undef as a pointer, which AMDGPUInstrInfo::isUniformMMO treats as uniform.
Differential Revision: https://reviews.llvm.org/D118909
show more ...
|
| #
ddd3807e |
| 02-Feb-2022 |
Jay Foad <[email protected]> |
[AMDGPU] Use new target MMO flag MONoClobber
This allows us to set the noclobber flag on (the MMO of) a load instruction instead of on the pointer. This fixes a bug where noclobber was being applied
[AMDGPU] Use new target MMO flag MONoClobber
This allows us to set the noclobber flag on (the MMO of) a load instruction instead of on the pointer. This fixes a bug where noclobber was being applied to all loads from the same pointer, even if some of them were clobbered.
Differential Revision: https://reviews.llvm.org/D118775
show more ...
|
|
Revision tags: llvmorg-15-init |
|
| #
79606ee8 |
| 31-Jan-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Check atomics aliasing in the clobbering annotation
MemorySSA considers any atomic a def to any operation it dominates just like a barrier or fence. That is correct from memory state perspe
[AMDGPU] Check atomics aliasing in the clobbering annotation
MemorySSA considers any atomic a def to any operation it dominates just like a barrier or fence. That is correct from memory state perspective, but not required for the no-clobber metadata since we are not using it for reordering. Skip such atomics during the scan just like a barrier if it does not alias with the load.
Differential Revision: https://reviews.llvm.org/D118661
show more ...
|
| #
c2b18a3c |
| 28-Jan-2022 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Allow scalar loads after barrier
Currently we cannot convert a vector load into scalar if there is dominating barrier or fence. It is considered a clobbering memory access to prevent memory
[AMDGPU] Allow scalar loads after barrier
Currently we cannot convert a vector load into scalar if there is dominating barrier or fence. It is considered a clobbering memory access to prevent memory operations reordering. While reordering is not possible the actual memory is not being clobbered by a barrier or fence and we can still use a scalar load for a uniform pointer.
The solution is not to bail on a first clobbering access but traverse MemorySSA to the root excluding barriers and fences.
Differential Revision: https://reviews.llvm.org/D118419
show more ...
|
| #
0dcc8b86 |
| 31-Jan-2022 |
Jay Foad <[email protected]> |
[AMDGPU] AMDGPUAnnotateUniformValues: inline a single-use lambda. NFC.
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
2c3afa32 |
| 27-May-2021 |
Arthur Eubanks <[email protected]> |
[OpaquePtr] Clean up some uses of Type::getPointerElementType()
These depend on pointee types.
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
ab90ae6f |
| 05-May-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Switch AnnotateUniformValues to MemorySSA
This shall speedup compilation and also remove threshold limitations used by memory dependency analysis.
It also seem to fix the bug in the coales
[AMDGPU] Switch AnnotateUniformValues to MemorySSA
This shall speedup compilation and also remove threshold limitations used by memory dependency analysis.
It also seem to fix the bug in the coalescer_remat.ll where an SMRD load was used in presence of a potentially clobbering store.
Fixes: SWDEV-272132
Differential Revision: https://reviews.llvm.org/D101962
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
fd142e6c |
| 23-Mar-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Simplify AMDGPUAnnotateUniformValues::visitBranchInst. NFC.
A BranchInst is always the terminator of its containing BasicBlock.
|
| #
5d48b45c |
| 12-Mar-2021 |
Jay Foad <[email protected]> |
[AMDGPU] Use depth first iterator instead of recursive DFS. NFCI.
The reason for this is to avoid deep recursion in DFS() which can cause stack overflow on large CFGs, especially on Windows.
Differ
[AMDGPU] Use depth first iterator instead of recursive DFS. NFCI.
The reason for this is to avoid deep recursion in DFS() which can cause stack overflow on large CFGs, especially on Windows.
Differential Revision: https://reviews.llvm.org/D98528
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
| #
cb5b52a0 |
| 05-Jan-2021 |
Changpeng Fang <[email protected]> |
AMDGPU: Annotate amdgpu.noclobber for global loads only
Summary: This is to avoid unnecessary analysis since amdgpu.noclobber is only used for globals.
Reviewers: arsenm
Fixes: SWDEV-239161
AMDGPU: Annotate amdgpu.noclobber for global loads only
Summary: This is to avoid unnecessary analysis since amdgpu.noclobber is only used for globals.
Reviewers: arsenm
Fixes: SWDEV-239161
Differential Revision: https://reviews.llvm.org/D94107
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2 |
|
| #
ce0c0013 |
| 15-Dec-2020 |
Changpeng Fang <[email protected]> |
AMDGPU: If a store defines (alias) a load, it clobbers the load.
Summary: If a store defines (must alias) a load, it clobbers the load.
Fixes: SWDEV-258915
Reviewers: arsenm
Differential Revis
AMDGPU: If a store defines (alias) a load, it clobbers the load.
Summary: If a store defines (must alias) a load, it clobbers the load.
Fixes: SWDEV-258915
Reviewers: arsenm
Differential Revision: https://reviews.llvm.org/D92951
show more ...
|
|
Revision tags: llvmorg-11.0.1-rc1 |
|
| #
4df8efce |
| 17-Nov-2020 |
Nikita Popov <[email protected]> |
[AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationS
[AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the meaning of LocationSize::unknown(): Some parts (including most of BasicAA) assume that LocationSize::unknown() only allows accesses after the base pointer. Some parts (various callers of AA) assume that LocationSize::unknown() allows accesses both before and after the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer() to make this completely unambiguous. I tried my best to determine which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously clearly incorrect AA result: We were effectively assuming that argmemonly functions were only allowed to access their arguments after the passed pointer, but not before it. I'm pretty sure that this was not intentional, and it's certainly not specified by LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
show more ...
|
| #
393b9e9d |
| 19-Nov-2020 |
Nikita Popov <[email protected]> |
[MemLoc] Require LocationSize argument (NFC)
When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two di
[MemLoc] Require LocationSize argument (NFC)
When constructing a MemoryLocation by hand, require that a LocationSize is explicitly specified. D91649 will split up LocationSize::unknown() into two different states, and callers should make an explicit choice regarding the kind of MemoryLocation they want to have.
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
| #
243376cd |
| 30-Jul-2020 |
Changpeng Fang <[email protected]> |
AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst
Summary: This is in response to the review of https://reviews.llvm.org/D84873: The expensive check should be reorder
AMDGPU: Put inexpensive ops first in AMDGPUAnnotateUniformValues::visitLoadInst
Summary: This is in response to the review of https://reviews.llvm.org/D84873: The expensive check should be reordered last
Reviewers: arsenm
Differential Revision: https://reviews.llvm.org/D84890
show more ...
|
|
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
85117e28 |
| 30-May-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix not using scalar loads for global reads in shaders
The pass which infers when it's legal to load a global address space as SMRD was only considering amdgpu_kernel, and ignoring the shade
AMDGPU: Fix not using scalar loads for global reads in shaders
The pass which infers when it's legal to load a global address space as SMRD was only considering amdgpu_kernel, and ignoring the shader entry type calling conventions.
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
| #
05da2fe5 |
| 13-Nov-2019 |
Reid Kleckner <[email protected]> |
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of reco
Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
show more ...
|
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
| #
85af701e |
| 18-Jan-2019 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove llvm.SI.load.const
It's taken 3 years, but now all of the old AMDGPU and SI intrinsics are finally gone
llvm-svn: 351586
|
| #
f77e2e84 |
| 07-Jan-2019 |
Rhys Perry <[email protected]> |
AMDGPU: test for uniformity of branch instruction, not its condition
Summary: If a divergent branch instruction is marked as divergent by propagation rule 2 in DivergencePropagator::exploreSyncDepen
AMDGPU: test for uniformity of branch instruction, not its condition
Summary: If a divergent branch instruction is marked as divergent by propagation rule 2 in DivergencePropagator::exploreSyncDependency() and its condition is uniform, that branch would incorrectly be assumed to be uniform.
Reviewers: arsenm, tstellar
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D56331
llvm-svn: 350532
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3 |
|
| #
0da6350d |
| 31-Aug-2018 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove remnants of old address space mapping
llvm-svn: 341165
|
| #
35617ed4 |
| 30-Aug-2018 |
Nicolai Haehnle <[email protected]> |
[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).
The purpose of this patch is to free up the
[NFC] Rename the DivergenceAnalysis to LegacyDivergenceAnalysis
Summary: This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).
The purpose of this patch is to free up the name DivergenceAnalysis for the new generic implementation. The generic implementation class will be shared by specialized divergence analysis classes.
Patch by: Simon Moll
Reviewed By: nhaehnle
Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D50434
Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb llvm-svn: 341071
show more ...
|
|
Revision tags: llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1 |
|
| #
ce34ac58 |
| 12-Jul-2017 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix converting unanalyzable global loads to SMRD
Not all memory dependence queries succeed, so this needs to be conservative if it fails.
llvm-svn: 307861
|