AMDGPUPromoteAlloca.cpp - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5
# e0039b8d	05-Jun-2022	Kazu Hirata <[email protected]>	Use llvm::less_second (NFC)
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 3ed643ea	10-Mar-2022	Nikita Popov <[email protected]>	[AMDGPUPromoteAlloca] Make compatible with opaque pointers This mainly changes the handling of bitcasts to not check the types being casted from/to -- we should only care about the actual load/store [AMDGPUPromoteAlloca] Make compatible with opaque pointers This mainly changes the handling of bitcasts to not check the types being casted from/to -- we should only care about the actual load/store types. The GEP handling is also changed to not care about types, and just make sure that we get an offset corresponding to a vector element. This was a bit of a struggle for me, because this code seems to be pretty sensitive to small changes. The end result seems to produce strictly better results for the existing test coverage though, because we can now deal with more situations involving bitcasts. Differential Revision: https://reviews.llvm.org/D121371 show more ...
Revision tags: llvmorg-14.0.0-rc2
# 6527b2a4	18-Feb-2022	Sebastian Neubauer <[email protected]>	[AMDGPU][NFC] Fix typos Fix some typos in the amdgpu backend. Differential Revision: https://reviews.llvm.org/D119235
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# e188aae4	31-Jan-2022	serge-sans-paille <[email protected]>	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avo Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652 show more ...
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3
# 15f54dd5	19-Jan-2022	Yaxun (Sam) Liu <[email protected]>	AMDGPU: Account for usage HIP-style dynamic LDS Disable promote alloca to LDS when HIP-style dynamic LDS since the size is unknown at compile time. Patch by: Siu Chi Chan Reviewed by: Matt Arsenau AMDGPU: Account for usage HIP-style dynamic LDS Disable promote alloca to LDS when HIP-style dynamic LDS since the size is unknown at compile time. Patch by: Siu Chi Chan Reviewed by: Matt Arsenault, Yaxun Liu Differential Revision: https://reviews.llvm.org/D117494 show more ...
Revision tags: llvmorg-13.0.1-rc2
# 1172712f	08-Dec-2021	Arthur Eubanks <[email protected]>	[NFC] Replace some deprecated getAlignment() calls with getAlign() Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D115370
Revision tags: llvmorg-13.0.1-rc1
# d1f45ed5	11-Nov-2021	Neubauer, Sebastian <[email protected]>	[AMDGPU][NFC] Fix typos Differential Revision: https://reviews.llvm.org/D113672
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4
# cf74ef13	23-Sep-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Limit promote alloca max size in functions Non-entry functions have 32 caller saved VGPRs available. If we promote alloca to consume more registers we will have to spill CSRs. There is no r [AMDGPU] Limit promote alloca max size in functions Non-entry functions have 32 caller saved VGPRs available. If we promote alloca to consume more registers we will have to spill CSRs. There is no reason to eliminate scratch access to get another scratch access instead. Differential Revision: https://reviews.llvm.org/D110372 show more ...
Revision tags: llvmorg-13.0.0-rc3
# b9b419a1	01-Sep-2021	Arthur Eubanks <[email protected]>	[NFC] Remove redundant code added in 04ce2de3
Revision tags: llvmorg-13.0.0-rc2
# 04ce2de3	13-Aug-2021	Matt Arsenault <[email protected]>	AMDGPU: Remove implicit argument attributes when introducing new calls In a future patch, a new set of amdgpu-no-* attributes will be introduced to indicate when a function does not need an implicit AMDGPU: Remove implicit argument attributes when introducing new calls In a future patch, a new set of amdgpu-no-* attributes will be introduced to indicate when a function does not need an implicitly passed input. This pass introduces new instances of these intrinsic calls, and should remove the attributes if they were present before. show more ...
# 44a3241f	19-Aug-2021	Arthur Eubanks <[email protected]>	[NFC] Replace some attribute methods that use confusing indexes
# 3f4d00bc	18-Aug-2021	Arthur Eubanks <[email protected]>	[NFC] More get/removeAttribute() cleanup
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 89612938	27-May-2021	Arthur Eubanks <[email protected]>	[OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Revie [OpaquePtr] Create API to make a copy of a PointerType with some address space Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D103429 show more ...
Revision tags: llvmorg-12.0.1-rc1
# 544be708	29-Apr-2021	Christudasan Devadasan <[email protected]>	[AMDGPU] Skip promote-alloca for insertelement/insertvalue users It is difficult to track the users of vector and aggregate types. Reviewed by: arsenm Differential Revision: https://reviews.llvm.o [AMDGPU] Skip promote-alloca for insertelement/insertvalue users It is difficult to track the users of vector and aggregate types. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D101562 show more ...
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 13e49dce	15-Mar-2021	Jon Chesterfield <[email protected]>	[amdgpu] Implement lower function LDS pass [amdgpu] Implement lower function LDS pass Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kern [amdgpu] Implement lower function LDS pass [amdgpu] Implement lower function LDS pass Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kernel functions, moves them into a new struct type, and allocates an instance of that type in every kernel. Uses are then replaced with a constantexpr offset. Prior to this pass, accesses from a function are compiled to trap. With this pass, most such accesses are removed before reaching codegen. The trap logic is left unchanged by this pass. It is still reachable for the cases this pass misses, notably the extern shared construct from hip and variables marked constant which survive the optimizer. This is of interest to the openmp project because the deviceRTL runtime library uses cuda shared variables from functions that cannot be inlined. Trunk llvm therefore cannot compile some openmp kernels for amdgpu. In addition to the unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled and the function pointer hashing scheme deleted passes the openmp suite. This lowering will use more LDS than strictly necessary. It is intended to be a functionally correct fallback for cases that are difficult to target from future optimisation passes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D94648 show more ...
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# cb41ee92	10-Feb-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Fix promote alloca with double use in a same insn If we have an instruction where more than one pointer operands are derived from the same promoted alloca, we are fixing it for one argument [AMDGPU] Fix promote alloca with double use in a same insn If we have an instruction where more than one pointer operands are derived from the same promoted alloca, we are fixing it for one argument and do not fix a second use considering this user done. Fix this by deferring processing of memory intrinsics until all potential operands are replaced. Fixes: SWDEV-271358 Differential Revision: https://reviews.llvm.org/D96386 show more ...
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04	20-Jan-2021	dfukalov <[email protected]>	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
# 352fcfc6	17-Jan-2021	Kazu Hirata <[email protected]>	[llvm] Use llvm::sort (NFC)
Revision tags: llvmorg-11.1.0-rc1
# 6a87e9b0	25-Dec-2020	dfukalov <[email protected]>	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
# 0e9abcfc	28-Dec-2020	Arthur Eubanks <[email protected]>	[AMDGPU][NewPM] Port amdgpu-promote-alloca(-to-vector) And add to AMDGPU opt pipeline. Don't pin an opt run to the legacy PM when -enable-new-pm=1 if these passes (or passes introduced in https://r [AMDGPU][NewPM] Port amdgpu-promote-alloca(-to-vector) And add to AMDGPU opt pipeline. Don't pin an opt run to the legacy PM when -enable-new-pm=1 if these passes (or passes introduced in https://reviews.llvm.org/D93863) are in the list of passes. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D93875 show more ...
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2
# b0eb40ca	31-Jul-2020	Vitaly Buka <[email protected]>	[NFC] Remove unused GetUnderlyingObject paramenter Depends on D84617. Differential Revision: https://reviews.llvm.org/D84621
# 89051eba	31-Jul-2020	Vitaly Buka <[email protected]>	[NFC] GetUnderlyingObject -> getUnderlyingObject I am going to touch them in the next patch anyway
Revision tags: llvmorg-11.0.0-rc1
# d42c7b22	18-Jul-2020	Matt Arsenault <[email protected]>	AMDGPU: Account for the size of LDS globals used through constant expressions. Also "fix" the longstanding bug where the computed size depends on the order of the visitation. We could try to predict AMDGPU: Account for the size of LDS globals used through constant expressions. Also "fix" the longstanding bug where the computed size depends on the order of the visitation. We could try to predict the allocation order used by legalization, but it would never be 100% perfect. Until we start fixing the addresses somehow (or have a more reliable allocation scheme later), just try to compute the size based on the worst case padding. show more ...
# 84704d98	18-Jul-2020	Matt Arsenault <[email protected]>	AMDGPU: Fix not accounting for constantexpr uses of LDS globals This was failing to add the size of LDS globals that weren't directly used by an instruction. They could be used by constant expressio AMDGPU: Fix not accounting for constantexpr uses of LDS globals This was failing to add the size of LDS globals that weren't directly used by an instruction. They could be used by constant expressions which are transitively used by the function. This requires a better search, but just abort on this for now for correctness. show more ...
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 54e2dc75	01-Jul-2020	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Limit promote alloca to vector with VGPR budget Allow only up to 1/4 of available VGPRs for the vectorization of any given alloca. Differential Revision: https://reviews.llvm.org/D82990
12 3 4 5