|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
e0039b8d |
| 05-Jun-2022 |
Kazu Hirata <[email protected]> |
Use llvm::less_second (NFC)
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
3ed643ea |
| 10-Mar-2022 |
Nikita Popov <[email protected]> |
[AMDGPUPromoteAlloca] Make compatible with opaque pointers
This mainly changes the handling of bitcasts to not check the types being casted from/to -- we should only care about the actual load/store
[AMDGPUPromoteAlloca] Make compatible with opaque pointers
This mainly changes the handling of bitcasts to not check the types being casted from/to -- we should only care about the actual load/store types. The GEP handling is also changed to not care about types, and just make sure that we get an offset corresponding to a vector element.
This was a bit of a struggle for me, because this code seems to be pretty sensitive to small changes. The end result seems to produce strictly better results for the existing test coverage though, because we can now deal with more situations involving bitcasts.
Differential Revision: https://reviews.llvm.org/D121371
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
6527b2a4 |
| 18-Feb-2022 |
Sebastian Neubauer <[email protected]> |
[AMDGPU][NFC] Fix typos
Fix some typos in the amdgpu backend.
Differential Revision: https://reviews.llvm.org/D119235
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
e188aae4 |
| 31-Jan-2022 |
serge-sans-paille <[email protected]> |
Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avo
Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 6400831 after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
15f54dd5 |
| 19-Jan-2022 |
Yaxun (Sam) Liu <[email protected]> |
AMDGPU: Account for usage HIP-style dynamic LDS
Disable promote alloca to LDS when HIP-style dynamic LDS since the size is unknown at compile time.
Patch by: Siu Chi Chan
Reviewed by: Matt Arsenau
AMDGPU: Account for usage HIP-style dynamic LDS
Disable promote alloca to LDS when HIP-style dynamic LDS since the size is unknown at compile time.
Patch by: Siu Chi Chan
Reviewed by: Matt Arsenault, Yaxun Liu
Differential Revision: https://reviews.llvm.org/D117494
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
1172712f |
| 08-Dec-2021 |
Arthur Eubanks <[email protected]> |
[NFC] Replace some deprecated getAlignment() calls with getAlign()
Reviewed By: gchatelet
Differential Revision: https://reviews.llvm.org/D115370
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
d1f45ed5 |
| 11-Nov-2021 |
Neubauer, Sebastian <[email protected]> |
[AMDGPU][NFC] Fix typos
Differential Revision: https://reviews.llvm.org/D113672
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
cf74ef13 |
| 23-Sep-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Limit promote alloca max size in functions
Non-entry functions have 32 caller saved VGPRs available. If we promote alloca to consume more registers we will have to spill CSRs. There is no r
[AMDGPU] Limit promote alloca max size in functions
Non-entry functions have 32 caller saved VGPRs available. If we promote alloca to consume more registers we will have to spill CSRs. There is no reason to eliminate scratch access to get another scratch access instead.
Differential Revision: https://reviews.llvm.org/D110372
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3 |
|
| #
b9b419a1 |
| 01-Sep-2021 |
Arthur Eubanks <[email protected]> |
[NFC] Remove redundant code added in 04ce2de3
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
04ce2de3 |
| 13-Aug-2021 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove implicit argument attributes when introducing new calls
In a future patch, a new set of amdgpu-no-* attributes will be introduced to indicate when a function does not need an implicit
AMDGPU: Remove implicit argument attributes when introducing new calls
In a future patch, a new set of amdgpu-no-* attributes will be introduced to indicate when a function does not need an implicitly passed input. This pass introduces new instances of these intrinsic calls, and should remove the attributes if they were present before.
show more ...
|
| #
44a3241f |
| 19-Aug-2021 |
Arthur Eubanks <[email protected]> |
[NFC] Replace some attribute methods that use confusing indexes
|
| #
3f4d00bc |
| 18-Aug-2021 |
Arthur Eubanks <[email protected]> |
[NFC] More get/removeAttribute() cleanup
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
89612938 |
| 27-May-2021 |
Arthur Eubanks <[email protected]> |
[OpaquePtr] Create API to make a copy of a PointerType with some address space
Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space.
Revie
[OpaquePtr] Create API to make a copy of a PointerType with some address space
Some existing places use getPointerElementType() to create a copy of a pointer type with some new address space.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D103429
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
544be708 |
| 29-Apr-2021 |
Christudasan Devadasan <[email protected]> |
[AMDGPU] Skip promote-alloca for insertelement/insertvalue users
It is difficult to track the users of vector and aggregate types.
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.o
[AMDGPU] Skip promote-alloca for insertelement/insertvalue users
It is difficult to track the users of vector and aggregate types.
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.org/D101562
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
13e49dce |
| 15-Mar-2021 |
Jon Chesterfield <[email protected]> |
[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kern
[amdgpu] Implement lower function LDS pass
[amdgpu] Implement lower function LDS pass
Local variables are allocated at kernel launch. This pass collects global variables that are used from non-kernel functions, moves them into a new struct type, and allocates an instance of that type in every kernel. Uses are then replaced with a constantexpr offset.
Prior to this pass, accesses from a function are compiled to trap. With this pass, most such accesses are removed before reaching codegen. The trap logic is left unchanged by this pass. It is still reachable for the cases this pass misses, notably the extern shared construct from hip and variables marked constant which survive the optimizer.
This is of interest to the openmp project because the deviceRTL runtime library uses cuda shared variables from functions that cannot be inlined. Trunk llvm therefore cannot compile some openmp kernels for amdgpu. In addition to the unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled and the function pointer hashing scheme deleted passes the openmp suite.
This lowering will use more LDS than strictly necessary. It is intended to be a functionally correct fallback for cases that are difficult to target from future optimisation passes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D94648
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2 |
|
| #
cb41ee92 |
| 10-Feb-2021 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Fix promote alloca with double use in a same insn
If we have an instruction where more than one pointer operands are derived from the same promoted alloca, we are fixing it for one argument
[AMDGPU] Fix promote alloca with double use in a same insn
If we have an instruction where more than one pointer operands are derived from the same promoted alloca, we are fixing it for one argument and do not fix a second use considering this user done.
Fix this by deferring processing of memory intrinsics until all potential operands are replaced.
Fixes: SWDEV-271358
Differential Revision: https://reviews.llvm.org/D96386
show more ...
|
|
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
560d7e04 |
| 20-Jan-2021 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets
... to reduce headers dependency.
Reviewed By: rampitec, arsenm
Differential Revision: https://reviews.llvm.org/D95036
|
| #
352fcfc6 |
| 17-Jan-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use llvm::sort (NFC)
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
| #
0e9abcfc |
| 28-Dec-2020 |
Arthur Eubanks <[email protected]> |
[AMDGPU][NewPM] Port amdgpu-promote-alloca(-to-vector)
And add to AMDGPU opt pipeline.
Don't pin an opt run to the legacy PM when -enable-new-pm=1 if these passes (or passes introduced in https://r
[AMDGPU][NewPM] Port amdgpu-promote-alloca(-to-vector)
And add to AMDGPU opt pipeline.
Don't pin an opt run to the legacy PM when -enable-new-pm=1 if these passes (or passes introduced in https://reviews.llvm.org/D93863) are in the list of passes.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D93875
show more ...
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2 |
|
| #
b0eb40ca |
| 31-Jul-2020 |
Vitaly Buka <[email protected]> |
[NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.
Differential Revision: https://reviews.llvm.org/D84621
|
| #
89051eba |
| 31-Jul-2020 |
Vitaly Buka <[email protected]> |
[NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
|
|
Revision tags: llvmorg-11.0.0-rc1 |
|
| #
d42c7b22 |
| 18-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Account for the size of LDS globals used through constant expressions.
Also "fix" the longstanding bug where the computed size depends on the order of the visitation. We could try to predict
AMDGPU: Account for the size of LDS globals used through constant expressions.
Also "fix" the longstanding bug where the computed size depends on the order of the visitation. We could try to predict the allocation order used by legalization, but it would never be 100% perfect. Until we start fixing the addresses somehow (or have a more reliable allocation scheme later), just try to compute the size based on the worst case padding.
show more ...
|
| #
84704d98 |
| 18-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Fix not accounting for constantexpr uses of LDS globals
This was failing to add the size of LDS globals that weren't directly used by an instruction. They could be used by constant expressio
AMDGPU: Fix not accounting for constantexpr uses of LDS globals
This was failing to add the size of LDS globals that weren't directly used by an instruction. They could be used by constant expressions which are transitively used by the function. This requires a better search, but just abort on this for now for correctness.
show more ...
|
|
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3 |
|
| #
54e2dc75 |
| 01-Jul-2020 |
Stanislav Mekhanoshin <[email protected]> |
[AMDGPU] Limit promote alloca to vector with VGPR budget
Allow only up to 1/4 of available VGPRs for the vectorization of any given alloca.
Differential Revision: https://reviews.llvm.org/D82990
|