History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (Results 1 – 25 of 114)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5
# e0039b8d 05-Jun-2022 Kazu Hirata <[email protected]>

Use llvm::less_second (NFC)


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 3ed643ea 10-Mar-2022 Nikita Popov <[email protected]>

[AMDGPUPromoteAlloca] Make compatible with opaque pointers

This mainly changes the handling of bitcasts to not check the types
being casted from/to -- we should only care about the actual
load/store

[AMDGPUPromoteAlloca] Make compatible with opaque pointers

This mainly changes the handling of bitcasts to not check the types
being casted from/to -- we should only care about the actual
load/store types. The GEP handling is also changed to not care about
types, and just make sure that we get an offset corresponding to
a vector element.

This was a bit of a struggle for me, because this code seems to be
pretty sensitive to small changes. The end result seems to produce
strictly better results for the existing test coverage though,
because we can now deal with more situations involving bitcasts.

Differential Revision: https://reviews.llvm.org/D121371

show more ...


Revision tags: llvmorg-14.0.0-rc2
# 6527b2a4 18-Feb-2022 Sebastian Neubauer <[email protected]>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# e188aae4 31-Jan-2022 serge-sans-paille <[email protected]>

Cleanup header dependencies in LLVMCore

Based on the output of include-what-you-use.

This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avo

Cleanup header dependencies in LLVMCore

Based on the output of include-what-you-use.

This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/

I've tried to summarize the biggest change below:

- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h

And the usual count of preprocessed lines:
$ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after: 6189948

200k lines less to process is no that bad ;-)

Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup

Differential Revision: https://reviews.llvm.org/D118652

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3
# 15f54dd5 19-Jan-2022 Yaxun (Sam) Liu <[email protected]>

AMDGPU: Account for usage HIP-style dynamic LDS

Disable promote alloca to LDS when HIP-style dynamic LDS since the size
is unknown at compile time.

Patch by: Siu Chi Chan

Reviewed by: Matt Arsenau

AMDGPU: Account for usage HIP-style dynamic LDS

Disable promote alloca to LDS when HIP-style dynamic LDS since the size
is unknown at compile time.

Patch by: Siu Chi Chan

Reviewed by: Matt Arsenault, Yaxun Liu

Differential Revision: https://reviews.llvm.org/D117494

show more ...


Revision tags: llvmorg-13.0.1-rc2
# 1172712f 08-Dec-2021 Arthur Eubanks <[email protected]>

[NFC] Replace some deprecated getAlignment() calls with getAlign()

Reviewed By: gchatelet

Differential Revision: https://reviews.llvm.org/D115370


Revision tags: llvmorg-13.0.1-rc1
# d1f45ed5 11-Nov-2021 Neubauer, Sebastian <[email protected]>

[AMDGPU][NFC] Fix typos

Differential Revision: https://reviews.llvm.org/D113672


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4
# cf74ef13 23-Sep-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Limit promote alloca max size in functions

Non-entry functions have 32 caller saved VGPRs available. If we
promote alloca to consume more registers we will have to spill
CSRs. There is no r

[AMDGPU] Limit promote alloca max size in functions

Non-entry functions have 32 caller saved VGPRs available. If we
promote alloca to consume more registers we will have to spill
CSRs. There is no reason to eliminate scratch access to get
another scratch access instead.

Differential Revision: https://reviews.llvm.org/D110372

show more ...


Revision tags: llvmorg-13.0.0-rc3
# b9b419a1 01-Sep-2021 Arthur Eubanks <[email protected]>

[NFC] Remove redundant code added in 04ce2de3


Revision tags: llvmorg-13.0.0-rc2
# 04ce2de3 13-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Remove implicit argument attributes when introducing new calls

In a future patch, a new set of amdgpu-no-* attributes will be
introduced to indicate when a function does not need an implicit

AMDGPU: Remove implicit argument attributes when introducing new calls

In a future patch, a new set of amdgpu-no-* attributes will be
introduced to indicate when a function does not need an implicitly
passed input. This pass introduces new instances of these intrinsic
calls, and should remove the attributes if they were present before.

show more ...


# 44a3241f 19-Aug-2021 Arthur Eubanks <[email protected]>

[NFC] Replace some attribute methods that use confusing indexes


# 3f4d00bc 18-Aug-2021 Arthur Eubanks <[email protected]>

[NFC] More get/removeAttribute() cleanup


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 89612938 27-May-2021 Arthur Eubanks <[email protected]>

[OpaquePtr] Create API to make a copy of a PointerType with some address space

Some existing places use getPointerElementType() to create a copy of a
pointer type with some new address space.

Revie

[OpaquePtr] Create API to make a copy of a PointerType with some address space

Some existing places use getPointerElementType() to create a copy of a
pointer type with some new address space.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D103429

show more ...


Revision tags: llvmorg-12.0.1-rc1
# 544be708 29-Apr-2021 Christudasan Devadasan <[email protected]>

[AMDGPU] Skip promote-alloca for insertelement/insertvalue users

It is difficult to track the users of vector and aggregate types.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.o

[AMDGPU] Skip promote-alloca for insertelement/insertvalue users

It is difficult to track the users of vector and aggregate types.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D101562

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 13e49dce 15-Mar-2021 Jon Chesterfield <[email protected]>

[amdgpu] Implement lower function LDS pass

[amdgpu] Implement lower function LDS pass

Local variables are allocated at kernel launch. This pass collects global
variables that are used from non-kern

[amdgpu] Implement lower function LDS pass

[amdgpu] Implement lower function LDS pass

Local variables are allocated at kernel launch. This pass collects global
variables that are used from non-kernel functions, moves them into a new struct
type, and allocates an instance of that type in every kernel. Uses are then
replaced with a constantexpr offset.

Prior to this pass, accesses from a function are compiled to trap. With this
pass, most such accesses are removed before reaching codegen. The trap logic
is left unchanged by this pass. It is still reachable for the cases this pass
misses, notably the extern shared construct from hip and variables marked
constant which survive the optimizer.

This is of interest to the openmp project because the deviceRTL runtime library
uses cuda shared variables from functions that cannot be inlined. Trunk llvm
therefore cannot compile some openmp kernels for amdgpu. In addition to the
unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled
and the function pointer hashing scheme deleted passes the openmp suite.

This lowering will use more LDS than strictly necessary. It is intended to be
a functionally correct fallback for cases that are difficult to target from
future optimisation passes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D94648

show more ...


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2
# cb41ee92 10-Feb-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Fix promote alloca with double use in a same insn

If we have an instruction where more than one pointer operands
are derived from the same promoted alloca, we are fixing it for
one argument

[AMDGPU] Fix promote alloca with double use in a same insn

If we have an instruction where more than one pointer operands
are derived from the same promoted alloca, we are fixing it for
one argument and do not fix a second use considering this user
done.

Fix this by deferring processing of memory intrinsics until all
potential operands are replaced.

Fixes: SWDEV-271358

Differential Revision: https://reviews.llvm.org/D96386

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04 20-Jan-2021 dfukalov <[email protected]>

[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets

... to reduce headers dependency.

Reviewed By: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D95036


# 352fcfc6 17-Jan-2021 Kazu Hirata <[email protected]>

[llvm] Use llvm::sort (NFC)


Revision tags: llvmorg-11.1.0-rc1
# 6a87e9b0 25-Dec-2020 dfukalov <[email protected]>

[NFC][AMDGPU] Reduce include files dependency.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D93813


# 0e9abcfc 28-Dec-2020 Arthur Eubanks <[email protected]>

[AMDGPU][NewPM] Port amdgpu-promote-alloca(-to-vector)

And add to AMDGPU opt pipeline.

Don't pin an opt run to the legacy PM when -enable-new-pm=1 if these
passes (or passes introduced in https://r

[AMDGPU][NewPM] Port amdgpu-promote-alloca(-to-vector)

And add to AMDGPU opt pipeline.

Don't pin an opt run to the legacy PM when -enable-new-pm=1 if these
passes (or passes introduced in https://reviews.llvm.org/D93863) are in
the list of passes.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D93875

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2
# b0eb40ca 31-Jul-2020 Vitaly Buka <[email protected]>

[NFC] Remove unused GetUnderlyingObject paramenter

Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621


# 89051eba 31-Jul-2020 Vitaly Buka <[email protected]>

[NFC] GetUnderlyingObject -> getUnderlyingObject

I am going to touch them in the next patch anyway


Revision tags: llvmorg-11.0.0-rc1
# d42c7b22 18-Jul-2020 Matt Arsenault <[email protected]>

AMDGPU: Account for the size of LDS globals used through constant
expressions.

Also "fix" the longstanding bug where the computed size depends on the
order of the visitation. We could try to predict

AMDGPU: Account for the size of LDS globals used through constant
expressions.

Also "fix" the longstanding bug where the computed size depends on the
order of the visitation. We could try to predict the allocation order
used by legalization, but it would never be 100% perfect. Until we
start fixing the addresses somehow (or have a more reliable allocation
scheme later), just try to compute the size based on the worst case
padding.

show more ...


# 84704d98 18-Jul-2020 Matt Arsenault <[email protected]>

AMDGPU: Fix not accounting for constantexpr uses of LDS globals

This was failing to add the size of LDS globals that weren't directly
used by an instruction. They could be used by constant expressio

AMDGPU: Fix not accounting for constantexpr uses of LDS globals

This was failing to add the size of LDS globals that weren't directly
used by an instruction. They could be used by constant expressions
which are transitively used by the function. This requires a better
search, but just abort on this for now for correctness.

show more ...


Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# 54e2dc75 01-Jul-2020 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Limit promote alloca to vector with VGPR budget

Allow only up to 1/4 of available VGPRs for the vectorization
of any given alloca.

Differential Revision: https://reviews.llvm.org/D82990


12345