Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 3a205977 19-Jul-2022 Jon Chesterfield <[email protected]>

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS varia

[amdgpu] Implement lds kernel id intrinsic

Implement an intrinsic for use lowering LDS variables to different
addresses from different kernels. This will allow kernels that cannot
reach an LDS variable to avoid wasting space for it.

There are a number of implicit arguments accessed by intrinsic already
so this implementation closely follows the existing handling. It is slightly
novel in that this SGPR is written by the kernel prologue.

It is necessary in the general case to put variables at different addresses
such that they can be compactly allocated and thus necessary for an
indirect function call to have some means of determining where a
given variable was allocated. Claiming an arbitrary SGPR into which
an integer can be written by the kernel, in this implementation based
on metadata associated with that kernel, which is then passed on to
indirect call sites is sufficient to determine the variable address.

The intent is to emit a __const array of LDS addresses and index into it.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D125060

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# 8edaf259 12-Apr-2022 Changpeng Fang <[email protected]>

AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally

Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset t

AMDGPU: Emit metadata for the hidden_multigrid_sync_arg conditionally

Summary:
Introduce a new function attribute, amdgpu-no-multigrid-sync-arg, which is default.
We use implicitarg_ptr + offset to check whether the multigrid synchronization
pointer is used. If yes, we remove this attribute and also remove
amdgpu-no-implicitarg-ptr. We generate metadata for the hidden_multigrid_sync_arg
only when the amdgpu-no-multigrid-sync-arg attribute is removed from the function.

Reviewers: arsenm, sameerds, b-sumner and foad

Differential Revision: https://reviews.llvm.org/D123548

show more ...


Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# ca62b1db 25-Feb-2022 Changpeng Fang <[email protected]>

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
Emit metadata for hidden_heap_v1 kernarg

Reviewers:
sameerds, b-sumner

Fixes:
SWDEV-307188

Differential Revision:
https://

[AMDGPU][NFC]: Emit metadata for hidden_heap_v1 kernarg

Summary:
Emit metadata for hidden_heap_v1 kernarg

Reviewers:
sameerds, b-sumner

Fixes:
SWDEV-307188

Differential Revision:
https://reviews.llvm.org/D119027

show more ...


# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <[email protected]>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 0eebe2e3 02-Dec-2021 Matt Arsenault <[email protected]>

AMDGPU: Sanitized functions require implicit arguments

Do not infer no-amdgpu-implicitarg-ptr for sanitized functions. If a
function is explicitly marked amdgpu-no-implicitarg-ptr and
sanitize_addre

AMDGPU: Sanitized functions require implicit arguments

Do not infer no-amdgpu-implicitarg-ptr for sanitized functions. If a
function is explicitly marked amdgpu-no-implicitarg-ptr and
sanitize_address, infer that it is required.

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# db4963d0 14-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Use attributor to propagate uniform-work-group-size

Drop the legacy version in AMDGPUAnnotateKernelFeatures. This has the
side effect of now respecting the linkage, and not changing external

AMDGPU: Use attributor to propagate uniform-work-group-size

Drop the legacy version in AMDGPUAnnotateKernelFeatures. This has the
side effect of now respecting the linkage, and not changing externally
visible functions.

show more ...


# 722b8e0e 14-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Invert ABI attribute handling

Previously we assumed all callable functions did not need any
implicitly passed inputs, and added attributes to functions to
indicate when they were necessary.

AMDGPU: Invert ABI attribute handling

Previously we assumed all callable functions did not need any
implicitly passed inputs, and added attributes to functions to
indicate when they were necessary. Requiring attributes for
correctness is pretty ugly, and it makes supporting indirect and
external calls more complicated.

This inverts the direction of the attributes, so an undecorated
function is assumed to need all implicit imputs. This enables
AMDGPUAttributor by default to mark when functions are proven to not
need a given input. This strips the equivalent functionality from the
legacy AMDGPUAnnotateKernelFeatures pass.

However, AMDGPUAnnotateKernelFeatures is not fully removed at this
point although it should be in the future. It is still necessary for
the two hacky amdgpu-calls and amdgpu-stack-objects attributes, which
would be better served by a trivial analysis on the IR during
selection. Additionally, AMDGPUAnnotateKernelFeatures still
redundantly handles the uniform-work-group-size attribute to be
removed in a future commit.

At this point when not using -amdgpu-fixed-function-abi, we are still
modifying the ABI based on these newly negated attributes. In the
future, this option will be removed and the locations for implicit
inputs will always be fixed. We will then use the new attributes to
avoid passing the values when unnecessary.

show more ...


# 088cc636 11-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Invert AMDGPUAttributor

Switch to using BitIntegerState for each of the inputs, and invert
their meanings.

This now diverges more from the old AMDGPUAnnotateKernelFeatures, but
this isn't u

AMDGPU: Invert AMDGPUAttributor

Switch to using BitIntegerState for each of the inputs, and invert
their meanings.

This now diverges more from the old AMDGPUAnnotateKernelFeatures, but
this isn't used yet anyway.

show more ...


# 46d82e73 13-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Restrict attributor transforms

We only really want this to add the custom attributes. Theoretically
the regular transforms were already run at this point. Touching
undefined behavior breaks

AMDGPU: Restrict attributor transforms

We only really want this to add the custom attributes. Theoretically
the regular transforms were already run at this point. Touching
undefined behavior breaks a lot of tests when this is enabled by
default, many of which are expecting to test handling of undef
operations.

show more ...


# cf32d61a 11-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Remove hacky attribute deduction from AMDGPUAttributor

amdgpu-calls and amdgpu-stack-objects don't really belong as
attributes, and are currently a hacky way of passing an analysis into
the

AMDGPU: Remove hacky attribute deduction from AMDGPUAttributor

amdgpu-calls and amdgpu-stack-objects don't really belong as
attributes, and are currently a hacky way of passing an analysis into
the DAG. These don't really belong in the IR, and don't really fit in
with the other attributes. Remove these to facilitate inverting the
pass.

I don't exactly understand the indirect call test changes. These tests
are using calls which are trivially replacable with a direct call, so
I'm not sure what the point is.

show more ...


# 98d7aa43 14-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Stop inferring use of llvm.amdgcn.kernarg.segment.ptr

We no longer use this intrinsic outside of the backend and no longer
support using it outside of kernels.


# a77ae4aa 12-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Stop attributor adding attributes to intrinsic declarations


# 152ceec1 11-Aug-2021 Matt Arsenault <[email protected]>

AMDGPU: Add indirect and extern calls to attributor test


# a420f80b 11-Aug-2021 Johannes Doerfert <[email protected]>

[Attributor] Do not delete volatile stores to null/undef

See D106309.

Differential Revision: https://reviews.llvm.org/D107906


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init
# cdb4cfe8 27-Jul-2021 Johannes Doerfert <[email protected]>

[Attributor][FIX] Update AMDGPU attributor test

The test contains UB and should be improved, for now we update the check
lines pass it.


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4
# 96709823 27-Jun-2021 Kuter Dinel <[email protected]>

[AMDGPU] Deduce attributes with the Attributor

This patch introduces a pass that uses the Attributor to deduce AMDGPU specific attributes.

Reviewed By: jdoerfert, arsenm

Differential Revision: htt

[AMDGPU] Deduce attributes with the Attributor

This patch introduces a pass that uses the Attributor to deduce AMDGPU specific attributes.

Reviewed By: jdoerfert, arsenm

Differential Revision: https://reviews.llvm.org/D104997

show more ...


# a7749c3f 13-Jul-2021 Kuter Dinel <[email protected]>

[AMDGPU] Use update_test_checks.py script for annotate kernel features tests.

This patch makes the annotate kernel features tests use the update_tests_checks.py
script. Which makes it easy to update

[AMDGPU] Use update_test_checks.py script for annotate kernel features tests.

This patch makes the annotate kernel features tests use the update_tests_checks.py
script. Which makes it easy to update the tests.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105864

show more ...


Revision tags: llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 6a4d9cb7 16-Apr-2021 madhur13490 <[email protected]>

[AMDGPU] Remove error check for indirect calls and add missing queue-ptr

This patch removes -fixed-abi check for indirect calls
and also adds queue-ptr which is required for indirect calls to work.

[AMDGPU] Remove error check for indirect calls and add missing queue-ptr

This patch removes -fixed-abi check for indirect calls
and also adds queue-ptr which is required for indirect calls to work.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D100633

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 5682ae2f 25-Mar-2021 madhur13490 <[email protected]>

[AMDGPU] Set implicit arg attributes for indirect calls

This patch adds attributes corresponding to
implicits to functions/kernels if
1. it has an indirect call OR
2. it's address is taken.

Once su

[AMDGPU] Set implicit arg attributes for indirect calls

This patch adds attributes corresponding to
implicits to functions/kernels if
1. it has an indirect call OR
2. it's address is taken.

Once such attributes are set, rest of the codegen would work
out-of-box for indirect calls. This patch eliminates
the potential overhead -fixed-abi imposes even though indirect functions
calls are not used.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D99347

show more ...


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# f0abefaf 28-May-2020 Matt Arsenault <[email protected]>

AMDGPU: Add IntrWillReturn to intrinsic definitions

This should probably be implied for all the speculatable ones. I think
the only ones where this plausibly doesn't apply is s_sendmsghalt and
maybe

AMDGPU: Add IntrWillReturn to intrinsic definitions

This should probably be implied for all the speculatable ones. I think
the only ones where this plausibly doesn't apply is s_sendmsghalt and
maybe kill.

show more ...


Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0
# 6bfe28e9 11-Sep-2018 Matt Arsenault <[email protected]>

AMDGPU: Fix annotate kernel features through casted calls

I thought I was testing this before, but the workitem id x
case isn't great since it's mandatory in the parent kernel.


# bb862209 12-Mar-2020 Matt Arsenault <[email protected]>

AMDGPU: Don't handle kernarg.segment.ptr in functions

Just lower this to null. Pass implicitarg.ptr in its place in the
argument list.


# ccc6e780 11-Mar-2020 Matt Arsenault <[email protected]>

AMDGPU: Directly annotate functions if they have calls

Currently we infer whether the flat-scratch-init kernel input should
be enabled based on calls. Move this handling, so we can decide if the
ful

AMDGPU: Directly annotate functions if they have calls

Currently we infer whether the flat-scratch-init kernel input should
be enabled based on calls. Move this handling, so we can decide if the
full set of ABI inputs is needed in kernels. Ideally we would have an
analysis of some sort, rather than the function attributes.

show more ...


# c56d2afc 07-Mar-2019 Aakanksha Patil <[email protected]>

AMDGPU: Handle "uniform-work-group-size" attribute (fix for RADV)

A previous patch for "uniform-work-group-size" attribute was found to break
some RADV and possibly radeon SI tests and had to be ret

AMDGPU: Handle "uniform-work-group-size" attribute (fix for RADV)

A previous patch for "uniform-work-group-size" attribute was found to break
some RADV and possibly radeon SI tests and had to be retracted.
This patch fixes that.

Differential Revision: http://reviews.llvm.org/D58993

llvm-svn: 355574

show more ...


# bc568766 13-Dec-2018 Aakanksha Patil <[email protected]>

Revert r348971: [AMDGPU] Support for "uniform-work-group-size" attribute

This patch breaks RADV (and probably RadeonSI as well)

llvm-svn: 349084


12