History log of /llvm-project-15.0.7/llvm/test/CodeGen/AMDGPU/GlobalISel/function-returns.ll (Results 1 – 19 of 19)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 8d0383eb 24-Jun-2022 Matt Arsenault <[email protected]>

CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is

CodeGen: Remove AliasAnalysis from regalloc

This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.

show more ...


Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 04fff547 07-Mar-2022 Venkata Ramanaiah Nalamothu <[email protected]>

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added a

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added as a live-in on the function entry to
preserve its value when we have calls so that it gets saved and restored
around the calls.

But the DWARF unwind information (CFI) needs to track where the return address
resides in a frame and the above approach makes it difficult to track the
return address when the CFI information is emitted during the frame lowering,
due to the involvment of understanding the control flow.

This patch moves the return address ABI registers s[30:31] into callee saved
registers range and stops adding live-in for return address registers, so that
the CFI machinery will know where the return address resides when CSR
save/restore happen during the frame lowering.

And doing the above poses an issue that now the return instruction uses undefined
register `sgpr30_sgpr31`. This is resolved by hiding the return address register
use by the return instruction through the `SI_RETURN` pseudo instruction, which
doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the
`S_SETPC_B64_return` during the `expandPostRAPseudo()`.

As an added benefit, this patch simplifies overall return instruction handling.

Note: The AMDGPU CFI changes are there only in the downstream code and another
version of this patch will be posted for review for the downstream code.

Reviewed By: arsenm, ronlieb

Differential Revision: https://reviews.llvm.org/D114652

show more ...


Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 29f88b93 23-Dec-2021 Petar Avramovic <[email protected]>

[GlobalISel] Rework more/fewer elements for vectors

Artifact combiner is not able to access individual elements after using
LCMTy style merge/unmerge, extract and insert to change vector number of
e

[GlobalISel] Rework more/fewer elements for vectors

Artifact combiner is not able to access individual elements after using
LCMTy style merge/unmerge, extract and insert to change vector number of
elements (pad with undef or split to sub-vector instructions).
Use unmerge to individual elements instead and then merge elements into
requested types.
Change argument lowering for vectors and moreElementsVector to use
buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements.
FewerElementsVector had a few helpers that had different behavior,
introduce new helper for most of the opcodes.
FewerElementsVector helper is more flexible since it can create leftover
instruction smaller then requested type (useful in case target wants to
avoid pad with undef and use fewer registers). If target does not want
leftover of different type it should call more elements first.
Some helpers were performing more elements first to have split without
leftover. Opcodes that used this helper use clampMaxNumElementsStrict
(does more elements first) in LegalizerInfo to avoid test changes.
Fixes failures caused by failing to combine artifacts created during
more/fewer elements vector.

Differential Revision: https://reviews.llvm.org/D114198

show more ...


# d2863088 23-Dec-2021 Petar Avramovic <[email protected]>

GlobalISel: Regen vector mir tests, add tests for vector arg lowering

Precommit for D114198 (Rework more/fewer elements for vectors).
Regenerate auto-generated mir tests for vectors (use CHECK-NEXT

GlobalISel: Regen vector mir tests, add tests for vector arg lowering

Precommit for D114198 (Rework more/fewer elements for vectors).
Regenerate auto-generated mir tests for vectors (use CHECK-NEXT instead
of CHECK). Remove -global-isel-abort=0 where it is no longer needed.
Add mir tests for different AMDGPU sub-targets and they way they lower
function vector arguments (tests for legalization artifact combiner).

show more ...


# 09b53296 22-Dec-2021 Ron Lieberman <[email protected]>

Revert "[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range"

This reverts commit 9075009d1fd5f2bf9aa6c2f362d2993691a316b3.

Failed amdgpu runtime buildbot # 3514


# 9075009d 22-Dec-2021 RamNalamothu <[email protected]>

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added a

[AMDGPU] Move call clobbered return address registers s[30:31] to callee saved range

Currently the return address ABI registers s[30:31], which fall in the call
clobbered register range, are added as a live-in on the function entry to
preserve its value when we have calls so that it gets saved and restored
around the calls.

But the DWARF unwind information (CFI) needs to track where the return address
resides in a frame and the above approach makes it difficult to track the
return address when the CFI information is emitted during the frame lowering,
due to the involvment of understanding the control flow.

This patch moves the return address ABI registers s[30:31] into callee saved
registers range and stops adding live-in for return address registers, so that
the CFI machinery will know where the return address resides when CSR
save/restore happen during the frame lowering.

And doing the above poses an issue that now the return instruction uses undefined
register `sgpr30_sgpr31`. This is resolved by hiding the return address register
use by the return instruction through the `SI_RETURN` pseudo instruction, which
doesn't take any input operands, until the `SI_RETURN` pseudo gets lowered to the
`S_SETPC_B64_return` during the `expandPostRAPseudo()`.

As an added benefit, this patch simplifies overall return instruction handling.

Note: The AMDGPU CFI changes are there only in the downstream code and another
version of this patch will be posted for review for the downstream code.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D114652

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 67d61324 16-Jul-2021 Matt Arsenault <[email protected]>

GlobalISel: Preserve memory types for implicit sret load/stores


# 51f115b0 16-Jul-2021 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Add a few tests for struct arguments

Test structs with pointers and vectors of pointers since this stresses
a future patch.


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# fae05692 20-May-2021 Matt Arsenault <[email protected]>

CodeGen: Print/parse LLTs in MachineMemOperands

This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted

CodeGen: Print/parse LLTs in MachineMemOperands

This will currently accept the old number of bytes syntax, and convert
it to a scalar. This should be removed in the near future (I think I
converted all of the tests already, but likely missed a few).

Not sure what the exact syntax and policy should be. We can continue
printing the number of bytes for non-generic instructions to avoid
test churn and only allow non-scalar types for generic instructions.

This will currently print the LLT in parentheses, but accept parsing
the existing integers and implicitly converting to scalar. The
parentheses are a bit ugly, but the parser logic seems unable to deal
without either parentheses or some keyword to indicate the start of a
type.

show more ...


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3
# 25e60f64 27-Feb-2021 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Add subtarget to a test

SelectionDAG forces us to have a weird ABI for 16-bit values without
legal 16-bit operations, but currently GlobalISel bypasses this and
sometimes ends up

AMDGPU/GlobalISel: Add subtarget to a test

SelectionDAG forces us to have a weird ABI for 16-bit values without
legal 16-bit operations, but currently GlobalISel bypasses this and
sometimes ends up using the gfx8+ ABI in some contexts. Make sure
we're testing the normal ABI to avoid a test change in a future patch.

show more ...


Revision tags: llvmorg-12.0.0-rc2
# 62d946e1 07-Feb-2021 Matt Arsenault <[email protected]>

GlobalISel: Merge some AMDGPU ABI lowering code to generic code

AMDGPU currently has a lot of pre-processing code to pre-split
argument types into 32-bit pieces before passing it to the generic
code

GlobalISel: Merge some AMDGPU ABI lowering code to generic code

AMDGPU currently has a lot of pre-processing code to pre-split
argument types into 32-bit pieces before passing it to the generic
code in handleAssignments. This is a bit sloppy and also requires some
overly fancy iterator work when building the calls. It's better if all
argument marshalling code is handled directly in
handleAssignments. This handles more situations like decomposing large
element vectors into sub-element sized pieces.

This should mostly be NFC, but does change the generated code by
shifting where the initial argument packing instructions are placed. I
think this is nicer looking, since it now emits the packing code
directly after the relevant copies, rather than after the copies for
the remaining arguments.

This doubles down on gfx6/gfx7 using the gfx8+ ABI for 16-bit
types. This is ultimately the better option, but incompatible with the
DAG. Fixing this requires more work, especially for f16.

show more ...


Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# ae25a397 06-Jan-2021 Christudasan Devadasan <[email protected]>

AMDGPU/GlobalISel: Enable sret demotion


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 20c43d6b 20-Nov-2020 Matt Arsenault <[email protected]>

OpaquePtr: Bulk update tests to use typed sret


Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 71131db6 30-Aug-2020 Matt Arsenault <[email protected]>

AMDGPU: Improve <2 x i24> arguments and return value handling

This was asserting for GlobalISel. For SelectionDAG, this was
passing this on the stack. Instead, scalarize this as if it were a
32-bit

AMDGPU: Improve <2 x i24> arguments and return value handling

This was asserting for GlobalISel. For SelectionDAG, this was
passing this on the stack. Instead, scalarize this as if it were a
32-bit vector.

show more ...


Revision tags: llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3
# bcff3dea 06-Jul-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Add some missing return tests


Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3
# eb416277 22-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Improve handling of illegal return types

Most importantly, this fixes ret i8. Also make sure to handle
signext/zeroext for odd types > i32. Some of the corresponding
argument pass

AMDGPU/GlobalISel: Improve handling of illegal return types

Most importantly, this fixes ret i8. Also make sure to handle
signext/zeroext for odd types > i32. Some of the corresponding
argument passing fixes also need to be handled.

show more ...


Revision tags: llvmorg-10.0.0-rc2
# 9087ef07 04-Feb-2020 Matt Arsenault <[email protected]>

GlobalISel: Allow CSE of G_IMPLICIT_DEF

The legalizer produces a lot of these, and they make reading legalized
MIR annoying. For some reason, this does seem to sometimes introduce
copies of implicit

GlobalISel: Allow CSE of G_IMPLICIT_DEF

The legalizer produces a lot of these, and they make reading legalized
MIR annoying. For some reason, this does seem to sometimes introduce
copies of implicit def, which is dumb.

show more ...


Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# e74c5b96 01-Nov-2019 Daniel Sanders <[email protected]>

[globalisel] Rename G_GEP to G_PTR_ADD

Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose

[globalisel] Rename G_GEP to G_PTR_ADD

Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD

Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm

Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69734

show more ...


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1
# a9ea8a9a 26-Jul-2019 Matt Arsenault <[email protected]>

AMDGPU/GlobalISel: Handle most function return types

handleAssignments gives up pretty easily on structs, and i8 values for
some reason. The other case that doesn't work is when an implicit sret
nee

AMDGPU/GlobalISel: Handle most function return types

handleAssignments gives up pretty easily on structs, and i8 values for
some reason. The other case that doesn't work is when an implicit sret
needs to be inserted if the return size exceeds the number of return
registers.

llvm-svn: 367082

show more ...