|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
21a0ef8d |
| 15-Jul-2021 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Redo kernel argument load handling
This avoids relying on G_EXTRACT on unusual types, and also properly decomposes structs into multiple registers. This also preserves the LLTs in
AMDGPU/GlobalISel: Redo kernel argument load handling
This avoids relying on G_EXTRACT on unusual types, and also properly decomposes structs into multiple registers. This also preserves the LLTs in the memory operands.
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2 |
|
| #
6a70874d |
| 12-Jan-2021 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Implement tail calls
Or at least the sibling call cases which the DAG already handles.
|
| #
fd82cbcf |
| 09-Feb-2021 |
Matt Arsenault <[email protected]> |
GlobalISel: Merge and cleanup more AMDGPU call lowering code
This merges more AMDGPU ABI lowering code into the generic call lowering. Start cleaning up by factoring away more of the pack/unpack log
GlobalISel: Merge and cleanup more AMDGPU call lowering code
This merges more AMDGPU ABI lowering code into the generic call lowering. Start cleaning up by factoring away more of the pack/unpack logic into the buildCopy{To|From}Parts functions. These could use more improvement, and the SelectionDAG versions are significantly more complex, and we'll eventually have to emulate all of those cases too.
This is mostly NFC, but does result in some minor instruction reordering. It also removes some of the limitations with mismatched sizes the old code had. However, similarly to the merge on the input, this is forcing gfx6/gfx7 to use the gfx8+ ABI (which is what we actually want, but SelectionDAG is stuck using the weird emergent ABI).
This also changes the load/store size for stack passed EVTs for AArch64, which makes it consistent with the DAG behavior.
show more ...
|
| #
6c260d3b |
| 28-Feb-2021 |
Matt Arsenault <[email protected]> |
GlobalISel: Move splitToValueTypes to generic code
I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication.
Mips and X86 have their own more exotic versions which s
GlobalISel: Move splitToValueTypes to generic code
I copied the nearly identical function from AArch64 into AMDGPU, so fix this duplication.
Mips and X86 have their own more exotic versions which should be removed. However replacing those is better left for a separate patch since it requires other changes to avoid regressions.
show more ...
|
|
Revision tags: llvmorg-11.1.0-rc1 |
|
| #
ae25a397 |
| 06-Jan-2021 |
Christudasan Devadasan <[email protected]> |
AMDGPU/GlobalISel: Enable sret demotion
|
| #
6a87e9b0 |
| 25-Dec-2020 |
dfukalov <[email protected]> |
[NFC][AMDGPU] Reduce include files dependency.
Reviewed By: rampitec
Differential Revision: https://reviews.llvm.org/D93813
|
|
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init |
|
| #
6b7d5a92 |
| 07-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Start cleaning up calling convention lowering
There are various hacks working around limitations in handleAssignments, and the logical split between different parts isn't correct.
AMDGPU/GlobalISel: Start cleaning up calling convention lowering
There are various hacks working around limitations in handleAssignments, and the logical split between different parts isn't correct. Start separating the type legalization to satisfy going through the DAG infrastructure from the code required to split into register types. The type splitting should be moved to generic code.
show more ...
|
| #
d68458bd |
| 23-Dec-2020 |
Christudasan Devadasan <[email protected]> |
[GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the s
[GlobalISel] Base implementation for sret demotion.
If the return values can't be lowered to registers SelectionDAG performs the sret demotion. This patch contains the basic implementation for the same in the GlobalISel pipeline.
Furthermore, targets should bring relevant changes during lowerFormalArguments, lowerReturn and lowerCall to make use of this feature.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D92953
show more ...
|
| #
16bcd545 |
| 26-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Mark GlobalISel classes as final
|
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
1168119c |
| 07-May-2020 |
Matt Arsenault <[email protected]> |
AMDGPU: Start interpreting byref on kernel arguments
These are treated identically to value aggregates placed in the kernel argument list. A %struct.foo or %struct.foo addrspace(4)* byref(sizeof(%st
AMDGPU: Start interpreting byref on kernel arguments
These are treated identically to value aggregates placed in the kernel argument list. A %struct.foo or %struct.foo addrspace(4)* byref(sizeof(%struct.foo)) align(alignof(%struct.foo)) argument should produce the same offsets and argument metadata.
This handles all 3 kernel ABI implementations, and the two HSA metadata emission paths.
show more ...
|
| #
61f1f2a2 |
| 03-Jul-2020 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Initial Implementation of calls
Return values, and tail calls are not yet handled.
|
| #
0de874ad |
| 31-Mar-2020 |
Guillaume Chatelet <[email protected]> |
[Alignment][NFC] Transition to inferAlignFromPtrInfo
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/
[Alignment][NFC] Transition to inferAlignFromPtrInfo
Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790
Reviewers: courbet
Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77120
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
| #
eb416277 |
| 22-Feb-2020 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Improve handling of illegal return types
Most importantly, this fixes ret i8. Also make sure to handle signext/zeroext for odd types > i32. Some of the corresponding argument pass
AMDGPU/GlobalISel: Improve handling of illegal return types
Most importantly, this fixes ret i8. Also make sure to handle signext/zeroext for odd types > i32. Some of the corresponding argument passing fixes also need to be handled.
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4 |
|
| #
06c8cb03 |
| 09-Sep-2019 |
Austin Kerbow <[email protected]> |
AMDGPU/GlobalISel: Rename MIRBuilder to B. NFC
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovi
AMDGPU/GlobalISel: Rename MIRBuilder to B. NFC
Reviewers: arsenm
Reviewed By: arsenm
Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67374
llvm-svn: 371467
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1 |
|
| #
a9ea8a9a |
| 26-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Handle most function return types
handleAssignments gives up pretty easily on structs, and i8 values for some reason. The other case that doesn't work is when an implicit sret nee
AMDGPU/GlobalISel: Handle most function return types
handleAssignments gives up pretty easily on structs, and i8 values for some reason. The other case that doesn't work is when an implicit sret needs to be inserted if the return size exceeds the number of return registers.
llvm-svn: 367082
show more ...
|
| #
b60a2ae4 |
| 19-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Support arguments with multiple registers
Handles structs used directly in argument lists.
llvm-svn: 366584
|
| #
fecf43eb |
| 19-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Rewrite lowerFormalArguments
This should now handle everything except structs passed as multiple registers.
I think most of the packing logic should be handled by handleAssignmen
AMDGPU/GlobalISel: Rewrite lowerFormalArguments
This should now handle everything except structs passed as multiple registers.
I think most of the packing logic should be handled by handleAssignments, but I'm unclear on what the contract is for multiple registers. This is copying how x86 handles this.
This does change the behavior of the test_sgpr_alignment0 amdgpu_vs test. I don't think shader arguments should try to follow the alignment, and registers need to be repacked. I also don't think it matters, since I think the pointers are packed to the beginning of the argument list anyway.
llvm-svn: 366582
show more ...
|
|
Revision tags: llvmorg-10-init |
|
| #
b725d273 |
| 11-Jul-2019 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Move kernel argument handling to separate function
llvm-svn: 365782
|
|
Revision tags: llvmorg-8.0.1, llvmorg-8.0.1-rc4 |
|
| #
c3dbe239 |
| 27-Jun-2019 |
Diana Picus <[email protected]> |
[GlobalISel] Accept multiple vregs in lowerFormalArgs
Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. Th
[GlobalISel] Accept multiple vregs in lowerFormalArgs
Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018.
CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches.
With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this.
ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions.
AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0.
AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC.
Mips doesn't support aggregates yet, so it's also NFC.
x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument.
Differential Revision: https://reviews.llvm.org/D63549
llvm-svn: 364510
show more ...
|
|
Revision tags: llvmorg-8.0.1-rc3 |
|
| #
faeaedf8 |
| 24-Jun-2019 |
Matt Arsenault <[email protected]> |
GlobalISel: Remove unsigned variant of SrcOp
Force using Register.
One downside is the generated register enums require explicit conversion.
llvm-svn: 364194
|
| #
e3a676e9 |
| 24-Jun-2019 |
Matt Arsenault <[email protected]> |
CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set neede
CodeGen: Introduce a class for registers
Avoids using a plain unsigned for registers throughoug codegen. Doesn't attempt to change every register use, just something a little more than the set needed to build after changing the return type of MachineOperand::getReg().
llvm-svn: 364191
show more ...
|
|
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3 |
|
| #
0da6350d |
| 31-Aug-2018 |
Matt Arsenault <[email protected]> |
AMDGPU: Remove remnants of old address space mapping
llvm-svn: 341165
|
|
Revision tags: llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1 |
|
| #
49168f67 |
| 02-Aug-2018 |
Alexander Ivchenko <[email protected]> |
[GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per Value
This is logical continuation of https://reviews.llvm.org/D46018 (r332449)
Differential Revision: https://reviews.ll
[GlobalISel] Rewrite CallLowering::lowerReturn to accept multiple VRegs per Value
This is logical continuation of https://reviews.llvm.org/D46018 (r332449)
Differential Revision: https://reviews.llvm.org/D49660
llvm-svn: 338685
show more ...
|
| #
29f30379 |
| 05-Jul-2018 |
Matt Arsenault <[email protected]> |
AMDGPU/GlobalISel: Implement custom kernel arg lowering
Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering.
For now at
AMDGPU/GlobalISel: Implement custom kernel arg lowering
Avoid using allocateKernArg / AssignFn. We do not want any of the type splitting properties of normal calling convention lowering.
For now at least this exists alongside the IR argument lowering pass. This is necessary to handle struct padding correctly while some arguments are still skipped by the IR argument lowering pass.
llvm-svn: 336373
show more ...
|