AMDGPUInstructionSelector.h - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.h

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 432cbd78	18-Jul-2022	Ivan Kosarev <[email protected]>	[AMDGPU][CodeGen] Support (register + immediate) SMRD offsets. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129381
# 4874838a	28-Jun-2022	Piotr Sobczak <[email protected]>	[AMDGPU] gfx11 WMMA instruction support gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate) instructions. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D1287 [AMDGPU] gfx11 WMMA instruction support gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate) instructions. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D128756 show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5
# 20d20156	09-Jun-2022	Joe Nash <[email protected]>	[AMDGPU] gfx11 VINTERP intrinsics and ISel support Depends on D127664 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127756
# 2d43de13	15-Jun-2022	Joe Nash <[email protected]>	[AMDGPU] gfx11 new dot instruction codegen support Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127904
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 7b9f620e	06-Apr-2022	Jay Foad <[email protected]>	[AMDGPU] Work around GFX11 flat scratch SVS swizzling bug Differential Revision: https://reviews.llvm.org/D127635
# 5df2893a	28-Apr-2022	Nicolai Hähnle <[email protected]>	AMDGPU: Add G_AMDGPU_MAD_64_32 instructions These generic instructions are trivially selected to V_MAD_[IU]64_[IU]32 instructions when run on the VALU. When at least both factors are scalar, it is AMDGPU: Add G_AMDGPU_MAD_64_32 instructions These generic instructions are trivially selected to V_MAD_[IU]64_[IU]32 instructions when run on the VALU. When at least both factors are scalar, it is usually better to execute some or all of the instruction on the SALU. To this end, we lower the instruction to simpler instructions that are supported on the SALU when applying the register bank mapping. Differential Revision: https://reviews.llvm.org/D124843 show more ...
# dee31902	17-May-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Add llvm.amdgcn.global.load.lds intrinsic Differential Revision: https://reviews.llvm.org/D125279
# 791ec1c6	13-May-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Add intrinsics llvm.amdgcn.{raw\|struct}.buffer.load.lds Differential Revision: https://reviews.llvm.org/D124884
# 6e3e14f6	21-Mar-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Support gfx940 smfmac instructions Differential Revision: https://reviews.llvm.org/D122191
# f59cb41b	15-Mar-2022	Abinav Puthan Purayil <[email protected]>	[AMDGPU] Select buffer_atomic_cmpswap* in tblgen This change replaces the manual selection of buffer_atomic_cmpswap* instructions in SelectionDAG and GlobalISel with a tblgen based selection in BUFI [AMDGPU] Select buffer_atomic_cmpswap* in tblgen This change replaces the manual selection of buffer_atomic_cmpswap* instructions in SelectionDAG and GlobalISel with a tblgen based selection in BUFInstructions.td. This allows us to select the return and no-return variants in tblgen. Differential Revision: https://reviews.llvm.org/D121770 show more ...
# c4500de2	14-Mar-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] gfx940: disable OP_SEL on V_DOT instructions Differential Revision: https://reviews.llvm.org/D121634
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 36fe3f13	08-Mar-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] flat scratch SVS addressing mode for gfx940 Both VADDR and SADDR are used in SVS mode. Differential Revision: https://reviews.llvm.org/D121254
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 7f26a102	12-Jan-2022	Matt Arsenault <[email protected]>	AMDGPU/GlobalISel: Introduce pseudo to copy sp in call sequences Arbitrary stack pointers are accessed using MUBUF instructions with the voffset field, which is interpreted as the swizzled address. AMDGPU/GlobalISel: Introduce pseudo to copy sp in call sequences Arbitrary stack pointers are accessed using MUBUF instructions with the voffset field, which is interpreted as the swizzled address. We want to fold fold into the MUBUF form to use the SP in the SGPR offset, and previously we were special casing the interpretation of the pointer value if the access memory operand said it was relative to the stack pointer. 690f5b7a0128a210093e9b217932743ad35b5c5a removed this check, and moved the DAG path to special casing copies from SGPRs. This is not an entirely sound approach, since it's still changing the interpretation of pointer values based the context. Introduce a new pseudo which corresponds to the wave-to-vector address transform. This way the memory instruction has consistent semantics where the incoming pointer is always interpreted as a vector address, and we're not obligated to optimize into the MUBUF offset-only addressing mode. The DAG should probably have an equivalent pseudo. This should fix some correctness issues, and folding this into addressing modes will be a future optimization patch. show more ...
# 41bfac6a	02-Jan-2022	Kazu Hirata <[email protected]>	[Target] Remove unused forward declarations (NFC)
Revision tags: llvmorg-13.0.1-rc1
# 078da26b	08-Nov-2021	Abinav Puthan Purayil <[email protected]>	[AMDGPU] Check for unneeded shift mask in shift PatFrags. The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnnee [AMDGPU] Check for unneeded shift mask in shift PatFrags. The existing constrained shift PatFrags only dealt with masked shift from OpenCL front-ends. This change copies the X86DAGToDAGISel::isUnneededShiftMask() function to AMDGPU and uses it in the shift PatFrag predicates. Differential Revision: https://reviews.llvm.org/D113448 show more ...
# aee86f9b	07-Nov-2021	Kazu Hirata <[email protected]>	[AMDGPU] Remove unused declaration selectSMRD (NFC) The function body proper was removed on Feb 20, 2019 in commit 79b5c3842b684f873d1ffad502336e973616ea51.
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 48958d02	23-Aug-2021	Daniil Fukalov <[email protected]>	[NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `G [NFC][AMDGPU] Reduce includes dependencies. 1. Splitted out some parts of R600 target to separate modules/headers. 2. Reduced some include lists in headers. 3. Found and fixed issue with override `GCNTargetMachine::getSubtargetImpl()` and `R600TargetMachine::getSubtargetImpl()` had different return value type than base class. 4. Minor forward declarations cleanup. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D108596 show more ...
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# f9f5d415	30-Apr-2021	Brendon Cahoon <[email protected]>	[AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX Adds legalizer, register bank select, and instruction select support for G_SBFX and G_UBFX. These opcodes generate scalar or vector ALU bit [AMDGPU][GlobalISel] Legalize and select G_SBFX and G_UBFX Adds legalizer, register bank select, and instruction select support for G_SBFX and G_UBFX. These opcodes generate scalar or vector ALU bitfield extract instructions for AMDGPU. The instructions allow both constant or register values for the offset and width operands. The 32-bit scalar version is expanded to a sequence that combines the offset and width into a single register. There are no 64-bit vgpr bitfield extract instructions, so the operations are expanded to a sequence of instructions that implement the operation. If the width is a constant, then the 32-bit bitfield extract instructions are used. Moved the AArch64 specific code for creating G_SBFX to CombinerHelper.cpp so that it can be used by other targets. Only bitfield extracts with constant offset and width values are handled currently. Differential Revision: https://reviews.llvm.org/D100149 show more ...
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# cc7add52	30-Mar-2021	Sebastian Neubauer <[email protected]>	[AMDGPU] Use SIInstrFlags for flat variants. NFC Use SIInstrFlags to differentiate between the different variants of flat instructions (flat, global and scratch). This should make it easier to bundl [AMDGPU] Use SIInstrFlags for flat variants. NFC Use SIInstrFlags to differentiate between the different variants of flat instructions (flat, global and scratch). This should make it easier to bundle the immediate offset logic in a single place and implement restrictions and bug workarounds. Fixed version of D99587, which does not rely on the address space. Differential Revision: https://reviews.llvm.org/D99743 show more ...
Revision tags: llvmorg-12.0.0-rc3
# 5d0e9ddf	01-Mar-2021	Jay Foad <[email protected]>	[AMDGPU][GlobalISel] Add support for global atomicrmw fadd This includes gfx908 which only has a no-return version of the global_atomic_add_f32 instruction, using the same hack that was previously i [AMDGPU][GlobalISel] Add support for global atomicrmw fadd This includes gfx908 which only has a no-return version of the global_atomic_add_f32 instruction, using the same hack that was previously implemented for selecting from the llvm.amdgcn.global.atomic.fadd intrinsic. Differential Revision: https://reviews.llvm.org/D97767 show more ...
Revision tags: llvmorg-12.0.0-rc2
# 3bffb1cd	09-Feb-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Use single cache policy operand Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amoun [AMDGPU] Use single cache policy operand Replace individual operands GLC, SLC, and DLC with a single cache_policy bitmask operand. This will reduce the number of operands in MIR and I hope the amount of code. These operands are mostly 0 anyway. Additional advantage that parser will accept these flags in any order unlike now. Differential Revision: https://reviews.llvm.org/D96469 show more ...
# 8a316045	25-Feb-2021	Amara Emerson <[email protected]>	[AArch64][GlobalISel] Enable use of the optsize predicate in the selector. To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and [AArch64][GlobalISel] Enable use of the optsize predicate in the selector. To do this while supporting the existing functionality in SelectionDAG of using PGO info, we add the ProfileSummaryInfo and LazyBlockFrequencyInfo analysis dependencies to the instruction selector pass. Then, use the predicate to generate constant pool loads for f32 materialization, if we're targeting optsize/minsize. Differential Revision: https://reviews.llvm.org/D97732 show more ...
# a8d9d507	17-Feb-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906
Revision tags: llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 7e9ceed9	27-Aug-2020	Jay Foad <[email protected]>	[TableGen][GlobalISel] Allow duplicate RendererFns Allow different GICustomOperandRenderers to use the same RendererFn. This avoids the need for targets to define a bunch of identical C++ renderer f [TableGen][GlobalISel] Allow duplicate RendererFns Allow different GICustomOperandRenderers to use the same RendererFn. This avoids the need for targets to define a bunch of identical C++ renderer functions with different names. Without this fix TableGen would have emitted code that tried to define the GICR enumeration with duplicate enumerators. Differential Revision: https://reviews.llvm.org/D96587 show more ...
# 6bde0853	27-Jan-2021	Kazu Hirata <[email protected]>	[AMDGPU] Forward-declare TargetRegisterClass (NFC) AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a forward declaration of TargetRegisterClass in InstructionSelector.h. This pat [AMDGPU] Forward-declare TargetRegisterClass (NFC) AMDGPUInstructionSelector.h needs TargetRegisterClass but relies on a forward declaration of TargetRegisterClass in InstructionSelector.h. This patch adds a forward declaration right in AMDGPUInstructionSelector.h. While we are at it, this patch removes the one in InstructionSelector.h, where it is unnecessary. show more ...
12 3 4 5 6