History log of /llvm-project-15.0.7/llvm/test/CodeGen/AMDGPU/inline-asm.ll (Results 1 – 25 of 28)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# f510045d 14-Jan-2022 Jay Foad <[email protected]>

[CodeGen] Remove unneeded regex escaping in FileCheck patterns. NFC.

Take advantage of D117117 to simplify all {{\[}} to [ and {{\]}} to ].

Differential Revision: https://reviews.llvm.org/D117298


Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 8a52bd82 19-Nov-2021 Jay Foad <[email protected]>

[AMDGPU] Only select VOP3 forms of VOP2 instructions

Change VOP_PAT_GEN to default to not generating an instruction selection
pattern for the VOP2 (e32) form of an instruction, only for the VOP3
(e6

[AMDGPU] Only select VOP3 forms of VOP2 instructions

Change VOP_PAT_GEN to default to not generating an instruction selection
pattern for the VOP2 (e32) form of an instruction, only for the VOP3
(e64) form. This allows SIFoldOperands maximum freedom to fold copies
into the operands of an instruction, before SIShrinkInstructions tries
to shrink it back to the smaller encoding.

This affects the following VOP2 instructions:
v_min_i32
v_max_i32
v_min_u32
v_max_u32
v_and_b32
v_or_b32
v_xor_b32
v_lshr_b32
v_ashr_i32
v_lshl_b32

A further cleanup could simplify or remove VOP_PAT_GEN, since its
optional second argument is never used.

Differential Revision: https://reviews.llvm.org/D114252

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 381ded34 28-Jun-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Add S_MOV_B64_IMM_PSEUDO for wide constants

This is to allow 64 bit constant rematerialization. If a constant
is split into two separate moves initializing sub0 and sub1 like
now RA cannot

[AMDGPU] Add S_MOV_B64_IMM_PSEUDO for wide constants

This is to allow 64 bit constant rematerialization. If a constant
is split into two separate moves initializing sub0 and sub1 like
now RA cannot rematerizalize a 64 bit register.

This gives 10-20% uplift in a set of huge apps heavily using double
precession math.

Fixes: SWDEV-292645

Differential Revision: https://reviews.llvm.org/D104874

show more ...


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 2f499b9a 19-Dec-2020 Tony <[email protected]>

[AMDGPU] Add volatile support to SIMemoryLegalizer

Treat a non-atomic volatile load and store as a relaxed atomic at
system scope for the address spaces accessed. This will ensure all
relevant cache

[AMDGPU] Add volatile support to SIMemoryLegalizer

Treat a non-atomic volatile load and store as a relaxed atomic at
system scope for the address spaces accessed. This will ensure all
relevant caches will be bypassed.

A volatile atomic is not changed and still only bypasses caches upto
the level specified by the SyncScope operand.

Differential Revision: https://reviews.llvm.org/D94214

show more ...


Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2
# cb7b661d 03-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU: Analyze divergence of inline asm


# 5df1ac78 31-Jan-2020 alex-t <[email protected]>

[AMDGPU] fixed divergence driven shift operations selection

Differential Revision: https://reviews.llvm.org/D73483

Reviewers: rampitec


Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 21bc8e5a 28-Oct-2019 Matt Arsenault <[email protected]>

AMDGPU: Make VReg_1 only include 1 artificial register

When TableGen is inferring register classes from contexts, it uses a
sorting function based on the number of registers in the class. Since
this

AMDGPU: Make VReg_1 only include 1 artificial register

When TableGen is inferring register classes from contexts, it uses a
sorting function based on the number of registers in the class. Since
this was being treated as an alias of VGPR_32, they had exactly the
same size. The sort used wasn't a stable sort, and even if it were, I
believe the tie breaker would effectively end up being the
alphabetical ordering of the class name. There appear to be issues
trying to use an empty set of registers, so add only one so this will
always sort to the end.

Also add a comment explaining how VReg_1 is a dirty hack for
SelectionDAG.

This does end up changing the behavior of i1 with inline asm and VGPR
constraints, but the existing behavior was was already nonsensical and
inconsistent. It should probably be disallowed anyway.

Fixes bug 43699

show more ...


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2
# 5fc1dfa7 28-May-2019 Michael Liao <[email protected]>

[AMDGPU] Correct the handling of inlineasm output registers.

Summary:
- There's a regression due to the cross-block RC assignment. Use the
proper way to derive the output register RC in inline asm

[AMDGPU] Correct the handling of inlineasm output registers.

Summary:
- There's a regression due to the cross-block RC assignment. Use the
proper way to derive the output register RC in inline asm.

Reviewers: rampitec, alex-t

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, eraman, hiraditya, llvm-commits, yaxunl

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62537

llvm-svn: 361868

show more ...


Revision tags: llvmorg-8.0.1-rc1
# fbdd2a18 15-Apr-2019 Matt Arsenault <[email protected]>

AMDGPU: Fix printed format of SReg_96

These are artificial, so I think this should only come up with inline
asm comments.

llvm-svn: 358446


Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 814abb59 31-Oct-2018 Nicolai Haehnle <[email protected]>

AMDGPU: Rewrite SILowerI1Copies to always stay on SALU

Summary:
Instead of writing boolean values temporarily into 32-bit VGPRs
if they are involved in PHIs or are observed from outside a loop,
we u

AMDGPU: Rewrite SILowerI1Copies to always stay on SALU

Summary:
Instead of writing boolean values temporarily into 32-bit VGPRs
if they are involved in PHIs or are observed from outside a loop,
we use bitwise masking operations to combine lane masks in a way
that is consistent with wave control flow.

Move SIFixSGPRCopies to before this pass, since that pass
incorrectly attempts to move SGPR phis to VGPRs.

This should recover most of the code quality that was lost with
the bug fix in "AMDGPU: Remove PHI loop condition optimization".

There are still some relevant cases where code quality could be
improved, in particular:

- We often introduce redundant masks with EXEC. Ideally, we'd
have a generic computeKnownBits-like analysis to determine
whether masks are already masked by EXEC, so we can avoid this
masking both here and when lowering uniform control flow.

- The criterion we use to determine whether a def is observed
from outside a loop is conservative: it doesn't check whether
(loop) branch conditions are uniform.

Change-Id: Ibabdb373a7510e426b90deef00f5e16c5d56e64b

Reviewers: arsenm, rampitec, tpr

Subscribers: kzhuravl, jvesely, wdng, mgorny, yaxunl, dstuttard, t-tye, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53496

llvm-svn: 345719

show more ...


# 611b533f 30-Oct-2018 Jonas Paulsson <[email protected]>

[SchedModel] Fix for read advance cycles with implicit pseudo operands.

The SchedModel allows the addition of ReadAdvances to express that certain
operands of the instructions are needed at a later

[SchedModel] Fix for read advance cycles with implicit pseudo operands.

The SchedModel allows the addition of ReadAdvances to express that certain
operands of the instructions are needed at a later point than the others.

RegAlloc may add pseudo operands that are not part of the instruction
descriptor, and therefore cannot have any read advance entries. This meant
that in some cases the desired read advance was nullified by such a pseudo
operand, which still had the original latency.

This patch fixes this by making sure that such pseudo operands get a zero
latency during DAG construction.

Review: Matthias Braun, Ulrich Weigand.
https://reviews.llvm.org/D49671

llvm-svn: 345606

show more ...


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2
# 36cd1859 09-Aug-2017 Matt Arsenault <[email protected]>

AMDGPU: Fix assert on n inline asm constraint

llvm-svn: 310515


Revision tags: llvmorg-5.0.0-rc1
# 9aa45f04 06-Jul-2017 Matt Arsenault <[email protected]>

AMDGPU: Add macro fusion schedule DAG mutation

Try to increase opportunities to shrink vcc uses.

llvm-svn: 307313


Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3
# 3c7581bb 08-Jun-2017 Matt Arsenault <[email protected]>

AMDGPU: Use correct register names in inline assembly

Fixes using physical registers in inline asm from clang.

llvm-svn: 305004


Revision tags: llvmorg-4.0.1-rc2
# 2b1f9aa5 17-May-2017 Matt Arsenault <[email protected]>

AMDGPU: Start defining a calling convention

Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as

AMDGPU: Start defining a calling convention

Partially implement callee-side for arguments and return values.
byval doesn't work properly, and most likely sret or other on-stack
return values most as well.

llvm-svn: 303308

show more ...


# 2a80369a 29-Apr-2017 Matt Arsenault <[email protected]>

AMDGPU: Fix copies from physical registers in SIFixSGPRCopies

This would assert when there were multiple defs of
a physical register.

We just need to move all of the users of it.

llvm-svn: 301730


Revision tags: llvmorg-4.0.1-rc1
# 0d0d6c2f 12-Apr-2017 Matt Arsenault <[email protected]>

AMDGPU: Fix invalid copies when copying i1 to phys reg

Insert a VReg_1 virtual register so the i1 workaround pass
can handle it.

llvm-svn: 300113


# fe78ffba 11-Apr-2017 Matt Arsenault <[email protected]>

AMDGPU: Fix folding reg_sequence into copy to phys reg

This was producing an illegal reg_sequence defining
a physical register with virtual register inputs.

llvm-svn: 299997


# 3dbeefa9 21-Mar-2017 Matt Arsenault <[email protected]>

AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel

Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
ca

AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel

Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.

Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).

llvm-svn: 298444

show more ...


Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2
# 7aad8fd8 24-Jan-2017 Matt Arsenault <[email protected]>

Enable FeatureFlatForGlobal on Volcanic Islands

This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <vedran@mi

Enable FeatureFlatForGlobal on Volcanic Islands

This switches to the workaround that HSA defaults to
for the mesa path.

This should be applied to the 4.0 branch.

Patch by Vedran Miletić <[email protected]>

llvm-svn: 292982

show more ...


Revision tags: llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1
# 5d8eb25e 30-Sep-2016 Matt Arsenault <[email protected]>

AMDGPU: Use unsigned compare for eq/ne

For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for

AMDGPU: Use unsigned compare for eq/ne

For some reason there are both of these available, except
for scalar 64-bit compares which only has u64. I'm not sure
why there are both (I'm guessing it's for the one bit inputs we
don't use), but for consistency always using the
unsigned one.

llvm-svn: 282832

show more ...


Revision tags: llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1
# accddacb 01-Jul-2016 Matt Arsenault <[email protected]>

TII: Fix inlineasm size counting comments as insts

The main problem was counting comments on their own
line as instructions.

llvm-svn: 274405


# a9720c67 20-Jun-2016 Matt Arsenault <[email protected]>

AMDGPU: Use correct method for determining instruction size

llvm-svn: 273172


Revision tags: llvmorg-3.8.1, llvmorg-3.8.1-rc1
# df3a20cd 06-Apr-2016 Nicolai Haehnle <[email protected]>

AMDGPU: Add a shader calling convention

This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.

Patch By: Bas Nieuwenhuizen <bas@basnie

AMDGPU: Add a shader calling convention

This makes it possible to distinguish between mesa shaders
and other kernels even in the presence of compute shaders.

Patch By: Bas Nieuwenhuizen <[email protected]>

Differential Revision: http://reviews.llvm.org/D18559

llvm-svn: 265589

show more ...


# 9f2e00de 09-Mar-2016 Tom Stellard <[email protected]>

SelectionDAG: Fix a crash on inline asm when output register supports multiple types

Summary:
The code in SelectionDAG did not handle the case where the
register type and output types were different

SelectionDAG: Fix a crash on inline asm when output register supports multiple types

Summary:
The code in SelectionDAG did not handle the case where the
register type and output types were different, but had the same size.

Reviewers: arsenm, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17940

llvm-svn: 263022

show more ...


12