History log of /llvm-project-15.0.7/llvm/test/CodeGen/AMDGPU/loop_break.ll (Results 1 – 25 of 32)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init
# d2e5d351 31-Jan-2022 Jay Foad <[email protected]>

[StructurizeCFG] Clean up some boolean not instructions

In some cases StructurizeCFG inserts i1 xor instructions to invert
predicates. Add a quick loop to clean these up afterwards if we can get
awa

[StructurizeCFG] Clean up some boolean not instructions

In some cases StructurizeCFG inserts i1 xor instructions to invert
predicates. Add a quick loop to clean these up afterwards if we can get
away with modifying an existing compare instruction instead.
(StructurizeCFG is generally run late in the pipeline so instcombine
does not clean them up for us.)

Differential Revision: https://reviews.llvm.org/D118623

show more ...


# 8faad296 31-Jan-2022 Jay Foad <[email protected]>

Revert "[Local] invertCondition: try modifying an existing ICmpInst"

This reverts commit a6b54ddaba2d5dc0f72dcc4591c92b9544eb0016.

Apparently it is not safe to modify the condition even if it passe

Revert "[Local] invertCondition: try modifying an existing ICmpInst"

This reverts commit a6b54ddaba2d5dc0f72dcc4591c92b9544eb0016.

Apparently it is not safe to modify the condition even if it passes the
hasOneUse test, because StructurizeCFG might have other references to
the condition that are not manifest in the IR use-def chains.

show more ...


# a6b54dda 28-Jan-2022 Jay Foad <[email protected]>

[Local] invertCondition: try modifying an existing ICmpInst

This avoids various cases where StructurizeCFG would otherwise insert an
xor i1 instruction, and it since it generally runs late in the pi

[Local] invertCondition: try modifying an existing ICmpInst

This avoids various cases where StructurizeCFG would otherwise insert an
xor i1 instruction, and it since it generally runs late in the pipeline,
instcombine does not clean up the xor-of-cmp pattern.

Differential Revision: https://reviews.llvm.org/D118478

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 18f93512 19-Nov-2021 RamNalamothu <[email protected]>

[AMDGPU] Do not generate ELF symbols for the local branch target labels

The compiler was generating symbols in the final code object for local
branch target labels. This bloats the code object, slow

[AMDGPU] Do not generate ELF symbols for the local branch target labels

The compiler was generating symbols in the final code object for local
branch target labels. This bloats the code object, slows down the loader,
and is only used to simplify disassembly.

Use '--symbolize-operands' with llvm-objdump to improve readability of the
branch target operands in disassembly.

Fixes: SWDEV-312223

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D114273

show more ...


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 9b1bcaea 21-Jul-2021 Matt Arsenault <[email protected]>

AMDGPU: Update tests for lower i1 change

I forgot to squash the test updates for b32d3d9e81cdd9275d19cd2a396c461edc9e7189


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 8de4db69 19-May-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Lower kernel LDS into a sorted structure

Differential Revision: https://reviews.llvm.org/D102954


Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 778351df 24-Jun-2020 Matt Arsenault <[email protected]>

Revert "[AMDGPU] Enable compare operations to be selected by divergence"

This reverts commit 521ac0b5cea02f629d035f807460affbb65ae7ad.

Reported to break thousands of piglit tests.


# 521ac0b5 19-Jun-2020 alex-t <[email protected]>

[AMDGPU] Enable compare operations to be selected by divergence

Summary: Details: This patch enables SETCC to be selected to S_CMP_* if uniform and V_CMP_* if divergent.

Reviewers: rampitec, arsenm

[AMDGPU] Enable compare operations to be selected by divergence

Summary: Details: This patch enables SETCC to be selected to S_CMP_* if uniform and V_CMP_* if divergent.

Reviewers: rampitec, arsenm

Reviewed By: rampitec

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D82194

show more ...


Revision tags: llvmorg-10.0.1-rc1
# 3cbbded6 28-Mar-2020 Sameer Sahasrabuddhe <[email protected]>

Introduce unify-loop-exits pass.

For each natural loop with multiple exit blocks, this pass creates a
new block N such that all exiting blocks now branch to N, and then
control flow is redistributed

Introduce unify-loop-exits pass.

For each natural loop with multiple exit blocks, this pass creates a
new block N such that all exiting blocks now branch to N, and then
control flow is redistributed to all the original exit blocks.

The bulk of the tranformation is a new function introduced in
BasicBlockUtils that an redirect control flow from a set of incoming
blocks to a set of outgoing blocks via a common "hub".

This is a useful workaround for a limitation in the structurizer which
incorrectly orders blocks when processing a nest of loops. This pass
bypasses that issue by ensuring that each natural loop is recognized
as a separate region. Since the structurizer is a region pass, it no
longer sees a nest of loops in a single region, and instead processes
each "level" in the nesting as a separate region.

The AMDGPU backend provides a new option to enable this pass before
the structurizer, which may eventually be enabled by default.

Reviewers: madhur13490, arsenm, nhaehnle

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D75865

show more ...


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4
# 42febbab 05-Mar-2020 Sameer Sahasrabuddhe <[email protected]>

StructurizeCFG: simplify phi nodes when possible

After structurization, some phi nodes can have a single incoming edge
and can be simplified away. This change runs a simplify query on all
phis that

StructurizeCFG: simplify phi nodes when possible

After structurization, some phi nodes can have a single incoming edge
and can be simplified away. This change runs a simplify query on all
phis that are either modified or added by the structurizer. This also
moves some phis closer to their use as a side benefit.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D75500

show more ...


Revision tags: llvmorg-10.0.0-rc3
# 534d8866 28-Feb-2020 Sameer Sahasrabuddhe <[email protected]>

[AMDGPU] add generated checks for some LIT tests

This is in prepration for further changes that affect these tests.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D75403


Revision tags: llvmorg-10.0.0-rc2
# 00b22df7 03-Feb-2020 Matt Arsenault <[email protected]>

AMDGPU: Fix extra type mangling on llvm.amdgcn.if.break

These have to be the same mask type.


Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init
# 01a4b831 08-Jan-2020 Michael Liao <[email protected]>

[codegen,amdgpu] Enhance MIR DIE and re-arrange it for AMDGPU.

Summary:
- `dead-mi-elimination` assumes MIR in the SSA form and cannot be
arranged after phi elimination or DeSSA. It's enhanced to

[codegen,amdgpu] Enhance MIR DIE and re-arrange it for AMDGPU.

Summary:
- `dead-mi-elimination` assumes MIR in the SSA form and cannot be
arranged after phi elimination or DeSSA. It's enhanced to handle the
dead register definition by skipping use check on it. Once a register
def is `dead`, all its uses, if any, should be `undef`.
- Re-arrange the DIE in RA phase for AMDGPU by placing it directly after
`detect-dead-lanes`.
- Many relevant tests are refined due to different register assignment.

Reviewers: rampitec, qcolombet, sunfish

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72709

show more ...


Revision tags: llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1
# 008e65a7 18-Nov-2019 vpykhtin <[email protected]>

[AMDGPU] Fix emitIfBreak CF lowering: use temp reg to make register coalescer life easier.

Differential revision: https://reviews.llvm.org/D70405


# c4d256a5 14-Oct-2019 Alexander Timofeev <[email protected]>

[AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.'

Detailed description:

After https://reviews.llvm.org/D59990 submit several issues

[AMDGPU] Come back patch for the 'Assign register class for cross block values according to the divergence.'

Detailed description:

After https://reviews.llvm.org/D59990 submit several issues were discovered.
Changes in common code were preserved but AMDGPU specific part was reverted to keep the backend working correctly.

Discovered issues were addressed in the following commits:

https://reviews.llvm.org/D67662
https://reviews.llvm.org/D67101
https://reviews.llvm.org/D63953
https://reviews.llvm.org/D63731

This change brings back AMDGPU specific changes.

Reviewed by: rampitec, arsenm

Differential Revision: https://reviews.llvm.org/D68635

llvm-svn: 374767

show more ...


Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# d2210af3 14-Jun-2019 Guozhi Wei <[email protected]>

[MBP] Move a latch block with conditional exit and multi predecessors to top of loop

Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another comm

[MBP] Move a latch block with conditional exit and multi predecessors to top of loop

Current findBestLoopTop can find and move one kind of block to top, a latch block has one successor. Another common case is:

* a latch block
* it has two successors, one is loop header, another is exit
* it has more than one predecessors

If it is below one of its predecessors P, only P can fall through to it, all other predecessors need a jump to it, and another conditional jump to loop header. If it is moved before loop header, all its predecessors jump to it, then fall through to loop header. So all its predecessors except P can reduce one taken branch.

Differential Revision: https://reviews.llvm.org/D43256

llvm-svn: 363471

show more ...


# 68a2fef9 13-Jun-2019 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx1010 wave32 icmp/fcmp intrinsic changes for wave32

Differential Revision: https://reviews.llvm.org/D63301

llvm-svn: 363339


Revision tags: llvmorg-8.0.1-rc2
# 37bd9bd1 06-Jun-2019 Alexander Timofeev <[email protected]>

[AMDGPU] Partial revert for the ba447bae7448435c9986eece0811da1423972fdd

"Divergence driven ISel. Assign register class for cross block values
according to the divergence."
that dis

[AMDGPU] Partial revert for the ba447bae7448435c9986eece0811da1423972fdd

"Divergence driven ISel. Assign register class for cross block values
according to the divergence."
that discovered the design flaw leading to several issues that
required to be solved before.

This change reverts AMDGPU specific changes and keeps common part
unaffected.

llvm-svn: 362749

show more ...


# ba447bae 26-May-2019 Alexander Timofeev <[email protected]>

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

Details: To make instruction selection really divergence driven it is necessary to assi

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

Details: To make instruction selection really divergence driven it is necessary to assign
the correct register classes to the cross block values beforehand. For the divergent targets
same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

This commit was reverted because of the build failure.
The reason was mlformed patch.
Build failure fixed.

llvm-svn: 361741

show more ...


# 3b937374 25-May-2019 Peter Collingbourne <[email protected]>

Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."

Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-

Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."

Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

llvm-svn: 361688

show more ...


# dffedea0 24-May-2019 Alexander Timofeev <[email protected]>

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

Details: To make instruction selection really divergence driven it is necessary to assign

[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.

Details: To make instruction selection really divergence driven it is necessary to assign
the correct register classes to the cross block values beforehand. For the divergent targets
same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

llvm-svn: 361644

show more ...


Revision tags: llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 814abb59 31-Oct-2018 Nicolai Haehnle <[email protected]>

AMDGPU: Rewrite SILowerI1Copies to always stay on SALU

Summary:
Instead of writing boolean values temporarily into 32-bit VGPRs
if they are involved in PHIs or are observed from outside a loop,
we u

AMDGPU: Rewrite SILowerI1Copies to always stay on SALU

Summary:
Instead of writing boolean values temporarily into 32-bit VGPRs
if they are involved in PHIs or are observed from outside a loop,
we use bitwise masking operations to combine lane masks in a way
that is consistent with wave control flow.

Move SIFixSGPRCopies to before this pass, since that pass
incorrectly attempts to move SGPR phis to VGPRs.

This should recover most of the code quality that was lost with
the bug fix in "AMDGPU: Remove PHI loop condition optimization".

There are still some relevant cases where code quality could be
improved, in particular:

- We often introduce redundant masks with EXEC. Ideally, we'd
have a generic computeKnownBits-like analysis to determine
whether masks are already masked by EXEC, so we can avoid this
masking both here and when lowering uniform control flow.

- The criterion we use to determine whether a def is observed
from outside a loop is conservative: it doesn't check whether
(loop) branch conditions are uniform.

Change-Id: Ibabdb373a7510e426b90deef00f5e16c5d56e64b

Reviewers: arsenm, rampitec, tpr

Subscribers: kzhuravl, jvesely, wdng, mgorny, yaxunl, dstuttard, t-tye, eraman, llvm-commits

Differential Revision: https://reviews.llvm.org/D53496

llvm-svn: 345719

show more ...


# 28212cc6 31-Oct-2018 Nicolai Haehnle <[email protected]>

AMDGPU: Remove PHI loop condition optimization

Summary:
The optimization to early break out of loops if all threads are dead was
never fully implemented.

But the PHI node analyzing is actually caus

AMDGPU: Remove PHI loop condition optimization

Summary:
The optimization to early break out of loops if all threads are dead was
never fully implemented.

But the PHI node analyzing is actually causing a number of problems, so
remove all the extra code for it.

(This does actually regress code quality in a few places because it
ends up relying more heavily on phi's of i1, which we don't do a
great job with. However, since it fixes real bugs in the wild, we
should take this change. I have some prototype changes to improve
i1 lowering in general -- not just for control flow -- which should
help recover the code quality, I just need to make those changes
fit for general consumption. -- Nicolai)

Change-Id: I6fc6c6c8961857ac6009fcfb9f7e5e48dc23fbb1
Patch-by: Christian König <[email protected]>

Reviewers: arsenm, rampitec, tpr

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D53359

llvm-svn: 345718

show more ...


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3
# 25528d6d 04-Dec-2017 Francis Visoiu Mistrih <[email protected]>

[CodeGen] Unify MBB reference format in both MIR and debug output

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR n

[CodeGen] Unify MBB reference format in both MIR and debug output

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665

show more ...


Revision tags: llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3
# a9487d92 16-Aug-2017 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Eliminate no effect instructions before s_endpgm

Differential Revision: https://reviews.llvm.org/D36585

llvm-svn: 310987


12