History log of /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (Results 1 – 25 of 98)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 4874838a 28-Jun-2022 Piotr Sobczak <[email protected]>

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D1287

[AMDGPU] gfx11 WMMA instruction support

gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate)
instructions.

Reviewed By: arsenm, #amdgpu

Differential Revision: https://reviews.llvm.org/D128756

show more ...


Revision tags: llvmorg-14.0.6
# 13107c27 16-Jun-2022 Jay Foad <[email protected]>

[AMDGPU] Add support for GFX11 LDSDIR hazards

Detect LDS direct WAR/WAW hazards and compute values for
wait_vdst (va_vdst) parameter. Where appropriate this
raises wait_vdst from the default 0 to a

[AMDGPU] Add support for GFX11 LDSDIR hazards

Detect LDS direct WAR/WAW hazards and compute values for
wait_vdst (va_vdst) parameter. Where appropriate this
raises wait_vdst from the default 0 to allow concurrent
issue of LDS direct with VALU execution.

Also detect LDS direct versus VMEM source VGPR hazards
and insert vm_vsrc=0 waits using s_waitcnt_depctr.

Differential Revision: https://reviews.llvm.org/D127963

show more ...


# 9dff14be 15-Jun-2022 Jay Foad <[email protected]>

[AMDGPU] Add support for GFX11 hazards

Add support for partial stall over EXEC hazard and trans use hazard.

Differential Revision: https://reviews.llvm.org/D127872


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4
# bd9eed3a 22-May-2022 Austin Kerbow <[email protected]>

[AMDGPU] Add isMFMA helper function. NFC

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D127124


# 5c974d08 08-Jun-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Fix hazard handling of v_cmpx to permlane

- VOP3 and SDWA forms of V_CMPX were not handled
- Hazard only exists if the compare defines EXEC (i.e. V_CMPX)
forwarded to the permlane.

Diffe

[AMDGPU] Fix hazard handling of v_cmpx to permlane

- VOP3 and SDWA forms of V_CMPX were not handled
- Hazard only exists if the compare defines EXEC (i.e. V_CMPX)
forwarded to the permlane.

Differential Revision: https://reviews.llvm.org/D127344

show more ...


Revision tags: llvmorg-14.0.3
# 63f21f4c 27-Apr-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Handle LDS DMA and LDS_DIRECT hazards

There shall be 1 wait state between M0 write and LDS DMA/LDS_DIRECT use.

Differential Revision: https://reviews.llvm.org/D124550


Revision tags: llvmorg-14.0.2
# d951d937 13-Apr-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Increate hazard for store dwordx3/4 to 2 waitstates on gfx940

Fixes: SWDEV-327053

Differential Revision: https://reviews.llvm.org/D123687


Revision tags: llvmorg-14.0.1
# f311f934 23-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx940 VALU hazard recognizer

Differntial Revision: https://reviews.llvm.org/D122339


# 64838ba3 23-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Use GenericTable to classify DGEMM

Since there is a table introduced for MAI instructions extend it
to use for DGEMM classification.

Differential Revision: https://reviews.llvm.org/D122337


# cad9de71 23-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx940 MAI hazard recognizer

Differential Revision: https://reviews.llvm.org/D122263


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 1e15adba 04-Mar-2022 Austin Kerbow <[email protected]>

[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
s

[AMDGPU] Add s_nop WaitStates between neighboring mfma

In some cases padding bubbles between sequential MFMA instructions may
lead to increased inter-wave performance. Add option to request to pad
some portion of these stall cycles with s_nops.

Fixes: SWDEV-326925

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D121437

show more ...


# e9a49c64 17-Mar-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] gfx940 basic speed model

This is incomplete and will handle more instructions as they are added.

Differential Revision: https://reviews.llvm.org/D121966


Revision tags: llvmorg-14.0.0-rc2
# 380ff31d 22-Feb-2022 Thomas Symalla <[email protected]>

[AMDGPU] Fix typo in comment [NFC]

This replaces "V_MOB_B32" with "V_MOV_B32" in some comment.


# 6527b2a4 18-Feb-2022 Sebastian Neubauer <[email protected]>

[AMDGPU][NFC] Fix typos

Fix some typos in the amdgpu backend.

Differential Revision: https://reviews.llvm.org/D119235


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# dbf278b9 21-Jan-2022 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Prevent aliasing of SrcC and Dst in MAI

Form the MAI spec: It’s ok that Src_C and vDst are the exact same VGPRs
or Src_C and vDst are completely separated. The case that Src_C and vDst
are

[AMDGPU] Prevent aliasing of SrcC and Dst in MAI

Form the MAI spec: It’s ok that Src_C and vDst are the exact same VGPRs
or Src_C and vDst are completely separated. The case that Src_C and vDst
are overlapping should be avoid as new value could be written to accumulator
input before it gets read.

Note that this inevitably increases register pressure to the point where
some programs will become uncompilable.

This patch separates MAC and FMA versions of MFMA instructions using either
tied dst and src2 or earlyclobber dst.

Fixes: SWDEV-318900

Differential Revision: https://reviews.llvm.org/D117844

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3
# d6b07348 19-Jan-2022 Jim Lin <[email protected]>

[NFC] Use Register instead of unsigned


Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 661a232e 19-Nov-2021 Stanislav Mekhanoshin <[email protected]>

[AMDGPU] Remove a no-op check in the gfx90a hazard recognizer

Also rename helper function accordingly.

Differential Revision: https://reviews.llvm.org/D114289


# d1f45ed5 11-Nov-2021 Neubauer, Sebastian <[email protected]>

[AMDGPU][NFC] Fix typos

Differential Revision: https://reviews.llvm.org/D113672


Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2
# 4f5ba46e 17-Aug-2021 Christudasan Devadasan <[email protected]>

[AMDGPU] Set wait state for meta instructions to zero

It looked more reasonable to set the wait state to
zero for all non-instructions. With that we can avoid
the special handling for them in `getWa

[AMDGPU] Set wait state for meta instructions to zero

It looked more reasonable to set the wait state to
zero for all non-instructions. With that we can avoid
the special handling for them in `getWaitStatesSince`
and `AdvanceCycle`. This NFC patch makes the handling
more generic.

show more ...


# 68660767 13-Aug-2021 Christudasan Devadasan <[email protected]>

[AMDGPU] Skip pseudo MIs in hazard recognizer

Instructions like WAVE_BARRIER and SI_MASKED_UNREACHABLE
are only placeholders to prevent certain unwanted
transformations and will get discarded during

[AMDGPU] Skip pseudo MIs in hazard recognizer

Instructions like WAVE_BARRIER and SI_MASKED_UNREACHABLE
are only placeholders to prevent certain unwanted
transformations and will get discarded during assembly
emission. They should not be counted during nop insertion.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D108022

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# e0c382a9 14-Jun-2021 Piotr Sobczak <[email protected]>

[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard

The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds
access followed by a branch, followed by an lds/vmem access.

The handling of

[AMDGPU] Limit runs of fixLdsBranchVmemWARHazard

The code in fixLdsBranchVmemWARHazard looks for patterns of a vmem/lds
access followed by a branch, followed by an lds/vmem access.

The handling of the hazard requires an arbitrary number of instructions
to process. In the worst case where a function has a vmem access, but no lds
accesses, all instructions are examined only to conclude that the hazard
cannot occur.

Add the pre-processing stage which detects if there is both lds and vmem
present in the function and only then does the more costly search.

This patch significantly improves compilation time in the cases the hazard
cannot happen. In one pathological case I looked at IsHazardInst is needlesly
called 88.6 milions times.

The numbers could also be improved by introducing a map around the
inner calls to ::getWaitStatesSince in fixLdsBranchVmemWARHazard, but
nothing will beat not running fixLdsBranchVmemWARHazard at all in the cases
detected by shouldRunLdsBranchVmemWARHazardFixup().

Differential Revision: https://reviews.llvm.org/D104219

show more ...


Revision tags: llvmorg-12.0.1-rc1
# f251379a 30-Apr-2021 Jay Foad <[email protected]>

[AMDGPU] Simplify getWaitStatesSince. NFC.


# 424f1f6f 30-Apr-2021 Carl Ritson <[email protected]>

[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the i

[AMDGPU][NFC] Refactor hazard recognition IsHazardFn and IsExpiredFn

Refactor IsHazardFn and IsExpiredFn to use constant references as these should not be mutating the instructions visited and the instruction can never be null.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D101430

show more ...


# 749702fc 29-Apr-2021 Carl Ritson <[email protected]>

[AMDGPU] Remove dead early-out in GCNHazardRecognizer

Remove an early-out in wait state counting which can never be
taken.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.o

[AMDGPU] Remove dead early-out in GCNHazardRecognizer

Remove an early-out in wait state counting which can never be
taken.

Reviewed By: foad, rampitec

Differential Revision: https://reviews.llvm.org/D101520

show more ...


# 12011b52 27-Apr-2021 Jay Foad <[email protected]>

[AMDGPU] GCNHazardRecognizer: ignore all meta instructions

This is hopefully NFC, but should be more robust in ignoring all
instructions that should be ignored, instead of just some of them.

Differ

[AMDGPU] GCNHazardRecognizer: ignore all meta instructions

This is hopefully NFC, but should be more robust in ignoring all
instructions that should be ignored, instead of just some of them.

Differential Revision: https://reviews.llvm.org/D101372

show more ...


1234