History log of /llvm-project-15.0.7/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp (Results 1 – 9 of 9)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 2789c4f5 26-Jul-2022 Kazu Hirata <[email protected]>

[mlir] Use value_or (NFC)


Revision tags: llvmorg-14.0.6
# 8df54a6a 19-Jun-2022 Jacques Pienaar <[email protected]>

[mlir] Update accessors to prefixed form (NFC)

Follow up from flipping dialects to both, flip accessor used to prefixed
variant ahead to flipping from _Both to _Prefixed. This just flips to
the acce

[mlir] Update accessors to prefixed form (NFC)

Follow up from flipping dialects to both, flip accessor used to prefixed
variant ahead to flipping from _Both to _Prefixed. This just flips to
the accessors introduced in the preceding change which are just prefixed
forms of the existing accessor changed from.

Mechanical change using helper script
https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.

show more ...


Revision tags: llvmorg-14.0.5
# 51b925df 07-Jun-2022 Christopher Bate <[email protected]>

[mlir][nvgpu] shared memory access optimization pass

This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a memref representing GPU shared
mem

[mlir][nvgpu] shared memory access optimization pass

This change adds a transformation and pass to the NvGPU dialect that
attempts to optimize reads/writes from a memref representing GPU shared
memory in order to avoid bank conflicts. Given a value representing a
shared memory memref, it traverses all reads/writes within the parent op
and, subject to suitable conditions, rewrites all last dimension index
values such that element locations in the final (col) dimension are
given by
`newColIdx = col % vecSize + perm[row](col/vecSize,row)`
where `perm` is a permutation function indexed by `row` and `vecSize`
is the vector access size in elements (currently assumes 128bit
vectorized accesses, but this can be made a parameter). This specific
transformation can help optimize typical distributed & vectorized accesses
common to loading matrix multiplication operands to/from shared memory.

Differential Revision: https://reviews.llvm.org/D127457

show more ...


# d7ef488b 09-Jun-2022 Mogball <[email protected]>

[mlir][gpu] Move GPU headers into IR/ and Transforms/

Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352


Revision tags: llvmorg-14.0.4
# 7085cb60 17-May-2022 Christopher Bate <[email protected]>

[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering

AsyncCopyOp lowering converted "size in elements" to "size in bytes"
assuming the element type size is at least one byte. This re

[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering

AsyncCopyOp lowering converted "size in elements" to "size in bytes"
assuming the element type size is at least one byte. This removes
that restriction, allowing for types such as i4 and b1 to be handled
correctly.

Differential Revision: https://reviews.llvm.org/D125838

show more ...


# 334f63e7 20-May-2022 Christopher Bate <[email protected]>

[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync

This changes adds missing support for the i4 data type. Tests are added
to ensure proper lowering of an nvgpu.mma.sync operation targeti

[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync

This changes adds missing support for the i4 data type. Tests are added
to ensure proper lowering of an nvgpu.mma.sync operation targeting the
16x8x64xi4 and 16x8x32xi4 MMA variants in the NVVM dialect.

Differential Revision: https://reviews.llvm.org/D126092

show more ...


# 15bcc36e 09-May-2022 Thomas Raoux <[email protected]>

[mlir][gpu] Move async copy ops to NVGPU and add caching hints

Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add

[mlir][gpu] Move async copy ops to NVGPU and add caching hints

Move async copy operations to NVGPU as they only exist on NV target and are
designed to match ptx semantic. This allows us to also add more fine grain
caching hint attribute to the op.
Add hint to bypass L1 and hook it up to NVVM op.

Differential Revision: https://reviews.llvm.org/D125244

show more ...


# 98798073 05-May-2022 Christopher Bate <[email protected]>

[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types

Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand oper

[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types

Adds missing logic in the lowering from NvGPU to NVVM to support fp32
(in an accumulator operand) and tf32 (in multiplicand operand) types.
Fixes logic in one of the helper functions for converting the result
of a mma.sync operation with multiple 8x256bit output tiles, which is
the case for f32 outputs.

Differential Revision: https://reviews.llvm.org/D124533

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2
# 894a591c 14-Apr-2022 Thomas Raoux <[email protected]>

[mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect

Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific
to nvidia target.

Differential Revision: https://reviews.llvm.org/D1

[mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect

Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific
to nvidia target.

Differential Revision: https://reviews.llvm.org/D123824

show more ...