|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
2789c4f5 |
| 26-Jul-2022 |
Kazu Hirata <[email protected]> |
[mlir] Use value_or (NFC)
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
8df54a6a |
| 19-Jun-2022 |
Jacques Pienaar <[email protected]> |
[mlir] Update accessors to prefixed form (NFC)
Follow up from flipping dialects to both, flip accessor used to prefixed variant ahead to flipping from _Both to _Prefixed. This just flips to the acce
[mlir] Update accessors to prefixed form (NFC)
Follow up from flipping dialects to both, flip accessor used to prefixed variant ahead to flipping from _Both to _Prefixed. This just flips to the accessors introduced in the preceding change which are just prefixed forms of the existing accessor changed from.
Mechanical change using helper script https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.
show more ...
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
51b925df |
| 07-Jun-2022 |
Christopher Bate <[email protected]> |
[mlir][nvgpu] shared memory access optimization pass
This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared mem
[mlir][nvgpu] shared memory access optimization pass
This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared memory in order to avoid bank conflicts. Given a value representing a shared memory memref, it traverses all reads/writes within the parent op and, subject to suitable conditions, rewrites all last dimension index values such that element locations in the final (col) dimension are given by `newColIdx = col % vecSize + perm[row](col/vecSize,row)` where `perm` is a permutation function indexed by `row` and `vecSize` is the vector access size in elements (currently assumes 128bit vectorized accesses, but this can be made a parameter). This specific transformation can help optimize typical distributed & vectorized accesses common to loading matrix multiplication operands to/from shared memory.
Differential Revision: https://reviews.llvm.org/D127457
show more ...
|
| #
d7ef488b |
| 09-Jun-2022 |
Mogball <[email protected]> |
[mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D127352
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
7085cb60 |
| 17-May-2022 |
Christopher Bate <[email protected]> |
[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering
AsyncCopyOp lowering converted "size in elements" to "size in bytes" assuming the element type size is at least one byte. This re
[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering
AsyncCopyOp lowering converted "size in elements" to "size in bytes" assuming the element type size is at least one byte. This removes that restriction, allowing for types such as i4 and b1 to be handled correctly.
Differential Revision: https://reviews.llvm.org/D125838
show more ...
|
| #
334f63e7 |
| 20-May-2022 |
Christopher Bate <[email protected]> |
[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync
This changes adds missing support for the i4 data type. Tests are added to ensure proper lowering of an nvgpu.mma.sync operation targeti
[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync
This changes adds missing support for the i4 data type. Tests are added to ensure proper lowering of an nvgpu.mma.sync operation targeting the 16x8x64xi4 and 16x8x32xi4 MMA variants in the NVVM dialect.
Differential Revision: https://reviews.llvm.org/D126092
show more ...
|
| #
15bcc36e |
| 09-May-2022 |
Thomas Raoux <[email protected]> |
[mlir][gpu] Move async copy ops to NVGPU and add caching hints
Move async copy operations to NVGPU as they only exist on NV target and are designed to match ptx semantic. This allows us to also add
[mlir][gpu] Move async copy ops to NVGPU and add caching hints
Move async copy operations to NVGPU as they only exist on NV target and are designed to match ptx semantic. This allows us to also add more fine grain caching hint attribute to the op. Add hint to bypass L1 and hook it up to NVVM op.
Differential Revision: https://reviews.llvm.org/D125244
show more ...
|
| #
98798073 |
| 05-May-2022 |
Christopher Bate <[email protected]> |
[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types
Adds missing logic in the lowering from NvGPU to NVVM to support fp32 (in an accumulator operand) and tf32 (in multiplicand oper
[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types
Adds missing logic in the lowering from NvGPU to NVVM to support fp32 (in an accumulator operand) and tf32 (in multiplicand operand) types. Fixes logic in one of the helper functions for converting the result of a mma.sync operation with multiple 8x256bit output tiles, which is the case for f32 outputs.
Differential Revision: https://reviews.llvm.org/D124533
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
894a591c |
| 14-Apr-2022 |
Thomas Raoux <[email protected]> |
[mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect
Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific to nvidia target.
Differential Revision: https://reviews.llvm.org/D1
[mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect
Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific to nvidia target.
Differential Revision: https://reviews.llvm.org/D123824
show more ...
|