NVGPUToNVVM.cpp - OpenGrok history log for /llvm-project-15.0.7/mlir/lib/Conversion/NVGPUToNVVM/NVGPUToNVVM.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 2789c4f5	26-Jul-2022	Kazu Hirata <[email protected]>	[mlir] Use value_or (NFC)
Revision tags: llvmorg-14.0.6
# 8df54a6a	19-Jun-2022	Jacques Pienaar <[email protected]>	[mlir] Update accessors to prefixed form (NFC) Follow up from flipping dialects to both, flip accessor used to prefixed variant ahead to flipping from _Both to _Prefixed. This just flips to the acce [mlir] Update accessors to prefixed form (NFC) Follow up from flipping dialects to both, flip accessor used to prefixed variant ahead to flipping from _Both to _Prefixed. This just flips to the accessors introduced in the preceding change which are just prefixed forms of the existing accessor changed from. Mechanical change using helper script https://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format. show more ...
Revision tags: llvmorg-14.0.5
# 51b925df	07-Jun-2022	Christopher Bate <[email protected]>	[mlir][nvgpu] shared memory access optimization pass This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared mem [mlir][nvgpu] shared memory access optimization pass This change adds a transformation and pass to the NvGPU dialect that attempts to optimize reads/writes from a memref representing GPU shared memory in order to avoid bank conflicts. Given a value representing a shared memory memref, it traverses all reads/writes within the parent op and, subject to suitable conditions, rewrites all last dimension index values such that element locations in the final (col) dimension are given by `newColIdx = col % vecSize + perm[row](col/vecSize,row)` where `perm` is a permutation function indexed by `row` and `vecSize` is the vector access size in elements (currently assumes 128bit vectorized accesses, but this can be made a parameter). This specific transformation can help optimize typical distributed & vectorized accesses common to loading matrix multiplication operands to/from shared memory. Differential Revision: https://reviews.llvm.org/D127457 show more ...
# d7ef488b	09-Jun-2022	Mogball <[email protected]>	[mlir][gpu] Move GPU headers into IR/ and Transforms/ Depends on D127350 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D127352
Revision tags: llvmorg-14.0.4
# 7085cb60	17-May-2022	Christopher Bate <[email protected]>	[mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering AsyncCopyOp lowering converted "size in elements" to "size in bytes" assuming the element type size is at least one byte. This re [mlir][NvGpuToNVVM] Fix byte size calculation in async copy lowering AsyncCopyOp lowering converted "size in elements" to "size in bytes" assuming the element type size is at least one byte. This removes that restriction, allowing for types such as i4 and b1 to be handled correctly. Differential Revision: https://reviews.llvm.org/D125838 show more ...
# 334f63e7	20-May-2022	Christopher Bate <[email protected]>	[mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync This changes adds missing support for the i4 data type. Tests are added to ensure proper lowering of an nvgpu.mma.sync operation targeti [mlir][NvGpuToNVVM] Fix missing i4 support for nvgpu.mma.sync This changes adds missing support for the i4 data type. Tests are added to ensure proper lowering of an nvgpu.mma.sync operation targeting the 16x8x64xi4 and 16x8x32xi4 MMA variants in the NVVM dialect. Differential Revision: https://reviews.llvm.org/D126092 show more ...
# 15bcc36e	09-May-2022	Thomas Raoux <[email protected]>	[mlir][gpu] Move async copy ops to NVGPU and add caching hints Move async copy operations to NVGPU as they only exist on NV target and are designed to match ptx semantic. This allows us to also add [mlir][gpu] Move async copy ops to NVGPU and add caching hints Move async copy operations to NVGPU as they only exist on NV target and are designed to match ptx semantic. This allows us to also add more fine grain caching hint attribute to the op. Add hint to bypass L1 and hook it up to NVVM op. Differential Revision: https://reviews.llvm.org/D125244 show more ...
# 98798073	05-May-2022	Christopher Bate <[email protected]>	[mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types Adds missing logic in the lowering from NvGPU to NVVM to support fp32 (in an accumulator operand) and tf32 (in multiplicand oper [mlir][NvGpu] Fix nvgpu.mma.sync lowering to NVVM for f32, tf32 types Adds missing logic in the lowering from NvGPU to NVVM to support fp32 (in an accumulator operand) and tf32 (in multiplicand operand) types. Fixes logic in one of the helper functions for converting the result of a mma.sync operation with multiple 8x256bit output tiles, which is the case for f32 outputs. Differential Revision: https://reviews.llvm.org/D124533 show more ...
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2
# 894a591c	14-Apr-2022	Thomas Raoux <[email protected]>	[mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific to nvidia target. Differential Revision: https://reviews.llvm.org/D1 [mlir][nvgpu] Move mma.sync and ldmatrix in nvgpu dialect Move gpu operation mma.sync and ldmatrix in nvgpu as they are specific to nvidia target. Differential Revision: https://reviews.llvm.org/D123824 show more ...