[mlir][NVGPU] Verifier for nvgpu.ldmatrix* Adds verifiers for `nvgpu.ldmatrix` op* Adds tests to `mlir/test/Dialect/NVGPU/invalid.mlir`Reviewed By: ThomasRaouxDifferential Revision: https://re
[mlir][NVGPU] Verifier for nvgpu.ldmatrix* Adds verifiers for `nvgpu.ldmatrix` op* Adds tests to `mlir/test/Dialect/NVGPU/invalid.mlir`Reviewed By: ThomasRaouxDifferential Revision: https://reviews.llvm.org/D129669
show more ...
[mlir][NVGPU] Verifiers for nvgpu.mma.sync Op- Adds verification for `nvgpu.mma.sync` op- Adds tests to `mlir/test/Dialect/NVGPU/invalid.mlir`- `nvgpu.mma.sync` verifier caught a bug and triggere
[mlir][NVGPU] Verifiers for nvgpu.mma.sync Op- Adds verification for `nvgpu.mma.sync` op- Adds tests to `mlir/test/Dialect/NVGPU/invalid.mlir`- `nvgpu.mma.sync` verifier caught a bug and triggered a failure in m16n8k4_tf32_f32 variant in `mlir/test/Conversion/NVGPUToNVVM/nvgpu-to-nvvm.mlir` - The output shape of vector holding thread-level accumulators was inconsistent and fixed in this changeReviewed By: ThomasRaouxDifferential Revision: https://reviews.llvm.org/D129400
[mlir] Flip accessors to prefixed form (NFC)Another mechanical sweep to keep diff small for flip to _Prefixed.
[mlir] Update accessors to prefixed form (NFC)Follow up from flipping dialects to both, flip accessor used to prefixedvariant ahead to flipping from _Both to _Prefixed. This just flips tothe acce
[mlir] Update accessors to prefixed form (NFC)Follow up from flipping dialects to both, flip accessor used to prefixedvariant ahead to flipping from _Both to _Prefixed. This just flips tothe accessors introduced in the preceding change which are just prefixedforms of the existing accessor changed from.Mechanical change using helper scripthttps://github.com/jpienaar/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/AddGetterCheck.cpp and clang-format.
[mlir][nvgpu] fix MSVC warning regarding left shiftDifferential Revision: https://reviews.llvm.org/D128088
[mlir][nvgpu] fix missing build dependency for NVGPUTransformsFixes build failure caused by 51b925df941a66349deff2467203acc200de5e78
[mlir][nvgpu] shared memory access optimization passThis change adds a transformation and pass to the NvGPU dialect thatattempts to optimize reads/writes from a memref representing GPU sharedmem
[mlir][nvgpu] shared memory access optimization passThis change adds a transformation and pass to the NvGPU dialect thatattempts to optimize reads/writes from a memref representing GPU sharedmemory in order to avoid bank conflicts. Given a value representing ashared memory memref, it traverses all reads/writes within the parent opand, subject to suitable conditions, rewrites all last dimension indexvalues such that element locations in the final (col) dimension aregiven by`newColIdx = col % vecSize + perm[row](col/vecSize,row)`where `perm` is a permutation function indexed by `row` and `vecSize`is the vector access size in elements (currently assumes 128bitvectorized accesses, but this can be made a parameter). This specifictransformation can help optimize typical distributed & vectorized accessescommon to loading matrix multiplication operands to/from shared memory.Differential Revision: https://reviews.llvm.org/D127457
[mlir] (NFC) Clean up bazel and CMake target namesAll dialect targets in bazel have been named *Dialect and all dialecttargets in CMake have been named MLIR*Dialect.
[mlir][gpu] Move GPU headers into IR/ and Transforms/Depends on D127350Reviewed By: rriddleDifferential Revision: https://reviews.llvm.org/D127352
[mlir][gpu] Move async copy ops to NVGPU and add caching hintsMove async copy operations to NVGPU as they only exist on NV target and aredesigned to match ptx semantic. This allows us to also add
[mlir][gpu] Move async copy ops to NVGPU and add caching hintsMove async copy operations to NVGPU as they only exist on NV target and aredesigned to match ptx semantic. This allows us to also add more fine graincaching hint attribute to the op.Add hint to bypass L1 and hook it up to NVVM op.Differential Revision: https://reviews.llvm.org/D125244
[mlir][nvgpu] Add NVGPU dialect (architectural specific gpu dialect)This introduce a new dialect for vendro specific ptx operations. Thisalso adds the first operation ldmatrix as an example. More
[mlir][nvgpu] Add NVGPU dialect (architectural specific gpu dialect)This introduce a new dialect for vendro specific ptx operations. Thisalso adds the first operation ldmatrix as an example. More operationswill be added in follow up patches.This new dialect is meant to be a bridge between GPU and Vectordialectis and NVVM dialect.This is based on the RFC proposed here:https://discourse.llvm.org/t/rfc-add-nv-gpu-dialect-hw-specific-extension-of-gpu-dialect-for-nvidia-gpus/61466/8Differential Revision: https://reviews.llvm.org/D123266