[mlir] Remove VectorToROCDLBetween issues such ashttps://github.com/llvm/llvm-project/issues/56323, the fact that thislowering (unlike the code in amdgpu-to-rocdl) does not correctly setup bound
[mlir] Remove VectorToROCDLBetween issues such ashttps://github.com/llvm/llvm-project/issues/56323, the fact that thislowering (unlike the code in amdgpu-to-rocdl) does not correctly setup bounds checks (and thus will cause page faults on reads that mightneed to be padded instead), and that fixing these problems would,essentially, involve replicating amdgpu-to-rocdl, remove--vector-to-rocdl for being broken. In addition, the lowering does notsupport many aspects of transfer_{read,write}, like supervectors, andmay not work correctly in their presence.We (the MLIR-based convolution generator at AMD) do not use thisconversion pass, nor are we aware of any other clients.Migration strategies:- Use VectorToLLVM- If buffer ops are particularly needed in your application, useamdgpu.raw_buffer_{load,store}A VectorToAMDGPU pass may be introduced in the future.Reviewed By: ThomasRaouxDifferential Revision: https://reviews.llvm.org/D129308
show more ...
[mlir] Fix the names of exported functionsThe names of the functions that are supposed to be exported do not match the implementations. This is due in part to https://github.com/llvm/llvm-project/c
[mlir] Fix the names of exported functionsThe names of the functions that are supposed to be exported do not match the implementations. This is due in part to https://github.com/llvm/llvm-project/commit/cac7aabbd8236bef2909bfc0dbba17644f7aaade.This change makes the implementations and declarations match and adds a couple missing declarations.The new names follow the pattern of the existing `verify` functions where the prefix is maintained as `_mlir_ciface_` but the suffix follows the new naming convention.Reviewed By: rriddleDifferential Revision: https://reviews.llvm.org/D124891
[mlir][NFC] Update textual references of `func` to `func.func` in Integration testsThe special case parsing of `func` operations is being removed.
[mlir] Split out a new ControlFlow dialect from StandardThis dialect is intended to model lower level/branch based control-flow constructs. The initial setof operations are: AssertOp, BranchOp, Co
[mlir] Split out a new ControlFlow dialect from StandardThis dialect is intended to model lower level/branch based control-flow constructs. The initial setof operations are: AssertOp, BranchOp, CondBranchOp, SwitchOp; all split out from the currentstandard dialect.See https://discourse.llvm.org/t/standard-dialect-the-final-chapter/6061Differential Revision: https://reviews.llvm.org/D118966
[mlir] Replace StrEnumAttr -> EnumAttr in core dialectsRemoves uses of `StrEnumAttr` in core dialectsReviewed By: mehdi_amini, rriddleDifferential Revision: https://reviews.llvm.org/D117514
[MLIR][GPU] Define gpu.printf op and its lowerings- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports const
[MLIR][GPU] Define gpu.printf op and its lowerings- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is loweredThis change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.And:[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernelsThis is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.Reviewed By: mehdi_aminiDifferential Revision: https://reviews.llvm.org/D110448
[MLIR] Make the ROCM integration tests runnable- Move the #define s to the GPU Transform library from GPU Ops so thatSerializeToHsaco is non-trivially compiled- Add required includes to Serializ
[MLIR] Make the ROCM integration tests runnable- Move the #define s to the GPU Transform library from GPU Ops so thatSerializeToHsaco is non-trivially compiled- Add required includes to SerializeToHsaco- Move MCSubtargetInfo creation to the correct point in thecompilation process- Change mlir in ROCM tests to account for renamed/moved opsDifferential Revision: https://reviews.llvm.org/D114184
[MLIR][GPU] Add target arguments to SerializeToHsacoCompiling code for AMD GPUs requires knowledge of which chipset isbeing targeted, especially if the code uses chipset-specificintrinsics (which
[MLIR][GPU] Add target arguments to SerializeToHsacoCompiling code for AMD GPUs requires knowledge of which chipset isbeing targeted, especially if the code uses chipset-specificintrinsics (which is the case in a downstream convolution generator).This commit adds `target`, `chipset` and `features` arguments to theSerializeToHsaco constructor to enable passing in this requiredinformation.It also amends the ROCm integration tests to pass in the targetchipset, which is set to the chipset of the first GPU on the systemexecuting the tests.Reviewed By: mehdi_aminiDifferential Revision: https://reviews.llvm.org/D114107
[MLIR] Replace std ops with arith dialect opsPrecursor: https://reviews.llvm.org/D110200Removed redundant ops from the standard dialect that were moved to the`arith` or `math` dialects.Renamed
[MLIR] Replace std ops with arith dialect opsPrecursor: https://reviews.llvm.org/D110200Removed redundant ops from the standard dialect that were moved to the`arith` or `math` dialects.Renamed all instances of operations in the codebase and in tests.Reviewed By: rriddle, jpienaarDifferential Revision: https://reviews.llvm.org/D110797
[mlir] Remove mlir-rocm-runnerThis change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integratio
[mlir] Remove mlir-rocm-runnerThis change combines for ROCm what was done for CUDA in D97463, D98203, D98360, and D98396.I did not try to compile SerializeToHsaco.cpp or test mlir/test/Integration/GPU/ROCM because I don't have an AMD card. I fixed the things that had obvious bit-rot though.Reviewed By: whchungDifferential Revision: https://reviews.llvm.org/D98447