[mlir] (NFC) Clean up bazel and CMake target namesAll dialect targets in bazel have been named *Dialect and all dialecttargets in CMake have been named MLIR*Dialect.
[mlir] Refactor DialectRegistry delayed interface support into a general DialectExtension mechanismThe current dialect registry allows for attaching delayed interfaces, that are added to attrs/dial
[mlir] Refactor DialectRegistry delayed interface support into a general DialectExtension mechanismThe current dialect registry allows for attaching delayed interfaces, that are added to attrs/dialects/ops/etc.when the owning dialect gets loaded. This is clunky for quite a few reasons, e.g. each interface type has aseparate tracking structure, and is also quite limiting. This commit refactors this delayed mutation ofdialect constructs into a more general DialectExtension mechanism. This mechanism is essentially a registrationcallback that is invoked when a set of dialects have been loaded. This allows for attaching interfaces directlyon the loaded constructs, and also allows for loading new dependent dialects. The latter of which isextremely useful as it will now enable dependent dialects to only apply in the contexts in which theyare necessary. For example, a dialect dependency can now be conditional on if a user actually needs theinterface that relies on it.Differential Revision: https://reviews.llvm.org/D120367
show more ...
Fix clang-tidy issues in mlir/ (NFC)Reviewed By: ftynseDifferential Revision: https://reviews.llvm.org/D115956
[MLIR][GPU] Make max flat work group size for ROCDL kernels configurableWhile the default value for the amdgpu-flat-work-group-size attribute,"1, 256", matches the defaults from Clang, some users
[MLIR][GPU] Make max flat work group size for ROCDL kernels configurableWhile the default value for the amdgpu-flat-work-group-size attribute,"1, 256", matches the defaults from Clang, some users of the ROCDL dialect,namely Tensorflow, use larger workgroups, such as 1024. Therefore,instead of hardcoding this value, we add a rocdl.max_flat_work_group_sizeattribute that can be set on GPU kernels to override the default value.Reviewed By: whchungDifferential Revision: https://reviews.llvm.org/D115741
[MLIR][GPU] Define gpu.printf op and its lowerings- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports const
[MLIR][GPU] Define gpu.printf op and its lowerings- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is loweredThis change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.And:[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernelsThis is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.Reviewed By: mehdi_aminiDifferential Revision: https://reviews.llvm.org/D110448
Adjust "end namespace" comment in MLIR to match new agree'd coding styleSee D115115 and this mailing list discussion:https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.htmlDifferenti
Adjust "end namespace" comment in MLIR to match new agree'd coding styleSee D115115 and this mailing list discussion:https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.htmlDifferential Revision: https://reviews.llvm.org/D115309
[mlir] Convert NamedAttribute to be a classNamedAttribute is currently represented as an std::pair, but thiscreates an extremely clunky .first/.second API. This commitconverts it to a class, with
[mlir] Convert NamedAttribute to be a classNamedAttribute is currently represented as an std::pair, but thiscreates an extremely clunky .first/.second API. This commitconverts it to a class, with better accessors (getName/getValue)and also opens the door for more convenient API in the future.Differential Revision: https://reviews.llvm.org/D113956
[mlir] make implementations of translation to LLVM IR interfaces privateThere is no need for the interface implementations to be exposed, opaqueregistration functions are sufficient for all users,
[mlir] make implementations of translation to LLVM IR interfaces privateThere is no need for the interface implementations to be exposed, opaqueregistration functions are sufficient for all users, similarly to passes.Reviewed By: mehdi_aminiDifferential Revision: https://reviews.llvm.org/D97852
[mlir] add verifiers for NVVM and ROCDL kernel attributesMake sure they can only be attached to LLVM functions as a result of convertingGPU functions to the LLVM Dialect.
[mlir] Use the interface-based translation for LLVM "intrinsic" dialectsPort the translation of five dialects that define LLVM IR intrinsics(LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to th
[mlir] Use the interface-based translation for LLVM "intrinsic" dialectsPort the translation of five dialects that define LLVM IR intrinsics(LLVMAVX512, LLVMArmNeon, LLVMArmSVE, NVVM, ROCDL) to the new dialectinterface-based mechanism. This allows us to remove individual translationsthat were created for each of these dialects and just use one commonMLIR-to-LLVM-IR translation that potentially supports all dialects instead,based on what is registered and including any combination of translatabledialects. This removal was one of the main goals of the refactoring.To support the addition of GPU-related metadata, the translation interface isextended with the `amendOperation` function that allows the interfaceimplementation to post-process any translated operation with dialect attributesfrom the dialect for which the interface is implemented regardless of theoperation's dialect. This is currently applied to "kernel" functions, but canbe used to construct other metadata in dialect-specific ways withoutnecessarily affecting operations.Depends On D96591, D96504Reviewed By: nicolasvasilacheDifferential Revision: https://reviews.llvm.org/D96592