|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
b7f93c28 |
| 14-Jul-2022 |
Jeff Niu <[email protected]> |
[mlir] (NFC) run clang-format on all files
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
6d5fc1e3 |
| 21-Jun-2022 |
Kazu Hirata <[email protected]> |
[mlir] Don't use Optional::getValue (NFC)
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
a2cdb979 |
| 02-Jun-2022 |
Krzysztof Drewniak <[email protected]> |
[mlir][AMDGPU] Set ABI version constant when linking device libs
Currently, linking the device libraries requires setting a constant that indicates the code object ABI version the compilation is tar
[mlir][AMDGPU] Set ABI version constant when linking device libs
Currently, linking the device libraries requires setting a constant that indicates the code object ABI version the compilation is targeting.
This fixes the MLIR linking process by setting this constant to 400, which is the value corresponding to the current code object ABI default, version 4.
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D126913
show more ...
|
| #
d7ef488b |
| 09-Jun-2022 |
Mogball <[email protected]> |
[mlir][gpu] Move GPU headers into IR/ and Transforms/
Depends on D127350
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D127352
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
59c3be74 |
| 16-May-2022 |
Mehdi Amini <[email protected]> |
Apply clang-tidy fixes for performance-move-const-arg in SerializeToHsaco.cpp (NFC)
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
5e50dd04 |
| 31-Mar-2022 |
River Riddle <[email protected]> |
[mlir] Rework the implementation of TypeID
This commit restructures how TypeID is implemented to ideally avoid the current problems related to shared libraries. This is done by changing the "implici
[mlir] Rework the implementation of TypeID
This commit restructures how TypeID is implemented to ideally avoid the current problems related to shared libraries. This is done by changing the "implicit" fallback path to use the name of the type, instead of using a static template variable (which breaks shared libraries). The major downside to this is that it adds some additional initialization costs for the implicit path. Given the use of type names for uniqueness in the fallback, we also no longer allow types defined in anonymous namespaces to have an implicit TypeID. To simplify defining an ID for these classes, a new `MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID` macro was added to allow for explicitly defining a TypeID directly on an internal class.
To help identify when types are using the fallback, `-debug-only=typeid` can be used to log which types are using implicit ids.
This change generally only requires changes to the test passes, which are all defined in anonymous namespaces, and thus can't use the fallback any longer.
Differential Revision: https://reviews.llvm.org/D122775
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
4e817b3f |
| 03-Mar-2022 |
Krzysztof Drewniak <[email protected]> |
[MLIR][AMDGPU] Fix typo and add comment to SerializeToHsaco
Reviewed By: bondhugula
Differential Revision: https://reviews.llvm.org/D120943
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
2aed07e9 |
| 16-Feb-2022 |
Shao-Ce SUN <[email protected]> |
[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D119846
|
| #
9cc49c19 |
| 16-Feb-2022 |
Shao-Ce SUN <[email protected]> |
Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`"
This reverts commit fe25c06cc5bdc2ef9427309f8ec1434aad69dc7a.
|
| #
fe25c06c |
| 15-Feb-2022 |
Shao-Ce SUN <[email protected]> |
[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
For ten years, it seems that `MCRegisterInfo` is not used by any target.
Reviewed By: skan
Differential Revision: https://revie
[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`
For ten years, it seems that `MCRegisterInfo` is not used by any target.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D119846
show more ...
|
| #
1aa71944 |
| 15-Feb-2022 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Add missing include to SerilazeToHsaco
Differential Revision: https://reviews.llvm.org/D119852
|
| #
d8f99bb6 |
| 11-Feb-2022 |
Sameer Sahasrabuddhe <[email protected]> |
[AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now r
[AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now replaced by a function attribute that gets propagated to top-level kernel functions via their respective call-graph.
If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the default behaviour is to emit kernel metadata indicating that the kernel uses the hostcall buffer pointer passed as an implicit argument.
The attribute may be placed explicitly by the user, or inferred by the AMDGPU attributor by examining the call-graph. The attribute is inferred only if the function is not being sanitized, and the implictarg_ptr does not result in a load of any byte in the hostcall pointer argument.
Reviewed By: jdoerfert, arsenm, kpyzhov
Differential Revision: https://reviews.llvm.org/D119216
show more ...
|
| #
1ce314ce |
| 10-Feb-2022 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU][lld] Use LLD bundled in ROCm, removing workaround
Having clarified that executing the SerializeToHsaco pass can depend on a ROCm installation, switch from calling lld as a library to usi
[MLIR][GPU][lld] Use LLD bundled in ROCm, removing workaround
Having clarified that executing the SerializeToHsaco pass can depend on a ROCm installation, switch from calling lld as a library to using the copy of lld guaranteed to be included in a ROCm install.
This removes the workaround introduced in D119277
Reviewed By: whchung
Differential Revision: https://reviews.llvm.org/D119463
show more ...
|
| #
c37b3e41 |
| 10-Feb-2022 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Add now-required include to SerializeToHsaco
Reviewed By: whchung
Differential Revision: https://reviews.llvm.org/D119455
|
|
Revision tags: llvmorg-14.0.0-rc1 |
|
| #
1e661e58 |
| 08-Feb-2022 |
Alexandre Ganea <[email protected]> |
[MLIR] Temporary workaround for calling the LLD ELF driver as-a-lib
This fixes the situation described in https://github.com/llvm/llvm-project/issues/53475 with a repro exposed by https://github.com
[MLIR] Temporary workaround for calling the LLD ELF driver as-a-lib
This fixes the situation described in https://github.com/llvm/llvm-project/issues/53475 with a repro exposed by https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction
This is purposely just a workaround to unblock users. This could be transplanted to the release/14.x branch if need be. A proper fix will later be provided in https://reviews.llvm.org/D119049.
Differential Revision: https://reviews.llvm.org/D119277
show more ...
|
|
Revision tags: llvmorg-15-init |
|
| #
e7d0dae7 |
| 28-Jan-2022 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Add missing #include to SerializeToHsaco.cpp
llvm/Support/Path.h was likely previously implicitly included, and a refactoring removed that inclusion, breaking the pass.
Differential Rev
[MLIR][GPU] Add missing #include to SerializeToHsaco.cpp
llvm/Support/Path.h was likely previously implicitly included, and a refactoring removed that inclusion, breaking the pass.
Differential Revision: https://reviews.llvm.org/D118508
show more ...
|
| #
1cf98766 |
| 28-Jan-2022 |
Alexandre Ganea <[email protected]> |
[mlir] Fix build after 83d59e05b201
Differential Revision: https://reviews.llvm.org/D118510
|
| #
6842ec42 |
| 26-Jan-2022 |
River Riddle <[email protected]> |
[mlir][NFC] Add a using for llvm::SMLoc/llvm::SMRange to LLVM.h
These are used pervasively during parsing.
Differential Revision: https://reviews.llvm.org/D118291
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
40aef79d |
| 10-Jan-2022 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Add debug output to enable dumping GPU assembly
- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob - Add debug output to print the assembly or PTX for GPU modules before they
[MLIR][GPU] Add debug output to enable dumping GPU assembly
- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob - Add debug output to print the assembly or PTX for GPU modules before they are assembled and linked
Note that, as SerializeToBlob is a superclass of SerializeToCubin and SerializeToHsaco, --debug-only=serialize-to-blom will dump the intermediate compiler result for both of these passes.
In addition, if LLVM options such as --stop-after are used to control the GPU kernel compilation process, the debug output will contain the appropriate intermediate IR.
Reviewed By: herhut
Differential Revision: https://reviews.llvm.org/D117519
show more ...
|
| #
b77d4d54 |
| 12-Jan-2022 |
Duncan P. N. Exon Smith <[email protected]> |
mlir: Avoid SmallVector::set_size in SerializeToHsacoPass::loadLibraries
Spotted this in a final grep of projects I don't usually build before pushing https://reviews.llvm.org/D115380, which makes `
mlir: Avoid SmallVector::set_size in SerializeToHsacoPass::loadLibraries
Spotted this in a final grep of projects I don't usually build before pushing https://reviews.llvm.org/D115380, which makes `SmallVector::set_size()` private.
Update to `truncate()`, a new-ish variant of `resize()` that asserts the new size is not bigger and that avoids pulling in the allocation and initialization code for growing. Doesn't really look like the perf impact of that would matter here, but since `dirLength` is known to be a smaller size then we might as well.
Differential Revision: https://reviews.llvm.org/D117073
show more ...
|
| #
e1da6291 |
| 08-Dec-2021 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Define gpu.printf op and its lowerings
- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports const
[MLIR][GPU] Define gpu.printf op and its lowerings
- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments - Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP. - Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered
This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.
And: [MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels
This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.
In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D110448
show more ...
|
| #
be0a7e9f |
| 07-Dec-2021 |
Mehdi Amini <[email protected]> |
Adjust "end namespace" comment in MLIR to match new agree'd coding style
See D115115 and this mailing list discussion: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html
Differenti
Adjust "end namespace" comment in MLIR to match new agree'd coding style
See D115115 and this mailing list discussion: https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html
Differential Revision: https://reviews.llvm.org/D115309
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
a6f53afb |
| 18-Nov-2021 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Link in device libraries during HSA compilation if needed
To perform some operations, such as sin() or printf(), code compiled for AMD GPUs must be linked to a series of device libraries
[MLIR][GPU] Link in device libraries during HSA compilation if needed
To perform some operations, such as sin() or printf(), code compiled for AMD GPUs must be linked to a series of device libraries. This commit adds support for linking in these libraries.
However, since these device libraries are delivered as LLVM bitcode, raising the possibility of version incompatibilities, this commit only links in libraries when the functions from those libraries are called by the code being compiled.
This code also sets the math flags to their most conservative values, as MLIR doesn't have a `-ffast-math` equivalent.
Depends on D114114
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114117
show more ...
|
| #
20f79f8c |
| 18-Nov-2021 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Make the path to ROCm a runtime option
Our current build assumes that the path to ROCm we find at build time will be the path at which ROCm is located when the built code is executed. Th
[MLIR][GPU] Make the path to ROCm a runtime option
Our current build assumes that the path to ROCm we find at build time will be the path at which ROCm is located when the built code is executed. This commit adds a --rocm-path option to SerializeToHsaco, and removes the HIP dependency that the SerializeToHsaco previously had.
Depends on D114113
(though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107)
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114114
show more ...
|
| #
bd22554a |
| 18-Nov-2021 |
Krzysztof Drewniak <[email protected]> |
[MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD)
- Adds hooks that allow SerializeTo* passes to arbitrarily transform the produced LLVM Module before it is passed to the code gen
[MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD)
- Adds hooks that allow SerializeTo* passes to arbitrarily transform the produced LLVM Module before it is passed to the code generation passes.
- Uses these hooks within the SerializeToHsaco pass in order to run LLVM optimizations and to set the optimization level on the TargetMachine.
- Adds an optLevel parameter to SerializeToHsaco
Future work may include moving much of what's been added to SerializeToHsaco to SerializeToBlob, but that would require confirmation from the NVVM backend maintainers that it would be appropriate to do so.
Depends on D114107
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D114113
show more ...
|