History log of /llvm-project-15.0.7/mlir/lib/Dialect/GPU/Transforms/SerializeToHsaco.cpp (Results 1 – 25 of 32)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# b7f93c28 14-Jul-2022 Jeff Niu <[email protected]>

[mlir] (NFC) run clang-format on all files


Revision tags: llvmorg-14.0.6
# 6d5fc1e3 21-Jun-2022 Kazu Hirata <[email protected]>

[mlir] Don't use Optional::getValue (NFC)


Revision tags: llvmorg-14.0.5
# a2cdb979 02-Jun-2022 Krzysztof Drewniak <[email protected]>

[mlir][AMDGPU] Set ABI version constant when linking device libs

Currently, linking the device libraries requires setting a constant
that indicates the code object ABI version the compilation is
tar

[mlir][AMDGPU] Set ABI version constant when linking device libs

Currently, linking the device libraries requires setting a constant
that indicates the code object ABI version the compilation is
targeting.

This fixes the MLIR linking process by setting this constant to 400,
which is the value corresponding to the current code object ABI
default, version 4.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D126913

show more ...


# d7ef488b 09-Jun-2022 Mogball <[email protected]>

[mlir][gpu] Move GPU headers into IR/ and Transforms/

Depends on D127350

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D127352


Revision tags: llvmorg-14.0.4
# 59c3be74 16-May-2022 Mehdi Amini <[email protected]>

Apply clang-tidy fixes for performance-move-const-arg in SerializeToHsaco.cpp (NFC)


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 5e50dd04 31-Mar-2022 River Riddle <[email protected]>

[mlir] Rework the implementation of TypeID

This commit restructures how TypeID is implemented to ideally avoid
the current problems related to shared libraries. This is done by changing
the "implici

[mlir] Rework the implementation of TypeID

This commit restructures how TypeID is implemented to ideally avoid
the current problems related to shared libraries. This is done by changing
the "implicit" fallback path to use the name of the type, instead of using
a static template variable (which breaks shared libraries). The major downside to this
is that it adds some additional initialization costs for the implicit path. Given the
use of type names for uniqueness in the fallback, we also no longer allow types
defined in anonymous namespaces to have an implicit TypeID. To simplify defining
an ID for these classes, a new `MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID` macro
was added to allow for explicitly defining a TypeID directly on an internal class.

To help identify when types are using the fallback, `-debug-only=typeid` can be
used to log which types are using implicit ids.

This change generally only requires changes to the test passes, which are all defined
in anonymous namespaces, and thus can't use the fallback any longer.

Differential Revision: https://reviews.llvm.org/D122775

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 4e817b3f 03-Mar-2022 Krzysztof Drewniak <[email protected]>

[MLIR][AMDGPU] Fix typo and add comment to SerializeToHsaco

Reviewed By: bondhugula

Differential Revision: https://reviews.llvm.org/D120943


Revision tags: llvmorg-14.0.0-rc2
# 2aed07e9 16-Feb-2022 Shao-Ce SUN <[email protected]>

[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846


# 9cc49c19 16-Feb-2022 Shao-Ce SUN <[email protected]>

Revert "[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`"

This reverts commit fe25c06cc5bdc2ef9427309f8ec1434aad69dc7a.


# fe25c06c 15-Feb-2022 Shao-Ce SUN <[email protected]>

[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`

For ten years, it seems that `MCRegisterInfo` is not used by any target.

Reviewed By: skan

Differential Revision: https://revie

[NFC][MC] remove unused argument `MCRegisterInfo` in `MCCodeEmitter`

For ten years, it seems that `MCRegisterInfo` is not used by any target.

Reviewed By: skan

Differential Revision: https://reviews.llvm.org/D119846

show more ...


# 1aa71944 15-Feb-2022 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Add missing include to SerilazeToHsaco

Differential Revision: https://reviews.llvm.org/D119852


# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <[email protected]>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


# 1ce314ce 10-Feb-2022 Krzysztof Drewniak <[email protected]>

[MLIR][GPU][lld] Use LLD bundled in ROCm, removing workaround

Having clarified that executing the SerializeToHsaco pass can
depend on a ROCm installation, switch from calling lld as a library to
usi

[MLIR][GPU][lld] Use LLD bundled in ROCm, removing workaround

Having clarified that executing the SerializeToHsaco pass can
depend on a ROCm installation, switch from calling lld as a library to
using the copy of lld guaranteed to be included in a ROCm install.

This removes the workaround introduced in D119277

Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D119463

show more ...


# c37b3e41 10-Feb-2022 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Add now-required include to SerializeToHsaco

Reviewed By: whchung

Differential Revision: https://reviews.llvm.org/D119455


Revision tags: llvmorg-14.0.0-rc1
# 1e661e58 08-Feb-2022 Alexandre Ganea <[email protected]>

[MLIR] Temporary workaround for calling the LLD ELF driver as-a-lib

This fixes the situation described in https://github.com/llvm/llvm-project/issues/53475 with a repro exposed by https://github.com

[MLIR] Temporary workaround for calling the LLD ELF driver as-a-lib

This fixes the situation described in https://github.com/llvm/llvm-project/issues/53475 with a repro exposed by https://github.com/ROCmSoftwarePlatform/D108850-lld-bug-reproduction

This is purposely just a workaround to unblock users. This could be transplanted to the release/14.x branch if need be. A proper fix will later be provided in https://reviews.llvm.org/D119049.

Differential Revision: https://reviews.llvm.org/D119277

show more ...


Revision tags: llvmorg-15-init
# e7d0dae7 28-Jan-2022 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Add missing #include to SerializeToHsaco.cpp

llvm/Support/Path.h was likely previously implicitly included, and a
refactoring removed that inclusion, breaking the pass.

Differential Rev

[MLIR][GPU] Add missing #include to SerializeToHsaco.cpp

llvm/Support/Path.h was likely previously implicitly included, and a
refactoring removed that inclusion, breaking the pass.

Differential Revision: https://reviews.llvm.org/D118508

show more ...


# 1cf98766 28-Jan-2022 Alexandre Ganea <[email protected]>

[mlir] Fix build after 83d59e05b201

Differential Revision: https://reviews.llvm.org/D118510


# 6842ec42 26-Jan-2022 River Riddle <[email protected]>

[mlir][NFC] Add a using for llvm::SMLoc/llvm::SMRange to LLVM.h

These are used pervasively during parsing.

Differential Revision: https://reviews.llvm.org/D118291


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 40aef79d 10-Jan-2022 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Add debug output to enable dumping GPU assembly

- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob
- Add debug output to print the assembly or PTX for GPU modules before
they

[MLIR][GPU] Add debug output to enable dumping GPU assembly

- Set the DEBUG_TYPE of SerializeToBlob to serialize-to-blob
- Add debug output to print the assembly or PTX for GPU modules before
they are assembled and linked

Note that, as SerializeToBlob is a superclass of SerializeToCubin and
SerializeToHsaco, --debug-only=serialize-to-blom will dump the
intermediate compiler result for both of these passes.

In addition, if LLVM options such as --stop-after are used to control
the GPU kernel compilation process, the debug output will contain the
appropriate intermediate IR.

Reviewed By: herhut

Differential Revision: https://reviews.llvm.org/D117519

show more ...


# b77d4d54 12-Jan-2022 Duncan P. N. Exon Smith <[email protected]>

mlir: Avoid SmallVector::set_size in SerializeToHsacoPass::loadLibraries

Spotted this in a final grep of projects I don't usually build before
pushing https://reviews.llvm.org/D115380, which makes
`

mlir: Avoid SmallVector::set_size in SerializeToHsacoPass::loadLibraries

Spotted this in a final grep of projects I don't usually build before
pushing https://reviews.llvm.org/D115380, which makes
`SmallVector::set_size()` private.

Update to `truncate()`, a new-ish variant of `resize()` that asserts the
new size is not bigger and that avoids pulling in the allocation and
initialization code for growing. Doesn't really look like the perf
impact of that would matter here, but since `dirLength` is known to be a
smaller size then we might as well.

Differential Revision: https://reviews.llvm.org/D117073

show more ...


# e1da6291 08-Dec-2021 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Define gpu.printf op and its lowerings

- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports const

[MLIR][GPU] Define gpu.printf op and its lowerings

- Define a gpu.printf op, which can be lowered to any GPU printf() support (which is present in CUDA, HIP, and OpenCL). This op only supports constant format strings and scalar arguments
- Define the lowering of gpu.pirntf to a call to printf() (which is what is required for AMD GPUs when using OpenCL) as well as to the hostcall interface present in the AMD Open Compute device library, which is the interface present when kernels are running under HIP.
- Add a "runtime" enum that allows specifying which of the possible runtimes a ROCDL kernel will be executed under or that the runtime is unknown. This enum controls how gpu.printf is lowered

This change does not enable lowering for Nvidia GPUs, but such a lowering should be possible in principle.

And:
[MLIR][AMDGPU] Always set amdgpu-implicitarg-num-bytes=56 on kernels

This is something that Clang always sets on both OpenCL and HIP kernels, and failing to include it causes mysterious crashes with printf() support.

In addition, revert the max-flat-work-group-size to (1, 256) to avoid triggering bugs in the AMDGPU backend.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D110448

show more ...


# be0a7e9f 07-Dec-2021 Mehdi Amini <[email protected]>

Adjust "end namespace" comment in MLIR to match new agree'd coding style

See D115115 and this mailing list discussion:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html

Differenti

Adjust "end namespace" comment in MLIR to match new agree'd coding style

See D115115 and this mailing list discussion:
https://lists.llvm.org/pipermail/llvm-dev/2021-December/154199.html

Differential Revision: https://reviews.llvm.org/D115309

show more ...


Revision tags: llvmorg-13.0.1-rc1
# a6f53afb 18-Nov-2021 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Link in device libraries during HSA compilation if needed

To perform some operations, such as sin() or printf(), code compiled
for AMD GPUs must be linked to a series of device libraries

[MLIR][GPU] Link in device libraries during HSA compilation if needed

To perform some operations, such as sin() or printf(), code compiled
for AMD GPUs must be linked to a series of device libraries. This
commit adds support for linking in these libraries.

However, since these device libraries are delivered as LLVM bitcode,
raising the possibility of version incompatibilities, this commit only
links in libraries when the functions from those libraries are called
by the code being compiled.

This code also sets the math flags to their most conservative values,
as MLIR doesn't have a `-ffast-math` equivalent.

Depends on D114114

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114117

show more ...


# 20f79f8c 18-Nov-2021 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Make the path to ROCm a runtime option

Our current build assumes that the path to ROCm we find at build time
will be the path at which ROCm is located when the built code is
executed. Th

[MLIR][GPU] Make the path to ROCm a runtime option

Our current build assumes that the path to ROCm we find at build time
will be the path at which ROCm is located when the built code is
executed. This commit adds a --rocm-path option to SerializeToHsaco,
and removes the HIP dependency that the SerializeToHsaco previously had.

Depends on D114113

(though the dependency is to ensure the diffs apply cleanly and to capture the dependency on D114107)

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114114

show more ...


# bd22554a 18-Nov-2021 Krzysztof Drewniak <[email protected]>

[MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD)

- Adds hooks that allow SerializeTo* passes to arbitrarily transform
the produced LLVM Module before it is passed to the code gen

[MLIR][GPU] Run generic LLVM optimizations when serializing (on AMD)

- Adds hooks that allow SerializeTo* passes to arbitrarily transform
the produced LLVM Module before it is passed to the code generation
passes.

- Uses these hooks within the SerializeToHsaco pass in order to run
LLVM optimizations and to set the optimization level on the
TargetMachine.

- Adds an optLevel parameter to SerializeToHsaco

Future work may include moving much of what's been added to
SerializeToHsaco to SerializeToBlob, but that would require
confirmation from the NVVM backend maintainers that it would be
appropriate to do so.

Depends on D114107

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D114113

show more ...


12