|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1 |
|
| #
fd8fd9e5 |
| 27-Jul-2022 |
Joseph Huber <[email protected]> |
Revert "[OpenMP] Remove noinline attributes in the device runtime"
The behaviour of this patch is not great, but it has some side-effects that are required for OpenMPOpt to work. The problem is that
Revert "[OpenMP] Remove noinline attributes in the device runtime"
The behaviour of this patch is not great, but it has some side-effects that are required for OpenMPOpt to work. The problem is that when we use `-mlink-builtin-bitcode` we only import used symbols from the runtime. Then OpenMPOpt will insert calls to symbols that were not previously included. This patch removed this implicit behaviour as these functions were kept alive by the `noinline` simply because it kept calls to them in the module. This caused regression in some tests that relied on some OpenMPOpt passes without using LTO. Reverting for the LLVM15 release but will try to fix it more correctly on main.
This reverts commit d61d72dae604c3258e25c00622b1a85861450303.
Fixes #56752
(cherry picked from commit b08369f7f288b6efb0897953da42ed54e60cfc0b)
show more ...
|
|
Revision tags: llvmorg-16-init |
|
| #
d61d72da |
| 25-Jul-2022 |
Joseph Huber <[email protected]> |
[OpenMP] Remove noinline attributes in the device runtime
We previously used the `noinline` attributes to specify some defintions which should be kept alive in the runtime. These were then stripped
[OpenMP] Remove noinline attributes in the device runtime
We previously used the `noinline` attributes to specify some defintions which should be kept alive in the runtime. These were then stripped immediately in the OpenMPOpt module pass. However, Since the changes in D130298, we not explicitly state which functions will have external visiblity in the bitcode library. Additionally the OpenMPOpt module pass should run before the inliner pass, so this shouldn't make a difference in whether or not the functions will be alive for the initial pass of OpenMPOpt. This should simplify the interface, and additionally save time spend on scanning funciton names for noinline.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D130368
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
421b1f55 |
| 31-May-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Do not use retaining attributes for the static library
When we build the libomptarget device runtime library targeting bitcode, we need special care to make sure that certain function
[Libomptarget] Do not use retaining attributes for the static library
When we build the libomptarget device runtime library targeting bitcode, we need special care to make sure that certain functions are not optimized out. This is because we manually internalize and optimize these definitions, ignoring their standard linkage semantics. When we build with the static library, we can maintain these semantics and we do not need these to be kept-alive. Furthermore, if they are kept-alive it prevents them from being removed during LTO. This prevents us from completely internalizing `IsSPMDMode` and removing several other functions. This patch removes these for the static library target by using a macro definition to enable them.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D126701
show more ...
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
ce0caf41 |
| 10-May-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Address existing warnings in the device runtime library
This patche attemps to address the current warnings in the OpenMP offloading device runtime. Previously we did not see these be
[Libomptarget] Address existing warnings in the device runtime library
This patche attemps to address the current warnings in the OpenMP offloading device runtime. Previously we did not see these because we compiled the runtime without the standard warning flags enabled. However, these warnings are used when we now build the static library version of this runtime. This became extremely noisy when coupled with the fact the we compile each file roughly 32 times when all the architectures are considered. So it would be ideal to not have all these warnings show up when building.
Most of these errors were simply implicit switch-case fallthroughs, which can be addressed using C++17's fallthrough attribute. Additionally there was a volatile variable that was being casted away. This is most likely safe to remove because we cast it away before its even used and didn't seem to affect anything in testing.
Depends on D125260
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D125339
show more ...
|
| #
b4f8443d |
| 09-May-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Allow the device runtime to be compiled for the host
Currently the OpenMP offloading device runtime is only expected to be compiled for the specific architecture it's targeting. This
[Libomptarget] Allow the device runtime to be compiled for the host
Currently the OpenMP offloading device runtime is only expected to be compiled for the specific architecture it's targeting. This is problematic if we want to make compiling the device runtime more general via the standar `clang` driver rather than invoking the clang front-end directly. This patch addresses this by primarily changing the declare type to `nohost` so the host will not contain any of this code. Additionally we forward declare the functions that are defined via variants, otherwise these would cause problems on the host.
Reviewed By: jdoerfert, tianshilei1992
Differential Revision: https://reviews.llvm.org/D125260
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
e2dcc221 |
| 04-Mar-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Work around bug in initialization of libomptarget
Libomptarget uses some shared variables to track certain internal stated in the runtime. This causes problems when we have code that
[Libomptarget] Work around bug in initialization of libomptarget
Libomptarget uses some shared variables to track certain internal stated in the runtime. This causes problems when we have code that contains no OpenMP kernels. These variables are normally initialized upon kernel entry, but if there are no kernels we will see no initialization. Currently we load the runtime into each source file when not running in LTO mode, so these variables will be erroneously considered undefined or dead and removed, causing miscompiles. This patch temporarily works around the most obvious case, but others still exhibit this problem. We will need to fix this more soundly later.
Fixes #54208.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D121007
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
57b4c526 |
| 14-Feb-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP][FIX] Eliminate race on the IsSPMD global
The `IsSPMD` global can only be read by threads other than the main thread *after* initialization is complete. To allow usage of `mapping::getBlockS
[OpenMP][FIX] Eliminate race on the IsSPMD global
The `IsSPMD` global can only be read by threads other than the main thread *after* initialization is complete. To allow usage of `mapping::getBlockSize` before initialization is done, we can pass the `IsSPMD` state explicitly. This is similar to other APIs that take `IsSPMD` explicitly to avoid such a race, e.g., `mapping::isInitialThreadInLevel0(IsSPMD)`
Fixes https://github.com/llvm/llvm-project/issues/53857
show more ...
|
| #
48e3dcec |
| 14-Feb-2022 |
Joseph Huber <[email protected]> |
[Libomptarget][NFC] Remove constexpr to hide warnings
Currently whenever we compile the device runtime we get the following 'Mapping.cpp:32:32: warning: inline function '_OMP::impl::getGridValue' is
[Libomptarget][NFC] Remove constexpr to hide warnings
Currently whenever we compile the device runtime we get the following 'Mapping.cpp:32:32: warning: inline function '_OMP::impl::getGridValue' is not defined [-Wundefined-inline]' warning. This can be silenced by removing the constexpr attribute for this function. Doing this doesn't change the generated bitcode at all but prevents the screen from getting filled with warnings whenver we build the runtime.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119747
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
737c4a26 |
| 09-Nov-2021 |
Atmn Patel <[email protected]> |
[clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files
The existing CGOpenMPRuntimeAMDGCN and CGOpenMPRuntimeNVPTX classes are just code bloat. By removing them, the codebase gets a bit
[clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files
The existing CGOpenMPRuntimeAMDGCN and CGOpenMPRuntimeNVPTX classes are just code bloat. By removing them, the codebase gets a bit cleaner.
Reviewed By: jdoerfert, JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D113421
show more ...
|
| #
ef717f38 |
| 09-Nov-2021 |
Atmn Patel <[email protected]> |
Revert "[clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files"
This reverts commit 81a7cad2ffc18f15b732f69d991c8398c979c5ca.
|
| #
81a7cad2 |
| 09-Nov-2021 |
Atmn Patel <[email protected]> |
[clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files
The existing CGOpenMPRuntimeAMDGCN and CGOpenMPRuntimeNVPTX classes are just code bloat. By removing them, the codebase gets a bit
[clang][openmp][NFC] Remove arch-specific CGOpenMPRuntimeGPU files
The existing CGOpenMPRuntimeAMDGCN and CGOpenMPRuntimeNVPTX classes are just code bloat. By removing them, the codebase gets a bit cleaner.
Reviewed By: jdoerfert, JonChesterfield, tianshilei1992
Differential Revision: https://reviews.llvm.org/D113421
show more ...
|
| #
93bebdc7 |
| 03-Nov-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP][NFCI] Cleanup new device RT mapping interface
Minimize the `impl` interface and clean up some uses of mapping functions.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.o
[OpenMP][NFCI] Cleanup new device RT mapping interface
Minimize the `impl` interface and clean up some uses of mapping functions.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D112154
show more ...
|
| #
ccb5d272 |
| 30-Oct-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP][FIX] Avoid a race between initialization and first state reads
When we pick state 0 to initialize state but thread N is going to be the "main thread", in generic mode, we would require extr
[OpenMP][FIX] Avoid a race between initialization and first state reads
When we pick state 0 to initialize state but thread N is going to be the "main thread", in generic mode, we would require extra synchronization. Instead, we should pick the main thread to initialize state in generic mode and any thread in SPMD mode.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D112874
show more ...
|
| #
74f91741 |
| 18-Oct-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Use function tracing RAII for runtime functions.
This patch adds support for using function tracing features to track the executino of runtime functions in the device runtime library. This
[OpenMP] Use function tracing RAII for runtime functions.
This patch adds support for using function tracing features to track the executino of runtime functions in the device runtime library. This is enabled by first compiling the new runtime with `-fopenmp-target-debug=3` and running with `LIBOMPTARGET_DEVICE_RTL_DEBUG=3`. The output only tracks team 0 and thread 0 so there isn't much output when using a generic region.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112002
show more ...
|
| #
7272982e |
| 19-Oct-2021 |
Jon Chesterfield <[email protected]> |
[libomptarget] Refactor DeviceRTL prior to AMDGPU bringup
Subset of D111993. Fix typos, rename read to load.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D111999
|
| #
bad44d5f |
| 09-Oct-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add RTL function for getting number of threads in block.
This patch adds support for the `__kmpc_get_hardware_num_threads_in_block` function that returns the number of threads. This was mis
[OpenMP] Add RTL function for getting number of threads in block.
This patch adds support for the `__kmpc_get_hardware_num_threads_in_block` function that returns the number of threads. This was missing in the new runtime and was used by the AMDGPU plugin which prevented it from using the new runtime. This patchs also unified the interface for getting the thread numbers in the frontend.
Originally authored by jdoerfert.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D111475
show more ...
|
| #
85ad5663 |
| 09-Oct-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Avoid calling `isSPMDMode` during RT initialization
Until we hit the first barrier we should not call `mapping::isSPMDMode` with all threads. Instead, we now have (and use during initializa
[OpenMP] Avoid calling `isSPMDMode` during RT initialization
Until we hit the first barrier we should not call `mapping::isSPMDMode` with all threads. Instead, we now have (and use during initialization) a `mapping::isMainThreadInGenericMode` overload that takes the known SPMD-mode state and one that queries it.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D111381
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
60a40cf3 |
| 22-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Fix KeepAlive usage
Summary: Functions were called the wrong way around, this didn't keep the symbol alive.
|
| #
1cf86df8 |
| 22-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Make sure the Thread ID function is not removed
Summary: The thread ID function was reintroduced in D110195, but could potentially be removed by the optimizer. Make the function noinline to
[OpenMP] Make sure the Thread ID function is not removed
Summary: The thread ID function was reintroduced in D110195, but could potentially be removed by the optimizer. Make the function noinline to preserve the call sites and add it to the externalization RAII so its definition is not removed by the attributor.
show more ...
|
| #
e95731cc |
| 21-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add thread ID function into new RTL
The new device runtime library currently lacks the `kmpc_get_hardware_thread_id_in_block` function which is currently used when doing the SPMDzation opti
[OpenMP] Add thread ID function into new RTL
The new device runtime library currently lacks the `kmpc_get_hardware_thread_id_in_block` function which is currently used when doing the SPMDzation optimization. This call would be introduced through the optimization and then cause a linking error because it was not present. This patch adds support for this runtime call.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D110195
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
842f875c |
| 23-Aug-2021 |
Jon Chesterfield <[email protected]> |
[openmp] Use llvm GridValues from devicertl
Add include path to the cmakefiles and set the target_impl enums from the llvm constants instead of copying the values.
Reviewed By: jdoerfert
Different
[openmp] Use llvm GridValues from devicertl
Add include path to the cmakefiles and set the target_impl enums from the llvm constants instead of copying the values.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D108391
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
67ab875f |
| 25-Jul-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP] Prototype opt-in new GPU device RTL
The "old" OpenMP GPU device runtime (D14254) has served us well for many years but modernizing it has caused some pain recently. This patch introduces an
[OpenMP] Prototype opt-in new GPU device RTL
The "old" OpenMP GPU device runtime (D14254) has served us well for many years but modernizing it has caused some pain recently. This patch introduces an alternative which is mostly written from scratch embracing OpenMP 5.X, C++, LLVM coding style (where applicable), and conceptual interfaces. This new runtime is opt-in through a clang flag (D106793). The new runtime is currently only build for nvptx and has "-new" in its name.
The design is tailored towards middle-end optimizations rather than front-end code generation choices, a trend we already started in the old runtime a while back. In contrast to the old one, state is organized in a simple manner rather than a "smart" one. While this can induce costs it helps optimizations. Our expectation is that the majority of codes can be optimized and a "simple" design is therefore preferable. The new runtime does also avoid users to pay for things they do not use, especially wrt. memory. The unlikely case of nested parallelism is supported but costly to make the more likely case use less resources.
The worksharing and reduction implementation have been taken from the old runtime and will be rewritten in the future if necessary.
Documentation and debug features are still mostly missing and will be added over time.
All external symbols start with `__kmpc` for legacy reasons but should be renamed once we switch over to a single runtime. All internal symbols are placed in appropriate namespaces (anonymous or `_OMP`) to avoid name clashes with user symbols.
Differential Revision: https://reviews.llvm.org/D106803
show more ...
|