|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
b4f8443d |
| 09-May-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Allow the device runtime to be compiled for the host
Currently the OpenMP offloading device runtime is only expected to be compiled for the specific architecture it's targeting. This
[Libomptarget] Allow the device runtime to be compiled for the host
Currently the OpenMP offloading device runtime is only expected to be compiled for the specific architecture it's targeting. This is problematic if we want to make compiling the device runtime more general via the standar `clang` driver rather than invoking the clang front-end directly. This patch addresses this by primarily changing the declare type to `nohost` so the host will not contain any of this code. Additionally we forward declare the functions that are defined via variants, otherwise these would cause problems on the host.
Reviewed By: jdoerfert, tianshilei1992
Differential Revision: https://reviews.llvm.org/D125260
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
0870a4f5 |
| 18-Feb-2022 |
Joseph Huber <[email protected]> |
[OpenMP] Add flag for disabling thread state in runtime
The runtime uses thread state values to indicate when we use an ICV or are in nested parallelism. This is done for OpenMP correctness, but it
[OpenMP] Add flag for disabling thread state in runtime
The runtime uses thread state values to indicate when we use an ICV or are in nested parallelism. This is done for OpenMP correctness, but it not needed in the majority of cases. The new flag added is `-fopenmp-assume-no-thread-state`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D120106
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
26feef08 |
| 20-Jan-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Change visibility to hidden for device RTL
This patch changes the visibility for all construct in the new device RTL to be hidden by default. This is done after the changes introduced
[Libomptarget] Change visibility to hidden for device RTL
This patch changes the visibility for all construct in the new device RTL to be hidden by default. This is done after the changes introduced in D117806 changed the visibility from being hidden by default for all device compilations. This asserts that the visibility for the device runtime library will be hidden except for the internal environment variable. This is done to aid optimization and linking of the device library.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D117807
show more ...
|
| #
4746e38f |
| 13-Jan-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Fix multiply defined symbol during linking
This patch adds the `weak` identifier to the openmp device environment variable. The changes introduced in https://reviews.llvm.org/D117211
[Libomptarget] Fix multiply defined symbol during linking
This patch adds the `weak` identifier to the openmp device environment variable. The changes introduced in https://reviews.llvm.org/D117211 result in multiply defined symbols. Because the symbol is potentially included multiple times for each offloading file we will get symbol colisions, and because it needs to have external visiblity it should be weak.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D117231
show more ...
|
| #
43956089 |
| 13-Jan-2022 |
Jon Chesterfield <[email protected]> |
[openmp] Mark used variables as retain as well
D97446 changed the behaviour of 'used'. Compensate.
Reviewed By: ronlieb
Differential Revision: https://reviews.llvm.org/D117211
|
|
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
4d50803c |
| 28-Oct-2021 |
Jon Chesterfield <[email protected]> |
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987. CI is showing a different set of pass/fails to local, committing this withou
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987. CI is showing a different set of pass/fails to local, committing this without the tests enabled by default while debugging that difference.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112227
show more ...
|
| #
6c7b203d |
| 28-Oct-2021 |
Jon Chesterfield <[email protected]> |
Revert "[libomptarget] Build DeviceRTL for amdgpu" - more tests failing on CI than failed locally when writing this patch
This reverts commit 33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.
|
| #
33427fdb |
| 27-Oct-2021 |
Jon Chesterfield <[email protected]> |
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D11
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112227
show more ...
|
| #
0c554a47 |
| 07-Oct-2021 |
Jon Chesterfield <[email protected]> |
[libomptarget] Move device environment to shared header, remove divergence
Follow on to D110006, related to D110957
Where implementations have diverged this resolves to match the new DeviceRTL
- r
[libomptarget] Move device environment to shared header, remove divergence
Follow on to D110006, related to D110957
Where implementations have diverged this resolves to match the new DeviceRTL
- replaces definitions of this struct in deviceRTL and plugins with include - changes the dynamic_shared_size field from D110006 to 32 bits - handles stdint being unavailable in DeviceRTL - adds a zero initializer for the field to amdgpu - moves the extern declaration for deviceRTL to target_interface (omptarget.h is more natural, but doesn't work due to include order with debug.h) - Renames the fields everywhere to match the LLVM format used in DeviceRTL - Makes debug_level uint32_t everywhere (previously sometimes int32_t)
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D111069
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
277b681e |
| 21-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add function tracing debugging to device RTL
This patch adds support for an RAII struct that will print function traces when placed inside of a function declaration. Each successive call wi
[OpenMP] Add function tracing debugging to device RTL
This patch adds support for an RAII struct that will print function traces when placed inside of a function declaration. Each successive call will increase the indentation to make it easier to visually inspect.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110202
show more ...
|
| #
f1c821fa |
| 17-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add support for dynamic shared memory in new RTL
This patch adds support for using dynamic shared memory in the new device runtime. The new function `__kmpc_get_dynamic_shared` will return
[OpenMP] Add support for dynamic shared memory in new RTL
This patch adds support for using dynamic shared memory in the new device runtime. The new function `__kmpc_get_dynamic_shared` will return a pointer to the buffer of dynamic shared memory. Currently the amount of memory allocated is set by an environment variable.
In the future this amount will be added to the amount used for the smart stack which will be configured in a similar way.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D110006
show more ...
|
| #
ec02c34b |
| 16-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add additional fields to device environment
This patch adds fields for the device number and number of devices into the device environment struct and debugging values.
Reviewed By: jdoerfe
[OpenMP] Add additional fields to device environment
This patch adds fields for the device number and number of devices into the device environment struct and debugging values.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110004
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
67ab875f |
| 25-Jul-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP] Prototype opt-in new GPU device RTL
The "old" OpenMP GPU device runtime (D14254) has served us well for many years but modernizing it has caused some pain recently. This patch introduces an
[OpenMP] Prototype opt-in new GPU device RTL
The "old" OpenMP GPU device runtime (D14254) has served us well for many years but modernizing it has caused some pain recently. This patch introduces an alternative which is mostly written from scratch embracing OpenMP 5.X, C++, LLVM coding style (where applicable), and conceptual interfaces. This new runtime is opt-in through a clang flag (D106793). The new runtime is currently only build for nvptx and has "-new" in its name.
The design is tailored towards middle-end optimizations rather than front-end code generation choices, a trend we already started in the old runtime a while back. In contrast to the old one, state is organized in a simple manner rather than a "smart" one. While this can induce costs it helps optimizations. Our expectation is that the majority of codes can be optimized and a "simple" design is therefore preferable. The new runtime does also avoid users to pay for things they do not use, especially wrt. memory. The unlikely case of nested parallelism is supported but costly to make the more likely case use less resources.
The worksharing and reduction implementation have been taken from the old runtime and will be rewritten in the future if necessary.
Documentation and debug features are still mostly missing and will be added over time.
All external symbols start with `__kmpc` for legacy reasons but should be renamed once we switch over to a single runtime. All internal symbols are placed in appropriate namespaces (anonymous or `_OMP`) to avoid name clashes with user symbols.
Differential Revision: https://reviews.llvm.org/D106803
show more ...
|