|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
d61aac76 |
| 31-Oct-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP][FIX] Do not signal SPMD-mode but then keep generic-mode
If we assume SPMD-mode during the fixpoint iteration we have to execute the kernel in SPMD-mode. If we change our mind during manifes
[OpenMP][FIX] Do not signal SPMD-mode but then keep generic-mode
If we assume SPMD-mode during the fixpoint iteration we have to execute the kernel in SPMD-mode. If we change our mind during manifest there is the chance of a mismatch between the simplification, e.g., of `__kmpc_is_spmd_exec_mode` calls, and the execution mode. This problem was introduced in D109438.
This patch is compromise to resolve the problem purely in OpenMP-opt while trying to keep the benefits of D109438 around. This might not always work, see `get_hardware_num_threads_in_block_fold` but it often does. At the same time we do keep value specialization and execution mode in sync.
Proper solutions to this problem should be considered. I believe a new execution mode is the easiest way forward (Singleton-SPMD). Alternatively, SPMD-mode execution can be used with a way to provide a new thread_limit (here 1) to the runtime. This is more general and could be useful if we see `num_threads` clauses or workshared loops with small trip counts in the kernel. In either proposal we need to disable the guarding for the kernel (which was the motivation for D109438).
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D112894
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
423d34f7 |
| 22-Sep-2021 |
Shilei Tian <[email protected]> |
[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patch
[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110279
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3 |
|
| #
6b9a3ec3 |
| 08-Sep-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Do not SPMDize generic regions with no parallel
This patch changes SPMDization to not trigger for regions with no parallelism. Otherwise, this will introduce unnecessary barriers that will
[OpenMP] Do not SPMDize generic regions with no parallel
This patch changes SPMDization to not trigger for regions with no parallelism. Otherwise, this will introduce unnecessary barriers that will slow the single-threaded region down.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109438
show more ...
|
| #
29a74a39 |
| 31-Aug-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Add an option to always inline OpenMP device functions.
Performance on GPU targets can be highly variable, sometimes inlining everything hurts performance and sometimes it greatly improves
[OpenMP] Add an option to always inline OpenMP device functions.
Performance on GPU targets can be highly variable, sometimes inlining everything hurts performance and sometimes it greatly improves it. Add an option to toggle this behaviour to better investigate it.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D109014
show more ...
|