|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
27177b82 |
| 10-Nov-2021 |
Jon Chesterfield <[email protected]> |
[OpenMP] Lower printf to __llvm_omp_vprintf
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char*, void* arguments as cuda vprintf and also passes the si
[OpenMP] Lower printf to __llvm_omp_vprintf
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char*, void* arguments as cuda vprintf and also passes the size of the void* alloca which will be needed by a non-stub implementation of `__llvm_omp_vprintf` for amdgpu.
This removes the amdgpu link error on any printf in a target region in favour of silently compiling code that doesn't print anything to stdout.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112680
show more ...
|
| #
0fa45d6d |
| 08-Nov-2021 |
Jon Chesterfield <[email protected]> |
Revert "[OpenMP] Lower printf to __llvm_omp_vprintf"
This reverts commit db81d8f6c4d6c4f8dfaa036d6959528c9f14e7d7.
|
| #
db81d8f6 |
| 08-Nov-2021 |
Jon Chesterfield <[email protected]> |
[OpenMP] Lower printf to __llvm_omp_vprintf
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char*, void* arguments as cuda vprintf and also passes the si
[OpenMP] Lower printf to __llvm_omp_vprintf
Extension of D112504. Lower amdgpu printf to `__llvm_omp_vprintf` which takes the same const char*, void* arguments as cuda vprintf and also passes the size of the void* alloca which will be needed by a non-stub implementation of `__llvm_omp_vprintf` for amdgpu.
This removes the amdgpu link error on any printf in a target region in favour of silently compiling code that doesn't print anything to stdout.
The exact set of changes to check-openmp probably needs revision before commit
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112680
show more ...
|
| #
4d50803c |
| 28-Oct-2021 |
Jon Chesterfield <[email protected]> |
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987. CI is showing a different set of pass/fails to local, committing this withou
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987. CI is showing a different set of pass/fails to local, committing this without the tests enabled by default while debugging that difference.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112227
show more ...
|
| #
6c7b203d |
| 28-Oct-2021 |
Jon Chesterfield <[email protected]> |
Revert "[libomptarget] Build DeviceRTL for amdgpu" - more tests failing on CI than failed locally when writing this patch
This reverts commit 33427fdb7b52b79ce5e25b7e14e0f1a44d876bd2.
|
| #
33427fdb |
| 27-Oct-2021 |
Jon Chesterfield <[email protected]> |
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D11
[libomptarget] Build DeviceRTL for amdgpu
Passes same tests as the current deviceRTL. Includes cmake change from D111987.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D112227
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
ad0f6e1d |
| 19-Aug-2021 |
Jon Chesterfield <[email protected]> |
[openmp] Disable the tests that block CI for amdgpu and host offloading.
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
c2c43132 |
| 23-Jul-2021 |
Shilei Tian <[email protected]> |
[OpenMP] Fix bug 50022
Bug 50022 [0] reports target nowait fails in certain case, which is added in this patch. The root cause of the failure is, when the second task is created, its parent's `td_in
[OpenMP] Fix bug 50022
Bug 50022 [0] reports target nowait fails in certain case, which is added in this patch. The root cause of the failure is, when the second task is created, its parent's `td_incomplete_child_tasks` will not be incremented because there is no parallel region here thus its team is serialized. Therefore, when the initial thread is waiting for its unfinished children tasks, it thought there is only one, the first task, because it is hidden helper task, so it is tracked. The second task will only be pushed to the queue when the first task is finished. However, when the first task finishes, it first decrements the counter of its parent, and then release dependences. Once the counter is decremented, the thread will move on because its counter is reset, but actually, the second task has not been executed at all. As a result, since in this case, the main function finishes, then `libomp` starts to destroy. When the second task is pushed somewhere, all some of the structures might already have already been destroyed, then anything could happen.
This patch simply moves `__kmp_release_deps` ahead of decrement of the counter. In this way, we can make sure that the initial thread is aware of the existence of another task(s) so it will not move on. In addition, in order to tackle dependence chain starting with hidden helper thread, when hidden helper task is encountered, we force the task to release dependences.
Reference: [0] https://bugs.llvm.org/show_bug.cgi?id=50022
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D106519
show more ...
|