|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
1f940b69 |
| 15-Jul-2022 |
Joseph Huber <[email protected]> |
[Libomptarget][NFC] Fix signed comparison warnings
Summary: Non-functional change, just fixing some sign comparison warnings by making both match.
|
| #
c9353eb4 |
| 27-Jun-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Use new tripcount argument in the runtime.
The previous patch added an argument to the `__tgt_target_kernel` runtime function which includes the tripcount used for the loop clause. Th
[Libomptarget] Use new tripcount argument in the runtime.
The previous patch added an argument to the `__tgt_target_kernel` runtime function which includes the tripcount used for the loop clause. This was originally passed in via the `__kmpc_push_target_tripcount` function. Now we move this logic to the kernel launch itself and remove the need for the push function.
Depends on D128816
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D128817
show more ...
|
| #
d27d0a67 |
| 01-Jul-2022 |
Joseph Huber <[email protected]> |
[Libomptarget][NFC] Make Libomptarget use the LLVM naming convention
Libomptarget grew out of a project that was originally not in LLVM. As we develop libomptarget this has led to an increasingly la
[Libomptarget][NFC] Make Libomptarget use the LLVM naming convention
Libomptarget grew out of a project that was originally not in LLVM. As we develop libomptarget this has led to an increasingly large clash between the naming conventions used. This patch fixes most of the variable names that did not confrom to the LLVM standard, that is `VariableName` for variables and `functionName` for functions.
This patch was primarily done using my editor's linting messages, if there are any issues I missed arising from the automation let me know.
Reviewed By: saiislam
Differential Revision: https://reviews.llvm.org/D128997
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
5ad07ac4 |
| 25-Apr-2022 |
Joseph Huber <[email protected]> |
[Libomptarget] Use entry name for global info
Currently, globals on the device will have an infinite reference count and an unknown name when using `LIBOMPTARGET_INFO` to print the mapping table. We
[Libomptarget] Use entry name for global info
Currently, globals on the device will have an infinite reference count and an unknown name when using `LIBOMPTARGET_INFO` to print the mapping table. We already store the name of the global in the offloading entry so we should be able to use it, although there will be no source location. To do this we need to create a valid `ident_t` string from a name only.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D124381
show more ...
|
|
Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
b3161268 |
| 02-Mar-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP][FIX] Avoid races in the handling of to be deleted mapping entries
If we decided to delete a mapping entry we did not act on it right away but first issued and waited for memory copies. In t
[OpenMP][FIX] Avoid races in the handling of to be deleted mapping entries
If we decided to delete a mapping entry we did not act on it right away but first issued and waited for memory copies. In the meantime some other thread might reuse the entry. While there was some logic to avoid colliding on the actual "deletion" part, there were two races happening:
1) The data transfer back of the thread deleting the entry and the data transfer back of the thread taking over the entry raced. 2) The update to the shadow map happened regardless if the entry was actually reused by another thread which left the shadow map in a inconsistent state.
To fix both issues we will now update the shadow map and delete the entry only if we are sure the thread is responsible for deletion, hence no other thread took over the entry and reused it. We also wait for a potential former data transfer from the device to finish before we issue another one that would race with it.
Fixes https://github.com/llvm/llvm-project/issues/54216
Differential Revision: https://reviews.llvm.org/D121058
show more ...
|
| #
4e34f061 |
| 05-Mar-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP][FIX] Ensure exclusive access to the HDTT map
This patch solves two problems with the `HostDataToTargetMap` (HDTT map) which caused races and crashes before:
1) Any access to the HDTT map n
[OpenMP][FIX] Ensure exclusive access to the HDTT map
This patch solves two problems with the `HostDataToTargetMap` (HDTT map) which caused races and crashes before:
1) Any access to the HDTT map needs to be exclusive access. This was not the case for the "dump table" traversals that could collide with updates by other threads. The new `Accessor` and `ProtectedObject` wrappers will ensure we have a hard time introducing similar races in the future. Note that we could allow multiple concurrent read-accesses but that feature can be added to the `Accessor` API later. 2) The elements of the HDTT map were `HostDataToTargetTy` objects which meant that they could be copied/moved/deleted as the map was changed. However, we sometimes kept pointers to these elements around after we gave up the map lock which caused potential races again. The new indirection through `HostDataToTargetMapKeyTy` will allows us to modify the map while keeping the (interesting part of the) entries valid. To offset potential cost we duplicate the ordering key of the entry which avoids an additional indirect lookup.
We should replace more objects with "protected objects" as we go.
Differential Revision: https://reviews.llvm.org/D121057
show more ...
|
| #
307bbd3c |
| 02-Mar-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible
Differential Revision: https://reviews.llvm.org/D121060
|
| #
7ead7e90 |
| 07-Mar-2022 |
Johannes Doerfert <[email protected]> |
Revert "[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible"
This reverts commit ff50e81b500800708db927cbccca2ab52ec11884 as it broke the buildbots, see https://reviews.llvm.org/D1210
Revert "[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible"
This reverts commit ff50e81b500800708db927cbccca2ab52ec11884 as it broke the buildbots, see https://reviews.llvm.org/D121060#3362737.
show more ...
|
| #
ff50e81b |
| 02-Mar-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP][NFCI] Use RAII lock guards in libomptarget where possible
Differential Revision: https://reviews.llvm.org/D121060
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
7b731f4d |
| 18-Feb-2022 |
Carlo Bertolli <[email protected]> |
[OpenMP][libomptarget] Delay restore of shadow pointers in structs to after H2D memory copies are completed
When using asynchronous plugin calls, shadow pointer restore could happen before the D2H c
[OpenMP][libomptarget] Delay restore of shadow pointers in structs to after H2D memory copies are completed
When using asynchronous plugin calls, shadow pointer restore could happen before the D2H copy for the entire struct has completed, effectively leaving a device pointer in a host struct. This patch fixes the problem by delaying restore's to after a synchronization happens (target regions) and by calling early synchronization (target update).
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119968
show more ...
|
| #
c27f530d |
| 13-Feb-2022 |
Shilei Tian <[email protected]> |
[OpenMP][Offloading] Fix infinite loop in applyToShadowMapEntries
This patch fixes the issue that the for loop in `applyToShadowMapEntries` is infinite because `Itr` is not incremented in `CB`. Fixe
[OpenMP][Offloading] Fix infinite loop in applyToShadowMapEntries
This patch fixes the issue that the for loop in `applyToShadowMapEntries` is infinite because `Itr` is not incremented in `CB`. Fixes #53727.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D119471
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
ad0a306a |
| 31-Jan-2022 |
Joseph Huber <[email protected]> |
[OpenMP][NFC] Change error message on offloading failure to mention documentation
This patch changes the error message to instead mention the documentation page for the debugging options provided by
[OpenMP][NFC] Change error message on offloading failure to mention documentation
This patch changes the error message to instead mention the documentation page for the debugging options provided by libomptarget and the bitcode runtimes. Add some extra information to the documentation to help users more quickly identify debugging resources.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118626
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
b0789a1b |
| 03-Nov-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP] Avoid costly shadow map traversals whenever possible
In the OpenMC app we saw `omp target update` spending an awful lot of time in the shadow map traversal without ever doing any update the
[OpenMP] Avoid costly shadow map traversals whenever possible
In the OpenMC app we saw `omp target update` spending an awful lot of time in the shadow map traversal without ever doing any update there. There are two cases that allow us to avoid the traversal completely. The simplest thing is that small updates cannot (reasonably) contain an attached pointer part. The other case requires to track in the mapping table if an entry might contain an attached pointer as part. Given that we have a single location shadow map entries are created, the latter is actually fairly easy as well.
Differential Revision: https://reviews.llvm.org/D113124
show more ...
|
| #
1e447d03 |
| 19-Jan-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP] Introduce an environment variable to disable atomic map clauses
Atomic handling of map clauses was introduced to comply with the OpenMP standard (see D104418). However, many apps won't need
[OpenMP] Introduce an environment variable to disable atomic map clauses
Atomic handling of map clauses was introduced to comply with the OpenMP standard (see D104418). However, many apps won't need this feature which can be costly in certain situations. To allow for applications to opt-out we now introduce the `LIBOMPTARGET_MAP_FORCE_ATOMIC` environment flag that voids the atomicity guarantee of the standard for map clauses again, shifting the burden to the user.
This patch also de-duplicates the code that introduces the events used to enforce atomicity as a cleanup.
Differential Revision: https://reviews.llvm.org/D117627
show more ...
|
| #
9584c6fa |
| 06-Jan-2022 |
Shilei Tian <[email protected]> |
[OpenMP][Offloading] Fixed data race in libomptarget caused by async data movement
The async data movement can cause data race if the target supports it. Details can be found in [1]. This patch trie
[OpenMP][Offloading] Fixed data race in libomptarget caused by async data movement
The async data movement can cause data race if the target supports it. Details can be found in [1]. This patch tries to fix this problem by attaching an event to the entry of data mapping table. Here are the details.
For each issued data movement, a new event is generated and returned to `libomptarget` by calling `createEvent`. The event will be attached to the corresponding mapping table entry.
For each data mapping lookup, if there is no need for a data movement, the attached event has to be inserted into the queue to gaurantee that all following operations in the queue can only be executed if the event is fulfilled.
This design is to avoid synchronization on host side.
Note that we are using CUDA terminolofy here. Similar mechanism is assumped to be supported by another targets. Even if the target doesn't support it, it can be easily implemented in the following fall back way: - `Event` can be any kind of flag that has at least two status, 0 and 1. - `waitEvent` can directly busy loop if `Event` is still 0.
My local test shows that `bug49334.cpp` can pass.
Reference: [1] https://bugs.llvm.org/show_bug.cgi?id=49940
Reviewed By: grokos, JonChesterfield, ye-luo
Differential Revision: https://reviews.llvm.org/D104418
show more ...
|
| #
8425bde8 |
| 10-Dec-2021 |
Joseph Huber <[email protected]> |
Revert "[OpenMP] Avoid costly shadow map traversals whenever possible"
This reverts commit 7c8f4e7b85ed98497f37571d72609f39a8eed447. Fails a few OpenMP tests, causes a few updates to segfault.
|
| #
7c8f4e7b |
| 10-Dec-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Avoid costly shadow map traversals whenever possible
In the OpenMC app we saw `omp target update` spending an awful lot of time in the shadow map traversal without ever doing any update the
[OpenMP] Avoid costly shadow map traversals whenever possible
In the OpenMC app we saw `omp target update` spending an awful lot of time in the shadow map traversal without ever doing any update there. There are two cases that allow us to avoid the traversal completely. The simplest thing is that small updates cannot (reasonably) contain an attached pointer part. The other case requires to track in the mapping table if an entry might contain an attached pointer as part. Given that we have a single location shadow map entries are created, the latter is actually fairly easy as well.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D113124
show more ...
|
| #
2feafa2e |
| 25-Oct-2021 |
Georgios Rokos <[email protected]> |
[libomptarget][NFC] Add comment explaining why we pass argument bases and offsets as two separate entities to the plugins.
|
| #
2a30c03c |
| 25-Oct-2021 |
Shilei Tian <[email protected]> |
[OpenMP][Offloading] Only get trip count if team construct
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D112475
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
2cfe1a09 |
| 06-Sep-2021 |
Ye Luo <[email protected]> |
[OpenMP][libomptarget][NFC] Change checkDeviceAndCtors return type to bool.
What is exactly needed is only a boolean. Pulling OFFLOAD_SUCCESS/FAIL only adds confusion.
Differential Revision: https:
[OpenMP][libomptarget][NFC] Change checkDeviceAndCtors return type to bool.
What is exactly needed is only a boolean. Pulling OFFLOAD_SUCCESS/FAIL only adds confusion.
Differential Revision: https://reviews.llvm.org/D109303
show more ...
|
| #
c3aecf87 |
| 04-Sep-2021 |
Ye Luo <[email protected]> |
[OpenMP][libomptarget] Change device vector elements to unique_ptr type
Using std::vector<DeviceTy> requires implementing copy constructor and copied assign operator for DeviceTy. Indeed DeviceTy sh
[OpenMP][libomptarget] Change device vector elements to unique_ptr type
Using std::vector<DeviceTy> requires implementing copy constructor and copied assign operator for DeviceTy. Indeed DeviceTy should never be copied. After changing to std::vector<std::unique_ptr<DeviceTy>>, All the unsafe copy constructor and copy assign operator implementations can be removed. Compilers mark them deleted due to mutex or underlying objects and this is the desired behavior.
Differential Revision: https://reviews.llvm.org/D109276
show more ...
|
| #
786a1406 |
| 01-Sep-2021 |
Joel E. Denny <[email protected]> |
[OpenMP] Use IsHostPtr where needed in rest of omptarget.cpp
As started in D107925, this patch replaces the remaining occurrences of `UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin` in `omptarg
[OpenMP] Use IsHostPtr where needed in rest of omptarget.cpp
As started in D107925, this patch replaces the remaining occurrences of `UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin` in `omptarget.cpp` with `IsHostPtr`. The former condition is broken in the rare case that the device and host happen to use the same address for their mapped allocations. I don't know how to write a test that's likely to reveal this case.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107928
show more ...
|
| #
d11bab0b |
| 01-Sep-2021 |
Joel E. Denny <[email protected]> |
[OpenMP] Use IsHostPtr where needed for targetDataBegin
As discussed in D105990, without this patch, `targetDataBegin` determines whether to transfer data (as opposed to assuming it's in shared memo
[OpenMP] Use IsHostPtr where needed for targetDataBegin
As discussed in D105990, without this patch, `targetDataBegin` determines whether to transfer data (as opposed to assuming it's in shared memory) using the condition `!UseUSM || HasCloseModifier`. However, this condition is broken if use of discrete memory was forced by `omp_target_associate_ptr`. This patch extends `unified_shared_memory/associate_ptr.c` to reveal this case, and it fixes it using `!IsHostPtr` in `DeviceTy::getTargetPointer` to replace this condition.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107927
show more ...
|
| #
fa6c2755 |
| 01-Sep-2021 |
Joel E. Denny <[email protected]> |
[OpenMP][NFC] Eliminate CopyMember from targetDataEnd
This patch is based on comments in D105990. It is NFC according to the following observations:
1. `CopyMember` is computed as `!IsHostPtr && I
[OpenMP][NFC] Eliminate CopyMember from targetDataEnd
This patch is based on comments in D105990. It is NFC according to the following observations:
1. `CopyMember` is computed as `!IsHostPtr && IsLast`. 2. `DelEntry` is true only if `IsLast` is true.
We apply those observations in order:
``` if ((DelEntry || Always || CopyMember) && !IsHostPtr)
if ((DelEntry || Always || IsLast) && !IsHostPtr)
if ((Always || IsLast) && !IsHostPtr) ```
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107926
show more ...
|
| #
8e4836b2 |
| 01-Sep-2021 |
Joel E. Denny <[email protected]> |
[OpenMP] Use IsHostPtr where needed for targetDataEnd
As discussed in D105990, without this patch, `targetDataEnd` determines whether to transfer data or delete a device mapping (as opposed to assum
[OpenMP] Use IsHostPtr where needed for targetDataEnd
As discussed in D105990, without this patch, `targetDataEnd` determines whether to transfer data or delete a device mapping (as opposed to assuming it's in shared memory) using two different conditions, each of which is broken for some cases:
1. `!(UNIFIED_SHARED_MEMORY && TgtPtrBegin == HstPtrBegin)`: The broken case is rare: the device and host might happen to use the same address for their mapped allocations. I don't know how to write a test that's likely to reveal this case, but this patch does fix it, as discussed below. 2. `!UNIFIED_SHARED_MEMORY || HasCloseModifier`: There are at least two broken cases: 1. The `close` modifier might have been specified on an `omp target enter data` but not the corresponding `omp target exit data`, which thus might falsely assume a mapping is in shared memory. The test `unified_shared_memory/close_enter_exit.c` already has a missing deletion as a result, and this patch adds a check for that. This patch also adds the new test `close_member.c` to reveal a missing transfer and deletion. 2. Use of discrete memory might have been forced by `omp_target_associate_ptr`, as in the test `unified_shared_memory/api.c`. In the current `targetDataEnd` implementation, this condition turns out not be used for this case: because the reference count is infinite, a transfer is possible only with an `always` modifier, and this condition is never used in that case. To ensure it's never used for that case in the future, this patch adds the test `unified_shared_memory/associate_ptr.c`.
Fortunately, `DeviceTy::getTgtPtrBegin` already has a solution: it reports whether the allocation was found in shared memory via the variable `IsHostPtr`.
After this patch, `HasCloseModifier` is no longer used in `targetDataEnd`, and I wonder if the `close` modifier is ever useful on an `omp target data end`.
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D107925
show more ...
|