|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
17dcde5f |
| 18-May-2022 |
AndreyChurbanov <[email protected]> |
[OpenMP][libomp] Allow reset affinity mask after parallel
Added control to reset affinity of primary thread after outermost parallel region to initial affinity encountered before OpenMP runtime was
[OpenMP][libomp] Allow reset affinity mask after parallel
Added control to reset affinity of primary thread after outermost parallel region to initial affinity encountered before OpenMP runtime was initialized. KMP_AFFINITY environment variable reset/noreset modifier introduced. Default behavior is unchanged.
Differential Revision: https://reviews.llvm.org/D125993
show more ...
|
| #
a01d274f |
| 19-May-2022 |
AndreyChurbanov <[email protected]> |
[OpenMP][libomp] Fix /dev/shm pollution after forked child process terminates
Made library registration conditional and skip it in the __kmp_atfork_child handler, postponed it till middle initializa
[OpenMP][libomp] Fix /dev/shm pollution after forked child process terminates
Made library registration conditional and skip it in the __kmp_atfork_child handler, postponed it till middle initialization in the child. This fixes the problem of applications those use e.g. popen/pclose which terminate the forked child process.
Differential Revision: https://reviews.llvm.org/D125996
show more ...
|
| #
d4a7b8de |
| 24-Jun-2022 |
Daniel Douglas <[email protected]> |
[OpenMP][libomp] avoid spin wait and yield on arm64 macOS
This patch changes the default behavior to avoid spin waiting and yielding. (See “Don’t Keep Threads Active And Idle” section here: https://
[OpenMP][libomp] avoid spin wait and yield on arm64 macOS
This patch changes the default behavior to avoid spin waiting and yielding. (See “Don’t Keep Threads Active And Idle” section here: https://developer.apple.com/documentation/apple-silicon/tuning-your-code-s-performance-for-apple-silicon)
We verified using instruments traces that the changes improve scheduling behavior on macOS.
We also collected results using EPCC schedbench (https://github.com/LangdalP/EPCC-OpenMP-micro-benchmarks) that are attached here that show a reduction in standard deviation and max test run time across all scheduling types. Static scheduling sees dramatic improvements with these changes, we see a 2-4x average runtime improvement in the benchmark.
Differential Revision: https://reviews.llvm.org/D126510
show more ...
|
| #
b7b49865 |
| 05-May-2022 |
Jonathan Peyton <[email protected]> |
[OpenMP][libomp] Hold old __kmp_threads arrays until library shutdown
When many nested teams are formed, __kmp_threads may be reallocated to accommodate new threads. This reallocation causes a data
[OpenMP][libomp] Hold old __kmp_threads arrays until library shutdown
When many nested teams are formed, __kmp_threads may be reallocated to accommodate new threads. This reallocation causes a data race when another existing team's thread simultaneously references __kmp_threads. This patch keeps the old thread arrays around until library shutdown so these lingering references can complete without issue and access to __kmp_threads remains a simple array reference.
Fixes: https://github.com/llvm/llvm-project/issues/54708 Differential Revision: https://reviews.llvm.org/D125013
show more ...
|
| #
f58fe2e1 |
| 03-Jun-2022 |
Vadim Paretsky <[email protected]> |
[OpenMP] allow loc to be NULL in __kmp_determine_reduction_method for MSVC
MSVC may not supply source location information to kmpc_reduce passing NULL for the value. The patch adds a check for the l
[OpenMP] allow loc to be NULL in __kmp_determine_reduction_method for MSVC
MSVC may not supply source location information to kmpc_reduce passing NULL for the value. The patch adds a check for the loc value being NULL in kmp_determine_reduction_method.
Differential Revision: https://reviews.llvm.org/D126564
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
1234011b |
| 31-Jan-2022 |
Jonathan Peyton <[email protected]> |
[OpenMP][libomp] Introduce oneAPI compiler support
Introduce KMP_COMPILER_ICX macro to represent compilation with oneAPI compiler.
Fixup flag detection and compiler ID detection in CMake. Older CMa
[OpenMP][libomp] Introduce oneAPI compiler support
Introduce KMP_COMPILER_ICX macro to represent compilation with oneAPI compiler.
Fixup flag detection and compiler ID detection in CMake. Older CMake's detect IntelLLVM as Clang.
Fix compiler warnings.
Fixup many of the tests to have non-empty parallel regions as they are elided by oneAPI compiler.
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
2e02579a |
| 14-Dec-2021 |
Terry Wilmarth <[email protected]> |
[OpenMP] Add use of TPAUSE
Add use of TPAUSE (from WAITPKG) to the runtime for Intel hardware, with an envirable to turn it on in a particular C-state. Always uses TPAUSE if it is selected and enab
[OpenMP] Add use of TPAUSE
Add use of TPAUSE (from WAITPKG) to the runtime for Intel hardware, with an envirable to turn it on in a particular C-state. Always uses TPAUSE if it is selected and enabled by Intel hardware and presence of WAITPKG, and if not, falls back to old way of checking __kmp_use_yield, etc.
Differential Revision: https://reviews.llvm.org/D115758
show more ...
|
| #
458db51c |
| 30-Dec-2021 |
Shilei Tian <[email protected]> |
[OpenMP] Add missing `tt_hidden_helper_task_encountered` along with `tt_found_proxy_tasks`
In most cases, hidden helper task behave similar as detached tasks. That means, for example, if we have to
[OpenMP] Add missing `tt_hidden_helper_task_encountered` along with `tt_found_proxy_tasks`
In most cases, hidden helper task behave similar as detached tasks. That means, for example, if we have to wait for detached tasks, we have to do the same thing for hidden helper tasks as well. This patch adds the missing condition for hidden helper task accordingly along with detached task.
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D107316
show more ...
|
| #
4dd8fccb |
| 08-Dec-2021 |
AndreyChurbanov <[email protected]> |
[OpenMP] libomp: Fix crash if application send us negative thread_limit value
Regardless that specification requires thread_limit to be positive, it is better to warn user instead of crash in case t
[OpenMP] libomp: Fix crash if application send us negative thread_limit value
Regardless that specification requires thread_limit to be positive, it is better to warn user instead of crash in case the value is negative.
Differential Revision: https://reviews.llvm.org/D115340
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
50b68a3d |
| 15-Sep-2021 |
Peyton, Jonathan L <[email protected]> |
[OpenMP][host runtime] Add support for teams affinity
This patch implements teams affinity on the host. The default is spread. A user can specify either spread, close, or primary using KMP_TEAMS_PRO
[OpenMP][host runtime] Add support for teams affinity
This patch implements teams affinity on the host. The default is spread. A user can specify either spread, close, or primary using KMP_TEAMS_PROC_BIND environment variable. Unlike OMP_PROC_BIND, KMP_TEAMS_PROC_BIND is only a single value and is not a list of values. The values follow the same semantics under the OpenMP specification for parallel regions except T is the number of teams in a league instead of the number of threads in a parallel region.
Differential Revision: https://reviews.llvm.org/D109921
show more ...
|
| #
6e98ec9b |
| 13-Oct-2021 |
AndreyChurbanov <[email protected]> |
[OpenMP] libomp: fix ittnotify usage.
Replaced storing of ittnotify domain array index into location info structure (which is now read-only) with storing of (location info address + ittnotify domain
[OpenMP] libomp: fix ittnotify usage.
Replaced storing of ittnotify domain array index into location info structure (which is now read-only) with storing of (location info address + ittnotify domain + team size) into hash map. Replaced __kmp_itt_barrier_domains and __kmp_itt_imbalance_domains arrays with __kmp_itt_barrier_domains hash map; __kmp_itt_region_domains and __kmp_itt_region_team_size arrays with __kmp_itt_region_domains hash map. Basic functionality did not change (at least tried to not change).
The patch fixes https://bugs.llvm.org/show_bug.cgi?id=48644.
Differential Revision: https://reviews.llvm.org/D111580
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1 |
|
| #
49fabd9d |
| 30-Jul-2021 |
Pirama Arumuga Nainar <[email protected]> |
[openmp] Do not use shared memory on Android
Android provides ashmem/ASharedMemory support on newer releases, which we can use if requested by openmp users on Android.
Also refactor the preprocesso
[openmp] Do not use shared memory on Android
Android provides ashmem/ASharedMemory support on newer releases, which we can use if requested by openmp users on Android.
Also refactor the preprocessor check for using shared memory to kmp_config.h.cmake.
Differential Revision: https://reviews.llvm.org/D107181
show more ...
|
| #
8b81524c |
| 30-Jul-2021 |
AndreyChurbanov <[email protected]> |
[OpenMP][NFC] libomp: silence warnings on unused variables.
Put declarations/definitions of unused variables under corresponding macros to silence clang build warnings.
Differential Revision: https
[OpenMP][NFC] libomp: silence warnings on unused variables.
Put declarations/definitions of unused variables under corresponding macros to silence clang build warnings.
Differential Revision: https://reviews.llvm.org/D106608
show more ...
|
|
Revision tags: llvmorg-14-init |
|
| #
d8e4cb91 |
| 15-Jul-2021 |
Terry Wilmarth <[email protected]> |
[OpenMP] libomp: Add new experimental barrier: two-level distributed barrier
Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in so
[OpenMP] libomp: Add new experimental barrier: two-level distributed barrier
Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in some cases than the default hyper barrier.
This barrier is designed to handle fine granularity parallelism where barriers are used frequently with little compute and memory access between barriers. There is no need to use it for codes with few barriers and large granularity compute, or memory intensive applications, as little difference will be seen between this barrier and the default hyper barrier. This barrier is designed to work optimally with a fixed number of threads, and has a significant setup time, so should NOT be used in situations where the number of threads in a team is varied frequently.
The two-level distributed barrier is off by default -- hyper barrier is used by default. To use this barrier, you must set all barrier patterns to use this type, because it will not work with other barrier patterns. Thus, to turn it on, the following settings are required:
KMP_FORKJOIN_BARRIER_PATTERN=dist,dist KMP_PLAIN_BARRIER_PATTERN=dist,dist KMP_REDUCTION_BARRIER_PATTERN=dist,dist
Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER, and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed barrier.
Patch fixed for ITTNotify disabled builds and non-x86 builds
Co-authored-by: Jonathan Peyton <[email protected]> Co-authored-by: Vladislav Vinogradov <[email protected]>
Differential Revision: https://reviews.llvm.org/D103121
show more ...
|
| #
424f14f0 |
| 13-Jul-2021 |
Peyton, Jonathan L <[email protected]> |
[OpenMP] Fix one sign-compare warning from GCC
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
681055ea |
| 04-Jun-2021 |
Joachim Protze <[email protected]> |
[OpenMP] Remove TSAN annotations from libomp
The annotations in libomp were never built by default. The annotations are also superseded by the annotations which the OMPT tool libarcher.so provides.
[OpenMP] Remove TSAN annotations from libomp
The annotations in libomp were never built by default. The annotations are also superseded by the annotations which the OMPT tool libarcher.so provides. With respect to libarcher, libomp behaves as if libarcher would be the last element of OMP_TOOL_LIBARARIES. I.e., if no other OMPT tool gets active, libarcher will check if an OpenMP application is built with TSan.
Since libarcher gets loaded by default, enabling LIBOMP_TSAN_SUPPORT would result in redundant annotations for TSan, which slightly differ in details and coverage (e.g. task dependencies are not handled well by the annotations in libomp).
This patch removes all TSan annotations from the OpenMP runtime code.
Differential Revision: https://reviews.llvm.org/D103767
show more ...
|
| #
f1b9ce27 |
| 30-Jun-2021 |
Hansang Bae <[email protected]> |
[OpenMP] Fix a few issues with hidden helper task
This patch includes the following changes to address a few issues when using hidden helper task.
- Assertion is triggered when there are inadverten
[OpenMP] Fix a few issues with hidden helper task
This patch includes the following changes to address a few issues when using hidden helper task.
- Assertion is triggered when there are inadvertent calls to hidden helper functions on non-Linux OS - Added deinit code in __kmp_internal_end_library function to fix random shutdown crashes - Moved task data access into the lock-guarded region in __kmp_push_task
Differential Revision: https://reviews.llvm.org/D105308
show more ...
|
| #
4eb90e89 |
| 29-Jun-2021 |
Johannes Doerfert <[email protected]> |
Revert "[OpenMP] Add Two-level Distributed Barrier"
This reverts commit 25073a4ecfc9b2e3cb76776185e63bfdb094cd98.
This breaks non-x86 OpenMP builds for a while now. Until a solution is ready to be
Revert "[OpenMP] Add Two-level Distributed Barrier"
This reverts commit 25073a4ecfc9b2e3cb76776185e63bfdb094cd98.
This breaks non-x86 OpenMP builds for a while now. Until a solution is ready to be upstreamed we revert the feature and unblock those builds. See: https://reviews.llvm.org/rG25073a4ecfc9b2e3cb76776185e63bfdb094cd98#1005821 and https://reviews.llvm.org/rG25073a4ecfc9b2e3cb76776185e63bfdb094cd98#1005821
The currently proposed fix (D104788) seems not to be ready yet: https://reviews.llvm.org/D104788#2841928
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
25073a4e |
| 21-May-2021 |
Terry Wilmarth <[email protected]> |
[OpenMP] Add Two-level Distributed Barrier
Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in some cases than the default hyper ba
[OpenMP] Add Two-level Distributed Barrier
Two-level distributed barrier is a new experimental barrier designed for Intel hardware that has better performance in some cases than the default hyper barrier.
This barrier is designed to handle fine granularity parallelism where barriers are used frequently with little compute and memory access between barriers. There is no need to use it for codes with few barriers and large granularity compute, or memory intensive applications, as little difference will be seen between this barrier and the default hyper barrier. This barrier is designed to work optimally with a fixed number of threads, and has a significant setup time, so should NOT be used in situations where the number of threads in a team is varied frequently.
The two-level distributed barrier is off by default -- hyper barrier is used by default. To use this barrier, you must set all barrier patterns to use this type, because it will not work with other barrier patterns. Thus, to turn it on, the following settings are required:
KMP_FORKJOIN_BARRIER_PATTERN=dist,dist KMP_PLAIN_BARRIER_PATTERN=dist,dist KMP_REDUCTION_BARRIER_PATTERN=dist,dist
Branching factors (set with KMP_FORKJOIN_BARRIER, KMP_PLAIN_BARRIER, and KMP_REDUCTION_BARRIER) are ignored by the two-level distributed barrier.
Differential Revision: https://reviews.llvm.org/D103121
show more ...
|
| #
cff21556 |
| 15-Jun-2021 |
Joachim Protze <[email protected]> |
[OpenMP] Remove unused variables from libomp code
Several variables were left unused as a result of different patches removing their use.
Two variables have some use: `poll_count` is used by the KM
[OpenMP] Remove unused variables from libomp code
Several variables were left unused as a result of different patches removing their use.
Two variables have some use: `poll_count` is used by the KMP_BLOCKING macro only under certain conditions. Adding (void) to tell the compiler to ignore the unused variable.
`padding` is a dummy stack allocation with no intent to be used. Also adding (void) to make the compiler ignore the unused variable.
Differential Revision: https://reviews.llvm.org/D104303
show more ...
|
| #
0ddde4d8 |
| 02-Jun-2021 |
Peyton, Jonathan L <[email protected]> |
[OpenMP] Lazily assign root affinity
Lazily set affinity for root threads. Previously, the root thread executing middle initialization would attempt to assign affinity to other existing root threads
[OpenMP] Lazily assign root affinity
Lazily set affinity for root threads. Previously, the root thread executing middle initialization would attempt to assign affinity to other existing root threads. This was not working properly as the set_system_affinity() function wasn't setting the affinity for the target thread. Instead, the middle init thread was resetting the its own affinity using the target thread's affinity mask.
Differential Revision: https://reviews.llvm.org/D103625
show more ...
|
| #
f61602b0 |
| 08-Jun-2021 |
Vignesh Balasubramanian <[email protected]> |
[OpenMP][OMPD] Implementation of OMPD debugging library - libompd.
This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs. It contains
[OpenMP][OMPD] Implementation of OMPD debugging library - libompd.
This is the first of seven patches that implements OMPD, a debugging interface to support debugging of OpenMP programs. It contains support code required in "openmp/runtime" for OMPD implementation.
Reviewed By: @hbae Differential Revision: https://reviews.llvm.org/D100181
show more ...
|
| #
8ec9aa23 |
| 10-May-2021 |
Terry Wilmarth <[email protected]> |
[OpenMP] Add experimental nesting mode feature
Nesting mode is a new experimental feature in the OpenMP runtime. It allows a user to set up nesting for an application in a way that corresponds to th
[OpenMP] Add experimental nesting mode feature
Nesting mode is a new experimental feature in the OpenMP runtime. It allows a user to set up nesting for an application in a way that corresponds to the hardware topology levels on the machine an application is being run on. For example, if a machine has 2 sockets, each with 12 cores, then use of nesting mode could set up an outer level of nesting that uses 2 threads per parallel region, and an inner level of nesting that uses 12 threads per parallel region.
Nesting mode is controlled with the KMP_NESTING_MODE environment variable as follows:
1) KMP_NESTING_MODE = 0: Nesting mode is off (default); max-active-levels-var is set to 1 (the default -- nesting is off, nested parallel regions are serialized).
2) KMP_NESTING_MODE = 1: Nesting mode is on, and a number of threads will be assigned for each level discovered in the machine topology; max-active-levels-var is set to the number of levels discovered.
3) KMP_NESTING_MODE = n, n>1: [Note: this option is experimental and may change or be removed in the future.] Nesting mode is on, and a number of threads will be assigned for each topology level discovered on the machine, up to k<=n levels (since there may be fewer than n levels discovered in the topology), and beyond the kth level, nested parallel regions will be serialized; NOTE: max-active-levels-var is 1 (the default -- nesting is off, and nested parallel regions are serialized until the user changes max-active-levels-var.
If the user sets OMP_NUM_THREADS or OMP_MAX_ACTIVE_LEVELS, they will override KMP_NESTING_MODE settings for the associated environment variables. The detected topology may be limited by an affinity mask setting on the initial thread, or if the user sets KMP_HW_SUBSET. See also: KMP_HOT_TEAMS_MAX_LEVEL for controlling use of hot teams for nested parallel regions. Note that this feature only sets numbers of threads used at nesting levels. The user should make use of OMP_PLACES and OMP_PROC_BIND or KMP_AFFINITY for affinitizing those threads, if desired.
Differential Revision: https://reviews.llvm.org/D102188
show more ...
|
| #
c765d140 |
| 10-May-2021 |
Peyton, Jonathan L <[email protected]> |
[OpenMP] Fix hidden helper + affinity
When KMP_AFFINITY is set, each worker thread's gtid value is used as an index into the place list to determine the thread's placement. With hidden helpers enabl
[OpenMP] Fix hidden helper + affinity
When KMP_AFFINITY is set, each worker thread's gtid value is used as an index into the place list to determine the thread's placement. With hidden helpers enabled, this gtid value is shifted down leading to unexpected shifted thread placement. This patch restores the previous behavior by adjusting the mask index to take the number of hidden helper threads into account.
Hidden helper threads are given the full initial mask and do not participate in any of the other affinity mechanisms (place partitioning, balanced affinity). Their affinity is only printed for debug builds.
Differential Revision: https://reviews.llvm.org/D101882
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
467f3924 |
| 11-Mar-2021 |
Hansang Bae <[email protected]> |
[OpenMP] Misc. changes that add or remove pointer/bound checks
-- Added or moved checks to appropriate places. -- Removed ineffective null check where the pointer is already being dereferenced ar
[OpenMP] Misc. changes that add or remove pointer/bound checks
-- Added or moved checks to appropriate places. -- Removed ineffective null check where the pointer is already being dereferenced around the code. -- Initialized variables that can be used without definitions. -- Added call to dlclose/FreeLibrary in OMPT tool activation. -- Added a new build compiler definition.
Differential Revision: https://reviews.llvm.org/D98584
show more ...
|