|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
7effcbda |
| 19-Jun-2022 |
Nico Weber <[email protected]> |
Rename parallelForEachN to just parallelFor
Patch created by running:
rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'
No behavior change.
Differential Revision: ht
Rename parallelForEachN to just parallelFor
Patch created by running:
rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'
No behavior change.
Differential Revision: https://reviews.llvm.org/D128140
show more ...
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
4290ef54 |
| 25-May-2022 |
Andrew Ng <[email protected]> |
[Support] Reduce allocations in parallelForEach with move
Differential Revision: https://reviews.llvm.org/D126458
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
8e382ae9 |
| 23-Jan-2022 |
Fangrui Song <[email protected]> |
[Support] Simplify parallelForEach{,N}
* Merge parallel_for_each into parallelForEach (this removes 1 `Fn(...)` call) * Change parallelForEach to use parallelForEachN * Move parallelForEachN into Pa
[Support] Simplify parallelForEach{,N}
* Merge parallel_for_each into parallelForEach (this removes 1 `Fn(...)` call) * Change parallelForEach to use parallelForEachN * Move parallelForEachN into Parallel.cpp
My x86-64 `lld` executable is 100KiB smaller. No noticeable difference in performance.
Reviewed By: lattner
Differential Revision: https://reviews.llvm.org/D117510
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
7b25fa8c |
| 18-Sep-2021 |
Alexandre Ganea <[email protected]> |
[Support] Attempt to fix deadlock in ThreadGroup
This is an attempt to fix the situation described by https://reviews.llvm.org/D104207#2826290 and PR41508. See sequence of operations leading to the
[Support] Attempt to fix deadlock in ThreadGroup
This is an attempt to fix the situation described by https://reviews.llvm.org/D104207#2826290 and PR41508. See sequence of operations leading to the bug in https://reviews.llvm.org/D104207#3004689
We ensure that the Latch is completely "free" before decrementing the number of TaskGroupInstances.
Differential revision: https://reviews.llvm.org/D109914
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init |
|
| #
4137ab62 |
| 08-Jul-2020 |
Fangrui Song <[email protected]> |
[Support] Define llvm::parallel::strategy for -DLLVM_ENABLE_THREADS=off builds after D76885
|
|
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5 |
|
| #
eb4663d8 |
| 17-Mar-2020 |
Fangrui Song <[email protected]> |
[lld][COFF][ELF][WebAssembly] Replace --[no-]threads /threads[:no] with --threads={1,2,...} /threads:{1,2,...}
--no-threads is a name copied from gold. gold has --no-thread, --thread-count and sever
[lld][COFF][ELF][WebAssembly] Replace --[no-]threads /threads[:no] with --threads={1,2,...} /threads:{1,2,...}
--no-threads is a name copied from gold. gold has --no-thread, --thread-count and several other --thread-count-*.
There are needs to customize the number of threads (running several lld processes concurrently or customizing the number of LTO threads). Having a single --threads=N is a straightforward replacement of gold's --no-threads + --thread-count.
--no-threads is used rarely. So just delete --no-threads instead of keeping it for compatibility for a while.
If --threads= is specified (ELF,wasm; COFF /threads: is similar), --thinlto-jobs= defaults to --threads=, otherwise all available hardware threads are used.
There is currently no way to override a --threads={1,2,...}. It is still a debate whether we should use --threads=all.
Reviewed By: rnk, aganea
Differential Revision: https://reviews.llvm.org/D76885
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
| #
8404aeb5 |
| 14-Feb-2020 |
Alexandre Ganea <[email protected]> |
[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, s
[Support] On Windows, ensure hardware_concurrency() extends to all CPU sockets and all NUMA groups
The goal of this patch is to maximize CPU utilization on multi-socket or high core count systems, so that parallel computations such as LLD/ThinLTO can use all hardware threads in the system. Before this patch, on Windows, a maximum of 64 hardware threads could be used at most, in some cases dispatched only on one CPU socket.
== Background == Windows doesn't have a flat cpu_set_t like Linux. Instead, it projects hardware CPUs (or NUMA nodes) to applications through a concept of "processor groups". A "processor" is the smallest unit of execution on a CPU, that is, an hyper-thread if SMT is active; a core otherwise. There's a limit of 32-bit processors on older 32-bit versions of Windows, which later was raised to 64-processors with 64-bit versions of Windows. This limit comes from the affinity mask, which historically is represented by the sizeof(void*). Consequently, the concept of "processor groups" was introduced for dealing with systems with more than 64 hyper-threads.
By default, the Windows OS assigns only one "processor group" to each starting application, in a round-robin manner. If the application wants to use more processors, it needs to programmatically enable it, by assigning threads to other "processor groups". This also means that affinity cannot cross "processor group" boundaries; one can only specify a "preferred" group on start-up, but the application is free to allocate more groups if it wants to.
This creates a peculiar situation, where newer CPUs like the AMD EPYC 7702P (64-cores, 128-hyperthreads) are projected by the OS as two (2) "processor groups". This means that by default, an application can only use half of the cores. This situation could only get worse in the years to come, as dies with more cores will appear on the market.
== The problem == The heavyweight_hardware_concurrency() API was introduced so that only *one hardware thread per core* was used. Once that API returns, that original intention is lost, only the number of threads is retained. Consider a situation, on Windows, where the system has 2 CPU sockets, 18 cores each, each core having 2 hyper-threads, for a total of 72 hyper-threads. Both heavyweight_hardware_concurrency() and hardware_concurrency() currently return 36, because on Windows they are simply wrappers over std::thread::hardware_concurrency() -- which can only return processors from the current "processor group".
== The changes in this patch == To solve this situation, we capture (and retain) the initial intention until the point of usage, through a new ThreadPoolStrategy class. The number of threads to use is deferred as late as possible, until the moment where the std::threads are created (ThreadPool in the case of ThinLTO).
When using hardware_concurrency(), setting ThreadCount to 0 now means to use all the possible hardware CPU (SMT) threads. Providing a ThreadCount above to the maximum number of threads will have no effect, the maximum will be used instead. The heavyweight_hardware_concurrency() is similar to hardware_concurrency(), except that only one thread per hardware *core* will be used.
When LLVM_ENABLE_THREADS is OFF, the threading APIs will always return 1, to ensure any caller loops will be exercised at least once.
Differential Revision: https://reviews.llvm.org/D71775
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
564481ae |
| 16-Mar-2019 |
Andrew Ng <[email protected]> |
[Support] ThreadPoolExecutor fixes for Windows/MinGW
Changed ThreadPoolExecutor to no longer use detached threads and instead to join threads on destruction. This is to prevent intermittent crashing
[Support] ThreadPoolExecutor fixes for Windows/MinGW
Changed ThreadPoolExecutor to no longer use detached threads and instead to join threads on destruction. This is to prevent intermittent crashing on Windows when doing a normal full exit, e.g. via exit().
Changed ThreadPoolExecutor to be a ManagedStatic so that it can be stopped on llvm_shutdown(). Without this, it would only be stopped in the destructor when doing a full exit. This is required to avoid intermittent crashing on Windows due to a race condition between the ThreadPoolExecutor starting up threads and the process doing a fast exit, e.g. via _exit().
The Windows crashes appear to only occur with the MSVC static runtimes and are more frequent with the debug static runtime.
These changes also prevent intermittent deadlocks on exit with the MinGW runtime.
Differential Revision: https://reviews.llvm.org/D70447
show more ...
|
| #
d4960032 |
| 10-Oct-2019 |
Nico Weber <[email protected]> |
win: Move Parallel.h off concrt to cross-platform code
r179397 added Parallel.h and implemented it terms of concrt in 2013.
In 2015, a cross-platform implementation of the functions has appeared an
win: Move Parallel.h off concrt to cross-platform code
r179397 added Parallel.h and implemented it terms of concrt in 2013.
In 2015, a cross-platform implementation of the functions has appeared and is in use everywhere but on Windows (r232419). r246219 hints that <thread> had issues in MSVC2013, but r296906 suggests they've been fixed now that we require 2015+.
So remove the concrt code. It's less code, and it sounds like concrt has conceptual and performance issues, see PR41198.
I built blink_core.dll in a debug component build with full symbols and in a release component build without any symbols. I couldn't measure a performance difference for linking blink_core.dll before and after this patch.
Differential Revision: https://reviews.llvm.org/D68820
llvm-svn: 374421
show more ...
|
| #
f6a62909 |
| 25-Apr-2019 |
Fangrui Song <[email protected]> |
Parallel: only allow the first TaskGroup to run tasks parallelly
Summary: Concurrent (e.g. nested) llvm::parallel::for_each() may lead to dead locks. See PR35788 (fixed by rLLD322041) and PR41508 (f
Parallel: only allow the first TaskGroup to run tasks parallelly
Summary: Concurrent (e.g. nested) llvm::parallel::for_each() may lead to dead locks. See PR35788 (fixed by rLLD322041) and PR41508 (fixed by D60757).
When parallel_for_each() is about to return, in ~Latch() called by ~TaskGroup(), a thread (in the default executor) may block in Latch::sync() waiting for Count to become zero. If all threads in the default executor are blocked, it is a dead lock.
To fix this, force serial execution if the current TaskGroup is not the first one. For a nested llvm::parallel::for_each(), this parallelizes the outermost loop and serializes inner loops.
Differential Revision: https://reviews.llvm.org/D61115
llvm-svn: 359182
show more ...
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
| #
0f2a48c1 |
| 11-May-2018 |
Nico Weber <[email protected]> |
Remove unused SyncExecutor and make it clearer that the whole file is only used if LLVM_ENABLE_THREADS
llvm-svn: 332098
|
| #
5f8f34e4 |
| 01-May-2018 |
Adrian Prantl <[email protected]> |
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they ar
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1 |
|
| #
8c0ff950 |
| 04-Oct-2017 |
Rafael Espindola <[email protected]> |
Bring r314809 back.
But now include a check for CPU_COUNT so we still build on 10 year old versions of glibc.
Original message:
Use sched_getaffinity instead of std::thread::hardware_concurrency.
Bring r314809 back.
But now include a check for CPU_COUNT so we still build on 10 year old versions of glibc.
Original message:
Use sched_getaffinity instead of std::thread::hardware_concurrency.
The issue with std::thread::hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration.
With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores.
This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example.
llvm-svn: 314931
show more ...
|
| #
bef94bcb |
| 04-Oct-2017 |
Daniel Neilson <[email protected]> |
Revert D38481 due to missing cmake check for CPU_COUNT
Summary: This reverts D38481. The change breaks systems with older versions of glibc. It injects a use of CPU_COUNT() from sched.h without chec
Revert D38481 due to missing cmake check for CPU_COUNT
Summary: This reverts D38481. The change breaks systems with older versions of glibc. It injects a use of CPU_COUNT() from sched.h without checking to ensure that the function exists first.
Reviewers:
Subscribers:
llvm-svn: 314922
show more ...
|
| #
6e182fba |
| 03-Oct-2017 |
Rafael Espindola <[email protected]> |
Use sched_getaffinity instead of std::thread::hardware_concurrency.
The issue with std::thread::hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thre
Use sched_getaffinity instead of std::thread::hardware_concurrency.
The issue with std::thread::hardware_concurrency is that it forwards to libc and some implementations (like glibc) don't take thread affinity into consideration.
With this change a llvm program that can execute in only 2 cores will use 2 threads, even if the machine has 32 cores.
This makes benchmarking a lot easier, but should also help if someone doesn't want to use all cores for compilation for example.
llvm-svn: 314809
show more ...
|
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2 |
|
| #
905da745 |
| 11-May-2017 |
Hans Wennborg <[email protected]> |
Fix -DLLVM_ENABLE_THREADS=OFF build after r302748
llvm-svn: 302806
|
| #
7a2d5681 |
| 11-May-2017 |
Zachary Turner <[email protected]> |
Final (hopefully) fix for the build bots.
This time it actually occurred to me to change the #defines to actually test the pre-processed out codepath. Hopefully this time it works.
llvm-svn: 302752
|
| #
20c8e919 |
| 11-May-2017 |
Zachary Turner <[email protected]> |
Try again to fix the buildbots.
TaskGroup and Latch need to be in llvm::parallel::detail, not in llvm::detail.
llvm-svn: 302751
|
| #
bfb8e189 |
| 11-May-2017 |
Zachary Turner <[email protected]> |
Fix build errors with Parallel.
llvm-svn: 302749
|
| #
3a57fbd6 |
| 11-May-2017 |
Zachary Turner <[email protected]> |
[Support] Move Parallel algorithms from LLD to LLVM.
Differential Revision: https://reviews.llvm.org/D33024
llvm-svn: 302748
|