|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
b52d33e6 |
| 27-Jun-2022 |
Johannes Doerfert <[email protected]> |
[OpenMP][NFC] Reuse check lines for Clang/OpenMP tests
I used a script to reuse existing check lines rather than creating new ones. There are more opportunities to reduce the line count but the "che
[OpenMP][NFC] Reuse check lines for Clang/OpenMP tests
I used a script to reuse existing check lines rather than creating new ones. There are more opportunities to reduce the line count but the "check generated functions" logic makes that somewhat tricky.
FWIW, we really should redo the update script with all these use cases in mind...
Differential Revision: https://reviews.llvm.org/D128686
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
532dc62b |
| 07-Apr-2022 |
Nikita Popov <[email protected]> |
[OpaquePtrs][Clang] Add -no-opaque-pointers to tests (NFC)
This adds -no-opaque-pointers to clang tests whose output will change when opaque pointers are enabled by default. This is intended to be p
[OpaquePtrs][Clang] Add -no-opaque-pointers to tests (NFC)
This adds -no-opaque-pointers to clang tests whose output will change when opaque pointers are enabled by default. This is intended to be part of the migration approach described in https://discourse.llvm.org/t/enabling-opaque-pointers-by-default/61322/9.
The patch has been produced by replacing %clang_cc1 with %clang_cc1 -no-opaque-pointers for tests that fail with opaque pointers enabled. Worth noting that this doesn't cover all tests, there's a remaining ~40 tests not using %clang_cc1 that will need a followup change.
Differential Revision: https://reviews.llvm.org/D123115
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
53d5757e |
| 01-Feb-2022 |
Joseph Huber <[email protected]> |
[OpenMP] Add kernel string attribute to kernel function
This patch adds a function attribute to the kernel function generated in OpenMP offloading. We already create a `nvvm.annotations` metadata no
[OpenMP] Add kernel string attribute to kernel function
This patch adds a function attribute to the kernel function generated in OpenMP offloading. We already create a `nvvm.annotations` metadata node indicating the kernels present in the program. However, this created some indirection when trying to identify if a specific function was an entry. We add a single function attribute for each function now to simplify this.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D118708
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
1b1c8d83 |
| 16-Jan-2022 |
hyeongyu kim <[email protected]> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
bc9c4d72 |
| 09-Dec-2021 |
Joseph Huber <[email protected]> |
[OpenMP][FIX] Pass the num_threads value directly to parallel_51
The problem with the old scheme is that we would need to keep track of the "next region" and reset the num_threads value after it. Th
[OpenMP][FIX] Pass the num_threads value directly to parallel_51
The problem with the old scheme is that we would need to keep track of the "next region" and reset the num_threads value after it. The new RT doesn't do it and an assertion is triggered. The old RT doesn't do it either, I haven't tested it but I assume a num_threads clause might impact multiple parallel regions "accidentally". Further, in SPMD mode num_threads was simply ignored, for some reason beyond me.
In any case, parallel_51 is designed to take the clause value directly, so let's do that instead.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D113623
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
fd9b0999 |
| 08-Nov-2021 |
hyeongyu kim <[email protected]> |
Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953eb705af2ecfeb95a6262818fa85dd92.
Revert "Fix lit test failu
Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit aacfbb953eb705af2ecfeb95a6262818fa85dd92.
Revert "Fix lit test failures in CodeGenCoroutines"
This reverts commit 63fff0f5bffe20fa2c84a45a41161afa0043cb34.
show more ...
|
| #
aacfbb95 |
| 15-Oct-2021 |
hyeongyukim <[email protected]> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169. Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
Resolve lit failures in clang after 8ca4b3e's land
Fix lit test failures in clang-ppc* and clang-x64-windows-msvc
Fix missing failures in clang-ppc64be* and retry fixing clang-x64-windows-msvc
Fix internal_clone(aarch64) inline assembly
show more ...
|
| #
89ad2822 |
| 06-Nov-2021 |
Juneyoung Lee <[email protected]> |
Revert "[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default"
This reverts commit 7584ef766a7219b6ee5a400637206d26e0fa98ac.
|
| #
7584ef76 |
| 06-Nov-2021 |
Juneyoung Lee <[email protected]> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default
Turning on `enable_noundef_analysis` flag allows better codegen by removing freeze instructions. I modified clang by renaming `enable_noundef_analysis` flag to `disable-noundef-analysis` and turning it off by default.
Test updates are made as a separate patch: D108453
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D105169
show more ...
|
| #
f193bcc7 |
| 18-Oct-2021 |
Juneyoung Lee <[email protected]> |
Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits: 37ca7a795b277c20c02a218bf44052278c03344b 9aa6c72b92b6c89cc6d23b693257df9af7de2d15 705387c5074bcca36d626882462e
Revert D105169 due to the two-stage failure in ASAN
This reverts the following commits: 37ca7a795b277c20c02a218bf44052278c03344b 9aa6c72b92b6c89cc6d23b693257df9af7de2d15 705387c5074bcca36d626882462ebbc2bcc3bed4 8ca4b3ef19fe82d7ad6a6e1515317dcc01b41515 80dba72a669b5416e97a42fd2c2a7bc5a6d3f44a
show more ...
|
| #
8ca4b3ef |
| 15-Oct-2021 |
Juneyoung Lee <[email protected]> |
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169. Autogenerated test codes are changed by `utils/up
[Clang/Test]: Rename enable_noundef_analysis to disable-noundef-analysis and turn it off by default (2)
This patch updates test files after D105169. Autogenerated test codes are changed by `utils/update_cc_test_checks.py,` and non-autogenerated test codes are changed as follows:
(1) I wrote a python script that (partially) updates the tests using regex: {F18594904} The script is not perfect, but I believe it gives hints about which patterns are updated to have `noundef` attached.
(2) The remaining tests are updated manually.
Reviewed By: eugenis
Differential Revision: https://reviews.llvm.org/D108453
show more ...
|
| #
f7de6962 |
| 12-Oct-2021 |
hsmahesha <[email protected]> |
[CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()
Sequel patch to https://reviews.llvm.org/D111293.
Remove call to CodeGenFunction::InitTempAlloca() from OpenMP related codegen p
[CFE][Codegen][In-progress] Remove CodeGenFunction::InitTempAlloca()
Sequel patch to https://reviews.llvm.org/D111293.
Remove call to CodeGenFunction::InitTempAlloca() from OpenMP related codegen part.
Also remove the metadata `!llvm.access.group` from the updated lit tests.
Reviewed By: rjmccall
Differential Revision: https://reviews.llvm.org/D111316
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
423d34f7 |
| 22-Sep-2021 |
Shilei Tian <[email protected]> |
[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patch
[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`
This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110279
show more ...
|
| #
ac90dfc4 |
| 21-Sep-2021 |
Giorgis Georgakoudis <[email protected]> |
Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit 1d66649adf28d48ae1731516d87fb899426e3349.
Revert to fix AMG GPU issue.
|
| #
1d66649a |
| 21-Sep-2021 |
Giorgis Georgakoudis <[email protected]> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert, jhuber6
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
fb0cf017 |
| 19-Jul-2021 |
Giorgis Georgakoudis <[email protected]> |
Revert "[OpenMP] Codegen aggregate for outlined function captures"
This reverts commit e9c7291cb25f071f1a1dfa4049ed9f5a8a217b3e.
Fix failing tests
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
e9c7291c |
| 15-Jun-2021 |
Giorgis Georgakoudis <[email protected]> |
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument lis
[OpenMP] Codegen aggregate for outlined function captures
Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D102107
show more ...
|
| #
e2cfbfcc |
| 17-Jun-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
68d133a3 |
| 22-Mar-2021 |
Joseph Huber <[email protected]> |
[OpenMP] Simplify GPU memory globalization
Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share da
[OpenMP] Simplify GPU memory globalization
Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type.
This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D97680
show more ...
|
| #
df729e2b |
| 22-Apr-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well.
This started with
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already.
Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall.
After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
show more ...
|
| #
207b08a9 |
| 05-May-2021 |
Giorgis Georgakoudis <[email protected]> |
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactor
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
show more ...
|
| #
78a7d8c4 |
| 05-May-2021 |
Giorgis Georgakoudis <[email protected]> |
[Utils][NFC] Rename replace-function-regex in update_cc_test_checks
This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality
[Utils][NFC] Rename replace-function-regex in update_cc_test_checks
This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101934
show more ...
|
| #
f016c06a |
| 05-May-2021 |
Giorgis Georgakoudis <[email protected]> |
Revert "[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks"
This reverts commit 956cae2f09b21429dbcb02066c99e35a239aa4bf.
|
| #
956cae2f |
| 04-May-2021 |
Giorgis Georgakoudis <[email protected]> |
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactor
[OpenMP][NFC] Refactor Clang OpenMP tests using update_cc_test_checks
This patch refactors a subset of Clang OpenMP tests, generating checklines using the update_cc_test_checks script. This refactoring facilitates updating the Clang OpenMP code generation codebase by automating test generation.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D101849
show more ...
|
| #
a2dbfb6b |
| 21-Apr-2021 |
Giorgis Georgakoudis <[email protected]> |
[OpenMP] Simplify offloading parallel call codegen
This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPM
[OpenMP] Simplify offloading parallel call codegen
This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions.
Reviewed By: jdoerfert, Meinersbur
Differential Revision: https://reviews.llvm.org/D95976
show more ...
|