|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2 |
|
| #
a9ac5ac7 |
| 06-Aug-2022 |
Shilei Tian <[email protected]> |
[Clang][OpenMP] Fix the issue that `llvm.lifetime.end` is emitted too early for variables captured in linear clause
Currently if an OpenMP program uses `linear` clause, and is compiled with optimiza
[Clang][OpenMP] Fix the issue that `llvm.lifetime.end` is emitted too early for variables captured in linear clause
Currently if an OpenMP program uses `linear` clause, and is compiled with optimization, `llvm.lifetime.end` for variables listed in `linear` clause are emitted too early such that there could still be uses after that. Let's take the following code as example: ``` // loop.c int j; int *u;
void loop(int n) { int i; for (i = 0; i < n; ++i) { ++j; u = &j; } } ``` We compile using the command: ``` clang -cc1 -fopenmp-simd -O3 -x c -triple x86_64-apple-darwin10 -emit-llvm loop.c -o loop.ll ``` The following IR (simplified) will be generated: ``` @j = local_unnamed_addr global i32 0, align 4 @u = local_unnamed_addr global ptr null, align 8
define void @loop(i32 noundef %n) local_unnamed_addr { entry: %j = alloca i32, align 4 %cmp = icmp sgt i32 %n, 0 br i1 %cmp, label %simd.if.then, label %simd.if.end
simd.if.then: ; preds = %entry call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %j) store ptr %j, ptr @u, align 8 call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j) %0 = load i32, ptr %j, align 4 store i32 %0, ptr @j, align 4 br label %simd.if.end
simd.if.end: ; preds = %simd.if.then, %entry ret void } ``` The most important part is: ``` call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j) %0 = load i32, ptr %j, align 4 store i32 %0, ptr @j, align 4 ``` `%j` is still loaded after `@llvm.lifetime.end.p0(i64 4, ptr nonnull %j)`. This could cause the backend incorrectly optimizes the code and further generates incorrect code. The root cause is, when we emit a construct that could have `linear` clause, it usually has the following pattern: ``` EmitOMPLinearClauseInit(S) { OMPPrivateScope LoopScope(*this); ... EmitOMPLinearClause(S, LoopScope); ... (void)LoopScope.Privatize(); ... } EmitOMPLinearClauseFinal(S, [](CodeGenFunction &) { return nullptr; }); ``` Variables that need to be privatized are added into `LoopScope`, which also serves as a RAII object. When `LoopScope` is destructed and if optimization is enabled, a `@llvm.lifetime.end` is also emitted for each privatized variable. However, the writing back to original variables in `linear` clause happens after the scope in `EmitOMPLinearClauseFinal`, causing the issue we see above.
A quick "fix" seems to be, moving `EmitOMPLinearClauseFinal` inside the scope. However, it doesn't work. That's because the local variable map has been updated by `LoopScope` such that a variable declaration is mapped to the privatized variable, instead of the actual one. In that way, the following code will be generated: ``` %0 = load i32, ptr %j, align 4 store i32 %0, ptr %j, align 4 call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j) ``` Well, now the life time is correct, but apparently the writing back is broken.
In this patch, a new function `OMPPrivateScope::restoreMap` is added and called before calling `EmitOMPLinearClauseFinal`. This can make sure that `EmitOMPLinearClauseFinal` can find the orignal varaibls to write back.
Fixes #56913.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D131272
(cherry picked from commit e21202dac18ed7f718d26a0e131f96b399b4891c)
show more ...
|
|
Revision tags: llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
6678f8e5 |
| 27-Jun-2022 |
Yuanfang Chen <[email protected]> |
[ubsan] Using metadata instead of prologue data for function sanitizer
Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The se
[ubsan] Using metadata instead of prologue data for function sanitizer
Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The self (global value) references inside `Prologue Data` is still pointing to the original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`.
This patch detaches the information from function `Prologue Data` and attaches it to a function metadata node.
This and D116130 fix https://github.com/llvm/llvm-project/issues/49689.
Reviewed By: pcc
Differential Revision: https://reviews.llvm.org/D115844
show more ...
|
| #
8322fe20 |
| 27-Jun-2022 |
Ritanya B Bharadwaj <[email protected]> |
Adding support for target in_reduction
Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.
Adding support for target in_reduction
Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6. So, this
``` for (int i=0; i<N; i++) { res = res+i } ```
will become
```
#pragma omp task in_reduction(+:res) if(0) #pragma omp target map(res) for (int i=0; i<N; i++) { res = res+i } ```
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D125669
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
8ad4c6e4 |
| 17-Jun-2022 |
Yaxun (Sam) Liu <[email protected]> |
[HIP] add -fhip-kernel-arg-name
Add option -fhip-kernel-arg-name to emit kernel argument name metadata, which is needed for certain HIP applications.
Reviewed by: Artem Belevich, Fangrui Song, Bria
[HIP] add -fhip-kernel-arg-name
Add option -fhip-kernel-arg-name to emit kernel argument name metadata, which is needed for certain HIP applications.
Reviewed by: Artem Belevich, Fangrui Song, Brian Sumner
Differential Revision: https://reviews.llvm.org/D128022
show more ...
|
| #
ad7ce1e7 |
| 20-Jun-2022 |
Kazu Hirata <[email protected]> |
Don't use Optional::hasValue (NFC)
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
cefe472c |
| 17-May-2022 |
Yaxun (Sam) Liu <[email protected]> |
[clang] Fix __has_builtin
Fix __has_builtin to return 1 only if the requested target features of a builtin are enabled by refactoring the code for checking required target features of a builtin and
[clang] Fix __has_builtin
Fix __has_builtin to return 1 only if the requested target features of a builtin are enabled by refactoring the code for checking required target features of a builtin and use it in evaluation of __has_builtin.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D125829
show more ...
|
|
Revision tags: llvmorg-14.0.3 |
|
| #
ff289fee |
| 26-Apr-2022 |
Michael Kruse <[email protected]> |
[OpenMPIRBuilder] Remove ContinuationBB argument from Body callback.
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishi
[OpenMPIRBuilder] Remove ContinuationBB argument from Body callback.
The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems:
1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block.
2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator.
3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again.
4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch.
With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback.
Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block.
Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D118409
show more ...
|
|
Revision tags: llvmorg-14.0.2 |
|
| #
6f20744b |
| 13-Apr-2022 |
Erich Keane <[email protected]> |
Add support for ignored bitfield conditional codegen.
Currently we emit an error in just about every case of conditionals with a 'non simple' branch if treated as an LValue. This patch adds support
Add support for ignored bitfield conditional codegen.
Currently we emit an error in just about every case of conditionals with a 'non simple' branch if treated as an LValue. This patch adds support for the special case where this is an 'ignored' lvalue, which permits the side effects from happening.
It also splits up the emit for conditional LValue in a way that should be usable to handle simple assignment expressions in similar situations.
Differential Revision: https://reviews.llvm.org/D123680
show more ...
|
|
Revision tags: llvmorg-14.0.1 |
|
| #
8b62dd3c |
| 18-Mar-2022 |
Nikita Popov <[email protected]> |
Reapply [CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()
This requires some adjustment in caller code, because there was a confusion regarding the meaning of the PtrTy argument: This a
Reapply [CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()
This requires some adjustment in caller code, because there was a confusion regarding the meaning of the PtrTy argument: This argument is the type of the pointer being loaded, not the addresses being loaded from.
Reapply after fixing the specified pointer type for one call in 47eb4f7dcd845878b16a53dadd765195b9c24b6e, where the used type is important for determining alignment.
show more ...
|
| #
27f6cee1 |
| 23-Mar-2022 |
Nikita Popov <[email protected]> |
Revert "[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()"
This reverts commit 767ec883e37510a247ea5695921876ef67cf5b3f.
This results in a some incorrect alignments which are not cover
Revert "[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()"
This reverts commit 767ec883e37510a247ea5695921876ef67cf5b3f.
This results in a some incorrect alignments which are not covered by existing tests.
show more ...
|
| #
767ec883 |
| 18-Mar-2022 |
Nikita Popov <[email protected]> |
[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()
This requires some adjustment in caller code, because there was a confusion regarding the meaning of the PtrTy argument: This argument
[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()
This requires some adjustment in caller code, because there was a confusion regarding the meaning of the PtrTy argument: This argument is the type of the pointer being loaded, not the addresses being loaded from.
show more ...
|
| #
74992f4a |
| 18-Mar-2022 |
Nikita Popov <[email protected]> |
[CodeGen] Store element type in DominatingValue<RValue>
For aggregate rvalues, we need to store the element type in the dominating value, so we can recover the element type for the address.
|
| #
0aab3441 |
| 16-Mar-2022 |
Simon Moll <[email protected]> |
[Clang] Allow "ext_vector_type" applied to Booleans
This is the `ext_vector_type` alternative to D81083.
This patch extends Clang to allow 'bool' as a valid vector element type (attribute ext_vecto
[Clang] Allow "ext_vector_type" applied to Booleans
This is the `ext_vector_type` alternative to D81083.
This patch extends Clang to allow 'bool' as a valid vector element type (attribute ext_vector_type) in C/C++.
This is intended as the canonical type for SIMD masks and facilitates clean vector intrinsic declarations. Vectors of i1 are supported on IR level and below down to many SIMD ISAs, such as AVX512, ARM SVE (fixed vector length) and the VE target (NEC SX-Aurora TSUBASA).
The RFC on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html
Reviewed By: erichkeane
Differential Revision: https://reviews.llvm.org/D88905
show more ...
|
| #
003c0b93 |
| 14-Mar-2022 |
Dávid Bolvanský <[email protected]> |
[Clang] always_inline statement attribute
Motivation:
``` int test(int x, int y) { int r = 0; [[clang::always_inline]] r += foo(x, y); // force compiler to inline this function here re
[Clang] always_inline statement attribute
Motivation:
``` int test(int x, int y) { int r = 0; [[clang::always_inline]] r += foo(x, y); // force compiler to inline this function here return r; } ```
In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).
This patch solves this problem with call site attribute. "noinline" statement attribute already landed in D119061. Also, some LLVM Inliner fixes landed so call site attribute is stronger than function attribute.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D120717
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
c0a6433f |
| 07-Mar-2022 |
David Blaikie <[email protected]> |
Simplify OpenMP Lambda use
* Use default ref capture for non-escaping lambdas (this makes maintenance easier by allowing new uses, removing uses, having conditional uses (such as in assertions)
Simplify OpenMP Lambda use
* Use default ref capture for non-escaping lambdas (this makes maintenance easier by allowing new uses, removing uses, having conditional uses (such as in assertions) not require updates to an explicit capture list) * Simplify addPrivate API not to take a lambda, since it calls it unconditionally/immediately anyway - most callers are simply passing in a named value or short expression anyway and the lambda syntax just adds noise/overhead
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D121077
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
223b8240 |
| 28-Feb-2022 |
Dávid Bolvanský <[email protected]> |
[Clang] noinline call site attribute
Motivation:
``` int foo(int x, int y) { // any compiler will happily inline this function return x / y; }
int test(int x, int y) { int r = 0; [[cla
[Clang] noinline call site attribute
Motivation:
``` int foo(int x, int y) { // any compiler will happily inline this function return x / y; }
int test(int x, int y) { int r = 0; [[clang::noinline]] r += foo(x, y); // for some reason we don't want any inlining here return r; }
```
In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).
This patch solves this problem with call site attribute. The implementation is "smaller" wrt approach which uses new intrinsics and thanks to https://reviews.llvm.org/D79121 (Add nomerge statement attribute to clang), we have got some basic infrastructure to deal with attrs on statements with call expressions.
GCC devs are more inclined to call attribute solution as well, as builtins are problematic for them - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104187. But they have no patch proposal yet so.. We have free hands here.
If this approach makes sense, next future steps would be support for call site attributes for always_inline / flatten.
Reviewed By: aaron.ballman, kuhar
Differential Revision: https://reviews.llvm.org/D119061
show more ...
|
| #
4cb24ef9 |
| 23-Feb-2022 |
Arthur Eubanks <[email protected]> |
[clang] Remove Address::deprecated() from CGClass.cpp
|
| #
6eec4835 |
| 23-Feb-2022 |
Arthur Eubanks <[email protected]> |
[clang] Remove getPointerElementType() in EmitVTableTypeCheckedLoad()
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
5aa24558 |
| 27-Jan-2022 |
Sri Hari Krishna Narayanan <[email protected]> |
OMPIRBuilder for Interop directive
Implements the OMPIRBuilder portion for the Interop directive.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105876
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
d1b127b5 |
| 08-Jan-2022 |
Kazu Hirata <[email protected]> |
[clang] Remove unused forward declarations (NFC)
|
| #
e8b98a52 |
| 05-Jan-2022 |
Nikita Popov <[email protected]> |
[CodeGen] Emit elementtype attributes for indirect inline asm constraints
This implements the clang side of D116531. The elementtype attribute is added for all indirect constraints (*) and tests are
[CodeGen] Emit elementtype attributes for indirect inline asm constraints
This implements the clang side of D116531. The elementtype attribute is added for all indirect constraints (*) and tests are updated accordingly.
Differential Revision: https://reviews.llvm.org/D116666
show more ...
|
| #
d677a7cb |
| 02-Jan-2022 |
Kazu Hirata <[email protected]> |
[clang] Remove redundant member initialization (NFC)
Identified with readability-redundant-member-init.
|
| #
1f07a4a5 |
| 24-Dec-2021 |
Nikita Popov <[email protected]> |
[CodeGen] Avoid more pointer element type accesses
|
| #
5eb27188 |
| 22-Dec-2021 |
Alok Kumar Sharma <[email protected]> |
[clang][OpenMP][DebugInfo] Debug support for variables in shared clause of OpenMP task construct
Currently variables appearing inside shared clause of OpenMP task construct are not visible inside ll
[clang][OpenMP][DebugInfo] Debug support for variables in shared clause of OpenMP task construct
Currently variables appearing inside shared clause of OpenMP task construct are not visible inside lldb debugger.
After the current patch, lldb is able to show the variable
``` * thread #1, name = 'a.out', stop reason = breakpoint 1.1 frame #0: 0x0000000000400934 a.out`.omp_task_entry. [inlined] .omp_outlined.(.global_tid.=0, .part_id.=0x000000000071f0d0, .privates.=0x000000000071f0e8, .copy_fn.=(a.out`.omp_task_privates_map. at testshared.cxx:8), .task_t.=0x000000000071f0c0, __context=0x000000000071f0f0) at testshared.cxx:10:34 7 else { 8 #pragma omp task shared(svar) firstprivate(n) 9 { -> 10 printf("Task svar = %d\n", svar); 11 printf("Task n = %d\n", n); 12 svar = fib(n - 1); 13 } (lldb) p svar (int) $0 = 9 ```
Reviewed By: djtodoro
Differential Revision: https://reviews.llvm.org/D115510
show more ...
|
| #
55d7a12b |
| 21-Dec-2021 |
Nikita Popov <[email protected]> |
[CodeGen] Avoid pointee type access during global var declaration
All callers pass in a GlobalVariable, so we can conveniently fetch the type from there.
|