| #
a4005e13 |
| 28-Jul-2018 |
Yaxun Liu <[email protected]> |
[CUDA][HIP] Allow function-scope static const variable
CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__ function, only __shared__ variables or variables without any device memory
[CUDA][HIP] Allow function-scope static const variable
CUDA 8.0 E.3.9.4 says: Within the body of a __device__ or __global__ function, only __shared__ variables or variables without any device memory qualifiers may be declared with static storage class.
It is unclear how a function-scope non-const static variable without device memory qualifier is implemented, therefore only static const variable without device memory qualifier is allowed, which can be emitted as a global variable in constant address space.
Currently clang only allows function-scope static variable with __shared__ qualifier.
This patch also allows function-scope static const variable without device memory qualifier and emits it as a global variable in constant address space.
Differential Revision: https://reviews.llvm.org/D49931
llvm-svn: 338188
show more ...
|
| #
e1bfbc58 |
| 21-Jul-2018 |
Yaxun Liu <[email protected]> |
[HIP] Support -fcuda-flush-denormals-to-zero for amdgcn
Differential Revision: https://reviews.llvm.org/D48287
llvm-svn: 337639
|
| #
3efe0020 |
| 20-Jul-2018 |
Erich Keane <[email protected]> |
Implement cpu_dispatch/cpu_specific Multiversioning
As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is
Implement cpu_dispatch/cpu_specific Multiversioning
As documented here: https://software.intel.com/en-us/node/682969 and https://software.intel.com/en-us/node/523346. cpu_dispatch multiversioning is an ICC feature that provides for function multiversioning.
This feature is implemented with two attributes: First, cpu_specific, which specifies the individual function versions. Second, cpu_dispatch, which specifies the location of the resolver function and the list of resolvable functions.
This is valuable since it provides a mechanism where the resolver's TU can be specified in one location, and the individual implementions each in their own translation units.
The goal of this patch is to be source-compatible with ICC, so this implementation diverges from the ICC implementation in a few ways: 1- Linux x86/64 only: This implementation uses ifuncs in order to properly dispatch functions. This is is a valuable performance benefit over the ICC implementation. A future patch will be provided to enable this feature on Windows, but it will obviously more closely fit ICC's implementation. 2- CPU Identification functions: ICC uses a set of custom functions to identify the feature list of the host processor. This patch uses the cpu_supports functionality in order to better align with 'target' multiversioning. 1- cpu_dispatch function def/decl: ICC's cpu_dispatch requires that the function marked cpu_dispatch be an empty definition. This patch supports that as well, however declarations are also permitted, since the linker will solve the issue of multiple emissions.
Differential Revision: https://reviews.llvm.org/D47474
llvm-svn: 337552
show more ...
|
| #
7963e8be |
| 18-Jul-2018 |
Erich Keane <[email protected]> |
Add support for __declspec(code_seg("segname"))
This patch uses CodeSegAttr to represent __declspec(code_seg) rather than building on the existing support for #pragma code_seg. The code_seg declspe
Add support for __declspec(code_seg("segname"))
This patch uses CodeSegAttr to represent __declspec(code_seg) rather than building on the existing support for #pragma code_seg. The code_seg declspec is applied on functions and classes. This attribute enables the placement of code into separate named segments, including compiler- generated codes and template instantiations.
For more information, please see the following: https://msdn.microsoft.com/en-us/library/dn636922.aspx
This patch fixes the regression for the support for attribute ((section). https://github.com/llvm-mirror/clang/commit/746b78de7812bc785fbb5207b788348040b23fa7
Patch by Soumi Manna (Manna) Differential Revision: https://reviews.llvm.org/D48841
llvm-svn: 337420
show more ...
|
| #
a934f9da |
| 11-Jul-2018 |
Petr Pavlu <[email protected]> |
Fix setting of empty implicit-section-name attribute
Code in `CodeGenModule::SetFunctionAttributes()` could set an empty attribute `implicit-section-name` on a function that is affected by `#pragma
Fix setting of empty implicit-section-name attribute
Code in `CodeGenModule::SetFunctionAttributes()` could set an empty attribute `implicit-section-name` on a function that is affected by `#pragma clang text="section"`. This is incorrect because the attribute should contain a valid section name. If the function additionally also used `__attribute__((section("section")))` then this could result in emitting the function in a section with an empty name.
The patch fixes the issue by removing the problematic code that sets empty `implicit-section-name` from `CodeGenModule::SetFunctionAttributes()` because it is sufficient to set this attribute only from a similar code in `setNonAliasAttributes()` when the function is emitted.
Differential Revision: https://reviews.llvm.org/D48916
llvm-svn: 336842
show more ...
|
| #
c66d254d |
| 27-Jun-2018 |
Artem Belevich <[email protected]> |
[CUDA] Use atexit() to call module destructor.
This matches the way NVCC does it. Doing module cleanup at global destructor phase used to work, but is, apparently, too late for the CUDA runtime in C
[CUDA] Use atexit() to call module destructor.
This matches the way NVCC does it. Doing module cleanup at global destructor phase used to work, but is, apparently, too late for the CUDA runtime in CUDA-9.2, which ends up crashing with double-free.
Differential Revision: https://reviews.llvm.org/D48613
llvm-svn: 335763
show more ...
|
| #
c69e0676 |
| 26-Jun-2018 |
Evgeniy Stepanov <[email protected]> |
Revert "[MS] Use mangled names and comdats for string merging with ASan"
Depends on r334313, which has been reverted in r335681.
llvm-svn: 335684
|
| #
e44acadf |
| 26-Jun-2018 |
Peter Collingbourne <[email protected]> |
Implement CFI for indirect calls via a member function pointer.
Similarly to CFI on virtual and indirect calls, this implementation tries to use program type information to make the checks as precis
Implement CFI for indirect calls via a member function pointer.
Similarly to CFI on virtual and indirect calls, this implementation tries to use program type information to make the checks as precise as possible. The basic way that it works is as follows, where `C` is the name of the class being defined or the target of a call and the function type is assumed to be `void()`.
For virtual calls: - Attach type metadata to the addresses of function pointers in vtables (not the functions themselves) of type `void (B::*)()` for each `B` that is a recursive dynamic base class of `C`, including `C` itself. This type metadata has an annotation that the type is for virtual calls (to distinguish it from the non-virtual case). - At the call site, check that the computed address of the function pointer in the vtable has type `void (C::*)()`.
For non-virtual calls: - Attach type metadata to each non-virtual member function whose address can be taken with a member function pointer. The type of a function in class `C` of type `void()` is each of the types `void (B::*)()` where `B` is a most-base class of `C`. A most-base class of `C` is defined as a recursive base class of `C`, including `C` itself, that does not have any bases. - At the call site, check that the function pointer has one of the types `void (B::*)()` where `B` is a most-base class of `C`.
Differential Revision: https://reviews.llvm.org/D47567
llvm-svn: 335569
show more ...
|
| #
96edb2e3 |
| 25-Jun-2018 |
Alexey Bataev <[email protected]> |
[OPENMP] Do not consider address constant vars as possibly threadprivate.
Do not delay emission of the address constant variables in OpenMP mode as they cannot be defined as threadprivate.
llvm-svn
[OPENMP] Do not consider address constant vars as possibly threadprivate.
Do not delay emission of the address constant variables in OpenMP mode as they cannot be defined as threadprivate.
llvm-svn: 335483
show more ...
|
|
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3 |
|
| #
6c10a66e |
| 12-Jun-2018 |
Yaxun Liu <[email protected]> |
[CUDA][HIP] Set kernel calling convention before arrange function
Currently clang set kernel calling convention for CUDA/HIP after arranging function, which causes incorrect kernel function type sin
[CUDA][HIP] Set kernel calling convention before arrange function
Currently clang set kernel calling convention for CUDA/HIP after arranging function, which causes incorrect kernel function type since it depends on calling convention.
This patch moves setting kernel convention before arranging function.
Differential Revision: https://reviews.llvm.org/D47733
llvm-svn: 334457
show more ...
|
| #
3513fdcc |
| 11-Jun-2018 |
Reid Kleckner <[email protected]> |
[MS] Use mangled names and comdats for string merging with ASan
This should reduce the binary size penalty of ASan on Windows. After r334313, ASan will add red zones to globals in comdats, so we wil
[MS] Use mangled names and comdats for string merging with ASan
This should reduce the binary size penalty of ASan on Windows. After r334313, ASan will add red zones to globals in comdats, so we will still find OOB accesses to string literals.
llvm-svn: 334417
show more ...
|
| #
1a83d067 |
| 07-Jun-2018 |
Gabor Buella <[email protected]> |
[CodeGen] Improve diagnostics related to target attributes
Summary: When requirement imposed by __target__ attributes on functions are not satisfied, prefer printing those requirements, which are ex
[CodeGen] Improve diagnostics related to target attributes
Summary: When requirement imposed by __target__ attributes on functions are not satisfied, prefer printing those requirements, which are explicitly mentioned in the attributes.
This makes such messages more useful, e.g. printing avx512f instead of avx2 in the following scenario:
``` $ cat foo.c static inline void __attribute__((__always_inline__, __target__("avx512f"))) x(void) { }
int main(void) { x(); } $ clang foo.c foo.c:7:2: error: always_inline function 'x' requires target feature 'avx2', but would be inlined into function 'main' that is compiled without support for 'avx2' x(); ^ 1 error generated. ```
bugzilla: https://bugs.llvm.org/show_bug.cgi?id=37338
Reviewers: craig.topper, echristo, dblaikie
Reviewed By: craig.topper, echristo
Differential Revision: https://reviews.llvm.org/D46541
llvm-svn: 334174
show more ...
|
| #
6328f9a9 |
| 05-Jun-2018 |
Yaxun Liu <[email protected]> |
[CUDA][HIP] Do not emit type info when compiling for device
CUDA/HIP does not support RTTI on device side, therefore there is no point of emitting type info when compiling for device.
Emitting type
[CUDA][HIP] Do not emit type info when compiling for device
CUDA/HIP does not support RTTI on device side, therefore there is no point of emitting type info when compiling for device.
Emitting type info for device not only clutters the IR with useless global variables, but also causes undefined symbol at linking since vtable for cxxabiv1::class_type_info has external linkage.
Differential Revision: https://reviews.llvm.org/D47694
llvm-svn: 334021
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc2 |
|
| #
b534510c |
| 30-May-2018 |
Richard Smith <[email protected]> |
Make the mangled name collision diagnostic a bit more useful by listing the mangling.
This helps especially when the collision is for a template specialization, where the template arguments are not
Make the mangled name collision diagnostic a bit more useful by listing the mangling.
This helps especially when the collision is for a template specialization, where the template arguments are not available from anywhere else in the diagnostic, and are likely relevant to the problem.
llvm-svn: 333489
show more ...
|
| #
daceb1ea |
| 14-May-2018 |
Yaxun Liu <[email protected]> |
CodeGen: Emit string literal in constant address space
Some targets have constant address space (e.g. amdgcn). For them string literal should be emitted in constant address space then casted to defa
CodeGen: Emit string literal in constant address space
Some targets have constant address space (e.g. amdgcn). For them string literal should be emitted in constant address space then casted to default address space.
Differential Revision: https://reviews.llvm.org/D46643
llvm-svn: 332279
show more ...
|
| #
4fbf84c1 |
| 09-May-2018 |
Manoj Gupta <[email protected]> |
[Clang] Implement function attribute no_stack_protector.
Summary: This attribute tells clang to skip this function from stack protector when -stack-protector option is passed. GCC option for this is
[Clang] Implement function attribute no_stack_protector.
Summary: This attribute tells clang to skip this function from stack protector when -stack-protector option is passed. GCC option for this is: __attribute__((__optimize__("no-stack-protector"))) and the equivalent clang syntax would be: __attribute__((no_stack_protector))
This is used in Linux kernel to selectively disable stack protector in certain functions.
Reviewers: aaron.ballman, rsmith, rnk, probinson
Reviewed By: aaron.ballman
Subscribers: probinson, srhines, cfe-commits
Differential Revision: https://reviews.llvm.org/D46300
llvm-svn: 331925
show more ...
|
| #
9fc8faf9 |
| 09-May-2018 |
Adrian Prantl <[email protected]> |
Remove \brief commands from doxygen comments.
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of years now. Thi
Remove \brief commands from doxygen comments.
This is similar to the LLVM change https://reviews.llvm.org/D46290.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\@brief'); do perl -pi -e 's/\@brief //g' $i & done for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46320
llvm-svn: 331834
show more ...
|
| #
d7ff6d64 |
| 07-May-2018 |
Alexey Bataev <[email protected]> |
[OPENMP, NVPTX] Added support for L2 parallelism.
Added initial codegen for level 2, 3 etc. parallelism. Currently, all the second, the third etc. parallel regions will run sequentially.
llvm-svn:
[OPENMP, NVPTX] Added support for L2 parallelism.
Added initial codegen for level 2, 3 etc. parallelism. Currently, all the second, the third etc. parallel regions will run sequentially.
llvm-svn: 331642
show more ...
|
| #
6d944109 |
| 02-May-2018 |
Alexey Bataev <[email protected]> |
[OPENMP] Support C++ member functions in the device constructs.
Added correct emission of the C++ member functions for the device function when they are used in the device constructs.
llvm-svn: 331
[OPENMP] Support C++ member functions in the device constructs.
Added correct emission of the C++ member functions for the device function when they are used in the device constructs.
llvm-svn: 331365
show more ...
|
| #
a534f07f |
| 26-Apr-2018 |
Faisal Vali <[email protected]> |
Revert rC330794 and some dependent tiny bug fixes
See Richard's humbling feedback here: http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180423/226482.html http://lists.llvm.org/pipermail
Revert rC330794 and some dependent tiny bug fixes
See Richard's humbling feedback here: http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180423/226482.html http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20180423/226486.html
Wish I'd had the patience to solicit the feedback prior to committing :)
Sorry for the noise guys.
Thank you Richard for being the steward that clang deserves!
llvm-svn: 330888
show more ...
|
| #
936de9d6 |
| 25-Apr-2018 |
Faisal Vali <[email protected]> |
[c++2a] [concepts] Add rudimentary parsing support for template concept declarations
This patch is a tweak of changyu's patch: https://reviews.llvm.org/D40381. It differs in that the recognition of
[c++2a] [concepts] Add rudimentary parsing support for template concept declarations
This patch is a tweak of changyu's patch: https://reviews.llvm.org/D40381. It differs in that the recognition of the 'concept' token is moved into the machinery that recognizes declaration-specifiers - this allows us to leverage the attribute handling machinery more seamlessly.
See the test file to get a sense of the basic parsing that this patch supports.
There is much more work to be done before concepts are usable...
Thanks Changyu!
llvm-svn: 330794
show more ...
|
| #
4a4e7a31 |
| 23-Apr-2018 |
Mikhail Maltsev <[email protected]> |
[CodeGen] Reland r330442: Add an option to suppress output of llvm.ident
The test case in the original patch was overly contrained and failed on PPC targets.
llvm-svn: 330575
|
| #
42b2a0e1 |
| 20-Apr-2018 |
Mikhail Maltsev <[email protected]> |
Revert r330442, CodeGen/no-ident-version.c is failing on PPC
llvm-svn: 330451
|
| #
4306f208 |
| 20-Apr-2018 |
Yaxun Liu <[email protected]> |
[CUDA] Set LLVM calling convention for CUDA kernel
Some targets need special LLVM calling convention for CUDA kernel. This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn t
[CUDA] Set LLVM calling convention for CUDA kernel
Some targets need special LLVM calling convention for CUDA kernel. This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers. Revised and lit tests added by Yaxun Liu.
Differential Revision: https://reviews.llvm.org/D45223
llvm-svn: 330447
show more ...
|
| #
6550c139 |
| 20-Apr-2018 |
Mikhail Maltsev <[email protected]> |
[CodeGen] Add an option to suppress output of llvm.ident
Summary: By default Clang outputs its version (including git commit hash, in case of trunk builds) into object and assembly files. It might b
[CodeGen] Add an option to suppress output of llvm.ident
Summary: By default Clang outputs its version (including git commit hash, in case of trunk builds) into object and assembly files. It might be useful to have an option to disable this, especially for debugging purposes. This patch implements new command line flags -Qn and -Qy (the names are chosen for compatibility with GCC). -Qn disables output of the 'llvm.ident' metadata string and the 'producer' debug info. -Qy (enabled by default) does the opposite.
Reviewers: faisalv, echristo, aprantl
Reviewed By: aprantl
Subscribers: aprantl, cfe-commits, JDevlieghere, rogfer01
Differential Revision: https://reviews.llvm.org/D45255
llvm-svn: 330442
show more ...
|