| #
e2e82c99 |
| 10-Jan-2021 |
Fangrui Song <[email protected]> |
[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data
ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking i
[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data
ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking its address (similar to a data symbol). The emitted code follows the traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced.
If the function is not defined in the executable, a canonical PLT entry will be needed at link time. This is similar to a copy relocation and is incompatible with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in a shared object).
This patch gives -fno-pic code a way to avoid such a canonical PLT entry.
The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie -fdirect-access-external-data). While we could set dso_local to avoid GOT when taking the address of a function declaration (there is an ignorable difference about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit and can just cause trouble, so we don't make the generalization.
show more ...
|
| #
38a716c3 |
| 09-Jan-2021 |
Fangrui Song <[email protected]> |
Make -fno-pic respect -fno-direct-access-external-data
D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-p
Make -fno-pic respect -fno-direct-access-external-data
D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-pic and -fpic.)
This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from global variable declarations. This usually causes the backend to emit a GOT indirection for external data access. With a GOT relocation, the subsequent -no-pie link will not have copy relocation even if the data symbol turns out to be defined by a shared object.
Differential Revision: https://reviews.llvm.org/D92714
show more ...
|
| #
1d3ebbf5 |
| 09-Jan-2021 |
Fangrui Song <[email protected]> |
Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations
GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference ext
Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations
GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference external data symbols by default. Clang adopted -mpie-copy-relocations D19996 as a flexible alternative.
The name -mpie-copy-relocations can be improved [1] and does not capture the idea that this option can apply to -fno-pic and -fpic [2], so this patch introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations their aliases for compatibility.
[1] For ``` extern int var; int get() { return var; } ``` if var is defined in another translation unit in the link unit, there is no copy relocation.
[2] -fno-pic -fno-direct-access-external-data is useful to avoid copy relocations. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888 If a shared object is linked with -Bsymbolic or --dynamic-list and exports a data symbol, normally the data symbol cannot be accessed by -fno-pic code (because by default an absolute relocation is produced which will lead to a copy relocation). -fno-direct-access-external-data can prevent copy relocations.
-fpic -fdirect-access-external-data can avoid GOT indirection. This is like the undefined counterpart of -fno-semantic-interposition. However, the user should define var in another translation unit and link with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared link. Generally the user has better tools for their goal but I want to mention that this combination is valid.
On COFF, the behavior is like always -fdirect-access-external-data. `__declspec(dllimport)` is needed to enable indirect access.
There is currently no plan to affect non-ELF behaviors or -fpic behaviors.
-fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch.
GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D92633
show more ...
|
| #
d1fd7234 |
| 31-Dec-2020 |
Fangrui Song <[email protected]> |
Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions
The idea is that the CC1 default for ELF should set dso_local on default visibility externa
Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions
The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar.
The refactoring is made available by 2820a2ca3a0e69c3f301845420e0067ffff2251b.
Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states:
* -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out.
Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.
show more ...
|
| #
809a1e0f |
| 31-Dec-2020 |
Fangrui Song <[email protected]> |
[CodeGenModule] Set dso_local for Mach-O GlobalValue
* static relocation model: always * other relocation models: if isStrongDefinitionForLinker
This will make LLVM IR emitted for COFF/Mach-O and e
[CodeGenModule] Set dso_local for Mach-O GlobalValue
* static relocation model: always * other relocation models: if isStrongDefinitionForLinker
This will make LLVM IR emitted for COFF/Mach-O and executable ELF similar.
show more ...
|
| #
2820a2ca |
| 30-Dec-2020 |
Fangrui Song <[email protected]> |
Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule
This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. F
Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule
This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. For LLVM synthesized functions/globals, they may lose inferred dso_local but such optimizations are probably not very useful.
Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now. (llvm/CodeGen/X86/semantic-interposition-comdat.ll) (Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)
show more ...
|
| #
3733463d |
| 18-Dec-2020 |
Rong Xu <[email protected]> |
[IR][PGO] Add hot func attribute and use hot/cold attribute in func section
Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR.
This patch adds
[IR][PGO] Add hot func attribute and use hot/cold attribute in func section
Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR.
This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph).
This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold.
The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute.
The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions.
Differential Revision: https://reviews.llvm.org/D92493
show more ...
|
| #
fb0f7288 |
| 08-Dec-2020 |
Zequan Wu <[email protected]> |
[Clang] Make nomerge attribute a function attribute as well as a statement attribute.
Differential Revision: https://reviews.llvm.org/D92800
|
| #
c36f31c4 |
| 15-Dec-2020 |
Rong Xu <[email protected]> |
[PGO] remove unintentional code in early commit
Remove unintentional code in commit 54e03d [PGO] Verify BFI counts after loading profile data.
|
| #
54e03d03 |
| 14-Dec-2020 |
Rong Xu <[email protected]> |
[PGO] Verify BFI counts after loading profile data
This patch adds the functionality to compare BFI counts with real profile counts right after reading the profile. It will print remarks under -Rpas
[PGO] Verify BFI counts after loading profile data
This patch adds the functionality to compare BFI counts with real profile counts right after reading the profile. It will print remarks under -Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo.
Differential Revision: https://reviews.llvm.org/D91813
show more ...
|
| #
1ab9327d |
| 05-Dec-2020 |
Fangrui Song <[email protected]> |
[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal
PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead. Actually Clang ppc32
[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal
PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead. Actually Clang ppc32 does produce a pair of absolute relocations which match GCC.
This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).
show more ...
|
| #
0cbf61be |
| 03-Dec-2020 |
Nico Weber <[email protected]> |
[mac/arm] Fix rtti codegen tests when running on an arm mac
shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes RTTI objects to be emitted hidden. Update two tests that didn't expect th
[mac/arm] Fix rtti codegen tests when running on an arm mac
shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes RTTI objects to be emitted hidden. Update two tests that didn't expect this to happen for the default triple.
Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for arm64-apple-macos triples too.
Part of PR46644.
Differential Revision: https://reviews.llvm.org/D91904
show more ...
|
| #
e42021d5 |
| 23-Nov-2020 |
Ben Dunbobbin <[email protected]> |
[Clang][-fvisibility-from-dllstorageclass] Set DSO Locality from final visibility
Ensure that the DSO Locality of the globals in the IR is derived from their final visibility when using -fvisibility
[Clang][-fvisibility-from-dllstorageclass] Set DSO Locality from final visibility
Ensure that the DSO Locality of the globals in the IR is derived from their final visibility when using -fvisibility-from-dllstorageclass.
To accomplish this we reset the DSO locality of globals (before setting their visibility from their dllstorageclass) at the end of IRGen in Clang. This removes any effects that visibility options or annotations may have had on the DSO locality.
The resulting DSO locality of the globals will be pessimistic w.r.t. to the normal compiler IRGen.
Differential Revision: https://reviews.llvm.org/D91779
show more ...
|
| #
17497ec5 |
| 19-Nov-2020 |
Xiangling Liao <[email protected]> |
[AIX][FE] Support constructor/destructor attribute
Support attribute((constructor)) and attribute((destructor)) on AIX
Differential Revision: https://reviews.llvm.org/D90892
|
| #
f4c6080a |
| 18-Nov-2020 |
Nick Desaulniers <[email protected]> |
Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7a83cdf70c68d82bc3389c04009b841.
Going with a simpler approach.
|
| #
b637148e |
| 21-Sep-2020 |
Richard Smith <[email protected]> |
[c++20] For P0732R2 / P1907R1: Basic code generation and name mangling support for non-type template parameters of class type and template parameter objects.
The Itanium side of this follows the app
[c++20] For P0732R2 / P1907R1: Basic code generation and name mangling support for non-type template parameters of class type and template parameter objects.
The Itanium side of this follows the approach I proposed in https://github.com/itanium-cxx-abi/cxx-abi/issues/47 on 2020-09-06.
The MSVC side of this was determined empirically by observing MSVC's output.
Differential Revision: https://reviews.llvm.org/D89998
show more ...
|
| #
d093401a |
| 09-Nov-2020 |
Tyker <[email protected]> |
[NFC] Remove string parameter of annotation attribute from AST childs. this simplifies using annotation attributes when using clang as library
|
| #
8930032f |
| 08-Nov-2020 |
Simon Pilgrim <[email protected]> |
Don't dereference a dyn_cast<> result - use cast<> instead. NFCI.
We were relying on the dyn_cast<> succeeding - better use cast<> and have it assert that its the correct type than dereference a nul
Don't dereference a dyn_cast<> result - use cast<> instead. NFCI.
We were relying on the dyn_cast<> succeeding - better use cast<> and have it assert that its the correct type than dereference a null result.
show more ...
|
| #
d2e7dca5 |
| 05-Nov-2020 |
Jan Ole Hüser <[email protected]> |
[CodeGen] Fix Bug 47499: __unaligned extension inconsistent behaviour with C and C++
For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers.
The reason, why
[CodeGen] Fix Bug 47499: __unaligned extension inconsistent behaviour with C and C++
For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers.
The reason, why there was a difference between C and C++ for the keyword __unaligned: For C, the Method getAsCXXREcordDecl() returns nullptr. That guarantees that hasUnaligned() is called. If the language is C++, it is not guaranteed, that hasUnaligend() is called and evaluated.
Here are some links:
The Bug: https://bugs.llvm.org/show_bug.cgi?id=47499 Thread on the cfe-dev mailing list: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066783.html Diff, that introduced the check hasUnaligned() in getNaturalTypeAlignment(): https://reviews.llvm.org/D30166
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D90630
show more ...
|
| #
7ad6010f |
| 03-Nov-2020 |
Ben Dunbobbin <[email protected]> |
Fix - [Clang] Add the ability to map DLL storage class to visibility
415f7ee883 had a silly typo introduced when I inlined some code into a loop from its own function.
Original commit message:
For
Fix - [Clang] Add the ability to map DLL storage class to visibility
415f7ee883 had a silly typo introduced when I inlined some code into a loop from its own function.
Original commit message:
For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF.
To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
show more ...
|
| #
abd8cd91 |
| 28-Oct-2020 |
Yaxun (Sam) Liu <[email protected]> |
[CUDA][HIP] Fix linkage for -fgpu-rdc
Currently for explicit template function instantiation in CUDA/HIP device compilation clang emits instantiated kernel with external linkage and instantiated dev
[CUDA][HIP] Fix linkage for -fgpu-rdc
Currently for explicit template function instantiation in CUDA/HIP device compilation clang emits instantiated kernel with external linkage and instantiated device function with internal linkage.
This is fine for -fno-gpu-rdc since there is only one TU.
However this causes duplicate symbols for kernels for -fgpu-rdc if the same instantiation happen in multiple TU. Or missing symbols if a device function calls an explicitly instantiated template function in a different TU.
To make explicit template function instantiation work for -fgpu-rdc we need to follow the C++ linkage paradigm, i.e. use weak_odr linkage.
Differential Revision: https://reviews.llvm.org/D90311
show more ...
|
| #
ae9231ca |
| 02-Nov-2020 |
Ben Dunbobbin <[email protected]> |
Reland - [Clang] Add the ability to map DLL storage class to visibility
415f7ee883 had LIT test failures on any build where the clang executable was not called "clang". I have adjusted the LIT CHECK
Reland - [Clang] Add the ability to map DLL storage class to visibility
415f7ee883 had LIT test failures on any build where the clang executable was not called "clang". I have adjusted the LIT CHECKs to remove the binary name to fix this.
Original commit message:
For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF.
To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
show more ...
|
| #
5024d3aa |
| 02-Nov-2020 |
Ben Dunbobbin <[email protected]> |
Revert "[Clang] Add the ability to map DLL storage class to visibility"
This reverts commit 415f7ee8836944942d8beb70e982e95a312866a7.
The added tests were failing on the build bots!
|
| #
415f7ee8 |
| 02-Nov-2020 |
Ben Dunbobbin <[email protected]> |
[Clang] Add the ability to map DLL storage class to visibility
For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on
[Clang] Add the ability to map DLL storage class to visibility
For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF.
To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang.
Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain:
https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0)
This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour.
Differential Revision: https://reviews.llvm.org/D89970
show more ...
|
| #
0949f96d |
| 29-Sep-2020 |
Teresa Johnson <[email protected]> |
[MemProf] Pass down memory profile name with optional path from clang
Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module
[MemProf] Pass down memory profile name with optional path from clang
Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module flag metadata. LLVM in turn provides this name to the runtime via the new __memprof_profile_filename variable.
Additionally, always pass a default filename (in $cwd if a directory name is not specified vi the = form of the option). This is also consistent with the behavior of the PGO instrumentation. Since the memory profiles will generally be fairly large, it doesn't make sense to dump them to stderr. Also, importantly, the memory profiles will eventually be dumped in a compact binary format, which is another reason why it does not make sense to send these to stderr by default.
Change the existing memprof tests to specify log_path=stderr when that was being relied on.
Depends on D89086.
Differential Revision: https://reviews.llvm.org/D89087
show more ...
|