| #
a596b54d |
| 08-Jun-2021 |
Nick Desaulniers <[email protected]> |
Revert "[IR] make -stack-alignment= into a module attr"
This reverts commit 433c8d950cb3a1fa0977355ce0367e8c763a3f13.
Breaks the MIPS build.
|
| #
433c8d95 |
| 08-Jun-2021 |
Nick Desaulniers <[email protected]> |
[IR] make -stack-alignment= into a module attr
Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specifie
[IR] make -stack-alignment= into a module attr
Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO.
Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D103048
show more ...
|
| #
b31f41e7 |
| 24-May-2021 |
Andrew Savonichev <[email protected]> |
[Clang] Support a user-defined __dso_handle
This fixes PR49198: Wrong usage of __dso_handle in user code leads to a compiler crash.
When Init is an address of the global itself, we need to track it
[Clang] Support a user-defined __dso_handle
This fixes PR49198: Wrong usage of __dso_handle in user code leads to a compiler crash.
When Init is an address of the global itself, we need to track it across RAUW. Otherwise the initializer can be destroyed if the global is replaced.
Differential Revision: https://reviews.llvm.org/D101156
show more ...
|
| #
0e4cf807 |
| 22-May-2021 |
Martin Storsjö <[email protected]> |
[clang] [MinGW] Don't mark emutls variables as DSO local
These actually can be automatically imported from another DLL. (This works properly as long as the actual implementation of emutls is linked
[clang] [MinGW] Don't mark emutls variables as DSO local
These actually can be automatically imported from another DLL. (This works properly as long as the actual implementation of emutls is linked dynamically from e.g. libgcc; if the implementation comes from compiler-rt or a statically linked libgcc, it doesn't work as intended.)
This fixes PR50146 and https://github.com/msys2/MINGW-packages/issues/8706 (fixing calling std::call_once in a dynamically linked libstdc++); since f73183958482602c4588b0f4a1c3a096e7542947 the dso_local attribute on the TLS variable affected the actual generated code for accessing the emutls variable.
The dso_local attribute on the emutls variable made those accesses to use 32 bit relative addressing in code, which requires runtime pseudo relocations in the text section, and breaks entirely if the actual other variable ends up loaded too far away in the virtual address space.
Differential Revision: https://reviews.llvm.org/D102970
show more ...
|
| #
033138ea |
| 21-May-2021 |
Nick Desaulniers <[email protected]> |
[IR] make stack-protector-guard-* flags into module attrs
D88631 added initial support for:
- -mstack-protector-guard= - -mstack-protector-guard-reg= - -mstack-protector-guard-offset=
flags, and D
[IR] make stack-protector-guard-* flags into module attrs
D88631 added initial support for:
- -mstack-protector-guard= - -mstack-protector-guard-reg= - -mstack-protector-guard-offset=
flags, and D100919 extended these to AArch64. Unfortunately, these flags aren't retained for LTO. Make them module attributes rather than TargetOptions.
Link: https://github.com/ClangBuiltLinux/linux/issues/1378
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D102742
show more ...
|
| #
4cb42564 |
| 19-May-2021 |
Yaxun (Sam) Liu <[email protected]> |
[CUDA][HIP] Fix device variables used by host
variables emitted on both host and device side with different addresses when ODR-used by host function should not cause device side counter-part to be f
[CUDA][HIP] Fix device variables used by host
variables emitted on both host and device side with different addresses when ODR-used by host function should not cause device side counter-part to be force emitted.
This fixes the regression caused by https://reviews.llvm.org/D102237
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D102801
show more ...
|
| #
8f20ac95 |
| 13-May-2021 |
Reid Kleckner <[email protected]> |
[PGO] Don't reference functions unless value profiling is enabled
This reduces the size of chrome.dll.pdb built with optimizations, coverage, and line table info from 4,690,210,816 to 2,181,128,192,
[PGO] Don't reference functions unless value profiling is enabled
This reduces the size of chrome.dll.pdb built with optimizations, coverage, and line table info from 4,690,210,816 to 2,181,128,192, which makes it possible to fit under the 4GB limit.
This change can greatly reduce binary size in coverage builds, which do not need value profiling. IR PGO builds are unaffected. There is a minor behavior change for frontend PGO.
PGO and coverage both use InstrProfiling to create profile data with counters. PGO records the address of each function in the __profd_ global. It is used later to map runtime function pointer values back to source-level function names. Coverage does not appear to use this information.
Recording the address of every function with code coverage drastically increases code size. Consider this program:
void foo(); void bar(); inline void inlineMe(int x) { if (x > 0) foo(); else bar(); } int getVal(); int main() { inlineMe(getVal()); }
With code coverage, the InstrProfiling pass runs before inlining, and it captures the address of inlineMe in the __profd_ global. This greatly increases code size, because now the compiler can no longer delete trivial code.
One downside to this approach is that users of frontend PGO must apply the -mllvm -enable-value-profiling flag globally in TUs that enable PGO. Otherwise, some inline virtual method addresses may not be recorded and will not be able to be promoted. My assumption is that this mllvm flag is not popular, and most frontend PGO users don't enable it.
Differential Revision: https://reviews.llvm.org/D102818
show more ...
|
| #
37561ba8 |
| 19-May-2021 |
Fangrui Song <[email protected]> |
-fno-semantic-interposition: Don't set dso_local on GlobalVariable
`clang -fpic -fno-semantic-interposition` may set dso_local on variables for -fpic.
GCC folks consider there are 'address interpos
-fno-semantic-interposition: Don't set dso_local on GlobalVariable
`clang -fpic -fno-semantic-interposition` may set dso_local on variables for -fpic.
GCC folks consider there are 'address interposition' and 'semantic interposition', and 'disabling semantic interposition' can optimize function calls but cannot change variable references to use local aliases (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483).
This patch removes dso_local for variables in `clang -fpic -fno-semantic-interposition` mode so that the built shared objects can work with copy relocations. Building llvm-project tiself with -fno-semantic-interposition (D102453) should now be safe with trunk Clang.
Example: ``` // a.c int var; int *addr() { return var; }
// old: cannot be interposed movslq .Lvar$local(%rip), %rax // new: can be interposed movq var@GOTPCREL(%rip), %rax movslq (%rax), %rax ```
The local alias lowering for `GlobalVariable`s is kept in case there is a future option allowing local aliases.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D102583
show more ...
|
| #
797ad701 |
| 18-May-2021 |
Ten Tzen <[email protected]> |
[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.
This new feature adds the support of Hardware Exception
[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.
This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch.
Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.
NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.
The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs.
The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process.
Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial.
This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option.
One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales:
A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed:
Warning: jump bypasses variable with a non-trivial destructor
In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State.
Implementation: Part-1: Clang implementation described below.
Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining.
Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small.
Part-2 (will be in Part-2 patch): LLVM implementation described below.
For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap).
For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address.
The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through.
Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html
Differential Revision: https://reviews.llvm.org/D80344/new/
show more ...
|
| #
9f7d552c |
| 17-May-2021 |
Arthur Eubanks <[email protected]> |
[NFC] Pass GV value type instead of pointer type to GetOrCreateLLVMGlobal
For opaque pointers, to avoid PointerType::getElementType().
Reviewed By: dblaikie
Differential Revision: https://reviews.
[NFC] Pass GV value type instead of pointer type to GetOrCreateLLVMGlobal
For opaque pointers, to avoid PointerType::getElementType().
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D102638
show more ...
|
| #
a624cec5 |
| 13-May-2021 |
Roman Lebedev <[email protected]> |
[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs
As it was discovered in post-commit feedback for 0aa0458f1429372038ca6a4edc7e94c96cd9a753, we handle thunks
[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs
As it was discovered in post-commit feedback for 0aa0458f1429372038ca6a4edc7e94c96cd9a753, we handle thunks incorrectly, and end up annotating their this/return with attributes that are valid for their callees, not for thunks themselves.
While it would be good to fix this properly, and keep annotating them on thunks, i've tried doing that in https://reviews.llvm.org/D100388 with little success, and the patch is stuck for a month now.
So for now, as a stopgap measure, subj.
show more ...
|
| #
98575708 |
| 11-May-2021 |
Yaxun (Sam) Liu <[email protected]> |
[CUDA][HIP] Fix device template variables
Currently clang does not emit device template variables instantiated only in host functions, however, nvcc is able to do that:
https://godbolt.org/z/fneEff
[CUDA][HIP] Fix device template variables
Currently clang does not emit device template variables instantiated only in host functions, however, nvcc is able to do that:
https://godbolt.org/z/fneEfferY
This patch fixes this issue by refactoring and extending the existing mechanism for emitting static device var ODR-used by host only. Basically clang records device variables ODR-used by host code and force them to be emitted in device compilation. The existing mechanism makes sure these device variables ODR-used by host code are added to llvm.compiler-used, therefore they are guaranteed not to be deleted.
It also fixes non-ODR-use of static device variable by host code causing static device variable to be emitted and registered, which should not.
Reviewed by: Artem Belevich
Differential Revision: https://reviews.llvm.org/D102237
show more ...
|
| #
df729e2b |
| 22-Apr-2021 |
Johannes Doerfert <[email protected]> |
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well.
This started with
[OpenMP] Overhaul `declare target` handling
This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well.
This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well.
I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already.
Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary.
---
Notes:
The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall.
After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around.
---
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D101030
show more ...
|
| #
84c47543 |
| 21-Apr-2021 |
Leonard Chan <[email protected]> |
[clang] Add -fc++-abi= flag for specifying which C++ ABI to use
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to
[clang] Add -fc++-abi= flag for specifying which C++ ABI to use
This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html.
The goal is to add a way to override the default target C++ ABI through a compiler flag. This makes it easier to test and transition between different C++ ABIs through compile flags rather than build flags.
In this patch:
- Store -fc++-abi= in a LangOpt. This isn't stored in a CodeGenOpt because there are instances outside of codegen where Clang needs to know what the ABI is (particularly through ASTContext::createCXXABI), and we should be able to override the target default if the flag is provided at that point. - Expose the existing ABIs in TargetCXXABI as values that can be passed through this flag. - Create a .def file for these ABIs to make it easier to check flag values. - Add an error for diagnosing bad ABI flag values.
Differential Revision: https://reviews.llvm.org/D85802
show more ...
|
| #
2669abae |
| 04-May-2021 |
Alex Lorenz <[email protected]> |
[clang][CodeGen] Use llvm::stable_sort for multi version resolver options
The use of llvm::sort causes periodic failures on the bot with EXPENSIVE_CHECKS enabled, as the regular sort pre-shuffles th
[clang][CodeGen] Use llvm::stable_sort for multi version resolver options
The use of llvm::sort causes periodic failures on the bot with EXPENSIVE_CHECKS enabled, as the regular sort pre-shuffles the array in the expensive checks mode, leading to a non-deterministic test result which causes the CodeGenCXX/attr-cpuspecific-outoflinedefs.cpp testcase to fail on the bot (http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/).
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init |
|
| #
7818906c |
| 20-Dec-2019 |
Alexey Bader <[email protected]> |
[SYCL] Implement SYCL address space attributes handling
Default address space (applies when no explicit address space was specified) maps to generic (4) address space.
Added SYCL named address spac
[SYCL] Implement SYCL address space attributes handling
Default address space (applies when no explicit address space was specified) maps to generic (4) address space.
Added SYCL named address spaces `sycl_global`, `sycl_local` and `sycl_private` defined as sub-sets of the default address space.
Static variables without address space now reside in global address space when compile for SPIR target, unless they have an explicit address space qualifier in source code.
Differential Revision: https://reviews.llvm.org/D89909
show more ...
|
| #
2786e673 |
| 23-Apr-2021 |
Fangrui Song <[email protected]> |
[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all}
The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instru
[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all}
The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it.
Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan)
Also document the function attribute "frame-pointer" which is long overdue.
Differential Revision: https://reviews.llvm.org/D101016
show more ...
|
| #
775a9483 |
| 21-Apr-2021 |
Fangrui Song <[email protected]> |
[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables
On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needs
[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables
On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module.
Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.)
Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame.
This patch
* sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute.
The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics.
Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc.
Differential Revision: https://reviews.llvm.org/D100251
show more ...
|
| #
0ed61361 |
| 20-Apr-2021 |
Erich Keane <[email protected]> |
Ensure target-multiversioning emits deferred declarations
As reported in PR50025, sometimes we would end up not emitting functions needed by inline multiversioned variants. This is because we typica
Ensure target-multiversioning emits deferred declarations
As reported in PR50025, sometimes we would end up not emitting functions needed by inline multiversioned variants. This is because we typically use the 'deferred decl' mechanism to emit these. However, the variants are emitted after that typically happens. This fixes that by ensuring we re-run deferred decls after this happens. Also, the multiversion emission is done recursively to ensure that MV functions that require other MV functions to be emitted get emitted.
show more ...
|
| #
6a72ed23 |
| 19-Apr-2021 |
Jan Svoboda <[email protected]> |
[clang] NFC: Fix range-based for loop warnings related to decl lookup
|
| #
f6f21dcd |
| 24-Mar-2021 |
Alexey Bataev <[email protected]> |
[OPENMP]Fix PR49636: Assertion `(!Entry.getAddress() || Entry.getAddress() == Addr) && "Resetting with the new address."' failed.
The original issue is caused by the fact that the variable is alloca
[OPENMP]Fix PR49636: Assertion `(!Entry.getAddress() || Entry.getAddress() == Addr) && "Resetting with the new address."' failed.
The original issue is caused by the fact that the variable is allocated with incorrect type i1 instead of i8. This causes the bitcasting of the declaration to i8 type and the bitcast expression does not match the original variable. To fix the problem, the UndefValue initializer and the original variable should be emitted with type i8, not i1.
Differential Revision: https://reviews.llvm.org/D99297
show more ...
|
| #
d672d521 |
| 11-Mar-2021 |
Alex Lorenz <[email protected]> |
Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue"
This reverts commit 809a1e0ffd7af40ee27270ff8ba2ffc927330e71.
Mach-O doesn't support dso_local and this change broke XNU because of the
Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue"
This reverts commit 809a1e0ffd7af40ee27270ff8ba2ffc927330e71.
Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local.
Differential Revision: https://reviews.llvm.org/D98458
show more ...
|
| #
440f6bdf |
| 16-Mar-2021 |
Luke Drummond <[email protected]> |
[OpenCL][NFCI] Prefer CodeGenFunction::EmitRuntimeCall
`CodeGenFunction::EmitRuntimeCall` automatically sets the right calling convention for the callee so we can avoid setting it ourselves.
As req
[OpenCL][NFCI] Prefer CodeGenFunction::EmitRuntimeCall
`CodeGenFunction::EmitRuntimeCall` automatically sets the right calling convention for the callee so we can avoid setting it ourselves.
As requested in https://reviews.llvm.org/D98411
Reviewed by: anastasia Differential Revision: https://reviews.llvm.org/D98705
show more ...
|
| #
fcfd3fda |
| 10-Mar-2021 |
Luke Drummond <[email protected]> |
[OpenCL] Respect calling convention for builtin
`__translate_sampler_initializer` has a calling convention of `spir_func`, but clang generated calls to it using the default CC.
Instruction Combinin
[OpenCL] Respect calling convention for builtin
`__translate_sampler_initializer` has a calling convention of `spir_func`, but clang generated calls to it using the default CC.
Instruction Combining was lowering these mismatching calling conventions to `store i1* undef` which itself was subsequently lowered to a trap instruction by simplifyCFG resulting in runtime `SIGILL`
There are arguably two bugs here: but whether there's any wisdom in converting an obviously invalid call into a runtime crash over aborting with a sensible error message will require further discussion. So for now it's enough to set the right calling convention on the runtime helper.
Reviewed By: svenh, bader
Differential Revision: https://reviews.llvm.org/D98411
show more ...
|
| #
cdb42a4c |
| 12-Mar-2021 |
Sriraman Tallam <[email protected]> |
Disable unique linkage suffixes ifor global vars until demanglers can be fixed.
D96109 added support for unique internal linkage names for both internal linkage functions and global variables. There
Disable unique linkage suffixes ifor global vars until demanglers can be fixed.
D96109 added support for unique internal linkage names for both internal linkage functions and global variables. There was a lot of discussion on how to get the demangling right for functions but I completely missed the point that demanglers do not support suffixes for global vars. For example:
$ c++filt _ZL3foo foo $ c++filt _ZL3foo.uniq.123 _ZL3foo.uniq.123
The demangling for functions works as expected.
I am not sure of the impact of this. I don't understand how debuggers and other tools depend on the correctness of global variable demangling so I am pre-emptively disabling it until we can get the demangling support added.
Importantly, uniquefying global variables is not needed right now as we do not do profile attribution to global vars based on sampling. It was added for completeness and so this feature is not exactly missed.
Differential Revision: https://reviews.llvm.org/D98392
show more ...
|