History log of /llvm-project-15.0.7/clang/lib/CodeGen/CodeGenModule.cpp (Results 76 – 100 of 1864)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 806bbc49 21-Feb-2022 Joseph Huber <[email protected]>

[OpenMP] Try to embed offloading objects after codegen

Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a

[OpenMP] Try to embed offloading objects after codegen

Currently we use the `-fembed-offload-object` option to embed a binary
file into the host as a named section. This is currently only used as a
codegen action, meaning we only handle this option correctly when the
input is a bitcode file. This patch adds the same handling to embed an
offloading object after we complete code generation. This allows us to
embed the object correctly if the input file is source or bitcode.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120270

show more ...


# dc152659 10-Mar-2022 Erich Keane <[email protected]>

Have cpu-specific variants set 'tune-cpu' as an optimization hint

Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the

Have cpu-specific variants set 'tune-cpu' as an optimization hint

Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.

This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.

Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).

If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.

1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.

2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.

Differential Revision: https://reviews.llvm.org/D121410

show more ...


# f3480390 29-Jan-2022 Itay Bookstein <[email protected]>

[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers

The purpose of this change is to fix the following codegen bug:

```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) {

[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers

The purpose of this change is to fix the following codegen bug:

```
// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }

// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);

// run:
clang main.c other.c -o main; ./main
```

This will segfault prior to the change, and return the correct
exit code 5 after the change.

The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
the generated code binds the reference to foo() against a
GlobalIFunc whose resolver is undefined. This is invalid: the
resolver must be defined in the same translation unit as the
ifunc, but historically the LLVM bitcode verifier did not check
that. The generated code then binds against the resolver rather
than the ifunc, so it ends up calling the resolver rather than
the resolvee. In the example above it treats its return value as
an int *, therefore trying to write to program text.

The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type, as opposed to simply emitting code
into a Function.

I think this limitation is unlikely to change, so I implemented
the fix by returning a function declaration rather than an ifunc
when encountering cpu_specific, and upgrading it to an ifunc
when emitting cpu_dispatch.
This uses `takeName` + `replaceAllUsesWith` in similar vein to
other places where the correct IR object type cannot be known
locally/up-front, like in `CodeGenModule::EmitAliasDefinition`.

Previous discussion in: https://reviews.llvm.org/D112349

Signed-off-by: Itay Bookstein <[email protected]>

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D120266

show more ...


# 50650766 16-Feb-2022 Nikita Popov <[email protected]>

[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While d

[CodeGen] Rename deprecated Address constructor

To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().

While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.

show more ...


# 6398903a 14-Feb-2022 Momchil Velikov <[email protected]>

Extend the `uwtable` attribute with unwind table kind

We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tabl

Extend the `uwtable` attribute with unwind table kind

We have the `clang -cc1` command-line option `-funwind-tables=1|2` and
the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind
tables (1) or asynchronous unwind tables (2)`. However, this is
encoded in LLVM IR by the presence or the absence of the `uwtable`
attribute, i.e. we lose the information whether to generate want just
some unwind tables or asynchronous unwind tables.

Asynchronous unwind tables take more space in the runtime image, I'd
estimate something like 80-90% more, as the difference is adding
roughly the same number of CFI directives as for prologues, only a bit
simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even
more, if you consider tail duplication of epilogue blocks.
Asynchronous unwind tables could also restrict code generation to
having only a finite number of frame pointer adjustments (an example
of *not* having a finite number of `SP` adjustments is on AArch64 when
untagging the stack (MTE) in some cases the compiler can modify `SP`
in a loop).
Having the CFI precise up to an instruction generally also means one
cannot bundle together CFI instructions once the prologue is done,
they need to be interspersed with ordinary instructions, which means
extra `DW_CFA_advance_loc` commands, further increasing the unwind
tables size.

That is to say, async unwind tables impose a non-negligible overhead,
yet for the most common use cases (like C++ exceptions), they are not
even needed.

This patch extends the `uwtable` attribute with an optional
value:
- `uwtable` (default to `async`)
- `uwtable(sync)`, synchronous unwind tables
- `uwtable(async)`, asynchronous (instruction precise) unwind tables

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D114543

show more ...


# 87dd3d35 11-Feb-2022 Arthur Eubanks <[email protected]>

[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOfGlobalTemporary()


# d8f99bb6 11-Feb-2022 Sameer Sahasrabuddhe <[email protected]>

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
r

[AMDGPU] replace hostcall module flag with function attribute

The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216

show more ...


# 1d97cb1f 04-Feb-2022 Yaxun (Sam) Liu <[email protected]>

[HIP] Emit amdgpu_code_object_version module flag

code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code

[HIP] Emit amdgpu_code_object_version module flag

code object version determines ABI, therefore should not be mixed.

This patch emits amdgpu_code_object_version module flag in LLVM IR
based on code object version (default 4).

The amdgpu_code_object_version value is code object version times 100.

LLVM IR with different amdgpu_code_object_version module flag cannot
be linked.

The -cc1 option -mcode-object-version=none is for ROCm device library use
only, which supports multiple ABI.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D119026

show more ...


# 171da443 04-Feb-2022 Yaxun (Sam) Liu <[email protected]>

[HIPSPV] Fix literals are mapped to Generic address space

This issue is an oversight in D108621.

Literals in HIP are emitted as global constant variables with default
address space which maps to Ge

[HIPSPV] Fix literals are mapped to Generic address space

This issue is an oversight in D108621.

Literals in HIP are emitted as global constant variables with default
address space which maps to Generic address space for HIPSPV. In
SPIR-V such variables translate to OpVariable instructions with
Generic storage class which are not legal. Fix by mapping literals
to CrossWorkGroup address space.

The literals are not mapped to UniformConstant because the “flat”
pointers in HIP may reference them and “flat” pointers are modeled
as Generic pointers in SPIR-V. In SPIR-V/OpenCL UniformConstant
pointers may not be casted to Generic.

Patch by: Henry Linjamäki

Reviewed by: Yaxun Liu

Differential Revision: https://reviews.llvm.org/D118876

show more ...


# 853e0aa4 04-Feb-2022 Hans Wennborg <[email protected]>

Don't dllexport reference temporaries

Even if the reference itself is dllexport, the temporary should not be.
In fact, we're already giving it internal linkage, so dllexporting it
is not just wastef

Don't dllexport reference temporaries

Even if the reference itself is dllexport, the temporary should not be.
In fact, we're already giving it internal linkage, so dllexporting it
is not just wasteful, but will fail to link, as in the example below:

$ cat /tmp/a.cc
void _DllMainCRTStartup() {}
const int __declspec(dllexport) &foo = 42;

$ clang-cl -fuse-ld=lld /tmp/a.cc /Zl /link /dll /out:a.dll
lld-link: error: <root>: undefined symbol: int const &foo::$RT1

Differential revision: https://reviews.llvm.org/D118980

show more ...


# 1f08b086 28-Jan-2022 Amilendra Kodithuwakku <[email protected]>

[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures

Branch protection in M-class is supported by
- Armv8.1-M.Main
- Armv8-M.Main
- Armv7-M

Attempting to enable this f

[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures

Branch protection in M-class is supported by
- Armv8.1-M.Main
- Armv8-M.Main
- Armv7-M

Attempting to enable this for other architectures, either by
command-line (e.g -mbranch-protection=bti) or by target attribute
in source code (e.g. __attribute__((target("branch-protection=..."))) )
will generate a warning.

In both cases function attributes related to branch protection will not
be emitted. Regardless of the warning, module level attributes related to
branch protection will be emitted when it is enabled via the command-line.

The following people also contributed to this patch:
- Victor Campos

Reviewed By: chill

Differential Revision: https://reviews.llvm.org/D115501

show more ...


# 82af9502 21-Jan-2022 Joao Moreira <[email protected]>

[X86] Enable ibt-seal optimization when LTO is used in Kernel

Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit

[X86] Enable ibt-seal optimization when LTO is used in Kernel

Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function.

Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1].

This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage.

A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540.

The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged.

[1] - https://lkml.org/lkml/2021/11/22/591

Reviewed By: xiangzhangllvm

Differential Revision: https://reviews.llvm.org/D116070

show more ...


# 85c2bd2a 19-Jan-2022 Yaxun (Sam) Liu <[email protected]>

Prevent adding module flag amdgpu_hostcall multiple times

HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one

Prevent adding module flag amdgpu_hostcall multiple times

HIP program with printf call fails to compile with -fsanitize=address
option, because of appending module flag - amdgpu_hostcall twice, one
for printf and one for sanitize option. This patch fixes that issue.

Patch by: Praveen Velliengiri

Reviewed by: Yaxun Liu, Roman Lebedev

Differential Revision: https://reviews.llvm.org/D116216

show more ...


# c63a3175 15-Jan-2022 Nikita Popov <[email protected]>

[AttrBuilder] Remove ctor accepting AttributeList and Index

Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeLi

[AttrBuilder] Remove ctor accepting AttributeList and Index

Use the AttributeSet constructor instead. There's no good reason
why AttrBuilder itself should exact the AttributeSet from the
AttributeList. Moving this out of the AttrBuilder generally results
in cleaner code.

show more ...


# 2bcba21c 14-Jan-2022 Erich Keane <[email protected]>

[CPU-Dispatch] Make sure Dispatch names get updated if previously mangled

Cases where there is a mangling of a cpu-dispatch/cpu-specific function
before the function becomes 'multiversion' (such as

[CPU-Dispatch] Make sure Dispatch names get updated if previously mangled

Cases where there is a mangling of a cpu-dispatch/cpu-specific function
before the function becomes 'multiversion' (such as a member function)
causes the wrong name to be emitted for one of the variants/resolver,
since the name is cached. Make sure we invalidate the cache in
cpu-dispatch/cpu-specific modes, like we previously did for just target
multiversioning.

show more ...


# b699e8b1 13-Jan-2022 Erich Keane <[email protected]>

Add another assert to cpu-dispatch emission to help track down a tough
to repro error.

As mentioned yesterday, I've got a problem that I can only reproduce on
Godbolt (none of the build configs on m

Add another assert to cpu-dispatch emission to help track down a tough
to repro error.

As mentioned yesterday, I've got a problem that I can only reproduce on
Godbolt (none of the build configs on my local machine!), so this is at
least somewhat usable until I figure out a cause.

show more ...


# 6e77ad11 12-Jan-2022 Erich Keane <[email protected]>

Add an assert in cpudispatch emit to try to track down an error.

I'm attempting to debug an issue that I can only get to happen on
godbolt, where the cpu-dispatch resolver for an out of line member

Add an assert in cpudispatch emit to try to track down an error.

I'm attempting to debug an issue that I can only get to happen on
godbolt, where the cpu-dispatch resolver for an out of line member
function is generated with the wrong name, causing a link failure.

show more ...


# d2cc6c2d 03-Jan-2022 Serge Guelton <[email protected]>

Use a sorted array instead of a map to store AttrBuilder string attributes

Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor sligh

Use a sorted array instead of a map to store AttrBuilder string attributes

Using and std::map<SmallString, SmallString> for target dependent attributes is
inefficient: it makes its constructor slightly heavier, and involves extra
allocation for each new string attribute. Storing the attribute key/value as
strings implies extra allocation/copy step.

Use a sorted vector instead. Given the low number of attributes generally
involved, this is cheaper, as showcased by

https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions

Differential Revision: https://reviews.llvm.org/D116599

show more ...


# 40446663 09-Jan-2022 Kazu Hirata <[email protected]>

[clang] Use true/false instead of 1/0 (NFC)

Identified with modernize-use-bool-literals.


# 9290ccc3 04-Jan-2022 serge-sans-paille <[email protected]>

Introduce the AttributeMask class

This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for bu

Introduce the AttributeMask class

This class is solely used as a lightweight and clean way to build a set of
attributes to be removed from an AttrBuilder. Previously AttrBuilder was used
both for building and removing, which introduced odd situation like creation of
Attribute with dummy value because the only relevant part was the attribute
kind.

Differential Revision: https://reviews.llvm.org/D116110

show more ...


# ec2e26ea 10-Aug-2021 Sami Tolvanen <[email protected]>

[Clang] Add __builtin_function_start

Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as op

[Clang] Add __builtin_function_start

Control-Flow Integrity (CFI) replaces references to address-taken
functions with pointers to the CFI jump table. This is a problem
for low-level code, such as operating system kernels, which may
need the address of an actual function body without the jump table
indirection.

This change adds the __builtin_function_start() builtin, which
accepts an argument that can be constant-evaluated to a function,
and returns the address of the function body.

Link: https://github.com/ClangBuiltLinux/linux/issues/1353

Depends on D108478

Reviewed By: pcc, rjmccall

Differential Revision: https://reviews.llvm.org/D108479

show more ...


# c3b624a1 15-Dec-2021 Nikita Popov <[email protected]>

[CodeGen] Avoid deprecated ConstantAddress constructor

Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.

For cases where the correct element type was no

[CodeGen] Avoid deprecated ConstantAddress constructor

Change all uses of the deprecated constructor to pass the
element type explicitly and drop it.

For cases where the correct element type was not immediately
obvious to me or would require a slightly larger change I'm
falling back to explicitly calling getPointerElementType() for now.

show more ...


# 0a14674f 03-Dec-2021 Peter Collingbourne <[email protected]>

CodeGen: Strip exception specifications from function types in CFI type names.

With C++17 the exception specification has been made part of the
function type, and therefore part of mangled type name

CodeGen: Strip exception specifications from function types in CFI type names.

With C++17 the exception specification has been made part of the
function type, and therefore part of mangled type names.

However, it's valid to convert function pointers with an exception
specification to function pointers with the same argument and return
types but without an exception specification, which means that e.g. a
function of type "void () noexcept" can be called through a pointer
of type "void ()". We must therefore consider the two types to be
compatible for CFI purposes.

We can do this by stripping the exception specification before mangling
the type name, which is what this patch does.

Differential Revision: https://reviews.llvm.org/D115015

show more ...


# e3b2f022 01-Dec-2021 Ties Stuij <[email protected]>

[clang][ARM] PACBTI-M frontend support

Handle branch protection option on the commandline as well as a function
attribute. One patch for both mechanisms, as they use the same underlying
parsing mech

[clang][ARM] PACBTI-M frontend support

Handle branch protection option on the commandline as well as a function
attribute. One patch for both mechanisms, as they use the same underlying
parsing mechanism.

These are recorded in a set of LLVM IR module-level attributes like we do for
AArch64 PAC/BTI (see https://reviews.llvm.org/D85649):

- command-line options are "translated" to module-level LLVM IR
attributes (metadata).

- functions have PAC/BTI specific attributes iff the
__attribute__((target("branch-protection=...))) was used in the function
declaration.

- command-line option -mbranch-protection to armclang targeting Arm,
following this grammar:

branch-protection ::= "-mbranch-protection=" <protection>
protection ::= "none" | "standard" | "bti" [ "+" <pac-ret-clause> ]
| <pac-ret-clause> [ "+" "bti"]
pac-ret-clause ::= "pac-ret" [ "+" <pac-ret-option> ]
pac-ret-option ::= "leaf" ["+" "b-key"] | "b-key" ["+" "leaf"]

b-key is simply a placeholder to make it consistent with AArch64's
version. In Arm, however, it triggers a warning informing that b-key is
unsupported and a-key will be selected instead.

- Handle _attribute_((target(("branch-protection=..."))) for AArch32 with the
same grammer as the commandline options.

This patch is part of a series that adds support for the PACBTI-M extension of
the Armv8.1-M architecture, as detailed here:

https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension

The PACBTI-M specification can be found in the Armv8-M Architecture Reference
Manual:

https://developer.arm.com/documentation/ddi0553/latest

The following people contributed to this patch:

- Momchil Velikov
- Victor Campos
- Ties Stuij

Reviewed By: vhscampos

Differential Revision: https://reviews.llvm.org/D112421

show more ...


# fc53eb69 29-Nov-2021 Erich Keane <[email protected]>

Reapply 'Implement target_clones multiversioning'

See discussion in D51650, this change was a little aggressive in an
error while doing a 'while we were here', so this removes that error
condition,

Reapply 'Implement target_clones multiversioning'

See discussion in D51650, this change was a little aggressive in an
error while doing a 'while we were here', so this removes that error
condition, as it is apparently useful.

This reverts commit bb4934601d731465e01e2e22c80ce2dbe687d73f.

show more ...


12345678910>>...75