|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
af58684f |
| 14-Jul-2022 |
Ellis Hoag <[email protected]> |
[InstrProf] Add options to profile function groups
Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instr
[InstrProf] Add options to profile function groups
Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instrument the functions in group `i`. Similar options were added to xray in https://reviews.llvm.org/D87953 and the goal is the same; to reduce instrumented size overhead by spreading the overhead across multiple builds. Raw profiles from different groups can be added like normal using the `llvm-profdata merge` command.
Reviewed By: ianlevesque
Differential Revision: https://reviews.llvm.org/D129594
show more ...
|
| #
2240d72f |
| 12-Jul-2022 |
Nick Desaulniers <[email protected]> |
[X86] initial -mfunction-return=thunk-extern support
Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute
Where
[X86] initial -mfunction-return=thunk-extern support
Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute
Where the supported <value>s are: * keep (disable) * thunk-extern (enable)
thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern.
The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk!
This new MachineFunctionPass is very similar to "x86-lvi-ret".
The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere.
Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk").
Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901
See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702
SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE.
This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release.
Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information.
Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html
Reviewed By: aaron.ballman, craig.topper
Differential Revision: https://reviews.llvm.org/D129572
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
8ad4c6e4 |
| 17-Jun-2022 |
Yaxun (Sam) Liu <[email protected]> |
[HIP] add -fhip-kernel-arg-name
Add option -fhip-kernel-arg-name to emit kernel argument name metadata, which is needed for certain HIP applications.
Reviewed by: Artem Belevich, Fangrui Song, Bria
[HIP] add -fhip-kernel-arg-name
Add option -fhip-kernel-arg-name to emit kernel argument name metadata, which is needed for certain HIP applications.
Reviewed by: Artem Belevich, Fangrui Song, Brian Sumner
Differential Revision: https://reviews.llvm.org/D128022
show more ...
|
| #
d4bcb45d |
| 12-Jun-2022 |
Jez Ng <[email protected]> |
[MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible
This reverts commit d941d597837d9e1405086f008c9bd6a71e7263c9.
Differential Revision: https://reviews.llvm.org/D122258
|
| #
d941d597 |
| 12-Jun-2022 |
Jez Ng <[email protected]> |
Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible"
This reverts commit ef501bf85d8c869248e51371f0e74bcec0e7b229.
|
| #
ef501bf8 |
| 12-Jun-2022 |
Jez Ng <[email protected]> |
[MC] Omit DWARF unwind info if compact unwind is present where eligible
Previously, omitting unnecessary DWARF unwinds was only done in two cases: * For Darwin + aarch64, if no DWARF unwind info is
[MC] Omit DWARF unwind info if compact unwind is present where eligible
Previously, omitting unnecessary DWARF unwinds was only done in two cases: * For Darwin + aarch64, if no DWARF unwind info is needed for all the functions in a TU, then the `__eh_frame` section would be omitted entirely. If any one function needed DWARF unwind, then MC would emit DWARF unwind entries for all the functions in the TU. * For watchOS, MC would omit DWARF unwind on a per-function basis, as long as compact unwind was available for that function.
This diff makes it so that we omit DWARF unwind on a per-function basis for Darwin + aarch64 as well. In addition, we introduce the flag `--emit-dwarf-unwind=` which can toggle between `always`, `no-compact-unwind` (only emit DWARF when CU cannot be emitted for a given function), and the target platform `default`. `no-compact-unwind` is particularly useful for newer x86_64 platforms: we don't want to omit DWARF unwind for x86_64 in general due to possible backwards compat issues, but we should make it possible for people to opt into this behavior if they are only targeting newer platforms.
**Motivation:** I'm working on adding support for `__eh_frame` to LLD, but I'm concerned that we would suffer a perf hit. Processing compact unwind is already expensive, and that's a simpler format than EH frames. Given that MC currently produces one EH frame entry for every compact unwind entry, I don't think processing them will be cheap. I tried to do something clever on LLD's end to drop the unnecessary EH frames at parse time, but this made the code significantly more complex. So I'm looking at fixing this at the MC level instead.
**Addendum:** It turns out that there was a latent bug in the X86 backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is not too surprising given that this combination has not been heretofore used.
For functions that have unwind info that cannot be encoded with CU, MC would end up dropping both the compact unwind entry (OK; existing behavior) as well as the DWARF entries (not OK). This diff fixes things so that we emit the DWARF entry, as well as a CU entry with encoding `UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry is necessary, this was the simplest fix. ld64 seems to be able to handle both the absence and presence of this CU entry. Ultimately ld64 (and LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there is no impact to the final binary size.
Reviewed By: davide, lhames
Differential Revision: https://reviews.llvm.org/D122258
show more ...
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
bac6cd5b |
| 01-Apr-2022 |
Paul Kirth <[email protected]> |
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
show more ...
|
| #
2978d026 |
| 11-Apr-2022 |
Nikita Popov <[email protected]> |
[Clang] Remove support for legacy pass manager
This removes the -flegacy-pass-manager and -fno-experimental-new-pass-manager options, and the corresponding support code in BackendUtil. The -fno-lega
[Clang] Remove support for legacy pass manager
This removes the -flegacy-pass-manager and -fno-experimental-new-pass-manager options, and the corresponding support code in BackendUtil. The -fno-legacy-pass-manager and -fexperimental-new-pass-manager options are retained as no-ops.
Differential Revision: https://reviews.llvm.org/D123609
show more ...
|
| #
d69e9f9d |
| 04-Apr-2022 |
Nikita Popov <[email protected]> |
[OpaquePtrs][Clang] Add -opaque-pointers/-no-opaque-pointers cc1 options
This adds cc1 options for enabling and disabling opaque pointers on the clang side. This is not super useful now (because -ml
[OpaquePtrs][Clang] Add -opaque-pointers/-no-opaque-pointers cc1 options
This adds cc1 options for enabling and disabling opaque pointers on the clang side. This is not super useful now (because -mllvm -opaque-pointers and -Xclang -opaque-pointers have the same visible effect) but will be important once opaque pointers are enabled by default in clang. In that case, it will only be possible to disable them using the cc1 -no-opaque-pointers option.
Differential Revision: https://reviews.llvm.org/D123034
show more ...
|
| #
fc7573f2 |
| 31-Mar-2022 |
Jorge Gorbe Moya <[email protected]> |
Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 46774df307159444d65083c2fd82f8574f0ab1d9.
|
| #
46774df3 |
| 29-Mar-2022 |
Paul Kirth <[email protected]> |
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
show more ...
|
| #
90cb325a |
| 29-Mar-2022 |
Paul Kirth <[email protected]> |
Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd976d7b80a3a7fc14ef0deb9b1ca6beee.
|
| #
2add3fbd |
| 19-Mar-2022 |
Paul Kirth <[email protected]> |
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D115907
show more ...
|
| #
964398cc |
| 18-Mar-2022 |
Paul Kirth <[email protected]> |
Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"""
This reverts commit 6cf560d69a222bff4af4e1d092437fd77f0f981c.
|
| #
6cf560d6 |
| 18-Mar-2022 |
Paul Kirth <[email protected]> |
Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""
I mistakenly reverted my commit, so I'm relanding it.
This reverts commit 10866a1df4a82cdc54187330c509a2d46235455d.
|
| #
10866a1d |
| 17-Mar-2022 |
Paul Kirth <[email protected]> |
Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit e7749d4713a5ec886011ceb0fc821c6723061724.
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
e7749d47 |
| 09-Mar-2022 |
Paul Kirth <[email protected]> |
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks
[misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata.
New checks rely on 2 invariants:
1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered.
2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles.
These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes.
Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line.
Differential Revision: https://reviews.llvm.org/D115907
show more ...
|
| #
bc1f8b6b |
| 07-Mar-2022 |
Yi Kong <[email protected]> |
Fix the comment for EnableNoundefAttrs, NFC
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
b529744c |
| 18-Feb-2022 |
hyeongyukim <[email protected]> |
[Clang] Rename `disable-noundef-analysis` flag to `-[no-]enable-noundef-analysis`
This flag was previously renamed `enable_noundef_analysis` to `disable-noundef-analysis,` which is not a conventiona
[Clang] Rename `disable-noundef-analysis` flag to `-[no-]enable-noundef-analysis`
This flag was previously renamed `enable_noundef_analysis` to `disable-noundef-analysis,` which is not a conventional name. (Driver and CC1's boolean options are using [no-] prefix) As discussed at https://reviews.llvm.org/D105169, this patch reverts its name to `[no-]enable_noundef_analysis` and enables noundef-analysis as default.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D119998
show more ...
|
| #
f9270214 |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
Reland "[clang-cl] Support the /JMC flag"
This relands commit b380a31de084a540cfa38b72e609b25ea0569bb7.
Restrict the tests to Windows only since the flag symbol hash depends on system-dependent pat
Reland "[clang-cl] Support the /JMC flag"
This relands commit b380a31de084a540cfa38b72e609b25ea0569bb7.
Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.
show more ...
|
| #
b380a31d |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
Revert "[clang-cl] Support the /JMC flag"
This reverts commit bd3a1de683f80d174ea9c97000db3ec3276bc022.
Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x6
Revert "[clang-cl] Support the /JMC flag"
This reverts commit bd3a1de683f80d174ea9c97000db3ec3276bc022.
Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview
show more ...
|
| #
bd3a1de6 |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
[clang-cl] Support the /JMC flag
The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these
[clang-cl] Support the /JMC flag
The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking.
Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch).
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D118428
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1 |
|
| #
deaf22bc |
| 09-Feb-2022 |
Bill Wendling <[email protected]> |
[X86] Implement -fzero-call-used-regs option
The "-fzero-call-used-regs" option tells the compiler to zero out certain registers before the function returns. It's also available as a function attrib
[X86] Implement -fzero-call-used-regs option
The "-fzero-call-used-regs" option tells the compiler to zero out certain registers before the function returns. It's also available as a function attribute: zero_call_used_regs.
The two upper categories are:
- "used": Zero out used registers. - "all": Zero out all registers, whether used or not.
The individual options are:
- "skip": Don't zero out any registers. This is the default. - "used": Zero out all used registers. - "used-arg": Zero out used registers that are used for arguments. - "used-gpr": Zero out used registers that are GPRs. - "used-gpr-arg": Zero out used GPRs that are used as arguments. - "all": Zero out all registers. - "all-arg": Zero out all registers used for arguments. - "all-gpr": Zero out all GPRs. - "all-gpr-arg": Zero out all GPRs used for arguments.
This is used to help mitigate Return-Oriented Programming exploits.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D110869
show more ...
|
|
Revision tags: llvmorg-15-init |
|
| #
82af9502 |
| 21-Jan-2022 |
Joao Moreira <[email protected]> |
[X86] Enable ibt-seal optimization when LTO is used in Kernel
Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit
[X86] Enable ibt-seal optimization when LTO is used in Kernel
Intel's CET/IBT requires every indirect branch target to be an ENDBR instruction. Because of that, the compiler needs to correctly emit these instruction on function's prologues. Because this is a security feature, it is desirable that only actual indirect-branch-targeted functions are emitted with ENDBRs. While it is possible to identify address-taken functions through LTO, minimizing these ENDBR instructions remains a hard task for user-space binaries because exported functions may end being reachable through PLT entries, that will use an indirect branch for such. Because this cannot be determined during compilation-time, the compiler currently emits ENDBRs to every non-local-linkage function.
Despite the challenge presented for user-space, the kernel landscape is different as no PLTs are used. With the intent of providing the most fit ENDBR emission for the kernel, kernel developers proposed an optimization named "ibt-seal" which replaces the ENDBRs for NOPs directly in the binary. The discussion of this feature can be seen in [1].
This diff brings the enablement of the flag -mibt-seal, which in combination with LTO enforces a different policy for ENDBR placement in when the code-model is set to "kernel". In this scenario, the compiler will only emit ENDBRs to address taken functions, ignoring non-address taken functions that are don't have local linkage.
A comparison between an LTO-compiled kernel binaries without and with the -mibt-seal feature enabled shows that when -mibt-seal was used, the number of ENDBRs in the vmlinux.o binary patched by objtool decreased from 44383 to 33192, and that the number of superfluous ENDBR instructions nopped-out decreased from 11730 to 540.
The 540 missed superfluous ENDBRs need to be investigated further, but hypotheses are: assembly code not being taken care of by the compiler, kernel exported symbols mechanisms creating bogus address taken situations or even these being removed due to other binary optimizations like kernel's static_calls. For now, I assume that the large drop in the number of ENDBR instructions already justifies the feature being merged.
[1] - https://lkml.org/lkml/2021/11/22/591
Reviewed By: xiangzhangllvm
Differential Revision: https://reviews.llvm.org/D116070
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
5af2433e |
| 20-Jan-2022 |
Alexandre Ganea <[email protected]> |
[clang-cl] Support the /HOTPATCH flag
This patch adds support for the MSVC /HOTPATCH flag: https://docs.microsoft.com/sv-se/cpp/build/reference/hotpatch-create-hotpatchable-image?view=msvc-170&viewF
[clang-cl] Support the /HOTPATCH flag
This patch adds support for the MSVC /HOTPATCH flag: https://docs.microsoft.com/sv-se/cpp/build/reference/hotpatch-create-hotpatchable-image?view=msvc-170&viewFallbackFrom=vs-2019
The flag is translated to a new -fms-hotpatch flag, which in turn adds a 'patchable-function' attribute for each function in the TU. This is then picked up by the PatchableFunction pass which would generate a TargetOpcode::PATCHABLE_OP of minsize = 2 (which means the target instruction must resolve to at least two bytes). TargetOpcode::PATCHABLE_OP is only implemented for x86/x64. When targetting ARM/ARM64, /HOTPATCH isn't required (instructions are always 2/4 bytes and suitable for hotpatching).
Additionally, when using /Z7, we generate a 'hot patchable' flag in the CodeView debug stream, in the S_COMPILE3 record. This flag is then picked up by LLD (or link.exe) and is used in conjunction with the linker /FUNCTIONPADMIN flag to generate extra space before each function, to accommodate for live patching long jumps. Please see: https://github.com/llvm/llvm-project/blob/d703b922961e0d02a5effdd4bfbb23ad50a3cc9f/lld/COFF/Writer.cpp#L1298
The outcome is that we can finally use Live++ or Recode along with clang-cl.
NOTE: It seems that MSVC cl.exe always enables /HOTPATCH on x64 by default, although if we did the same I thought we might generate sub-optimal code (if this flag was active by default). Additionally, MSVC always generates a .debug$S section and a S_COMPILE3 record, which Clang doesn't do without /Z7. Therefore, the following MSVC command-line "cl /c file.cpp" would have to be written with Clang such as "clang-cl /c file.cpp /HOTPATCH /Z7" in order to obtain the same result.
Depends on D43002, D80833 and D81301 for the full feature.
Differential Revision: https://reviews.llvm.org/D116511
show more ...
|