|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
190518da |
| 15-Jul-2022 |
Phoebe Wang <[email protected]> |
[X86] Use generic tuning for "x86-64" if "tune-cpu" is not specified
This is an alternative to D129154. See discussions on https://discourse.llvm.org/t/fast-scalar-fsqrt-tuning-in-x86/63605
Reviewe
[X86] Use generic tuning for "x86-64" if "tune-cpu" is not specified
This is an alternative to D129154. See discussions on https://discourse.llvm.org/t/fast-scalar-fsqrt-tuning-in-x86/63605
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D129647
show more ...
|
| #
2240d72f |
| 12-Jul-2022 |
Nick Desaulniers <[email protected]> |
[X86] initial -mfunction-return=thunk-extern support
Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute
Where
[X86] initial -mfunction-return=thunk-extern support
Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute
Where the supported <value>s are: * keep (disable) * thunk-extern (enable)
thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern.
The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk!
This new MachineFunctionPass is very similar to "x86-lvi-ret".
The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere.
Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk").
Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901
See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702
SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE.
This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release.
Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information.
Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html
Reviewed By: aaron.ballman, craig.topper
Differential Revision: https://reviews.llvm.org/D129572
show more ...
|
| #
5cb09798 |
| 25-Jun-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX] Split greedy RA for tile register
When we fill the shape to tile configure memory, the shape is gotten from AMX pseudo instruction. However the register for the shape may be split or spil
[X86][AMX] Split greedy RA for tile register
When we fill the shape to tile configure memory, the shape is gotten from AMX pseudo instruction. However the register for the shape may be split or spilled by greedy RA. That cause we fill the shape to config memory after ldtilecfg is executed, so that the shape configuration would be wrong. This patch is to split the tile register allocation from greedy register allocation, so that after tile registers are allocated the shape registers are still virtual register. The shape register only may be redefined or multi-defined by phi elimination pass, two address pass. That doesn't affect tile register configuration.
Differential Revision: https://reviews.llvm.org/D128584
show more ...
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
e0e687a6 |
| 20-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Don't use Optional::hasValue (NFC)
|
| #
654a835c |
| 15-Jun-2022 |
Paul Robinson <[email protected]> |
[PS5] Trap after noreturn calls, with special case for stack-check-fail
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
496156ac |
| 03-May-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX] Multiple configure for AMX register.
The previous solution depends on variable name to record the shape information. However it is not reliable, because in release build compiler would no
[X86][AMX] Multiple configure for AMX register.
The previous solution depends on variable name to record the shape information. However it is not reliable, because in release build compiler would not set the variable name. It can be accomplished with an additional option `fno-discard-value-names`, but it is not acceptable for users. This patch is to preconfigure the tile register with machine instruction. It follow the same way what sigle configure does. In the future we can fall back to multiple configure when single configure fails due to the shape dependency issue. The algorithm to configure the tile register is simple in the patch. We may improve it in the future. It configure tile register based on basic block. Compiler would spill the tile register if it live out the basic block. After the configure there should be no spill across tile confgiure in the register alloction. Just like fast register allocation the algorithm walk the instruction in reverse order. When the shape dependency doesn't meet, it insert ldtilecfg after the last instruction that define the shape. In post configuration compiler also walk the basic block to collect the physical tile register number and generate instruction to fill the stack slot for the correponding shape information. TODO: There is some following work in D125602. The risk is modifying the fast RA may cause regression as fast RA is usded for different targets. We may create an independent RA for tile register.
Differential Revision: https://reviews.llvm.org/D125075
show more ...
|
| #
58951792 |
| 05-May-2022 |
Craig Topper <[email protected]> |
[X86] Call initializeX86PreTileConfigPass from LLVMInitializeX86Target.
Without this, the pass doesn't show up in print-before/after-all.
Differential Revision: https://reviews.llvm.org/D124973
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
ed98c1b3 |
| 09-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: DebugInfo & CodeGen
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
c4b1a63a |
| 25-Feb-2022 |
Jameson Nash <[email protected]> |
mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D1
mark getTargetTransformInfo and getTargetIRAnalysis as const
Seems like this can be const, since Passes shouldn't modify it.
Reviewed By: wsmoses
Differential Revision: https://reviews.llvm.org/D120518
show more ...
|
| #
f9270214 |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
Reland "[clang-cl] Support the /JMC flag"
This relands commit b380a31de084a540cfa38b72e609b25ea0569bb7.
Restrict the tests to Windows only since the flag symbol hash depends on system-dependent pat
Reland "[clang-cl] Support the /JMC flag"
This relands commit b380a31de084a540cfa38b72e609b25ea0569bb7.
Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.
show more ...
|
| #
b380a31d |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
Revert "[clang-cl] Support the /JMC flag"
This reverts commit bd3a1de683f80d174ea9c97000db3ec3276bc022.
Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x6
Revert "[clang-cl] Support the /JMC flag"
This reverts commit bd3a1de683f80d174ea9c97000db3ec3276bc022.
Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview
show more ...
|
| #
bd3a1de6 |
| 10-Feb-2022 |
Yuanfang Chen <[email protected]> |
[clang-cl] Support the /JMC flag
The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these
[clang-cl] Support the /JMC flag
The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/
The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking.
Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch).
Reviewed By: hans
Differential Revision: https://reviews.llvm.org/D118428
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
37d1d022 |
| 23-Jan-2022 |
Phoebe Wang <[email protected]> |
[X86][MS] Change the alignment of f80 to 16 bytes on Windows 32bits to match with ICC
MSVC currently doesn't support 80 bits long double. ICC supports it when the option `/Qlong-double` is specified
[X86][MS] Change the alignment of f80 to 16 bytes on Windows 32bits to match with ICC
MSVC currently doesn't support 80 bits long double. ICC supports it when the option `/Qlong-double` is specified. Changing the alignment of f80 to 16 bytes so that we can be compatible with ICC's option.
Reviewed By: rnk, craig.topper
Differential Revision: https://reviews.llvm.org/D115942
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
f63a805a |
| 15-Jan-2022 |
Phoebe Wang <[email protected]> |
Revert "[X86][MS] Change the alignment of f80 to 16 bytes on Windows 32bits to match with ICC"
This reverts commit 1bb0caf561688681be67cc91560348c9e43fcbf3.
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
1bb0caf5 |
| 17-Dec-2021 |
Phoebe Wang <[email protected]> |
[X86][MS] Change the alignment of f80 to 16 bytes on Windows 32bits to match with ICC
MSVC currently doesn't support 80 bits long double. ICC supports it when the option `/Qlong-double` is specified
[X86][MS] Change the alignment of f80 to 16 bytes on Windows 32bits to match with ICC
MSVC currently doesn't support 80 bits long double. ICC supports it when the option `/Qlong-double` is specified. Changing the alignment of f80 to 16 bytes so that we can be compatible with ICC's option.
Reviewed By: rnk, craig.topper
Differential Revision: https://reviews.llvm.org/D115942
show more ...
|
| #
ff3b085a |
| 14-Dec-2021 |
Florian Hahn <[email protected]> |
[X86] Use bundle for CALL_RVMARKER expansion.
This patch updates expandCALL_RVMARKER to wrap the call, marker and objc runtime call in an instruction bundle. This ensures later passes, like machine
[X86] Use bundle for CALL_RVMARKER expansion.
This patch updates expandCALL_RVMARKER to wrap the call, marker and objc runtime call in an instruction bundle. This ensures later passes, like machine block placement, cannot break them up.
On AArch64, the instruction sequence is already wrapped in a bundle. Keeping the whole instruction sequence together is highly desirable for performance and outweighs potential other benefits from breaking the sequence up.
Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D115230
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
89b57061 |
| 08-Oct-2021 |
Reid Kleckner <[email protected]> |
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually us
Move TargetRegistry.(h|cpp) from Support to MC
This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support.
This allows us to ensure that Support doesn't have includes from MC/*.
Differential Revision: https://reviews.llvm.org/D111454
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
d9b511d8 |
| 22-Sep-2021 |
Hongtao Yu <[email protected]> |
[CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when
[CSSPGO] Set PseudoProbeInserter as a default pass.
Currenlty PseudoProbeInserter is a pass conditioned on a target switch. It works well with a single clang invocation. It doesn't work so well when the backend is called separately (i.e, through the linker or llc), where user has always to pass -pseudo-probe-for-profiling explictly. I'm making the pass a default pass that requires no command line arg to trigger, but will be actually run depending on whether the CU comes with `llvm.pseudo_probe_desc` metadata.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D110209
show more ...
|
| #
4ceea774 |
| 17-Sep-2021 |
Amara Emerson <[email protected]> |
[X86] Rename the X86WinAllocaExpander pass and related symbols to "DynAlloca". NFC.
For x86 Darwin, we have a stack checking feature which re-uses some of this machinery around stack probing on Wind
[X86] Rename the X86WinAllocaExpander pass and related symbols to "DynAlloca". NFC.
For x86 Darwin, we have a stack checking feature which re-uses some of this machinery around stack probing on Windows. Renaming this to be more appropriate for a generic feature.
Differential Revision: https://reviews.llvm.org/D109993
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
3787ee45 |
| 08-Jun-2021 |
Nick Desaulniers <[email protected]> |
reland [IR] make -stack-alignment= into a module attr
Relands commit 433c8d950cb3a1fa0977355ce0367e8c763a3f13 with fixes for MIPS.
Similar to D102742, specifying the stack alignment via CodegenOpts
reland [IR] make -stack-alignment= into a module attr
Relands commit 433c8d950cb3a1fa0977355ce0367e8c763a3f13 with fixes for MIPS.
Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO.
Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D103048
show more ...
|
| #
a596b54d |
| 08-Jun-2021 |
Nick Desaulniers <[email protected]> |
Revert "[IR] make -stack-alignment= into a module attr"
This reverts commit 433c8d950cb3a1fa0977355ce0367e8c763a3f13.
Breaks the MIPS build.
|
| #
433c8d95 |
| 08-Jun-2021 |
Nick Desaulniers <[email protected]> |
[IR] make -stack-alignment= into a module attr
Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specifie
[IR] make -stack-alignment= into a module attr
Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO.
Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079
Link: https://github.com/ClangBuiltLinux/linux/issues/1377
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D103048
show more ...
|
| #
75521bd9 |
| 07-Jun-2021 |
Harald van Dijk <[email protected]> |
[X32] Add Triple::isX32(), use it.
So far, support for x86_64-linux-gnux32 has been handled by explicit comparisons of Triple.getEnvironment() to GNUX32. This worked as long as x86_64-linux-gnux32 w
[X32] Add Triple::isX32(), use it.
So far, support for x86_64-linux-gnux32 has been handled by explicit comparisons of Triple.getEnvironment() to GNUX32. This worked as long as x86_64-linux-gnux32 was the only X32 environment to worry about, but we now have x86_64-linux-muslx32 as well. To support this, this change adds an isX32() function and uses it. It replaces all checks for GNUX32 or MuslX32 by isX32(), except for the following:
- Triple::isGNUEnvironment() and Triple::isMusl() are supposed to treat GNUX32 and MuslX32 differently. - computeTargetTriple() needs to be able to transform triples to add or remove X32 from the environment and needs to map GNU to GNUX32, and Musl to MuslX32. - getMultiarchTriple() completely lacks any Musl support and retains the explicit check for GNUX32 as it can only return x86_64-linux-gnux32.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D103777
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
d4bdeca5 |
| 08-May-2021 |
Xiang1 Zhang <[email protected]> |
[X86] Support AMX fast register allocation
Differential Revision: https://reviews.llvm.org/D100026
|
| #
bebafe01 |
| 08-May-2021 |
Xiang1 Zhang <[email protected]> |
Revert "[X86] Support AMX fast register allocation"
This reverts commit 77e2e5e07d01fe0b83c39d0c527c0d3d2e659146.
|