|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
e3e63f30 |
| 26-Jul-2022 |
Dmitry Vassiliev <[email protected]> |
[CodeGen] Fixed ambiguous symbol ExtAddrMode in case of NDEBUG and LLVM_ENABLE_DUMP
This patch fixes the following error with MSVC 16.9.2 in case of NDEBUG and LLVM_ENABLE_DUMP: llvm/lib/CodeGen/Cod
[CodeGen] Fixed ambiguous symbol ExtAddrMode in case of NDEBUG and LLVM_ENABLE_DUMP
This patch fixes the following error with MSVC 16.9.2 in case of NDEBUG and LLVM_ENABLE_DUMP: llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2872: 'ExtAddrMode': ambiguous symbol llvm/include/llvm/CodeGen/TargetInstrInfo.h(86): note: could be 'llvm::ExtAddrMode' llvm/lib/CodeGen/CodeGenPrepare.cpp(2447): note: or '`anonymous-namespace'::ExtAddrMode' llvm/lib/CodeGen/CodeGenPrepare.cpp(2581): error C2039: 'print': is not a member of 'llvm::ExtAddrMode'
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D130426
show more ...
|
| #
76e18cc4 |
| 20-Jul-2022 |
Kazu Hirata <[email protected]> |
[llvm] Use llvm::any_of and llvm::none_of (NFC)
|
| #
9e6d1f4b |
| 17-Jul-2022 |
Kazu Hirata <[email protected]> |
[CodeGen] Qualify auto variables in for loops (NFC)
|
| #
a323dfc0 |
| 16-Jul-2022 |
Tim Besard <[email protected]> |
Don't sink ptrtoint/inttoptr sequences into non-noop addrspacecasts.
In https://reviews.llvm.org/D30114, support for mismatching address spaces was introduced to CodeGenPrepare's optimizeMemoryInst,
Don't sink ptrtoint/inttoptr sequences into non-noop addrspacecasts.
In https://reviews.llvm.org/D30114, support for mismatching address spaces was introduced to CodeGenPrepare's optimizeMemoryInst, using addrspacecast as it was argued that only no-op addrspacecasts would be considered when constructing the address mode. However, by doing inttoptr/ptrtoint, it's possible to get CGP to emit an addrspace that's not actually no-op, introducing a miscompilation:
define void @kernel(i8* %julia_ptr) { %intptr = ptrtoint i8* %julia_ptr to i64 %ptr = inttoptr i64 %intptr to i32 addrspace(3)*
br label %end end:
store atomic i32 1, i32 addrspace(3)* %ptr unordered, align 4 ret void }
Gets compiled to:
define void @kernel(i8* %julia_ptr) { end: %0 = addrspacecast i8* %julia_ptr to i32 addrspace(3)* store atomic i32 1, i32 addrspace(3)* %0 unordered, align 4 ret void }
In the case of NVPTX, this introduces a cvta.to.shared, whereas leaving out the %end block and branch doesn't trigger this optimization. This results in illegal memory accesses as seen in https://github.com/JuliaGPU/CUDA.jl/issues/558
In this change, I introduced a check before doing the pointer cast that verifies address spaces are the same. If not, it emits a ptrtoint/inttoptr combination to get a no-op cast between address spaces. I decided against disallowing ptrtoint/inttoptr with non-default AS in matchOperationAddr, because now its still possible to look through multiple sequences of them that ultimately do not result in a address space mismatch (i.e. the second lit test).
show more ...
|
| #
373571db |
| 30-Jun-2022 |
Nuno Lopes <[email protected]> |
[NFC] Switch a few uses of undef to poison as placeholders for unreachble code
|
| #
44b456e5 |
| 26-Jun-2022 |
Craig Topper <[email protected]> |
[CodeGenPrepare] Avoid double map lookup. NFCI
|
|
Revision tags: llvmorg-14.0.6 |
|
| #
4271a1ff |
| 18-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Call *set::insert without checking membership first (NFC)
|
| #
6725d806 |
| 14-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[NFC][Alignment] Use Align in shouldAlignPointerArgs
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
c10921fa |
| 30-May-2022 |
Nikita Popov <[email protected]> |
[CGP] Also freeze ctlz/cttz operand when despeculating
D125887 changed the ctlz/cttz despeculation transform to insert a freeze for the introduced branch on zero. While this does fix the "branch on
[CGP] Also freeze ctlz/cttz operand when despeculating
D125887 changed the ctlz/cttz despeculation transform to insert a freeze for the introduced branch on zero. While this does fix the "branch on poison" issue, we may still get in trouble if we pick a different value for the branch and for the ctz argument (i.e. non-zero for the branch, but zero for the ctz). To avoid this, we should use the same frozen value in both positions.
This does cause a regression in RISCV codegen by introducing an additional sext. The DAG looks like this:
t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %3 t4: i64 = AssertSext t2, ValueType:ch:i32 t23: i64 = freeze t4 t9: ch = CopyToReg t0, Register:i64 %0, t23 t16: ch = CopyToReg t0, Register:i64 %4, Constant:i64<32> t18: ch = TokenFactor t9, t16 t25: i64 = sign_extend_inreg t23, ValueType:ch:i32 t24: i64 = setcc t25, Constant:i64<0>, seteq:ch t28: i64 = and t24, Constant:i64<1> t19: ch = brcond t18, t28, BasicBlock:ch<cond.end 0x8311f68> t21: ch = br t19, BasicBlock:ch<cond.false 0x8311e80>
I don't see a really obvious way to improve this, as we can't push the freeze past the AssertSext (which may produce poison).
Differential Revision: https://reviews.llvm.org/D126638
show more ...
|
| #
b8c2781f |
| 09-Jun-2022 |
Simon Moll <[email protected]> |
[NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with
[NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit.
This is the alternative to the less invasive clang-format only patch: D126783
Reviewed By: spatel, rengolin
Differential Revision: https://reviews.llvm.org/D126889
show more ...
|
| #
0e10f128 |
| 08-Jun-2022 |
Chuanqi Xu <[email protected]> |
[NFC] Remove commented cerr debugging loggings
There are some unused cerr debugging loggings in the codes. It is weird to remain such commented debug helpers in the product.
|
| #
d86a206f |
| 05-Jun-2022 |
Fangrui Song <[email protected]> |
Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options
|
| #
557efc9a |
| 04-Jun-2022 |
Fangrui Song <[email protected]> |
[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the err
[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded.
Also remove cl::init(false) while touching the lines.
show more ...
|
| #
08cc0585 |
| 27-May-2022 |
Rahman Lavaee <[email protected]> |
Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
The major change here is using 'addUsedIfAvailable<BasicBl
Reland "[Propeller] Promote functions with propeller profiles to .text.hot."
This relands commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests.
Differential Revision: https://reviews.llvm.org/D126518
show more ...
|
| #
3aa24932 |
| 27-May-2022 |
Rahman Lavaee <[email protected]> |
Revert "[Propeller] Promote functions with propeller profiles to .text.hot."
This reverts commit 4d8d2580c53e130c3c3dd3877384301e3c495554.
|
|
Revision tags: llvmorg-14.0.4 |
|
| #
4d8d2580 |
| 24-May-2022 |
Rahman Lavaee <[email protected]> |
[Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem
[Propeller] Promote functions with propeller profiles to .text.hot.
Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely).
This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`.
The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`.
Differential Revision: https://reviews.llvm.org/D122930
show more ...
|
| #
5126c380 |
| 18-May-2022 |
Nikita Popov <[email protected]> |
[CGP] Freeze condition when despeculating ctlz/cttz
Freeze the condition of the newly introduced conditional branch, to avoid immediate undefined behavior if the input to ctlz/cttz was originally po
[CGP] Freeze condition when despeculating ctlz/cttz
Freeze the condition of the newly introduced conditional branch, to avoid immediate undefined behavior if the input to ctlz/cttz was originally poison.
Differential Revision: https://reviews.llvm.org/D125887
show more ...
|
| #
8d03c49f |
| 03-May-2022 |
Matthias Braun <[email protected]> |
Extend switch condition in optimizeSwitchPhiConst when free
In a case like:
switch((i32)x) { case 42: phi((i64)42, ...); }
replace `(i64)42` with `zext(x)` when we can do so for free.
This fi
Extend switch condition in optimizeSwitchPhiConst when free
In a case like:
switch((i32)x) { case 42: phi((i64)42, ...); }
replace `(i64)42` with `zext(x)` when we can do so for free.
This fixes a part of https://github.com/llvm/llvm-project/issues/55153
Differential Revision: https://reviews.llvm.org/D124897
show more ...
|
| #
ed1cb01b |
| 13-May-2022 |
Nikita Popov <[email protected]> |
[IRBuilder] Add IsInBounds parameter to CreateGEP()
We commonly want to create either an inbounds or non-inbounds GEP based on a boolean value, e.g. when preserving inbounds from existing GEPs. Dire
[IRBuilder] Add IsInBounds parameter to CreateGEP()
We commonly want to create either an inbounds or non-inbounds GEP based on a boolean value, e.g. when preserving inbounds from existing GEPs. Directly accept such a boolean in the API, rather than requiring a ternary between CreateGEP and CreateInBoundsGEP.
This change is not entirely NFC, because we now preserve an inbounds flag in a constant expression edge-case in InstCombine.
show more ...
|
| #
edbf390d |
| 11-May-2022 |
Craig Topper <[email protected]> |
[CodeGenPrepare] Use const reference to avoid unnecessary APInt copy. NFC
Spotted while looking at Matthias' patches.
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D124985
|
| #
de9ad98d |
| 11-May-2022 |
Matthias Braun <[email protected]> |
Fix endless loop in optimizePhiConst with integer constant switch condition
Avoid endless loop in degenerate case with an integer constant as switch condition as reported in https://reviews.llvm.org
Fix endless loop in optimizePhiConst with integer constant switch condition
Avoid endless loop in degenerate case with an integer constant as switch condition as reported in https://reviews.llvm.org/D124552
show more ...
|
|
Revision tags: llvmorg-14.0.3 |
|
| #
f0ea9c9c |
| 27-Apr-2022 |
Matthias Braun <[email protected]> |
CodeGenPrepare: Replace constant PHI arguments with switch condition value
We often see code like the following after running SCCP:
switch (x) { case 42: phi(42, ...); }
This tends to produce
CodeGenPrepare: Replace constant PHI arguments with switch condition value
We often see code like the following after running SCCP:
switch (x) { case 42: phi(42, ...); }
This tends to produce bad code as we currently materialize the constant phi-argument in the switch-block. This increases register pressure and if the pattern repeats for `n` case statements, we end up generating `n` constant values.
This changes CodeGenPrepare to catch this pattern and revert it back to:
switch (x) { case 42: phi(x, ...); }
Differential Revision: https://reviews.llvm.org/D124552
show more ...
|
| #
cd19af74 |
| 03-May-2022 |
Matthias Braun <[email protected]> |
Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to give targets a chance to control the type used in `CodeGenPrepare::optimizeSwitchInst`
Avoid 8 and 16bit switch conditions on x86
This adds a `TargetLoweringBase::getSwitchConditionType` callback to give targets a chance to control the type used in `CodeGenPrepare::optimizeSwitchInst`.
Implement callback for X86 to avoid i8 and i16 types where possible as they often incur extra zero-extensions.
This is NFC for non-X86 targets.
Differential Revision: https://reviews.llvm.org/D124894
show more ...
|
|
Revision tags: llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
46f83cae |
| 22-Mar-2022 |
Jonas Paulsson <[email protected]> |
[InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p" constraint on X86 and SystemZ.
This was in fact broken on X86 (see exa
[InlineAsm] Add support for address operands ("p").
This patch adds support for inline assembly address operands using the "p" constraint on X86 and SystemZ.
This was in fact broken on X86 (see example at https://reviews.llvm.org/D110267, Nov 23).
These operands should probably be treated the same as memory operands by CodeGenPrepare, which have been commented with "TODO" there.
Review: Xiang Zhang and Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D122220
show more ...
|
| #
989f1c72 |
| 15-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-in
Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169
after: 1061034926 before: 1063332844
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681
show more ...
|