|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
1cf9b24d |
| 12-Jun-2022 |
Simon Pilgrim <[email protected]> |
[DAG] Enable ISD::FSHL/R SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source
[DAG] Enable ISD::FSHL/R SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits
This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.
This helps with several of the regressions from D125836
show more ...
|
|
Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
0aef747b |
| 11-Jun-2021 |
Roman Lebedev <[email protected]> |
[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated
The motivation is that the update script has at least two deviations (`<...>@GOT`/`<...>@PLT`/ and not
[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated
The motivation is that the update script has at least two deviations (`<...>@GOT`/`<...>@PLT`/ and not hiding pointer arithmetics) from what pretty much all the checklines were generated with, and most of the tests are still not updated, so each time one of the non-up-to-date tests is updated to see the effect of the code change, there is a lot of noise. Instead of having to deal with that each time, let's just deal with everything at once.
This has been done via: ``` cd llvm-project/llvm/test/CodeGen/X86 grep -rl "; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py" | xargs -L1 <...>/llvm-project/llvm/utils/update_llc_test_checks.py --llc-binary <...>/llvm-project/build/bin/llc ```
Not all tests were regenerated, however.
show more ...
|
|
Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
6cb7ddda |
| 12-Mar-2021 |
Simon Pilgrim <[email protected]> |
[X86][AVX] Insert zeros byte elements into 256/512-bit vectors using shuffle/and
Avoid extracting/inserting subvectors which makes it more difficult for shuffle combining to merge them together.
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1 |
|
| #
17eafe08 |
| 26-Jul-2020 |
Simon Pilgrim <[email protected]> |
[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening
If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result
[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening
If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result in the shuffle referencing more elements of the source vector than expected, affecting later shuffle combines and KnownBits/SimplifyDemanded calls.
By ensuring we widen the undef mask element we allow getV4X86ShuffleImm8 to use inline elements as the default, which are more likely to fold.
show more ...
|
|
Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
6d103ca8 |
| 02-May-2020 |
LemonBoy <[email protected]> |
[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad
The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation
[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad
The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces.
The technique employed by `ExpandLoad` is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines.
Differential Revision: https://reviews.llvm.org/D79096
show more ...
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1 |
|
| #
b3d719e1 |
| 22-Jul-2019 |
Simon Pilgrim <[email protected]> |
[X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED)
This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to r
[X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED)
This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load.
A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match.
Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle.
Fixed out of bounds load assert identified in rL366501
Differential Revision: https://reviews.llvm.org/D64551
llvm-svn: 366681
show more ...
|
| #
ba9c9e62 |
| 18-Jul-2019 |
Reid Kleckner <[email protected]> |
Revert [X86] EltsFromConsecutiveLoads - support common source loads
This reverts r366441 (git commit 48104ef7c9c653bbb732b66d7254957389fea337)
This causes clang to fail to compile some file in Skia
Revert [X86] EltsFromConsecutiveLoads - support common source loads
This reverts r366441 (git commit 48104ef7c9c653bbb732b66d7254957389fea337)
This causes clang to fail to compile some file in Skia. Reduction soon.
llvm-svn: 366501
show more ...
|
| #
48104ef7 |
| 18-Jul-2019 |
Simon Pilgrim <[email protected]> |
[X86] EltsFromConsecutiveLoads - support common source loads
This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise con
[X86] EltsFromConsecutiveLoads - support common source loads
This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load.
A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match.
Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle.
Differential Revision: https://reviews.llvm.org/D64551
llvm-svn: 366441
show more ...
|
|
Revision tags: llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
| #
858303b8 |
| 30-Oct-2018 |
Simon Pilgrim <[email protected]> |
[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VEC
[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy.
This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle. Differential Revision: https://reviews.llvm.org/D53760
llvm-svn: 345578
show more ...
|
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3 |
|
| #
4e2f757d |
| 16-Feb-2018 |
Simon Pilgrim <[email protected]> |
[X86][SSE] Allow float domain crossing if we are merging 2 or more shuffles and the root started as a float domain shuffle
llvm-svn: 325349
|
|
Revision tags: llvmorg-6.0.0-rc2 |
|
| #
7ad28863 |
| 20-Jan-2018 |
Jonas Paulsson <[email protected]> |
[SelectionDAG] Fix codegen of vector stores with non byte-sized elements.
This was completely broken, but hopefully fixed by this patch.
In cases where it is needed, a vector with non byte-sized el
[SelectionDAG] Fix codegen of vector stores with non byte-sized elements.
This was completely broken, but hopefully fixed by this patch.
In cases where it is needed, a vector with non byte-sized elements is stored by extracting, zero-extending, shift:ing and or:ing the elements into an integer of the same width as the vector, which is then stored.
Review: Eli Friedman, Ulrich Weigand https://reviews.llvm.org/D42100#inline-369520 https://bugs.llvm.org/show_bug.cgi?id=35520
llvm-svn: 323042
show more ...
|
|
Revision tags: llvmorg-6.0.0-rc1 |
|
| #
940eae3c |
| 15-Jan-2018 |
Simon Pilgrim <[email protected]> |
[X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873)
Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW.
[X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873)
Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW.
Differential Revision: https://reviews.llvm.org/D42042
llvm-svn: 322524
show more ...
|
|
Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3 |
|
| #
25528d6d |
| 04-Dec-2017 |
Francis Visoiu Mistrih <[email protected]> |
[CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'.
The MIR printer prints the IR n
[CodeGen] Unify MBB reference format in both MIR and debug output
As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'.
The MIR printer prints the IR name of a MBB only for block definitions.
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g' * find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g' * find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix
Differential Revision: https://reviews.llvm.org/D40422
llvm-svn: 319665
show more ...
|
|
Revision tags: llvmorg-5.0.1-rc2 |
|
| #
61145d5c |
| 26-Nov-2017 |
Simon Pilgrim <[email protected]> |
[X86][SSE] Add SSE42 tests to the clear upper tests
llvm-svn: 319003
|
|
Revision tags: llvmorg-5.0.1-rc1 |
|
| #
7765c93b |
| 18-Sep-2017 |
Sanjay Patel <[email protected]> |
[DAG, x86] allow store merging before and after legalization (PR34217)
rL310710 allowed store merging to occur after legalization to catch stores that are created late, but this exposes a logic hole
[DAG, x86] allow store merging before and after legalization (PR34217)
rL310710 allowed store merging to occur after legalization to catch stores that are created late, but this exposes a logic hole seen in PR34217: https://bugs.llvm.org/show_bug.cgi?id=34217
We will miss merging stores if the target lowers vector extracts into target-specific operations. This patch allows store merging to occur both before and after legalization if the target chooses to get maximum merging.
I don't think the potential regressions in the other tests are relevant. The tests are for correctness of weird IR constructs rather than perf tests, and I think those are still correct.
Differential Revision: https://reviews.llvm.org/D37987
llvm-svn: 313564
show more ...
|
| #
a6054328 |
| 18-Sep-2017 |
Craig Topper <[email protected]> |
[X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain.
MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVE
[X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain.
MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter.
llvm-svn: 313509
show more ...
|
| #
87f7381e |
| 18-Sep-2017 |
Craig Topper <[email protected]> |
[X86] Teach execution domain fixing to convert between FP and int unpack instructions.
llvm-svn: 313508
|
| #
76f44015 |
| 04-Sep-2017 |
Craig Topper <[email protected]> |
[X86] Add a combine to recognize when we have two insert subvectors that together write the whole vector, but the starting vector isn't undef.
In this case we should replace the starting vector with
[X86] Add a combine to recognize when we have two insert subvectors that together write the whole vector, but the starting vector isn't undef.
In this case we should replace the starting vector with undef.
llvm-svn: 312462
show more ...
|
| #
fa82efb5 |
| 03-Sep-2017 |
Craig Topper <[email protected]> |
[X86] Add VBLENDPS/VPBLENDD to the execution domain fixing tables.
llvm-svn: 312449
|
|
Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2 |
|
| #
aead31a3 |
| 27-Jul-2017 |
Dinar Temirbulatov <[email protected]> |
[X86] SET0 to use XMM registers where possible PR26018 PR32862
Differential Revision: https://reviews.llvm.org/D35839
llvm-svn: 309298
|
|
Revision tags: llvmorg-5.0.0-rc1 |
|
| #
b320ef9f |
| 05-Jul-2017 |
Nirav Dave <[email protected]> |
Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset
Relanding after rewriting undef.ll test to avoid host-dependant endianness.
As discussed in D34087, rewrite areNonVolatileConsecutiveLo
Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset
Relanding after rewriting undef.ll test to avoid host-dependant endianness.
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 307114
show more ...
|
| #
a35938d8 |
| 30-Jun-2017 |
Nirav Dave <[email protected]> |
Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"
This reverts commit r306819 which appears be exposing underlying issues in a stage1 ppc64be build
llvm-svn: 306820
|
| #
c5a48c1e |
| 30-Jun-2017 |
Nirav Dave <[email protected]> |
[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from t
[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset
As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks.
Tests of note:
* test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge
* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation.
Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab
Subscribers: nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D34472
llvm-svn: 306819
show more ...
|
| #
4822b5b6 |
| 20-Jun-2017 |
Simon Pilgrim <[email protected]> |
[X86][SSE] Relax 0/-1 vector element insertion to work for any vector with >=16bit elements
Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this
ll
[X86][SSE] Relax 0/-1 vector element insertion to work for any vector with >=16bit elements
Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this
llvm-svn: 305801
show more ...
|
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3 |
|
| #
58f5be27 |
| 07-Jun-2017 |
Simon Pilgrim <[email protected]> |
[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining
We were checking that the index was in range of the destination vector type, not the (larger) source vector type
llvm-svn
[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining
We were checking that the index was in range of the destination vector type, not the (larger) source vector type
llvm-svn: 304894
show more ...
|