Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# 1cf9b24d 12-Jun-2022 Simon Pilgrim <[email protected]>

[DAG] Enable ISD::FSHL/R SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits

This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source

[DAG] Enable ISD::FSHL/R SimplifyMultipleUseDemandedBits handling inside SimplifyDemandedBits

This patch allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits in cases where the source operand has other uses, enabling us to peek through the shifted value if we don't demand all the bits/elts.

This helps with several of the regressions from D125836

show more ...


Revision tags: llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2
# 0aef747b 11-Jun-2021 Roman Lebedev <[email protected]>

[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated

The motivation is that the update script has at least two deviations
(`<...>@GOT`/`<...>@PLT`/ and not

[NFC][X86][Codegen] Megacommit: mass-regenerate all check lines that were already autogenerated

The motivation is that the update script has at least two deviations
(`<...>@GOT`/`<...>@PLT`/ and not hiding pointer arithmetics) from
what pretty much all the checklines were generated with,
and most of the tests are still not updated, so each time one of the
non-up-to-date tests is updated to see the effect of the code change,
there is a lot of noise. Instead of having to deal with that each
time, let's just deal with everything at once.

This has been done via:
```
cd llvm-project/llvm/test/CodeGen/X86
grep -rl "; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py" | xargs -L1 <...>/llvm-project/llvm/utils/update_llc_test_checks.py --llc-binary <...>/llvm-project/build/bin/llc
```

Not all tests were regenerated, however.

show more ...


Revision tags: llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# 6cb7ddda 12-Mar-2021 Simon Pilgrim <[email protected]>

[X86][AVX] Insert zeros byte elements into 256/512-bit vectors using shuffle/and

Avoid extracting/inserting subvectors which makes it more difficult for shuffle combining to merge them together.


Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1
# 17eafe08 26-Jul-2020 Simon Pilgrim <[email protected]>

[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening

If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result

[X86][SSE] lowerV2I64Shuffle - use undef elements in PSHUFD mask widening

If we lower a v2i64 shuffle to PSHUFD, we currently clamp undef elements to 0, (elements 0,1 of the v4i32) which can result in the shuffle referencing more elements of the source vector than expected, affecting later shuffle combines and KnownBits/SimplifyDemanded calls.

By ensuring we widen the undef mask element we allow getV4X86ShuffleImm8 to use inline elements as the default, which are more likely to fold.

show more ...


Revision tags: llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 6d103ca8 02-May-2020 LemonBoy <[email protected]>

[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad

The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation

[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad

The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces.

The technique employed by `ExpandLoad` is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines.

Differential Revision: https://reviews.llvm.org/D79096

show more ...


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1
# b3d719e1 22-Jul-2019 Simon Pilgrim <[email protected]>

[X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED)

This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to r

[X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED)

This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load.

A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match.

Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle.

Fixed out of bounds load assert identified in rL366501

Differential Revision: https://reviews.llvm.org/D64551

llvm-svn: 366681

show more ...


# ba9c9e62 18-Jul-2019 Reid Kleckner <[email protected]>

Revert [X86] EltsFromConsecutiveLoads - support common source loads

This reverts r366441 (git commit 48104ef7c9c653bbb732b66d7254957389fea337)

This causes clang to fail to compile some file in Skia

Revert [X86] EltsFromConsecutiveLoads - support common source loads

This reverts r366441 (git commit 48104ef7c9c653bbb732b66d7254957389fea337)

This causes clang to fail to compile some file in Skia. Reduction soon.

llvm-svn: 366501

show more ...


# 48104ef7 18-Jul-2019 Simon Pilgrim <[email protected]>

[X86] EltsFromConsecutiveLoads - support common source loads

This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise con

[X86] EltsFromConsecutiveLoads - support common source loads

This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load.

A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match.

Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle.

Differential Revision: https://reviews.llvm.org/D64551

llvm-svn: 366441

show more ...


Revision tags: llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1
# 858303b8 30-Oct-2018 Simon Pilgrim <[email protected]>

[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes

Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VEC

[SelectionDAG] Add FoldBUILD_VECTOR to simplify new BUILD_VECTOR nodes

Similar to FoldCONCAT_VECTORS, this patch adds FoldBUILD_VECTOR to simplify cases that can avoid the creation of the BUILD_VECTOR - if all the operands are UNDEF or if the BUILD_VECTOR simplifies to a copy.

This exposed an assumption in some AMDGPU code that getBuildVector was guaranteed to be a BUILD_VECTOR node that I've tried to handle.

Differential Revision: https://reviews.llvm.org/D53760

llvm-svn: 345578

show more ...


Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3
# 4e2f757d 16-Feb-2018 Simon Pilgrim <[email protected]>

[X86][SSE] Allow float domain crossing if we are merging 2 or more shuffles and the root started as a float domain shuffle

llvm-svn: 325349


Revision tags: llvmorg-6.0.0-rc2
# 7ad28863 20-Jan-2018 Jonas Paulsson <[email protected]>

[SelectionDAG] Fix codegen of vector stores with non byte-sized elements.

This was completely broken, but hopefully fixed by this patch.

In cases where it is needed, a vector with non byte-sized el

[SelectionDAG] Fix codegen of vector stores with non byte-sized elements.

This was completely broken, but hopefully fixed by this patch.

In cases where it is needed, a vector with non byte-sized elements is stored
by extracting, zero-extending, shift:ing and or:ing the elements into an
integer of the same width as the vector, which is then stored.

Review: Eli Friedman, Ulrich Weigand
https://reviews.llvm.org/D42100#inline-369520
https://bugs.llvm.org/show_bug.cgi?id=35520

llvm-svn: 323042

show more ...


Revision tags: llvmorg-6.0.0-rc1
# 940eae3c 15-Jan-2018 Simon Pilgrim <[email protected]>

[X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873)

Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW.

[X86][SSE] Add custom execution domain fixing for BLENDPD/BLENDPS/PBLENDD/PBLENDW (PR34873)

Add support for custom execution domain fixing and implement support for BLENDPD/BLENDPS/PBLENDD/PBLENDW.

Differential Revision: https://reviews.llvm.org/D42042

llvm-svn: 322524

show more ...


Revision tags: llvmorg-5.0.1, llvmorg-5.0.1-rc3
# 25528d6d 04-Dec-2017 Francis Visoiu Mistrih <[email protected]>

[CodeGen] Unify MBB reference format in both MIR and debug output

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR n

[CodeGen] Unify MBB reference format in both MIR and debug output

As part of the unification of the debug format and the MIR format, print
MBB references as '%bb.5'.

The MIR printer prints the IR name of a MBB only for block definitions.

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber\(\)/" << printMBBReference(*\1)/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber\(\)/" << printMBBReference(\1)/g'
* find . \( -name "*.txt" -o -name "*.s" -o -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g'
* grep -nr 'BB#' and fix

Differential Revision: https://reviews.llvm.org/D40422

llvm-svn: 319665

show more ...


Revision tags: llvmorg-5.0.1-rc2
# 61145d5c 26-Nov-2017 Simon Pilgrim <[email protected]>

[X86][SSE] Add SSE42 tests to the clear upper tests

llvm-svn: 319003


Revision tags: llvmorg-5.0.1-rc1
# 7765c93b 18-Sep-2017 Sanjay Patel <[email protected]>

[DAG, x86] allow store merging before and after legalization (PR34217)

rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole

[DAG, x86] allow store merging before and after legalization (PR34217)

rL310710 allowed store merging to occur after legalization to catch stores that are created late,
but this exposes a logic hole seen in PR34217:
https://bugs.llvm.org/show_bug.cgi?id=34217

We will miss merging stores if the target lowers vector extracts into target-specific operations.
This patch allows store merging to occur both before and after legalization if the target chooses
to get maximum merging.

I don't think the potential regressions in the other tests are relevant. The tests are for
correctness of weird IR constructs rather than perf tests, and I think those are still correct.

Differential Revision: https://reviews.llvm.org/D37987

llvm-svn: 313564

show more ...


# a6054328 18-Sep-2017 Craig Topper <[email protected]>

[X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain.

MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVE

[X86] Teach the execution domain fixing tables to use movlhps inplace of unpcklpd for the packed single domain.

MOVLHPS has a smaller encoding than UNPCKLPD in the legacy encodings. With VEX and EVEX encodings it doesn't matter.

llvm-svn: 313509

show more ...


# 87f7381e 18-Sep-2017 Craig Topper <[email protected]>

[X86] Teach execution domain fixing to convert between FP and int unpack instructions.

llvm-svn: 313508


# 76f44015 04-Sep-2017 Craig Topper <[email protected]>

[X86] Add a combine to recognize when we have two insert subvectors that together write the whole vector, but the starting vector isn't undef.

In this case we should replace the starting vector with

[X86] Add a combine to recognize when we have two insert subvectors that together write the whole vector, but the starting vector isn't undef.

In this case we should replace the starting vector with undef.

llvm-svn: 312462

show more ...


# fa82efb5 03-Sep-2017 Craig Topper <[email protected]>

[X86] Add VBLENDPS/VPBLENDD to the execution domain fixing tables.

llvm-svn: 312449


Revision tags: llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2
# aead31a3 27-Jul-2017 Dinar Temirbulatov <[email protected]>

[X86] SET0 to use XMM registers where possible PR26018 PR32862

Differential Revision: https://reviews.llvm.org/D35839

llvm-svn: 309298


Revision tags: llvmorg-5.0.0-rc1
# b320ef9f 05-Jul-2017 Nirav Dave <[email protected]>

Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset

Relanding after rewriting undef.ll test to avoid host-dependant
endianness.

As discussed in D34087, rewrite areNonVolatileConsecutiveLo

Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset

Relanding after rewriting undef.ll test to avoid host-dependant
endianness.

As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.

Tests of note:

* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge

* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.

Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D34472

llvm-svn: 307114

show more ...


# a35938d8 30-Jun-2017 Nirav Dave <[email protected]>

Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset"

This reverts commit r306819 which appears be exposing underlying
issues in a stage1 ppc64be build

llvm-svn: 306820


# c5a48c1e 30-Jun-2017 Nirav Dave <[email protected]>

[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset

As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from t

[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset

As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using
generic checks. Also, propagate missing local handling from there to
BaseIndexOffset checks.

Tests of note:

* test/CodeGen/X86/build-vector* - Improved.
* test/CodeGen/BPF/undef.ll - Improved store alignment allows an
additional store merge

* test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a
case we already do not handle well. Here, the DAG is improved, but
scheduling causes a code size degradation.

Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab

Subscribers: nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D34472

llvm-svn: 306819

show more ...


# 4822b5b6 20-Jun-2017 Simon Pilgrim <[email protected]>

[X86][SSE] Relax 0/-1 vector element insertion to work for any vector with >=16bit elements

Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this

ll

[X86][SSE] Relax 0/-1 vector element insertion to work for any vector with >=16bit elements

Shuffle lowering/combining now does a good job for 256/512-bit vectors - we don't need to prevent this

llvm-svn: 305801

show more ...


Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3
# 58f5be27 07-Jun-2017 Simon Pilgrim <[email protected]>

[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining

We were checking that the index was in range of the destination vector type, not the (larger) source vector type

llvm-svn

[X86][SSE] Fix an issue with PEXTRW/PEXTRB indices during shuffle combining

We were checking that the index was in range of the destination vector type, not the (larger) source vector type

llvm-svn: 304894

show more ...


123