History log of /llvm-project-15.0.7/llvm/utils/PerfectShuffle/PerfectShuffle.cpp (Results 1 – 25 of 30)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4
# 4c6a070a 17-May-2022 David Green <[email protected]>

[AArch64] Teach perfect shuffles tables about D-lane movs

Similar to D123386, this adds D-Movs to the AArch64 perfect shuffle
tables, slightly lowering the costs a little more. This is a rough
impro

[AArch64] Teach perfect shuffles tables about D-lane movs

Similar to D123386, this adds D-Movs to the AArch64 perfect shuffle
tables, slightly lowering the costs a little more. This is a rough
improvement in general, especially if you ignore mov v0.16b, v2.16b type
moves that are often artefacts of the calling convention.

The D register movs are encoded as (0x4 | LaneIdx), and to generate a D
register move we are required to bitcast into a higher type, but it is
otherwise very similar to the S-lane mov's already supported.

Differential Revision: https://reviews.llvm.org/D125477

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2
# cc034141 19-Apr-2022 David Green <[email protected]>

[PerfectShuffle] Remove unused variables from D123386. NFC


# 73dc9964 19-Apr-2022 David Green <[email protected]>

[AArch64] Add lane moves to PerfectShuffle tables

This teaches the perfect shuffle tables about lane inserts, that can
help reduce the cost of many entries. Many of the shuffle masks are
one-away fr

[AArch64] Add lane moves to PerfectShuffle tables

This teaches the perfect shuffle tables about lane inserts, that can
help reduce the cost of many entries. Many of the shuffle masks are
one-away from being correct, and a simple lane move can be a lot simpler
than trying to use ext/zip/etc. Because they are not exactly like the
other masks handled in the perfect shuffle tables, they require special
casing to generate them, with a special InsOp Operator.

The lane to insert into is encoded as the RHSID, and the move from is
grabbed from the original mask. This helps reduce the maximum perfect
shuffle entry cost to 3, with many more shuffles being generatable in a
single instruction.

Differential Revision: https://reviews.llvm.org/D123386

show more ...


# 50af8270 19-Apr-2022 David Green <[email protected]>

[AArch64] Cost all perfect shuffles entries as cost 1

A brief introduction to perfect shuffles - AArch64 NEON has a number of
shuffle operations - dups, zips, exts, movs etc that can in some way
shu

[AArch64] Cost all perfect shuffles entries as cost 1

A brief introduction to perfect shuffles - AArch64 NEON has a number of
shuffle operations - dups, zips, exts, movs etc that can in some way
shuffle around the lanes of a vector. Given a shuffle of size 4 with 2
inputs, some shuffle masks can be easily codegen'd to a single
instruction. A <0,0,1,1> mask for example is a zip LHS, LHS. This is
great, but some masks are not so simple, like a <0,0,1,2>. It turns out
we can generate that from zip LHS, <0,2,0,2>, having generated
<0,2,0,2> from uzp LHS, LHS, producing the result in 2 instructions.

It is not obvious from a given mask how to get there though. So we have
a simple program (PerfectShuffle.cpp in the util folder) that can scan
through all combinations of 4-element vectors and generate the perfect
combination of results needed for each shuffle mask (for some definition
of perfect). This is run offline to generate a table that is queried for
generating shuffle instructions. (Because the table could get quite big,
it is limited to 4 element vectors).

In the perfect shuffle tables zip, unz and trn shuffles were being cost
as 2, which is higher than needed and skews the perfect shuffle tables
to create inefficient combinations. This sets them to 1 and regenerates
the tables. The codegen will usually be better and the costs should be
more precise (but it can get less second-order re-use of values from
multiple shuffles, these cases should be fixed up in subsequent patches.

Differential Revision: https://reviews.llvm.org/D123379

show more ...


Revision tags: llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 2aed0813 07-Jan-2022 Kazu Hirata <[email protected]>

[llvm] Use true/false instead of 1/0 (NFC)

Identified with modernize-use-bool-literals.


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1
# 13d267e3 25-Mar-2020 Nico Weber <[email protected]>

revert parts of d7888149aa813f that several bots do not like


# d7888149 25-Mar-2020 Nico Weber <[email protected]>

Suppress a few -Wunreachable-code warnings.

No behavior change. Also fix a comment to say match reality.


Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70 19-Jan-2019 Chandler Carruth <[email protected]>

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the ne

Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636

show more ...


Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1
# 46642ffe 14-Dec-2015 Ben Craig <[email protected]>

Reordering fields to reduce padding in LLVM. NFC

llvm-svn: 255554


Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1, llvmorg-3.6.1, llvmorg-3.6.1-rc1, llvmorg-3.5.2, llvmorg-3.5.2-rc1, llvmorg-3.6.0, llvmorg-3.6.0-rc4, llvmorg-3.6.0-rc3, llvmorg-3.6.0-rc2, llvmorg-3.6.0-rc1, llvmorg-3.5.1, llvmorg-3.5.1-rc2, llvmorg-3.5.1-rc1, llvmorg-3.5.0, llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3, llvmorg-3.5.0-rc2, llvmorg-3.5.0-rc1
# 66f09ad0 08-Jun-2014 Craig Topper <[email protected]>

[C++11] Use 'nullptr'.

llvm-svn: 210442


Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1, llvmorg-3.4.1, llvmorg-3.4.1-rc2, llvmorg-3.4.1-rc1, llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1, llvmorg-3.3.1-rc1, llvmorg-3.3.0, llvmorg-3.3.0-rc3, llvmorg-3.3.0-rc2, llvmorg-3.3.0-rc1, llvmorg-3.2.0, llvmorg-3.2.0-rc3
# 91d19d8e 04-Dec-2012 Chandler Carruth <[email protected]>

Sort the #include lines for utils/...

I've tried to find main moudle headers where possible, but the TableGen
stuff may warrant someone else looking at it.

llvm-svn: 169251


Revision tags: llvmorg-3.2.0-rc2, llvmorg-3.2.0-rc1, llvmorg-3.1.0, llvmorg-3.1.0-rc3, llvmorg-3.1.0-rc2, llvmorg-3.1.0-rc1, llvmorg-3.0.0, llvmorg-3.0.0-rc4, llvmorg-3.0.0-rc3, llvmorg-3.0.0-rc2, llvmorg-3.0.0-rc1
# 48b182c3 18-May-2011 Tanya Lattner <[email protected]>

In r131488 I misunderstood how VREV works. It splits the vector in half and splits each half. Therefore, the real problem was that we were using a VREV64 for a 4xi16, when we should have been using a

In r131488 I misunderstood how VREV works. It splits the vector in half and splits each half. Therefore, the real problem was that we were using a VREV64 for a 4xi16, when we should have been using a VREV32.
Updated test case and reverted change to the PerfectShuffle Table.

llvm-svn: 131529

show more ...


# c7e291b3 17-May-2011 Tanya Lattner <[email protected]>

vrev is incorrectly defined in the perfect shuffle table. The ordering is backwards (should be 0x3210 versus 0x1032) which exposed a bug when doing a shuffle on a 4xi16. I've attached a test case.

l

vrev is incorrectly defined in the perfect shuffle table. The ordering is backwards (should be 0x3210 versus 0x1032) which exposed a bug when doing a shuffle on a 4xi16. I've attached a test case.

llvm-svn: 131488

show more ...


Revision tags: llvmorg-2.9.0, llvmorg-2.9.0-rc3, llvmorg-2.9.0-rc2, llvmorg-2.9.0-rc1
# dba47755 14-Oct-2010 Jim Grosbach <[email protected]>

Teach PerfectShuffle to not generate files with embedded tab characters.

llvm-svn: 116458


Revision tags: llvmorg-2.8.0, llvmorg-2.8.0-rc3, llvmorg-2.8.0-rc2, llvmorg-2.8.0-rc1, llvmorg-2.8.0-rc0, llvmorg-2.7.0, llvmorg-2.6.0
# 8493c5de 25-Sep-2009 Anton Korobeynikov <[email protected]>

Provide proper masks for neon perfect shuffle table.
I definitely need to read documentation better :(

llvm-svn: 82813


# 9a232f46 21-Aug-2009 Anton Korobeynikov <[email protected]>

Add lowering of ARM 4-element shuffles to multiple instructios via perfectshuffle-generated table.

llvm-svn: 79624


# 84e24388 21-Aug-2009 Anton Korobeynikov <[email protected]>

Add simple cost model to perfect shuffle. Currently we're doing just greedy search for cost instead of (proper) dynamic programming approach.

llvm-svn: 79623


# b539e06f 21-Aug-2009 Anton Korobeynikov <[email protected]>

Whitespace cleanup

llvm-svn: 79618


Revision tags: llvmorg-2.5.0, llvmorg-2.4.0, llvmorg-2.3.0
# 579f0713 20-Feb-2008 Anton Korobeynikov <[email protected]>

Unbreak build with gcc 4.3: provide missed includes and silence most annoying warnings.

llvm-svn: 47367


Revision tags: llvmorg-2.2.0
# 8adcd9f3 29-Dec-2007 Chris Lattner <[email protected]>

remove attributions from utils.

llvm-svn: 45419


Revision tags: llvmorg-2.1.0, llvmorg-2.0.0, llvmorg-1.9.0
# 16e05c37 03-Nov-2006 Chris Lattner <[email protected]>

silence warning

llvm-svn: 31402


# de46e484 02-Nov-2006 Reid Spencer <[email protected]>

For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to dele

For PR786:
Turn on -Wunused and -Wno-unused-parameter. Clean up most of the resulting
fall out by removing unused variables. Remaining warnings have to do with
unused functions (I didn't want to delete code without review) and unused
variables in generated code. Maintainers should clean up the remaining
issues when they see them. All changes pass DejaGnu tests and Olden.

llvm-svn: 31380

show more ...


# aa237256 24-May-2006 Chris Lattner <[email protected]>

Patches to make the LLVM sources more -pedantic clean. Patch provided
by Anton Korobeynikov! This is a step towards closing PR786.

llvm-svn: 28447


# 3db20563 18-Apr-2006 Chris Lattner <[email protected]>

Fix a build failure on Vladimir's tester.

llvm-svn: 27785


# e2e2cc5b 17-Apr-2006 Chris Lattner <[email protected]>

Encode a cost of zero as a cost of 1.

llvm-svn: 27745


12