History log of /llvm-project-15.0.7/mlir/lib/Dialect/X86Vector/Transforms/AVXTranspose.cpp (Results 1 – 15 of 15)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2
# eda6f907 22-Apr-2022 River Riddle <[email protected]>

[mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp

Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file

[mlir][NFC] Shift a bunch of dialect includes from the .h to the .cpp

Now that dialect constructors are generated in the .cpp file, we can
drop all of the dependent dialect includes from the .h file.

Differential Revision: https://reviews.llvm.org/D124298

show more ...


Revision tags: llvmorg-14.0.1
# 7c38fd60 28-Mar-2022 Jacques Pienaar <[email protected]>

[mlir] Flip Vector dialect accessors used to prefixed form.

This has been on _Both for a couple of weeks. Flip usages in core with
intention to flip flag to _Prefixed in follow up. Needed to add a c

[mlir] Flip Vector dialect accessors used to prefixed form.

This has been on _Both for a couple of weeks. Flip usages in core with
intention to flip flag to _Prefixed in follow up. Needed to add a couple
of helper methods in AffineOps and Linalg to facilitate a pure flag flip
in follow up as some of these classes are used in templates and so
sensitive to Vector dialect changes.

Differential Revision: https://reviews.llvm.org/D122151

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 875bbce9 25-Feb-2022 Diego Caballero <[email protected]>

[mlir][Vector] Prevent AVX2 lowering for non-f32 transpose ops

The AVX2 lowering for transpose operations is only applicable to f32 vector types.

Reviewed By: aartbik

Differential Revision: https:

[mlir][Vector] Prevent AVX2 lowering for non-f32 transpose ops

The AVX2 lowering for transpose operations is only applicable to f32 vector types.

Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D120427

show more ...


# d7e0a084 25-Feb-2022 Diego Caballero <[email protected]>

[mlir][Vector] Generalize AVX2 transpose lowering to n-D vectors

The existing AVX2 lowering patterns for the transpose op only triggers if the
input vector is 2-D. This patch extends the patterns to

[mlir][Vector] Generalize AVX2 transpose lowering to n-D vectors

The existing AVX2 lowering patterns for the transpose op only triggers if the
input vector is 2-D. This patch extends the patterns to trigger for n-D vectors
which are effectively 2-D vectors (e.g., vector<1x4x1x8x1). The main constraint
for the generalized AVX2 patterns to be applicable to these vectors is that the
dimensions that are greater than one must be transposed. Otherwise, the existing
patterns are not applicable.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D119505

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# 42398b51 26-Jan-2022 Nicolas Vasilache <[email protected]>

[mlir][LLVM] Add support for operand_attrs to InlineAsmOp

This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "el

[mlir][LLVM] Add support for operand_attrs to InlineAsmOp

This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "elementtype" TypeAttr on the operands to pass LLVM verification and need to be provided.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D118006

show more ...


# 99ef9eeb 31-Jan-2022 Matthias Springer <[email protected]>

[mlir][vector][NFC] Split into IR, Transforms and Utils

This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.

Differential Revision: https://r

[mlir][vector][NFC] Split into IR, Transforms and Utils

This reduces the dependencies of the MLIRVector target and makes the dialect consistent with other dialects.

Differential Revision: https://reviews.llvm.org/D118533

show more ...


# 7ebd22c5 26-Jan-2022 Mehdi Amini <[email protected]>

Revert "[mlir][LLVM] Add support for operand_attrs to InlineAsmOp"

This reverts commit e6ce2c0b8d5f8253791bf87145669c58328c30db.

The test is failing in CI right now.


# e6ce2c0b 26-Jan-2022 Nicolas Vasilache <[email protected]>

[mlir][LLVM] Add support for operand_attrs to InlineAsmOp

This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "el

[mlir][LLVM] Add support for operand_attrs to InlineAsmOp

This revision adds enough support to allow InlineAsmOp to work properly with indirect memory constraints "*m".
These require an explicit "elementtype" TypeAttr on the operands to pass LLVM verification and need to be provided.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D118006

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 02b6fb21 20-Dec-2021 Mehdi Amini <[email protected]>

Fix clang-tidy issues in mlir/ (NFC)

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D115956


Revision tags: llvmorg-13.0.1-rc1
# b2729fda 22-Nov-2021 Nicolas Vasilache <[email protected]>

[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)

This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the A

[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)

This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations: 100
Instructions: 5900
Total Cycles: 2415
Total uOps: 7300

Dispatch Width: 6
uOps Per Cycle: 3.02
IPC: 2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
Resource Pressure [ 89.65% ]
- SKXPort1 [ 0.04% ]
- SKXPort2 [ 12.42% ]
- SKXPort3 [ 12.42% ]
- SKXPort5 [ 89.52% ]
Data Dependencies: [ 37.06% ]
- Register Dependencies [ 37.06% ]
- Memory Dependencies [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations: 100
Instructions: 6300
Total Cycles: 2015
Total uOps: 7700

Dispatch Width: 6
uOps Per Cycle: 3.82
IPC: 3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
Resource Pressure [ 83.18% ]
- SKXPort0 [ 14.49% ]
- SKXPort1 [ 14.54% ]
- SKXPort2 [ 19.70% ]
- SKXPort3 [ 19.70% ]
- SKXPort5 [ 83.03% ]
- SKXPort6 [ 14.49% ]
Data Dependencies: [ 39.75% ]
- Register Dependencies [ 39.75% ]
- Memory Dependencies [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Differential Revision: https://reviews.llvm.org/D114393

show more ...


# e0b7bee7 22-Nov-2021 Mehdi Amini <[email protected]>

Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)"

This reverts commit a9e236bed835c58be381dadb973a1db0681e4795.
This broke the Windows build:

mlir\incl

Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)"

This reverts commit a9e236bed835c58be381dadb973a1db0681e4795.
This broke the Windows build:

mlir\include\mlir/Dialect/X86Vector/Transforms.h(28): error C2061: syntax error: identifier 'uint'

show more ...


# a9e236be 22-Nov-2021 Nicolas Vasilache <[email protected]>

[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)

This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the A

[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)

This revision follows up on the conversation titled:

```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```

The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.

This results in roughly 20% fewer cycles as reported by llvm-mca:

After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):
```
Iterations: 100
Instructions: 5900
Total Cycles: 2415
Total uOps: 7300

Dispatch Width: 6
uOps Per Cycle: 3.02
IPC: 2.44
Block RThroughput: 24.0

Cycles with backend pressure increase [ 89.90% ]
Throughput Bottlenecks:
Resource Pressure [ 89.65% ]
- SKXPort1 [ 0.04% ]
- SKXPort2 [ 12.42% ]
- SKXPort3 [ 12.42% ]
- SKXPort5 [ 89.52% ]
Data Dependencies: [ 37.06% ]
- Register Dependencies [ 37.06% ]
- Memory Dependencies [ 0.00% ]
```

After this revision (inline_asm version, vblendps instructions are indeed emitted):
```
Iterations: 100
Instructions: 6300
Total Cycles: 2015
Total uOps: 7700

Dispatch Width: 6
uOps Per Cycle: 3.82
IPC: 3.13
Block RThroughput: 20.0

Cycles with backend pressure increase [ 83.47% ]
Throughput Bottlenecks:
Resource Pressure [ 83.18% ]
- SKXPort0 [ 14.49% ]
- SKXPort1 [ 14.54% ]
- SKXPort2 [ 19.70% ]
- SKXPort3 [ 19.70% ]
- SKXPort5 [ 83.03% ]
- SKXPort6 [ 14.49% ]
Data Dependencies: [ 39.75% ]
- Register Dependencies [ 39.75% ]
- Memory Dependencies [ 0.00% ]
```

An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).

Reviewed By: ftynse, dcaballe

Differential Revision: https://reviews.llvm.org/D114335

show more ...


# f04a1237 11-Nov-2021 Benjamin Kramer <[email protected]>

[mlir][X86Vector] Fix unused variable warning


# a085c4b5 11-Nov-2021 Nicolas Vasilache <[email protected]>

[mlir][Vector] Silence recently introduced warnings


# 34ff8573 10-Nov-2021 Nicolas Vasilache <[email protected]>

[mlir][X86Vector] Add specialized vector.transpose lowering patterns for AVX2

This revision adds an implementation of 2-D vector.transpose for 4x8 and 8x8 for
AVX2 and surfaces it to the Linalg leve

[mlir][X86Vector] Add specialized vector.transpose lowering patterns for AVX2

This revision adds an implementation of 2-D vector.transpose for 4x8 and 8x8 for
AVX2 and surfaces it to the Linalg level of control.

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D113347

show more ...