[mlir] add complex type to getZeroAttrFixes issue encountered with <sparse> complex constanthttps://github.com/llvm/llvm-project/issues/56428Reviewed By: rriddleDifferential Revision: https://
[mlir] add complex type to getZeroAttrFixes issue encountered with <sparse> complex constanthttps://github.com/llvm/llvm-project/issues/56428Reviewed By: rriddleDifferential Revision: https://reviews.llvm.org/D129325
show more ...
[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)This revision follows up on the conversation titled:```[llvm-dev] Understanding and controlling some of the A
[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)This revision follows up on the conversation titled:```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.This results in roughly 20% fewer cycles as reported by llvm-mca:After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):```Iterations: 100Instructions: 5900Total Cycles: 2415Total uOps: 7300Dispatch Width: 6uOps Per Cycle: 3.02IPC: 2.44Block RThroughput: 24.0Cycles with backend pressure increase [ 89.90% ]Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ]```After this revision (inline_asm version, vblendps instructions are indeed emitted):```Iterations: 100Instructions: 6300Total Cycles: 2015Total uOps: 7700Dispatch Width: 6uOps Per Cycle: 3.82IPC: 3.13Block RThroughput: 20.0Cycles with backend pressure increase [ 83.47% ]Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ]```An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).Differential Revision: https://reviews.llvm.org/D114393
Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)"This reverts commit a9e236bed835c58be381dadb973a1db0681e4795.This broke the Windows build:mlir\incl
Revert "[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)"This reverts commit a9e236bed835c58be381dadb973a1db0681e4795.This broke the Windows build:mlir\include\mlir/Dialect/X86Vector/Transforms.h(28): error C2061: syntax error: identifier 'uint'
[mlir][Vector] Add a vblendps-based impl for transpose8x8 (both intrin and inline_asm)This revision follows up on the conversation titled:```[llvm-dev] Understanding and controlling some of the AVX shuffle emission paths```The revision adds a vblendps-based implementation for transpose8x8 and further distinguishes between and intrinsics and an inline_asm implementation.This results in roughly 20% fewer cycles as reported by llvm-mca:After this revision (intrinsic version, resolves to virtually identical assembly as per the llvm-dev discussion, no vblendps instruction is emitted):```Iterations: 100Instructions: 5900Total Cycles: 2415Total uOps: 7300Dispatch Width: 6uOps Per Cycle: 3.02IPC: 2.44Block RThroughput: 24.0Cycles with backend pressure increase [ 89.90% ]Throughput Bottlenecks: Resource Pressure [ 89.65% ] - SKXPort1 [ 0.04% ] - SKXPort2 [ 12.42% ] - SKXPort3 [ 12.42% ] - SKXPort5 [ 89.52% ] Data Dependencies: [ 37.06% ] - Register Dependencies [ 37.06% ] - Memory Dependencies [ 0.00% ]```After this revision (inline_asm version, vblendps instructions are indeed emitted):```Iterations: 100Instructions: 6300Total Cycles: 2015Total uOps: 7700Dispatch Width: 6uOps Per Cycle: 3.82IPC: 3.13Block RThroughput: 20.0Cycles with backend pressure increase [ 83.47% ]Throughput Bottlenecks: Resource Pressure [ 83.18% ] - SKXPort0 [ 14.49% ] - SKXPort1 [ 14.54% ] - SKXPort2 [ 19.70% ] - SKXPort3 [ 19.70% ] - SKXPort5 [ 83.03% ] - SKXPort6 [ 14.49% ] Data Dependencies: [ 39.75% ] - Register Dependencies [ 39.75% ] - Memory Dependencies [ 0.00% ]```An accessible copy of the conversation is available [here](https://gist.github.com/nicolasvasilache/68c7f34012584b0e00f335bcb374ede0).Reviewed By: ftynse, dcaballeDifferential Revision: https://reviews.llvm.org/D114335
Move the MLIR integration tests as a subdirectory of test (NFC)This does not change the behavior directly: the tests only run when`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However runni
Move the MLIR integration tests as a subdirectory of test (NFC)This does not change the behavior directly: the tests only run when`-DMLIR_INCLUDE_INTEGRATION_TESTS=ON` is configured. However running`ninja check-mlir` will not run all the tests within a singlelit invocation. The previous behavior would wait for all the integrationtests to complete before starting to run the first regular test. Thetest results were also reported separately. This change is unifying allof this and allow concurrent execution of the integration tests withregular non-regression and unit-tests.Differential Revision: https://reviews.llvm.org/D97241