|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
d5999bd3 |
| 20-May-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX][NFC] Refactor X86LowerAMXCast.cpp
Change static function to X86LowerAMXCast member function.
Differential Revision: https://reviews.llvm.org/D126058
|
| #
ce9c0fac |
| 02-May-2022 |
Simon Pilgrim <[email protected]> |
[X86][AMX] combineLdSt - don't dereference dyn_cast. NFC
This leads to null pointer dereference warnings - use cast<> which will assert that the cast correct.
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2 |
|
| #
942ec5c3 |
| 25-Apr-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX] combine tile cast and load/store instruction.
The `llvm.x86.cast.tile.to.vector` intrinsic is lowered to `llvm.x86.tilestored64.internal` and `load <256 x i32>`. The `llvm.x86.cast.vector
[X86][AMX] combine tile cast and load/store instruction.
The `llvm.x86.cast.tile.to.vector` intrinsic is lowered to `llvm.x86.tilestored64.internal` and `load <256 x i32>`. The `llvm.x86.cast.vector.to.tile` is lowered to `store <256 x i32>` and `llvm.x86.tileloadd64.internal`. When `llvm.x86.cast.tile.to.vector` is used by `store <256 x i32>` or `load <256 x i32>` is used by `llvm.x86.cast.vector.to.tile`, they can be combined by `llvm.x86.tilestored64.internal` and `llvm.x86.tileloadd64.internal`.
Differential Revision: https://reviews.llvm.org/D124378
show more ...
|
| #
9727c77d |
| 25-Apr-2022 |
David Green <[email protected]> |
[NFC] Rename Instrinsic to Intrinsic
|
|
Revision tags: llvmorg-14.0.1 |
|
| #
690bed0c |
| 10-Apr-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX] Fix infinite loop of getShape.
When walk the user chain to get the shape of a phi node. If it is phi node in the chain, we should walk to the user of this phi node instead of the original
[X86][AMX] Fix infinite loop of getShape.
When walk the user chain to get the shape of a phi node. If it is phi node in the chain, we should walk to the user of this phi node instead of the original phi node.
show more ...
|
| #
6753eb0c |
| 30-Mar-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX] Materialize undef or zero value to tilezero
The AMX combiner would store undef or zero to stack and invoke tileload to load the data to tile register. To avoid the store/load, we can mate
[X86][AMX] Materialize undef or zero value to tilezero
The AMX combiner would store undef or zero to stack and invoke tileload to load the data to tile register. To avoid the store/load, we can materialzie undef or zero value to tilezero.
Differential Revision: https://reviews.llvm.org/D122714
show more ...
|
| #
1141c8b6 |
| 30-Mar-2022 |
Luo, Yuanke <[email protected]> |
[X86][AMX] Fix bug for amx cast tranform
After combining amx cast operation, some amx cast intrinsic may be dead code. This patch is to delete such dead code and avoid crash.
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
e188aae4 |
| 31-Jan-2022 |
serge-sans-paille <[email protected]> |
Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avo
Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.
This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/
I've tried to summarize the biggest change below:
- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h
And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l before: 6400831 after: 6189948
200k lines less to process is no that bad ;-)
Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D118652
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
7ca14f60 |
| 18-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops (NFC)
|
| #
2c4ba3e9 |
| 05-Nov-2021 |
Kazu Hirata <[email protected]> |
[Target] Use make_early_inc_range (NFC)
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
55ba1de7 |
| 27-Aug-2021 |
Vince Bridgers <[email protected]> |
[X86] Remove X86LowerAMXType::getRowFromCol from X86LowerAMXType.cpp
Remove method X86LowerAMXType::getRowFromCol since it's not used, and it's causing a warning.
Reviewed By: LuoYuanke
Differenti
[X86] Remove X86LowerAMXType::getRowFromCol from X86LowerAMXType.cpp
Remove method X86LowerAMXType::getRowFromCol since it's not used, and it's causing a warning.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D108862
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
c17f5afa |
| 26-Aug-2021 |
Simon Pilgrim <[email protected]> |
[X86] getShape - don't dereference dyn_cast<>
dyn_cast can return nullptr, use cast<> to assert we have the correct type.
|
| #
ffe58de3 |
| 18-Aug-2021 |
Bing1 Yu <[email protected]> |
[X86] [AMX] Fix the test case failure caused by D107544.
The issue can be duplicated when EXPENSIVE_CHECKS is specified for llvm build. Thank Simon report this issue at https://bugs.llvm.org/show_bu
[X86] [AMX] Fix the test case failure caused by D107544.
The issue can be duplicated when EXPENSIVE_CHECKS is specified for llvm build. Thank Simon report this issue at https://bugs.llvm.org/show_bug.cgi?id=51513. We need return correct value for the changed IR.
Reviewed By: RKSimon, LuoYuanke
Differential Revision: https://reviews.llvm.org/D108269
show more ...
|
| #
bcec4ccd |
| 05-Aug-2021 |
Bing1 Yu <[email protected]> |
[X86] [AMX] Replace bitcast with specific AMX intrinsics with X86 specific cast.
There is some discussion on the bitcast for vector and x86_amx at https://reviews.llvm.org/D99152. This patch is to i
[X86] [AMX] Replace bitcast with specific AMX intrinsics with X86 specific cast.
There is some discussion on the bitcast for vector and x86_amx at https://reviews.llvm.org/D99152. This patch is to introduce a x86 specific cast for vector and x86_amx, so that it can avoid some unnecessary optimization by middle-end. On the other way, we have to optimize the x86 specific cast by ourselves. This patch also optimize the cast operation to eliminate redundant code.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D107544
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2 |
|
| #
56d5c46b |
| 11-Jun-2021 |
Bing1 Yu <[email protected]> |
[X86] Support __tile_stream_loadd intrinsic for new AMX interface
Adding support for __tile_stream_loadd intrinsic.
Reviewed By: LuoYuanke
Differential Revision: https://reviews.llvm.org/D103784
|
| #
63233da7 |
| 10-Jun-2021 |
Luo, Yuanke <[email protected]> |
[X86][NFC] Fix typo.
|
|
Revision tags: llvmorg-12.0.1-rc1 |
|
| #
d4bdeca5 |
| 08-May-2021 |
Xiang1 Zhang <[email protected]> |
[X86] Support AMX fast register allocation
Differential Revision: https://reviews.llvm.org/D100026
|
| #
bebafe01 |
| 08-May-2021 |
Xiang1 Zhang <[email protected]> |
Revert "[X86] Support AMX fast register allocation"
This reverts commit 77e2e5e07d01fe0b83c39d0c527c0d3d2e659146.
|
| #
77e2e5e0 |
| 07-May-2021 |
Xiang1 Zhang <[email protected]> |
[X86] Support AMX fast register allocation
|
| #
df323ba4 |
| 29-Apr-2021 |
Benjamin Kramer <[email protected]> |
Revert "[X86] Support AMX fast register allocation"
This reverts commit 3b8ec86fd576b9808dc63da620d9a4f7bbe04372.
Revert "[X86] Refine AMX fast register allocation"
This reverts commit c3f95e91976
Revert "[X86] Support AMX fast register allocation"
This reverts commit 3b8ec86fd576b9808dc63da620d9a4f7bbe04372.
Revert "[X86] Refine AMX fast register allocation"
This reverts commit c3f95e9197643b699b891ca416ce7d72cf89f5fc.
This pass breaks using LLVM in a multi-threaded environment by introducing global state.
show more ...
|
| #
3b8ec86f |
| 07-Apr-2021 |
Xiang1 Zhang <[email protected]> |
[X86] Support AMX fast register allocation
Differential Revision: https://reviews.llvm.org/D100026
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
4f198b0c |
| 16-Mar-2021 |
Bing1 Yu <[email protected]> |
[X86] Pass to transform amx intrinsics to scalar operation.
This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of sh
[X86] Pass to transform amx intrinsics to scalar operation.
This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D93594
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3 |
|
| #
3fd2fa12 |
| 05-Mar-2021 |
Simon Pilgrim <[email protected]> |
Revert rG8198d83965ba4b9db6922b44ef3041030b2bac39: "[X86] Pass to transform amx intrinsics to scalar operation."
This reverts commit 8198d83965ba4b9db6922b44ef3041030b2bac39.due to buildbot breakages
|
| #
8198d839 |
| 04-Mar-2021 |
Luo, Yuanke <[email protected]> |
[X86] Pass to transform amx intrinsics to scalar operation.
This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of sh
[X86] Pass to transform amx intrinsics to scalar operation.
This pass runs in any situations but we skip it when it is not O0 and the function doesn't have optnone attribute. With -O0, the def of shape to amx intrinsics is near the amx intrinsics code. We are not able to find a point which post-dominate all the shape and dominate all amx intrinsics. To decouple the dependency of the shape, we transform amx intrinsics to scalar operation, so that compiling doesn't fail. In long term, we should improve fast register allocation to allocate amx register.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D93594
show more ...
|
| #
4bc7c863 |
| 24-Feb-2021 |
Liu, Chen3 <[email protected]> |
[X86] Support amx-bf16 intrinsic.
Adding support for intrinsics of AMX-BF16. This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong predicate.
Differential Revision: https
[X86] Support amx-bf16 intrinsic.
Adding support for intrinsics of AMX-BF16. This patch alse fix a bug that AMX-INT8 instructions will be selected with wrong predicate.
Differential Revision: https://reviews.llvm.org/D97358
show more ...
|