|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
335e8bf1 |
| 08-Jun-2022 |
Quinn Pham <[email protected]> |
[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores
This patch changes the PowerPC backend to generate VSX load/store instructions for all vector loads/stores on
[PowerPC] emit VSX instructions instead of VMX instructions for vector loads and stores
This patch changes the PowerPC backend to generate VSX load/store instructions for all vector loads/stores on Power8 and earlier (LE) instead of VMX load/store instructions. The reason for this change is because VMX instructions require the vector to be 16-byte aligned. So, a vector load/store will fail with VMX instructions if the vector is misaligned. Also, `gcc` generates VSX instructions in this situation which allow for unaligned access but require a swap instruction after loading/before storing. This is not an issue for BE because we already emit VSX instructions since no swap is required. And this is not an issue on Power9 and up since we have access to `lxv[x]`/`stxv[x]` which allow for unaligned access and do not require swaps.
This patch also delays the VSX load/store for LE combines until after LegalizeOps to prioritize other load/store combines.
Reviewed By: #powerpc, stefanp
Differential Revision: https://reviews.llvm.org/D127309
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
300e1293 |
| 15-Mar-2022 |
Qiu Chaofan <[email protected]> |
[PowerPC] Disable perfect shuffle by default
We are going to remove the old 'perfect shuffle' optimization since it brings performance penalty in hot loop around vectors. For example, in following l
[PowerPC] Disable perfect shuffle by default
We are going to remove the old 'perfect shuffle' optimization since it brings performance penalty in hot loop around vectors. For example, in following loop sharing the same mask:
%v.1 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27> %v.2 = shufflevector ... <0,1,2,3,8,9,10,11,16,17,18,19,24,25,26,27>
The generated instructions will be `vmrglw-vmrghw-vmrglw-vmrghw` instead of `vperm-vperm`. In some large loop cases, this causes 20%+ performance penalty.
The original attempt to resolve this is to pre-record masks of every shufflevector operation in DAG, but that is somewhat complex and brings unnecessary computation (to scan all nodes) in optimization. Here we disable it by default. There're indeed some cases becoming worse after this, which will be fixed in a more careful way in future patches.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D121082
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4 |
|
| #
766ca2c5 |
| 11-Mar-2022 |
Nemanja Ivanovic <[email protected]> |
[PowerPC] Add missed VSX shuffles instead of Altivec ones
VSX introduced some permute instructions that are direct replacements for Altivec ones except they can target all the VSX registers. We have
[PowerPC] Add missed VSX shuffles instead of Altivec ones
VSX introduced some permute instructions that are direct replacements for Altivec ones except they can target all the VSX registers. We have added code generation for most of these but somehow missed the low/hi word merges (XXMRG[LH]W). This caused some additional spills on some large computationally intensive code.
This patch simply adds the missed patterns.
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
4195ed99 |
| 30-Sep-2021 |
Albion Fung <[email protected]> |
[PowerPC] Improved codegen related to xscvdpsxws/xscvdpuxws
This patch removes the uneccessary mf/mtvsr generated in conjunction with xscvdpsxws/xscvdpuxws.
Differential revision: https://reviews.l
[PowerPC] Improved codegen related to xscvdpsxws/xscvdpuxws
This patch removes the uneccessary mf/mtvsr generated in conjunction with xscvdpsxws/xscvdpuxws.
Differential revision: https://reviews.llvm.org/D109902
show more ...
|
| #
3678df5a |
| 24-Sep-2021 |
Albion Fung <[email protected]> |
[PowerPC][NFC] Add test case in preparation for codegen change
This test case tests doubles inserted into vector ints, and help make apparent the optimizations a future patch will make.
|