ds-sub-offset.ll - OpenGrok history log for /llvm-project-15.0.7/llvm/test/CodeGen/AMDGPU/ds-sub-offset.ll

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init
# 8b756713	08-Jul-2022	Sanjay Patel <[email protected]>	[SDAG] try to replace subtract-from-constant with xor This is almost the same as the abandoned D48529, but it allows splat vector constants too. This replaces the x86-specific code that was added w [SDAG] try to replace subtract-from-constant with xor This is almost the same as the abandoned D48529, but it allows splat vector constants too. This replaces the x86-specific code that was added with the alternate patch D48557 with the original generic combine. This transform is a less restricted form of an existing InstCombine and the proposed SDAG equivalent for that in D128080: https://alive2.llvm.org/ce/z/OUm6N_ Differential Revision: https://reviews.llvm.org/D128123 show more ...
# 5cae8816	06-Jul-2022	Jay Foad <[email protected]>	[AMDGPU] Add GFX11 test coverage Add GFX11 test coverage to a bunch of tests where it was easy to do so, mostly because the checks are autogenerated and/or GFX11 can share the same checks as GFX10. [AMDGPU] Add GFX11 test coverage Add GFX11 test coverage to a bunch of tests where it was easy to do so, mostly because the checks are autogenerated and/or GFX11 can share the same checks as GFX10. Differential Revision: https://reviews.llvm.org/D129295 show more ...
# e44dcfb0	30-Jun-2022	Sanjay Patel <[email protected]>	[AMDGPU] add alternate tests for max-offset codegen; NFC As discussed in D128123, the existing test shows a possible regression when converting sub to xor. This adds tests that avoid that pattern bu [AMDGPU] add alternate tests for max-offset codegen; NFC As discussed in D128123, the existing test shows a possible regression when converting sub to xor. This adds tests that avoid that pattern but still has a offset near 65535. Also, add a test with the canonical IR for the existing test to show if the transform is happening with the expected pattern in IR. show more ...
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# 16cf9e6d	07-Apr-2022	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Fix handling of gfx10 LDS misaligned access bug It was only handled for FLAT initially because we did not have unaligned DS instructions lowering. Now it is implemented but the bug is not h [AMDGPU] Fix handling of gfx10 LDS misaligned access bug It was only handled for FLAT initially because we did not have unaligned DS instructions lowering. Now it is implemented but the bug is not handled. Differential Revision: https://reviews.llvm.org/D123338 show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3
# 89c447e4	16-Jan-2022	Matt Arsenault <[email protected]>	AMDGPU: Stop reserving 36-bytes before kernel arguments for amdpal This was inheriting the mesa behavior, and as far as I know nobody is using opencl kernels with amdpal. The isMesaKernel check was AMDGPU: Stop reserving 36-bytes before kernel arguments for amdpal This was inheriting the mesa behavior, and as far as I know nobody is using opencl kernels with amdpal. The isMesaKernel check was irrelevant because this property needs to be held for all functions. show more ...
Revision tags: llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1
# 3f34f75a	22-Oct-2021	Jay Foad <[email protected]>	[AMDGPU] Fix latency for implicit vcc_lo operands on GFX10 wave32 As described in the comment, the way we change vcc to vcc_lo in these operands confuses addPhysRegDataDeps into treating them as imp [AMDGPU] Fix latency for implicit vcc_lo operands on GFX10 wave32 As described in the comment, the way we change vcc to vcc_lo in these operands confuses addPhysRegDataDeps into treating them as implicit pseudo operands. Fix this by setting the correct latency from the SchedModel after addPhysRegDataDeps wrongly set it to 0. Differential Revision: https://reviews.llvm.org/D112317 show more ...
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 3ce1b963	08-Sep-2021	Joe Nash <[email protected]>	[AMDGPU] Switch PostRA sched to MachineSched Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D1095 [AMDGPU] Switch PostRA sched to MachineSched Use GCNHazardRecognizer in postra sched. Updated tests for the new schedules. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D109536 Change-Id: Ia86ba2ae168f12fb34b4d8efdab491f84d936cde show more ...
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init
# 4a3b0556	09-Jul-2021	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Fix flags of V_MOV_B64_PSEUDO In particular it was not rematerializable. Differential Revision: https://reviews.llvm.org/D105724
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# caf1294d	26-Apr-2021	Baptiste Saleil <[email protected]>	[AMDGPU] Experiments show that the GCNRegBankReassign pass significantly impacts the compilation time and there is no case for which we see any improvement in performance. This patch removes this pas [AMDGPU] Experiments show that the GCNRegBankReassign pass significantly impacts the compilation time and there is no case for which we see any improvement in performance. This patch removes this pass and its associated test cases from the tree. Differential Revision: https://reviews.llvm.org/D101313 Change-Id: I0599169a7609c19a887f8d847a71e664030cc141 show more ...
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4
# b082e6f8	29-Mar-2021	Petar Avramovic <[email protected]>	[AMDGPU] Extend gfx10 test coverage. NFC. Differential Revision: https://reviews.llvm.org/D99267
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1
# 2f499b9a	19-Dec-2020	Tony <[email protected]>	[AMDGPU] Add volatile support to SIMemoryLegalizer Treat a non-atomic volatile load and store as a relaxed atomic at system scope for the address spaces accessed. This will ensure all relevant cache [AMDGPU] Add volatile support to SIMemoryLegalizer Treat a non-atomic volatile load and store as a relaxed atomic at system scope for the address spaces accessed. This will ensure all relevant caches will be bypassed. A volatile atomic is not changed and still only bypasses caches upto the level specified by the SyncScope operand. Differential Revision: https://reviews.llvm.org/D94214 show more ...
# 1ebe86ad	05-Jan-2021	Mircea Trofin <[email protected]>	[NFC] Removed unused prefixes in test/CodeGen/AMDGPU More patches to follow. Differential Revision: https://reviews.llvm.org/D94121
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1
# 040c5027	02-Nov-2020	Jay Foad <[email protected]>	[AMDGPU] Fix ds_read2/write2 with unaligned offsets These instructions use a scaled offset. We were wrongly selecting them even when the required offset was not a multiple of the scale factor. Diff [AMDGPU] Fix ds_read2/write2 with unaligned offsets These instructions use a scaled offset. We were wrongly selecting them even when the required offset was not a multiple of the scale factor. Differential Revision: https://reviews.llvm.org/D90607 show more ...
# 32897c05	03-Nov-2020	Jay Foad <[email protected]>	[AMDGPU] Specify a triple to avoid codegen changes depending on host OS
# 0892d2a3	02-Nov-2020	Jay Foad <[email protected]>	Revert "Fix ds_read2/write2 unaligned offsets" This reverts commit 2e7e898c8f0b38dc11fbce2553fc715067aaf42f. It was committed by mistake.
# 2e7e898c	02-Nov-2020	Jay Foad <[email protected]>	Fix ds_read2/write2 unaligned offsets
# f3881d65	02-Nov-2020	Jay Foad <[email protected]>	[AMDGPU] Generate test checks. NFC.
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# 2710171a	25-Jun-2019	Nicolai Haehnle <[email protected]>	AMDGPU: Write LDS objects out as global symbols in code generation Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then AMDGPU: Write LDS objects out as global symbols in code generation Summary: The symbols use the processor-specific SHN_AMDGPU_LDS section index introduced with a previous change. The linker is then expected to resolve relocations, which are also emitted. Initially disabled for HSA and PAL environments until they have caught up in terms of linker and runtime loader. Some notes: - The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered to a constant at compile times, which means some tests can no longer be applied. The current "solution" is a terrible hack, but the intrinsic isn't used by Mesa, so we can keep it for now. - We no longer know the full LDS size per kernel at compile time, which means that we can no longer generate a relevant error message at compile time. It would be possible to add a check for the size of individual variables, but ultimately the linker will have to perform the final check. Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275 Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61494 llvm-svn: 364297 show more ...
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 9bef688b	02-Apr-2019	Michael Liao <[email protected]>	[AMDGPU] Add more test cases of D59608. Summary: - Add more test cases. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm [AMDGPU] Add more test cases of D59608. Summary: - Add more test cases. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60071 llvm-svn: 357442 show more ...
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3
# 84445dd1	30-Nov-2017	Matt Arsenault <[email protected]>	AMDGPU: Use gfx9 carry-less add/sub instructions llvm-svn: 319491
Revision tags: llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2, llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1
# 3dbeefa9	21-Mar-2017	Matt Arsenault <[email protected]>	AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default ca AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernel Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444 show more ...
Revision tags: llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1, llvmorg-3.8.0, llvmorg-3.8.0-rc3
# 9c47dd58	11-Feb-2016	Matt Arsenault <[email protected]>	AMDGPU: Remove some old intrinsic uses from tests llvm-svn: 260493
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1
# 2aed6ca1	19-Dec-2015	Matt Arsenault <[email protected]>	AMDGPU: Switch barrier intrinsics to using convergent noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for AMDGPU: Switch barrier intrinsics to using convergent noduplicate prevents unrolling of small loops that happen to have barriers in them. If a loop has a barrier in it, it is OK to duplicate it for the unroll. llvm-svn: 256075 show more ...
Revision tags: llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1
# 966a94f8	08-Sep-2015	Matt Arsenault <[email protected]>	AMDGPU: Handle sub of constant for DS offset folding sub C, x - > add (sub 0, x), C for DS offsets. This is mostly to fix regressions that show up when SeparateConstOffsetFromGEP is enabled. llvm- AMDGPU: Handle sub of constant for DS offset folding sub C, x - > add (sub 0, x), C for DS offsets. This is mostly to fix regressions that show up when SeparateConstOffsetFromGEP is enabled. llvm-svn: 247054 show more ...