SIPeepholeSDWA.cpp - OpenGrok history log for /llvm-project-15.0.7/llvm/lib/Target/AMDGPU/SIPeepholeSDWA.cpp

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2
# 6527b2a4	18-Feb-2022	Sebastian Neubauer <[email protected]>	[AMDGPU][NFC] Fix typos Fix some typos in the amdgpu backend. Differential Revision: https://reviews.llvm.org/D119235
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# 399b7de0	18-Sep-2021	Christudasan Devadasan <[email protected]>	[AMDGPU] Add a regclass flag for scalar registers Along with vector RC flags, this scalar flag will make various regclass queries like `isVGPR` more accurate. Regclasses other than vectors are curr [AMDGPU] Add a regclass flag for scalar registers Along with vector RC flags, this scalar flag will make various regclass queries like `isVGPR` more accurate. Regclasses other than vectors are currently set with the new flag even though certain unallocatable classes aren't truly scalars. It would be ok as long as they remain unallocatable. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D110053 show more ...
Revision tags: llvmorg-13.0.0-rc3
# 654c89d8	06-Sep-2021	Christudasan Devadasan <[email protected]>	[AMDGPU] Make vector superclasses allocatable The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to [AMDGPU] Make vector superclasses allocatable The combined vector register classes with both VGPRs and AGPRs are currently unallocatable. This patch turns them into allocatable as a prerequisite to enable copy between VGPR and AGPR registers during regalloc. Also, added the missing AV register classes from 192b to 1024b. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D109300 show more ...
# d1f45ed5	11-Nov-2021	Neubauer, Sebastian <[email protected]>	[AMDGPU][NFC] Fix typos Differential Revision: https://reviews.llvm.org/D113672
Revision tags: llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2
# 560d7e04	20-Jan-2021	dfukalov <[email protected]>	[NFC][AMDGPU] Split AMDGPUSubtarget.h to R600 and GCN subtargets ... to reduce headers dependency. Reviewed By: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D95036
Revision tags: llvmorg-11.1.0-rc1
# 314e29ed	07-Jan-2021	Joe Nash <[email protected]>	[AMDGPU] Add _e64 suffix to VOP3 Insts Previously, instructions which could be expressed as VOP3 in addition to another encoding had a _e64 suffix on the tablegen record name, while those only avail [AMDGPU] Add _e64 suffix to VOP3 Insts Previously, instructions which could be expressed as VOP3 in addition to another encoding had a _e64 suffix on the tablegen record name, while those only available as VOP3 did not. With this patch, all VOP3s will have the _e64 suffix. The assembly does not change, only the mir. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D94341 Change-Id: Ia8ec8890d47f8f94bbbdac43745b4e9dd2b03423 show more ...
# 6a87e9b0	25-Dec-2020	dfukalov <[email protected]>	[NFC][AMDGPU] Reduce include files dependency. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D93813
Revision tags: llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3
# 34978602	20-Aug-2020	Jay Foad <[email protected]>	[AMDGPU] Remove uses of Register::isPhysicalRegister/isVirtualRegister ... in favour of the isPhysical/isVirtual methods.
Revision tags: llvmorg-11.0.0-rc2
# 0462aef5	13-Aug-2020	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Inhibit SDWA if target instruction has FI Differential Revision: https://reviews.llvm.org/D85918
Revision tags: llvmorg-11.0.0-rc1, llvmorg-12-init
# 79f67cae	14-Jul-2020	Matt Arsenault <[email protected]>	AMDGPU: Rename add/sub with carry out instructions The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to AMDGPU: Rename add/sub with carry out instructions The hardware has created a real mess in the naming for add/sub, which have been renamed basically every generation. Switch the carry out pseudos to have the gfx9/gfx10 names. We were using the original SI/CI v_add_i32/v_sub_i32 names. Later targets reintroduced these names as carryless instructions with a saturating clamp bit, which we do not define. Do this rename so we can unambiguously add these missing instructions. The carry-in versions should also be renamed, but at least those had a consistent _u32 name to begin with. The 16-bit instructions were also renamed, but aren't ambiguous. This does regress assembler error message quality in some cases. In mismatched wave32/wave64 situations, this will switch from "unsupported instruction" to "invalid operand", with the error pointing at the wrong position. I couldn't quite follow how the assembler selects these, but the previous behavior seemed accidental to me. It looked like there was a partial attempt to handle this which was never completed (i.e. there is an AMDGPUOperand::isBoolReg but it isn't used for anything). show more ...
Revision tags: llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2
# 07cd19ef	27-May-2020	Matt Arsenault <[email protected]>	AMDGPU: Fix dropping MI flags when rewriting instructions All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExcept AMDGPU: Fix dropping MI flags when rewriting instructions All 3 passes that change instruction encodings were dropping MI flags. This avoids scheduling regressions caused by setting mayRaiseFPExceptions on FP instructions for non-strictfp functions. show more ...
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2
# a19de320	12-Feb-2020	Hans Wennborg <[email protected]>	Fix unused function warning (PR44808)
Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2
# 3d5ba7c6	27-Nov-2019	Tim Renouf <[email protected]>	AMDGPU: Fixed indeterminate map iteration in SIPeepholeSDWA Differential Revision: https://reviews.llvm.org/D70783 Change-Id: Ic26f915a4acb4c00ecefa9d09d7c24cec370ed06
Revision tags: llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3
# 0c476111	15-Aug-2019	Daniel Sanders <[email protected]>	Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Re Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM Summary: This clang-tidy check is looking for unsigned integer variables whose initializer starts with an implicit cast from llvm::Register and changes the type of the variable to llvm::Register (dropping the llvm:: where possible). Partial reverts in: X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned& MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register PPCFastISel.cpp - No Register::operator-=() PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned& MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor Manual fixups in: ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned& HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register. PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned& Depends on D65919 Reviewers: arsenm, bogner, craig.topper, RKSimon Reviewed By: arsenm Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65962 llvm-svn: 369041 show more ...
# 0eaee545	15-Aug-2019	Jonas Devlieghere <[email protected]>	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of [llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013 show more ...
Revision tags: llvmorg-9.0.0-rc2
# 2bea69bf	01-Aug-2019	Daniel Sanders <[email protected]>	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633
Revision tags: llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3
# 52500216	16-Jun-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] gfx10 conditional registers handling This is cpp source part of wave32 support, excluding overriden getRegClass(). Differential Revision: https://reviews.llvm.org/D63351 llvm-svn: 363513
Revision tags: llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1
# 28a1936f	04-May-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] gfx1010: use fmac instructions Differential Revision: https://reviews.llvm.org/D61527 llvm-svn: 359959
Revision tags: llvmorg-8.0.0
# da644c02	13-Mar-2019	Stanislav Mekhanoshin <[email protected]>	[AMDGPU] Silence gcc 7 warnings Differential Revision: https://reviews.llvm.org/D59330 llvm-svn: 356100
Revision tags: llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1
# 2946cd70	19-Jan-2019	Chandler Carruth <[email protected]>	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the ne Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636 show more ...
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3
# 16de4fd2	03-Dec-2018	Ron Lieberman <[email protected]>	[AMDGPU] Add sdwa support for ADD\|SUB U64 decomposed Pseudos The introduction of S_{ADD\|SUB}_U64_PSEUDO instructions which are decomposed into VOP3 instruction pairs for S_ADD_U64_PSEUDO: V_ADD_I3 [AMDGPU] Add sdwa support for ADD\|SUB U64 decomposed Pseudos The introduction of S_{ADD\|SUB}_U64_PSEUDO instructions which are decomposed into VOP3 instruction pairs for S_ADD_U64_PSEUDO: V_ADD_I32_e64 V_ADDC_U32_e64 and for S_SUB_U64_PSEUDO V_SUB_I32_e64 V_SUBB_U32_e64 preclude the use of SDWA to encode a constant. SDWA: Sub-Dword addressing is supported on VOP1 and VOP2 instructions, but not on VOP3 instructions. We desire to fold the bit-and operand into the instruction encoding for the V_ADD_I32 instruction. This requires that we transform the VOP3 into a VOP2 form of the instruction (_e32). %19:vgpr_32 = V_AND_B32_e32 255, killed %16:vgpr_32, implicit $exec %47:vgpr_32, %49:sreg_64_xexec = V_ADD_I32_e64 %26.sub0:vreg_64, %19:vgpr_32, implicit $exec %48:vgpr_32, dead %50:sreg_64_xexec = V_ADDC_U32_e64 %26.sub1:vreg_64, %54:vgpr_32, killed %49:sreg_64_xexec, implicit $exec which then allows the SDWA encoding and becomes %47:vgpr_32 = V_ADD_I32_sdwa 0, %26.sub0:vreg_64, 0, killed %16:vgpr_32, 0, 6, 0, 6, 0, implicit-def $vcc, implicit $exec %48:vgpr_32 = V_ADDC_U32_e32 0, %26.sub1:vreg_64, implicit-def $vcc, implicit $vcc, implicit $exec Differential Revision: https://reviews.llvm.org/D54882 llvm-svn: 348132 show more ...
Revision tags: llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1
# 5bfbae5c	11-Jul-2018	Tom Stellard <[email protected]>	AMDGPU: Refactor Subtarget classes Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Me AMDGPU: Refactor Subtarget classes Summary: This is a follow-up to r335942. - Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget - Rename AMDGPUCommonSubtarget to AMDGPUSubtarget - Merge R600Subtarget::Generation and GCNSubtarget::Generation into AMDGPUSubtarget::Generation. Reviewers: arsenm, jvesely Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D49037 llvm-svn: 336851 show more ...
Revision tags: llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2
# 44b30b45	22-May-2018	Tom Stellard <[email protected]>	AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are hu AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers Summary: MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction and register defintions, which are huge so we only want to include them where needed. This will also make it easier if we want to split the R600 and GCN definitions into separate tablegenerated files. I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h because it uses some enums from the header to initialize default values for the SIMachineFunction class, so I ended up having to remove includes of SIMachineFunctionInfo.h from headers too. Reviewers: arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D46272 llvm-svn: 332930 show more ...
# d34e60ca	14-May-2018	Nicola Zaghen <[email protected]>	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/ Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240 show more ...
# 432a3883	30-Apr-2018	Nico Weber <[email protected]>	IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() IWYU for llvm-config.h in llvm, additions. See r331124 for how I made a list of files missing the include. I then ran this Python script: for f in open('filelist.txt'): f = f.strip() fl = open(f).readlines() found = False for i in xrange(len(fl)): p = '#include "llvm/' if not fl[i].startswith(p): continue if fl[i][len(p):] > 'Config': fl.insert(i, '#include "llvm/Config/llvm-config.h"\n') found = True break if not found: print 'not found', f else: open(f, 'w').write(''.join(fl)) and then looked through everything with `svn diff \| diffstat -l \| xargs -n 1000 gvim -p` and tried to fix include ordering and whatnot. No intended behavior change. llvm-svn: 331184 show more ...
12 3