|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
7165e671 |
| 23-Aug-2021 |
Kai Luo <[email protected]> |
[PowerPC] Use int64_t to represent stack object offset and frame size
This is the first step to enable PPC64 support huge frame size(>2G). Also fix an assertion error for frame size, i.e.,`int x; !i
[PowerPC] Use int64_t to represent stack object offset and frame size
This is the first step to enable PPC64 support huge frame size(>2G). Also fix an assertion error for frame size, i.e.,`int x; !isInt<32>(x);` should be always evaluated false, so the guard code for frame size is impossible to hit.
Reviewed By: jsji
Differential Revision: https://reviews.llvm.org/D107435
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
2432d80d |
| 20-Apr-2021 |
Qiu Chaofan <[email protected]> |
[PowerPC] Use mtvsrdd to put callee-saved GPR into VSR
This patch exploits mtvsrdd instruction (available in ISA3.0+) to save two callee-saved GPR registers into a single VSR, making it more efficie
[PowerPC] Use mtvsrdd to put callee-saved GPR into VSR
This patch exploits mtvsrdd instruction (available in ISA3.0+) to save two callee-saved GPR registers into a single VSR, making it more efficient.
Reviewed By: jsji, nemanjai
Differential Revision: https://reviews.llvm.org/D62565
show more ...
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3 |
|
| #
c352e088 |
| 03-Jul-2020 |
Kai Luo <[email protected]> |
[PowerPC] Implement probing for prologue
This patch is part of supporting `-fstack-clash-protection`. Implemented probing when emitting prologue.
Differential Revision: https://reviews.llvm.org/D81
[PowerPC] Implement probing for prologue
This patch is part of supporting `-fstack-clash-protection`. Implemented probing when emitting prologue.
Differential Revision: https://reviews.llvm.org/D81460
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1 |
|
| #
4dad4914 |
| 19-May-2020 |
Matt Arsenault <[email protected]> |
CodeGen: Use Register
|
|
Revision tags: llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3 |
|
| #
186dd631 |
| 29-Feb-2020 |
Benjamin Kramer <[email protected]> |
ArrayRef'ize restoreCalleeSavedRegisters. NFCI.
restoreCalleeSavedRegisters can mutate the contents of the CalleeSavedInfos, so use a MutableArrayRef.
|
| #
8efc2f57 |
| 24-Feb-2020 |
Sean Fertile <[email protected]> |
[PowerPC][AIX] Spill/restore the callee-saved condition register bits.
Extends the existing support for spilling and restoring the condition register to the linkage area for 32-bit targets, and enab
[PowerPC][AIX] Spill/restore the callee-saved condition register bits.
Extends the existing support for spilling and restoring the condition register to the linkage area for 32-bit targets, and enables for AIX.
Differential Revision: https://reviews.llvm.org/D74349
show more ...
|
|
Revision tags: llvmorg-10.0.0-rc2 |
|
| #
e4230a9f |
| 08-Feb-2020 |
Benjamin Kramer <[email protected]> |
ArrayRef'ize spillCalleeSavedRegisters. NFCI.
|
|
Revision tags: llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3 |
|
| #
09967050 |
| 13-Aug-2019 |
Hubert Tong <[email protected]> |
Reland r368691: "[AIX] Implement LR prolog/epilog save/restore"
Trying again with the code changes (and not just the new test).
Summary: This patch fixes the offsets of fields in the stack frame li
Reland r368691: "[AIX] Implement LR prolog/epilog save/restore"
Trying again with the code changes (and not just the new test).
Summary: This patch fixes the offsets of fields in the stack frame linkage save area for AIX.
Reviewers: sfertile, hubert.reinterpretcast, jasonliu, Xiangling_L, xingxue, ZarkoCA, daltenty
Reviewed By: hubert.reinterpretcast
Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D64424
Patch by Chris Bowler!
llvm-svn: 368721
show more ...
|
|
Revision tags: llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1 |
|
| #
9bd22fec |
| 26-Jul-2019 |
Sean Fertile <[email protected]> |
[PowerPC] Add getCRSaveOffset to improve readability. [NFC]
In preperation for AIX support in FrameLowering: replace a number of literal '8' that represent the stack offset of the condition register
[PowerPC] Add getCRSaveOffset to improve readability. [NFC]
In preperation for AIX support in FrameLowering: replace a number of literal '8' that represent the stack offset of the condition register save area with a member in PPCFrameLowering.
Patch by Chris Bowler.
llvm-svn: 367111
show more ...
|
|
Revision tags: llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2 |
|
| #
6fc4c1cc |
| 05-Jun-2019 |
Dmitri Gribenko <[email protected]> |
Include what you use in PPCFrameLowering.h
llvm-svn: 362590
|
|
Revision tags: llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4 |
|
| #
bd5429ef |
| 28-Feb-2019 |
Stefan Pintilie <[email protected]> |
[PowerPC] Move the stack pointer update instruction later in the prologue and earlier in the epilogue.
Move the stdu instruction in the prologue and epilogue. This should provide a small performance
[PowerPC] Move the stack pointer update instruction later in the prologue and earlier in the epilogue.
Move the stdu instruction in the prologue and epilogue. This should provide a small performance boost in functions that are able to do this. I've kept this change rather conservative at the moment and functions with frame pointers or base pointers will not try to move the stack pointer update.
Differential Revision: https://reviews.llvm.org/D42590
llvm-svn: 355085
show more ...
|
|
Revision tags: llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3 |
|
| #
5c179bf1 |
| 09-Nov-2018 |
Zaara Syeda <[email protected]> |
[Power9] Allow gpr callee saved spills in prologue to vectors registers
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to stack frame index to be spilled in the prologue.
[Power9] Allow gpr callee saved spills in prologue to vectors registers
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to stack frame index to be spilled in the prologue. We would like to enable spilling gprs to vector registers. This patch adds the capability to spill to other registers aside from just the stack. It also adds the changes for power9 to spill gprs to volatile vector registers when they are available. This happens only for leaf functions when using the option -ppc-enable-pe-vector-spills.
Differential Revision: https://reviews.llvm.org/D39386
llvm-svn: 346512
show more ...
|
|
Revision tags: llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1, llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2 |
|
| #
5f8f34e4 |
| 01-May-2018 |
Adrian Prantl <[email protected]> |
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they ar
Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
llvm-svn: 331272
show more ...
|
|
Revision tags: llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2 |
|
| #
1be62f03 |
| 03-Nov-2017 |
David Blaikie <[email protected]> |
Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in lib/CodeGen, so move the header there to match.
This fixes a link er
Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in lib/CodeGen, so move the header there to match.
This fixes a link error with modular codegeneration builds - where a header and its implementation are circularly dependent and so need to be in the same library, not split between two like this.
llvm-svn: 317379
show more ...
|
|
Revision tags: llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2 |
|
| #
bea30c62 |
| 10-Aug-2017 |
Krzysztof Parzyszek <[email protected]> |
Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers t
Add "Restored" flag to CalleeSavedInfo
The liveness-tracking code assumes that the registers that were saved in the function's prolog are live outside of the function. Specifically, that registers that were saved are also live-on-exit from the function. This isn't always the case as illustrated by the LR register on ARM.
Differential Revision: https://reviews.llvm.org/D36160
llvm-svn: 310619
show more ...
|
|
Revision tags: llvmorg-5.0.0-rc1, llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2, llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2, llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
| #
f8b592f2 |
| 01-Apr-2016 |
Chuang-Yu Cheng <[email protected]> |
[PPC64] Bug fix: when enabling sibling-call-opt and shrink-wrapping, the tail call branch instruction might disappear
Bug Pattern: # BB#0: # %entry cmpldi 3
[PPC64] Bug fix: when enabling sibling-call-opt and shrink-wrapping, the tail call branch instruction might disappear
Bug Pattern: # BB#0: # %entry cmpldi 3, 0 beq- 0, .LBB0_2 # BB#1: # %exit lwz 4, 0(3) #TC_RETURNd8 LVComputationKind 0 .LBB0_2: # %cond.false mflr 0 std 0, 16(1) stdu 1, -96(1) .Ltmp0: .cfi_def_cfa_offset 96 .Ltmp1: .cfi_offset lr, 16 bl __assert_fail nop
The branch instruction for tail call return is not generated, because the shrink-wrapping pass choosing a new Restore Point: %cond.false, so %exit block is not sent to emitEpilogue, that's why the branch is not generated.
Thanks Kit's opinions! Reviewers: nemanjai hfinkel tjablin kbarton
http://reviews.llvm.org/D17606
llvm-svn: 265112
show more ...
|
| #
e1a2e90f |
| 31-Mar-2016 |
Hans Wennborg <[email protected]> |
Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous
Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method merge adjacent stack adjustments, i.e. it might erase the previous and/or next instruction.
It also greatly simplifies the calls to this function from Prolog- EpilogInserter. Previously, that had a bunch of logic to resume iteration after the call; now it just continues with the returned iterator.
Note that this changes the behaviour of PEI a little. Previously, it attempted to re-visit the new instruction created by eliminateCallFramePseudoInstr(). That code was added in r36625, but I can't see any reason for it: the new instructions will obviously not be pseudo instructions, they will not have FrameIndex operands, and we have already accounted for the stack adjustment.
Differential Revision: http://reviews.llvm.org/D18627
llvm-svn: 265036
show more ...
|
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3 |
|
| #
ae22101c |
| 20-Feb-2016 |
Nemanja Ivanovic <[email protected]> |
Fix for PR 26500
This patch corresponds to review: http://reviews.llvm.org/D17294
It ensures that whatever block we are emitting the prologue/epilogue into, we have the necessary scratch registers.
Fix for PR 26500
This patch corresponds to review: http://reviews.llvm.org/D17294
It ensures that whatever block we are emitting the prologue/epilogue into, we have the necessary scratch registers. It takes away the hard-coded register numbers for use as scratch registers as registers that are guaranteed to be available in the function prologue/epilogue are not guaranteed to be available within the function body. Since we shrink-wrap, the prologue/epilogue may end up in the function body.
llvm-svn: 261441
show more ...
|
|
Revision tags: llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1 |
|
| #
9c432ae1 |
| 16-Nov-2015 |
Kit Barton <[email protected]> |
Find available scratch register to use in function prologue and epilogue as part of shrink wrapping.
Phabricator: http://reviews.llvm.org/D13955 llvm-svn: 253247
|
| #
d3b904d4 |
| 10-Sep-2015 |
Kit Barton <[email protected]> |
Enable the shrink wrapping optimization for PPC64.
The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blo
Enable the shrink wrapping optimization for PPC64.
The changes in this patch are as follows: 1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function 2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function 3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run: Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line
A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64.
Phabricator review: http://reviews.llvm.org/D11817
llvm-svn: 247237
show more ...
|
|
Revision tags: llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1 |
|
| #
02564865 |
| 14-Jul-2015 |
Matthias Braun <[email protected]> |
PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():
- Rename the function to determineCalleeSaves() - Pas
PrologEpilogInserter: Rewrite API to determine callee save regsiters.
This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan():
- Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified.
Related to rdar://21539507
Differential Revision: http://reviews.llvm.org/D10909
llvm-svn: 242165
show more ...
|
|
Revision tags: llvmorg-3.6.2, llvmorg-3.6.2-rc1 |
|
| #
f00654e3 |
| 23-Jun-2015 |
Alexander Kornienko <[email protected]> |
Revert r240137 (Fixed/added namespace ending comments using clang-tidy. NFC)
Apparently, the style needs to be agreed upon first.
llvm-svn: 240390
|
| #
70bc5f13 |
| 19-Jun-2015 |
Alexander Kornienko <[email protected]> |
Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-c
Fixed/added namespace ending comments using clang-tidy. NFC
The patch is generated using this command:
tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \ -checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \ llvm/lib/
Thanks to Eugene Kosov for the original patch!
llvm-svn: 240137
show more ...
|
|
Revision tags: llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
| #
61b305ed |
| 05-May-2015 |
Quentin Colombet <[email protected]> |
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find
[ShrinkWrap] Add (a simplified version) of shrink-wrapping.
This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks.
As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64.
** Context **
Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places.
** Motivating example **
Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false
true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false
false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 }
On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret
With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret
Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call.
** Proposed Solution **
This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI.
Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties.
The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap.
Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block.
** Design Decisions **
1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples.
Differential Revision: http://reviews.llvm.org/D9210
<rdar://problem/3201744>
llvm-svn: 236507
show more ...
|