|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1 |
|
| #
37ead201 |
| 12-Nov-2021 |
Philip Reames <[email protected]> |
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the
[runtime-unroll] Use incrementing IVs instead of decrementing ones
This is one of those wonderful "in theory X doesn't matter, but in practice is does" changes. In this particular case, we shift the IVs inserted by the runtime unroller to clamp iteration count of the loops* from decrementing to incrementing.
Why does this matter? A couple of reasons: * SCEV doesn't have a native subtract node. Instead, all subtracts (A - B) are represented as A + -1 * B and drops any flags invalidated by such. As a result, SCEV is slightly less good at reasoning about edge cases involving decrementing addrecs than incrementing ones. (You can see this in the inferred flags in some of the test cases.) * Other parts of the optimizer produce incrementing IVs, and they're common in idiomatic source language. We do have support for reversing IVs, but in general if we produce one of each, the pair will persist surprisingly far through the optimizer before being coalesced. (You can see this looking at nearby phis in the test cases.)
Note that if the hardware prefers decrementing (i.e. zero tested) loops, LSR should convert back immediately before codegen.
* Mostly irrelevant detail: The main loop of the prolog case is handled independently and will simple use the original IV with a changed start value. We could in theory use this scheme for all iteration clamping, but that's a larger and more invasive change.
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1, llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2 |
|
| #
a95796a3 |
| 26-Jun-2020 |
Arthur Eubanks <[email protected]> |
[NewPM][LoopUnroll] Rename unroll* to loop-unroll*
The legacy pass is called "loop-unroll", but in the new PM it's called "unroll". Also applied to unroll-and-jam and unroll-full.
Fixes various che
[NewPM][LoopUnroll] Rename unroll* to loop-unroll*
The legacy pass is called "loop-unroll", but in the new PM it's called "unroll". Also applied to unroll-and-jam and unroll-full.
Fixes various check-llvm tests when NPM is turned on.
Reviewed By: Whitney, dmgreen
Differential Revision: https://reviews.llvm.org/D82590
show more ...
|
|
Revision tags: llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1, llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1, llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1 |
|
| #
cee313d2 |
| 17-Apr-2019 |
Eric Christopher <[email protected]> |
Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.
Will be re-reverting again.
llvm-svn: 358552
|
|
Revision tags: llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1, llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1, llvmorg-7.0.1, llvmorg-7.0.1-rc3, llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
| #
412ed347 |
| 31-Oct-2018 |
Fedor Sergeev <[email protected]> |
[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll
Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unro
[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll
Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unroll to run (runtime, peeling, partial), relying on global defaults.
In some cases having ability to run a restricted LoopUnroll that does more than LoopFullUnroll is needed.
Introduced LoopUnrollOptions to select optional unroll behaviors. Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing.
Reviewers: chandlerc, tejohnson Differential Revision: https://reviews.llvm.org/D53440
llvm-svn: 345723
show more ...
|
| #
f923e862 |
| 29-Oct-2018 |
Fedor Sergeev <[email protected]> |
[LoopUnroll] NFC. Factor out runtime-loop.ll common test behavior.
Adding COMMON prefix to get common part handled there. Needed to simplify test changes for D53440.
llvm-svn: 345538
|
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3, llvmorg-7.0.0-rc2, llvmorg-7.0.0-rc1, llvmorg-6.0.1, llvmorg-6.0.1-rc3, llvmorg-6.0.1-rc2, llvmorg-6.0.1-rc1, llvmorg-5.0.2, llvmorg-5.0.2-rc2, llvmorg-5.0.2-rc1, llvmorg-6.0.0, llvmorg-6.0.0-rc3, llvmorg-6.0.0-rc2, llvmorg-6.0.0-rc1, llvmorg-5.0.1, llvmorg-5.0.1-rc3, llvmorg-5.0.1-rc2, llvmorg-5.0.1-rc1, llvmorg-5.0.0, llvmorg-5.0.0-rc5, llvmorg-5.0.0-rc4, llvmorg-5.0.0-rc3, llvmorg-5.0.0-rc2 |
|
| #
ecd90131 |
| 02-Aug-2017 |
Teresa Johnson <[email protected]> |
[PM] Split LoopUnrollPass and make partial unroller a function pass
Summary: This is largely NFC*, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I a
[PM] Split LoopUnrollPass and make partial unroller a function pass
Summary: This is largely NFC*, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good.
*Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling.
Reviewers: chandlerc
Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D36157
llvm-svn: 309886
show more ...
|
|
Revision tags: llvmorg-5.0.0-rc1 |
|
| #
8e431a98 |
| 12-Jul-2017 |
Anna Thomas <[email protected]> |
[LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop
Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it
[LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loop
Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true.
llvm-svn: 307843
show more ...
|
|
Revision tags: llvmorg-4.0.1, llvmorg-4.0.1-rc3, llvmorg-4.0.1-rc2, llvmorg-4.0.1-rc1, llvmorg-4.0.0, llvmorg-4.0.0-rc4, llvmorg-4.0.0-rc3, llvmorg-4.0.0-rc2 |
|
| #
ce40fa13 |
| 25-Jan-2017 |
Chandler Carruth <[email protected]> |
[PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls loops.
We do this by reconstructing the newly added loops after the unroll completes to avoid threading pass manager details thro
[PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls loops.
We do this by reconstructing the newly added loops after the unroll completes to avoid threading pass manager details through all the mess of the unrolling infrastructure.
I've enabled some extra assertions in the LPM to try and catch issues here and enabled a bunch of unroller tests to try and make sure this is sane.
Currently, I'm manually running loop-simplify when needed. That should go away once it is folded into the LPM infrastructure.
Differential Revision: https://reviews.llvm.org/D28848
llvm-svn: 293011
show more ...
|
|
Revision tags: llvmorg-4.0.0-rc1, llvmorg-3.9.1, llvmorg-3.9.1-rc3, llvmorg-3.9.1-rc2, llvmorg-3.9.1-rc1, llvmorg-3.9.0, llvmorg-3.9.0-rc3, llvmorg-3.9.0-rc2 |
|
| #
b2738e41 |
| 02-Aug-2016 |
Michael Zolotukhin <[email protected]> |
[LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value.
As agreed in post-commit review of r265388, I'm switching the flag to its original value until the 90% run
[LoopUnroll] Switch the default value of -unroll-runtime-epilog back to its original value.
As agreed in post-commit review of r265388, I'm switching the flag to its original value until the 90% runtime performance regression on SingleSource/Benchmarks/Stanford/Bubblesort is addressed.
llvm-svn: 277524
show more ...
|
| #
d9b6ad3c |
| 02-Aug-2016 |
Michael Zolotukhin <[email protected]> |
[LoopUnroll] Ensure we create prolog loops in simplified form.
llvm-svn: 277502
|
|
Revision tags: llvmorg-3.9.0-rc1, llvmorg-3.8.1, llvmorg-3.8.1-rc1 |
|
| #
23ce61b6 |
| 27-Apr-2016 |
Evgeny Stupachenko <[email protected]> |
The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can p
The patch fixes PR27392. Summary: It is incorrect to compare TripCount (which is BECount + 1) with extraiters (or Count) to check if we should enter unrolled loop or not, because TripCount can potentially overflow (when BECount is max unsigned integer). While comparing BECount with (Count - 1) is overflow safe and therefore correct.
Reviewer: hfinkel
Differential Revision: http://reviews.llvm.org/D19256
From: Evgeny Stupachenko <[email protected]> llvm-svn: 267662
show more ...
|
| #
188de5ae |
| 05-Apr-2016 |
David L Kreitzer <[email protected]> |
Adds the ability to use an epilog remainder loop during loop unrolling and makes this the default behavior.
Patch by Evgeny Stupachenko ([email protected]).
Differential Revision: http://reviews.l
Adds the ability to use an epilog remainder loop during loop unrolling and makes this the default behavior.
Patch by Evgeny Stupachenko ([email protected]).
Differential Revision: http://reviews.llvm.org/D18158
llvm-svn: 265388
show more ...
|
|
Revision tags: llvmorg-3.8.0, llvmorg-3.8.0-rc3, llvmorg-3.8.0-rc2, llvmorg-3.8.0-rc1, llvmorg-3.7.1, llvmorg-3.7.1-rc2, llvmorg-3.7.1-rc1, llvmorg-3.7.0, llvmorg-3.7.0-rc4, llvmorg-3.7.0-rc3, llvmorg-3.7.0-rc2, llvmorg-3.7.0-rc1, llvmorg-3.6.2, llvmorg-3.6.2-rc1, llvmorg-3.6.1, llvmorg-3.6.1-rc1 |
|
| #
e178f469 |
| 14-Apr-2015 |
Sanjoy Das <[email protected]> |
[LoopUnrollRuntime] Avoid high-cost trip count computation.
Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this
[LoopUnrollRuntime] Avoid high-cost trip count computation.
Summary: Runtime unrolling of loops needs to emit an expression to compute the loop's runtime trip-count. Avoid runtime unrolling if this computation will be expensive.
Depends on D8993.
Reviewers: atrick
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D8994
llvm-svn: 234846
show more ...
|
|
Revision tags: llvmorg-3.5.2, llvmorg-3.5.2-rc1 |
|
| #
715b01e9 |
| 09-Mar-2015 |
Kevin Qin <[email protected]> |
Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization.
Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and
Introduce runtime unrolling disable matadata and use it to mark the scalar loop from vectorization.
Runtime unrolling is an expensive optimization which can bring benefit only if the loop is hot and iteration number is relatively large enough. For some loops, we know they are not worth to be runtime unrolled. The scalar loop from vectorization is one of the cases.
llvm-svn: 231631
show more ...
|
| #
a79ac14f |
| 27-Feb-2015 |
David Blaikie <[email protected]> |
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.
A similar migration script can be used to update test
[opaque pointer type] Add textual IR support for explicit type parameter to load instruction
Essentially the same as the GEP change in r230786.
A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278)
import fileinput import sys import re
pat = re.compile(r"((?:=|:|^)\s*load (?:atomic )?(?:volatile )?(.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$)")
for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line))
Reviewers: rafael, dexonsmith, grosser
Differential Revision: http://reviews.llvm.org/D7649
llvm-svn: 230794
show more ...
|
| #
79e6c749 |
| 27-Feb-2015 |
David Blaikie <[email protected]> |
[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers, replacing them with a
[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction
One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type.
This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions.
* This doesn't modify gep operators, only instructions (operators will be handled separately)
* Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes.
* geps of vectors are transformed as: getelementptr <4 x float*> %x, ... ->getelementptr float, <4 x float*> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float.
* address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x
Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files.
update.py: import fileinput import sys import re
ibrep = re.compile(r"(^.*?[^%\w]getelementptr inbounds )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))") normrep = re.compile( r"(^.*?[^%\w]getelementptr )(((?:<\d* x )?)(.*?)(| addrspace\(\d\)) *\*(|>)(?:$| *(?:%|@|null|undef|blockaddress|getelementptr|addrspacecast|bitcast|inttoptr|\[\[[a-zA-Z]|\{\{).*$))")
def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line
for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line)
apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done
The actual commands: From llvm/src: find test/ -name *.ll | xargs ./apply.sh From llvm/src/tools/clang: find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll | xargs ./apply.sh
After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out).
The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases.
Reviewers: rafael, dexonsmith, grosser
Differential Revision: http://reviews.llvm.org/D7636
llvm-svn: 230786
show more ...
|
|
Revision tags: llvmorg-3.6.0, llvmorg-3.6.0-rc4 |
|
| #
11b279a8 |
| 18-Feb-2015 |
Sanjoy Das <[email protected]> |
Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue l
Partial fix for bug 22589
Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue loop and the extra check for overflowing trip-count.
Differential Revision: http://reviews.llvm.org/D7715
llvm-svn: 229731
show more ...
|
|
Revision tags: llvmorg-3.6.0-rc3, llvmorg-3.6.0-rc2, llvmorg-3.6.0-rc1 |
|
| #
090a19bd |
| 08-Jan-2015 |
Duncan P. N. Exon Smith <[email protected]> |
IR: Add 'distinct' MDNodes to bitcode and assembly
Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly.
Cu
IR: Add 'distinct' MDNodes to bitcode and assembly
Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly.
Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for:
- self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision.
The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`.
Part of PR22111.
llvm-svn: 225474
show more ...
|
|
Revision tags: llvmorg-3.5.1, llvmorg-3.5.1-rc2 |
|
| #
be7ea19b |
| 15-Dec-2014 |
Duncan P. N. Exon Smith <[email protected]> |
IR: Make metadata typeless in assembly
Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802.
- Only use
IR: Make metadata typeless in assembly
Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802.
- Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`.
- Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics.
So, assembly like this:
define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{}
turns into this:
define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{}
I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases.
This is part of PR21532.
llvm-svn: 224257
show more ...
|
|
Revision tags: llvmorg-3.5.1-rc1 |
|
| #
fc02e3c3 |
| 29-Sep-2014 |
Kevin Qin <[email protected]> |
Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-e
Use a loop to simplify the runtime unrolling prologue.
Runtime unrolling will create a prologue to execute the extra iterations which is can't divided by the unroll factor. It generates an if-then-else sequence to jump into a factor -1 times unrolled loop body, like
extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: if (extraiters == loopfactor) jump L1 if (extraiters == loopfactor-1) jump L2 ... L1: LoopBody; L2: LoopBody; ... if tripcount < loopfactor jump End Loop: ... End:
It means if the unroll factor is 4, the loop body will be 7 times unrolled, 3 are in loop prologue, and 4 are in the loop. This commit is to use a loop to execute the extra iterations in prologue, like
extraiters = tripcount % loopfactor if (extraiters == 0) jump Loop: else jump Prol Prol: LoopBody; extraiters -= 1 // Omitted if unroll factor is 2. if (extraiters != 0) jump Prol: // Omitted if unroll factor is 2. if (tripcount < loopfactor) jump End Loop: ... End:
Then when unroll factor is 4, the loop body will be copied by only 5 times, 1 in the prologue loop, 4 in the original loop. And if the unroll factor is 2, new loop won't be created, just as the original solution.
llvm-svn: 218604
show more ...
|
|
Revision tags: llvmorg-3.5.0, llvmorg-3.5.0-rc4, llvmorg-3.5.0-rc3, llvmorg-3.5.0-rc2, llvmorg-3.5.0-rc1 |
|
| #
0bf086f8 |
| 21-Jun-2014 |
Benjamin Kramer <[email protected]> |
LoopUnrollRuntime: Check for overflow in the trip count calculation.
Fixes PR19823.
llvm-svn: 211436
|
|
Revision tags: llvmorg-3.4.2, llvmorg-3.4.2-rc1, llvmorg-3.4.1, llvmorg-3.4.1-rc2, llvmorg-3.4.1-rc1, llvmorg-3.4.0, llvmorg-3.4.0-rc3, llvmorg-3.4.0-rc2, llvmorg-3.4.0-rc1, llvmorg-3.3.1-rc1, llvmorg-3.3.0, llvmorg-3.3.0-rc3, llvmorg-3.3.0-rc2, llvmorg-3.3.0-rc1, llvmorg-3.2.0, llvmorg-3.2.0-rc3, llvmorg-3.2.0-rc2, llvmorg-3.2.0-rc1, llvmorg-3.1.0, llvmorg-3.1.0-rc3, llvmorg-3.1.0-rc2, llvmorg-3.1.0-rc1 |
|
| #
d04d1529 |
| 09-Dec-2011 |
Andrew Trick <[email protected]> |
Add -unroll-runtime for unrolling loops with run-time trip counts.
Patch by Brendon Cahoon!
This extends the existing LoopUnroll and LoopUnrollPass. Brendon measured no regressions in the llvm test
Add -unroll-runtime for unrolling loops with run-time trip counts.
Patch by Brendon Cahoon!
This extends the existing LoopUnroll and LoopUnrollPass. Brendon measured no regressions in the llvm test suite with -unroll-runtime enabled. This implementation works by using the existing loop unrolling code to unroll the loop by a power-of-two (default 8). It generates an if-then-else sequence of code prior to the loop to execute the extra iterations before entering the unrolled loop.
llvm-svn: 146245
show more ...
|