|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1, llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
c1b96725 |
| 23-Feb-2022 |
Bill Wendling <[email protected]> |
[NFC] Add #include for constants
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
95b981ca |
| 26-Jan-2022 |
Giorgis Georgakoudis <[email protected]> |
[CodeExtractor] Enable partial aggregate arguments
Summary: Enable CodeExtractor to construct output functions that partially aggregate inputs/outputs in their argument list. A use case is the OMPIR
[CodeExtractor] Enable partial aggregate arguments
Summary: Enable CodeExtractor to construct output functions that partially aggregate inputs/outputs in their argument list. A use case is the OMPIRBuilder to create outlined functions for parallel regions that aggregate in a struct the payload variables for the region while passing as scalars thread and bound identifiers.
Differential Revision: https://reviews.llvm.org/D96854
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
144cd22b |
| 23-Aug-2021 |
Andrew Litteken <[email protected]> |
[CodeExtractor] Creating exit stubs based off original order branch instructions.
Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on t
[CodeExtractor] Creating exit stubs based off original order branch instructions.
Previously the CodeExtractor created exit stubs, and the subsequent return value of the outlined function based on the order of out-of-region blocks after splitting any phi nodes, and collecting the blocks to be outlined. This could cause differences in order if there was a difference of exit block phi nodes between the two regions. This patch moves the collection of the output target blocks to be before this occurs, so that the assignment of target block to output value will be the same, regardless of the contents of the output block.
Reviewers: paquette, roelofs
Differential Revision: https://reviews.llvm.org/D108657
show more ...
|
| #
9d2c859e |
| 26-Aug-2021 |
Andrew Litteken <[email protected]> |
[CodeExtractor] Making the arguments outlined easier to access from the outside
The Code Extractor does not provide an easy mechanism for determining the inputs and outputs after extraction has occu
[CodeExtractor] Making the arguments outlined easier to access from the outside
The Code Extractor does not provide an easy mechanism for determining the inputs and outputs after extraction has occurred, this patch gives the ability to pass in empty SetVectors to be filled with the inputs and outputs if they need to be analyzed.
Added Tests: - InputOutputMonitoring in unittests/Transforms/Utils/CodeExtractorTests.cpp
Reviewers: paquette
Differential Revision: https://reviews.llvm.org/D106991
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4, llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
700d2417 |
| 03-Nov-2020 |
Giorgis Georgakoudis <[email protected]> |
[CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers
CodeExtractor handles bitcasts in the extracted region that have lifetime markers users in the outer region as ou
[CodeExtractor] Replace uses of extracted bitcasts in out-of-region lifetime markers
CodeExtractor handles bitcasts in the extracted region that have lifetime markers users in the outer region as outputs. That creates unnecessary alloca/reload instructions and extra lifetime markers. The patch identifies those cases, and replaces uses in out-of-region lifetime markers with new bitcasts in the outer region.
**Example** ``` define void @foo() { entry: %0 = alloca i32 br label %extract
extract: %1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 4, i8* %1) call void @use(i32* %0) br label %exit
exit: call void @use(i32* %0) call void @llvm.lifetime.end.p0i8(i64 4, i8* %1) ret void } ```
**Current extraction** ``` define void @foo() { entry: %.loc = alloca i8*, align 8 %0 = alloca i32, align 4 br label %codeRepl
codeRepl: ; preds = %entry %lt.cast = bitcast i8** %.loc to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast) %lt.cast1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1) call void @foo.extract(i32* %0, i8** %.loc) %.reload = load i8*, i8** %.loc, align 8 call void @llvm.lifetime.end.p0i8(i64 -1, i8* %lt.cast) br label %exit
exit: ; preds = %codeRepl call void @use(i32* %0) call void @llvm.lifetime.end.p0i8(i64 4, i8* %.reload) ret void }
define internal void @foo.extract(i32* %0, i8** %.out) { newFuncRoot: br label %extract
exit.exitStub: ; preds = %extract ret void
extract: ; preds = %newFuncRoot %1 = bitcast i32* %0 to i8* store i8* %1, i8** %.out, align 8 call void @use(i32* %0) br label %exit.exitStub } ```
**Extraction with patch** ``` define void @foo() { entry: %0 = alloca i32, align 4 br label %codeRepl
codeRepl: ; preds = %entry %lt.cast1 = bitcast i32* %0 to i8* call void @llvm.lifetime.start.p0i8(i64 -1, i8* %lt.cast1) call void @foo.extract(i32* %0) br label %exit
exit: ; preds = %codeRepl call void @use(i32* %0) %lt.cast = bitcast i32* %0 to i8* call void @llvm.lifetime.end.p0i8(i64 4, i8* %lt.cast) ret void }
define internal void @foo.extract(i32* %0) { newFuncRoot: br label %extract
exit.exitStub: ; preds = %extract ret void
extract: ; preds = %newFuncRoot %1 = bitcast i32* %0 to i8* call void @use(i32* %0) br label %exit.exitStub } ```
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D90689
show more ...
|
|
Revision tags: llvmorg-11.0.0, llvmorg-11.0.0-rc6, llvmorg-11.0.0-rc5, llvmorg-11.0.0-rc4, llvmorg-11.0.0-rc3, llvmorg-11.0.0-rc2, llvmorg-11.0.0-rc1, llvmorg-12-init, llvmorg-10.0.1, llvmorg-10.0.1-rc4, llvmorg-10.0.1-rc3, llvmorg-10.0.1-rc2, llvmorg-10.0.1-rc1, llvmorg-10.0.0, llvmorg-10.0.0-rc6, llvmorg-10.0.0-rc5, llvmorg-10.0.0-rc4, llvmorg-10.0.0-rc3, llvmorg-10.0.0-rc2, llvmorg-10.0.0-rc1 |
|
| #
8359511c |
| 29-Jan-2020 |
Vedant Kumar <[email protected]> |
[CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the original function. The setup is:
1) CodeExtractor unregister
[CodeExtractor] Remove stale llvm.assume calls from extracted region
During extraction, stale llvm.assume handles may be retained in the original function. The setup is:
1) CodeExtractor unregisters assumptions in the blocks that are to be extracted.
2) Extraction happens. There are now two functions: f1 and f1.extracted.
3) Leftover assumptions in f1 (/not/ removed as they were not in the set of blocks to be extracted) now have affected-value llvm.assume handles in f1.extracted.
When assumptions for a value used in f1 are looked up, ValueTracking can assert as some of the handles are in the wrong function. To fix this, simply erase the llvm.assume calls in the extracted function.
Alternatives include flushing the assumption cache in the original function, or walking all values used in the original function to prune stale affected-value handles. Both seem more expensive.
Testing: check-llvm, LNT run with -mllvm -hot-cold-split enabled
rdar://58460728
show more ...
|
|
Revision tags: llvmorg-11-init, llvmorg-9.0.1, llvmorg-9.0.1-rc3, llvmorg-9.0.1-rc2, llvmorg-9.0.1-rc1 |
|
| #
9852699d |
| 08-Oct-2019 |
Vedant Kumar <[email protected]> |
[CodeExtractor] Factor out and reuse shrinkwrap analysis
Factor out CodeExtractor's analysis of allocas (for shrinkwrapping purposes), and allow the analysis to be reused.
This resolves a quadratic
[CodeExtractor] Factor out and reuse shrinkwrap analysis
Factor out CodeExtractor's analysis of allocas (for shrinkwrapping purposes), and allow the analysis to be reused.
This resolves a quadratic compile-time bug observed when compiling AMDGPUDisassembler.cpp.o.
Pre-patch (Release + LTO clang):
``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 176.5278 ( 57.8%) 0.4915 ( 18.5%) 177.0192 ( 57.4%) 177.4112 ( 57.3%) Hot Cold Splitting ```
Post-patch (ReleaseAsserts clang):
``` ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 1.4051 ( 3.3%) 0.0079 ( 0.3%) 1.4129 ( 3.2%) 1.4129 ( 3.2%) Hot Cold Splitting ```
Testing: check-llvm, and comparing the AMDGPUDisassembler.cpp.o binary pre- vs. post-patch.
An alternate approach is to hide CodeExtractorAnalysisCache from clients of CodeExtractor, and to recompute the analysis from scratch inside of CodeExtractor::extractCodeRegion(). This eliminates some redundant work in the shrinkwrapping legality check. However, some clients continue to exhibit O(n^2) compile time behavior as computing the analysis is O(n).
rdar://55912966
Differential Revision: https://reviews.llvm.org/D68616
llvm-svn: 374089
show more ...
|
| #
50afaa9d |
| 04-Oct-2019 |
Aditya Kumar <[email protected]> |
Add a unittest to verify for assumption cache
Reviewers: vsk, tejohnson
Reviewed By: vsk
Differential Revision: https://reviews.llvm.org/D68095
llvm-svn: 373811
|
|
Revision tags: llvmorg-9.0.0, llvmorg-9.0.0-rc6, llvmorg-9.0.0-rc5, llvmorg-9.0.0-rc4, llvmorg-9.0.0-rc3, llvmorg-9.0.0-rc2, llvmorg-9.0.0-rc1, llvmorg-10-init, llvmorg-8.0.1, llvmorg-8.0.1-rc4, llvmorg-8.0.1-rc3, llvmorg-8.0.1-rc2, llvmorg-8.0.1-rc1, llvmorg-8.0.0, llvmorg-8.0.0-rc5, llvmorg-8.0.0-rc4, llvmorg-8.0.0-rc3, llvmorg-7.1.0, llvmorg-7.1.0-rc1 |
|
| #
0e5dd512 |
| 08-Feb-2019 |
Vedant Kumar <[email protected]> |
[CodeExtractor] Restore outputs after creating exit stubs
When CodeExtractor saves the result of InvokeInst at the first insertion point of the 'normal destination' basic block, this block can be om
[CodeExtractor] Restore outputs after creating exit stubs
When CodeExtractor saves the result of InvokeInst at the first insertion point of the 'normal destination' basic block, this block can be omitted in the outlined region, so store is placed outside of the function. The suggested solution is to process saving outputs after creating exit stubs for new function, and stores will be placed in that blocks before return in this case.
Patch by Sergei Kachkov!
Fixes llvm.org/PR40455.
Differential Revision: https://reviews.llvm.org/D57919
llvm-svn: 353562
show more ...
|
|
Revision tags: llvmorg-8.0.0-rc2, llvmorg-8.0.0-rc1 |
|
| #
2946cd70 |
| 19-Jan-2019 |
Chandler Carruth <[email protected]> |
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the ne
Update the file headers across all of the LLVM projects in the monorepo to reflect the new license.
We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository.
llvm-svn: 351636
show more ...
|
|
Revision tags: llvmorg-7.0.1, llvmorg-7.0.1-rc3 |
|
| #
b2a6f8e5 |
| 07-Dec-2018 |
Vedant Kumar <[email protected]> |
[CodeExtractor] Store outputs at the first valid insertion point
When CodeExtractor outlines values which are used by the original function, it must store those values in some in-out parameter. This
[CodeExtractor] Store outputs at the first valid insertion point
When CodeExtractor outlines values which are used by the original function, it must store those values in some in-out parameter. This store instruction must not be inserted in between a PHI and an EH pad instruction, as that results in invalid IR.
This fixes the following verifier failure seen while outlining within ObjC methods with live exit values:
The unwind destination does not have an exception handling instruction! %call35 = invoke i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %exn.adjusted, i8* %1) to label %invoke.cont34 unwind label %lpad33, !dbg !4183 The unwind destination does not have an exception handling instruction! invoke void @objc_exception_throw(i8* %call35) #12 to label %invoke.cont36 unwind label %lpad33, !dbg !4184 LandingPadInst not the first non-PHI instruction in the block. %3 = landingpad { i8*, i32 } catch i8* null, !dbg !1411
rdar://46540815
llvm-svn: 348562
show more ...
|
| #
d129569e |
| 03-Dec-2018 |
Vedant Kumar <[email protected]> |
[CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433)
If a PHI node out of extracted region has multiple incoming values from it, split this PHI on two parts. First PHI
[CodeExtractor] Split PHI nodes with incoming values from outlined region (PR39433)
If a PHI node out of extracted region has multiple incoming values from it, split this PHI on two parts. First PHI has incomings only from region and extracts with it (they are placed to the separate basic block that added to the list of outlined), and incoming values in original PHI are replaced by first PHI. Similar solution is already used in CodeExtractor for PHIs in entry block (severSplitPHINodes method). It covers PR39433 bug.
Patch by Sergei Kachkov!
Differential Revision: https://reviews.llvm.org/D55018
llvm-svn: 348205
show more ...
|
|
Revision tags: llvmorg-7.0.1-rc2, llvmorg-7.0.1-rc1 |
|
| #
c2990068 |
| 24-Oct-2018 |
Vedant Kumar <[email protected]> |
[HotColdSplitting] Identify larger cold regions using domtree queries
The current splitting algorithm works in three stages:
1) Identify cold blocks, then 2) Use forward/backward propagation to
[HotColdSplitting] Identify larger cold regions using domtree queries
The current splitting algorithm works in three stages:
1) Identify cold blocks, then 2) Use forward/backward propagation to mark hot blocks, then 3) Grow a SESE region of blocks *outside* of the set of hot blocks and start outlining.
While testing this pass on Apple internal frameworks I noticed that some kinds of control flow (e.g. loops) are never outlined, even though they unconditionally lead to / follow cold blocks. I noticed two other issues related to how cold regions are identified:
- An inconsistency can arise in the internal state of the hotness propagation stage, as a block may end up in both the ColdBlocks set and the HotBlocks set. Further inconsistencies can arise as these sets do not match what's in ProfileSummaryInfo.
- It isn't necessary to limit outlining to single-exit regions.
This patch teaches the splitting algorithm to identify maximal cold regions and outline them. A maximal cold region is defined as the set of blocks post-dominated by a cold sink block, or dominated by that sink block. This approach can successfully outline loops in the cold path. As a side benefit, it maintains less internal state than the current approach.
Due to a limitation in CodeExtractor, blocks within the maximal cold region which aren't dominated by a single entry point (a so-called "max ancestor") are filtered out.
Results: - X86 (LNT + -Os + externals): 134KB of TEXT were outlined compared to 47KB pre-patch, or a ~3x improvement. Did not see a performance impact across two runs. - AArch64 (LNT + -Os + externals + Apple-internal benchmarks): 149KB of TEXT were outlined. Ditto re: performance impact. - Outlining results improve marginally in the internal frameworks I tested.
Follow-ups: - Outline more than once per function, outline large single basic blocks, & try to remove unconditional branches in outlined functions.
Differential Revision: https://reviews.llvm.org/D53627
llvm-svn: 345209
show more ...
|
|
Revision tags: llvmorg-7.0.0, llvmorg-7.0.0-rc3 |
|
| #
8267b333 |
| 03-Sep-2018 |
Nico Weber <[email protected]> |
Rename a few unittests/.../Foo.cpp files to FooTest.cpp
The convention for unit test sources is that they're called FooTest.cpp.
No behavior change. https://reviews.llvm.org/D51579
llvm-svn: 341313
|