|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
30d3f56e |
| 13-Jul-2022 |
Kazu Hirata <[email protected]> |
[Analysis] clang-format InlineAdvisor.cpp (NFC)
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
e0d06959 |
| 12-May-2022 |
Mingming Liu <[email protected]> |
[Inline] Annotate inline pass name with link phase information for analysis.
The annotation is flag gated; flag is turned off by default.
Differential Revision: https://reviews.llvm.org/D125495
|
| #
bc856eb3 |
| 01-Jun-2022 |
Mingming Liu <[email protected]> |
[SampleProfile][Inline] Annotate sample profile inline remarks with link phase (prelink/postlink) information.
Differential Revision: https://reviews.llvm.org/D126833
|
| #
7a47ee51 |
| 21-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Don't use Optional::getValue (NFC)
|
| #
e0e687a6 |
| 20-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Don't use Optional::hasValue (NFC)
|
| #
129b531c |
| 19-Jun-2022 |
Kazu Hirata <[email protected]> |
[llvm] Use value_or instead of getValueOr (NFC)
|
| #
9f2b873a |
| 14-Jun-2022 |
Jin Xin Ng <[email protected]> |
[inliner] Add per-SCC-pass InlineAdvisor printing option
Adds option to print the contents of the Inline Advisor after each SCC Inliner pass
Reviewed By: mtrofin
Differential Revision: https://rev
[inliner] Add per-SCC-pass InlineAdvisor printing option
Adds option to print the contents of the Inline Advisor after each SCC Inliner pass
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D127689
show more ...
|
| #
8601f269 |
| 01-Jun-2022 |
Mingming Liu <[email protected]> |
[Inline][Remark][NFC] Optionally provide inline context to inline advisor.
This patch has no functional change, and merely a preparation patch for main functional change. The motivating use case is
[Inline][Remark][NFC] Optionally provide inline context to inline advisor.
This patch has no functional change, and merely a preparation patch for main functional change. The motivating use case is to annotate inline remark pass name with context information (e.g. prelink or postlink, CGSCC or always-inliner), see D125495 for more details.
Differential Revision: https://reviews.llvm.org/D126824
show more ...
|
|
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
9aa52ba5 |
| 21-Mar-2022 |
Kazu Hirata <[email protected]> |
[Analysis] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC)
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
71c3a551 |
| 28-Feb-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor: before: 1065940348 after: 1065307662
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Diff
Cleanup includes: LLVMAnalysis
Number of lines output by preprocessor: before: 1065940348 after: 1065307662
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120659
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init, llvmorg-13.0.1, llvmorg-13.0.1-rc3 |
|
| #
f29256a6 |
| 20-Jan-2022 |
Mircea Trofin <[email protected]> |
[MLGO] Improved support for AOT cross-targeting scenarios
The tensorflow AOT compiler can cross-target, but it can't run on (for example) arm64. We added earlier support where the AOT-ed header and
[MLGO] Improved support for AOT cross-targeting scenarios
The tensorflow AOT compiler can cross-target, but it can't run on (for example) arm64. We added earlier support where the AOT-ed header and object would be built on a separate builder and then passed at build time to a build host where the AOT compiler can't run, but clang can be otherwise built.
To simplify such scenarios given we now support more than one AOT-able case (regalloc and inliner), we make the AOT scenario centered on whether files are generated, case by case (this includes the "passed from a different builder" scenario). This means we shouldn't need an 'umbrella' LLVM_HAVE_TF_AOT, in favor of case by case control. A builder can opt out of an AOT case by passing that case's model path as `none`. Note that the overrides still take precedence.
This patch controls conditional compilation with case-specific flags, which can be enabled locally, for the component where those are available. We still keep an overall flag for some tests.
The 'development/training' mode is unchanged, because there the model is passed from the command line and interpreted.
Differential Revision: https://reviews.llvm.org/D117752
show more ...
|
| #
3e8553aa |
| 13-Jan-2022 |
Mircea Trofin <[email protected]> |
[mlgo][inline] Improve global state tracking
The global state refers to the number of the nodes currently in the module, and the number of direct calls between nodes, across the module.
Node counts
[mlgo][inline] Improve global state tracking
The global state refers to the number of the nodes currently in the module, and the number of direct calls between nodes, across the module.
Node counts are not a problem; edge counts are because we want strictly the kind of edges that affect inlining (direct calls), and that is not easily obtainable without iteration over the whole module.
This patch avoids relying on analysis invalidation because it turned out to be too aggressive in some cases. It leverages the fact that Node objects are stable - they do not get deleted while cgscc passes are run over the module; and cgscc pass manager invariants.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D115847
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc2 |
|
| #
248d55af |
| 10-Jan-2022 |
Mircea Trofin <[email protected]> |
[NFC][MLGO] Use LazyCallGraph::Node to track functions.
This avoids the InlineAdvisor carrying the responsibility of deleting Function objects. We use LazyCallGraph::Node objects instead, which are
[NFC][MLGO] Use LazyCallGraph::Node to track functions.
This avoids the InlineAdvisor carrying the responsibility of deleting Function objects. We use LazyCallGraph::Node objects instead, which are stable in memory for the duration of the Module-wide performance of CGSCC passes started under the same ModuleToPostOrderCGSCCPassAdaptor (which is the case here)
Differential Revision: https://reviews.llvm.org/D116964
show more ...
|
| #
a8c2ba10 |
| 10-Dec-2021 |
Nikita Popov <[email protected]> |
[Inline] Disable deferred inlining
After the switch to the new pass manager, we have observed multiple instances of catastrophic inlining, where the inliner produces huge functions with many hundred
[Inline] Disable deferred inlining
After the switch to the new pass manager, we have observed multiple instances of catastrophic inlining, where the inliner produces huge functions with many hundreds of thousands of instructions from small input IR. We were forced to back out the switch to the new pass manager for this reason. This patch fixes at least one of the root cause issues.
LLVM uses a bottom-up inliner, and the fact that functions are processed bottom-up is not just a question of optimality -- it is an imporant requirement to prevent runaway inlining. The premise of the current inlining approach and cost model is that after all calls inside a function have been inlined, it may get large enough that inlining it into its callers is no longer considered profitable. This safeguard does not exist if inlining doesn't happen bottom-up, as inlining the callees, and their callees, and their callees etc. will always seem individually profitable, and the inliner can easily flatten the whole call tree.
There are instances where we necessarily have to deviate from bottom-up inlining: When inlining in an SCC there is no natural "bottom", so inlining effectively happens top-down. This requires special care, and the inliner avoids exponential blowup by ensuring that functions in the SCC grow in a balanced way and will eventually hit the threshold.
However, there is one instance where the inlining advisor explicitly violates the bottom-up principle: Deferred inlining tries to "defer" inlining a call if it determines that inlining the caller into all its call-sites would be more profitable. Something very important to understand about deferred inlining is that it doesn't make one inlining choice in place of another -- it effectively chooses to do both. If we have a call chain A -> B -> C and cost modelling tells us that inlining B -> C is profitable, but we defer this and instead inline A -> B first, then we'll now have a call A -> C, and the cost model will (a few special cases notwithstanding) still tell us that this is profitable. So the end result is that we inlined *both* B and C, even though under the usual cost model function B would have been too large to further inline after C has been integrated into it.
Because deferred inlining violates the bottom-up invariant of the inliner, it can result in exponential inlining. The exponential-deferred-inlining.ll test case illustrates this on a simple example (see https://gist.github.com/nikic/1262b5f7d27278e1b34a190ae10947f5 for a much more catastrophic case with about 5000x size blowup). If the call chain A -> B -> C is not a chain but a tree of calls, then we end up deferring inlining across the tree and end up flattening everything into the root node.
This patch proposes to address this by disabling deferred inlining entirely (currently still behind an option). Beyond the issue of exponential inlining, I don't think that the whole concept makes sense, at least as long as deferred inlining still ends up inlining both call edges.
I believe the motivation for having deferred inlining in the first place is that you might have a small wrapper function with local linkage that could be eliminated if inlined. This would automatically happen if there was a single caller, due to the large "last call to local" bonus. However, this bonus is not extended if there are multiple callers, even if we would eventually end up inlining into all of them (if the bonus were extended).
Now, unlike the normal inlining cost model, the deferred inlining cost model does look at all callers, and will extend the "last call to local" bonus if it determines that we could inline all of them as long as we defer the current inlining decision. This makes very little sense. The "last call to local" bonus doesn't really cost model anything. It's basically an "infinite" bonus that ensures we always inline the last call to a local. The fact that it's not literally infinite just prevents inlining of huge functions, which can easily result in scalability issues. I very much doubt that it was an intentional cost-modelling choice to say that getting rid of a small local function is worth adding 15000 instructions elsewhere, yet this is exactly how this value is getting used here.
The main alternative I see to complete removal is to change deferred inlining to an actual either/or decision. That is, to mark deferred calls as noinline so we're actually trading off one inlining decision against another, and not just adding a side-channel to the cost model to do both.
Apart from fixing the catastrophic inlining case, the effect on rustc is a modest compile-time improvement on average (up to 8% for a parsing-type crate, where tree-like calls are expected) and pretty neutral where run-time performance is concerned (mix of small wins and losses, usually in the sub-1% category).
Differential Revision: https://reviews.llvm.org/D115497
show more ...
|
| #
7abf299f |
| 14-Dec-2021 |
Nikita Popov <[email protected]> |
[InlineAdvisor] Add option to control deferred inlining (NFC)
This change is split out from D115497 to add the option independently from the switch of the default value.
|
| #
3beafece |
| 09-Dec-2021 |
Nikita Popov <[email protected]> |
[InlineAdvisor] Remove outdated comment (NFC)
This just returns None nowadays, so this comment doesn't apply anymore.
|
|
Revision tags: llvmorg-13.0.1-rc1 |
|
| #
5caad9b5 |
| 29-Oct-2021 |
modimo <[email protected]> |
[InlineAdvisor] Add fallback/format switches and negative remark processing to Replay Inliner
Adds the following switches:
1. --sample-profile-inline-replay-fallback/--cgscc-inline-replay-fallback:
[InlineAdvisor] Add fallback/format switches and negative remark processing to Replay Inliner
Adds the following switches:
1. --sample-profile-inline-replay-fallback/--cgscc-inline-replay-fallback: controls what the replay advisor does for inline sites that are not present in the replay. Options are:
1. Original: defers to original advisor 2. AlwaysInline: inline all sites not in replay 3. NeverInline: inline no sites not in replay
2. --sample-profile-inline-replay-format/--cgscc-inline-replay-format: controls what format should be generated to match against the replay remarks. Options are:
1. Line 2. LineColumn 3. LineDiscriminator 4. LineColumnDiscriminator
Adds support for negative inlining decisions. These are denoted by "will not be inlined into" as compared to the positive "inlined into" in the remarks.
All of these together with the previous `--sample-profile-inline-replay-scope/--cgscc-inline-replay-scope` allow tweaking in how to apply replay. In my testing, I'm using: 1. --sample-profile-inline-replay-scope/--cgscc-inline-replay-scope = Function to only replay on a function 2. --sample-profile-inline-replay-fallback/--cgscc-inline-replay-fallback = NeverInline since I'm feeding in only positive remarks to the replay system 3. --sample-profile-inline-replay-format/--cgscc-inline-replay-format = Line since I'm generating the remarks from DWARF information from GCC which can conflict quite heavily in column number compared to Clang
An alternative configuration could be to do Function, AlwaysInline, Line fallback with negative remarks which closer matches the final call-sites. Note that this can lead to unbounded inlining if a negative remark doesn't match/exist for one reason or another.
Updated various tests to cover the new switches and negative remarks
Testing: ninja check-all
Reviewed By: wenlei, mtrofin
Differential Revision: https://reviews.llvm.org/D112040
show more ...
|
| #
313c657f |
| 18-Oct-2021 |
modimo <[email protected]> |
[InlineAdvisor] Add -inline-replay-scope=<Function|Module> to control replay scope
The goal is to allow grafting an inline tree from Clang or GCC into a new compilation without affecting other funct
[InlineAdvisor] Add -inline-replay-scope=<Function|Module> to control replay scope
The goal is to allow grafting an inline tree from Clang or GCC into a new compilation without affecting other functions. For GCC, we're doing this by extracting the inline tree from dwarf information and generating the equivalent remarks.
This allows easier side-by-side asm analysis and a trial way to see if a particular inlining setup provides benefits by itself.
Testing: ninja check-all
Reviewed By: wenlei, mtrofin
Differential Revision: https://reviews.llvm.org/D110658
show more ...
|
| #
7d541eb4 |
| 30-Sep-2021 |
Mircea Trofin <[email protected]> |
[inliner] Mandatory inlining decisions produce remarks
This also removes the need to disable the mandatory inlining phase in tests.
In a departure from the previous remark, we don't output a 'cost'
[inliner] Mandatory inlining decisions produce remarks
This also removes the need to disable the mandatory inlining phase in tests.
In a departure from the previous remark, we don't output a 'cost' in this case, because there's no such thing. We just report that inlining happened because of the attribute.
Differential Revision: https://reviews.llvm.org/D110891
show more ...
|
|
Revision tags: llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
0bb767e7 |
| 23-Sep-2021 |
Fangrui Song <[email protected]> |
[InlineAdvisor] Use one single quote
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2 |
|
| #
3f4d00bc |
| 18-Aug-2021 |
Arthur Eubanks <[email protected]> |
[NFC] More get/removeAttribute() cleanup
|
| #
76093b17 |
| 10-Aug-2021 |
Fangrui Song <[email protected]> |
[InlineAdvisor] Add single quotes around caller/callee names
Clang diagnostics refer to identifier names in quotes. This patch makes inline remarks conform to the convention. New behavior:
``` % cl
[InlineAdvisor] Add single quotes around caller/callee names
Clang diagnostics refer to identifier names in quotes. This patch makes inline remarks conform to the convention. New behavior:
``` % clang -O2 -Rpass=inline -Rpass-missed=inline -S a.c a.c:4:25: remark: 'foo' inlined into 'bar' with (cost=-30, threshold=337) at callsite bar:0:25; [-Rpass=inline] int bar(int a) { return foo(a); } ^ ```
Reviewed By: hoy
Differential Revision: https://reviews.llvm.org/D107791
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init |
|
| #
935dea2c |
| 27-Jul-2021 |
Mircea Trofin <[email protected]> |
[MLGO] fix silly LLVM_DEBUG misuse
|
| #
eb76ca57 |
| 27-Jul-2021 |
Mircea Trofin <[email protected]> |
[NFC][MLGO] Debug messages for what inline advisor is selected
We already have an indication (error) if the desired inline advisor cannot be enabled, but we don't have a positive indication. Added L
[NFC][MLGO] Debug messages for what inline advisor is selected
We already have an indication (error) if the desired inline advisor cannot be enabled, but we don't have a positive indication. Added LLVM_DEBUG messages for the latter.
show more ...
|
|
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1, llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
1ce2b584 |
| 12-Mar-2021 |
serge-sans-paille <[email protected]> |
[NFC] Use llvm::raw_string_ostream instead of std::stringstream
That's more efficient and we don't loose any valuable feature when doing so.
|