|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init |
|
| #
19c5638e |
| 26-Jul-2022 |
Phoebe Wang <[email protected]> |
[ArgPromotion] Transfer metadata nontemporal to promoted loads
Fixes #56703
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D130536
|
| #
acf648b5 |
| 24-Jul-2022 |
Kazu Hirata <[email protected]> |
Use llvm::less_first and llvm::less_second (NFC)
|
| #
3d9ce9e4 |
| 28-Jun-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Remove all the getters and ReplaceCallSite (NFC)
AARGetter is an abstraction over a source of the `AAResults` introduced to support the legacy pass manager as well as the modern one.
[ArgPromotion] Remove all the getters and ReplaceCallSite (NFC)
AARGetter is an abstraction over a source of the `AAResults` introduced to support the legacy pass manager as well as the modern one. Since the Argument Promotion pass doesn't support the legacy pass manager anymore, the abstraction is not required and `AAResults` may be used directly.
The instance of the `FunctionAnalysisManager` is passed through the functions to get all the required analyses just wherever they are required and do not use the awkward getter callbacks.
The `ReplaceCallSite` parameter was required for the legacy pass manager only and isn't used anymore, so the parameter has been eliminated.
Differential Revision: https://reviews.llvm.org/D128727
show more ...
|
|
Revision tags: llvmorg-14.0.6, llvmorg-14.0.5, llvmorg-14.0.4 |
|
| #
8958057f |
| 18-May-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Move isDenselyPacked static member (NFC)
The `isDenselyPacked` static member of the `ArgumentPromotionPass` class is not used in the class itself anymore. The single known user of the
[ArgPromotion] Move isDenselyPacked static member (NFC)
The `isDenselyPacked` static member of the `ArgumentPromotionPass` class is not used in the class itself anymore. The single known user of the function is in the `AttributorAttributes.cpp` file, so the function has been moved into the file.
Differential Revision: https://reviews.llvm.org/D128725
show more ...
|
| #
170c4d21 |
| 04-May-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Unify byval promotion with non-byval
It makes sense to handle byval promotion in the same way as non-byval but also allowing `store` instructions. However, these should use the same c
[ArgPromotion] Unify byval promotion with non-byval
It makes sense to handle byval promotion in the same way as non-byval but also allowing `store` instructions. However, these should use the same checks as the `load` instructions do, i.e. be part of the `ArgsToPromote` collection. For these instructions, the check for interfering modifications can be disabled, though. The promotion algorithm itself has been modified a lot: all the accesses (i.e. loads and stores) are rewritten to the emitted `alloca` instructions. To optimize these new `alloca`s out, the `PromoteMemToReg` function from `Transforms/Utils/PromoteMemoryToRegister.cpp` file is invoked after promotion.
In order to let the `PromoteMemToReg` promote as many `alloca`s as it is possible, there should be no `GEP`s from the `alloca`s. To eliminate the `GEP`s, its own `alloca` is generated for every argument part because a single `alloca` for the whole argument (that significantly simplifies the code of the pass though) unfortunately cannot be used.
The idea comes from the following discussion: https://reviews.llvm.org/D124514#3479676
Differential Revision: https://reviews.llvm.org/D125485
show more ...
|
| #
217e8576 |
| 24-Jun-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Remove legacy PM support
Support for the legacy pass manager in ArgPromotion causes complications in D125485. As the legacy pass manager for middle-end optimizations is unsupported, d
[ArgPromotion] Remove legacy PM support
Support for the legacy pass manager in ArgPromotion causes complications in D125485. As the legacy pass manager for middle-end optimizations is unsupported, drop ArgPromotion from the legacy pipeline, rather than introducing additional complexity to deal with it.
Differential Revision: https://reviews.llvm.org/D128536
show more ...
|
| #
d46fa1fc |
| 26-Jun-2022 |
Nuno Lopes <[email protected]> |
[ArgumentPromotion] use poison when replacing dead instructions instead of undef [NFC]
|
| #
098afdb0 |
| 12-May-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Make a non-byval promotion attempt first
It makes sense to make a non-byval promotion attempt first and then fall back to the byval one. The non-byval ('usual') promotion is generally
[ArgPromotion] Make a non-byval promotion attempt first
It makes sense to make a non-byval promotion attempt first and then fall back to the byval one. The non-byval ('usual') promotion is generally better, for example it does promotion even when a structure has more elements than 'MaxElements' but not all of them are actually used in the function.
Differential Revision: https://reviews.llvm.org/D124514
show more ...
|
| #
7c044542 |
| 02-May-2022 |
Phoebe Wang <[email protected]> |
[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion
X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to
[ArgPromotion][Attributor] Update min-legal-vector-width when do promotion
X86 codegen uses function attribute `min-legal-vector-width` to select the proper ABI. The intention of the attribute is to reflect user's requirement when they passing or returning vector arguments. So Clang front-end will iterate the vector arguments and set `min-legal-vector-width` to the width of the maximum for both caller and callee.
It is assumed any middle end optimizations won't care of the attribute expect inlining and argument promotion. - For inlining, we will propagate the attribute of inlined functions because the inlining functions become the newer caller. - For argument promotion, we check the `min-legal-vector-width` of the caller and callee and refuse to promote when they don't match.
The problem comes from the optimizations' combination, as shown by https://godbolt.org/z/zo3hba8xW. The caller `foo` has two callees `bar` and `baz`. When doing argument promotion, both `foo` and `bar` has the same `min-legal-vector-width`. So the argument was promoted to vector. Then the inlining inlines `baz` to `foo` and updates `min-legal-vector-width`, which results in ABI mismatch between `foo` and `bar`.
This patch fixes the problem by expanding the concept of `min-legal-vector-width` to indicator of functions arguments. That says, any passes touch functions arguments have to set `min-legal-vector-width` to the value reflects the width of vector arguments. It makes sense to me because any arguments modifications are ABI related and should response for the ABI compatibility.
Differential Revision: https://reviews.llvm.org/D123284
show more ...
|
|
Revision tags: llvmorg-14.0.3 |
|
| #
9197959e |
| 28-Apr-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Move ArgPart and OffsetAndArgPart to anonymous namespace
The structure ArgPart and alias OffsetAndArgPart have been moved into the anonymous namespace. NFC.
Reviewed By: aeubanks
Di
[ArgPromotion] Move ArgPart and OffsetAndArgPart to anonymous namespace
The structure ArgPart and alias OffsetAndArgPart have been moved into the anonymous namespace. NFC.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D124617
show more ...
|
| #
6b825e50 |
| 28-Apr-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Change the condition to check the promotion limit
The condition should be 'ArgParts.size() > MaxElements', so that if we have exactly 3 elements in the 'ArgParts' vector, the promotio
[ArgPromotion] Change the condition to check the promotion limit
The condition should be 'ArgParts.size() > MaxElements', so that if we have exactly 3 elements in the 'ArgParts' vector, the promotion should be allowed because the 'MaxElement' threshold is not exceeded yet.
The default value for 'MaxElement' has been decreased to 2 in order to avoid an actual change in argument promoting behavior. However, this changes byval argument transformation behavior by allowing adding not more than 2 arguments to the function instead of 3 allowed before.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D124178
show more ...
|
| #
744a8378 |
| 28-Apr-2022 |
Pavel Samolysov <[email protected]> |
[ArgPromotion] Rename variables according to the code style. NFC
Some loop counters ('i', 'e') and variables ('type') were named not in accordance with the code style and clang-tidy issues warnings
[ArgPromotion] Rename variables according to the code style. NFC
Some loop counters ('i', 'e') and variables ('type') were named not in accordance with the code style and clang-tidy issues warnings about the using of such variables. This patch renames the variables and fixes some typos in the comments within the source file.
Differential Revision: https://reviews.llvm.org/D123662
show more ...
|
|
Revision tags: llvmorg-14.0.2 |
|
| #
51561b5e |
| 12-Apr-2022 |
Arthur Eubanks <[email protected]> |
[ArgPromo][OpaquePointer] Don't promote mismatched function types
Mismatched call/callee function types is considered an indirect call.
Fixes crash in https://reviews.llvm.org/D123300#3446023.
|
|
Revision tags: llvmorg-14.0.1 |
|
| #
f1985a3f |
| 21-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: Transforms/IPO
Preprocessor output diff: -238205 lines Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.ll
Cleanup includes: Transforms/IPO
Preprocessor output diff: -238205 lines Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D122183
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2 |
|
| #
e2406781 |
| 10-Feb-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Protect harder against recursive promotion (PR42028)
In addition to the self-recursion check, also check whether there is more than one node in the SCC, which implies that there is a
[ArgPromotion] Protect harder against recursive promotion (PR42028)
In addition to the self-recursion check, also check whether there is more than one node in the SCC, which implies that there is a larger cycle. I believe checking SCC structure (rather than something like norecurse) is the right thing to do here, because this is specifically about preventing infinite loops over the SCC.
Fixes https://github.com/llvm/llvm-project/issues/42028.
Differential Revision: https://reviews.llvm.org/D119418
show more ...
|
| #
8018d6be |
| 10-Feb-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Transfer metadata to promoted loads
Also transfer selected non-AA metadata to the promoted load. Only metadata from guaranteed to execute loads is transferred.
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
68c1eeb4 |
| 28-Jan-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Make implementation offset based
This rewrites ArgPromotion to be based on offsets rather than GEP structure. We inspect all loads at constant offsets and remember which types are loa
[ArgPromotion] Make implementation offset based
This rewrites ArgPromotion to be based on offsets rather than GEP structure. We inspect all loads at constant offsets and remember which types are loaded at which offsets. Then we promote based on those types.
This generalizes ArgPromotion to work with bitcasted loads, and is compatible with opaque pointers.
This patch also fixes incorrect handling of alignment during argument promotion. Previously, the implementation only checked that the pointer is dereferenceable, but was happy to speculate overaligned loads. (I would have fixed this separately in advance, but I found this hard to do with the previous implementation approach).
Differential Revision: https://reviews.llvm.org/D118685
show more ...
|
| #
b8963348 |
| 08-Feb-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Check dereferenceability on argument as well
Before walking all the callers, check whether we have a dereferenceable attribute directly on the argument.
Also make it clearer that the
[ArgPromotion] Check dereferenceability on argument as well
Before walking all the callers, check whether we have a dereferenceable attribute directly on the argument.
Also make it clearer that the code currently does not treat alignment correctly.
show more ...
|
| #
79179a37 |
| 01-Feb-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Use range-based for loop (NFC)
|
| #
0ebbf343 |
| 28-Jan-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Don't assume all entry block instrs are executed
We should abort this walk if we hit any instruction that is not guaranteed to transfer.
|
| #
8b36c437 |
| 28-Jan-2022 |
Nikita Popov <[email protected]> |
[ArgPromotion] Make areFunctionArgsABICompatible() static (NFC)
This function used to be shared with the Attributor, but can now be made private.
|
| #
aa97bc11 |
| 21-Jan-2022 |
Nikita Popov <[email protected]> |
[NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecatin
[NFC] Remove uses of PointerType::getElementType()
Instead use either Type::getPointerElementType() or Type::getNonOpaquePointerElementType().
This is part of D117885, in preparation for deprecating the API.
show more ...
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
f5ac23b5 |
| 20-Dec-2021 |
Nikita Popov <[email protected]> |
[ArgPromotion][TTI] Pass types to ABI compatibility hook
The areFunctionArgsABICompatible() hook currently accepts a list of pointer arguments, though what we're actually interested in is the ABI co
[ArgPromotion][TTI] Pass types to ABI compatibility hook
The areFunctionArgsABICompatible() hook currently accepts a list of pointer arguments, though what we're actually interested in is the ABI compatibility after these pointer arguments have been converted into value arguments.
This means that a) the current API is incompatible with opaque pointers (because it requires inspection of pointee types) and b) it can only be used in the specific context of ArgPromotion. I would like to reuse the API when inspecting calls during inlining.
This patch converts it into an areTypesABICompatible() hook, which accepts a list of types. This makes the method more generally usable, and compatible with opaque pointers from an API perspective (the actual usage in ArgPromotion/Attributor is still incompatible, I'll follow up on that in separate patches).
Differential Revision: https://reviews.llvm.org/D116031
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
19867de9 |
| 03-May-2021 |
Arthur Eubanks <[email protected]> |
[NewPM] Only invalidate modified functions' analyses in CGSCC passes + turn on eagerly invalidate analyses
Previously, any change in any function in an SCC would cause all analyses for all functions
[NewPM] Only invalidate modified functions' analyses in CGSCC passes + turn on eagerly invalidate analyses
Previously, any change in any function in an SCC would cause all analyses for all functions in the SCC to be invalidated. With this change, we now manually invalidate analyses for functions we modify, then let the pass manager know that all function analyses should be preserved since we've already handled function analysis invalidation.
So far this only touches the inliner, argpromotion, function-attrs, and updateCGAndAnalysisManager(), since they are the most used.
This is part of an effort to investigate running the function simplification pipeline less on functions we visit multiple times in the inliner pipeline.
However, this causes major memory regressions especially on larger IR. To counteract this, turn on the option to eagerly invalidate function analyses. This invalidates analyses on functions immediately after they're processed in a module or scc to function adaptor for specific parts of the pipeline.
Within an SCC, if a pass only modifies one function, other functions in the SCC do not have their analyses invalidated, so in later function passes in the SCC pass manager the analyses may still be cached. It is only after the function passes that the eager invalidation takes effect. For the default pipelines this makes sense because the inliner pipeline runs the function simplification pipeline after all other SCC passes (except CoroSplit which doesn't request any analyses).
Overall this has mostly positive effects on compile time and positive effects on memory usage. https://llvm-compile-time-tracker.com/compare.php?from=7f627596977624730f9298a1b69883af1555765e&to=39e824e0d3ca8a517502f13032dfa67304841c90&stat=instructions https://llvm-compile-time-tracker.com/compare.php?from=7f627596977624730f9298a1b69883af1555765e&to=39e824e0d3ca8a517502f13032dfa67304841c90&stat=max-rss
D113196 shows that we slightly regressed compile times in exchange for some memory improvements when turning on eager invalidation. D100917 shows that we slightly improved compile times in exchange for major memory regressions in some cases when invalidating less in SCC passes. Turning these on at the same time keeps the memory improvements while keeping compile times neutral/slightly positive.
Reviewed By: asbirlea, nikic
Differential Revision: https://reviews.llvm.org/D113304
show more ...
|
| #
73797367 |
| 14-Nov-2021 |
Kazu Hirata <[email protected]> |
[llvm] Use range-based for loops with User::operands (NFC)
|