|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6, llvmorg-14.0.5 |
|
| #
78c6b148 |
| 02-Jun-2022 |
Florian Hahn <[email protected]> |
[CaptureTracking] Increase limit and use it for all visited uses.
Currently the MaxUsesToExplore limit only applies to the number of users per value, not the total number of users to explore.
The c
[CaptureTracking] Increase limit and use it for all visited uses.
Currently the MaxUsesToExplore limit only applies to the number of users per value, not the total number of users to explore.
The current limit of 20 pessimizes IR with opaque pointers in some cases. Without opaque pointers, we have deeper pointer def-use chains in general due to extra bitcasts and geps for structs with index 0.
With opaque pointers the def-use chain is not as deep but wider, due to bitcasts & 0-geps missing.
To improve the situation for opaque pointers, this patch does 2 things:
1. Apply the limit to the total number of uses visited. From the wording in the description of the option it seems like this may be the original intention. With the current implementation we could still end up walking a lot of uses. 2. Increase the limit to 100. This is quite arbitrary, but enables a good number of additional optimizations.
Those adjustments have a noticeable compile-time impact though. In part that is likely due to additional transformations (and conversely the current baseline misses optimizations after switching to opaque pointers).
This recovers some regressions that showed up after enabling opaque pointers.
Limit=100:
* NewPM-O3: +0.21% * NewPM-ReleaseThinLTO: +0.87% * NewPM-ReleaseLTO-g: +0.46%
https://llvm-compile-time-tracker.com/compare.php?from=2e50ecb2ef4e1da1aeab05bcf66380068e680991&to=7e6fbe519d958d09f32f01d5d44a622f551e2031&stat=instructions
Limit=60:
* NewPM-O3: +0.14% * NewPM-ReleaseThinLTO: +0.41% * NewPM-ReleaseLTO-g: +0.21%
https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=520563fdc146319aae90d06f88d87f2e9e1247b7&stat=instructions
Limit=40: * NewPM-O3: +0.11% * NewPM-ReleaseThinLTO: +0.12% * NewPM-ReleaseLTO-g: +0.09%
https://llvm-compile-time-tracker.com/compare.php?from=aeb19817d66f1a15754163c7f48e01e9ebdd6d45&to=c9182576e9fe3f1c84a71479665aef91a416318c&stat=instructions
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D126236
show more ...
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
b22ffc7b |
| 07-Apr-2022 |
Arthur Eubanks <[email protected]> |
[CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo
And thread DSE's ephemeral values to EarliestEscapeInfo.
This allows more precise analysis in DSEState::isReadClobber() via BatchAA.
[CaptureTracking] Ignore ephemeral values in EarliestEscapeInfo
And thread DSE's ephemeral values to EarliestEscapeInfo.
This allows more precise analysis in DSEState::isReadClobber() via BatchAA.
Followup to D123162.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D123342
show more ...
|
| #
17fdaccc |
| 05-Apr-2022 |
Arthur Eubanks <[email protected]> |
[CaptureTracking] Ignore ephemeral values when determining pointer escapeness
Ephemeral values cannot cause a pointer to escape.
No change in compile time: https://llvm-compile-time-tracker.com/com
[CaptureTracking] Ignore ephemeral values when determining pointer escapeness
Ephemeral values cannot cause a pointer to escape.
No change in compile time: https://llvm-compile-time-tracker.com/compare.php?from=4371710085ba1c376a094948b806ddd3b88319de&to=c5ddbcc4866f38026737762ee8d7b9b00395d4f4&stat=instructions
This partially fixes some regressions caused by more calls to `__builtin_assume` (D122397).
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D123162
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
d6e09ce8 |
| 08-Mar-2022 |
Johannes Doerfert <[email protected]> |
[CaptureTracking][NFCI] Expose capture tracking logic
The logic exposed by this patch via `llvm::DetermineUseCaptureKind` was part of `llvm::PointerMayBeCaptured`. In the Attributor we want to keep
[CaptureTracking][NFCI] Expose capture tracking logic
The logic exposed by this patch via `llvm::DetermineUseCaptureKind` was part of `llvm::PointerMayBeCaptured`. In the Attributor we want to keep track of the work list items but still reuse the logic if a use might capture a value. A follow up for the Attributor removes ~100 lines of code and complexity while making future handling of simplified values possible.
Differential Revision: https://reviews.llvm.org/D121272
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1 |
|
| #
3a3cb929 |
| 07-Feb-2022 |
Kazu Hirata <[email protected]> |
[llvm] Use = default (NFC)
|
|
Revision tags: llvmorg-15-init |
|
| #
b752eb88 |
| 24-Jan-2022 |
Kazu Hirata <[email protected]> |
[Analysis] Use default member initialization (NFC)
Identified with modernize-use-default-member-init.
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
793c0da8 |
| 17-Dec-2021 |
Philip Reames <[email protected]> |
[capturetracking] Explicitly check for callee operand [NFC]
Pull out an explicit check rather than relying on the fact that the callee operand is not a data operand. The only real value is it gives
[capturetracking] Explicitly check for callee operand [NFC]
Pull out an explicit check rather than relying on the fact that the callee operand is not a data operand. The only real value is it gives us a clear place to move the comment, and makes the code slightly more understandable.
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4 |
|
| #
6f28fb70 |
| 24-Sep-2021 |
Florian Hahn <[email protected]> |
Recommit "[DSE] Track earliest escape, use for loads in isReadClobber."
This reverts the revert commit df56fc6ebbee6c458b0473185277b7860f7e3408.
This version of the patch adjusts the location where
Recommit "[DSE] Track earliest escape, use for loads in isReadClobber."
This reverts the revert commit df56fc6ebbee6c458b0473185277b7860f7e3408.
This version of the patch adjusts the location where the EarliestEscapes cache is cleared when an instruction gets removed. The earliest escaping instruction does not have to be a memory instruction.
It could be a ptrtoint instruction like in the added test @earliest_escape_ptrtoint, which subsequently gets removed. We need to invalidate the EarliestEscape entry referring to the ptrtoint when deleting it.
This fixes the crash mentioned in https://bugs.chromium.org/p/chromium/issues/detail?id=1252762#c6
show more ...
|
| #
df56fc6e |
| 24-Sep-2021 |
Nico Weber <[email protected]> |
Revert "[DSE] Track earliest escape, use for loads in isReadClobber."
This reverts commit 5ce89279c0986d0bcbe526dce52f91dd0c16427c. Makes clang crash, see comments on https://reviews.llvm.org/D109844
|
| #
5ce89279 |
| 23-Sep-2021 |
Florian Hahn <[email protected]> |
[DSE] Track earliest escape, use for loads in isReadClobber.
At the moment, DSE only considers whether a pointer may be captured at all in a function. This leads to cases where we fail to remove sto
[DSE] Track earliest escape, use for loads in isReadClobber.
At the moment, DSE only considers whether a pointer may be captured at all in a function. This leads to cases where we fail to remove stores to local objects because we do not check if they escape before potential read-clobbers or after.
Doing context-sensitive escape queries in isReadClobber has been removed a while ago in d1a1cce5b130 to save compile-time. See PR50220 for more context.
This patch introduces a new capture tracker, which keeps track of the 'earliest' capture. An instruction A is considered earlier than instruction B, if A dominates B. If 2 escapes do not dominate each other, the terminator of the common dominator is chosen. If not all uses cannot be analyzed, the earliest escape is set to the first instruction in the function entry block.
If the query instruction dominates the earliest escape and is not in a cycle, then pointer does not escape before the query instruction.
This patch uses this information when checking if a load of a loaded underlying object may alias a write to a stack object. If the stack object does not escape before the load, they do not alias.
I will share a follow-up patch to also use the information for call instructions to fix PR50220.
In terms of compile-time, the impact is low in general, NewPM-O3: +0.05% NewPM-ReleaseThinLTO: +0.05% NewPM-ReleaseLTO-g: +0.03
with the largest change being tramp3d-v4 (+0.30%) http://llvm-compile-time-tracker.com/compare.php?from=1a3b3301d7aa9ab25a8bdf045c77298b087e3930&to=bc6c6899cae757c3480f4ad4874a76fc1eafb0be&stat=instructions
Compared to always computing the capture information on demand, we get the following benefits from the caching: NewPM-O3: -0.03% NewPM-ReleaseThinLTO: -0.08% NewPM-ReleaseLTO-g: -0.04%
The biggest speedup is tramp3d-v4 (-0.21%). http://llvm-compile-time-tracker.com/compare.php?from=0b0c99177d1511469c633282ef67f20c851f58b1&to=bc6c6899cae757c3480f4ad4874a76fc1eafb0be&stat=instructions
Overall there is a small, but noticeable benefit from caching. I am not entirely sure if the speedups warrant the extra complexity of caching. The way the caching works also means that we might miss a few cases, as it is less precise. Also, there may be a better way to cache things.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D109844
show more ...
|
| #
7f6a4826 |
| 20-Sep-2021 |
Florian Hahn <[email protected]> |
[CaptureTracking] Allow passing LI to PointerMayBeCapturedBefore (NFC).
isPotentiallyReachable can use LoopInfo to return earlier. This patch allows passing an optional LI to PointerMayBeCapturedBef
[CaptureTracking] Allow passing LI to PointerMayBeCapturedBefore (NFC).
isPotentiallyReachable can use LoopInfo to return earlier. This patch allows passing an optional LI to PointerMayBeCapturedBefore. Used in D109844.
Reviewed By: nikic, asbirlea
Differential Revision: https://reviews.llvm.org/D109978
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc3, llvmorg-13.0.0-rc2, llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
72431201 |
| 16-May-2021 |
Nikita Popov <[email protected]> |
[CaptureTracking] Simplify reachability check (NFCI)
This code was re-implementing the same-BB case of isPotentiallyReachable(). Historically, this was done because CaptureTracking used additional c
[CaptureTracking] Simplify reachability check (NFCI)
This code was re-implementing the same-BB case of isPotentiallyReachable(). Historically, this was done because CaptureTracking used additional caching for local dominance queries. Now that it is no longer needed, the code is effectively the same as isPotentiallyReachable().
The only difference are extra checks for invoke/phis. These are misleading checks related to dominance in the value availability sense that are not relevant for control reachability. The invoke check was correct but redundant in that invokes are always terminators, so `I` could never come before the invoke. The phi check is a matter of interpretation (should an earlier phi node be considered reachable from a later phi node in the same block?) but ultimately doesn't matter because phis don't capture anyway.
show more ...
|
| #
656296b1 |
| 16-May-2021 |
Nikita Popov <[email protected]> |
Reapply [CaptureTracking] Do not check domination
Reapply after adjusting the synchronized.m test case, where the TODO is now resolved. The pointer is only captured on the exception handling path.
Reapply [CaptureTracking] Do not check domination
Reapply after adjusting the synchronized.m test case, where the TODO is now resolved. The pointer is only captured on the exception handling path.
-----
For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it can also occur if the capture happens on an entirely disjoint path.
This change was previously accepted in D90688, but had to be reverted due to large compile-time impact in some cases: It increases the number of reachability queries that are performed.
After recent changes, the compile-time impact is largely mitigated, so I'm reapplying this patch. The remaining compile-time impact is largely proportional to changes in code-size.
show more ...
|
| #
541c2845 |
| 16-May-2021 |
Nikita Popov <[email protected]> |
Revert "[CaptureTracking] Do not check domination"
This reverts commit 6b8b43e7af3074124e3c9e429e1fb08165799be4.
This causes clang test to fail (CodeGenObjC/synchronized.m). Revert until I can figu
Revert "[CaptureTracking] Do not check domination"
This reverts commit 6b8b43e7af3074124e3c9e429e1fb08165799be4.
This causes clang test to fail (CodeGenObjC/synchronized.m). Revert until I can figure out whether that's an expected change.
show more ...
|
| #
6b8b43e7 |
| 16-May-2021 |
Nikita Popov <[email protected]> |
[CaptureTracking] Do not check domination
For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it
[CaptureTracking] Do not check domination
For the CapturesBefore tracker, it is sufficient to check that I can not reach BeforeHere. This does not necessarily require that BeforeHere dominates I, it can also occur if the capture happens on an entirely disjoint path.
This change was previously accepted in D90688, but had to be reverted due to large compile-time impact in some cases: It increases the number of reachability queries that are performed.
After recent changes, the compile-time impact is largely mitigated, so I'm reapplying this patch. The remaining compile-time impact is largely proportional to changes in code-size.
show more ...
|
| #
6e9363c9 |
| 15-May-2021 |
Nikita Popov <[email protected]> |
[CaptureTracking] Only check reachability for capture candidates
Reachability queries are very expensive, and currently performed for each instruction we look at, even though most of them will not l
[CaptureTracking] Only check reachability for capture candidates
Reachability queries are very expensive, and currently performed for each instruction we look at, even though most of them will not lead to a capture and are thus ultimately irrelevant. It is more efficient to walk a few unnecessary instructions than to perform unnecessary reachability queries.
Theoretically, this may produce worse results, because the additional instructions considered may cause us to hit the use count limit earlier. In practice, this does not appear to be a problem, e.g. on test-suite O3 we report only one more captured-before with this change, with no resulting codegen differences.
This makes PointerMayBeCapturedBefore() significantly cheaper in practice, hopefully allowing it to be used in more places.
show more ...
|
| #
fb9ed197 |
| 15-May-2021 |
Nikita Popov <[email protected]> |
[IR] Add BasicBlock::isEntryBlock() (NFC)
This is a recurring and somewhat awkward pattern. Add a helper method for it.
|
| #
f765e54d |
| 15-May-2021 |
Nikita Popov <[email protected]> |
[CaptureTracking] Clean up same instruction check (NFC)
Check the BeforeHere == I case once in shouldExplore, instead of handling it in four different places.
|
| #
425781bc |
| 13-May-2021 |
Nikita Popov <[email protected]> |
[CaptureTracking] Use isIdentifiedFunctionLocal() (NFC)
These conditions together exactly match isIdentifiedFunctionLocal(), and this is also what we logically want to check for here.
|
|
Revision tags: llvmorg-12.0.0, llvmorg-12.0.0-rc5, llvmorg-12.0.0-rc4 |
|
| #
5698537f |
| 19-Mar-2021 |
Philip Reames <[email protected]> |
Update basic deref API to account for possiblity of free [NFC]
This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into
Update basic deref API to account for possiblity of free [NFC]
This patch is plumbing to support work towards the goal outlined in the recent llvm-dev post "[llvm-dev] RFC: Decomposing deref(N) into deref(N) + nofree".
The point of this change is purely to simplify iteration on other pieces on way to making the switch. Rebuilding with a change to Value.h is slow and painful, so I want to get the API change landed. Once that's done, I plan to more closely audit each caller, add the inference rules in their own patch, then post a patch with the langref changes and test diffs. The value of the command line flag is that we can exercise the inference logic in standalone patches without needing the whole switch ready to go just yet.
Differential Revision: https://reviews.llvm.org/D98908
show more ...
|
|
Revision tags: llvmorg-12.0.0-rc3, llvmorg-12.0.0-rc2, llvmorg-11.1.0, llvmorg-11.1.0-rc3, llvmorg-12.0.0-rc1, llvmorg-13-init, llvmorg-11.1.0-rc2, llvmorg-11.1.0-rc1, llvmorg-11.0.1, llvmorg-11.0.1-rc2, llvmorg-11.0.1-rc1 |
|
| #
57b3bc8c |
| 07-Nov-2020 |
Nikita Popov <[email protected]> |
[CaptureTracking] Add statistics (NFC)
Add basic statistics on the number of pointers that have been determined to maybe capture / not capture.
|
| #
f63ab188 |
| 07-Nov-2020 |
Nikita Popov <[email protected]> |
[CaptureTracking] Early abort on too many uses (NFCI)
If there are too many uses, we should directly return -- there's no point in inspecting the remaining uses in the worklist, as we have to conser
[CaptureTracking] Early abort on too many uses (NFCI)
If there are too many uses, we should directly return -- there's no point in inspecting the remaining uses in the worklist, as we have to conservatively assume a capture anyway. This also means that tooManyUses() gets called exactly once, rather than potentially many times.
This restores the behavior prior to e9832dfdf366ddffba68164adb6855d17c9f87c1, where this was accidentally changed while moving the AddUses logic into a closure, thus making the return a return from the closure rather than the whole function.
show more ...
|
| #
d35366bc |
| 07-Nov-2020 |
Nikita Popov <[email protected]> |
[CaptureTracking] Correctly handle multiple uses in one instruction
If the same value is used multiple times in the same instruction, CaptureTracking may end up reporting the wrong use as being capt
[CaptureTracking] Correctly handle multiple uses in one instruction
If the same value is used multiple times in the same instruction, CaptureTracking may end up reporting the wrong use as being captured, and/or report the same use as being captured multiple times.
Make sure that all checks take the use operand number into account, rather than performing unreliable comparisons against the used value.
I'm not sure whether this can cause any problems in practice, but at least some capture trackers (ArgUsesTracker, AACaptureUseTracker) do care about which call argument is captured.
show more ...
|
| #
bac97993 |
| 07-Nov-2020 |
Nikita Popov <[email protected]> |
[CaptureTracking] Avoid duplicate shouldExplode() check (NFCI)
We check shouldExplore() before adding uses to the worklist, so uses that should not be explored will not reach captured() in the first
[CaptureTracking] Avoid duplicate shouldExplode() check (NFCI)
We check shouldExplore() before adding uses to the worklist, so uses that should not be explored will not reach captured() in the first place.
show more ...
|
| #
afe92642 |
| 05-Nov-2020 |
Anna Thomas <[email protected]> |
Revert "[CaptureTracking] Avoid overly restrictive dominates check"
This reverts commit 15694fd6ad955c6a16b446a6324364111a49ae8b. Need to investigate and fix a failing clang test: synchronized.m. Mi
Revert "[CaptureTracking] Avoid overly restrictive dominates check"
This reverts commit 15694fd6ad955c6a16b446a6324364111a49ae8b. Need to investigate and fix a failing clang test: synchronized.m. Might need a test update.
show more ...
|