|
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6 |
|
| #
3c126d5f |
| 22-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[Alignment] Replace commonAlignment with std::min
`commonAlignment` is a shortcut to pick the smallest of two `Align` objects. As-is it doesn't bring much value compared to `std::min`.
Differential
[Alignment] Replace commonAlignment with std::min
`commonAlignment` is a shortcut to pick the smallest of two `Align` objects. As-is it doesn't bring much value compared to `std::min`.
Differential Revision: https://reviews.llvm.org/D128345
show more ...
|
| #
57ffff6d |
| 22-Jun-2022 |
Guillaume Chatelet <[email protected]> |
Revert "[NFC] Remove dead code"
This reverts commit 8ba2cbff70f2c49a8926451c59cc260d67b706cf.
|
| #
8ba2cbff |
| 22-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[NFC] Remove dead code
|
| #
f9bb8c24 |
| 13-Jun-2022 |
Guillaume Chatelet <[email protected]> |
[NFC][Alignment] Convert MemCpyOptimizer.cpp
|
|
Revision tags: llvmorg-14.0.5 |
|
| #
d86a206f |
| 05-Jun-2022 |
Fangrui Song <[email protected]> |
Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options
|
|
Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1 |
|
| #
a5c3b574 |
| 01-Apr-2022 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Work around PR54682
As discussed on https://github.com/llvm/llvm-project/issues/54682, MemorySSA currently has a bug when computing the clobber of calls that access loop-varying location
[MemCpyOpt] Work around PR54682
As discussed on https://github.com/llvm/llvm-project/issues/54682, MemorySSA currently has a bug when computing the clobber of calls that access loop-varying locations. I think a "proper" fix for this on the MemorySSA side might be non-trivial, but we can easily work around this in MemCpyOpt:
Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess() call to find a clobber on either the src or dest location, and then refines it for the src and dest clobber. This was intended as an optimization, as the location-less API is cached, while the location-affected APIs are not.
However, I don't think this really makes a difference in practice, because I don't think anything will use the cached clobbers on those calls later anyway. On CTMark, this patch seems to be very mildly positive actually.
So I think this is a reasonable way to avoid the problem for now, though MemorySSA should also get a fix.
Differential Revision: https://reviews.llvm.org/D122911
show more ...
|
| #
7c51669c |
| 29-Mar-2022 |
Philip Reames <[email protected]> |
[memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time
The search for the clobbering call is fairly expensive if uses are not optimized at construction. Defer the clobbe
[memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time
The search for the clobbering call is fairly expensive if uses are not optimized at construction. Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point. (e.g. If the source pointer is not an alloca, we can't do callslotopt.)
On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.
show more ...
|
| #
33deaa13 |
| 29-Mar-2022 |
Philip Reames <[email protected]> |
[memcpyopt] Common code into performCallSlotOptzn [NFC]
We have the same code repeated in both callers, sink it into callee.
The motivation here isn't just code style, we can also defer the relativ
[memcpyopt] Common code into performCallSlotOptzn [NFC]
We have the same code repeated in both callers, sink it into callee.
The motivation here isn't just code style, we can also defer the relatively expensive aliasing checks until the cheap structural preconditions have been validated. (e.g. Don't bother aliasing if src is not an alloca.) This helps compile time significantly.
show more ...
|
|
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3 |
|
| #
59630917 |
| 02-Mar-2022 |
serge-sans-paille <[email protected]> |
Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cl
Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line: before: 1062981579 after: 1062494547
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120817
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc2 |
|
| #
7662d168 |
| 21-Feb-2022 |
Florian Hahn <[email protected]> |
[MemCpyOpt] Check all access for MemoryUses in writtenBetween.
Currently writtenBetween can miss clobbers of Loc between End and Start, if End is a MemoryUse.
To guarantee we see all write clobbers
[MemCpyOpt] Check all access for MemoryUses in writtenBetween.
Currently writtenBetween can miss clobbers of Loc between End and Start, if End is a MemoryUse.
To guarantee we see all write clobbers of Loc between Start and End for MemoryUses, restrict to Start and End being in the same block and check all accesses between them.
This fixes 2 mis-compiles illustrated in llvm/test/Transforms/MemCpyOpt/memcpy-byval-forwarding-clobbers.ll
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D119929
show more ...
|
|
Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init |
|
| #
6b69985d |
| 26-Jan-2022 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Use helper for unwind check
This extends support to byval arguments. It would be further extended to handle the case of non-captured noalias returns.
|
|
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2 |
|
| #
0d20407d |
| 05-Jan-2022 |
Nikita Popov <[email protected]> |
Reapply [MemCpyOpt] Look through pointer casts when checking capture
This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679.
-----
The user scanni
Reapply [MemCpyOpt] Look through pointer casts when checking capture
This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679.
-----
The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.
show more ...
|
| #
655a7024 |
| 13-Dec-2021 |
Nikita Popov <[email protected]> |
Reapply [MemCpyOpt] Make capture check during call slot optimization more precise
This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679.
-----
Ca
Reapply [MemCpyOpt] Make capture check during call slot optimization more precise
This is a recommit of the patch without changes. The reason for the revert has been addressed in D117679.
-----
Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first.
In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling.
Differential Revision: https://reviews.llvm.org/D115615
show more ...
|
| #
d7bff2e9 |
| 19-Jan-2022 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Fix metadata merging during call slot optimization
Call slot optimization currently merges the metadata between the call and the load. However, we also need to merge in the metadata of t
[MemCpyOpt] Fix metadata merging during call slot optimization
Call slot optimization currently merges the metadata between the call and the load. However, we also need to merge in the metadata of the store.
Part of the reason why we might have gotten away with this previously is that usually the load and the store are the same instruction (a memcpy), this can only happen if call slot optimization occurs on an actual load/store pair.
This addresses the issue reported in https://reviews.llvm.org/D115615#3251386.
Differential Revision: https://reviews.llvm.org/D117679
show more ...
|
| #
4dc4815f |
| 19-Jan-2022 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Add some debug output to call slot optimization (NFC)
|
| #
53a51acc |
| 18-Jan-2022 |
Hans Wennborg <[email protected]> |
Revert "[MemCpyOpt] Make capture check during call slot optimization more precise"
This casued a miscompile due to call slot optimization replacing a call argument without considering the call's !no
Revert "[MemCpyOpt] Make capture check during call slot optimization more precise"
This casued a miscompile due to call slot optimization replacing a call argument without considering the call's !noalias metadata, see discussion on the code review.
> Call slot optimization is currently supposed to be prevented if > the call can capture the source pointer. Due to an implementation > bug, this check currently doesn't trigger if a bitcast of the source > pointer is passed instead. I'm somewhat afraid of the fallout of > fixing this bug (due to heavy reliance on call slot optimization > in rust), so I'd like to strengthen the capture reasoning a bit first. > > In particular, I believe that the capture is fine as long as a) > the call itself cannot depend on the pointer identity, because > neither dest has been captured before/at nor src before the > call and b) there is no potential use of the captured pointer > before the lifetime of the source alloca ends, either due to > lifetime.end or a return from a function. At that point the > potentially captured pointer becomes dangling. > > Differential Revision: https://reviews.llvm.org/D115615
Also reverting the dependent commit:
> [MemCpyOpt] Look through pointer casts when checking capture > > The user scanning loop above looks through pointer casts, so we > also need to strip pointer casts in the capture check. Previously > the source was incorrectly considered not captured if a bitcast > was passed to the call.
This reverts commit 487a34ed9d7d24a7b1fb388c8856c784a459b22b and 00e6869463ae6023d0d48f30de8511d6d748b14f.
show more ...
|
| #
5bbcff61 |
| 06-Jan-2022 |
Simon Pilgrim <[email protected]> |
[MemCpyOptimizer] hasUndefContents - only look for underlying object if we've found an alloca
Provides an early-out if we fail to find an AllocaInst, and avoids a static analyzer warning about null
[MemCpyOptimizer] hasUndefContents - only look for underlying object if we've found an alloca
Provides an early-out if we fail to find an AllocaInst, and avoids a static analyzer warning about null dereferencing.
show more ...
|
| #
8399fa67 |
| 06-Jan-2022 |
Simon Pilgrim <[email protected]> |
[MemCpyOptimizer] Use auto* for cast<> results (style). NFC.
|
| #
00e68694 |
| 05-Jan-2022 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Look through pointer casts when checking capture
The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the sou
[MemCpyOpt] Look through pointer casts when checking capture
The user scanning loop above looks through pointer casts, so we also need to strip pointer casts in the capture check. Previously the source was incorrectly considered not captured if a bitcast was passed to the call.
show more ...
|
| #
487a34ed |
| 13-Dec-2021 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Make capture check during call slot optimization more precise
Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implement
[MemCpyOpt] Make capture check during call slot optimization more precise
Call slot optimization is currently supposed to be prevented if the call can capture the source pointer. Due to an implementation bug, this check currently doesn't trigger if a bitcast of the source pointer is passed instead. I'm somewhat afraid of the fallout of fixing this bug (due to heavy reliance on call slot optimization in rust), so I'd like to strengthen the capture reasoning a bit first.
In particular, I believe that the capture is fine as long as a) the call itself cannot depend on the pointer identity, because neither dest has been captured before/at nor src before the call and b) there is no potential use of the captured pointer before the lifetime of the source alloca ends, either due to lifetime.end or a return from a function. At that point the potentially captured pointer becomes dangling.
Differential Revision: https://reviews.llvm.org/D115615
show more ...
|
|
Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3 |
|
| #
7fb66d40 |
| 06-Sep-2021 |
Fraser Cormack <[email protected]> |
[MemCpyOpt] Fix a variety of scalable-type crashes
This patch fixes a variety of crashes resulting from the `MemCpyOptPass` casting `TypeSize` to a constant integer, whether implicitly or explicitly
[MemCpyOpt] Fix a variety of scalable-type crashes
This patch fixes a variety of crashes resulting from the `MemCpyOptPass` casting `TypeSize` to a constant integer, whether implicitly or explicitly.
Since the `MemsetRanges` requires a constant size to work, all but one of the fixes in this patch simply involve skipping the various optimizations for scalable types as cleanly as possible.
The optimization of `byval` parameters, however, has been updated to work on scalable types in theory. In practice, this optimization is only valid when the length of the `memcpy` is known to be larger than the scalable type size, which is currently never the case. This could perhaps be done in the future using the `vscale_range` attribute.
Some implicit casts have been left as they were, under the knowledge they are only called on aggregate types. These should never be scalably-sized.
Reviewed By: nikic, tra
Differential Revision: https://reviews.llvm.org/D109329
show more ...
|
| #
30dfd344 |
| 30-Aug-2021 |
Artem Belevich <[email protected]> |
[MemCpyOpt] Allow specifying --enable-memcpyopt-without-libcalls more than once
so we can override it via clang's CLI if necessary.
|
|
Revision tags: llvmorg-13.0.0-rc2 |
|
| #
17db125b |
| 07-Aug-2021 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Optimize MemoryDef insertion
When converting a store into a memset, we currently insert the new MemoryDef after the store MemoryDef, which requires all uses to be renamed to the new def
[MemCpyOpt] Optimize MemoryDef insertion
When converting a store into a memset, we currently insert the new MemoryDef after the store MemoryDef, which requires all uses to be renamed to the new def using a whole block scan. Instead, we can insert the new MemoryDef before the store and not rename uses, because we know that the location is immediately overwritten, so all uses should still refer to the old MemoryDef. Those uses will get renamed when the old MemoryDef is actually dropped, which is efficient.
I expect something similar can be done for some of the other MSSA updates in MemCpyOpt. This is an alternative to D107513, at least for this particular case.
Differential Revision: https://reviews.llvm.org/D107702
show more ...
|
|
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1 |
|
| #
88003cea |
| 08-May-2021 |
Nikita Popov <[email protected]> |
[MemCpyOpt] Remove MemDepAnalysis-based implementation
The MemorySSA-based implementation has been enabled for a few months (since D94376). This patch drops the old MDA-based implementation entirely
[MemCpyOpt] Remove MemDepAnalysis-based implementation
The MemorySSA-based implementation has been enabled for a few months (since D94376). This patch drops the old MDA-based implementation entirely.
I've kept this to only the basic cleanup of dropping various conditions -- the code could be further cleaned up now that there is only one implementation.
Differential Revision: https://reviews.llvm.org/D102113
show more ...
|
| #
6a9cf21f |
| 20-Jul-2021 |
Artem Belevich <[email protected]> |
[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.
Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that there are users that do not expect LLVM to
[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.
Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that there are users that do not expect LLVM to materialize `memset` intrinsic.
While other passes can do that, too, MemCpyOpt triggers it more frequently and breaks sanitizers and some downstream users.
For now introduce a flag to force-enable the flag and opt-in only CUDA compilation with NVPTX back-end.
Differential Revision: https://reviews.llvm.org/D106401
show more ...
|