History log of /llvm-project-15.0.7/llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp (Results 1 – 25 of 364)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# 3c126d5f 22-Jun-2022 Guillaume Chatelet <[email protected]>

[Alignment] Replace commonAlignment with std::min

`commonAlignment` is a shortcut to pick the smallest of two `Align`
objects. As-is it doesn't bring much value compared to `std::min`.

Differential

[Alignment] Replace commonAlignment with std::min

`commonAlignment` is a shortcut to pick the smallest of two `Align`
objects. As-is it doesn't bring much value compared to `std::min`.

Differential Revision: https://reviews.llvm.org/D128345

show more ...


# 57ffff6d 22-Jun-2022 Guillaume Chatelet <[email protected]>

Revert "[NFC] Remove dead code"

This reverts commit 8ba2cbff70f2c49a8926451c59cc260d67b706cf.


# 8ba2cbff 22-Jun-2022 Guillaume Chatelet <[email protected]>

[NFC] Remove dead code


# f9bb8c24 13-Jun-2022 Guillaume Chatelet <[email protected]>

[NFC][Alignment] Convert MemCpyOptimizer.cpp


Revision tags: llvmorg-14.0.5
# d86a206f 05-Jun-2022 Fangrui Song <[email protected]>

Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options


Revision tags: llvmorg-14.0.4, llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# a5c3b574 01-Apr-2022 Nikita Popov <[email protected]>

[MemCpyOpt] Work around PR54682

As discussed on https://github.com/llvm/llvm-project/issues/54682,
MemorySSA currently has a bug when computing the clobber of calls
that access loop-varying location

[MemCpyOpt] Work around PR54682

As discussed on https://github.com/llvm/llvm-project/issues/54682,
MemorySSA currently has a bug when computing the clobber of calls
that access loop-varying locations. I think a "proper" fix for this
on the MemorySSA side might be non-trivial, but we can easily work
around this in MemCpyOpt:

Currently, MemCpyOpt uses a location-less getClobberingMemoryAccess()
call to find a clobber on either the src or dest location, and then
refines it for the src and dest clobber. This was intended as an
optimization, as the location-less API is cached, while the
location-affected APIs are not.

However, I don't think this really makes a difference in practice,
because I don't think anything will use the cached clobbers on
those calls later anyway. On CTMark, this patch seems to be very
mildly positive actually.

So I think this is a reasonable way to avoid the problem for now,
though MemorySSA should also get a fix.

Differential Revision: https://reviews.llvm.org/D122911

show more ...


# 7c51669c 29-Mar-2022 Philip Reames <[email protected]>

[memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time

The search for the clobbering call is fairly expensive if uses are not optimized at construction. Defer the clobbe

[memcpyopt] Restructure store(load src, dest) form of callslotopt for compile time

The search for the clobbering call is fairly expensive if uses are not optimized at construction. Defer the clobber walk to the point in the implementation we need it; there are a bunch of bailouts before that point. (e.g. If the source pointer is not an alloca, we can't do callslotopt.)

On a test case which involves a bunch of copies from argument pointers, this switches memcpyopt from > 1/2 second to < 10ms.

show more ...


# 33deaa13 29-Mar-2022 Philip Reames <[email protected]>

[memcpyopt] Common code into performCallSlotOptzn [NFC]

We have the same code repeated in both callers, sink it into callee.

The motivation here isn't just code style, we can also defer the relativ

[memcpyopt] Common code into performCallSlotOptzn [NFC]

We have the same code repeated in both callers, sink it into callee.

The motivation here isn't just code style, we can also defer the relatively expensive aliasing checks until the cheap structural preconditions have been validated. (e.g. Don't bother aliasing if src is not an alloca.) This helps compile time significantly.

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3
# 59630917 02-Mar-2022 serge-sans-paille <[email protected]>

Cleanup includes: Transform/Scalar

Estimated impact on preprocessor output line:
before: 1062981579
after: 1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cl

Cleanup includes: Transform/Scalar

Estimated impact on preprocessor output line:
before: 1062981579
after: 1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817

show more ...


Revision tags: llvmorg-14.0.0-rc2
# 7662d168 21-Feb-2022 Florian Hahn <[email protected]>

[MemCpyOpt] Check all access for MemoryUses in writtenBetween.

Currently writtenBetween can miss clobbers of Loc between End and Start,
if End is a MemoryUse.

To guarantee we see all write clobbers

[MemCpyOpt] Check all access for MemoryUses in writtenBetween.

Currently writtenBetween can miss clobbers of Loc between End and Start,
if End is a MemoryUse.

To guarantee we see all write clobbers of Loc between Start and End
for MemoryUses, restrict to Start and End being in the same block
and check all accesses between them.

This fixes 2 mis-compiles illustrated in
llvm/test/Transforms/MemCpyOpt/memcpy-byval-forwarding-clobbers.ll

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D119929

show more ...


Revision tags: llvmorg-14.0.0-rc1, llvmorg-15-init
# 6b69985d 26-Jan-2022 Nikita Popov <[email protected]>

[MemCpyOpt] Use helper for unwind check

This extends support to byval arguments. It would be further
extended to handle the case of non-captured noalias returns.


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2
# 0d20407d 05-Jan-2022 Nikita Popov <[email protected]>

Reapply [MemCpyOpt] Look through pointer casts when checking capture

This is a recommit of the patch without changes. The reason for
the revert has been addressed in D117679.

-----

The user scanni

Reapply [MemCpyOpt] Look through pointer casts when checking capture

This is a recommit of the patch without changes. The reason for
the revert has been addressed in D117679.

-----

The user scanning loop above looks through pointer casts, so we
also need to strip pointer casts in the capture check. Previously
the source was incorrectly considered not captured if a bitcast
was passed to the call.

show more ...


# 655a7024 13-Dec-2021 Nikita Popov <[email protected]>

Reapply [MemCpyOpt] Make capture check during call slot optimization more precise

This is a recommit of the patch without changes. The reason for
the revert has been addressed in D117679.

-----

Ca

Reapply [MemCpyOpt] Make capture check during call slot optimization more precise

This is a recommit of the patch without changes. The reason for
the revert has been addressed in D117679.

-----

Call slot optimization is currently supposed to be prevented if
the call can capture the source pointer. Due to an implementation
bug, this check currently doesn't trigger if a bitcast of the source
pointer is passed instead. I'm somewhat afraid of the fallout of
fixing this bug (due to heavy reliance on call slot optimization
in rust), so I'd like to strengthen the capture reasoning a bit first.

In particular, I believe that the capture is fine as long as a)
the call itself cannot depend on the pointer identity, because
neither dest has been captured before/at nor src before the
call and b) there is no potential use of the captured pointer
before the lifetime of the source alloca ends, either due to
lifetime.end or a return from a function. At that point the
potentially captured pointer becomes dangling.

Differential Revision: https://reviews.llvm.org/D115615

show more ...


# d7bff2e9 19-Jan-2022 Nikita Popov <[email protected]>

[MemCpyOpt] Fix metadata merging during call slot optimization

Call slot optimization currently merges the metadata between the
call and the load. However, we also need to merge in the metadata
of t

[MemCpyOpt] Fix metadata merging during call slot optimization

Call slot optimization currently merges the metadata between the
call and the load. However, we also need to merge in the metadata
of the store.

Part of the reason why we might have gotten away with this
previously is that usually the load and the store are the same
instruction (a memcpy), this can only happen if call slot
optimization occurs on an actual load/store pair.

This addresses the issue reported in
https://reviews.llvm.org/D115615#3251386.

Differential Revision: https://reviews.llvm.org/D117679

show more ...


# 4dc4815f 19-Jan-2022 Nikita Popov <[email protected]>

[MemCpyOpt] Add some debug output to call slot optimization (NFC)


# 53a51acc 18-Jan-2022 Hans Wennborg <[email protected]>

Revert "[MemCpyOpt] Make capture check during call slot optimization more precise"

This casued a miscompile due to call slot optimization replacing a call
argument without considering the call's !no

Revert "[MemCpyOpt] Make capture check during call slot optimization more precise"

This casued a miscompile due to call slot optimization replacing a call
argument without considering the call's !noalias metadata, see discussion on
the code review.

> Call slot optimization is currently supposed to be prevented if
> the call can capture the source pointer. Due to an implementation
> bug, this check currently doesn't trigger if a bitcast of the source
> pointer is passed instead. I'm somewhat afraid of the fallout of
> fixing this bug (due to heavy reliance on call slot optimization
> in rust), so I'd like to strengthen the capture reasoning a bit first.
>
> In particular, I believe that the capture is fine as long as a)
> the call itself cannot depend on the pointer identity, because
> neither dest has been captured before/at nor src before the
> call and b) there is no potential use of the captured pointer
> before the lifetime of the source alloca ends, either due to
> lifetime.end or a return from a function. At that point the
> potentially captured pointer becomes dangling.
>
> Differential Revision: https://reviews.llvm.org/D115615

Also reverting the dependent commit:

> [MemCpyOpt] Look through pointer casts when checking capture
>
> The user scanning loop above looks through pointer casts, so we
> also need to strip pointer casts in the capture check. Previously
> the source was incorrectly considered not captured if a bitcast
> was passed to the call.

This reverts commit 487a34ed9d7d24a7b1fb388c8856c784a459b22b
and 00e6869463ae6023d0d48f30de8511d6d748b14f.

show more ...


# 5bbcff61 06-Jan-2022 Simon Pilgrim <[email protected]>

[MemCpyOptimizer] hasUndefContents - only look for underlying object if we've found an alloca

Provides an early-out if we fail to find an AllocaInst, and avoids a static analyzer warning about null

[MemCpyOptimizer] hasUndefContents - only look for underlying object if we've found an alloca

Provides an early-out if we fail to find an AllocaInst, and avoids a static analyzer warning about null dereferencing.

show more ...


# 8399fa67 06-Jan-2022 Simon Pilgrim <[email protected]>

[MemCpyOptimizer] Use auto* for cast<> results (style). NFC.


# 00e68694 05-Jan-2022 Nikita Popov <[email protected]>

[MemCpyOpt] Look through pointer casts when checking capture

The user scanning loop above looks through pointer casts, so we
also need to strip pointer casts in the capture check. Previously
the sou

[MemCpyOpt] Look through pointer casts when checking capture

The user scanning loop above looks through pointer casts, so we
also need to strip pointer casts in the capture check. Previously
the source was incorrectly considered not captured if a bitcast
was passed to the call.

show more ...


# 487a34ed 13-Dec-2021 Nikita Popov <[email protected]>

[MemCpyOpt] Make capture check during call slot optimization more precise

Call slot optimization is currently supposed to be prevented if
the call can capture the source pointer. Due to an implement

[MemCpyOpt] Make capture check during call slot optimization more precise

Call slot optimization is currently supposed to be prevented if
the call can capture the source pointer. Due to an implementation
bug, this check currently doesn't trigger if a bitcast of the source
pointer is passed instead. I'm somewhat afraid of the fallout of
fixing this bug (due to heavy reliance on call slot optimization
in rust), so I'd like to strengthen the capture reasoning a bit first.

In particular, I believe that the capture is fine as long as a)
the call itself cannot depend on the pointer identity, because
neither dest has been captured before/at nor src before the
call and b) there is no potential use of the captured pointer
before the lifetime of the source alloca ends, either due to
lifetime.end or a return from a function. At that point the
potentially captured pointer becomes dangling.

Differential Revision: https://reviews.llvm.org/D115615

show more ...


Revision tags: llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4, llvmorg-13.0.0-rc3
# 7fb66d40 06-Sep-2021 Fraser Cormack <[email protected]>

[MemCpyOpt] Fix a variety of scalable-type crashes

This patch fixes a variety of crashes resulting from the `MemCpyOptPass`
casting `TypeSize` to a constant integer, whether implicitly or
explicitly

[MemCpyOpt] Fix a variety of scalable-type crashes

This patch fixes a variety of crashes resulting from the `MemCpyOptPass`
casting `TypeSize` to a constant integer, whether implicitly or
explicitly.

Since the `MemsetRanges` requires a constant size to work, all but one
of the fixes in this patch simply involve skipping the various
optimizations for scalable types as cleanly as possible.

The optimization of `byval` parameters, however, has been updated to
work on scalable types in theory. In practice, this optimization is only
valid when the length of the `memcpy` is known to be larger than the
scalable type size, which is currently never the case. This could
perhaps be done in the future using the `vscale_range` attribute.

Some implicit casts have been left as they were, under the knowledge
they are only called on aggregate types. These should never be
scalably-sized.

Reviewed By: nikic, tra

Differential Revision: https://reviews.llvm.org/D109329

show more ...


# 30dfd344 30-Aug-2021 Artem Belevich <[email protected]>

[MemCpyOpt] Allow specifying --enable-memcpyopt-without-libcalls more than once

so we can override it via clang's CLI if necessary.


Revision tags: llvmorg-13.0.0-rc2
# 17db125b 07-Aug-2021 Nikita Popov <[email protected]>

[MemCpyOpt] Optimize MemoryDef insertion

When converting a store into a memset, we currently insert the new
MemoryDef after the store MemoryDef, which requires all uses to be
renamed to the new def

[MemCpyOpt] Optimize MemoryDef insertion

When converting a store into a memset, we currently insert the new
MemoryDef after the store MemoryDef, which requires all uses to be
renamed to the new def using a whole block scan. Instead, we can
insert the new MemoryDef before the store and not rename uses,
because we know that the location is immediately overwritten, so
all uses should still refer to the old MemoryDef. Those uses will
get renamed when the old MemoryDef is actually dropped, which is
efficient.

I expect something similar can be done for some of the other MSSA
updates in MemCpyOpt. This is an alternative to D107513, at least
for this particular case.

Differential Revision: https://reviews.llvm.org/D107702

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init, llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# 88003cea 08-May-2021 Nikita Popov <[email protected]>

[MemCpyOpt] Remove MemDepAnalysis-based implementation

The MemorySSA-based implementation has been enabled for a few months
(since D94376). This patch drops the old MDA-based implementation
entirely

[MemCpyOpt] Remove MemDepAnalysis-based implementation

The MemorySSA-based implementation has been enabled for a few months
(since D94376). This patch drops the old MDA-based implementation
entirely.

I've kept this to only the basic cleanup of dropping various
conditions -- the code could be further cleaned up now that there
is only one implementation.

Differential Revision: https://reviews.llvm.org/D102113

show more ...


# 6a9cf21f 20-Jul-2021 Artem Belevich <[email protected]>

[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.

Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that
there are users that do not expect LLVM to

[CUDA, MemCpyOpt] Add a flag to force-enable memcpyopt and use it for CUDA.

Attempt to enable MemCpyOpt unconditionally in D104801 uncovered the fact that
there are users that do not expect LLVM to materialize `memset` intrinsic.

While other passes can do that, too, MemCpyOpt triggers it more frequently and
breaks sanitizers and some downstream users.

For now introduce a flag to force-enable the flag and opt-in only CUDA
compilation with NVPTX back-end.

Differential Revision: https://reviews.llvm.org/D106401

show more ...


12345678910>>...15