History log of /llvm-project-15.0.7/llvm/test/Transforms/OpenMP/replace_globalization.ll (Results 1 – 25 of 35)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# bf789b19 21-Jun-2022 Johannes Doerfert <[email protected]>

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.

Fixes: https://github.com/llvm/llvm-project/issues/54981

Note: A previous version was flawed and consequently reverted in
6555558a80589d1c5a1154b92cc3af9495f8f86c.

show more ...


# f6e0c05e 08-Jul-2022 Johannes Doerfert <[email protected]>

Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"

This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three
AMDGPU tests haven't been updated. Will need to verify the

Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"

This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three
AMDGPU tests haven't been updated. Will need to verify the changes are
not regressions we should avoid.

show more ...


# f17639ea 21-Jun-2022 Johannes Doerfert <[email protected]>

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good even if some tests look like they regress.

Fixes: https://github.com/llvm/llvm-project/issues/54981

Note: A previous version was flawed and consequently reverted in
6555558a80589d1c5a1154b92cc3af9495f8f86c.

show more ...


Revision tags: llvmorg-14.0.5
# 6555558a 09-Jun-2022 Johannes Doerfert <[email protected]>

Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"

This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705.

Patch broke AMD GPU OpenMP offload buildbots.
https://lab.llvm.org

Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues"

This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705.

Patch broke AMD GPU OpenMP offload buildbots.
https://lab.llvm.org/buildbot/#/builders/193/builds/13246

show more ...


Revision tags: llvmorg-14.0.4
# da50dab1 10-May-2022 Johannes Doerfert <[email protected]>

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many

[Attributor] Replace AAValueSimplify with AAPotentialValues

For the longest time we used `AAValueSimplify` and
`genericValueTraversal` to determine "potential values". This was
problematic for many reasons:
- We recomputed the result a lot as there was no caching for the 9
locations calling `genericValueTraversal`.
- We added the idea of "intra" vs. "inter" procedural simplification
only as an afterthought. `genericValueTraversal` did offer an option
but `AAValueSimplify` did not. Thus, we might end up with "too much"
simplification in certain situations and then gave up on it.
- Because `genericValueTraversal` was not a real `AA` we ended up with
problems like the infinite recursion bug (#54981) as well as code
duplication.

This patch introduces `AAPotentialValues` and replaces the
`AAValueSimplify` uses with it. `genericValueTraversal` is folded into
`AAPotentialValues` as are the instruction simplifications performed in
`AAValueSimplify` before. We further distinguish "intra" and "inter"
procedural simplification now.

`AAValueSimplify` was not deleted as we haven't ported the
re-materialization of instructions yet. There are other differences over
the former handling, e.g., we may not fold trivially foldable
instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2`
but if an operand would be simplified to `i32 1` we would fold it still.

We are also even more aware of function/SCC boundaries in CGSCC passes,
which is good.

Fixes: https://github.com/llvm/llvm-project/issues/54981

show more ...


# dbffa407 18-May-2022 Joseph Huber <[email protected]>

[NVVM] Update intrinsic defintions to include the `nocallback` attribute

This patch adds the `nocallback` attribute to the NVVM intrinsics that
did not use the `DefaultAttrsIntrinsic` method that in

[NVVM] Update intrinsic defintions to include the `nocallback` attribute

This patch adds the `nocallback` attribute to the NVVM intrinsics that
did not use the `DefaultAttrsIntrinsic` method that includes it already.
The `nocallback` attribute states that the intrinsic function cannot
enter back into the caller's translation-unit. This allows as to
determine that a function calling a `nocallback` function can have the
`norecurse` attribute. This should be safe for all the NVVM intrinsics
because they do not call other functions within the translation unit.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D125937

show more ...


Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# e90bce8f 31-Mar-2022 Augie Fackler <[email protected]>

CallBase: fix getFnAttr so it also checks the function

Prior to this change, CallBase::hasFnAttr checked the called function to
see if it had an attribute if it wasn't set on the CallBase, but
getFn

CallBase: fix getFnAttr so it also checks the function

Prior to this change, CallBase::hasFnAttr checked the called function to
see if it had an attribute if it wasn't set on the CallBase, but
getFnAttr didn't do the same delegation, which led to very confusing
behavior. This patch fixes the issue by making CallBase::getFnAttr also
check the function under the same circumstances.

Test changes look (to me) like they're cleaning up redundant attributes
which no longer get specified both on the callee and call. We also clean
up the one ad-hoc implementation of this getter over in InlineCost.cpp.

Differential Revision: https://reviews.llvm.org/D122821

show more ...


# a81fff8a 24-Mar-2022 Johannes Doerfert <[email protected]>

Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"

This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and
reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with a

Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes"

This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and
reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test
changes.

show more ...


# c5f78905 24-Mar-2022 Johannes Doerfert <[email protected]>

Revert "[Intrinsics] Add `nocallback` to the default intrinsic attributes"

This reverts commit 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 as it
breaks the buildbots.

I didn't see these failures in th

Revert "[Intrinsics] Add `nocallback` to the default intrinsic attributes"

This reverts commit 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 as it
breaks the buildbots.

I didn't see these failures in the pre-merge checks, looking into it.

show more ...


Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init
# 7aea3ea8 01-Feb-2022 Johannes Doerfert <[email protected]>

[Intrinsics] Add `nocallback` to the default intrinsic attributes

Most intrinsics, especially "default" ones, will not call back into the
IR module. `nocallback` encodes this nicely. As it was not u

[Intrinsics] Add `nocallback` to the default intrinsic attributes

Most intrinsics, especially "default" ones, will not call back into the
IR module. `nocallback` encodes this nicely. As it was not used before,
this patch also makes use of `nocallback` in the Attributor which
results in many more `norecurse` deductions.

Tablegen part is mechanical, test updates by script.

Differential Revision: https://reviews.llvm.org/D118680

show more ...


# 4166738c 18-Mar-2022 Johannes Doerfert <[email protected]>

[OpenMP][FIX] Do not crash when kernels are debug wrapper functions

With debug information enabled (-g) Clang will wrap the actual target
region into a new function which is called from the "kernel"

[OpenMP][FIX] Do not crash when kernels are debug wrapper functions

With debug information enabled (-g) Clang will wrap the actual target
region into a new function which is called from the "kernel". The problem
is that the "kernel" is now basically a wrapper without all the things
we expect. More importantly, if we end up asking for an AAKernelInfo
for the "target region function" we might try to turn it into SPMD mode.
That used to cause an assertion as that function doesn't have an
appropriately named `_exec_mode` global. While the global is going away
soon we still need to make sure to properly handle this case, e.g.,
perform optimizations reliably.

Differential Revision: https://reviews.llvm.org/D122043

show more ...


# e92891f8 11-Mar-2022 Johannes Doerfert <[email protected]>

[Attributor] Allow not to default initialize AAs for live internal functions

Outside users of the Attributor, e.g., OpenMP-opt, want to seed AAs
themselves. We should not seed all default AAs one an

[Attributor] Allow not to default initialize AAs for live internal functions

Outside users of the Attributor, e.g., OpenMP-opt, want to seed AAs
themselves. We should not seed all default AAs one an internal function
becomes live. That said, there should be a callback such that they can
do lazy seeding as well.

Differential Revision: https://reviews.llvm.org/D121489

show more ...


# 192a34dd 25-Feb-2022 Johannes Doerfert <[email protected]>

[Attributor][OpenMPOpt][FIX] Register simplification callbacks

Heap-2-stack and heap-2-shared can replace an allocation call with
something else. To avoid us deriving information from the allocator

[Attributor][OpenMPOpt][FIX] Register simplification callbacks

Heap-2-stack and heap-2-shared can replace an allocation call with
something else. To avoid us deriving information from the allocator
implementation we register a simplification callback now that will
force us to stop at the call site. We probably should create the
replacement memory eagerly and return that instead though.

show more ...


# 0136a440 17-Feb-2022 Joseph Huber <[email protected]>

[OpenMP] Add an option to limit shared memory usage in OpenMPOpt

One of the optimizations performed in OpenMPOpt pushes globalized
variables to static shared memory. This is preferable to keeping th

[OpenMP] Add an option to limit shared memory usage in OpenMPOpt

One of the optimizations performed in OpenMPOpt pushes globalized
variables to static shared memory. This is preferable to keeping the
runtime call in all cases, however if too many variables are pushed to
hared memory the kernel will crash. Since this is an optimization and
not something the user specified explicitly, there should be an option
to limit this optimization in those cases. This path introduces the
`-openmp-opt-shared-limit=` option to limit the amount of bytes that
will be placed in shared memory from HeapToShared.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D120079

show more ...


Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# ac3ec22d 20-Sep-2021 Johannes Doerfert <[email protected]>

[Attributor] Use AAFunctionReachability to determine AANoRecurse

We missed out on AANoRecurse in the module pass because we had no call
graph. With AAFunctionReachability we can simply ask if the fu

[Attributor] Use AAFunctionReachability to determine AANoRecurse

We missed out on AANoRecurse in the module pass because we had no call
graph. With AAFunctionReachability we can simply ask if the function may
reach itself.

Differential Revision: https://reviews.llvm.org/D110099

show more ...


# a1db0e52 01-Feb-2022 Johannes Doerfert <[email protected]>

[Attributor][FIX] Liveness handling in the isAssumedDead helpers

This fixes a conceptual problem with our AAIsDead usage which conflated
call site liveness with call site return value liveness. With

[Attributor][FIX] Liveness handling in the isAssumedDead helpers

This fixes a conceptual problem with our AAIsDead usage which conflated
call site liveness with call site return value liveness. Without the
fix tests would obviously miscompile as we make genericValueTraversal
more powerful (in a follow up). The effects on the tests are mixed but
mostly marginal. The most prominent one is the lack of `noreturn` for
functions. The reason is that we make entire blocks live at the same
time (for time reasons). Now that we actually look at the block
liveness, which we need to do, the return instructions are live and
will survive. As an example, `noreturn_async.ll` has been modified
to retain the `noreturn` even with block granularity. We could address
this easily but there is little need in practice.

show more ...


# 5eb49009 24-Jan-2022 Joseph Huber <[email protected]>

[OpenMP] Add more identifier to created shared globals

Currenly we push some variables to a global constant containing shared
memory as an optimization. This generated constant had internal linkage

[OpenMP] Add more identifier to created shared globals

Currenly we push some variables to a global constant containing shared
memory as an optimization. This generated constant had internal linkage
and should not have collided with any known identifiers in the
translation unit. However, there have been observed cases of this
optimiztaion unintentionally colliding with undocumented PTX
identifiers. This patch adds a suffix to the created globals to
hopefully bypass this.

Depends on D118059

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D118068

show more ...


# 6e220296 27-Dec-2021 Joseph Huber <[email protected]>

[OpenMP] Use alignment information in HeapToShared

This patch uses the return alignment attribute now present in the
`__kmpc_alloc_shared` runtime call to set the alignment of the shared
memory glob

[OpenMP] Use alignment information in HeapToShared

This patch uses the return alignment attribute now present in the
`__kmpc_alloc_shared` runtime call to set the alignment of the shared
memory global created to replace it.

Depends on D115971

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D116319

show more ...


# c690c1c9 17-Sep-2021 Johannes Doerfert <[email protected]>

[NVVM] Update intrinsic definitions to include more attributes

A lot of NVVM intrinsics can use the default intrinsic attributes (e.g.,
nosync, nofree, ...) as well as `speculatable`. The latter is

[NVVM] Update intrinsic definitions to include more attributes

A lot of NVVM intrinsics can use the default intrinsic attributes (e.g.,
nosync, nofree, ...) as well as `speculatable`. The latter is important
if we want to recompute intrinsics results instead of communicating them
via memory.

I did use default attributes for almost all `readnone` attributes but
speculatable only where I had reasonable confidence they cannot
experience UB. That said, someone should double check.

TODO: There seem to be various intrinsics marked `Commutative` which
should not, e.g., fma and div.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D109987

show more ...


Revision tags: llvmorg-13.0.0-rc3
# 57822c3f 08-Sep-2021 Johannes Doerfert <[email protected]>

[OpenMP][NFC] Repair test that contained nested kernels

The benchmark contained (partially) nested kernels, something we do not
generate nor support.


# 423d34f7 22-Sep-2021 Shilei Tian <[email protected]>

[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`

This is a follow-up of D110029, which uses bitset to indicate execution mode. This patch

[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit`

This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D110279

show more ...


# fec2927e 17-Sep-2021 Joseph Huber <[email protected]>

[OpenMP] Add NoSync attributes to alloc / free shared RTL calls

This patch adds the `nosync` attribute to the `__kmpc_alloc_shared` and
`__kmpc_free_shared` runtime library calls. This allows code a

[OpenMP] Add NoSync attributes to alloc / free shared RTL calls

This patch adds the `nosync` attribute to the `__kmpc_alloc_shared` and
`__kmpc_free_shared` runtime library calls. This allows code analysis to
know that these functins dont contain any barriers. This will help
optimizations reason about the CFG of blocks containing these calls.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D109995

show more ...


Revision tags: llvmorg-13.0.0-rc2
# 29a3e3dd 04-Aug-2021 Giorgis Georgakoudis <[email protected]>

[OpenMPOpt] Expand SPMDization with guarding for target parallel regions

This patch expands SPMDization (converting generic execution mode to SPMD for target regions) by guarding code regions that s

[OpenMPOpt] Expand SPMDization with guarding for target parallel regions

This patch expands SPMDization (converting generic execution mode to SPMD for target regions) by guarding code regions that should be executed only by the main thread. Specifically, it generates guarded regions, which only the main thread executes, and the synchronization with worker threads using simple barriers. For correctness, the patch aborts SPMDization for target regions if the same code executes in a parallel region, thus must be not be guarded. This check is implemented using the ParallelLevels AA.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D106892

show more ...


Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init
# 754eb1c2 21-Jul-2021 Joseph Huber <[email protected]>

[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size

This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. T

[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size

This patch changes `__kmpc_free_shared` to take an additional argument
corresponding to the associated allocation's size. This makes it easier to
implement the allocator in the runtime.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D106496

show more ...


Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# d9659bf6 20-May-2021 Johannes Doerfert <[email protected]>

[OpenMP] Create custom state machines for generic target regions

In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called

[OpenMP] Create custom state machines for generic target regions

In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.

The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D101977

show more ...


12