replace_globalization.ll - OpenGrok history log for /llvm-project-15.0.7/llvm/test/Transforms/OpenMP/replace

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: llvmorg-20.1.0, llvmorg-20.1.0-rc3, llvmorg-20.1.0-rc2, llvmorg-20.1.0-rc1, llvmorg-21-init, llvmorg-19.1.7, llvmorg-19.1.6, llvmorg-19.1.5, llvmorg-19.1.4, llvmorg-19.1.3, llvmorg-19.1.2, llvmorg-19.1.1, llvmorg-19.1.0, llvmorg-19.1.0-rc4, llvmorg-19.1.0-rc3, llvmorg-19.1.0-rc2, llvmorg-19.1.0-rc1, llvmorg-20-init, llvmorg-18.1.8, llvmorg-18.1.7, llvmorg-18.1.6, llvmorg-18.1.5, llvmorg-18.1.4, llvmorg-18.1.3, llvmorg-18.1.2, llvmorg-18.1.1, llvmorg-18.1.0, llvmorg-18.1.0-rc4, llvmorg-18.1.0-rc3, llvmorg-18.1.0-rc2, llvmorg-18.1.0-rc1, llvmorg-19-init, llvmorg-17.0.6, llvmorg-17.0.5, llvmorg-17.0.4, llvmorg-17.0.3, llvmorg-17.0.2, llvmorg-17.0.1, llvmorg-17.0.0, llvmorg-17.0.0-rc4, llvmorg-17.0.0-rc3, llvmorg-17.0.0-rc2, llvmorg-17.0.0-rc1, llvmorg-18-init, llvmorg-16.0.6, llvmorg-16.0.5, llvmorg-16.0.4, llvmorg-16.0.3, llvmorg-16.0.2, llvmorg-16.0.1, llvmorg-16.0.0, llvmorg-16.0.0-rc4, llvmorg-16.0.0-rc3, llvmorg-16.0.0-rc2, llvmorg-16.0.0-rc1, llvmorg-17-init, llvmorg-15.0.7, llvmorg-15.0.6, llvmorg-15.0.5, llvmorg-15.0.4, llvmorg-15.0.3, llvmorg-15.0.2, llvmorg-15.0.1, llvmorg-15.0.0, llvmorg-15.0.0-rc3, llvmorg-15.0.0-rc2, llvmorg-15.0.0-rc1, llvmorg-16-init, llvmorg-14.0.6
# bf789b19	21-Jun-2022	Johannes Doerfert <[email protected]>	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many [Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in 6555558a80589d1c5a1154b92cc3af9495f8f86c. show more ...
# f6e0c05e	08-Jul-2022	Johannes Doerfert <[email protected]>	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three AMDGPU tests haven't been updated. Will need to verify the Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit f17639ea0cd30f52ac853ba2eb25518426cc3bb8 as three AMDGPU tests haven't been updated. Will need to verify the changes are not regressions we should avoid. show more ...
# f17639ea	21-Jun-2022	Johannes Doerfert <[email protected]>	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many [Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in 6555558a80589d1c5a1154b92cc3af9495f8f86c. show more ...
Revision tags: llvmorg-14.0.5
# 6555558a	09-Jun-2022	Johannes Doerfert <[email protected]>	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705. Patch broke AMD GPU OpenMP offload buildbots. https://lab.llvm.org Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit da50dab1ae111e9e6cb0248a47a038b17f798705. Patch broke AMD GPU OpenMP offload buildbots. https://lab.llvm.org/buildbot/#/builders/193/builds/13246 show more ...
Revision tags: llvmorg-14.0.4
# da50dab1	10-May-2022	Johannes Doerfert <[email protected]>	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many [Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good. Fixes: https://github.com/llvm/llvm-project/issues/54981 show more ...
# dbffa407	18-May-2022	Joseph Huber <[email protected]>	[NVVM] Update intrinsic defintions to include the `nocallback` attribute This patch adds the `nocallback` attribute to the NVVM intrinsics that did not use the `DefaultAttrsIntrinsic` method that in [NVVM] Update intrinsic defintions to include the `nocallback` attribute This patch adds the `nocallback` attribute to the NVVM intrinsics that did not use the `DefaultAttrsIntrinsic` method that includes it already. The `nocallback` attribute states that the intrinsic function cannot enter back into the caller's translation-unit. This allows as to determine that a function calling a `nocallback` function can have the `norecurse` attribute. This should be safe for all the NVVM intrinsics because they do not call other functions within the translation unit. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D125937 show more ...
Revision tags: llvmorg-14.0.3, llvmorg-14.0.2, llvmorg-14.0.1
# e90bce8f	31-Mar-2022	Augie Fackler <[email protected]>	CallBase: fix getFnAttr so it also checks the function Prior to this change, CallBase::hasFnAttr checked the called function to see if it had an attribute if it wasn't set on the CallBase, but getFn CallBase: fix getFnAttr so it also checks the function Prior to this change, CallBase::hasFnAttr checked the called function to see if it had an attribute if it wasn't set on the CallBase, but getFnAttr didn't do the same delegation, which led to very confusing behavior. This patch fixes the issue by making CallBase::getFnAttr also check the function under the same circumstances. Test changes look (to me) like they're cleaning up redundant attributes which no longer get specified both on the callee and call. We also clean up the one ad-hoc implementation of this getter over in InlineCost.cpp. Differential Revision: https://reviews.llvm.org/D122821 show more ...
# a81fff8a	24-Mar-2022	Johannes Doerfert <[email protected]>	Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes" This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with a Reapply "[Intrinsics] Add `nocallback` to the default intrinsic attributes" This reverts commit c5f789050daab25aad6770790987e2b7c0395936 and reapplies 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 with additional test changes. show more ...
# c5f78905	24-Mar-2022	Johannes Doerfert <[email protected]>	Revert "[Intrinsics] Add `nocallback` to the default intrinsic attributes" This reverts commit 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 as it breaks the buildbots. I didn't see these failures in th Revert "[Intrinsics] Add `nocallback` to the default intrinsic attributes" This reverts commit 7aea3ea8c3b33c9bb338d5d6c0e4832be1d09ac3 as it breaks the buildbots. I didn't see these failures in the pre-merge checks, looking into it. show more ...
Revision tags: llvmorg-14.0.0, llvmorg-14.0.0-rc4, llvmorg-14.0.0-rc3, llvmorg-14.0.0-rc2, llvmorg-14.0.0-rc1, llvmorg-15-init
# 7aea3ea8	01-Feb-2022	Johannes Doerfert <[email protected]>	[Intrinsics] Add `nocallback` to the default intrinsic attributes Most intrinsics, especially "default" ones, will not call back into the IR module. `nocallback` encodes this nicely. As it was not u [Intrinsics] Add `nocallback` to the default intrinsic attributes Most intrinsics, especially "default" ones, will not call back into the IR module. `nocallback` encodes this nicely. As it was not used before, this patch also makes use of `nocallback` in the Attributor which results in many more `norecurse` deductions. Tablegen part is mechanical, test updates by script. Differential Revision: https://reviews.llvm.org/D118680 show more ...
# 4166738c	18-Mar-2022	Johannes Doerfert <[email protected]>	[OpenMP][FIX] Do not crash when kernels are debug wrapper functions With debug information enabled (-g) Clang will wrap the actual target region into a new function which is called from the "kernel" [OpenMP][FIX] Do not crash when kernels are debug wrapper functions With debug information enabled (-g) Clang will wrap the actual target region into a new function which is called from the "kernel". The problem is that the "kernel" is now basically a wrapper without all the things we expect. More importantly, if we end up asking for an AAKernelInfo for the "target region function" we might try to turn it into SPMD mode. That used to cause an assertion as that function doesn't have an appropriately named `_exec_mode` global. While the global is going away soon we still need to make sure to properly handle this case, e.g., perform optimizations reliably. Differential Revision: https://reviews.llvm.org/D122043 show more ...
# e92891f8	11-Mar-2022	Johannes Doerfert <[email protected]>	[Attributor] Allow not to default initialize AAs for live internal functions Outside users of the Attributor, e.g., OpenMP-opt, want to seed AAs themselves. We should not seed all default AAs one an [Attributor] Allow not to default initialize AAs for live internal functions Outside users of the Attributor, e.g., OpenMP-opt, want to seed AAs themselves. We should not seed all default AAs one an internal function becomes live. That said, there should be a callback such that they can do lazy seeding as well. Differential Revision: https://reviews.llvm.org/D121489 show more ...
# 192a34dd	25-Feb-2022	Johannes Doerfert <[email protected]>	[Attributor][OpenMPOpt][FIX] Register simplification callbacks Heap-2-stack and heap-2-shared can replace an allocation call with something else. To avoid us deriving information from the allocator [Attributor][OpenMPOpt][FIX] Register simplification callbacks Heap-2-stack and heap-2-shared can replace an allocation call with something else. To avoid us deriving information from the allocator implementation we register a simplification callback now that will force us to stop at the call site. We probably should create the replacement memory eagerly and return that instead though. show more ...
# 0136a440	17-Feb-2022	Joseph Huber <[email protected]>	[OpenMP] Add an option to limit shared memory usage in OpenMPOpt One of the optimizations performed in OpenMPOpt pushes globalized variables to static shared memory. This is preferable to keeping th [OpenMP] Add an option to limit shared memory usage in OpenMPOpt One of the optimizations performed in OpenMPOpt pushes globalized variables to static shared memory. This is preferable to keeping the runtime call in all cases, however if too many variables are pushed to hared memory the kernel will crash. Since this is an optimization and not something the user specified explicitly, there should be an option to limit this optimization in those cases. This path introduces the `-openmp-opt-shared-limit=` option to limit the amount of bytes that will be placed in shared memory from HeapToShared. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120079 show more ...
Revision tags: llvmorg-13.0.1, llvmorg-13.0.1-rc3, llvmorg-13.0.1-rc2, llvmorg-13.0.1-rc1, llvmorg-13.0.0, llvmorg-13.0.0-rc4
# ac3ec22d	20-Sep-2021	Johannes Doerfert <[email protected]>	[Attributor] Use AAFunctionReachability to determine AANoRecurse We missed out on AANoRecurse in the module pass because we had no call graph. With AAFunctionReachability we can simply ask if the fu [Attributor] Use AAFunctionReachability to determine AANoRecurse We missed out on AANoRecurse in the module pass because we had no call graph. With AAFunctionReachability we can simply ask if the function may reach itself. Differential Revision: https://reviews.llvm.org/D110099 show more ...
# a1db0e52	01-Feb-2022	Johannes Doerfert <[email protected]>	[Attributor][FIX] Liveness handling in the isAssumedDead helpers This fixes a conceptual problem with our AAIsDead usage which conflated call site liveness with call site return value liveness. With [Attributor][FIX] Liveness handling in the isAssumedDead helpers This fixes a conceptual problem with our AAIsDead usage which conflated call site liveness with call site return value liveness. Without the fix tests would obviously miscompile as we make genericValueTraversal more powerful (in a follow up). The effects on the tests are mixed but mostly marginal. The most prominent one is the lack of `noreturn` for functions. The reason is that we make entire blocks live at the same time (for time reasons). Now that we actually look at the block liveness, which we need to do, the return instructions are live and will survive. As an example, `noreturn_async.ll` has been modified to retain the `noreturn` even with block granularity. We could address this easily but there is little need in practice. show more ...
# 5eb49009	24-Jan-2022	Joseph Huber <[email protected]>	[OpenMP] Add more identifier to created shared globals Currenly we push some variables to a global constant containing shared memory as an optimization. This generated constant had internal linkage [OpenMP] Add more identifier to created shared globals Currenly we push some variables to a global constant containing shared memory as an optimization. This generated constant had internal linkage and should not have collided with any known identifiers in the translation unit. However, there have been observed cases of this optimiztaion unintentionally colliding with undocumented PTX identifiers. This patch adds a suffix to the created globals to hopefully bypass this. Depends on D118059 Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D118068 show more ...
# 6e220296	27-Dec-2021	Joseph Huber <[email protected]>	[OpenMP] Use alignment information in HeapToShared This patch uses the return alignment attribute now present in the `__kmpc_alloc_shared` runtime call to set the alignment of the shared memory glob [OpenMP] Use alignment information in HeapToShared This patch uses the return alignment attribute now present in the `__kmpc_alloc_shared` runtime call to set the alignment of the shared memory global created to replace it. Depends on D115971 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D116319 show more ...
# c690c1c9	17-Sep-2021	Johannes Doerfert <[email protected]>	[NVVM] Update intrinsic definitions to include more attributes A lot of NVVM intrinsics can use the default intrinsic attributes (e.g., nosync, nofree, ...) as well as `speculatable`. The latter is [NVVM] Update intrinsic definitions to include more attributes A lot of NVVM intrinsics can use the default intrinsic attributes (e.g., nosync, nofree, ...) as well as `speculatable`. The latter is important if we want to recompute intrinsics results instead of communicating them via memory. I did use default attributes for almost all `readnone` attributes but speculatable only where I had reasonable confidence they cannot experience UB. That said, someone should double check. TODO: There seem to be various intrinsics marked `Commutative` which should not, e.g., fma and div. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D109987 show more ...
Revision tags: llvmorg-13.0.0-rc3
# 57822c3f	08-Sep-2021	Johannes Doerfert <[email protected]>	[OpenMP][NFC] Repair test that contained nested kernels The benchmark contained (partially) nested kernels, something we do not generate nor support.
# 423d34f7	22-Sep-2021	Shilei Tian <[email protected]>	[OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit` This is a follow-up of D110029, which uses bitset to indicate execution mode. This patch [OpenMP][Offloading] Change `bool IsSPMD` to `int8_t Mode` in `__kmpc_target_init` and `__kmpc_target_deinit` This is a follow-up of D110029, which uses bitset to indicate execution mode. This patches makes the changes in the function call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D110279 show more ...
# fec2927e	17-Sep-2021	Joseph Huber <[email protected]>	[OpenMP] Add NoSync attributes to alloc / free shared RTL calls This patch adds the `nosync` attribute to the `__kmpc_alloc_shared` and `__kmpc_free_shared` runtime library calls. This allows code a [OpenMP] Add NoSync attributes to alloc / free shared RTL calls This patch adds the `nosync` attribute to the `__kmpc_alloc_shared` and `__kmpc_free_shared` runtime library calls. This allows code analysis to know that these functins dont contain any barriers. This will help optimizations reason about the CFG of blocks containing these calls. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D109995 show more ...
Revision tags: llvmorg-13.0.0-rc2
# 29a3e3dd	04-Aug-2021	Giorgis Georgakoudis <[email protected]>	[OpenMPOpt] Expand SPMDization with guarding for target parallel regions This patch expands SPMDization (converting generic execution mode to SPMD for target regions) by guarding code regions that s [OpenMPOpt] Expand SPMDization with guarding for target parallel regions This patch expands SPMDization (converting generic execution mode to SPMD for target regions) by guarding code regions that should be executed only by the main thread. Specifically, it generates guarded regions, which only the main thread executes, and the synchronization with worker threads using simple barriers. For correctness, the patch aborts SPMDization for target regions if the same code executes in a parallel region, thus must be not be guarded. This check is implemented using the ParallelLevels AA. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D106892 show more ...
Revision tags: llvmorg-13.0.0-rc1, llvmorg-14-init
# 754eb1c2	21-Jul-2021	Joseph Huber <[email protected]>	[OpenMP] Change `__kmpc_free_shared` to include the paired allocation size This patch changes `__kmpc_free_shared` to take an additional argument corresponding to the associated allocation's size. T [OpenMP] Change `__kmpc_free_shared` to include the paired allocation size This patch changes `__kmpc_free_shared` to take an additional argument corresponding to the associated allocation's size. This makes it easier to implement the allocator in the runtime. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D106496 show more ...
Revision tags: llvmorg-12.0.1, llvmorg-12.0.1-rc4, llvmorg-12.0.1-rc3, llvmorg-12.0.1-rc2, llvmorg-12.0.1-rc1
# d9659bf6	20-May-2021	Johannes Doerfert <[email protected]>	[OpenMP] Create custom state machines for generic target regions In the spirit of TRegions [0], this patch creates a custom state machine for a generic target region based on the potentially called [OpenMP] Create custom state machines for generic target regions In the spirit of TRegions [0], this patch creates a custom state machine for a generic target region based on the potentially called parallel regions. The code analysis is done interprocedurally via an abstract attribute (AAKernelInfo). All outermost parallel regions are collected and we check if there might be unknown outermost parallel regions for which we need an indirect call. Other AAKernelInfo extensions are expected. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 Differential Revision: https://reviews.llvm.org/D101977 show more ...
12