[mlir][NFC] Update textual references of `func` to `func.func` in Conversion/ testsThe special case parsing of `func` operations is being removed.
[mlir][OpenMP] Added assembly format for omp.wsloop and remove parseClausesThis patch - adds assembly format for `omp.wsloop` operation - removes the `parseClauses` clauses as it is not required
[mlir][OpenMP] Added assembly format for omp.wsloop and remove parseClausesThis patch - adds assembly format for `omp.wsloop` operation - removes the `parseClauses` clauses as it is not required anymoreThis is expected to be the final patch in a series of patches for replacingparsers for clauses with `oilist`.Reviewed By: MogballDifferential Revision: https://reviews.llvm.org/D121367
show more ...
[MLIR][OpenMP] Place alloca scope within wsloop in scf.parallel to omp loweringhttps://reviews.llvm.org/D120423 replaced the use of stacksave/restore with memref.alloca_scope, but kept the save/res
[MLIR][OpenMP] Place alloca scope within wsloop in scf.parallel to omp loweringhttps://reviews.llvm.org/D120423 replaced the use of stacksave/restore with memref.alloca_scope, but kept the save/restore at the same location. This PR places the allocation scope within the wsloop, thus keeping the same allocation scope as the original scf.parallel (e.g. no longer over stack allocating).Reviewed By: ftynseDifferential Revision: https://reviews.llvm.org/D120772
[SCF][MemRef] Enable SCF.Parallel Lowering to use Scope OpAs discussed in https://reviews.llvm.org/D119743 scf.parallel would continuously stack allocate since the alloca op was placd in the wsloop
[SCF][MemRef] Enable SCF.Parallel Lowering to use Scope OpAs discussed in https://reviews.llvm.org/D119743 scf.parallel would continuously stack allocate since the alloca op was placd in the wsloop rather than the omp.parallel. This PR is the second stage of the fix for that problem. Specifically, we now introduce an alloca scope around the inlined body of the scf.parallel and enable a canonicalization to hoist the allocations to the surrounding allocation scope (e.g. omp.parallel).Reviewed By: ftynseDifferential Revision: https://reviews.llvm.org/D120423
[mlir] Move SelectOp from Standard to ArithmeticThis is part of splitting up the standard dialect. See https://llvm.discourse.group/t/standard-dialect-the-final-chapter/ for discussion.Differenti
[mlir] Move SelectOp from Standard to ArithmeticThis is part of splitting up the standard dialect. See https://llvm.discourse.group/t/standard-dialect-the-final-chapter/ for discussion.Differential Revision: https://reviews.llvm.org/D118648
[MLIR] Replace std ops with arith dialect opsPrecursor: https://reviews.llvm.org/D110200Removed redundant ops from the standard dialect that were moved to the`arith` or `math` dialects.Renamed
[MLIR] Replace std ops with arith dialect opsPrecursor: https://reviews.llvm.org/D110200Removed redundant ops from the standard dialect that were moved to the`arith` or `math` dialects.Renamed all instances of operations in the codebase and in tests.Reviewed By: rriddle, jpienaarDifferential Revision: https://reviews.llvm.org/D110797
[mlir] support reductions in SCF to OpenMP conversionOpenMP reductions need a neutral element, so we match some known reductionkinds (integer add/mul/or/and/xor, float add/mul, integer and float m
[mlir] support reductions in SCF to OpenMP conversionOpenMP reductions need a neutral element, so we match some known reductionkinds (integer add/mul/or/and/xor, float add/mul, integer and float min/max) todefine the neutral element and the atomic version when possible to expressusing atomicrmw (everything except float mul). The SCF-to-OpenMP pass becomes amodule pass because it now needs to introduce new symbols for reductiondeclarations in the module.Reviewed By: cheliniDifferential Revision: https://reviews.llvm.org/D107549
[MLIR][OMP] Ensure nested scf.parallel execute all iterationsPresently, the lowering of nested scf.parallel loops to OpenMP creates one omp.parallel region, with two (nested) OpenMP worksharing loo
[MLIR][OMP] Ensure nested scf.parallel execute all iterationsPresently, the lowering of nested scf.parallel loops to OpenMP creates one omp.parallel region, with two (nested) OpenMP worksharing loops on the inside. When lowered to LLVM and executed, this results in incorrect results. The reason for this is as follows:An OpenMP parallel region results in the code being run with whatever number of threads available to OpenMP. Within a parallel region a worksharing loop divides up the total number of requested iterations by the available number of threads, and distributes accordingly. For a single ws loop in a parallel region, this works as intended.Now consider nested ws loops as follows:omp.parallel { A: omp.ws %i = 0...10 { B: omp.ws %j = 0...10 { code(%i, %j) } }}Suppose we ran this on two threads. The first workshare loop would decide to execute iterations 0, 1, 2, 3, 4 on thread 0, and iterations 5, 6, 7, 8, 9 on thread 1. The second workshare loop would decide the same for its iteration. This means thread 0 would execute i \in [0, 5) and j \in [0, 5). Thread 1 would execute i \in [5, 10) and j \in [5, 10). This means that iterations i in [5, 10), j in [0, 5) and i in [0, 5), j in [5, 10) never get executed, which is clearly wrong.This permits two options for a remedy:1) Change the semantics of the omp.wsloop to be distinct from that of the OpenMP runtime call or equivalently #pragma omp for. This could then allow some lowering transformation to remedy the aforementioned issue. I don't think this is desirable for an abstraction standpoint.2) When lowering an scf.parallel always surround the wsloop with a new parallel region (thereby causing the innermost wsloop to use the number of threads available only to it).This PR implements the latter change.Reviewed By: jdoerfertDifferential Revision: https://reviews.llvm.org/D108426
[MLIR][OpenMP] Pretty printer and parser for omp.wsloopCo-authored-by: Kiran Chandramohan <[email protected]>Reviewed By: ftynseDifferential Revision: https://reviews.llvm.org/D92327
[mlir] Add conversion from SCF parallel loops to OpenMPIntroduce a conversion pass from SCF parallel loops to OpenMP dialectconstructs - parallel region and workshare loop. Loops with reductions a
[mlir] Add conversion from SCF parallel loops to OpenMPIntroduce a conversion pass from SCF parallel loops to OpenMP dialectconstructs - parallel region and workshare loop. Loops with reductions are notsupported because the OpenMP dialect cannot model them yet.The conversion currently targets only one level of parallelism, i.e. onlyone top-level `omp.parallel` operation is produced even if there are nested`scf.parallel` operations that could be mapped to `omp.wsloop`. Nestedparallelism support is left for future work.Reviewed By: kiranchandramohanDifferential Revision: https://reviews.llvm.org/D91982