1# Understanding the IR Structure
2
3The MLIR Language Reference describes the
4[High Level Structure](../LangRef/#high-level-structure), this document
5illustrates this structure through examples, and introduces at the same time the
6C++ APIs involved in manipulating it.
7
8We will implement a [pass](../PassManagement/#operation-pass) that traverses any
9MLIR input and prints the entity inside the IR. A pass (or in general almost any
10piece of IR) is always rooted with an operation. Most of the time the top-level
11operation is a `ModuleOp`, the MLIR `PassManager` is actually limited to
12operation on a top-level `ModuleOp`. As such a pass starts with an operation,
13and so will our traversal:
14
15```
16  void runOnOperation() override {
17    Operation *op = getOperation();
18    resetIndent();
19    printOperation(op);
20  }
21```
22
23## Traversing the IR Nesting
24
25The IR is recursively nested, an `Operation` can have one or multiple nested
26`Region`s, each of which is actually a list of `Blocks`, each of which itself
27wraps a list of `Operation`s. Our traversal will follow this structure with
28three methods: `printOperation()`, `printRegion()`, and `printBlock()`.
29
30The first method inspects the properties of an operation, before iterating on
31the nested regions and print them individually:
32
33```c++
34  void printOperation(Operation *op) {
35    // Print the operation itself and some of its properties
36    printIndent() << "visiting op: '" << op->getName() << "' with "
37                  << op->getNumOperands() << " operands and "
38                  << op->getNumResults() << " results\n";
39    // Print the operation attributes
40    if (!op->getAttrs().empty()) {
41      printIndent() << op->getAttrs().size() << " attributes:\n";
42      for (NamedAttribute attr : op->getAttrs())
43        printIndent() << " - '" << attr.first << "' : '" << attr.second
44                      << "'\n";
45    }
46
47    // Recurse into each of the regions attached to the operation.
48    printIndent() << " " << op->getNumRegions() << " nested regions:\n";
49    auto indent = pushIndent();
50    for (Region &region : op->getRegions())
51      printRegion(region);
52  }
53```
54
55A `Region` does not hold anything other than a list of `Block`s:
56
57```c++
58  void printRegion(Region &region) {
59    // A region does not hold anything by itself other than a list of blocks.
60    printIndent() << "Region with " << region.getBlocks().size()
61                  << " blocks:\n";
62    auto indent = pushIndent();
63    for (Block &block : region.getBlocks())
64      printBlock(block);
65  }
66```
67
68Finally, a `Block` has a list of arguments, and holds a list of `Operation`s:
69
70```c++
71  void printBlock(Block &block) {
72    // Print the block intrinsics properties (basically: argument list)
73    printIndent()
74        << "Block with " << block.getNumArguments() << " arguments, "
75        << block.getNumSuccessors()
76        << " successors, and "
77        // Note, this `.size()` is traversing a linked-list and is O(n).
78        << block.getOperations().size() << " operations\n";
79
80    // A block main role is to hold a list of Operations: let's recurse into
81    // printing each operation.
82    auto indent = pushIndent();
83    for (Operation &op : block.getOperations())
84      printOperation(&op);
85  }
86```
87
88The code for the pass is available
89[here in the repo](https://github.com/llvm/llvm-project/blob/master/mlir/test/lib/IR/TestPrintNesting.cpp)
90and can be exercised with `mlir-opt -test-print-nesting`.
91
92### Example
93
94The Pass introduced in the previous section can be applied on the following IR
95with `mlir-opt -test-print-nesting -allow-unregistered-dialect
96llvm-project/mlir/test/IR/print-ir-nesting.mlir`:
97
98```mlir
99"module"() ( {
100  %0:4 = "dialect.op1"() {"attribute name" = 42 : i32} : () -> (i1, i16, i32, i64)
101  "dialect.op2"() ( {
102    "dialect.innerop1"(%0#0, %0#1) : (i1, i16) -> ()
103  },  {
104    "dialect.innerop2"() : () -> ()
105    "dialect.innerop3"(%0#0, %0#2, %0#3)[^bb1, ^bb2] : (i1, i32, i64) -> ()
106  ^bb1(%1: i32):  // pred: ^bb0
107    "dialect.innerop4"() : () -> ()
108    "dialect.innerop5"() : () -> ()
109  ^bb2(%2: i64):  // pred: ^bb0
110    "dialect.innerop6"() : () -> ()
111    "dialect.innerop7"() : () -> ()
112  }) {"other attribute" = 42 : i64} : () -> ()
113  "module_terminator"() : () -> ()
114}) : () -> ()
115```
116
117And will yield the following output:
118
119```
120visiting op: 'module' with 0 operands and 0 results
121 1 nested regions:
122  Region with 1 blocks:
123    Block with 0 arguments, 0 successors, and 3 operations
124      visiting op: 'dialect.op1' with 0 operands and 4 results
125      1 attributes:
126       - 'attribute name' : '42 : i32'
127       0 nested regions:
128      visiting op: 'dialect.op2' with 0 operands and 0 results
129       2 nested regions:
130        Region with 1 blocks:
131          Block with 0 arguments, 0 successors, and 1 operations
132            visiting op: 'dialect.innerop1' with 2 operands and 0 results
133             0 nested regions:
134        Region with 3 blocks:
135          Block with 0 arguments, 2 successors, and 2 operations
136            visiting op: 'dialect.innerop2' with 0 operands and 0 results
137             0 nested regions:
138            visiting op: 'dialect.innerop3' with 3 operands and 0 results
139             0 nested regions:
140          Block with 1 arguments, 0 successors, and 2 operations
141            visiting op: 'dialect.innerop4' with 0 operands and 0 results
142             0 nested regions:
143            visiting op: 'dialect.innerop5' with 0 operands and 0 results
144             0 nested regions:
145          Block with 1 arguments, 0 successors, and 2 operations
146            visiting op: 'dialect.innerop6' with 0 operands and 0 results
147             0 nested regions:
148            visiting op: 'dialect.innerop7' with 0 operands and 0 results
149             0 nested regions:
150      visiting op: 'module_terminator' with 0 operands and 0 results
151       0 nested regions:
152```
153
154## Other IR Traversal Methods.
155
156In many cases, unwrapping the recursive structure of the IR is cumbersome and
157you may be interested in using other helpers.
158
159### Filtered iterator: `getOps<OpTy>()`
160
161For example the `Block` class exposes a convenient templated method
162`getOps<OpTy>()` that provided a filtered iterator. Here is an example:
163
164```c++
165  auto varOps = entryBlock.getOps<spirv::GlobalVariableOp>();
166  for (spirv::GlobalVariableOp gvOp : varOps) {
167     // process each GlobalVariable Operation in the block.
168     ...
169  }
170```
171
172Similarly, the `Region` class exposes the same `getOps` method that will iterate
173on all the blocks in the region.
174
175### Walkers
176
177The `getOps<OpTy>()` is useful to iterate on some Operations immediately listed
178inside a single block (or a single region), however it is frequently interesting
179to traverse the IR in a nested fashion. To this end MLIR exposes the `walk()`
180helper on `Operation`, `Block`, and `Region`. This helper takes a single
181argument: a callback method that will be invoked for every operation recursively
182nested under the provided entity.
183
184```c++
185  // Recursively traverse all the regions and blocks nested inside the function
186  // and apply the callback on every single operation in post-order.
187  getFunction().walk([&](mlir::Operation *op) {
188    // process Operation `op`.
189  });
190```
191
192The provided callback can be specialized to filter on a particular type of
193Operation, for example the following will apply the callback only on `LinalgOp`
194operations nested inside the function:
195
196```c++
197  getFunction.walk([](LinalgOp linalgOp) {
198    // process LinalgOp `linalgOp`.
199  });
200```
201
202Finally, the callback can optionally stop the walk by returning a
203`WalkResult::interrupt()` value. For example the following walk will find all
204`AllocOp` nested inside the function and interrupt the traversal if one of them
205does not satisfy a criteria:
206
207```c++
208  WalkResult result = getFunction().walk([&](AllocOp allocOp) {
209    if (!isValid(allocOp))
210      return WalkResult::interrupt();
211    return WalkResult::advance();
212  });
213  if (result.wasInterrupted())
214    // One alloc wasn't matching.
215    ...
216```
217
218## Traversing the def-use chains
219
220Another relationship in the IR is the one that links a `Value` with its users.
221As defined in the
222[language reference](https://mlir.llvm.org/docs/LangRef/#high-level-structure),
223each Value is either a `BlockArgument` or the result of exactly one `Operation`
224(an `Operation` can have multiple results, each of them is a separate `Value`).
225The users of a `Value` are `Operation`s, through their arguments: each
226`Operation` argument references a single `Value`.
227
228Here is a code sample that inspects the operands of an `Operation` and prints
229some information about them:
230
231```c++
232  // Print information about the producer of each of the operands.
233  for (Value operand : op->getOperands()) {
234    if (Operation *producer = operand.getDefiningOp()) {
235      llvm::outs() << "  - Operand produced by operation '"
236                   << producer->getName() << "'\n";
237    } else {
238      // If there is no defining op, the Value is necessarily a Block
239      // argument.
240      auto blockArg = operand.cast<BlockArgument>();
241      llvm::outs() << "  - Operand produced by Block argument, number "
242                   << blockArg.getArgNumber() << "\n";
243    }
244  }
245```
246
247Similarly, the following code sample iterates through the result `Value`s
248produced by an `Operation` and for each result will iterate the users of these
249results and print informations about them:
250
251```c++
252  // Print information about the user of each of the result.
253  llvm::outs() << "Has " << op->getNumResults() << " results:\n";
254  for (auto indexedResult : llvm::enumerate(op->getResults())) {
255    Value result = indexedResult.value();
256    llvm::outs() << "  - Result " << indexedResult.index();
257    if (result.use_empty()) {
258      llvm::outs() << " has no uses\n";
259      continue;
260    }
261    if (result.hasOneUse()) {
262      llvm::outs() << " has a single use: ";
263    } else {
264      llvm::outs() << " has "
265                   << std::distance(result.getUses().begin(),
266                                    result.getUses().end())
267                   << " uses:\n";
268    }
269    for (Operation *userOp : result.getUsers()) {
270      llvm::outs() << "    - " << userOp->getName() << "\n";
271    }
272  }
273```
274
275The illustrating code for this pass is available
276[here in the repo](https://github.com/llvm/llvm-project/blob/master/mlir/test/lib/IR/TestPrintDefUse.cpp)
277and can be exercised with `mlir-opt -test-print-defuse`.
278
279The chaining of `Value`s and their uses can be viewed as following:
280
281![Index Map Example](/includes/img/DefUseChains.svg)
282
283The uses of a `Value` (`OpOperand` or `BlockOperand`) are also chained in a
284doubly linked-list, which is particularly useful when replacing all uses of a
285`Value` with a new one ("RAUW"):
286
287![Index Map Example](/includes/img/Use-list.svg)
288