1# Operation Definition Specification (ODS)
2
3In addition to specializing the `mlir::Op` C++ template, MLIR also supports
4defining operations and data types in a table-driven manner. This is achieved
5via [TableGen][TableGen], which is both a generic language and its tooling to
6maintain records of domain-specific information. Facts regarding an operation
7are specified concisely into a TableGen record, which will be expanded into an
8equivalent `mlir::Op` C++ template specialization at compiler build time.
9
10This manual explains in detail all the available mechanisms for defining
11operations in such a table-driven manner. It aims to be a specification instead
12of a tutorial. Please refer to
13[Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md)
14for the latter.
15
16In addition to detailing each mechanism, this manual also tries to capture best
17practices. They are rendered as quoted bullet points.
18
19## Motivation
20
21MLIR allows pluggable dialects, and dialects contain, among others, a list of
22operations. This open and extensible ecosystem leads to the "stringly" type IR
23problem, e.g., repetitive string comparisons during optimization and analysis
24passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)`
25vs self-documenting `getStride()`) with more generic return types, verbose and
26generic constructors without default arguments, verbose textual IR dumps, and so
27on. Furthermore, operation verification is:
28
291.  best case: a central string-to-verification-function map,
301.  middle case: duplication of verification across the code base, or
311.  worst case: no verification functions.
32
33The fix is to support defining ops in a table-driven manner. Then for each
34dialect, we can have a central place that contains everything you need to know
35about each op, including its constraints, custom assembly form, etc. This
36description is also used to generate helper functions and classes to allow
37building, verification, parsing, printing, analysis, and many more.
38
39## Benefits
40
41Compared to the C++ template, this table-driven approach has several benefits
42including but not limited to:
43
44*   **Single source of truth**: We strive to encode all facts regarding an
45    operation into the record, so that readers don't need to jump among code
46    snippets to fully understand an operation.
47*   **Removing boilerplate**: We can automatically generate
48    operand/attribute/result getter methods, operation build methods, operation
49    verify methods, and many more utilities from the record. This greatly
50    reduces the boilerplate needed for defining a new op.
51*   **Facilitating auto-generation**: The usage of these operation information
52    records are by no means limited to op definition itself. We can use them to
53    drive the auto-generation of many other components, like computation graph
54    serialization.
55
56## TableGen Syntax
57
58We use TableGen as the language for specifying operation information. TableGen
59itself just provides syntax for writing records; the syntax and constructs
60allowed in a TableGen file (typically with the filename suffix `.td`) can be found
61[here][TableGenProgRef].
62
63*   TableGen `class` is similar to C++ class; it can be templated and
64    subclassed.
65*   TableGen `def` is similar to C++ object; it can be declared by specializing
66    a TableGen `class` (e.g., `def MyDef : MyClass<...>;`) or completely
67    independently (e.g., `def MyDef;`). It cannot be further templated or
68    subclassed.
69*   TableGen `dag` is a dedicated type for directed acyclic graph of elements. A
70    `dag` has one operator and zero or more arguments. Its syntax is `(operator
71    arg0, arg1, argN)`. The operator can be any TableGen `def`; an argument can
72    be anything, including `dag` itself. We can have names attached to both the
73    operator and the arguments like `(MyOp:$op_name MyArg:$arg_name)`.
74
75Please see the [language reference][TableGenProgRef] to learn about all the
76types and expressions supported by TableGen.
77
78## Operation Definition
79
80MLIR defines several common constructs to help operation definition and provide
81their semantics via a special [TableGen backend][TableGenBackend]:
82[`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in
83[`OpBase.td`][OpBase]. The main ones are:
84
85*   The `Op` class: It is the main construct for defining operations. All facts
86    regarding the operation are specified when specializing this class, with the
87    help of the following constructs.
88*   The `Dialect` class: Operations belonging to one logical group are placed in
89    the same dialect. The `Dialect` class contains dialect-level information.
90*   The `OpTrait` class hierarchy: They are used to specify special properties
91    and constraints of the operation, including whether the operation has side
92    effect or whether its output has the same shape as the input.
93*   The `ins`/`outs` marker: These are two special markers builtin to the
94    `OpDefinitionsGen` backend. They lead to the definitions of operands/attributes
95    and results respectively.
96*   The `TypeConstraint` class hierarchy: They are used to specify the
97    constraints over operands or results. A notable subclass hierarchy is
98    `Type`, which stands for constraints for common C++ types.
99*   The `AttrConstraint` class hierarchy: They are used to specify the
100    constraints over attributes. A notable subclass hierarchy is `Attr`, which
101    stands for constraints for attributes whose values are of common types.
102
103An operation is defined by specializing the `Op` class with concrete contents
104for all the fields it requires. For example, `tf.AvgPool` is defined as
105
106```tablegen
107def TF_AvgPoolOp : TF_Op<"AvgPool", [NoSideEffect]> {
108  let summary = "Performs average pooling on the input.";
109
110  let description = [{
111Each entry in `output` is the mean of the corresponding size `ksize`
112window in `value`.
113  }];
114
115  let arguments = (ins
116    TF_FpTensor:$value,
117
118    Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$ksize,
119    Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$strides,
120    TF_AnyStrAttrOf<["SAME", "VALID"]>:$padding,
121    DefaultValuedAttr<TF_ConvertDataFormatAttr, "NHWC">:$data_format
122  );
123
124  let results = (outs
125    TF_FpTensor:$output
126  );
127
128  TF_DerivedOperandTypeAttr T = TF_DerivedOperandTypeAttr<0>;
129}
130```
131
132In the following we describe all the fields needed. Please see the definition of
133the `Op` class for the complete list of fields supported.
134
135### Operation name
136
137The operation name is a unique identifier for the operation within MLIR, e.g.,
138`tf.Add` for addition operation in the TensorFlow dialect. This is the
139equivalent of the mnemonic in assembly language. It is used for parsing and
140printing in the textual format. It is also used for pattern matching in graph
141rewrites.
142
143The full operation name is composed of the dialect name and the op name, with
144the former provided via the dialect and the latter provided as the second
145template parameter to the `Op` class.
146
147### Operation documentation
148
149This includes both a one-line `summary` and a longer human-readable
150`description`. They will be used to drive automatic generation of dialect
151documentation. They need to be provided in the operation's definition body:
152
153```tablegen
154let summary = "...";
155
156let description = [{
157...
158}];
159```
160
161`description` should be written in Markdown syntax.
162
163Placing the documentation at the beginning is recommended since it helps in
164understanding the operation.
165
166> *   Place documentation at the beginning of the operation definition
167> *   The summary should be short and concise. It should be a one-liner without
168>     trailing punctuation. Put expanded explanation in description.
169
170### Operation arguments
171
172There are two kinds of arguments: operands and attributes. Operands are runtime
173values produced by other ops; while attributes are compile-time known constant
174values, including two categories:
175
1761.  Natural attributes: these attributes affect the behavior of the operations
177    (e.g., padding for convolution);
1781.  Derived attributes: these attributes are not needed to define the operation
179    but are instead derived from information of the operation. E.g., the output
180    shape of type. This is mostly used for convenience interface generation or
181    interaction with other frameworks/translation.
182
183    All derived attributes should be materializable as an Attribute. That is,
184    even though they are not materialized, it should be possible to store as an
185    attribute.
186
187Both operands and attributes are specified inside the `dag`-typed `arguments`,
188led by `ins`:
189
190```tablegen
191let arguments = (ins
192  <type-constraint>:$<operand-name>,
193  ...
194  <attr-constraint>:$<attr-name>,
195  ...
196);
197```
198
199Here `<type-constraint>` is a TableGen `def` from the `TypeConstraint` class
200hierarchy. Similarly, `<attr-constraint>` is a TableGen `def` from the
201`AttrConstraint` class hierarchy. See [Constraints](#constraints) for more
202information.
203
204There is no requirements on the relative order of operands and attributes; they
205can mix freely. The relative order of operands themselves matters. From each
206named argument a named getter will be generated that returns the argument with
207the return type (in the case of attributes the return type will be constructed
208from the storage type, while for operands it will be `Value`). Each attribute's
209raw value (e.g., as stored) can also be accessed via generated `<name>Attr`
210getters for use in transformation passes where the more user-friendly return
211type is less suitable.
212
213All the arguments should be named to:
214- provide documentation,
215- drive auto-generation of getter methods, and
216- provide a handle to reference for other places like constraints.
217
218#### Variadic operands
219
220To declare a variadic operand, wrap the `TypeConstraint` for the operand with
221`Variadic<...>`.
222
223Normally operations have no variadic operands or just one variadic operand. For
224the latter case, it is easy to deduce which dynamic operands are for the static
225variadic operand definition. However, if an operation has more than one variable
226length operands (either optional or variadic), it would be impossible to
227attribute dynamic operands to the corresponding static variadic operand
228definitions without further information from the operation. Therefore, either
229the `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to
230indicate that all variable length operands have the same number of dynamic
231values.
232
233#### VariadicOfVariadic operands
234
235To declare a variadic operand that has a variadic number of sub-ranges, wrap the
236`TypeConstraint` for the operand with `VariadicOfVariadic<...,
237"<segment-attribute-name>">`.
238
239The second field of the `VariadicOfVariadic` is the name of an `I32ElementsAttr`
240argument that contains the sizes of the variadic sub-ranges. This attribute will
241be used when determining the size of sub-ranges, or when updating the size of
242sub-ranges.
243
244#### Optional operands
245
246To declare an optional operand, wrap the `TypeConstraint` for the operand with
247`Optional<...>`.
248
249Normally operations have no optional operands or just one optional operand. For
250the latter case, it is easy to deduce which dynamic operands are for the static
251operand definition. However, if an operation has more than one variable length
252operands (either optional or variadic), it would be impossible to attribute
253dynamic operands to the corresponding static variadic operand definitions
254without further information from the operation. Therefore, either the
255`SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to
256indicate that all variable length operands have the same number of dynamic
257values.
258
259#### Optional attributes
260
261To declare an optional attribute, wrap the `AttrConstraint` for the attribute
262with `OptionalAttr<...>`.
263
264#### Attributes with default values
265
266To declare an attribute with a default value, wrap the `AttrConstraint` for the
267attribute with `DefaultValuedAttr<..., "...">`.
268
269The second parameter to `DefaultValuedAttr` should be a string containing the
270C++ default value. For example, a float default value should be specified as
271like `"0.5f"`, and an integer array default value should be specified as like
272`"{1, 2, 3}"`.
273
274#### Confining attributes
275
276`Confined` is provided as a general mechanism to help modelling further
277constraints on attributes beyond the ones brought by value types. You can use
278`Confined` to compose complex constraints out of more primitive ones. For
279example, a 32-bit integer attribute whose minimum value must be 10 can be
280expressed as `Confined<I32Attr, [IntMinValue<10>]>`.
281
282Right now, the following primitive constraints are supported:
283
284*   `IntMinValue<N>`: Specifying an integer attribute to be greater than or
285    equal to `N`
286*   `IntMaxValue<N>`: Specifying an integer attribute to be less than or equal
287    to `N`
288*   `ArrayMinCount<N>`: Specifying an array attribute to have at least `N`
289    elements
290*   `IntArrayNthElemEq<I, N>`: Specifying an integer array attribute's `I`-th
291    element to be equal to `N`
292*   `IntArrayNthElemMinValue<I, N>`: Specifying an integer array attribute's
293    `I`-th element to be greater than or equal to `N`
294
295TODO: Design and implement more primitive constraints
296
297### Operation regions
298
299The regions of an operation are specified inside of the `dag`-typed `regions`,
300led by `region`:
301
302```tablegen
303let regions = (region
304  <region-constraint>:$<region-name>,
305  ...
306);
307```
308
309#### Variadic regions
310
311Similar to the `Variadic` class used for variadic operands and results,
312`VariadicRegion<...>` can be used for regions. Variadic regions can currently
313only be specified as the last region in the regions list.
314
315### Operation results
316
317Similar to operands, results are specified inside the `dag`-typed `results`, led
318by `outs`:
319
320```tablegen
321let results = (outs
322  <type-constraint>:$<result-name>,
323  ...
324);
325```
326
327#### Variadic results
328
329Similar to variadic operands, `Variadic<...>` can also be used for results. And
330similarly, `SameVariadicResultSize` for multiple variadic results in the same
331operation.
332
333### Operation successors
334
335For terminator operations, the successors are specified inside of the
336`dag`-typed `successors`, led by `successor`:
337
338```tablegen
339let successors = (successor
340  <successor-constraint>:$<successor-name>,
341  ...
342);
343```
344
345#### Variadic successors
346
347Similar to the `Variadic` class used for variadic operands and results,
348`VariadicSuccessor<...>` can be used for successors. Variadic successors can
349currently only be specified as the last successor in the successor list.
350
351### Operation traits and constraints
352
353Traits are operation properties that affect syntax or semantics. MLIR C++ models
354various traits in the `mlir::OpTrait` namespace.
355
356Both operation traits, [interfaces](Interfaces.md/#utilizing-the-ods-framework),
357and constraints involving multiple operands/attributes/results are provided as
358the third template parameter to the `Op` class. They should be deriving from
359the `OpTrait` class. See [Constraints](#constraints) for more information.
360
361### Builder methods
362
363For each operation, there are a few builders automatically generated based on
364the arguments and returns types. For example, given the following op definition:
365
366```tablegen
367def MyOp : ... {
368  let arguments = (ins
369    I32:$i32_operand,
370    F32:$f32_operand,
371    ...,
372
373    I32Attr:$i32_attr,
374    F32Attr:$f32_attr,
375    ...
376  );
377
378  let results = (outs
379    I32:$i32_result,
380    F32:$f32_result,
381    ...
382  );
383}
384```
385
386The following builders are generated:
387
388```c++
389// All result-types/operands/attributes have one aggregate parameter.
390static void build(OpBuilder &odsBuilder, OperationState &odsState,
391                  TypeRange resultTypes,
392                  ValueRange operands,
393                  ArrayRef<NamedAttribute> attributes);
394
395// Each result-type/operand/attribute has a separate parameter. The parameters
396// for attributes are of mlir::Attribute types.
397static void build(OpBuilder &odsBuilder, OperationState &odsState,
398                  Type i32_result, Type f32_result, ...,
399                  Value i32_operand, Value f32_operand, ...,
400                  IntegerAttr i32_attr, FloatAttr f32_attr, ...);
401
402// Each result-type/operand/attribute has a separate parameter. The parameters
403// for attributes are raw values unwrapped with mlir::Attribute instances.
404// (Note that this builder will not always be generated. See the following
405// explanation for more details.)
406static void build(OpBuilder &odsBuilder, OperationState &odsState,
407                  Type i32_result, Type f32_result, ...,
408                  Value i32_operand, Value f32_operand, ...,
409                  APInt i32_attr, StringRef f32_attr, ...);
410
411// Each operand/attribute has a separate parameter but result type is aggregate.
412static void build(OpBuilder &odsBuilder, OperationState &odsState,
413                  TypeRange resultTypes,
414                  Value i32_operand, Value f32_operand, ...,
415                  IntegerAttr i32_attr, FloatAttr f32_attr, ...);
416
417// All operands/attributes have aggregate parameters.
418// Generated if return type can be inferred.
419static void build(OpBuilder &odsBuilder, OperationState &odsState,
420                  ValueRange operands, ArrayRef<NamedAttribute> attributes);
421
422// (And manually specified builders depending on the specific op.)
423```
424
425The first form provides basic uniformity so that we can create ops using the
426same form regardless of the exact op. This is particularly useful for
427implementing declarative pattern rewrites.
428
429The second and third forms are good for use in manually written code, given that
430they provide better guarantee via signatures.
431
432The third form will be generated if any of the op's attribute has different
433`Attr.returnType` from `Attr.storageType` and we know how to build an attribute
434from an unwrapped value (i.e., `Attr.constBuilderCall` is defined.)
435Additionally, for the third form, if an attribute appearing later in the
436`arguments` list has a default value, the default value will be supplied in the
437declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the
438list can grow in the future. So if possible, the default-valued attribute should be
439placed at the end of the `arguments` list to leverage this feature. (This
440behavior is essentially due to C++ function parameter default value placement
441restrictions.) Otherwise, the builder of the third form will still be generated
442but default values for the attributes not at the end of the `arguments` list
443will not be supplied in the builder's signature.
444
445ODS will generate a builder that doesn't require the return type specified if
446
447*   Op implements InferTypeOpInterface interface;
448*   All return types are either buildable types or are the same as a given
449    operand (e.g., `AllTypesMatch` constraint between operand and result);
450
451And there may potentially exist other builders depending on the specific op;
452please refer to the
453[generated C++ file](#run-mlir-tblgen-to-see-the-generated-content) for the
454complete list.
455
456#### Custom builder methods
457
458However, if the above cases cannot satisfy all needs, you can define additional
459convenience build methods in the `builders` field as follows.
460
461```tablegen
462def MyOp : Op<"my_op", []> {
463  let arguments = (ins F32Attr:$attr);
464
465  let builders = [
466    OpBuilder<(ins "float":$val)>
467  ];
468}
469```
470
471The `builders` field is a list of custom builders that are added to the Op
472class. In this example, we provide a convenience builder that takes a floating
473point value instead of an attribute. The `ins` prefix is common to many function
474declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What
475follows is a comma-separated list of types (quoted string) and names prefixed
476with the `$` sign. This will generate the declaration of a builder method that
477looks like:
478
479```c++
480class MyOp : /*...*/ {
481  /*...*/
482  static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
483                    float val);
484};
485```
486
487Note that the method has two additional leading arguments. These arguments are
488useful to construct the operation. In particular, the method must populate
489`state` with attributes, operands, regions and result types of the operation to
490be constructed. `builder` can be used to construct any IR objects that belong to
491the Op, such as types or nested operations. Since the type and name are
492generated as is in the C++ code, they should be valid C++ constructs for a type
493(in the namespace of the Op) and an identifier (e.g., `class` is not a valid
494identifier).
495
496Implementations of the builder can be provided directly in ODS, using TableGen
497code block as follows.
498
499```tablegen
500def MyOp : Op<"my_op", []> {
501  let arguments = (ins F32Attr:$attr);
502
503  let builders = [
504    OpBuilder<(ins "float":$val), [{
505      $_state.addAttribute("attr", $_builder.getF32FloatAttr(val));
506    }]>
507  ];
508}
509```
510
511The equivalents of `builder` and `state` arguments are available as `$_builder`
512and `$_state` special variables. The named arguments listed in the `ins` part
513are available directly, e.g. `val`. The body of the builder will be generated by
514substituting special variables and should otherwise be valid C++. While there is
515no limitation on the code size, we encourage one to define only short builders
516inline in ODS and put definitions of longer builders in C++ files.
517
518Finally, if some arguments need a default value, they can be defined using
519`CArg` to wrap the type and this value as follows.
520
521```tablegen
522def MyOp : Op<"my_op", []> {
523  let arguments = (ins F32Attr:$attr);
524
525  let builders = [
526    OpBuilder<(ins CArg<"float", "0.5f">:$val), [{
527      $_state.addAttribute("attr", $_builder.getF32FloatAttr(val));
528    }]>
529  ];
530}
531```
532
533The generated code will use default value in the declaration, but not in the
534definition, as required by C++.
535
536```c++
537/// Header file.
538class MyOp : /*...*/ {
539  /*...*/
540  static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
541                    float val = 0.5f);
542};
543
544/// Source file.
545MyOp::build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
546            float val) {
547  state.addAttribute("attr", builder.getF32FloatAttr(val));
548}
549```
550
551**Deprecated:** `OpBuilder` class allows one to specify the custom builder
552signature as a raw string, without separating parameters into different `dag`
553arguments. It also supports leading parameters of `OpBuilder &` and
554`OperationState &` types, which will be used instead of the autogenerated ones
555if present.
556
557### Custom parser and printer methods
558
559Functions to parse and print the operation's custom assembly form.
560
561### Custom verifier code
562
563Verification code will be automatically generated for
564[constraints](#constraints) specified on various entities of the op. To perform
565_additional_ verification, you can use
566
567```tablegen
568let hasVerifier = 1;
569let hasRegionVerifier = 1;
570```
571
572This will generate `LogicalResult verify()`/`LogicalResult verifyRegions()`
573method declarations on the op class that can be defined with any additional
574verification constraints. For verificaiton which needs to access the nested
575operations, you should use `hasRegionVerifier` to ensure that it won't access
576any ill-formed operation. Except that, The other verifications can be
577implemented with `hasVerifier`. Check the next section for the execution order
578of these verification methods.
579
580#### Verification Ordering
581
582The verification of an operation involves several steps,
583
5841. StructuralOpTrait will be verified first, they can be run independently.
5852. `verifyInvariants` which is constructed by ODS, it verifies the type,
586   attributes, .etc.
5873. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or
588   `verifyWithRegions=0`.
5894. Custom verifier which is defined in the op and has been marked `hasVerifier=1`
590
591If an operation has regions, then it may have the second phase,
592
5931. Traits/Interfaces that have marked their verifier as `verifyRegionTrait` or
594   `verifyWithRegions=1`. This implies the verifier needs to access the
595   operations in its regions.
5962. Custom verifier which is defined in the op and has been marked
597   `hasRegionVerifier=1`
598
599Note that the second phase will be run after the operations in the region are
600verified. Verifiers further down the order can rely on certain invariants being
601verified by a previous verifier and do not need to re-verify them.
602
603#### Emitting diagnostics in custom verifiers
604
605Custom verifiers should avoid printing operations using custom operation
606printers, because they require the printed operation (and sometimes its parent
607operation) to be verified first. In particular, when emitting diagnostics,
608custom verifiers should use the `Error` severity level, which prints operations
609in generic form by default, and avoid using lower severity levels (`Note`,
610`Remark`, `Warning`).
611
612### Declarative Assembly Format
613
614The custom assembly form of the operation may be specified in a declarative
615string that matches the operations operands, attributes, etc. With the ability
616to express additional information that needs to be parsed to build the
617operation:
618
619```tablegen
620def CallOp : Std_Op<"call", ...> {
621  let arguments = (ins FlatSymbolRefAttr:$callee, Variadic<AnyType>:$args);
622  let results = (outs Variadic<AnyType>);
623
624  let assemblyFormat = [{
625    $callee `(` $args `)` attr-dict `:` functional-type($args, results)
626  }];
627}
628```
629
630The format is comprised of three components:
631
632#### Directives
633
634A directive is a type of builtin function, with an optional set of arguments.
635The available directives are as follows:
636
637*   `attr-dict`
638
639    -   Represents the attribute dictionary of the operation.
640
641*   `attr-dict-with-keyword`
642
643    -   Represents the attribute dictionary of the operation, but prefixes the
644        dictionary with an `attributes` keyword.
645
646*   `custom` < UserDirective > ( Params )
647
648    -   Represents a custom directive implemented by the user in C++.
649    -   See the [Custom Directives](#custom-directives) section below for more
650        details.
651
652*   `functional-type` ( inputs , results )
653
654    -   Formats the `inputs` and `results` arguments as a
655        [function type](Dialects/Builtin.md/#functiontype).
656    -   The constraints on `inputs` and `results` are the same as the `input` of
657        the `type` directive.
658
659*   `oilist` ( \`keyword\` elements | \`otherKeyword\` elements ...)
660
661    -   Represents an optional order-independent list of clauses. Each clause
662        has a keyword and corresponding assembly format.
663    -   Each clause can appear 0 or 1 time (in any order).
664    -   Only literals, types and variables can be used within an oilist element.
665    -   All the variables must be optional or variadic.
666
667*   `operands`
668
669    -   Represents all of the operands of an operation.
670
671*   `ref` ( input )
672
673    -   Represents a reference to the a variable or directive, that must have
674        already been resolved, that may be used as a parameter to a `custom`
675        directive.
676    -   Used to pass previously parsed entities to custom directives.
677    -   The input may be any directive or variable, aside from `functional-type`
678        and `custom`.
679
680*   `regions`
681
682    -   Represents all of the regions of an operation.
683
684*   `results`
685
686    -   Represents all of the results of an operation.
687
688*   `successors`
689
690    -   Represents all of the successors of an operation.
691
692*   `type` ( input )
693
694    -   Represents the type of the given input.
695    -   `input` must be either an operand or result [variable](#variables), the
696        `operands` directive, or the `results` directive.
697
698*   `qualified` ( type_or_attribute )
699
700    -   Wraps a `type` directive or an attribute parameter.
701    -   Used to force printing the type or attribute prefixed with its dialect
702        and mnemonic. For example the `vector.multi_reduction` operation has a
703        `kind` attribute ; by default the declarative assembly will print:
704        `vector.multi_reduction <minf>, ...` but using `qualified($kind)` in the
705        declarative assembly format will print it instead as:
706        `vector.multi_reduction #vector.kind<minf>, ...`.
707
708#### Literals
709
710A literal is either a keyword or punctuation surrounded by \`\`.
711
712The following are the set of valid punctuation:
713
714`:`, `,`, `=`, `<`, `>`, `(`, `)`, `{`, `}`, `[`, `]`, `->`, `?`, `+`, `*`
715
716The following are valid whitespace punctuation:
717
718`\n`, ` `
719
720The `\n` literal emits a newline an indents to the start of the operation. An
721example is shown below:
722
723```tablegen
724let assemblyFormat = [{
725  `{` `\n` ` ` ` ` `this_is_on_a_newline` `\n` `}` attr-dict
726}];
727```
728
729```mlir
730%results = my.operation {
731  this_is_on_a_newline
732}
733```
734
735An empty literal \`\` may be used to remove a space that is inserted implicitly
736after certain literal elements, such as `)`/`]`/etc. For example, "`]`" may
737result in an output of `]` it is not the last element in the format. "`]` \`\`"
738would trim the trailing space in this situation.
739
740#### Variables
741
742A variable is an entity that has been registered on the operation itself, i.e.
743an argument(attribute or operand), region, result, successor, etc. In the
744`CallOp` example above, the variables would be `$callee` and `$args`.
745
746Attribute variables are printed with their respective value type, unless that
747value type is buildable. In those cases, the type of the attribute is elided.
748
749#### Custom Directives
750
751The declarative assembly format specification allows for handling a large
752majority of the common cases when formatting an operation. For the operations
753that require or desire specifying parts of the operation in a form not supported
754by the declarative syntax, custom directives may be specified. A custom
755directive essentially allows for users to use C++ for printing and parsing
756subsections of an otherwise declaratively specified format. Looking at the
757specification of a custom directive above:
758
759```
760custom-directive ::= `custom` `<` UserDirective `>` `(` Params `)`
761```
762
763A custom directive has two main parts: The `UserDirective` and the `Params`. A
764custom directive is transformed into a call to a `print*` and a `parse*` method
765when generating the C++ code for the format. The `UserDirective` is an
766identifier used as a suffix to these two calls, i.e., `custom<MyDirective>(...)`
767would result in calls to `parseMyDirective` and `printMyDirective` within the
768parser and printer respectively. `Params` may be any combination of variables
769(i.e. Attribute, Operand, Successor, etc.), type directives, and `attr-dict`.
770The type directives must refer to a variable, but that variable need not also be
771a parameter to the custom directive.
772
773The arguments to the `parse<UserDirective>` method are firstly a reference to
774the `OpAsmParser`(`OpAsmParser &`), and secondly a set of output parameters
775corresponding to the parameters specified in the format. The mapping of
776declarative parameter to `parse` method argument is detailed below:
777
778*   Attribute Variables
779    -   Single: `<Attribute-Storage-Type>(e.g. Attribute) &`
780    -   Optional: `<Attribute-Storage-Type>(e.g. Attribute) &`
781*   Operand Variables
782    -   Single: `OpAsmParser::UnresolvedOperand &`
783    -   Optional: `Optional<OpAsmParser::UnresolvedOperand> &`
784    -   Variadic: `SmallVectorImpl<OpAsmParser::UnresolvedOperand> &`
785    -   VariadicOfVariadic:
786        `SmallVectorImpl<SmallVector<OpAsmParser::UnresolvedOperand>> &`
787*   Ref Directives
788    -   A reference directive is passed to the parser using the same mapping as
789        the input operand. For example, a single region would be passed as a
790        `Region &`.
791*   Region Variables
792    -   Single: `Region &`
793    -   Variadic: `SmallVectorImpl<std::unique_ptr<Region>> &`
794*   Successor Variables
795    -   Single: `Block *&`
796    -   Variadic: `SmallVectorImpl<Block *> &`
797*   Type Directives
798    -   Single: `Type &`
799    -   Optional: `Type &`
800    -   Variadic: `SmallVectorImpl<Type> &`
801    -   VariadicOfVariadic: `SmallVectorImpl<SmallVector<Type>> &`
802*   `attr-dict` Directive: `NamedAttrList &`
803
804When a variable is optional, the value should only be specified if the variable
805is present. Otherwise, the value should remain `None` or null.
806
807The arguments to the `print<UserDirective>` method is firstly a reference to the
808`OpAsmPrinter`(`OpAsmPrinter &`), second the op (e.g. `FooOp op` which can be
809`Operation *op` alternatively), and finally a set of output parameters
810corresponding to the parameters specified in the format. The mapping of
811declarative parameter to `print` method argument is detailed below:
812
813*   Attribute Variables
814    -   Single: `<Attribute-Storage-Type>(e.g. Attribute)`
815    -   Optional: `<Attribute-Storage-Type>(e.g. Attribute)`
816*   Operand Variables
817    -   Single: `Value`
818    -   Optional: `Value`
819    -   Variadic: `OperandRange`
820    -   VariadicOfVariadic: `OperandRangeRange`
821*   Ref Directives
822    -   A reference directive is passed to the printer using the same mapping as
823        the input operand. For example, a single region would be passed as a
824        `Region &`.
825*   Region Variables
826    -   Single: `Region &`
827    -   Variadic: `MutableArrayRef<Region>`
828*   Successor Variables
829    -   Single: `Block *`
830    -   Variadic: `SuccessorRange`
831*   Type Directives
832    -   Single: `Type`
833    -   Optional: `Type`
834    -   Variadic: `TypeRange`
835    -   VariadicOfVariadic: `TypeRangeRange`
836*   `attr-dict` Directive: `DictionaryAttr`
837
838When a variable is optional, the provided value may be null.
839
840#### Optional Groups
841
842In certain situations operations may have "optional" information, e.g.
843attributes or an empty set of variadic operands. In these situations a section
844of the assembly format can be marked as `optional` based on the presence of this
845information. An optional group is defined as follows:
846
847```
848optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?`
849```
850
851The `elements` of an optional group have the following requirements:
852
853*   The first element of the group must either be a attribute, literal, operand,
854    or region.
855    -   This is because the first element must be optionally parsable.
856*   Exactly one argument variable or type directive within the group must be
857    marked as the anchor of the group.
858    -   The anchor is the element whose presence controls whether the group
859        should be printed/parsed.
860    -   An element is marked as the anchor by adding a trailing `^`.
861    -   The first element is *not* required to be the anchor of the group.
862    -   When a non-variadic region anchors a group, the detector for printing
863        the group is if the region is empty.
864*   Literals, variables, custom directives, and type directives are the only
865    valid elements within the group.
866    -   Any attribute variable may be used, but only optional attributes can be
867        marked as the anchor.
868    -   Only variadic or optional results and operand arguments and can be used.
869    -   All region variables can be used. When a non-variable length region is
870        used, if the group is not present the region is empty.
871
872An example of an operation with an optional group is `func.return`, which has a
873variadic number of operands.
874
875```tablegen
876def ReturnOp : ... {
877  let arguments = (ins Variadic<AnyType>:$operands);
878
879  // We only print the operands and types if there are a non-zero number
880  // of operands.
881  let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?";
882}
883```
884
885##### Unit Attributes
886
887In MLIR, the [`unit` Attribute](Dialects/Builtin.md/#unitattr) is special in that it
888only has one possible value, i.e. it derives meaning from its existence. When a
889unit attribute is used to anchor an optional group and is not the first element
890of the group, the presence of the unit attribute can be directly correlated with
891the presence of the optional group itself. As such, in these situations the unit
892attribute will not be printed or present in the output and will be automatically
893inferred when parsing by the presence of the optional group itself.
894
895For example, the following operation:
896
897```tablegen
898def FooOp : ... {
899  let arguments = (ins UnitAttr:$is_read_only);
900
901  let assemblyFormat = "attr-dict (`is_read_only` $is_read_only^)?";
902}
903```
904
905would be formatted as such:
906
907```mlir
908// When the unit attribute is present:
909foo.op is_read_only
910
911// When the unit attribute is not present:
912foo.op
913```
914
915##### Optional "else" Group
916
917Optional groups also have support for an "else" group of elements. These are
918elements that are parsed/printed if the `anchor` element of the optional group
919is *not* present. Unlike the main element group, the "else" group has no
920restriction on the first element and none of the elements may act as the
921`anchor` for the optional. An example is shown below:
922
923```tablegen
924def FooOp : ... {
925  let arguments = (ins UnitAttr:$foo);
926
927  let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?";
928}
929```
930
931would be formatted as such:
932
933```mlir
934// When the `foo` attribute is present:
935foo.op foo_is_present
936
937// When the `foo` attribute is not present:
938foo.op foo_is_absent
939```
940
941#### Requirements
942
943The format specification has a certain set of requirements that must be adhered
944to:
945
9461.  The output and operation name are never shown as they are fixed and cannot
947    be altered.
9481.  All operands within the operation must appear within the format, either
949    individually or with the `operands` directive.
9501.  All regions within the operation must appear within the format, either
951    individually or with the `regions` directive.
9521.  All successors within the operation must appear within the format, either
953    individually or with the `successors` directive.
9541.  All operand and result types must appear within the format using the various
955    `type` directives, either individually or with the `operands` or `results`
956    directives.
9571.  The `attr-dict` directive must always be present.
9581.  Must not contain overlapping information; e.g. multiple instances of
959    'attr-dict', types, operands, etc.
960    -   Note that `attr-dict` does not overlap with individual attributes. These
961        attributes will simply be elided when printing the attribute dictionary.
962
963##### Type Inference
964
965One requirement of the format is that the types of operands and results must
966always be present. In certain instances, the type of a variable may be deduced
967via type constraints or other information available. In these cases, the type of
968that variable may be elided from the format.
969
970*   Buildable Types
971
972Some type constraints may only have one representation, allowing for them to be
973directly buildable; for example the `I32` or `Index` types. Types in `ODS` may
974mark themselves as buildable by setting the `builderCall` field or inheriting
975from the `BuildableType` class.
976
977*   Trait Equality Constraints
978
979There are many operations that have known type equality constraints registered
980as traits on the operation; for example the true, false, and result values of a
981`select` operation often have the same type. The assembly format may inspect
982these equal constraints to discern the types of missing variables. The currently
983supported traits are: `AllTypesMatch`, `TypesMatchWith`, `SameTypeOperands`, and
984`SameOperandsAndResultType`.
985
986*   InferTypeOpInterface
987
988Operations that implement `InferTypeOpInterface` can omit their result types in
989their assembly format since the result types can be inferred from the operands.
990
991### `hasCanonicalizer`
992
993This boolean field indicate whether canonicalization patterns have been defined
994for this operation. If it is `1`, then `::getCanonicalizationPatterns()` should
995be defined.
996
997### `hasCanonicalizeMethod`
998
999When this boolean field is set to `true`, it indicates that the op implements a
1000`canonicalize` method for simple "matchAndRewrite" style canonicalization
1001patterns. If `hasCanonicalizer` is 0, then an implementation of
1002`::getCanonicalizationPatterns()` is implemented to call this function.
1003
1004### `hasFolder`
1005
1006This boolean field indicate whether general folding rules have been defined for
1007this operation. If it is `1`, then `::fold()` should be defined.
1008
1009### Extra declarations
1010
1011One of the goals of table-driven op definition is to auto-generate as much logic
1012and methods needed for each op as possible. With that said, there will always be
1013long-tail cases that won't be covered. For such cases, you can use
1014`extraClassDeclaration`. Code in `extraClassDeclaration` will be copied
1015literally to the generated C++ op class.
1016
1017Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by
1018power users; for not-yet-implemented widely-applicable cases, improving the
1019infrastructure is preferable.
1020
1021### Extra definitions
1022
1023When defining base op classes in TableGen that are inherited many times by
1024different ops, users may want to provide common definitions of utility and
1025interface functions. However, many of these definitions may not be desirable or
1026possible in `extraClassDeclaration`, which append them to the op's C++ class
1027declaration. In these cases, users can add an `extraClassDefinition` to define
1028code that is added to the generated source file inside the op's C++ namespace.
1029The substitution `$cppClass` is replaced by the op's C++ class name.
1030
1031### Generated C++ code
1032
1033[OpDefinitionsGen][OpDefinitionsGen] processes the op definition spec file and
1034generates two files containing the corresponding C++ code: one for declarations,
1035the other for definitions. The former is generated via the `-gen-op-decls`
1036command-line option, while the latter is via the `-gen-op-defs` option.
1037
1038The definition file contains all the op method definitions, which can be
1039included and enabled by defining `GET_OP_CLASSES`. For each operation,
1040OpDefinitionsGen generates an operation class and an
1041[operand adaptor](#operand-adaptors) class. Besides, it also contains a
1042comma-separated list of all defined ops, which can be included and enabled by
1043defining `GET_OP_LIST`.
1044
1045#### Class name and namespaces
1046
1047For each operation, its generated C++ class name is the symbol `def`ed with
1048TableGen with dialect prefix removed. The first `_` serves as the delimiter. For
1049example, for `def TF_AddOp`, the C++ class name would be `AddOp`. We remove the
1050`TF` prefix because it is for scoping ops; other dialects may as well define
1051their own `AddOp`s.
1052
1053The namespaces of the generated C++ class will come from the dialect's
1054`cppNamespace` field. For example, if a dialect's `cppNamespace` is `A::B`, then
1055an op of that dialect will be placed in `namespace A { namespace B { ... } }`.
1056If a dialect does not specify a `cppNamespace`, we then use the dialect's name
1057as the namespace.
1058
1059This means the qualified name of the generated C++ class does not necessarily
1060match exactly with the operation name as explained in
1061[Operation name](#operation-name). This is to allow flexible naming to satisfy
1062coding style requirements.
1063
1064#### Operand adaptors
1065
1066For each operation, we automatically generate an _operand adaptor_. This class
1067solves the problem of accessing operands provided as a list of `Value`s without
1068using "magic" constants. The operand adaptor takes a reference to an array of
1069`Value` and provides methods with the same names as those in the operation class
1070to access them. For example, for a binary arithmetic operation, it may provide
1071`.lhs()` to access the first operand and `.rhs()` to access the second operand.
1072
1073The operand adaptor class lives in the same namespace as the operation class,
1074and has the name of the operation followed by `Adaptor` as well as an alias
1075`Adaptor` inside the op class.
1076
1077Operand adaptors can be used in function templates that also process operations:
1078
1079```c++
1080template <typename BinaryOpTy>
1081std::pair<Value, Value> zip(BinaryOpTy &&op) {
1082  return std::make_pair(op.lhs(), op.rhs());;
1083}
1084
1085void process(AddOp op, ArrayRef<Value> newOperands) {
1086  zip(op);
1087  zip(Adaptor<AddOp>(newOperands));
1088  /*...*/
1089}
1090```
1091
1092## Constraints
1093
1094Constraint is a core concept in table-driven operation definition: operation
1095verification and graph operation matching are all based on satisfying
1096constraints. So both the operation definition and rewrite rules specification
1097significantly involve writing constraints. We have the `Constraint` class in
1098[`OpBase.td`][OpBase] as the common base class for all constraints.
1099
1100An operation's constraint can cover different range; it may
1101
1102*   Only concern a single attribute (e.g. being a 32-bit integer greater than
1103    5),
1104*   Multiple operands and results (e.g., the 1st result's shape must be the same
1105    as the 1st operand), or
1106*   Intrinsic to the operation itself (e.g., having no side effect).
1107
1108We call them as single-entity constraint, multi-entity constraint, and traits,
1109respectively.
1110
1111### Single-entity constraint
1112
1113Constraints scoped to a single operand, attribute, or result are specified at
1114the entity's declaration place as described in
1115[Operation arguments](#operation-arguments) and
1116[Operation results](#operation-results).
1117
1118To help modelling constraints of common types, a set of `TypeConstraint`s are
1119created; they are the `Type` subclass hierarchy. It includes `F32` for the
1120constraints of being a float, `TensorOf<[F32]>` for the constraints of being a
1121float tensor, and so on.
1122
1123Similarly, a set of `AttrConstraint`s are created for helping modelling
1124constraints of common attribute kinds. They are the `Attr` subclass hierarchy.
1125It includes `F32Attr` for the constraints of being a float attribute,
1126`F32ArrayAttr` for the constraints of being a float array attribute, and so on.
1127
1128### Multi-entity constraint
1129
1130Constraints involving more than one operand/attribute/result are quite common on
1131operations, like the element type and shape relation between operands and
1132results. These constraints should be specified as the `Op` class template
1133parameter as described in
1134[Operation traits and constraints](#operation-traits-and-constraints).
1135
1136Multi-entity constraints are modeled as `PredOpTrait` (a subclass of `OpTrait`)
1137in [`OpBase.td`][OpBase].A bunch of constraint primitives are provided to help
1138specification. See [`OpBase.td`][OpBase] for the complete list.
1139
1140### Trait
1141
1142Traits are intrinsic properties of the operation like having side effect or not,
1143commutative or not, whether is a terminator, etc. These constraints should be
1144specified as the `Op` class template parameter as described in
1145[Operation traits and constraints](#operation-traits-and-constraints).
1146
1147Traits are modeled as `NativeOpTrait` (a subclass of `OpTrait`) in
1148[`OpBase.td`][OpBase]. They are backed and will be translated into the
1149corresponding C++ `mlir::OpTrait` classes.
1150
1151### How to specify new constraint
1152
1153To write a constraint, you need to provide its predicates and give it a
1154descriptive name. Predicates, modeled with the `Pred` class, are the workhorse
1155for composing constraints. The predicate for a constraint is typically built up
1156in a nested manner, using the two categories of predicates:
1157
11581.  `CPred`: the primitive leaf predicate.
11592.  Compound predicate: a predicate composed from child predicates using
1160    predicate combiners (conjunction: `And`, disjunction: `Or`, negation: `Neg`,
1161    substitution: `SubstLeaves`, concatenation: `Concat`).
1162
1163`CPred` is the basis for composing more complex predicates. It is the "atom"
1164predicate from the perspective of TableGen and the "interface" between TableGen
1165and C++. What is inside is already C++ code, which will be treated as opaque
1166strings with special placeholders to be substituted.
1167
1168You can put any C++ code that returns a boolean value inside a `CPred`,
1169including evaluating expressions, calling functions, calling class methods, and
1170so on.
1171
1172To help interaction with the C++ environment, there are a few special
1173placeholders provided to refer to entities in the context where this predicate
1174is used. They serve as "hooks" to the enclosing environment. This includes
1175`$_builder`, `$_op`, and `$_self`:
1176
1177*   `$_builder` will be replaced by a `mlir::Builder` instance so that you can
1178    access common build methods.
1179*   `$_op` will be replaced by the current operation so that you can access
1180    information of the current operation.
1181*   `$_self` will be replaced with the entity this predicate is attached to.
1182    E.g., `BoolAttr` is an attribute constraint that wraps a
1183    `CPred<"$_self.isa<BoolAttr>()">`. Then for `BoolAttr:$attr`,`$_self` will be
1184    replaced by `$attr`. For type constraints, it's a little bit special since
1185    we want the constraints on each type definition reads naturally and we want
1186    to attach type constraints directly to an operand/result, `$_self` will be
1187    replaced by the operand/result's type. E.g., for `F32` in `F32:$operand`,
1188    its `$_self` will be expanded as `operand(...).getType()`.
1189
1190TODO: Reconsider the leading symbol for special placeholders. Eventually we want
1191to allow referencing operand/result `$-name`s; such `$-name`s can start with
1192underscore.
1193
1194For example, to write an attribute `attr` is an `IntegerAttr`, in C++ you can
1195just call `attr.isa<IntegerAttr>()`. The code can be wrapped in a `CPred` as
1196`$_self.isa<IntegerAttr>()`, with `$_self` as the special placeholder to be
1197replaced by the current attribute `attr` at expansion time.
1198
1199For more complicated predicates, you can wrap it in a single `CPred`, or you can
1200use predicate combiners to combine them. For example, to write the constraint
1201that an attribute `attr` is a 32-bit or 64-bit integer, you can write it as
1202
1203```tablegen
1204And<[
1205  CPred<"$_self.isa<IntegerAttr>()">,
1206  Or<[
1207    CPred<"$_self.cast<IntegerAttr>().getType().isInteger(32)">,
1208    CPred<"$_self.cast<IntegerAttr>().getType().isInteger(64)">
1209  ]>
1210]>
1211```
1212
1213(Note that the above is just to show with a familiar example how you can use
1214`CPred` and predicate combiners to write complicated predicates. For integer
1215attributes specifically, [`OpBase.td`][OpBase] already defines `I32Attr` and
1216`I64Attr`. So you can actually reuse them to write it as `Or<[I32Attr.predicate,
1217I64Attr.predicate]>`.)
1218
1219TODO: Build up a library of reusable primitive constraints
1220
1221If the predicate is very complex to write with `CPred` together with predicate
1222combiners, you can also write it as a normal C++ function and use the `CPred` as
1223a way to "invoke" the function. For example, to verify an attribute `attr` has
1224some property, you can write a C++ function like
1225
1226```cpp
1227bool HasSomeProperty(Attribute attr) { ... }
1228```
1229
1230and then define the op as:
1231
1232```tablegen
1233def HasSomeProperty : AttrConstraint<CPred<"HasSomeProperty($_self)">,
1234                                     "has some property">;
1235
1236def MyOp : Op<...> {
1237  let arguments = (ins
1238    ...
1239    HasSomeProperty:$attr
1240  );
1241}
1242```
1243
1244As to whether we should define the predicate using a single `CPred` wrapping the
1245whole expression, multiple `CPred`s with predicate combiners, or a single
1246`CPred` "invoking" a function, there are no clear-cut criteria. Defining using
1247`CPred` and predicate combiners is preferable since it exposes more information
1248(instead hiding all the logic behind a C++ function) into the op definition spec
1249so that it can potentially drive more auto-generation cases. But it will require
1250a nice library of common predicates as the building blocks to avoid the
1251duplication, which is being worked on right now.
1252
1253## Attribute Definition
1254
1255An attribute is a compile-time known constant of an operation.
1256
1257ODS provides attribute wrappers over C++ attribute classes. There are a few
1258common C++ [attribute classes][AttrClasses] defined in MLIR's core IR library
1259and one is free to define dialect-specific attribute classes. ODS allows one to
1260use these attributes in TableGen to define operations, potentially with more
1261fine-grained constraints. For example, `StrAttr` directly maps to `StringAttr`;
1262`F32Attr`/`F64Attr` requires the `FloatAttr` to additionally be of a certain
1263bitwidth.
1264
1265ODS attributes are defined as having a storage type (corresponding to a backing
1266`mlir::Attribute` that _stores_ the attribute), a return type (corresponding to
1267the C++ _return_ type of the generated helper getters) as well as a method
1268to convert between the internal storage and the helper method.
1269
1270### Attribute decorators
1271
1272There are a few important attribute adapters/decorators/modifiers that can be
1273applied to ODS attributes to specify common additional properties like
1274optionality, default values, etc.:
1275
1276*   `DefaultValuedAttr`: specifies the
1277    [default value](#attributes-with-default-values) for an attribute.
1278*   `OptionalAttr`: specifies an attribute as [optional](#optional-attributes).
1279*   `Confined`: adapts an attribute with
1280    [further constraints](#confining-attributes).
1281
1282### Enum attributes
1283
1284Some attributes can only take values from a predefined enum, e.g., the
1285comparison kind of a comparison op. To define such attributes, ODS provides
1286several mechanisms: `IntEnumAttr`, and `BitEnumAttr`.
1287
1288*   `IntEnumAttr`: each enum case is an integer, the attribute is stored as a
1289    [`IntegerAttr`][IntegerAttr] in the op.
1290*   `BitEnumAttr`: each enum case is a either the empty case, a single bit,
1291    or a group of single bits, and the attribute is stored as a
1292    [`IntegerAttr`][IntegerAttr] in the op.
1293
1294All these `*EnumAttr` attributes require fully specifying all of the allowed
1295cases via their corresponding `*EnumAttrCase`. With this, ODS is able to
1296generate additional verification to only accept allowed cases. To facilitate the
1297interaction between `*EnumAttr`s and their C++ consumers, the
1298[`EnumsGen`][EnumsGen] TableGen backend can generate a few common utilities: a
1299C++ enum class, `llvm::DenseMapInfo` for the enum class, conversion functions
1300from/to strings. This is controlled via the `-gen-enum-decls` and
1301`-gen-enum-defs` command-line options of `mlir-tblgen`.
1302
1303For example, given the following `EnumAttr`:
1304
1305```tablegen
1306def Case15: I32EnumAttrCase<"Case15", 15>;
1307def Case20: I32EnumAttrCase<"Case20", 20>;
1308
1309def MyIntEnum: I32EnumAttr<"MyIntEnum", "An example int enum",
1310                           [Case15, Case20]> {
1311  let cppNamespace = "Outer::Inner";
1312  let stringToSymbolFnName = "ConvertToEnum";
1313  let symbolToStringFnName = "ConvertToString";
1314}
1315```
1316
1317The following will be generated via `mlir-tblgen -gen-enum-decls`:
1318
1319```c++
1320namespace Outer {
1321namespace Inner {
1322// An example int enum
1323enum class MyIntEnum : uint32_t {
1324  Case15 = 15,
1325  Case20 = 20,
1326};
1327
1328llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t);
1329llvm::StringRef ConvertToString(MyIntEnum);
1330llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef);
1331inline constexpr unsigned getMaxEnumValForMyIntEnum() {
1332  return 20;
1333}
1334
1335} // namespace Inner
1336} // namespace Outer
1337
1338namespace llvm {
1339template<> struct DenseMapInfo<Outer::Inner::MyIntEnum> {
1340  using StorageInfo = llvm::DenseMapInfo<uint32_t>;
1341
1342  static inline Outer::Inner::MyIntEnum getEmptyKey() {
1343    return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getEmptyKey());
1344  }
1345
1346  static inline Outer::Inner::MyIntEnum getTombstoneKey() {
1347    return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getTombstoneKey());
1348  }
1349
1350  static unsigned getHashValue(const Outer::Inner::MyIntEnum &val) {
1351    return StorageInfo::getHashValue(static_cast<uint32_t>(val));
1352  }
1353
1354  static bool isEqual(const Outer::Inner::MyIntEnum &lhs, const Outer::Inner::MyIntEnum &rhs) {
1355    return lhs == rhs;
1356  }
1357};
1358}
1359```
1360
1361The following will be generated via `mlir-tblgen -gen-enum-defs`:
1362
1363```c++
1364namespace Outer {
1365namespace Inner {
1366llvm::StringRef ConvertToString(MyIntEnum val) {
1367  switch (val) {
1368    case MyIntEnum::Case15: return "Case15";
1369    case MyIntEnum::Case20: return "Case20";
1370  }
1371  return "";
1372}
1373
1374llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef str) {
1375  return llvm::StringSwitch<llvm::Optional<MyIntEnum>>(str)
1376      .Case("Case15", MyIntEnum::Case15)
1377      .Case("Case20", MyIntEnum::Case20)
1378      .Default(llvm::None);
1379}
1380llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t value) {
1381  switch (value) {
1382  case 15: return MyIntEnum::Case15;
1383  case 20: return MyIntEnum::Case20;
1384  default: return llvm::None;
1385  }
1386}
1387
1388} // namespace Inner
1389} // namespace Outer
1390```
1391
1392Similarly for the following `BitEnumAttr` definition:
1393
1394```tablegen
1395def None: BitEnumAttrCaseNone<"None">;
1396def Bit0: BitEnumAttrCaseBit<"Bit0", 0>;
1397def Bit1: BitEnumAttrCaseBit<"Bit1", 1>;
1398def Bit2: BitEnumAttrCaseBit<"Bit2", 2>;
1399def Bit3: BitEnumAttrCaseBit<"Bit3", 3>;
1400
1401def MyBitEnum: BitEnumAttr<"MyBitEnum", "An example bit enum",
1402                           [None, Bit0, Bit1, Bit2, Bit3]>;
1403```
1404
1405We can have:
1406
1407```c++
1408// An example bit enum
1409enum class MyBitEnum : uint32_t {
1410  None = 0,
1411  Bit0 = 1,
1412  Bit1 = 2,
1413  Bit2 = 4,
1414  Bit3 = 8,
1415};
1416
1417llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t);
1418std::string stringifyMyBitEnum(MyBitEnum);
1419llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef);
1420inline MyBitEnum operator|(MyBitEnum lhs, MyBitEnum rhs) {
1421  return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) | static_cast<uint32_t>(rhs));
1422}
1423inline MyBitEnum operator&(MyBitEnum lhs, MyBitEnum rhs) {
1424  return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) & static_cast<uint32_t>(rhs));
1425}
1426inline bool bitEnumContains(MyBitEnum bits, MyBitEnum bit) {
1427  return (static_cast<uint32_t>(bits) & static_cast<uint32_t>(bit)) != 0;
1428}
1429
1430namespace llvm {
1431template<> struct DenseMapInfo<::MyBitEnum> {
1432  using StorageInfo = llvm::DenseMapInfo<uint32_t>;
1433
1434  static inline ::MyBitEnum getEmptyKey() {
1435    return static_cast<::MyBitEnum>(StorageInfo::getEmptyKey());
1436  }
1437
1438  static inline ::MyBitEnum getTombstoneKey() {
1439    return static_cast<::MyBitEnum>(StorageInfo::getTombstoneKey());
1440  }
1441
1442  static unsigned getHashValue(const ::MyBitEnum &val) {
1443    return StorageInfo::getHashValue(static_cast<uint32_t>(val));
1444  }
1445
1446  static bool isEqual(const ::MyBitEnum &lhs, const ::MyBitEnum &rhs) {
1447    return lhs == rhs;
1448  }
1449};
1450```
1451
1452```c++
1453std::string stringifyMyBitEnum(MyBitEnum symbol) {
1454  auto val = static_cast<uint32_t>(symbol);
1455  assert(15u == (15u | val) && "invalid bits set in bit enum");
1456  // Special case for all bits unset.
1457  if (val == 0) return "None";
1458  llvm::SmallVector<llvm::StringRef, 2> strs;
1459  if (1u == (1u & val)) { strs.push_back("Bit0"); }
1460  if (2u == (2u & val)) { strs.push_back("Bit1"); }
1461  if (4u == (4u & val)) { strs.push_back("Bit2"); }
1462  if (8u == (8u & val)) { strs.push_back("Bit3"); }
1463
1464  return llvm::join(strs, "|");
1465}
1466
1467llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef str) {
1468  // Special case for all bits unset.
1469  if (str == "None") return MyBitEnum::None;
1470
1471  llvm::SmallVector<llvm::StringRef, 2> symbols;
1472  str.split(symbols, "|");
1473
1474  uint32_t val = 0;
1475  for (auto symbol : symbols) {
1476    auto bit = llvm::StringSwitch<llvm::Optional<uint32_t>>(symbol)
1477      .Case("Bit0", 1)
1478      .Case("Bit1", 2)
1479      .Case("Bit2", 4)
1480      .Case("Bit3", 8)
1481      .Default(llvm::None);
1482    if (bit) { val |= *bit; } else { return llvm::None; }
1483  }
1484  return static_cast<MyBitEnum>(val);
1485}
1486
1487llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t value) {
1488  // Special case for all bits unset.
1489  if (value == 0) return MyBitEnum::None;
1490
1491  if (value & ~(1u | 2u | 4u | 8u)) return llvm::None;
1492  return static_cast<MyBitEnum>(value);
1493}
1494```
1495
1496## Debugging Tips
1497
1498### Run `mlir-tblgen` to see the generated content
1499
1500TableGen syntax sometimes can be obscure; reading the generated content can be a
1501very helpful way to understand and debug issues. To build `mlir-tblgen`, run
1502`cmake --build . --target mlir-tblgen` in your build directory and find the
1503`mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators
1504can be found via `mlir-tblgen --help`. For example, `--gen-op-decls` and
1505`--gen-op-defs` as explained in [Generated C++ code](#generated-c-code).
1506
1507To see the generated code, invoke `mlir-tblgen` with a specific generator by
1508providing include paths via `-I`. For example,
1509
1510```sh
1511# To see op C++ class declaration
1512mlir-tblgen --gen-op-decls -I /path/to/mlir/include /path/to/input/td/file
1513# To see op C++ class definition
1514mlir-tblgen --gen-op-defs -I /path/to/mlir/include /path/to/input/td/file
1515# To see op documentation
1516mlir-tblgen --gen-dialect-doc -I /path/to/mlir/include /path/to/input/td/file
1517
1518# To see op interface C++ class declaration
1519mlir-tblgen --gen-op-interface-decls -I /path/to/mlir/include /path/to/input/td/file
1520# To see op interface C++ class definition
1521mlir-tblgen --gen-op-interface-defs -I /path/to/mlir/include /path/to/input/td/file
1522# To see op interface documentation
1523mlir-tblgen --gen-op-interface-doc -I /path/to/mlir/include /path/to/input/td/file
1524```
1525
1526## Appendix
1527
1528### Reporting deprecation
1529
1530Classes/defs can be marked as deprecated by using the `Deprecate` helper class,
1531e.g.,
1532
1533```tablegen
1534def OpTraitA : NativeOpTrait<"OpTraitA">, Deprecated<"use `bar` instead">;
1535```
1536
1537would result in marking `OpTraitA` as deprecated and mlir-tblgen can emit a
1538warning (default) or error (depending on `-on-deprecated` flag) to make
1539deprecated state known.
1540
1541### Requirements and existing mechanisms analysis
1542
1543The op description should be as declarative as possible to allow a wide range of
1544tools to work with them and query methods generated from them. In particular
1545this means specifying traits, constraints and shape inference information in a
1546way that is easily analyzable (e.g., avoid opaque calls to C++ functions where
1547possible).
1548
1549We considered the approaches of several contemporary systems and focused on
1550requirements that were desirable:
1551
1552*   Ops registered using a registry separate from C++ code.
1553    *   Unknown ops are allowed in MLIR, so ops need not be registered. The
1554        ability of the compiler to optimize those ops or graphs containing those
1555        ops is constrained but correct.
1556    *   The current proposal does not include a runtime op description, but it
1557        does not preclude such description, it can be added later.
1558    *   The op registry is essential for generating C++ classes that make
1559        manipulating ops, verifying correct construction etc. in C++ easier by
1560        providing a typed representation and accessors.
1561*   The op registry will be defined in
1562    [TableGen](https://llvm.org/docs/TableGen/index.html) and be used to
1563    generate C++ classes and utility functions
1564    (builder/verifier/parser/printer).
1565    *   TableGen is a modelling specification language used by LLVM's backends
1566        and fits in well with trait-based modelling. This is an implementation
1567        decision and there are alternative ways of doing this. But the
1568        specification language is good for the requirements of modelling the
1569        traits (as seen from usage in LLVM processor backend modelling) and easy
1570        to extend, so a practical choice. If another good option comes up, we
1571        will consider it.
1572*   MLIR allows both defined and undefined ops.
1573    *   Defined ops should have fixed semantics and could have a corresponding
1574        reference implementation defined.
1575    *   Dialects are under full control of the dialect owner and normally live
1576        with the framework of the dialect.
1577*   The op's traits (e.g., commutative) are modelled along with the op in the
1578    registry.
1579*   The op's operand/return type constraints are modelled along with the op in
1580    the registry (see [Shape inference](ShapeInference.md) discussion below),
1581    this allows (e.g.) optimized concise syntax in textual dumps.
1582*   Behavior of the op is documented along with the op with a summary and a
1583    description. The description is written in markdown and extracted for
1584    inclusion in the generated LangRef section of the dialect.
1585*   The generic assembly form of printing and parsing is available as normal,
1586    but a custom parser and printer can either be specified or automatically
1587    generated from an optional string representation showing the mapping of the
1588    "assembly" string to operands/type.
1589    *   Parser-level remappings (e.g., `eq` to enum) will be supported as part
1590        of the parser generation.
1591*   Matching patterns are specified separately from the op description.
1592    *   Contrasted with LLVM there is no "base" set of ops that every backend
1593        needs to be aware of. Instead there are many different dialects and the
1594        transformations/legalizations between these dialects form a graph of
1595        transformations.
1596*   Reference implementation may be provided along with the op definition.
1597
1598    *   The reference implementation may be in terms of either standard ops or
1599        other reference implementations.
1600
1601    TODO: document expectation if the dependent op's definition changes.
1602
1603[TableGen]: https://llvm.org/docs/TableGen/index.html
1604[TableGenProgRef]: https://llvm.org/docs/TableGen/ProgRef.html
1605[TableGenBackend]: https://llvm.org/docs/TableGen/BackEnds.html#introduction
1606[OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td
1607[OpDefinitionsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1608[EnumsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/EnumsGen.cpp
1609[StringAttr]: Dialects/Builtin.md/#stringattr
1610[IntegerAttr]: Dialects/Builtin.md/#integertype
1611[AttrClasses]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Attributes.h
1612