1# Operation Definition Specification (ODS)
2
3In addition to specializing the `mlir::Op` C++ template, MLIR also supports
4defining operations and data types in a table-driven manner. This is achieved
5via [TableGen][TableGen], which is both a generic language and its tooling to
6maintain records of domain-specific information. Facts regarding an operation
7are specified concisely into a TableGen record, which will be expanded into an
8equivalent `mlir::Op` C++ template specialization at compiler build time.
9
10This manual explains in detail all the available mechanisms for defining
11operations in such a table-driven manner. It aims to be a specification instead
12of a tutorial. Please refer to
13[Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md)
14for the latter.
15
16In addition to detailing each mechanism, this manual also tries to capture best
17practices. They are rendered as quoted bullet points.
18
19## Motivation
20
21MLIR allows pluggable dialects, and dialects contain, among others, a list of
22operations. This open and extensible ecosystem leads to the "stringly" type IR
23problem, e.g., repetitive string comparisons during optimization and analysis
24passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)`
25vs self-documenting `getStride()`) with more generic return types, verbose and
26generic constructors without default arguments, verbose textual IR dump, and so
27on. Furthermore, operation verification is:
28
291.  best case: a central string-to-verification-function map,
301.  middle case: duplication of verification across the code base, or
311.  worst case: no verification functions.
32
33The fix is to support defining ops in a table-driven manner. Then for each
34dialect, we can have a central place that contains everything you need to know
35about each op, including its constraints, custom assembly form, etc. This
36description is also used to generate helper functions and classes to allow
37building, verification, parsing, printing, analysis, and many more.
38
39## Benefits
40
41Compared to the C++ template, this table-driven approach has several benefits
42including but not limited to:
43
44*   **Single source of truth**: We strive to encode all facts regarding an
45    operation into the record, so that readers don't need to jump among code
46    snippets to fully understand an operation.
47*   **Removing boilerplate**: We can automatically generate
48    operand/attribute/result getter methods, operation build methods, operation
49    verify methods, and many more utilities from the record. This greatly
50    reduces the boilerplate needed for defining a new op.
51*   **Facilitating auto-generation**: The usage of these operation information
52    records are by no means limited to op definition itself. We can use them to
53    drive the auto-generation of many other components, like computation graph
54    serialization.
55
56## TableGen Syntax
57
58We use TableGen as the language for specifying operation information. TableGen
59itself just provides syntax for writing records; the syntax and constructs
60allowed in a TableGen file (typically with filename suffix `.td`) can be found
61[here][TableGenProgRef].
62
63*   TableGen `class` is similar to C++ class; it can be templated and
64    subclassed.
65*   TableGen `def` is similar to C++ object; it can be declared by specializing
66    a TableGen `class` (e.g., `def MyDef : MyClass<...>;`) or completely
67    independently (e.g., `def MyDef;`). It cannot be further templated or
68    subclassed.
69*   TableGen `dag` is a dedicated type for directed acyclic graph of elements. A
70    `dag` has one operator and zero or more arguments. Its syntax is `(operator
71    arg0, arg1, argN)`. The operator can be any TableGen `def`; an argument can
72    be anything, including `dag` itself. We can have names attached to both the
73    operator and the arguments like `(MyOp:$op_name MyArg:$arg_name)`.
74
75Please see the [language reference][TableGenProgRef] to learn about all the
76types and expressions supported by TableGen.
77
78## Operation Definition
79
80MLIR defines several common constructs to help operation definition and provide
81their semantics via a special [TableGen backend][TableGenBackend]:
82[`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in
83[`OpBase.td`][OpBase]. The main ones are
84
85*   The `Op` class: It is the main construct for defining operations. All facts
86    regarding the operation are specified when specializing this class, with the
87    help of the following constructs.
88*   The `Dialect` class: Operations belonging to one logical group are placed in
89    the same dialect. The `Dialect` class contains dialect-level information.
90*   The `OpTrait` class hierarchy: They are used to specify special properties
91    and constraints of the operation, including whether the operation has side
92    effect or whether its output has the same shape as the input.
93*   The `ins`/`outs` marker: These are two special markers builtin to the
94    `OpDefinitionsGen` backend. They lead the definitions of operands/attributes
95    and results respectively.
96*   The `TypeConstraint` class hierarchy: They are used to specify the
97    constraints over operands or results. A notable subclass hierarchy is
98    `Type`, which stands for constraints for common C++ types.
99*   The `AttrConstraint` class hierarchy: They are used to specify the
100    constraints over attributes. A notable subclass hierarchy is `Attr`, which
101    stands for constraints for attributes whose values are of common types.
102
103An operation is defined by specializing the `Op` class with concrete contents
104for all the fields it requires. For example, `tf.AvgPool` is defined as
105
106```tablegen
107def TF_AvgPoolOp : TF_Op<"AvgPool", [NoSideEffect]> {
108  let summary = "Performs average pooling on the input.";
109
110  let description = [{
111Each entry in `output` is the mean of the corresponding size `ksize`
112window in `value`.
113  }];
114
115  let arguments = (ins
116    TF_FpTensor:$value,
117
118    Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$ksize,
119    Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$strides,
120    TF_AnyStrAttrOf<["SAME", "VALID"]>:$padding,
121    DefaultValuedAttr<TF_ConvertDataFormatAttr, "NHWC">:$data_format
122  );
123
124  let results = (outs
125    TF_FpTensor:$output
126  );
127
128  TF_DerivedOperandTypeAttr T = TF_DerivedOperandTypeAttr<0>;
129}
130```
131
132In the following we describe all the fields needed. Please see the definition of
133the `Op` class for the complete list of fields supported.
134
135### Operation name
136
137The operation name is a unique identifier of the operation within MLIR, e.g.,
138`tf.Add` for addition operation in the TensorFlow dialect. This is the
139equivalent of the mnemonic in assembly language. It is used for parsing and
140printing in the textual format. It is also used for pattern matching in graph
141rewrites.
142
143The full operation name is composed of the dialect name and the op name, with
144the former provided via the dialect and the latter provided as the second
145template parameter to the `Op` class.
146
147### Operation documentation
148
149This includes both a one-line `summary` and a longer human-readable
150`description`. They will be used to drive automatic generation of dialect
151documentation. They need to be provided in the operation's definition body:
152
153```tablegen
154let summary = "...";
155
156let description = [{
157...
158}];
159```
160
161`description` should be written in Markdown syntax.
162
163Placing the documentation at the beginning is recommended since it helps in
164understanding the operation.
165
166> *   Place documentation at the beginning of the operation definition
167> *   The summary should be short and concise. It should be a one-liner without
168>     trailing punctuation. Put expanded explanation in description.
169
170### Operation arguments
171
172There are two kinds of arguments: operands and attributes. Operands are runtime
173values produced by other ops; while attributes are compile-time known constant
174values, including two categories:
175
1761.  Natural attributes: these attributes affect the behavior of the operations
177    (e.g., padding for convolution);
1781.  Derived attributes: these attributes are not needed to define the operation
179    but are instead derived from information of the operation. E.g., the output
180    shape of type. This is mostly used for convenience interface generation or
181    interaction with other frameworks/translation.
182
183    All derived attributes should be materializable as an Attribute. That is,
184    even though they are not materialized, it should be possible to store as an
185    attribute.
186
187Both operands and attributes are specified inside the `dag`-typed `arguments`,
188led by `ins`:
189
190```tablegen
191let arguments = (ins
192  <type-constraint>:$<operand-name>,
193  ...
194  <attr-constraint>:$<attr-name>,
195  ...
196);
197```
198
199Here `<type-constraint>` is a TableGen `def` from the `TypeConstraint` class
200hierarchy. Similarly, `<attr-constraint>` is a TableGen `def` from the
201`AttrConstraint` class hierarchy. See [Constraints](#constraints) for more
202information.
203
204There is no requirements on the relative order of operands and attributes; they
205can mix freely. The relative order of operands themselves matters. From each
206named argument a named getter will be generated that returns the argument with
207the return type (in the case of attributes the return type will be constructed
208from the storage type, while for operands it will be `Value`). Each attribute's
209raw value (e.g., as stored) can also be accessed via generated `<name>Attr`
210getters for use in transformation passes where the more user friendly return
211type is less suitable.
212
213All the arguments should be named to 1) provide documentation, 2) drive
214auto-generation of getter methods, 3) provide a handle to reference for other
215places like constraints.
216
217#### Variadic operands
218
219To declare a variadic operand, wrap the `TypeConstraint` for the operand with
220`Variadic<...>`.
221
222Normally operations have no variadic operands or just one variadic operand. For
223the latter case, it is easy to deduce which dynamic operands are for the static
224variadic operand definition. Though, if an operation has more than one variable
225length operands (either optional or variadic), it would be impossible to
226attribute dynamic operands to the corresponding static variadic operand
227definitions without further information from the operation. Therefore, either
228the `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to
229indicate that all variable length operands have the same number of dynamic
230values.
231
232#### VariadicOfVariadic operands
233
234To declare a variadic operand that has a variadic number of sub-ranges, wrap the
235`TypeConstraint` for the operand with `VariadicOfVariadic<...,
236"<segment-attribute-name>">`.
237
238The second field of the `VariadicOfVariadic` is the name of an `I32ElementsAttr`
239argument that contains the sizes of the variadic sub-ranges. This attribute will
240be used when determining the size of sub-ranges, or when updating the size of
241sub-ranges.
242
243#### Optional operands
244
245To declare an optional operand, wrap the `TypeConstraint` for the operand with
246`Optional<...>`.
247
248Normally operations have no optional operands or just one optional operand. For
249the latter case, it is easy to deduce which dynamic operands are for the static
250operand definition. Though, if an operation has more than one variable length
251operands (either optional or variadic), it would be impossible to attribute
252dynamic operands to the corresponding static variadic operand definitions
253without further information from the operation. Therefore, either the
254`SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to
255indicate that all variable length operands have the same number of dynamic
256values.
257
258#### Optional attributes
259
260To declare an optional attribute, wrap the `AttrConstraint` for the attribute
261with `OptionalAttr<...>`.
262
263#### Attributes with default values
264
265To declare an attribute with a default value, wrap the `AttrConstraint` for the
266attribute with `DefaultValuedAttr<..., "...">`.
267
268The second parameter to `DefaultValuedAttr` should be a string containing the
269C++ default value. For example, a float default value should be specified as
270like `"0.5f"`, and an integer array default value should be specified as like
271`"{1, 2, 3}"`.
272
273#### Confining attributes
274
275`Confined` is provided as a general mechanism to help modelling further
276constraints on attributes beyond the ones brought by value types. You can use
277`Confined` to compose complex constraints out of more primitive ones. For
278example, a 32-bit integer attribute whose minimum value must be 10 can be
279expressed as `Confined<I32Attr, [IntMinValue<10>]>`.
280
281Right now, the following primitive constraints are supported:
282
283*   `IntMinValue<N>`: Specifying an integer attribute to be greater than or
284    equal to `N`
285*   `IntMaxValue<N>`: Specifying an integer attribute to be less than or equal
286    to `N`
287*   `ArrayMinCount<N>`: Specifying an array attribute to have at least `N`
288    elements
289*   `IntArrayNthElemEq<I, N>`: Specifying an integer array attribute's `I`-th
290    element to be equal to `N`
291*   `IntArrayNthElemMinValue<I, N>`: Specifying an integer array attribute's
292    `I`-th element to be greater than or equal to `N`
293
294TODO: Design and implement more primitive constraints
295
296### Operation regions
297
298The regions of an operation are specified inside of the `dag`-typed `regions`,
299led by `region`:
300
301```tablegen
302let regions = (region
303  <region-constraint>:$<region-name>,
304  ...
305);
306```
307
308#### Variadic regions
309
310Similar to the `Variadic` class used for variadic operands and results,
311`VariadicRegion<...>` can be used for regions. Variadic regions can currently
312only be specified as the last region in the regions list.
313
314### Operation results
315
316Similar to operands, results are specified inside the `dag`-typed `results`, led
317by `outs`:
318
319```tablegen
320let results = (outs
321  <type-constraint>:$<result-name>,
322  ...
323);
324```
325
326#### Variadic results
327
328Similar to variadic operands, `Variadic<...>` can also be used for results. And
329similarly, `SameVariadicResultSize` for multiple variadic results in the same
330operation.
331
332### Operation successors
333
334For terminator operations, the successors are specified inside of the
335`dag`-typed `successors`, led by `successor`:
336
337```tablegen
338let successors = (successor
339  <successor-constraint>:$<successor-name>,
340  ...
341);
342```
343
344#### Variadic successors
345
346Similar to the `Variadic` class used for variadic operands and results,
347`VariadicSuccessor<...>` can be used for successors. Variadic successors can
348currently only be specified as the last successor in the successor list.
349
350### Operation traits and constraints
351
352Traits are operation properties that affect syntax or semantics. MLIR C++ models
353various traits in the `mlir::OpTrait` namespace.
354
355Both operation traits, [interfaces](Interfaces.md/#utilizing-the-ods-framework),
356and constraints involving multiple operands/attributes/results are provided as
357the third template parameter to the `Op` class. They should be deriving from
358the `OpTrait` class. See [Constraints](#constraints) for more information.
359
360### Builder methods
361
362For each operation, there are a few builders automatically generated based on
363the arguments and returns types. For example, given the following op definition:
364
365```tablegen
366def MyOp : ... {
367  let arguments = (ins
368    I32:$i32_operand,
369    F32:$f32_operand,
370    ...,
371
372    I32Attr:$i32_attr,
373    F32Attr:$f32_attr,
374    ...
375  );
376
377  let results = (outs
378    I32:$i32_result,
379    F32:$f32_result,
380    ...
381  );
382}
383```
384
385The following builders are generated:
386
387```c++
388// All result-types/operands/attributes have one aggregate parameter.
389static void build(OpBuilder &odsBuilder, OperationState &odsState,
390                  ArrayRef<Type> resultTypes,
391                  ValueRange operands,
392                  ArrayRef<NamedAttribute> attributes);
393
394// Each result-type/operand/attribute has a separate parameter. The parameters
395// for attributes are of mlir::Attribute types.
396static void build(OpBuilder &odsBuilder, OperationState &odsState,
397                  Type i32_result, Type f32_result, ...,
398                  Value i32_operand, Value f32_operand, ...,
399                  IntegerAttr i32_attr, FloatAttr f32_attr, ...);
400
401// Each result-type/operand/attribute has a separate parameter. The parameters
402// for attributes are raw values unwrapped with mlir::Attribute instances.
403// (Note that this builder will not always be generated. See the following
404// explanation for more details.)
405static void build(OpBuilder &odsBuilder, OperationState &odsState,
406                  Type i32_result, Type f32_result, ...,
407                  Value i32_operand, Value f32_operand, ...,
408                  APInt i32_attr, StringRef f32_attr, ...);
409
410// Each operand/attribute has a separate parameter but result type is aggregate.
411static void build(OpBuilder &odsBuilder, OperationState &odsState,
412                  ArrayRef<Type> resultTypes,
413                  Value i32_operand, Value f32_operand, ...,
414                  IntegerAttr i32_attr, FloatAttr f32_attr, ...);
415
416// All operands/attributes have aggregate parameters.
417// Generated if return type can be inferred.
418static void build(OpBuilder &odsBuilder, OperationState &odsState,
419                  ValueRange operands, ArrayRef<NamedAttribute> attributes);
420
421// (And manually specified builders depending on the specific op.)
422```
423
424The first form provides basic uniformity so that we can create ops using the
425same form regardless of the exact op. This is particularly useful for
426implementing declarative pattern rewrites.
427
428The second and third forms are good for use in manually written code given that
429they provide better guarantee via signatures.
430
431The third form will be generated if any of the op's attribute has different
432`Attr.returnType` from `Attr.storageType` and we know how to build an attribute
433from an unwrapped value (i.e., `Attr.constBuilderCall` is defined.)
434Additionally, for the third form, if an attribute appearing later in the
435`arguments` list has a default value, the default value will be supplied in the
436declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the
437list can grow in the future. So if possible, default valued attribute should be
438placed at the end of the `arguments` list to leverage this feature. (This
439behavior is essentially due to C++ function parameter default value placement
440restrictions.) Otherwise, the builder of the third form will still be generated
441but default values for the attributes not at the end of the `arguments` list
442will not be supplied in the builder's signature.
443
444ODS will generate a builder that doesn't require return type specified if
445
446*   Op implements InferTypeOpInterface interface;
447*   All return types are either buildable types or are the same as a given
448    operand (e.g., `AllTypesMatch` constraint between operand and result);
449
450And there may potentially exist other builders depending on the specific op;
451please refer to the
452[generated C++ file](#run-mlir-tblgen-to-see-the-generated-content) for the
453complete list.
454
455#### Custom builder methods
456
457However, if the above cases cannot satisfy all needs, you can define additional
458convenience build methods in the `builders` field as follows.
459
460```tablegen
461def MyOp : Op<"my_op", []> {
462  let arguments = (ins F32Attr:$attr);
463
464  let builders = [
465    OpBuilder<(ins "float":$val)>
466  ];
467}
468```
469
470The `builders` field is a list of custom builders that are added to the Op
471class. In this example, we provide a convenience builder that takes a floating
472point value instead of an attribute. The `ins` prefix is common to many function
473declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What
474follows is a comma-separated list of types (quoted string) and names prefixed
475with the `$` sign. This will generate the declaration of a builder method that
476looks like:
477
478```c++
479class MyOp : /*...*/ {
480  /*...*/
481  static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
482                    float val);
483};
484```
485
486Note that the method has two additional leading arguments. These arguments are
487useful to construct the operation. In particular, the method must populate
488`state` with attributes, operands, regions and result types of the operation to
489be constructed. `builder` can be used to construct any IR objects that belong to
490the Op, such as types or nested operations. Since the type and name are
491generated as is in the C++ code, they should be valid C++ constructs for a type
492(in the namespace of the Op) and an identifier (e.g., `class` is not a valid
493identifier).
494
495Implementations of the builder can be provided directly in ODS, using TableGen
496code block as follows.
497
498```tablegen
499def MyOp : Op<"my_op", []> {
500  let arguments = (ins F32Attr:$attr);
501
502  let builders = [
503    OpBuilder<(ins "float":$val), [{
504      $_state.addAttribute("attr", $_builder.getF32FloatAttr(val));
505    }]>
506  ];
507}
508```
509
510The equivalents of `builder` and `state` arguments are available as `$_builder`
511and `$_state` special variables. The named arguments listed in the `ins` part
512are available directly, e.g. `val`. The body of the builder will be generated by
513substituting special variables and should otherwise be valid C++. While there is
514no limitation on the code size, we encourage one to define only short builders
515inline in ODS and put definitions of longer builders in C++ files.
516
517Finally, if some arguments need a default value, they can be defined using
518`CArg` to wrap the type and this value as follows.
519
520```tablegen
521def MyOp : Op<"my_op", []> {
522  let arguments = (ins F32Attr:$attr);
523
524  let builders = [
525    OpBuilder<(ins CArg<"float", "0.5f">:$val), [{
526      $_state.addAttribute("attr", $_builder.getF32FloatAttr(val));
527    }]>
528  ];
529}
530```
531
532The generated code will use default value in the declaration, but not in the
533definition, as required by C++.
534
535```c++
536/// Header file.
537class MyOp : /*...*/ {
538  /*...*/
539  static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
540                    float val = 0.5f);
541};
542
543/// Source file.
544MyOp::build(::mlir::OpBuilder &builder, ::mlir::OperationState &state,
545            float val) {
546  state.addAttribute("attr", builder.getF32FloatAttr(val));
547}
548```
549
550**Deprecated:** `OpBuilder` class allows one to specify the custom builder
551signature as a raw string, without separating parameters into different `dag`
552arguments. It also supports leading parameters of `OpBuilder &` and
553`OperationState &` types, which will be used instead of the autogenerated ones
554if present.
555
556### Custom parser and printer methods
557
558Functions to parse and print the operation's custom assembly form.
559
560### Custom verifier code
561
562Verification code will be automatically generated for
563[constraints](#constraints) specified on various entities of the op. To perform
564_additional_ verification, you can use
565
566```tablegen
567let verifier = [{
568  ...
569}];
570```
571
572Code placed in `verifier` will be called after the auto-generated verification
573code. The order of trait verification excluding those of `verifier` should not
574be relied upon.
575
576### Declarative Assembly Format
577
578The custom assembly form of the operation may be specified in a declarative
579string that matches the operations operands, attributes, etc. With the ability
580to express additional information that needs to be parsed to build the
581operation:
582
583```tablegen
584def CallOp : Std_Op<"call", ...> {
585  let arguments = (ins FlatSymbolRefAttr:$callee, Variadic<AnyType>:$args);
586  let results = (outs Variadic<AnyType>);
587
588  let assemblyFormat = [{
589    $callee `(` $args `)` attr-dict `:` functional-type($args, results)
590  }];
591}
592```
593
594The format is comprised of three components:
595
596#### Directives
597
598A directive is a type of builtin function, with an optional set of arguments.
599The available directives are as follows:
600
601*   `attr-dict`
602
603    -   Represents the attribute dictionary of the operation.
604
605*   `attr-dict-with-keyword`
606
607    -   Represents the attribute dictionary of the operation, but prefixes the
608        dictionary with an `attributes` keyword.
609
610*   `custom` < UserDirective > ( Params )
611
612    -   Represents a custom directive implemented by the user in C++.
613    -   See the [Custom Directives](#custom-directives) section below for more
614        details.
615
616*   `functional-type` ( inputs , results )
617
618    -   Formats the `inputs` and `results` arguments as a
619        [function type](Dialects/Builtin.md/#functiontype).
620    -   The constraints on `inputs` and `results` are the same as the `input` of
621        the `type` directive.
622
623*   `operands`
624
625    -   Represents all of the operands of an operation.
626
627*   `ref` ( input )
628
629    -   Represents a reference to the a variable or directive, that must have
630        already been resolved, that may be used as a parameter to a `custom`
631        directive.
632    -   Used to pass previously parsed entities to custom directives.
633    -   The input may be any directive or variable, aside from `functional-type`
634        and `custom`.
635
636*   `regions`
637
638    -   Represents all of the regions of an operation.
639
640*   `results`
641
642    -   Represents all of the results of an operation.
643
644*   `successors`
645
646    -   Represents all of the successors of an operation.
647
648*   `type` ( input )
649
650    -   Represents the type of the given input.
651    -   `input` must be either an operand or result [variable](#variables), the
652        `operands` directive, or the `results` directive.
653
654*   `qualified` ( type_or_attribute )
655
656    -   Wraps a `type` directive or an attribute parameter.
657    -   Used to force printing the type or attribute prefixed with its dialect
658        and mnemonic. For example the `vector.multi_reduction` operation has a
659        `kind` attribute ; by default the declarative assembly will print:
660        `vector.multi_reduction <minf>, ...` but using `qualified($kind)` in the
661        declarative assembly format will print it instead as:
662        `vector.multi_reduction #vector.kind<minf>, ...`.
663
664#### Literals
665
666A literal is either a keyword or punctuation surrounded by \`\`.
667
668The following are the set of valid punctuation:
669
670`:`, `,`, `=`, `<`, `>`, `(`, `)`, `{`, `}`, `[`, `]`, `->`, `?`, `+`, `*`
671
672The following are valid whitespace punctuation:
673
674`\n`, ` `
675
676The `\n` literal emits a newline an indents to the start of the operation. An
677example is shown below:
678
679```tablegen
680let assemblyFormat = [{
681  `{` `\n` ` ` ` ` `this_is_on_a_newline` `\n` `}` attr-dict
682}];
683```
684
685```mlir
686%results = my.operation {
687  this_is_on_a_newline
688}
689```
690
691An empty literal \`\` may be used to remove a space that is inserted implicitly
692after certain literal elements, such as `)`/`]`/etc. For example, "`]`" may
693result in an output of `]` it is not the last element in the format. "`]` \`\`"
694would trim the trailing space in this situation.
695
696#### Variables
697
698A variable is an entity that has been registered on the operation itself, i.e.
699an argument(attribute or operand), region, result, successor, etc. In the
700`CallOp` example above, the variables would be `$callee` and `$args`.
701
702Attribute variables are printed with their respective value type, unless that
703value type is buildable. In those cases, the type of the attribute is elided.
704
705#### Custom Directives
706
707The declarative assembly format specification allows for handling a large
708majority of the common cases when formatting an operation. For the operations
709that require or desire specifying parts of the operation in a form not supported
710by the declarative syntax, custom directives may be specified. A custom
711directive essentially allows for users to use C++ for printing and parsing
712subsections of an otherwise declaratively specified format. Looking at the
713specification of a custom directive above:
714
715```
716custom-directive ::= `custom` `<` UserDirective `>` `(` Params `)`
717```
718
719A custom directive has two main parts: The `UserDirective` and the `Params`. A
720custom directive is transformed into a call to a `print*` and a `parse*` method
721when generating the C++ code for the format. The `UserDirective` is an
722identifier used as a suffix to these two calls, i.e., `custom<MyDirective>(...)`
723would result in calls to `parseMyDirective` and `printMyDirective` within the
724parser and printer respectively. `Params` may be any combination of variables
725(i.e. Attribute, Operand, Successor, etc.), type directives, and `attr-dict`.
726The type directives must refer to a variable, but that variable need not also be
727a parameter to the custom directive.
728
729The arguments to the `parse<UserDirective>` method are firstly a reference to
730the `OpAsmParser`(`OpAsmParser &`), and secondly a set of output parameters
731corresponding to the parameters specified in the format. The mapping of
732declarative parameter to `parse` method argument is detailed below:
733
734*   Attribute Variables
735    -   Single: `<Attribute-Storage-Type>(e.g. Attribute) &`
736    -   Optional: `<Attribute-Storage-Type>(e.g. Attribute) &`
737*   Operand Variables
738    -   Single: `OpAsmParser::OperandType &`
739    -   Optional: `Optional<OpAsmParser::OperandType> &`
740    -   Variadic: `SmallVectorImpl<OpAsmParser::OperandType> &`
741    -   VariadicOfVariadic:
742        `SmallVectorImpl<SmallVector<OpAsmParser::OperandType>> &`
743*   Ref Directives
744    -   A reference directive is passed to the parser using the same mapping as
745        the input operand. For example, a single region would be passed as a
746        `Region &`.
747*   Region Variables
748    -   Single: `Region &`
749    -   Variadic: `SmallVectorImpl<std::unique_ptr<Region>> &`
750*   Successor Variables
751    -   Single: `Block *&`
752    -   Variadic: `SmallVectorImpl<Block *> &`
753*   Type Directives
754    -   Single: `Type &`
755    -   Optional: `Type &`
756    -   Variadic: `SmallVectorImpl<Type> &`
757    -   VariadicOfVariadic: `SmallVectorImpl<SmallVector<Type>> &`
758*   `attr-dict` Directive: `NamedAttrList &`
759
760When a variable is optional, the value should only be specified if the variable
761is present. Otherwise, the value should remain `None` or null.
762
763The arguments to the `print<UserDirective>` method is firstly a reference to the
764`OpAsmPrinter`(`OpAsmPrinter &`), second the op (e.g. `FooOp op` which can be
765`Operation *op` alternatively), and finally a set of output parameters
766corresponding to the parameters specified in the format. The mapping of
767declarative parameter to `print` method argument is detailed below:
768
769*   Attribute Variables
770    -   Single: `<Attribute-Storage-Type>(e.g. Attribute)`
771    -   Optional: `<Attribute-Storage-Type>(e.g. Attribute)`
772*   Operand Variables
773    -   Single: `Value`
774    -   Optional: `Value`
775    -   Variadic: `OperandRange`
776    -   VariadicOfVariadic: `OperandRangeRange`
777*   Ref Directives
778    -   A reference directive is passed to the printer using the same mapping as
779        the input operand. For example, a single region would be passed as a
780        `Region &`.
781*   Region Variables
782    -   Single: `Region &`
783    -   Variadic: `MutableArrayRef<Region>`
784*   Successor Variables
785    -   Single: `Block *`
786    -   Variadic: `SuccessorRange`
787*   Type Directives
788    -   Single: `Type`
789    -   Optional: `Type`
790    -   Variadic: `TypeRange`
791    -   VariadicOfVariadic: `TypeRangeRange`
792*   `attr-dict` Directive: `DictionaryAttr`
793
794When a variable is optional, the provided value may be null.
795
796#### Optional Groups
797
798In certain situations operations may have "optional" information, e.g.
799attributes or an empty set of variadic operands. In these situations a section
800of the assembly format can be marked as `optional` based on the presence of this
801information. An optional group is defined as follows:
802
803```
804optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?`
805```
806
807The `elements` of an optional group have the following requirements:
808
809*   The first element of the group must either be a attribute, literal, operand,
810    or region.
811    -   This is because the first element must be optionally parsable.
812*   Exactly one argument variable or type directive within the group must be
813    marked as the anchor of the group.
814    -   The anchor is the element whose presence controls whether the group
815        should be printed/parsed.
816    -   An element is marked as the anchor by adding a trailing `^`.
817    -   The first element is *not* required to be the anchor of the group.
818    -   When a non-variadic region anchors a group, the detector for printing
819        the group is if the region is empty.
820*   Literals, variables, custom directives, and type directives are the only
821    valid elements within the group.
822    -   Any attribute variable may be used, but only optional attributes can be
823        marked as the anchor.
824    -   Only variadic or optional results and operand arguments and can be used.
825    -   All region variables can be used. When a non-variable length region is
826        used, if the group is not present the region is empty.
827
828An example of an operation with an optional group is `std.return`, which has a
829variadic number of operands.
830
831```tablegen
832def ReturnOp : ... {
833  let arguments = (ins Variadic<AnyType>:$operands);
834
835  // We only print the operands and types if there are a non-zero number
836  // of operands.
837  let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?";
838}
839```
840
841##### Unit Attributes
842
843In MLIR, the [`unit` Attribute](Dialects/Builtin.md/#unitattr) is special in that it
844only has one possible value, i.e. it derives meaning from its existence. When a
845unit attribute is used to anchor an optional group and is not the first element
846of the group, the presence of the unit attribute can be directly correlated with
847the presence of the optional group itself. As such, in these situations the unit
848attribute will not be printed or present in the output and will be automatically
849inferred when parsing by the presence of the optional group itself.
850
851For example, the following operation:
852
853```tablegen
854def FooOp : ... {
855  let arguments = (ins UnitAttr:$is_read_only);
856
857  let assemblyFormat = "attr-dict (`is_read_only` $is_read_only^)?";
858}
859```
860
861would be formatted as such:
862
863```mlir
864// When the unit attribute is present:
865foo.op is_read_only
866
867// When the unit attribute is not present:
868foo.op
869```
870
871##### Optional "else" Group
872
873Optional groups also have support for an "else" group of elements. These are
874elements that are parsed/printed if the `anchor` element of the optional group
875is *not* present. Unlike the main element group, the "else" group has no
876restriction on the first element and none of the elements may act as the
877`anchor` for the optional. An example is shown below:
878
879```tablegen
880def FooOp : ... {
881  let arguments = (ins UnitAttr:$foo);
882
883  let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?";
884}
885```
886
887would be formatted as such:
888
889```mlir
890// When the `foo` attribute is present:
891foo.op foo_is_present
892
893// When the `foo` attribute is not present:
894foo.op foo_is_absent
895```
896
897#### Requirements
898
899The format specification has a certain set of requirements that must be adhered
900to:
901
9021.  The output and operation name are never shown as they are fixed and cannot
903    be altered.
9041.  All operands within the operation must appear within the format, either
905    individually or with the `operands` directive.
9061.  All regions within the operation must appear within the format, either
907    individually or with the `regions` directive.
9081.  All successors within the operation must appear within the format, either
909    individually or with the `successors` directive.
9101.  All operand and result types must appear within the format using the various
911    `type` directives, either individually or with the `operands` or `results`
912    directives.
9131.  The `attr-dict` directive must always be present.
9141.  Must not contain overlapping information; e.g. multiple instances of
915    'attr-dict', types, operands, etc.
916    -   Note that `attr-dict` does not overlap with individual attributes. These
917        attributes will simply be elided when printing the attribute dictionary.
918
919##### Type Inference
920
921One requirement of the format is that the types of operands and results must
922always be present. In certain instances, the type of a variable may be deduced
923via type constraints or other information available. In these cases, the type of
924that variable may be elided from the format.
925
926*   Buildable Types
927
928Some type constraints may only have one representation, allowing for them to be
929directly buildable; for example the `I32` or `Index` types. Types in `ODS` may
930mark themselves as buildable by setting the `builderCall` field or inheriting
931from the `BuildableType` class.
932
933*   Trait Equality Constraints
934
935There are many operations that have known type equality constraints registered
936as traits on the operation; for example the true, false, and result values of a
937`select` operation often have the same type. The assembly format may inspect
938these equal constraints to discern the types of missing variables. The currently
939supported traits are: `AllTypesMatch`, `TypesMatchWith`, `SameTypeOperands`, and
940`SameOperandsAndResultType`.
941
942*   InferTypeOpInterface
943
944Operations that implement `InferTypeOpInterface` can omit their result types in
945their assembly format since the result types can be inferred from the operands.
946
947### `hasCanonicalizer`
948
949This boolean field indicate whether canonicalization patterns have been defined
950for this operation. If it is `1`, then `::getCanonicalizationPatterns()` should
951be defined.
952
953### `hasCanonicalizeMethod`
954
955When this boolean field is set to `true`, it indicates that the op implements a
956`canonicalize` method for simple "matchAndRewrite" style canonicalization
957patterns. If `hasCanonicalizer` is 0, then an implementation of
958`::getCanonicalizationPatterns()` is implemented to call this function.
959
960### `hasFolder`
961
962This boolean field indicate whether general folding rules have been defined for
963this operation. If it is `1`, then `::fold()` should be defined.
964
965### Extra declarations
966
967One of the goals of table-driven op definition is to auto-generate as much logic
968and methods needed for each op as possible. With that said, there will always be
969long-tail cases that won't be covered. For such cases, you can use
970`extraClassDeclaration`. Code in `extraClassDeclaration` will be copied
971literally to the generated C++ op class.
972
973Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by
974power users; for not-yet-implemented widely-applicable cases, improving the
975infrastructure is preferable.
976
977### Extra definitions
978
979When defining base op classes in TableGen that are inherited many times by
980different ops, users may want to provide common definitions of utility and
981interface functions. However, many of these definitions may not be desirable or
982possible in `extraClassDeclaration`, which append them to the op's C++ class
983declaration. In these cases, users can add an `extraClassDefinition` to define
984code that is added to the generated source file inside the op's C++ namespace.
985The substitution `$cppClass` is replaced by the op's C++ class name.
986
987### Generated C++ code
988
989[OpDefinitionsGen][OpDefinitionsGen] processes the op definition spec file and
990generates two files containing the corresponding C++ code: one for declarations,
991the other for definitions. The former is generated via the `-gen-op-decls`
992command-line option, while the latter is via the `-gen-op-defs` option.
993
994The definition file contains all the op method definitions, which can be
995included and enabled by defining `GET_OP_CLASSES`. For each operation,
996OpDefinitionsGen generates an operation class and an
997[operand adaptor](#operand-adaptors) class. Besides, it also contains a
998comma-separated list of all defined ops, which can be included and enabled by
999defining `GET_OP_LIST`.
1000
1001#### Class name and namespaces
1002
1003For each operation, its generated C++ class name is the symbol `def`ed with
1004TableGen with dialect prefix removed. The first `_` serves as the delimiter. For
1005example, for `def TF_AddOp`, the C++ class name would be `AddOp`. We remove the
1006`TF` prefix because it is for scoping ops; other dialects may as well define
1007their own `AddOp`s.
1008
1009The namespaces of the generated C++ class will come from the dialect's
1010`cppNamespace` field. For example, if a dialect's `cppNamespace` is `A::B`, then
1011an op of that dialect will be placed in `namespace A { namespace B { ... } }`.
1012If a dialect does not specify a `cppNamespace`, we then use the dialect's name
1013as the namespace.
1014
1015This means the qualified name of the generated C++ class does not necessarily
1016match exactly with the operation name as explained in
1017[Operation name](#operation-name). This is to allow flexible naming to satisfy
1018coding style requirements.
1019
1020#### Operand adaptors
1021
1022For each operation, we automatically generate an _operand adaptor_. This class
1023solves the problem of accessing operands provided as a list of `Value`s without
1024using "magic" constants. The operand adaptor takes a reference to an array of
1025`Value` and provides methods with the same names as those in the operation class
1026to access them. For example, for a binary arithmetic operation, it may provide
1027`.lhs()` to access the first operand and `.rhs()` to access the second operand.
1028
1029The operand adaptor class lives in the same namespace as the operation class,
1030and has the name of the operation followed by `Adaptor` as well as an alias
1031`Adaptor` inside the op class.
1032
1033Operand adaptors can be used in function templates that also process operations:
1034
1035```c++
1036template <typename BinaryOpTy>
1037std::pair<Value, Value> zip(BinaryOpTy &&op) {
1038  return std::make_pair(op.lhs(), op.rhs());;
1039}
1040
1041void process(AddOp op, ArrayRef<Value> newOperands) {
1042  zip(op);
1043  zip(Adaptor<AddOp>(newOperands));
1044  /*...*/
1045}
1046```
1047
1048## Constraints
1049
1050Constraint is a core concept in table-driven operation definition: operation
1051verification and graph operation matching are all based on satisfying
1052constraints. So both the operation definition and rewrite rules specification
1053significantly involve writing constraints. We have the `Constraint` class in
1054[`OpBase.td`][OpBase] as the common base class for all constraints.
1055
1056An operation's constraint can cover different range; it may
1057
1058*   Only concern a single attribute (e.g. being a 32-bit integer greater than
1059    5),
1060*   Multiple operands and results (e.g., the 1st result's shape must be the same
1061    as the 1st operand), or
1062*   Intrinsic to the operation itself (e.g., having no side effect).
1063
1064We call them as single-entity constraint, multi-entity constraint, and traits,
1065respectively.
1066
1067### Single-entity constraint
1068
1069Constraints scoped to a single operand, attribute, or result are specified at
1070the entity's declaration place as described in
1071[Operation arguments](#operation-arguments) and
1072[Operation results](#operation-results).
1073
1074To help modelling constraints of common types, a set of `TypeConstraint`s are
1075created; they are the `Type` subclass hierarchy. It includes `F32` for the
1076constraints of being a float, `TensorOf<[F32]>` for the constraints of being a
1077float tensor, and so on.
1078
1079Similarly, a set of `AttrConstraint`s are created for helping modelling
1080constraints of common attribute kinds. They are the `Attr` subclass hierarchy.
1081It includes `F32Attr` for the constraints of being a float attribute,
1082`F32ArrayAttr` for the constraints of being a float array attribute, and so on.
1083
1084### Multi-entity constraint
1085
1086Constraints involving more than one operand/attribute/result are quite common on
1087operations, like the element type and shape relation between operands and
1088results. These constraints should be specified as the `Op` class template
1089parameter as described in
1090[Operation traits and constraints](#operation-traits-and-constraints).
1091
1092Multi-entity constraints are modeled as `PredOpTrait` (a subclass of `OpTrait`)
1093in [`OpBase.td`][OpBase].A bunch of constraint primitives are provided to help
1094specification. See [`OpBase.td`][OpBase] for the complete list.
1095
1096### Trait
1097
1098Traits are intrinsic properties of the operation like having side effect or not,
1099commutative or not, whether is a terminator, etc. These constraints should be
1100specified as the `Op` class template parameter as described in
1101[Operation traits and constraints](#operation-traits-and-constraints).
1102
1103Traits are modeled as `NativeOpTrait` (a subclass of `OpTrait`) in
1104[`OpBase.td`][OpBase]. They are backed and will be translated into the
1105corresponding C++ `mlir::OpTrait` classes.
1106
1107### How to specify new constraint
1108
1109To write a constraint, you need to provide its predicates and give it a
1110descriptive name. Predicates, modeled with the `Pred` class, are the workhorse
1111for composing constraints. The predicate for a constraint is typically built up
1112in a nested manner, using the two categories of predicates:
1113
11141.  `CPred`: the primitive leaf predicate.
11152.  Compound predicate: a predicate composed from child predicates using
1116    predicate combiners (conjunction: `And`, disjunction: `Or`, negation: `Neg`,
1117    substitution: `SubstLeaves`, concatenation: `Concat`).
1118
1119`CPred` is the basis for composing more complex predicates. It is the "atom"
1120predicate from the perspective of TableGen and the "interface" between TableGen
1121and C++. What is inside is already C++ code, which will be treated as opaque
1122strings with special placeholders to be substituted.
1123
1124You can put any C++ code that returns a boolean value inside a `CPred`,
1125including evaluating expressions, calling functions, calling class methods, and
1126so on.
1127
1128To help interaction with the C++ environment, there are a few special
1129placeholders provided to refer to entities in the context where this predicate
1130is used. They serve as "hooks" to the enclosing environment. This includes
1131`$_builder`, `$_op`, and `$_self`:
1132
1133*   `$_builder` will be replaced by a `mlir::Builder` instance so that you can
1134    access common build methods.
1135*   `$_op` will be replaced by the current operation so that you can access
1136    information of the current operation.
1137*   `$_self` will be replaced with the entity this predicate is attached to.
1138    E.g., `BoolAttr` is an attribute constraint that wraps a
1139    `CPred<"$_self.isa<BoolAttr>()">`. Then for `BoolAttr:$attr`,`$_self` will be
1140    replaced by `$attr`. For type constraints, it's a little bit special since
1141    we want the constraints on each type definition reads naturally and we want
1142    to attach type constraints directly to an operand/result, `$_self` will be
1143    replaced by the operand/result's type. E.g., for `F32` in `F32:$operand`,
1144    its `$_self` will be expanded as `operand(...).getType()`.
1145
1146TODO: Reconsider the leading symbol for special placeholders. Eventually we want
1147to allow referencing operand/result `$-name`s; such `$-name`s can start with
1148underscore.
1149
1150For example, to write an attribute `attr` is an `IntegerAttr`, in C++ you can
1151just call `attr.isa<IntegerAttr>()`. The code can be wrapped in a `CPred` as
1152`$_self.isa<IntegerAttr>()`, with `$_self` as the special placeholder to be
1153replaced by the current attribute `attr` at expansion time.
1154
1155For more complicated predicates, you can wrap it in a single `CPred`, or you can
1156use predicate combiners to combine them. For example, to write the constraint
1157that an attribute `attr` is a 32-bit or 64-bit integer, you can write it as
1158
1159```tablegen
1160And<[
1161  CPred<"$_self.isa<IntegerAttr>()">,
1162  Or<[
1163    CPred<"$_self.cast<IntegerAttr>().getType().isInteger(32)">,
1164    CPred<"$_self.cast<IntegerAttr>().getType().isInteger(64)">
1165  ]>
1166]>
1167```
1168
1169(Note that the above is just to show with a familiar example how you can use
1170`CPred` and predicate combiners to write complicated predicates. For integer
1171attributes specifically, [`OpBase.td`][OpBase] already defines `I32Attr` and
1172`I64Attr`. So you can actually reuse them to write it as `Or<[I32Attr.predicate,
1173I64Attr.predicate]>`.)
1174
1175TODO: Build up a library of reusable primitive constraints
1176
1177If the predicate is very complex to write with `CPred` together with predicate
1178combiners, you can also write it as a normal C++ function and use the `CPred` as
1179a way to "invoke" the function. For example, to verify an attribute `attr` has
1180some property, you can write a C++ function like
1181
1182```cpp
1183bool HasSomeProperty(Attribute attr) { ... }
1184```
1185
1186and then define the op as:
1187
1188```tablegen
1189def HasSomeProperty : AttrConstraint<CPred<"HasSomeProperty($_self)">,
1190                                     "has some property">;
1191
1192def MyOp : Op<...> {
1193  let arguments = (ins
1194    ...
1195    HasSomeProperty:$attr
1196  );
1197}
1198```
1199
1200As to whether we should define the predicate using a single `CPred` wrapping the
1201whole expression, multiple `CPred`s with predicate combiners, or a single
1202`CPred` "invoking" a function, there are no clear-cut criteria. Defining using
1203`CPred` and predicate combiners is preferable since it exposes more information
1204(instead hiding all the logic behind a C++ function) into the op definition spec
1205so that it can potentially drive more auto-generation cases. But it will require
1206a nice library of common predicates as the building blocks to avoid the
1207duplication, which is being worked on right now.
1208
1209## Attribute Definition
1210
1211An attribute is a compile-time known constant of an operation.
1212
1213ODS provides attribute wrappers over C++ attribute classes. There are a few
1214common C++ [attribute classes][AttrClasses] defined in MLIR's core IR library
1215and one is free to define dialect-specific attribute classes. ODS allows one to
1216use these attributes in TableGen to define operations, potentially with more
1217fine-grained constraints. For example, `StrAttr` directly maps to `StringAttr`;
1218`F32Attr`/`F64Attr` requires the `FloatAttr` to additionally be of a certain
1219bitwidth.
1220
1221ODS attributes are defined as having a storage type (corresponding to a backing
1222`mlir::Attribute` that _stores_ the attribute), a return type (corresponding to
1223the C++ _return_ type of the generated helper getters) as well as a method
1224to convert between the internal storage and the helper method.
1225
1226### Attribute decorators
1227
1228There are a few important attribute adapters/decorators/modifiers that can be
1229applied to ODS attributes to specify common additional properties like
1230optionality, default values, etc.:
1231
1232*   `DefaultValuedAttr`: specifies the
1233    [default value](#attributes-with-default-values) for an attribute.
1234*   `OptionalAttr`: specifies an attribute as [optional](#optional-attributes).
1235*   `Confined`: adapts an attribute with
1236    [further constraints](#confining-attributes).
1237
1238### Enum attributes
1239
1240Some attributes can only take values from a predefined enum, e.g., the
1241comparison kind of a comparison op. To define such attributes, ODS provides
1242several mechanisms: `StrEnumAttr`, `IntEnumAttr`, and `BitEnumAttr`.
1243
1244*   `StrEnumAttr`: each enum case is a string, the attribute is stored as a
1245    [`StringAttr`][StringAttr] in the op.
1246*   `IntEnumAttr`: each enum case is an integer, the attribute is stored as a
1247    [`IntegerAttr`][IntegerAttr] in the op.
1248*   `BitEnumAttr`: each enum case is a bit, the attribute is stored as a
1249    [`IntegerAttr`][IntegerAttr] in the op.
1250
1251All these `*EnumAttr` attributes require fully specifying all of the allowed
1252cases via their corresponding `*EnumAttrCase`. With this, ODS is able to
1253generate additional verification to only accept allowed cases. To facilitate the
1254interaction between `*EnumAttr`s and their C++ consumers, the
1255[`EnumsGen`][EnumsGen] TableGen backend can generate a few common utilities: a
1256C++ enum class, `llvm::DenseMapInfo` for the enum class, conversion functions
1257from/to strings. This is controlled via the `-gen-enum-decls` and
1258`-gen-enum-defs` command-line options of `mlir-tblgen`.
1259
1260For example, given the following `EnumAttr`:
1261
1262```tablegen
1263def Case15: I32EnumAttrCase<"Case15", 15>;
1264def Case20: I32EnumAttrCase<"Case20", 20>;
1265
1266def MyIntEnum: I32EnumAttr<"MyIntEnum", "An example int enum",
1267                           [Case15, Case20]> {
1268  let cppNamespace = "Outer::Inner";
1269  let stringToSymbolFnName = "ConvertToEnum";
1270  let symbolToStringFnName = "ConvertToString";
1271}
1272```
1273
1274The following will be generated via `mlir-tblgen -gen-enum-decls`:
1275
1276```c++
1277namespace Outer {
1278namespace Inner {
1279// An example int enum
1280enum class MyIntEnum : uint32_t {
1281  Case15 = 15,
1282  Case20 = 20,
1283};
1284
1285llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t);
1286llvm::StringRef ConvertToString(MyIntEnum);
1287llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef);
1288inline constexpr unsigned getMaxEnumValForMyIntEnum() {
1289  return 20;
1290}
1291
1292} // namespace Inner
1293} // namespace Outer
1294
1295namespace llvm {
1296template<> struct DenseMapInfo<Outer::Inner::MyIntEnum> {
1297  using StorageInfo = llvm::DenseMapInfo<uint32_t>;
1298
1299  static inline Outer::Inner::MyIntEnum getEmptyKey() {
1300    return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getEmptyKey());
1301  }
1302
1303  static inline Outer::Inner::MyIntEnum getTombstoneKey() {
1304    return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getTombstoneKey());
1305  }
1306
1307  static unsigned getHashValue(const Outer::Inner::MyIntEnum &val) {
1308    return StorageInfo::getHashValue(static_cast<uint32_t>(val));
1309  }
1310
1311  static bool isEqual(const Outer::Inner::MyIntEnum &lhs, const Outer::Inner::MyIntEnum &rhs) {
1312    return lhs == rhs;
1313  }
1314};
1315}
1316```
1317
1318The following will be generated via `mlir-tblgen -gen-enum-defs`:
1319
1320```c++
1321namespace Outer {
1322namespace Inner {
1323llvm::StringRef ConvertToString(MyIntEnum val) {
1324  switch (val) {
1325    case MyIntEnum::Case15: return "Case15";
1326    case MyIntEnum::Case20: return "Case20";
1327  }
1328  return "";
1329}
1330
1331llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef str) {
1332  return llvm::StringSwitch<llvm::Optional<MyIntEnum>>(str)
1333      .Case("Case15", MyIntEnum::Case15)
1334      .Case("Case20", MyIntEnum::Case20)
1335      .Default(llvm::None);
1336}
1337llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t value) {
1338  switch (value) {
1339  case 15: return MyIntEnum::Case15;
1340  case 20: return MyIntEnum::Case20;
1341  default: return llvm::None;
1342  }
1343}
1344
1345} // namespace Inner
1346} // namespace Outer
1347```
1348
1349Similarly for the following `BitEnumAttr` definition:
1350
1351```tablegen
1352def None: BitEnumAttrCase<"None", 0x0000>;
1353def Bit1: BitEnumAttrCase<"Bit1", 0x0001>;
1354def Bit2: BitEnumAttrCase<"Bit2", 0x0002>;
1355def Bit3: BitEnumAttrCase<"Bit3", 0x0004>;
1356
1357def MyBitEnum: BitEnumAttr<"MyBitEnum", "An example bit enum",
1358                           [None, Bit1, Bit2, Bit3]>;
1359```
1360
1361We can have:
1362
1363```c++
1364// An example bit enum
1365enum class MyBitEnum : uint32_t {
1366  None = 0,
1367  Bit1 = 1,
1368  Bit2 = 2,
1369  Bit3 = 4,
1370};
1371
1372llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t);
1373std::string stringifyMyBitEnum(MyBitEnum);
1374llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef);
1375inline MyBitEnum operator|(MyBitEnum lhs, MyBitEnum rhs) {
1376  return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) | static_cast<uint32_t>(rhs));
1377}
1378inline MyBitEnum operator&(MyBitEnum lhs, MyBitEnum rhs) {
1379  return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) & static_cast<uint32_t>(rhs));
1380}
1381inline bool bitEnumContains(MyBitEnum bits, MyBitEnum bit) {
1382  return (static_cast<uint32_t>(bits) & static_cast<uint32_t>(bit)) != 0;
1383}
1384
1385namespace llvm {
1386template<> struct DenseMapInfo<::MyBitEnum> {
1387  using StorageInfo = llvm::DenseMapInfo<uint32_t>;
1388
1389  static inline ::MyBitEnum getEmptyKey() {
1390    return static_cast<::MyBitEnum>(StorageInfo::getEmptyKey());
1391  }
1392
1393  static inline ::MyBitEnum getTombstoneKey() {
1394    return static_cast<::MyBitEnum>(StorageInfo::getTombstoneKey());
1395  }
1396
1397  static unsigned getHashValue(const ::MyBitEnum &val) {
1398    return StorageInfo::getHashValue(static_cast<uint32_t>(val));
1399  }
1400
1401  static bool isEqual(const ::MyBitEnum &lhs, const ::MyBitEnum &rhs) {
1402    return lhs == rhs;
1403  }
1404};
1405```
1406
1407```c++
1408std::string stringifyMyBitEnum(MyBitEnum symbol) {
1409  auto val = static_cast<uint32_t>(symbol);
1410  // Special case for all bits unset.
1411  if (val == 0) return "None";
1412
1413  llvm::SmallVector<llvm::StringRef, 2> strs;
1414  if (1u & val) { strs.push_back("Bit1"); val &= ~1u; }
1415  if (2u & val) { strs.push_back("Bit2"); val &= ~2u; }
1416  if (4u & val) { strs.push_back("Bit3"); val &= ~4u; }
1417
1418  if (val) return "";
1419  return llvm::join(strs, "|");
1420}
1421
1422llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef str) {
1423  // Special case for all bits unset.
1424  if (str == "None") return MyBitEnum::None;
1425
1426  llvm::SmallVector<llvm::StringRef, 2> symbols;
1427  str.split(symbols, "|");
1428
1429  uint32_t val = 0;
1430  for (auto symbol : symbols) {
1431    auto bit = llvm::StringSwitch<llvm::Optional<uint32_t>>(symbol)
1432      .Case("Bit1", 1)
1433      .Case("Bit2", 2)
1434      .Case("Bit3", 4)
1435      .Default(llvm::None);
1436    if (bit) { val |= *bit; } else { return llvm::None; }
1437  }
1438  return static_cast<MyBitEnum>(val);
1439}
1440
1441llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t value) {
1442  // Special case for all bits unset.
1443  if (value == 0) return MyBitEnum::None;
1444
1445  if (value & ~(1u | 2u | 4u)) return llvm::None;
1446  return static_cast<MyBitEnum>(value);
1447}
1448```
1449
1450## Type Definitions
1451
1452MLIR defines the `TypeDef` class hierarchy to enable generation of data types from
1453their specifications. A type is defined by specializing the `TypeDef` class with
1454concrete contents for all the fields it requires. For example, an integer type
1455could be defined as:
1456
1457```tablegen
1458// All of the types will extend this class.
1459class Test_Type<string name> : TypeDef<Test_Dialect, name> { }
1460
1461// An alternate int type.
1462def IntegerType : Test_Type<"TestInteger"> {
1463  let mnemonic = "int";
1464
1465  let summary = "An integer type with special semantics";
1466
1467  let description = [{
1468    An alternate integer type. This type differentiates itself from the
1469    standard integer type by not having a SignednessSemantics parameter, just
1470    a width.
1471  }];
1472
1473  let parameters = (ins "unsigned":$width);
1474
1475  // We define the printer inline.
1476  let printer = [{
1477    $_printer << "int<" << getImpl()->width << ">";
1478  }];
1479
1480  // The parser is defined here also.
1481  let parser = [{
1482    if ($_parser.parseLess())
1483      return Type();
1484    int width;
1485    if ($_parser.parseInteger(width))
1486      return Type();
1487    if ($_parser.parseGreater())
1488      return Type();
1489    return get($_ctxt, width);
1490  }];
1491}
1492```
1493
1494### Type name
1495
1496The name of the C++ class which gets generated defaults to
1497`<classParamName>Type` (e.g. `TestIntegerType` in the above example). This can
1498be overridden via the `cppClassName` field. The field `mnemonic` is to specify
1499the asm name for parsing. It is optional and not specifying it will imply that
1500no parser or printer methods are attached to this class.
1501
1502### Type documentation
1503
1504The `summary` and `description` fields exist and are to be used the same way as
1505in Operations. Namely, the summary should be a one-liner and `description`
1506should be a longer explanation.
1507
1508### Type parameters
1509
1510The `parameters` field is a list of the type's parameters. If no parameters are
1511specified (the default), this type is considered a singleton type. Parameters
1512are in the `"c++Type":$paramName` format. To use C++ types as parameters which
1513need allocation in the storage constructor, there are two options:
1514
1515-   Set `hasCustomStorageConstructor` to generate the TypeStorage class with a
1516    constructor which is just declared -- no definition -- so you can write it
1517    yourself.
1518-   Use the `TypeParameter` tablegen class instead of the "c++Type" string.
1519
1520### TypeParameter tablegen class
1521
1522This is used to further specify attributes about each of the types parameters.
1523It includes documentation (`summary` and `syntax`), the C++ type to use, a
1524custom allocator to use in the storage constructor method, and a custom
1525comparator to decide if two instances of the parameter type are equal.
1526
1527```tablegen
1528// DO NOT DO THIS!
1529let parameters = (ins "ArrayRef<int>":$dims);
1530```
1531
1532The default storage constructor blindly copies fields by value. It does not know
1533anything about the types. In this case, the ArrayRef<int> requires allocation
1534with `dims = allocator.copyInto(dims)`.
1535
1536You can specify the necessary constructor by specializing the `TypeParameter`
1537tblgen class:
1538
1539```tablegen
1540class ArrayRefIntParam :
1541    TypeParameter<"::llvm::ArrayRef<int>", "Array of ints"> {
1542  let allocator = "$_dst = $_allocator.copyInto($_self);";
1543}
1544
1545...
1546
1547let parameters = (ins ArrayRefIntParam:$dims);
1548```
1549
1550The `allocator` code block has the following substitutions:
1551
1552-   `$_allocator` is the TypeStorageAllocator in which to allocate objects.
1553-   `$_dst` is the variable in which to place the allocated data.
1554
1555The `comparator` code block has the following substitutions:
1556
1557-   `$_lhs` is an instance of the parameter type.
1558-   `$_rhs` is an instance of the parameter type.
1559
1560MLIR includes several specialized classes for common situations:
1561
1562-   `StringRefParameter<descriptionOfParam>` for StringRefs.
1563-   `ArrayRefParameter<arrayOf, descriptionOfParam>` for ArrayRefs of value
1564    types
1565-   `SelfAllocationParameter<descriptionOfParam>` for C++ classes which contain
1566    a method called `allocateInto(StorageAllocator &allocator)` to allocate
1567    itself into `allocator`.
1568-   `ArrayRefOfSelfAllocationParameter<arrayOf, descriptionOfParam>` for arrays
1569    of objects which self-allocate as per the last specialization.
1570
1571If we were to use one of these included specializations:
1572
1573```tablegen
1574let parameters = (ins
1575  ArrayRefParameter<"int", "The dimensions">:$dims
1576);
1577```
1578
1579### Parsing and printing
1580
1581If a mnemonic is specified, the `printer` and `parser` code fields are active.
1582The rules for both are:
1583
1584-   If null, generate just the declaration.
1585-   If non-null and non-empty, use the code in the definition. The `$_printer`
1586    or `$_parser` substitutions are valid and should be used.
1587-   It is an error to have an empty code block.
1588
1589For each dialect, two "dispatch" functions will be created: one for parsing and
1590one for printing. You should add calls to these in your `Dialect::printType` and
1591`Dialect::parseType` methods. They are static functions placed alongside the
1592type class definitions and have the following function signatures:
1593
1594```c++
1595static Type generatedTypeParser(MLIRContext* ctxt, DialectAsmParser& parser, StringRef mnemonic);
1596LogicalResult generatedTypePrinter(Type type, DialectAsmPrinter& printer);
1597```
1598
1599The mnemonic, parser, and printer fields are optional. If they're not defined,
1600the generated code will not include any parsing or printing code and omit the
1601type from the dispatch functions above. In this case, the dialect author is
1602responsible for parsing/printing the types in `Dialect::printType` and
1603`Dialect::parseType`.
1604
1605### Other fields
1606
1607-   If the `genStorageClass` field is set to 1 (the default) a storage class is
1608    generated with member variables corresponding to each of the specified
1609    `parameters`.
1610-   If the `genAccessors` field is 1 (the default) accessor methods will be
1611    generated on the Type class (e.g. `int getWidth() const` in the example
1612    above).
1613-   If the `genVerifyDecl` field is set, a declaration for a method `static
1614    LogicalResult verify(emitErrorFn, parameters...)` is added to the class as
1615    well as a `getChecked(emitErrorFn, parameters...)` method which checks the
1616    result of `verify` before calling `get`.
1617-   The `storageClass` field can be used to set the name of the storage class.
1618-   The `storageNamespace` field is used to set the namespace where the storage
1619    class should sit. Defaults to "detail".
1620-   The `extraClassDeclaration` field is used to include extra code in the class
1621    declaration.
1622
1623### Type builder methods
1624
1625For each type, there are a few builders(`get`/`getChecked`) automatically
1626generated based on the parameters of the type. For example, given the following
1627type definition:
1628
1629```tablegen
1630def MyType : ... {
1631  let parameters = (ins "int":$intParam);
1632}
1633```
1634
1635The following builders are generated:
1636
1637```c++
1638// Type builders are named `get`, and return a new instance of a type for a
1639// given set of parameters.
1640static MyType get(MLIRContext *context, int intParam);
1641
1642// If `genVerifyDecl` is set to 1, the following method is also generated.
1643static MyType getChecked(function_ref<InFlightDiagnostic()> emitError,
1644                         MLIRContext *context, int intParam);
1645```
1646
1647If these autogenerated methods are not desired, such as when they conflict with
1648a custom builder method, a type can set `skipDefaultBuilders` to 1 to signal
1649that they should not be generated.
1650
1651#### Custom type builder methods
1652
1653The default build methods may cover a majority of the simple cases related to
1654type construction, but when they cannot satisfy a type's needs, you can define
1655additional convenience 'get' methods in the `builders` field as follows:
1656
1657```tablegen
1658def MyType : ... {
1659  let parameters = (ins "int":$intParam);
1660
1661  let builders = [
1662    TypeBuilder<(ins "int":$intParam)>,
1663    TypeBuilder<(ins CArg<"int", "0">:$intParam)>,
1664    TypeBuilder<(ins CArg<"int", "0">:$intParam), [{
1665      // Write the body of the `get` builder inline here.
1666      return Base::get($_ctxt, intParam);
1667    }]>,
1668    TypeBuilderWithInferredContext<(ins "Type":$typeParam), [{
1669      // This builder states that it can infer an MLIRContext instance from
1670      // its arguments.
1671      return Base::get(typeParam.getContext(), ...);
1672    }]>,
1673  ];
1674}
1675```
1676
1677The `builders` field is a list of custom builders that are added to the type
1678class. In this example, we provide several different convenience builders that
1679are useful in different scenarios. The `ins` prefix is common to many function
1680declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What
1681follows is a comma-separated list of types (quoted string or `CArg`) and names
1682prefixed with the `$` sign. The use of `CArg` allows for providing a default
1683value to that argument. Let's take a look at each of these builders individually
1684
1685The first builder will generate the declaration of a builder method that looks
1686like:
1687
1688```tablegen
1689  let builders = [
1690    TypeBuilder<(ins "int":$intParam)>,
1691  ];
1692```
1693
1694```c++
1695class MyType : /*...*/ {
1696  /*...*/
1697  static MyType get(::mlir::MLIRContext *context, int intParam);
1698};
1699```
1700
1701This builder is identical to the one that will be automatically generated for
1702`MyType`. The `context` parameter is implicitly added by the generator, and is
1703used when building the Type instance (with `Base::get`). The distinction
1704here is that we can provide the implementation of this `get` method. With this
1705style of builder definition only the declaration is generated, the implementor
1706of `MyType` will need to provide a definition of `MyType::get`.
1707
1708The second builder will generate the declaration of a builder method that looks
1709like:
1710
1711```tablegen
1712  let builders = [
1713    TypeBuilder<(ins CArg<"int", "0">:$intParam)>,
1714  ];
1715```
1716
1717```c++
1718class MyType : /*...*/ {
1719  /*...*/
1720  static MyType get(::mlir::MLIRContext *context, int intParam = 0);
1721};
1722```
1723
1724The constraints here are identical to the first builder example except for the
1725fact that `intParam` now has a default value attached.
1726
1727The third builder will generate the declaration of a builder method that looks
1728like:
1729
1730```tablegen
1731  let builders = [
1732    TypeBuilder<(ins CArg<"int", "0">:$intParam), [{
1733      // Write the body of the `get` builder inline here.
1734      return Base::get($_ctxt, intParam);
1735    }]>,
1736  ];
1737```
1738
1739```c++
1740class MyType : /*...*/ {
1741  /*...*/
1742  static MyType get(::mlir::MLIRContext *context, int intParam = 0);
1743};
1744
1745MyType MyType::get(::mlir::MLIRContext *context, int intParam) {
1746  // Write the body of the `get` builder inline here.
1747  return Base::get(context, intParam);
1748}
1749```
1750
1751This is identical to the second builder example. The difference is that now, a
1752definition for the builder method will be generated automatically using the
1753provided code block as the body. When specifying the body inline, `$_ctxt` may
1754be used to access the `MLIRContext *` parameter.
1755
1756The fourth builder will generate the declaration of a builder method that looks
1757like:
1758
1759```tablegen
1760  let builders = [
1761    TypeBuilderWithInferredContext<(ins "Type":$typeParam), [{
1762      // This builder states that it can infer an MLIRContext instance from
1763      // its arguments.
1764      return Base::get(typeParam.getContext(), ...);
1765    }]>,
1766  ];
1767```
1768
1769```c++
1770class MyType : /*...*/ {
1771  /*...*/
1772  static MyType get(Type typeParam);
1773};
1774
1775MyType MyType::get(Type typeParam) {
1776  // This builder states that it can infer an MLIRContext instance from its
1777  // arguments.
1778  return Base::get(typeParam.getContext(), ...);
1779}
1780```
1781
1782In this builder example, the main difference from the third builder example
1783there is that the `MLIRContext` parameter is no longer added. This is because
1784the type builder used `TypeBuilderWithInferredContext` implies that the context
1785parameter is not necessary as it can be inferred from the arguments to the
1786builder.
1787
1788## Debugging Tips
1789
1790### Run `mlir-tblgen` to see the generated content
1791
1792TableGen syntax sometimes can be obscure; reading the generated content can be a
1793very helpful way to understand and debug issues. To build `mlir-tblgen`, run
1794`cmake --build . --target mlir-tblgen` in your build directory and find the
1795`mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators
1796can be found via `mlir-tblgen --help`. For example, `--gen-op-decls` and
1797`--gen-op-defs` as explained in [Generated C++ code](#generated-c-code).
1798
1799To see the generated code, invoke `mlir-tblgen` with a specific generator by
1800providing include paths via `-I`. For example,
1801
1802```sh
1803# To see op C++ class declaration
1804mlir-tblgen --gen-op-decls -I /path/to/mlir/include /path/to/input/td/file
1805# To see op C++ class definition
1806mlir-tblgen --gen-op-defs -I /path/to/mlir/include /path/to/input/td/file
1807# To see op documentation
1808mlir-tblgen --gen-dialect-doc -I /path/to/mlir/include /path/to/input/td/file
1809
1810# To see op interface C++ class declaration
1811mlir-tblgen --gen-op-interface-decls -I /path/to/mlir/include /path/to/input/td/file
1812# To see op interface C++ class definition
1813mlir-tblgen --gen-op-interface-defs -I /path/to/mlir/include /path/to/input/td/file
1814# To see op interface documentation
1815mlir-tblgen --gen-op-interface-doc -I /path/to/mlir/include /path/to/input/td/file
1816```
1817
1818## Appendix
1819
1820### Requirements and existing mechanisms analysis
1821
1822The op description should be as declarative as possible to allow a wide range of
1823tools to work with them and query methods generated from them. In particular
1824this means specifying traits, constraints and shape inference information in a
1825way that is easily analyzable (e.g., avoid opaque calls to C++ functions where
1826possible).
1827
1828We considered the approaches of several contemporary systems and focused on
1829requirements that were desirable:
1830
1831*   Ops registered using a registry separate from C++ code.
1832    *   Unknown ops are allowed in MLIR, so ops need not be registered. The
1833        ability of the compiler to optimize those ops or graphs containing those
1834        ops is constrained but correct.
1835    *   The current proposal does not include a runtime op description, but it
1836        does not preclude such description, it can be added later.
1837    *   The op registry is essential for generating C++ classes that make
1838        manipulating ops, verifying correct construction etc. in C++ easier by
1839        providing a typed representation and accessors.
1840*   The op registry will be defined in
1841    [TableGen](https://llvm.org/docs/TableGen/index.html) and be used to
1842    generate C++ classes and utility functions
1843    (builder/verifier/parser/printer).
1844    *   TableGen is a modelling specification language used by LLVM's backends
1845        and fits in well with trait-based modelling. This is an implementation
1846        decision and there are alternative ways of doing this. But the
1847        specification language is good for the requirements of modelling the
1848        traits (as seen from usage in LLVM processor backend modelling) and easy
1849        to extend, so a practical choice. If another good option comes up, we
1850        will consider it.
1851*   MLIR allows both defined and undefined ops.
1852    *   Defined ops should have fixed semantics and could have a corresponding
1853        reference implementation defined.
1854    *   Dialects are under full control of the dialect owner and normally live
1855        with the framework of the dialect.
1856*   The op's traits (e.g., commutative) are modelled along with the op in the
1857    registry.
1858*   The op's operand/return type constraints are modelled along with the op in
1859    the registry (see [Shape inference](ShapeInference.md) discussion below),
1860    this allows (e.g.) optimized concise syntax in textual dumps.
1861*   Behavior of the op is documented along with the op with a summary and a
1862    description. The description is written in markdown and extracted for
1863    inclusion in the generated LangRef section of the dialect.
1864*   The generic assembly form of printing and parsing is available as normal,
1865    but a custom parser and printer can either be specified or automatically
1866    generated from an optional string representation showing the mapping of the
1867    "assembly" string to operands/type.
1868    *   Parser-level remappings (e.g., `eq` to enum) will be supported as part
1869        of the parser generation.
1870*   Matching patterns are specified separately from the op description.
1871    *   Contrasted with LLVM there is no "base" set of ops that every backend
1872        needs to be aware of. Instead there are many different dialects and the
1873        transformations/legalizations between these dialects form a graph of
1874        transformations.
1875*   Reference implementation may be provided along with the op definition.
1876
1877    *   The reference implementation may be in terms of either standard ops or
1878        other reference implementations.
1879
1880    TODO: document expectation if the dependent op's definition changes.
1881
1882[TableGen]: https://llvm.org/docs/TableGen/index.html
1883[TableGenProgRef]: https://llvm.org/docs/TableGen/ProgRef.html
1884[TableGenBackend]: https://llvm.org/docs/TableGen/BackEnds.html#introduction
1885[OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td
1886[OpDefinitionsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp
1887[EnumsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/EnumsGen.cpp
1888[StringAttr]: Dialects/Builtin.md/#stringattr
1889[IntegerAttr]: Dialects/Builtin.md/#integertype
1890[AttrClasses]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Attributes.h
1891