1# Operation Definition Specification (ODS) 2 3In addition to specializing the `mlir::Op` C++ template, MLIR also supports 4defining operations and data types in a table-driven manner. This is achieved 5via [TableGen][TableGen], which is both a generic language and its tooling to 6maintain records of domain-specific information. Facts regarding an operation 7are specified concisely into a TableGen record, which will be expanded into an 8equivalent `mlir::Op` C++ template specialization at compiler build time. 9 10This manual explains in detail all the available mechanisms for defining 11operations in such a table-driven manner. It aims to be a specification instead 12of a tutorial. Please refer to 13[Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md) 14for the latter. 15 16In addition to detailing each mechanism, this manual also tries to capture best 17practices. They are rendered as quoted bullet points. 18 19## Motivation 20 21MLIR allows pluggable dialects, and dialects contain, among others, a list of 22operations. This open and extensible ecosystem leads to the "stringly" type IR 23problem, e.g., repetitive string comparisons during optimization and analysis 24passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)` 25vs self-documenting `getStride()`) with more generic return types, verbose and 26generic constructors without default arguments, verbose textual IR dumps, and so 27on. Furthermore, operation verification is: 28 291. best case: a central string-to-verification-function map, 301. middle case: duplication of verification across the code base, or 311. worst case: no verification functions. 32 33The fix is to support defining ops in a table-driven manner. Then for each 34dialect, we can have a central place that contains everything you need to know 35about each op, including its constraints, custom assembly form, etc. This 36description is also used to generate helper functions and classes to allow 37building, verification, parsing, printing, analysis, and many more. 38 39## Benefits 40 41Compared to the C++ template, this table-driven approach has several benefits 42including but not limited to: 43 44* **Single source of truth**: We strive to encode all facts regarding an 45 operation into the record, so that readers don't need to jump among code 46 snippets to fully understand an operation. 47* **Removing boilerplate**: We can automatically generate 48 operand/attribute/result getter methods, operation build methods, operation 49 verify methods, and many more utilities from the record. This greatly 50 reduces the boilerplate needed for defining a new op. 51* **Facilitating auto-generation**: The usage of these operation information 52 records are by no means limited to op definition itself. We can use them to 53 drive the auto-generation of many other components, like computation graph 54 serialization. 55 56## TableGen Syntax 57 58We use TableGen as the language for specifying operation information. TableGen 59itself just provides syntax for writing records; the syntax and constructs 60allowed in a TableGen file (typically with the filename suffix `.td`) can be found 61[here][TableGenProgRef]. 62 63* TableGen `class` is similar to C++ class; it can be templated and 64 subclassed. 65* TableGen `def` is similar to C++ object; it can be declared by specializing 66 a TableGen `class` (e.g., `def MyDef : MyClass<...>;`) or completely 67 independently (e.g., `def MyDef;`). It cannot be further templated or 68 subclassed. 69* TableGen `dag` is a dedicated type for directed acyclic graph of elements. A 70 `dag` has one operator and zero or more arguments. Its syntax is `(operator 71 arg0, arg1, argN)`. The operator can be any TableGen `def`; an argument can 72 be anything, including `dag` itself. We can have names attached to both the 73 operator and the arguments like `(MyOp:$op_name MyArg:$arg_name)`. 74 75Please see the [language reference][TableGenProgRef] to learn about all the 76types and expressions supported by TableGen. 77 78## Operation Definition 79 80MLIR defines several common constructs to help operation definition and provide 81their semantics via a special [TableGen backend][TableGenBackend]: 82[`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in 83[`OpBase.td`][OpBase]. The main ones are: 84 85* The `Op` class: It is the main construct for defining operations. All facts 86 regarding the operation are specified when specializing this class, with the 87 help of the following constructs. 88* The `Dialect` class: Operations belonging to one logical group are placed in 89 the same dialect. The `Dialect` class contains dialect-level information. 90* The `OpTrait` class hierarchy: They are used to specify special properties 91 and constraints of the operation, including whether the operation has side 92 effect or whether its output has the same shape as the input. 93* The `ins`/`outs` marker: These are two special markers builtin to the 94 `OpDefinitionsGen` backend. They lead to the definitions of operands/attributes 95 and results respectively. 96* The `TypeConstraint` class hierarchy: They are used to specify the 97 constraints over operands or results. A notable subclass hierarchy is 98 `Type`, which stands for constraints for common C++ types. 99* The `AttrConstraint` class hierarchy: They are used to specify the 100 constraints over attributes. A notable subclass hierarchy is `Attr`, which 101 stands for constraints for attributes whose values are of common types. 102 103An operation is defined by specializing the `Op` class with concrete contents 104for all the fields it requires. For example, `tf.AvgPool` is defined as 105 106```tablegen 107def TF_AvgPoolOp : TF_Op<"AvgPool", [NoSideEffect]> { 108 let summary = "Performs average pooling on the input."; 109 110 let description = [{ 111Each entry in `output` is the mean of the corresponding size `ksize` 112window in `value`. 113 }]; 114 115 let arguments = (ins 116 TF_FpTensor:$value, 117 118 Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$ksize, 119 Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$strides, 120 TF_AnyStrAttrOf<["SAME", "VALID"]>:$padding, 121 DefaultValuedAttr<TF_ConvertDataFormatAttr, "NHWC">:$data_format 122 ); 123 124 let results = (outs 125 TF_FpTensor:$output 126 ); 127 128 TF_DerivedOperandTypeAttr T = TF_DerivedOperandTypeAttr<0>; 129} 130``` 131 132In the following we describe all the fields needed. Please see the definition of 133the `Op` class for the complete list of fields supported. 134 135### Operation name 136 137The operation name is a unique identifier for the operation within MLIR, e.g., 138`tf.Add` for addition operation in the TensorFlow dialect. This is the 139equivalent of the mnemonic in assembly language. It is used for parsing and 140printing in the textual format. It is also used for pattern matching in graph 141rewrites. 142 143The full operation name is composed of the dialect name and the op name, with 144the former provided via the dialect and the latter provided as the second 145template parameter to the `Op` class. 146 147### Operation documentation 148 149This includes both a one-line `summary` and a longer human-readable 150`description`. They will be used to drive automatic generation of dialect 151documentation. They need to be provided in the operation's definition body: 152 153```tablegen 154let summary = "..."; 155 156let description = [{ 157... 158}]; 159``` 160 161`description` should be written in Markdown syntax. 162 163Placing the documentation at the beginning is recommended since it helps in 164understanding the operation. 165 166> * Place documentation at the beginning of the operation definition 167> * The summary should be short and concise. It should be a one-liner without 168> trailing punctuation. Put expanded explanation in description. 169 170### Operation arguments 171 172There are two kinds of arguments: operands and attributes. Operands are runtime 173values produced by other ops; while attributes are compile-time known constant 174values, including two categories: 175 1761. Natural attributes: these attributes affect the behavior of the operations 177 (e.g., padding for convolution); 1781. Derived attributes: these attributes are not needed to define the operation 179 but are instead derived from information of the operation. E.g., the output 180 shape of type. This is mostly used for convenience interface generation or 181 interaction with other frameworks/translation. 182 183 All derived attributes should be materializable as an Attribute. That is, 184 even though they are not materialized, it should be possible to store as an 185 attribute. 186 187Both operands and attributes are specified inside the `dag`-typed `arguments`, 188led by `ins`: 189 190```tablegen 191let arguments = (ins 192 <type-constraint>:$<operand-name>, 193 ... 194 <attr-constraint>:$<attr-name>, 195 ... 196); 197``` 198 199Here `<type-constraint>` is a TableGen `def` from the `TypeConstraint` class 200hierarchy. Similarly, `<attr-constraint>` is a TableGen `def` from the 201`AttrConstraint` class hierarchy. See [Constraints](#constraints) for more 202information. 203 204There is no requirements on the relative order of operands and attributes; they 205can mix freely. The relative order of operands themselves matters. From each 206named argument a named getter will be generated that returns the argument with 207the return type (in the case of attributes the return type will be constructed 208from the storage type, while for operands it will be `Value`). Each attribute's 209raw value (e.g., as stored) can also be accessed via generated `<name>Attr` 210getters for use in transformation passes where the more user-friendly return 211type is less suitable. 212 213All the arguments should be named to: 214- provide documentation, 215- drive auto-generation of getter methods, and 216- provide a handle to reference for other places like constraints. 217 218#### Variadic operands 219 220To declare a variadic operand, wrap the `TypeConstraint` for the operand with 221`Variadic<...>`. 222 223Normally operations have no variadic operands or just one variadic operand. For 224the latter case, it is easy to deduce which dynamic operands are for the static 225variadic operand definition. However, if an operation has more than one variable 226length operands (either optional or variadic), it would be impossible to 227attribute dynamic operands to the corresponding static variadic operand 228definitions without further information from the operation. Therefore, either 229the `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to 230indicate that all variable length operands have the same number of dynamic 231values. 232 233#### VariadicOfVariadic operands 234 235To declare a variadic operand that has a variadic number of sub-ranges, wrap the 236`TypeConstraint` for the operand with `VariadicOfVariadic<..., 237"<segment-attribute-name>">`. 238 239The second field of the `VariadicOfVariadic` is the name of an `I32ElementsAttr` 240argument that contains the sizes of the variadic sub-ranges. This attribute will 241be used when determining the size of sub-ranges, or when updating the size of 242sub-ranges. 243 244#### Optional operands 245 246To declare an optional operand, wrap the `TypeConstraint` for the operand with 247`Optional<...>`. 248 249Normally operations have no optional operands or just one optional operand. For 250the latter case, it is easy to deduce which dynamic operands are for the static 251operand definition. However, if an operation has more than one variable length 252operands (either optional or variadic), it would be impossible to attribute 253dynamic operands to the corresponding static variadic operand definitions 254without further information from the operation. Therefore, either the 255`SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to 256indicate that all variable length operands have the same number of dynamic 257values. 258 259#### Optional attributes 260 261To declare an optional attribute, wrap the `AttrConstraint` for the attribute 262with `OptionalAttr<...>`. 263 264#### Attributes with default values 265 266To declare an attribute with a default value, wrap the `AttrConstraint` for the 267attribute with `DefaultValuedAttr<..., "...">`. 268 269The second parameter to `DefaultValuedAttr` should be a string containing the 270C++ default value. For example, a float default value should be specified as 271like `"0.5f"`, and an integer array default value should be specified as like 272`"{1, 2, 3}"`. 273 274#### Confining attributes 275 276`Confined` is provided as a general mechanism to help modelling further 277constraints on attributes beyond the ones brought by value types. You can use 278`Confined` to compose complex constraints out of more primitive ones. For 279example, a 32-bit integer attribute whose minimum value must be 10 can be 280expressed as `Confined<I32Attr, [IntMinValue<10>]>`. 281 282Right now, the following primitive constraints are supported: 283 284* `IntMinValue<N>`: Specifying an integer attribute to be greater than or 285 equal to `N` 286* `IntMaxValue<N>`: Specifying an integer attribute to be less than or equal 287 to `N` 288* `ArrayMinCount<N>`: Specifying an array attribute to have at least `N` 289 elements 290* `IntArrayNthElemEq<I, N>`: Specifying an integer array attribute's `I`-th 291 element to be equal to `N` 292* `IntArrayNthElemMinValue<I, N>`: Specifying an integer array attribute's 293 `I`-th element to be greater than or equal to `N` 294 295TODO: Design and implement more primitive constraints 296 297### Operation regions 298 299The regions of an operation are specified inside of the `dag`-typed `regions`, 300led by `region`: 301 302```tablegen 303let regions = (region 304 <region-constraint>:$<region-name>, 305 ... 306); 307``` 308 309#### Variadic regions 310 311Similar to the `Variadic` class used for variadic operands and results, 312`VariadicRegion<...>` can be used for regions. Variadic regions can currently 313only be specified as the last region in the regions list. 314 315### Operation results 316 317Similar to operands, results are specified inside the `dag`-typed `results`, led 318by `outs`: 319 320```tablegen 321let results = (outs 322 <type-constraint>:$<result-name>, 323 ... 324); 325``` 326 327#### Variadic results 328 329Similar to variadic operands, `Variadic<...>` can also be used for results. And 330similarly, `SameVariadicResultSize` for multiple variadic results in the same 331operation. 332 333### Operation successors 334 335For terminator operations, the successors are specified inside of the 336`dag`-typed `successors`, led by `successor`: 337 338```tablegen 339let successors = (successor 340 <successor-constraint>:$<successor-name>, 341 ... 342); 343``` 344 345#### Variadic successors 346 347Similar to the `Variadic` class used for variadic operands and results, 348`VariadicSuccessor<...>` can be used for successors. Variadic successors can 349currently only be specified as the last successor in the successor list. 350 351### Operation traits and constraints 352 353Traits are operation properties that affect syntax or semantics. MLIR C++ models 354various traits in the `mlir::OpTrait` namespace. 355 356Both operation traits, [interfaces](Interfaces.md/#utilizing-the-ods-framework), 357and constraints involving multiple operands/attributes/results are provided as 358the third template parameter to the `Op` class. They should be deriving from 359the `OpTrait` class. See [Constraints](#constraints) for more information. 360 361### Builder methods 362 363For each operation, there are a few builders automatically generated based on 364the arguments and returns types. For example, given the following op definition: 365 366```tablegen 367def MyOp : ... { 368 let arguments = (ins 369 I32:$i32_operand, 370 F32:$f32_operand, 371 ..., 372 373 I32Attr:$i32_attr, 374 F32Attr:$f32_attr, 375 ... 376 ); 377 378 let results = (outs 379 I32:$i32_result, 380 F32:$f32_result, 381 ... 382 ); 383} 384``` 385 386The following builders are generated: 387 388```c++ 389// All result-types/operands/attributes have one aggregate parameter. 390static void build(OpBuilder &odsBuilder, OperationState &odsState, 391 TypeRange resultTypes, 392 ValueRange operands, 393 ArrayRef<NamedAttribute> attributes); 394 395// Each result-type/operand/attribute has a separate parameter. The parameters 396// for attributes are of mlir::Attribute types. 397static void build(OpBuilder &odsBuilder, OperationState &odsState, 398 Type i32_result, Type f32_result, ..., 399 Value i32_operand, Value f32_operand, ..., 400 IntegerAttr i32_attr, FloatAttr f32_attr, ...); 401 402// Each result-type/operand/attribute has a separate parameter. The parameters 403// for attributes are raw values unwrapped with mlir::Attribute instances. 404// (Note that this builder will not always be generated. See the following 405// explanation for more details.) 406static void build(OpBuilder &odsBuilder, OperationState &odsState, 407 Type i32_result, Type f32_result, ..., 408 Value i32_operand, Value f32_operand, ..., 409 APInt i32_attr, StringRef f32_attr, ...); 410 411// Each operand/attribute has a separate parameter but result type is aggregate. 412static void build(OpBuilder &odsBuilder, OperationState &odsState, 413 TypeRange resultTypes, 414 Value i32_operand, Value f32_operand, ..., 415 IntegerAttr i32_attr, FloatAttr f32_attr, ...); 416 417// All operands/attributes have aggregate parameters. 418// Generated if return type can be inferred. 419static void build(OpBuilder &odsBuilder, OperationState &odsState, 420 ValueRange operands, ArrayRef<NamedAttribute> attributes); 421 422// (And manually specified builders depending on the specific op.) 423``` 424 425The first form provides basic uniformity so that we can create ops using the 426same form regardless of the exact op. This is particularly useful for 427implementing declarative pattern rewrites. 428 429The second and third forms are good for use in manually written code, given that 430they provide better guarantee via signatures. 431 432The third form will be generated if any of the op's attribute has different 433`Attr.returnType` from `Attr.storageType` and we know how to build an attribute 434from an unwrapped value (i.e., `Attr.constBuilderCall` is defined.) 435Additionally, for the third form, if an attribute appearing later in the 436`arguments` list has a default value, the default value will be supplied in the 437declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the 438list can grow in the future. So if possible, the default-valued attribute should be 439placed at the end of the `arguments` list to leverage this feature. (This 440behavior is essentially due to C++ function parameter default value placement 441restrictions.) Otherwise, the builder of the third form will still be generated 442but default values for the attributes not at the end of the `arguments` list 443will not be supplied in the builder's signature. 444 445ODS will generate a builder that doesn't require the return type specified if 446 447* Op implements InferTypeOpInterface interface; 448* All return types are either buildable types or are the same as a given 449 operand (e.g., `AllTypesMatch` constraint between operand and result); 450 451And there may potentially exist other builders depending on the specific op; 452please refer to the 453[generated C++ file](#run-mlir-tblgen-to-see-the-generated-content) for the 454complete list. 455 456#### Custom builder methods 457 458However, if the above cases cannot satisfy all needs, you can define additional 459convenience build methods in the `builders` field as follows. 460 461```tablegen 462def MyOp : Op<"my_op", []> { 463 let arguments = (ins F32Attr:$attr); 464 465 let builders = [ 466 OpBuilder<(ins "float":$val)> 467 ]; 468} 469``` 470 471The `builders` field is a list of custom builders that are added to the Op 472class. In this example, we provide a convenience builder that takes a floating 473point value instead of an attribute. The `ins` prefix is common to many function 474declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What 475follows is a comma-separated list of types (quoted string) and names prefixed 476with the `$` sign. This will generate the declaration of a builder method that 477looks like: 478 479```c++ 480class MyOp : /*...*/ { 481 /*...*/ 482 static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, 483 float val); 484}; 485``` 486 487Note that the method has two additional leading arguments. These arguments are 488useful to construct the operation. In particular, the method must populate 489`state` with attributes, operands, regions and result types of the operation to 490be constructed. `builder` can be used to construct any IR objects that belong to 491the Op, such as types or nested operations. Since the type and name are 492generated as is in the C++ code, they should be valid C++ constructs for a type 493(in the namespace of the Op) and an identifier (e.g., `class` is not a valid 494identifier). 495 496Implementations of the builder can be provided directly in ODS, using TableGen 497code block as follows. 498 499```tablegen 500def MyOp : Op<"my_op", []> { 501 let arguments = (ins F32Attr:$attr); 502 503 let builders = [ 504 OpBuilder<(ins "float":$val), [{ 505 $_state.addAttribute("attr", $_builder.getF32FloatAttr(val)); 506 }]> 507 ]; 508} 509``` 510 511The equivalents of `builder` and `state` arguments are available as `$_builder` 512and `$_state` special variables. The named arguments listed in the `ins` part 513are available directly, e.g. `val`. The body of the builder will be generated by 514substituting special variables and should otherwise be valid C++. While there is 515no limitation on the code size, we encourage one to define only short builders 516inline in ODS and put definitions of longer builders in C++ files. 517 518Finally, if some arguments need a default value, they can be defined using 519`CArg` to wrap the type and this value as follows. 520 521```tablegen 522def MyOp : Op<"my_op", []> { 523 let arguments = (ins F32Attr:$attr); 524 525 let builders = [ 526 OpBuilder<(ins CArg<"float", "0.5f">:$val), [{ 527 $_state.addAttribute("attr", $_builder.getF32FloatAttr(val)); 528 }]> 529 ]; 530} 531``` 532 533The generated code will use default value in the declaration, but not in the 534definition, as required by C++. 535 536```c++ 537/// Header file. 538class MyOp : /*...*/ { 539 /*...*/ 540 static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, 541 float val = 0.5f); 542}; 543 544/// Source file. 545MyOp::build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, 546 float val) { 547 state.addAttribute("attr", builder.getF32FloatAttr(val)); 548} 549``` 550 551**Deprecated:** `OpBuilder` class allows one to specify the custom builder 552signature as a raw string, without separating parameters into different `dag` 553arguments. It also supports leading parameters of `OpBuilder &` and 554`OperationState &` types, which will be used instead of the autogenerated ones 555if present. 556 557### Custom parser and printer methods 558 559Functions to parse and print the operation's custom assembly form. 560 561### Custom verifier code 562 563Verification code will be automatically generated for 564[constraints](#constraints) specified on various entities of the op. To perform 565_additional_ verification, you can use 566 567```tablegen 568let hasVerifier = 1; 569let hasRegionVerifier = 1; 570``` 571 572This will generate `LogicalResult verify()`/`LogicalResult verifyRegions()` 573method declarations on the op class that can be defined with any additional 574verification constraints. For verificaiton which needs to access the nested 575operations, you should use `hasRegionVerifier` to ensure that it won't access 576any ill-formed operation. Except that, The other verifications can be 577implemented with `hasVerifier`. Check the next section for the execution order 578of these verification methods. 579 580#### Verification Ordering 581 582The verification of an operation involves several steps, 583 5841. StructuralOpTrait will be verified first, they can be run independently. 5852. `verifyInvariants` which is constructed by ODS, it verifies the type, 586 attributes, .etc. 5873. Other Traits/Interfaces that have marked their verifier as `verifyTrait` or 588 `verifyWithRegions=0`. 5894. Custom verifier which is defined in the op and has been marked `hasVerifier=1` 590 591If an operation has regions, then it may have the second phase, 592 5931. Traits/Interfaces that have marked their verifier as `verifyRegionTrait` or 594 `verifyWithRegions=1`. This implies the verifier needs to access the 595 operations in its regions. 5962. Custom verifier which is defined in the op and has been marked 597 `hasRegionVerifier=1` 598 599Note that the second phase will be run after the operations in the region are 600verified. Verifiers further down the order can rely on certain invariants being 601verified by a previous verifier and do not need to re-verify them. 602 603#### Emitting diagnostics in custom verifiers 604 605Custom verifiers should avoid printing operations using custom operation 606printers, because they require the printed operation (and sometimes its parent 607operation) to be verified first. In particular, when emitting diagnostics, 608custom verifiers should use the `Error` severity level, which prints operations 609in generic form by default, and avoid using lower severity levels (`Note`, 610`Remark`, `Warning`). 611 612### Declarative Assembly Format 613 614The custom assembly form of the operation may be specified in a declarative 615string that matches the operations operands, attributes, etc. With the ability 616to express additional information that needs to be parsed to build the 617operation: 618 619```tablegen 620def CallOp : Std_Op<"call", ...> { 621 let arguments = (ins FlatSymbolRefAttr:$callee, Variadic<AnyType>:$args); 622 let results = (outs Variadic<AnyType>); 623 624 let assemblyFormat = [{ 625 $callee `(` $args `)` attr-dict `:` functional-type($args, results) 626 }]; 627} 628``` 629 630The format is comprised of three components: 631 632#### Directives 633 634A directive is a type of builtin function, with an optional set of arguments. 635The available directives are as follows: 636 637* `attr-dict` 638 639 - Represents the attribute dictionary of the operation. 640 641* `attr-dict-with-keyword` 642 643 - Represents the attribute dictionary of the operation, but prefixes the 644 dictionary with an `attributes` keyword. 645 646* `custom` < UserDirective > ( Params ) 647 648 - Represents a custom directive implemented by the user in C++. 649 - See the [Custom Directives](#custom-directives) section below for more 650 details. 651 652* `functional-type` ( inputs , results ) 653 654 - Formats the `inputs` and `results` arguments as a 655 [function type](Dialects/Builtin.md/#functiontype). 656 - The constraints on `inputs` and `results` are the same as the `input` of 657 the `type` directive. 658 659* `oilist` ( \`keyword\` elements | \`otherKeyword\` elements ...) 660 661 - Represents an optional order-independent list of clauses. Each clause 662 has a keyword and corresponding assembly format. 663 - Each clause can appear 0 or 1 time (in any order). 664 - Only literals, types and variables can be used within an oilist element. 665 - All the variables must be optional or variadic. 666 667* `operands` 668 669 - Represents all of the operands of an operation. 670 671* `ref` ( input ) 672 673 - Represents a reference to the a variable or directive, that must have 674 already been resolved, that may be used as a parameter to a `custom` 675 directive. 676 - Used to pass previously parsed entities to custom directives. 677 - The input may be any directive or variable, aside from `functional-type` 678 and `custom`. 679 680* `regions` 681 682 - Represents all of the regions of an operation. 683 684* `results` 685 686 - Represents all of the results of an operation. 687 688* `successors` 689 690 - Represents all of the successors of an operation. 691 692* `type` ( input ) 693 694 - Represents the type of the given input. 695 - `input` must be either an operand or result [variable](#variables), the 696 `operands` directive, or the `results` directive. 697 698* `qualified` ( type_or_attribute ) 699 700 - Wraps a `type` directive or an attribute parameter. 701 - Used to force printing the type or attribute prefixed with its dialect 702 and mnemonic. For example the `vector.multi_reduction` operation has a 703 `kind` attribute ; by default the declarative assembly will print: 704 `vector.multi_reduction <minf>, ...` but using `qualified($kind)` in the 705 declarative assembly format will print it instead as: 706 `vector.multi_reduction #vector.kind<minf>, ...`. 707 708#### Literals 709 710A literal is either a keyword or punctuation surrounded by \`\`. 711 712The following are the set of valid punctuation: 713 714`:`, `,`, `=`, `<`, `>`, `(`, `)`, `{`, `}`, `[`, `]`, `->`, `?`, `+`, `*` 715 716The following are valid whitespace punctuation: 717 718`\n`, ` ` 719 720The `\n` literal emits a newline an indents to the start of the operation. An 721example is shown below: 722 723```tablegen 724let assemblyFormat = [{ 725 `{` `\n` ` ` ` ` `this_is_on_a_newline` `\n` `}` attr-dict 726}]; 727``` 728 729```mlir 730%results = my.operation { 731 this_is_on_a_newline 732} 733``` 734 735An empty literal \`\` may be used to remove a space that is inserted implicitly 736after certain literal elements, such as `)`/`]`/etc. For example, "`]`" may 737result in an output of `]` it is not the last element in the format. "`]` \`\`" 738would trim the trailing space in this situation. 739 740#### Variables 741 742A variable is an entity that has been registered on the operation itself, i.e. 743an argument(attribute or operand), region, result, successor, etc. In the 744`CallOp` example above, the variables would be `$callee` and `$args`. 745 746Attribute variables are printed with their respective value type, unless that 747value type is buildable. In those cases, the type of the attribute is elided. 748 749#### Custom Directives 750 751The declarative assembly format specification allows for handling a large 752majority of the common cases when formatting an operation. For the operations 753that require or desire specifying parts of the operation in a form not supported 754by the declarative syntax, custom directives may be specified. A custom 755directive essentially allows for users to use C++ for printing and parsing 756subsections of an otherwise declaratively specified format. Looking at the 757specification of a custom directive above: 758 759``` 760custom-directive ::= `custom` `<` UserDirective `>` `(` Params `)` 761``` 762 763A custom directive has two main parts: The `UserDirective` and the `Params`. A 764custom directive is transformed into a call to a `print*` and a `parse*` method 765when generating the C++ code for the format. The `UserDirective` is an 766identifier used as a suffix to these two calls, i.e., `custom<MyDirective>(...)` 767would result in calls to `parseMyDirective` and `printMyDirective` within the 768parser and printer respectively. `Params` may be any combination of variables 769(i.e. Attribute, Operand, Successor, etc.), type directives, and `attr-dict`. 770The type directives must refer to a variable, but that variable need not also be 771a parameter to the custom directive. 772 773The arguments to the `parse<UserDirective>` method are firstly a reference to 774the `OpAsmParser`(`OpAsmParser &`), and secondly a set of output parameters 775corresponding to the parameters specified in the format. The mapping of 776declarative parameter to `parse` method argument is detailed below: 777 778* Attribute Variables 779 - Single: `<Attribute-Storage-Type>(e.g. Attribute) &` 780 - Optional: `<Attribute-Storage-Type>(e.g. Attribute) &` 781* Operand Variables 782 - Single: `OpAsmParser::UnresolvedOperand &` 783 - Optional: `Optional<OpAsmParser::UnresolvedOperand> &` 784 - Variadic: `SmallVectorImpl<OpAsmParser::UnresolvedOperand> &` 785 - VariadicOfVariadic: 786 `SmallVectorImpl<SmallVector<OpAsmParser::UnresolvedOperand>> &` 787* Ref Directives 788 - A reference directive is passed to the parser using the same mapping as 789 the input operand. For example, a single region would be passed as a 790 `Region &`. 791* Region Variables 792 - Single: `Region &` 793 - Variadic: `SmallVectorImpl<std::unique_ptr<Region>> &` 794* Successor Variables 795 - Single: `Block *&` 796 - Variadic: `SmallVectorImpl<Block *> &` 797* Type Directives 798 - Single: `Type &` 799 - Optional: `Type &` 800 - Variadic: `SmallVectorImpl<Type> &` 801 - VariadicOfVariadic: `SmallVectorImpl<SmallVector<Type>> &` 802* `attr-dict` Directive: `NamedAttrList &` 803 804When a variable is optional, the value should only be specified if the variable 805is present. Otherwise, the value should remain `None` or null. 806 807The arguments to the `print<UserDirective>` method is firstly a reference to the 808`OpAsmPrinter`(`OpAsmPrinter &`), second the op (e.g. `FooOp op` which can be 809`Operation *op` alternatively), and finally a set of output parameters 810corresponding to the parameters specified in the format. The mapping of 811declarative parameter to `print` method argument is detailed below: 812 813* Attribute Variables 814 - Single: `<Attribute-Storage-Type>(e.g. Attribute)` 815 - Optional: `<Attribute-Storage-Type>(e.g. Attribute)` 816* Operand Variables 817 - Single: `Value` 818 - Optional: `Value` 819 - Variadic: `OperandRange` 820 - VariadicOfVariadic: `OperandRangeRange` 821* Ref Directives 822 - A reference directive is passed to the printer using the same mapping as 823 the input operand. For example, a single region would be passed as a 824 `Region &`. 825* Region Variables 826 - Single: `Region &` 827 - Variadic: `MutableArrayRef<Region>` 828* Successor Variables 829 - Single: `Block *` 830 - Variadic: `SuccessorRange` 831* Type Directives 832 - Single: `Type` 833 - Optional: `Type` 834 - Variadic: `TypeRange` 835 - VariadicOfVariadic: `TypeRangeRange` 836* `attr-dict` Directive: `DictionaryAttr` 837 838When a variable is optional, the provided value may be null. 839 840#### Optional Groups 841 842In certain situations operations may have "optional" information, e.g. 843attributes or an empty set of variadic operands. In these situations a section 844of the assembly format can be marked as `optional` based on the presence of this 845information. An optional group is defined as follows: 846 847``` 848optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?` 849``` 850 851The `elements` of an optional group have the following requirements: 852 853* The first element of the group must either be a attribute, literal, operand, 854 or region. 855 - This is because the first element must be optionally parsable. 856* Exactly one argument variable or type directive within the group must be 857 marked as the anchor of the group. 858 - The anchor is the element whose presence controls whether the group 859 should be printed/parsed. 860 - An element is marked as the anchor by adding a trailing `^`. 861 - The first element is *not* required to be the anchor of the group. 862 - When a non-variadic region anchors a group, the detector for printing 863 the group is if the region is empty. 864* Literals, variables, custom directives, and type directives are the only 865 valid elements within the group. 866 - Any attribute variable may be used, but only optional attributes can be 867 marked as the anchor. 868 - Only variadic or optional results and operand arguments and can be used. 869 - All region variables can be used. When a non-variable length region is 870 used, if the group is not present the region is empty. 871 872An example of an operation with an optional group is `func.return`, which has a 873variadic number of operands. 874 875```tablegen 876def ReturnOp : ... { 877 let arguments = (ins Variadic<AnyType>:$operands); 878 879 // We only print the operands and types if there are a non-zero number 880 // of operands. 881 let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?"; 882} 883``` 884 885##### Unit Attributes 886 887In MLIR, the [`unit` Attribute](Dialects/Builtin.md/#unitattr) is special in that it 888only has one possible value, i.e. it derives meaning from its existence. When a 889unit attribute is used to anchor an optional group and is not the first element 890of the group, the presence of the unit attribute can be directly correlated with 891the presence of the optional group itself. As such, in these situations the unit 892attribute will not be printed or present in the output and will be automatically 893inferred when parsing by the presence of the optional group itself. 894 895For example, the following operation: 896 897```tablegen 898def FooOp : ... { 899 let arguments = (ins UnitAttr:$is_read_only); 900 901 let assemblyFormat = "attr-dict (`is_read_only` $is_read_only^)?"; 902} 903``` 904 905would be formatted as such: 906 907```mlir 908// When the unit attribute is present: 909foo.op is_read_only 910 911// When the unit attribute is not present: 912foo.op 913``` 914 915##### Optional "else" Group 916 917Optional groups also have support for an "else" group of elements. These are 918elements that are parsed/printed if the `anchor` element of the optional group 919is *not* present. Unlike the main element group, the "else" group has no 920restriction on the first element and none of the elements may act as the 921`anchor` for the optional. An example is shown below: 922 923```tablegen 924def FooOp : ... { 925 let arguments = (ins UnitAttr:$foo); 926 927 let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?"; 928} 929``` 930 931would be formatted as such: 932 933```mlir 934// When the `foo` attribute is present: 935foo.op foo_is_present 936 937// When the `foo` attribute is not present: 938foo.op foo_is_absent 939``` 940 941#### Requirements 942 943The format specification has a certain set of requirements that must be adhered 944to: 945 9461. The output and operation name are never shown as they are fixed and cannot 947 be altered. 9481. All operands within the operation must appear within the format, either 949 individually or with the `operands` directive. 9501. All regions within the operation must appear within the format, either 951 individually or with the `regions` directive. 9521. All successors within the operation must appear within the format, either 953 individually or with the `successors` directive. 9541. All operand and result types must appear within the format using the various 955 `type` directives, either individually or with the `operands` or `results` 956 directives. 9571. The `attr-dict` directive must always be present. 9581. Must not contain overlapping information; e.g. multiple instances of 959 'attr-dict', types, operands, etc. 960 - Note that `attr-dict` does not overlap with individual attributes. These 961 attributes will simply be elided when printing the attribute dictionary. 962 963##### Type Inference 964 965One requirement of the format is that the types of operands and results must 966always be present. In certain instances, the type of a variable may be deduced 967via type constraints or other information available. In these cases, the type of 968that variable may be elided from the format. 969 970* Buildable Types 971 972Some type constraints may only have one representation, allowing for them to be 973directly buildable; for example the `I32` or `Index` types. Types in `ODS` may 974mark themselves as buildable by setting the `builderCall` field or inheriting 975from the `BuildableType` class. 976 977* Trait Equality Constraints 978 979There are many operations that have known type equality constraints registered 980as traits on the operation; for example the true, false, and result values of a 981`select` operation often have the same type. The assembly format may inspect 982these equal constraints to discern the types of missing variables. The currently 983supported traits are: `AllTypesMatch`, `TypesMatchWith`, `SameTypeOperands`, and 984`SameOperandsAndResultType`. 985 986* InferTypeOpInterface 987 988Operations that implement `InferTypeOpInterface` can omit their result types in 989their assembly format since the result types can be inferred from the operands. 990 991### `hasCanonicalizer` 992 993This boolean field indicate whether canonicalization patterns have been defined 994for this operation. If it is `1`, then `::getCanonicalizationPatterns()` should 995be defined. 996 997### `hasCanonicalizeMethod` 998 999When this boolean field is set to `true`, it indicates that the op implements a 1000`canonicalize` method for simple "matchAndRewrite" style canonicalization 1001patterns. If `hasCanonicalizer` is 0, then an implementation of 1002`::getCanonicalizationPatterns()` is implemented to call this function. 1003 1004### `hasFolder` 1005 1006This boolean field indicate whether general folding rules have been defined for 1007this operation. If it is `1`, then `::fold()` should be defined. 1008 1009### Extra declarations 1010 1011One of the goals of table-driven op definition is to auto-generate as much logic 1012and methods needed for each op as possible. With that said, there will always be 1013long-tail cases that won't be covered. For such cases, you can use 1014`extraClassDeclaration`. Code in `extraClassDeclaration` will be copied 1015literally to the generated C++ op class. 1016 1017Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by 1018power users; for not-yet-implemented widely-applicable cases, improving the 1019infrastructure is preferable. 1020 1021### Extra definitions 1022 1023When defining base op classes in TableGen that are inherited many times by 1024different ops, users may want to provide common definitions of utility and 1025interface functions. However, many of these definitions may not be desirable or 1026possible in `extraClassDeclaration`, which append them to the op's C++ class 1027declaration. In these cases, users can add an `extraClassDefinition` to define 1028code that is added to the generated source file inside the op's C++ namespace. 1029The substitution `$cppClass` is replaced by the op's C++ class name. 1030 1031### Generated C++ code 1032 1033[OpDefinitionsGen][OpDefinitionsGen] processes the op definition spec file and 1034generates two files containing the corresponding C++ code: one for declarations, 1035the other for definitions. The former is generated via the `-gen-op-decls` 1036command-line option, while the latter is via the `-gen-op-defs` option. 1037 1038The definition file contains all the op method definitions, which can be 1039included and enabled by defining `GET_OP_CLASSES`. For each operation, 1040OpDefinitionsGen generates an operation class and an 1041[operand adaptor](#operand-adaptors) class. Besides, it also contains a 1042comma-separated list of all defined ops, which can be included and enabled by 1043defining `GET_OP_LIST`. 1044 1045#### Class name and namespaces 1046 1047For each operation, its generated C++ class name is the symbol `def`ed with 1048TableGen with dialect prefix removed. The first `_` serves as the delimiter. For 1049example, for `def TF_AddOp`, the C++ class name would be `AddOp`. We remove the 1050`TF` prefix because it is for scoping ops; other dialects may as well define 1051their own `AddOp`s. 1052 1053The namespaces of the generated C++ class will come from the dialect's 1054`cppNamespace` field. For example, if a dialect's `cppNamespace` is `A::B`, then 1055an op of that dialect will be placed in `namespace A { namespace B { ... } }`. 1056If a dialect does not specify a `cppNamespace`, we then use the dialect's name 1057as the namespace. 1058 1059This means the qualified name of the generated C++ class does not necessarily 1060match exactly with the operation name as explained in 1061[Operation name](#operation-name). This is to allow flexible naming to satisfy 1062coding style requirements. 1063 1064#### Operand adaptors 1065 1066For each operation, we automatically generate an _operand adaptor_. This class 1067solves the problem of accessing operands provided as a list of `Value`s without 1068using "magic" constants. The operand adaptor takes a reference to an array of 1069`Value` and provides methods with the same names as those in the operation class 1070to access them. For example, for a binary arithmetic operation, it may provide 1071`.lhs()` to access the first operand and `.rhs()` to access the second operand. 1072 1073The operand adaptor class lives in the same namespace as the operation class, 1074and has the name of the operation followed by `Adaptor` as well as an alias 1075`Adaptor` inside the op class. 1076 1077Operand adaptors can be used in function templates that also process operations: 1078 1079```c++ 1080template <typename BinaryOpTy> 1081std::pair<Value, Value> zip(BinaryOpTy &&op) { 1082 return std::make_pair(op.lhs(), op.rhs());; 1083} 1084 1085void process(AddOp op, ArrayRef<Value> newOperands) { 1086 zip(op); 1087 zip(Adaptor<AddOp>(newOperands)); 1088 /*...*/ 1089} 1090``` 1091 1092## Constraints 1093 1094Constraint is a core concept in table-driven operation definition: operation 1095verification and graph operation matching are all based on satisfying 1096constraints. So both the operation definition and rewrite rules specification 1097significantly involve writing constraints. We have the `Constraint` class in 1098[`OpBase.td`][OpBase] as the common base class for all constraints. 1099 1100An operation's constraint can cover different range; it may 1101 1102* Only concern a single attribute (e.g. being a 32-bit integer greater than 1103 5), 1104* Multiple operands and results (e.g., the 1st result's shape must be the same 1105 as the 1st operand), or 1106* Intrinsic to the operation itself (e.g., having no side effect). 1107 1108We call them as single-entity constraint, multi-entity constraint, and traits, 1109respectively. 1110 1111### Single-entity constraint 1112 1113Constraints scoped to a single operand, attribute, or result are specified at 1114the entity's declaration place as described in 1115[Operation arguments](#operation-arguments) and 1116[Operation results](#operation-results). 1117 1118To help modelling constraints of common types, a set of `TypeConstraint`s are 1119created; they are the `Type` subclass hierarchy. It includes `F32` for the 1120constraints of being a float, `TensorOf<[F32]>` for the constraints of being a 1121float tensor, and so on. 1122 1123Similarly, a set of `AttrConstraint`s are created for helping modelling 1124constraints of common attribute kinds. They are the `Attr` subclass hierarchy. 1125It includes `F32Attr` for the constraints of being a float attribute, 1126`F32ArrayAttr` for the constraints of being a float array attribute, and so on. 1127 1128### Multi-entity constraint 1129 1130Constraints involving more than one operand/attribute/result are quite common on 1131operations, like the element type and shape relation between operands and 1132results. These constraints should be specified as the `Op` class template 1133parameter as described in 1134[Operation traits and constraints](#operation-traits-and-constraints). 1135 1136Multi-entity constraints are modeled as `PredOpTrait` (a subclass of `OpTrait`) 1137in [`OpBase.td`][OpBase].A bunch of constraint primitives are provided to help 1138specification. See [`OpBase.td`][OpBase] for the complete list. 1139 1140### Trait 1141 1142Traits are intrinsic properties of the operation like having side effect or not, 1143commutative or not, whether is a terminator, etc. These constraints should be 1144specified as the `Op` class template parameter as described in 1145[Operation traits and constraints](#operation-traits-and-constraints). 1146 1147Traits are modeled as `NativeOpTrait` (a subclass of `OpTrait`) in 1148[`OpBase.td`][OpBase]. They are backed and will be translated into the 1149corresponding C++ `mlir::OpTrait` classes. 1150 1151### How to specify new constraint 1152 1153To write a constraint, you need to provide its predicates and give it a 1154descriptive name. Predicates, modeled with the `Pred` class, are the workhorse 1155for composing constraints. The predicate for a constraint is typically built up 1156in a nested manner, using the two categories of predicates: 1157 11581. `CPred`: the primitive leaf predicate. 11592. Compound predicate: a predicate composed from child predicates using 1160 predicate combiners (conjunction: `And`, disjunction: `Or`, negation: `Neg`, 1161 substitution: `SubstLeaves`, concatenation: `Concat`). 1162 1163`CPred` is the basis for composing more complex predicates. It is the "atom" 1164predicate from the perspective of TableGen and the "interface" between TableGen 1165and C++. What is inside is already C++ code, which will be treated as opaque 1166strings with special placeholders to be substituted. 1167 1168You can put any C++ code that returns a boolean value inside a `CPred`, 1169including evaluating expressions, calling functions, calling class methods, and 1170so on. 1171 1172To help interaction with the C++ environment, there are a few special 1173placeholders provided to refer to entities in the context where this predicate 1174is used. They serve as "hooks" to the enclosing environment. This includes 1175`$_builder`, `$_op`, and `$_self`: 1176 1177* `$_builder` will be replaced by a `mlir::Builder` instance so that you can 1178 access common build methods. 1179* `$_op` will be replaced by the current operation so that you can access 1180 information of the current operation. 1181* `$_self` will be replaced with the entity this predicate is attached to. 1182 E.g., `BoolAttr` is an attribute constraint that wraps a 1183 `CPred<"$_self.isa<BoolAttr>()">`. Then for `BoolAttr:$attr`,`$_self` will be 1184 replaced by `$attr`. For type constraints, it's a little bit special since 1185 we want the constraints on each type definition reads naturally and we want 1186 to attach type constraints directly to an operand/result, `$_self` will be 1187 replaced by the operand/result's type. E.g., for `F32` in `F32:$operand`, 1188 its `$_self` will be expanded as `operand(...).getType()`. 1189 1190TODO: Reconsider the leading symbol for special placeholders. Eventually we want 1191to allow referencing operand/result `$-name`s; such `$-name`s can start with 1192underscore. 1193 1194For example, to write an attribute `attr` is an `IntegerAttr`, in C++ you can 1195just call `attr.isa<IntegerAttr>()`. The code can be wrapped in a `CPred` as 1196`$_self.isa<IntegerAttr>()`, with `$_self` as the special placeholder to be 1197replaced by the current attribute `attr` at expansion time. 1198 1199For more complicated predicates, you can wrap it in a single `CPred`, or you can 1200use predicate combiners to combine them. For example, to write the constraint 1201that an attribute `attr` is a 32-bit or 64-bit integer, you can write it as 1202 1203```tablegen 1204And<[ 1205 CPred<"$_self.isa<IntegerAttr>()">, 1206 Or<[ 1207 CPred<"$_self.cast<IntegerAttr>().getType().isInteger(32)">, 1208 CPred<"$_self.cast<IntegerAttr>().getType().isInteger(64)"> 1209 ]> 1210]> 1211``` 1212 1213(Note that the above is just to show with a familiar example how you can use 1214`CPred` and predicate combiners to write complicated predicates. For integer 1215attributes specifically, [`OpBase.td`][OpBase] already defines `I32Attr` and 1216`I64Attr`. So you can actually reuse them to write it as `Or<[I32Attr.predicate, 1217I64Attr.predicate]>`.) 1218 1219TODO: Build up a library of reusable primitive constraints 1220 1221If the predicate is very complex to write with `CPred` together with predicate 1222combiners, you can also write it as a normal C++ function and use the `CPred` as 1223a way to "invoke" the function. For example, to verify an attribute `attr` has 1224some property, you can write a C++ function like 1225 1226```cpp 1227bool HasSomeProperty(Attribute attr) { ... } 1228``` 1229 1230and then define the op as: 1231 1232```tablegen 1233def HasSomeProperty : AttrConstraint<CPred<"HasSomeProperty($_self)">, 1234 "has some property">; 1235 1236def MyOp : Op<...> { 1237 let arguments = (ins 1238 ... 1239 HasSomeProperty:$attr 1240 ); 1241} 1242``` 1243 1244As to whether we should define the predicate using a single `CPred` wrapping the 1245whole expression, multiple `CPred`s with predicate combiners, or a single 1246`CPred` "invoking" a function, there are no clear-cut criteria. Defining using 1247`CPred` and predicate combiners is preferable since it exposes more information 1248(instead hiding all the logic behind a C++ function) into the op definition spec 1249so that it can potentially drive more auto-generation cases. But it will require 1250a nice library of common predicates as the building blocks to avoid the 1251duplication, which is being worked on right now. 1252 1253## Attribute Definition 1254 1255An attribute is a compile-time known constant of an operation. 1256 1257ODS provides attribute wrappers over C++ attribute classes. There are a few 1258common C++ [attribute classes][AttrClasses] defined in MLIR's core IR library 1259and one is free to define dialect-specific attribute classes. ODS allows one to 1260use these attributes in TableGen to define operations, potentially with more 1261fine-grained constraints. For example, `StrAttr` directly maps to `StringAttr`; 1262`F32Attr`/`F64Attr` requires the `FloatAttr` to additionally be of a certain 1263bitwidth. 1264 1265ODS attributes are defined as having a storage type (corresponding to a backing 1266`mlir::Attribute` that _stores_ the attribute), a return type (corresponding to 1267the C++ _return_ type of the generated helper getters) as well as a method 1268to convert between the internal storage and the helper method. 1269 1270### Attribute decorators 1271 1272There are a few important attribute adapters/decorators/modifiers that can be 1273applied to ODS attributes to specify common additional properties like 1274optionality, default values, etc.: 1275 1276* `DefaultValuedAttr`: specifies the 1277 [default value](#attributes-with-default-values) for an attribute. 1278* `OptionalAttr`: specifies an attribute as [optional](#optional-attributes). 1279* `Confined`: adapts an attribute with 1280 [further constraints](#confining-attributes). 1281 1282### Enum attributes 1283 1284Some attributes can only take values from a predefined enum, e.g., the 1285comparison kind of a comparison op. To define such attributes, ODS provides 1286several mechanisms: `IntEnumAttr`, and `BitEnumAttr`. 1287 1288* `IntEnumAttr`: each enum case is an integer, the attribute is stored as a 1289 [`IntegerAttr`][IntegerAttr] in the op. 1290* `BitEnumAttr`: each enum case is a either the empty case, a single bit, 1291 or a group of single bits, and the attribute is stored as a 1292 [`IntegerAttr`][IntegerAttr] in the op. 1293 1294All these `*EnumAttr` attributes require fully specifying all of the allowed 1295cases via their corresponding `*EnumAttrCase`. With this, ODS is able to 1296generate additional verification to only accept allowed cases. To facilitate the 1297interaction between `*EnumAttr`s and their C++ consumers, the 1298[`EnumsGen`][EnumsGen] TableGen backend can generate a few common utilities: a 1299C++ enum class, `llvm::DenseMapInfo` for the enum class, conversion functions 1300from/to strings. This is controlled via the `-gen-enum-decls` and 1301`-gen-enum-defs` command-line options of `mlir-tblgen`. 1302 1303For example, given the following `EnumAttr`: 1304 1305```tablegen 1306def Case15: I32EnumAttrCase<"Case15", 15>; 1307def Case20: I32EnumAttrCase<"Case20", 20>; 1308 1309def MyIntEnum: I32EnumAttr<"MyIntEnum", "An example int enum", 1310 [Case15, Case20]> { 1311 let cppNamespace = "Outer::Inner"; 1312 let stringToSymbolFnName = "ConvertToEnum"; 1313 let symbolToStringFnName = "ConvertToString"; 1314} 1315``` 1316 1317The following will be generated via `mlir-tblgen -gen-enum-decls`: 1318 1319```c++ 1320namespace Outer { 1321namespace Inner { 1322// An example int enum 1323enum class MyIntEnum : uint32_t { 1324 Case15 = 15, 1325 Case20 = 20, 1326}; 1327 1328llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t); 1329llvm::StringRef ConvertToString(MyIntEnum); 1330llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef); 1331inline constexpr unsigned getMaxEnumValForMyIntEnum() { 1332 return 20; 1333} 1334 1335} // namespace Inner 1336} // namespace Outer 1337 1338namespace llvm { 1339template<> struct DenseMapInfo<Outer::Inner::MyIntEnum> { 1340 using StorageInfo = llvm::DenseMapInfo<uint32_t>; 1341 1342 static inline Outer::Inner::MyIntEnum getEmptyKey() { 1343 return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getEmptyKey()); 1344 } 1345 1346 static inline Outer::Inner::MyIntEnum getTombstoneKey() { 1347 return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getTombstoneKey()); 1348 } 1349 1350 static unsigned getHashValue(const Outer::Inner::MyIntEnum &val) { 1351 return StorageInfo::getHashValue(static_cast<uint32_t>(val)); 1352 } 1353 1354 static bool isEqual(const Outer::Inner::MyIntEnum &lhs, const Outer::Inner::MyIntEnum &rhs) { 1355 return lhs == rhs; 1356 } 1357}; 1358} 1359``` 1360 1361The following will be generated via `mlir-tblgen -gen-enum-defs`: 1362 1363```c++ 1364namespace Outer { 1365namespace Inner { 1366llvm::StringRef ConvertToString(MyIntEnum val) { 1367 switch (val) { 1368 case MyIntEnum::Case15: return "Case15"; 1369 case MyIntEnum::Case20: return "Case20"; 1370 } 1371 return ""; 1372} 1373 1374llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef str) { 1375 return llvm::StringSwitch<llvm::Optional<MyIntEnum>>(str) 1376 .Case("Case15", MyIntEnum::Case15) 1377 .Case("Case20", MyIntEnum::Case20) 1378 .Default(llvm::None); 1379} 1380llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t value) { 1381 switch (value) { 1382 case 15: return MyIntEnum::Case15; 1383 case 20: return MyIntEnum::Case20; 1384 default: return llvm::None; 1385 } 1386} 1387 1388} // namespace Inner 1389} // namespace Outer 1390``` 1391 1392Similarly for the following `BitEnumAttr` definition: 1393 1394```tablegen 1395def None: BitEnumAttrCaseNone<"None">; 1396def Bit0: BitEnumAttrCaseBit<"Bit0", 0>; 1397def Bit1: BitEnumAttrCaseBit<"Bit1", 1>; 1398def Bit2: BitEnumAttrCaseBit<"Bit2", 2>; 1399def Bit3: BitEnumAttrCaseBit<"Bit3", 3>; 1400 1401def MyBitEnum: BitEnumAttr<"MyBitEnum", "An example bit enum", 1402 [None, Bit0, Bit1, Bit2, Bit3]>; 1403``` 1404 1405We can have: 1406 1407```c++ 1408// An example bit enum 1409enum class MyBitEnum : uint32_t { 1410 None = 0, 1411 Bit0 = 1, 1412 Bit1 = 2, 1413 Bit2 = 4, 1414 Bit3 = 8, 1415}; 1416 1417llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t); 1418std::string stringifyMyBitEnum(MyBitEnum); 1419llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef); 1420inline MyBitEnum operator|(MyBitEnum lhs, MyBitEnum rhs) { 1421 return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) | static_cast<uint32_t>(rhs)); 1422} 1423inline MyBitEnum operator&(MyBitEnum lhs, MyBitEnum rhs) { 1424 return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) & static_cast<uint32_t>(rhs)); 1425} 1426inline bool bitEnumContains(MyBitEnum bits, MyBitEnum bit) { 1427 return (static_cast<uint32_t>(bits) & static_cast<uint32_t>(bit)) != 0; 1428} 1429 1430namespace llvm { 1431template<> struct DenseMapInfo<::MyBitEnum> { 1432 using StorageInfo = llvm::DenseMapInfo<uint32_t>; 1433 1434 static inline ::MyBitEnum getEmptyKey() { 1435 return static_cast<::MyBitEnum>(StorageInfo::getEmptyKey()); 1436 } 1437 1438 static inline ::MyBitEnum getTombstoneKey() { 1439 return static_cast<::MyBitEnum>(StorageInfo::getTombstoneKey()); 1440 } 1441 1442 static unsigned getHashValue(const ::MyBitEnum &val) { 1443 return StorageInfo::getHashValue(static_cast<uint32_t>(val)); 1444 } 1445 1446 static bool isEqual(const ::MyBitEnum &lhs, const ::MyBitEnum &rhs) { 1447 return lhs == rhs; 1448 } 1449}; 1450``` 1451 1452```c++ 1453std::string stringifyMyBitEnum(MyBitEnum symbol) { 1454 auto val = static_cast<uint32_t>(symbol); 1455 assert(15u == (15u | val) && "invalid bits set in bit enum"); 1456 // Special case for all bits unset. 1457 if (val == 0) return "None"; 1458 llvm::SmallVector<llvm::StringRef, 2> strs; 1459 if (1u == (1u & val)) { strs.push_back("Bit0"); } 1460 if (2u == (2u & val)) { strs.push_back("Bit1"); } 1461 if (4u == (4u & val)) { strs.push_back("Bit2"); } 1462 if (8u == (8u & val)) { strs.push_back("Bit3"); } 1463 1464 return llvm::join(strs, "|"); 1465} 1466 1467llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef str) { 1468 // Special case for all bits unset. 1469 if (str == "None") return MyBitEnum::None; 1470 1471 llvm::SmallVector<llvm::StringRef, 2> symbols; 1472 str.split(symbols, "|"); 1473 1474 uint32_t val = 0; 1475 for (auto symbol : symbols) { 1476 auto bit = llvm::StringSwitch<llvm::Optional<uint32_t>>(symbol) 1477 .Case("Bit0", 1) 1478 .Case("Bit1", 2) 1479 .Case("Bit2", 4) 1480 .Case("Bit3", 8) 1481 .Default(llvm::None); 1482 if (bit) { val |= *bit; } else { return llvm::None; } 1483 } 1484 return static_cast<MyBitEnum>(val); 1485} 1486 1487llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t value) { 1488 // Special case for all bits unset. 1489 if (value == 0) return MyBitEnum::None; 1490 1491 if (value & ~(1u | 2u | 4u | 8u)) return llvm::None; 1492 return static_cast<MyBitEnum>(value); 1493} 1494``` 1495 1496## Debugging Tips 1497 1498### Run `mlir-tblgen` to see the generated content 1499 1500TableGen syntax sometimes can be obscure; reading the generated content can be a 1501very helpful way to understand and debug issues. To build `mlir-tblgen`, run 1502`cmake --build . --target mlir-tblgen` in your build directory and find the 1503`mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators 1504can be found via `mlir-tblgen --help`. For example, `--gen-op-decls` and 1505`--gen-op-defs` as explained in [Generated C++ code](#generated-c-code). 1506 1507To see the generated code, invoke `mlir-tblgen` with a specific generator by 1508providing include paths via `-I`. For example, 1509 1510```sh 1511# To see op C++ class declaration 1512mlir-tblgen --gen-op-decls -I /path/to/mlir/include /path/to/input/td/file 1513# To see op C++ class definition 1514mlir-tblgen --gen-op-defs -I /path/to/mlir/include /path/to/input/td/file 1515# To see op documentation 1516mlir-tblgen --gen-dialect-doc -I /path/to/mlir/include /path/to/input/td/file 1517 1518# To see op interface C++ class declaration 1519mlir-tblgen --gen-op-interface-decls -I /path/to/mlir/include /path/to/input/td/file 1520# To see op interface C++ class definition 1521mlir-tblgen --gen-op-interface-defs -I /path/to/mlir/include /path/to/input/td/file 1522# To see op interface documentation 1523mlir-tblgen --gen-op-interface-doc -I /path/to/mlir/include /path/to/input/td/file 1524``` 1525 1526## Appendix 1527 1528### Reporting deprecation 1529 1530Classes/defs can be marked as deprecated by using the `Deprecate` helper class, 1531e.g., 1532 1533```tablegen 1534def OpTraitA : NativeOpTrait<"OpTraitA">, Deprecated<"use `bar` instead">; 1535``` 1536 1537would result in marking `OpTraitA` as deprecated and mlir-tblgen can emit a 1538warning (default) or error (depending on `-on-deprecated` flag) to make 1539deprecated state known. 1540 1541### Requirements and existing mechanisms analysis 1542 1543The op description should be as declarative as possible to allow a wide range of 1544tools to work with them and query methods generated from them. In particular 1545this means specifying traits, constraints and shape inference information in a 1546way that is easily analyzable (e.g., avoid opaque calls to C++ functions where 1547possible). 1548 1549We considered the approaches of several contemporary systems and focused on 1550requirements that were desirable: 1551 1552* Ops registered using a registry separate from C++ code. 1553 * Unknown ops are allowed in MLIR, so ops need not be registered. The 1554 ability of the compiler to optimize those ops or graphs containing those 1555 ops is constrained but correct. 1556 * The current proposal does not include a runtime op description, but it 1557 does not preclude such description, it can be added later. 1558 * The op registry is essential for generating C++ classes that make 1559 manipulating ops, verifying correct construction etc. in C++ easier by 1560 providing a typed representation and accessors. 1561* The op registry will be defined in 1562 [TableGen](https://llvm.org/docs/TableGen/index.html) and be used to 1563 generate C++ classes and utility functions 1564 (builder/verifier/parser/printer). 1565 * TableGen is a modelling specification language used by LLVM's backends 1566 and fits in well with trait-based modelling. This is an implementation 1567 decision and there are alternative ways of doing this. But the 1568 specification language is good for the requirements of modelling the 1569 traits (as seen from usage in LLVM processor backend modelling) and easy 1570 to extend, so a practical choice. If another good option comes up, we 1571 will consider it. 1572* MLIR allows both defined and undefined ops. 1573 * Defined ops should have fixed semantics and could have a corresponding 1574 reference implementation defined. 1575 * Dialects are under full control of the dialect owner and normally live 1576 with the framework of the dialect. 1577* The op's traits (e.g., commutative) are modelled along with the op in the 1578 registry. 1579* The op's operand/return type constraints are modelled along with the op in 1580 the registry (see [Shape inference](ShapeInference.md) discussion below), 1581 this allows (e.g.) optimized concise syntax in textual dumps. 1582* Behavior of the op is documented along with the op with a summary and a 1583 description. The description is written in markdown and extracted for 1584 inclusion in the generated LangRef section of the dialect. 1585* The generic assembly form of printing and parsing is available as normal, 1586 but a custom parser and printer can either be specified or automatically 1587 generated from an optional string representation showing the mapping of the 1588 "assembly" string to operands/type. 1589 * Parser-level remappings (e.g., `eq` to enum) will be supported as part 1590 of the parser generation. 1591* Matching patterns are specified separately from the op description. 1592 * Contrasted with LLVM there is no "base" set of ops that every backend 1593 needs to be aware of. Instead there are many different dialects and the 1594 transformations/legalizations between these dialects form a graph of 1595 transformations. 1596* Reference implementation may be provided along with the op definition. 1597 1598 * The reference implementation may be in terms of either standard ops or 1599 other reference implementations. 1600 1601 TODO: document expectation if the dependent op's definition changes. 1602 1603[TableGen]: https://llvm.org/docs/TableGen/index.html 1604[TableGenProgRef]: https://llvm.org/docs/TableGen/ProgRef.html 1605[TableGenBackend]: https://llvm.org/docs/TableGen/BackEnds.html#introduction 1606[OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td 1607[OpDefinitionsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp 1608[EnumsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/EnumsGen.cpp 1609[StringAttr]: Dialects/Builtin.md/#stringattr 1610[IntegerAttr]: Dialects/Builtin.md/#integertype 1611[AttrClasses]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Attributes.h 1612