1# Operation Definition Specification (ODS) 2 3In addition to specializing the `mlir::Op` C++ template, MLIR also supports 4defining operations and data types in a table-driven manner. This is achieved 5via [TableGen][TableGen], which is both a generic language and its tooling to 6maintain records of domain-specific information. Facts regarding an operation 7are specified concisely into a TableGen record, which will be expanded into an 8equivalent `mlir::Op` C++ template specialization at compiler build time. 9 10This manual explains in detail all the available mechanisms for defining 11operations in such a table-driven manner. It aims to be a specification instead 12of a tutorial. Please refer to 13[Quickstart tutorial to adding MLIR graph rewrite](Tutorials/QuickstartRewrites.md) 14for the latter. 15 16In addition to detailing each mechanism, this manual also tries to capture best 17practices. They are rendered as quoted bullet points. 18 19## Motivation 20 21MLIR allows pluggable dialects, and dialects contain, among others, a list of 22operations. This open and extensible ecosystem leads to the "stringly" type IR 23problem, e.g., repetitive string comparisons during optimization and analysis 24passes, unintuitive accessor methods (e.g., generic/error prone `getOperand(3)` 25vs self-documenting `getStride()`) with more generic return types, verbose and 26generic constructors without default arguments, verbose textual IR dump, and so 27on. Furthermore, operation verification is: 28 291. best case: a central string-to-verification-function map, 301. middle case: duplication of verification across the code base, or 311. worst case: no verification functions. 32 33The fix is to support defining ops in a table-driven manner. Then for each 34dialect, we can have a central place that contains everything you need to know 35about each op, including its constraints, custom assembly form, etc. This 36description is also used to generate helper functions and classes to allow 37building, verification, parsing, printing, analysis, and many more. 38 39## Benefits 40 41Compared to the C++ template, this table-driven approach has several benefits 42including but not limited to: 43 44* **Single source of truth**: We strive to encode all facts regarding an 45 operation into the record, so that readers don't need to jump among code 46 snippets to fully understand an operation. 47* **Removing boilerplate**: We can automatically generate 48 operand/attribute/result getter methods, operation build methods, operation 49 verify methods, and many more utilities from the record. This greatly 50 reduces the boilerplate needed for defining a new op. 51* **Facilitating auto-generation**: The usage of these operation information 52 records are by no means limited to op definition itself. We can use them to 53 drive the auto-generation of many other components, like computation graph 54 serialization. 55 56## TableGen Syntax 57 58We use TableGen as the language for specifying operation information. TableGen 59itself just provides syntax for writing records; the syntax and constructs 60allowed in a TableGen file (typically with filename suffix `.td`) can be found 61[here][TableGenProgRef]. 62 63* TableGen `class` is similar to C++ class; it can be templated and 64 subclassed. 65* TableGen `def` is similar to C++ object; it can be declared by specializing 66 a TableGen `class` (e.g., `def MyDef : MyClass<...>;`) or completely 67 independently (e.g., `def MyDef;`). It cannot be further templated or 68 subclassed. 69* TableGen `dag` is a dedicated type for directed acyclic graph of elements. A 70 `dag` has one operator and zero or more arguments. Its syntax is `(operator 71 arg0, arg1, argN)`. The operator can be any TableGen `def`; an argument can 72 be anything, including `dag` itself. We can have names attached to both the 73 operator and the arguments like `(MyOp:$op_name MyArg:$arg_name)`. 74 75Please see the [language reference][TableGenProgRef] to learn about all the 76types and expressions supported by TableGen. 77 78## Operation Definition 79 80MLIR defines several common constructs to help operation definition and provide 81their semantics via a special [TableGen backend][TableGenBackend]: 82[`OpDefinitionsGen`][OpDefinitionsGen]. These constructs are defined in 83[`OpBase.td`][OpBase]. The main ones are 84 85* The `Op` class: It is the main construct for defining operations. All facts 86 regarding the operation are specified when specializing this class, with the 87 help of the following constructs. 88* The `Dialect` class: Operations belonging to one logical group are placed in 89 the same dialect. The `Dialect` class contains dialect-level information. 90* The `OpTrait` class hierarchy: They are used to specify special properties 91 and constraints of the operation, including whether the operation has side 92 effect or whether its output has the same shape as the input. 93* The `ins`/`outs` marker: These are two special markers builtin to the 94 `OpDefinitionsGen` backend. They lead the definitions of operands/attributes 95 and results respectively. 96* The `TypeConstraint` class hierarchy: They are used to specify the 97 constraints over operands or results. A notable subclass hierarchy is 98 `Type`, which stands for constraints for common C++ types. 99* The `AttrConstraint` class hierarchy: They are used to specify the 100 constraints over attributes. A notable subclass hierarchy is `Attr`, which 101 stands for constraints for attributes whose values are of common types. 102 103An operation is defined by specializing the `Op` class with concrete contents 104for all the fields it requires. For example, `tf.AvgPool` is defined as 105 106```tablegen 107def TF_AvgPoolOp : TF_Op<"AvgPool", [NoSideEffect]> { 108 let summary = "Performs average pooling on the input."; 109 110 let description = [{ 111Each entry in `output` is the mean of the corresponding size `ksize` 112window in `value`. 113 }]; 114 115 let arguments = (ins 116 TF_FpTensor:$value, 117 118 Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$ksize, 119 Confined<I64ArrayAttr, [ArrayMinCount<4>]>:$strides, 120 TF_AnyStrAttrOf<["SAME", "VALID"]>:$padding, 121 DefaultValuedAttr<TF_ConvertDataFormatAttr, "NHWC">:$data_format 122 ); 123 124 let results = (outs 125 TF_FpTensor:$output 126 ); 127 128 TF_DerivedOperandTypeAttr T = TF_DerivedOperandTypeAttr<0>; 129} 130``` 131 132In the following we describe all the fields needed. Please see the definition of 133the `Op` class for the complete list of fields supported. 134 135### Operation name 136 137The operation name is a unique identifier of the operation within MLIR, e.g., 138`tf.Add` for addition operation in the TensorFlow dialect. This is the 139equivalent of the mnemonic in assembly language. It is used for parsing and 140printing in the textual format. It is also used for pattern matching in graph 141rewrites. 142 143The full operation name is composed of the dialect name and the op name, with 144the former provided via the dialect and the latter provided as the second 145template parameter to the `Op` class. 146 147### Operation documentation 148 149This includes both a one-line `summary` and a longer human-readable 150`description`. They will be used to drive automatic generation of dialect 151documentation. They need to be provided in the operation's definition body: 152 153```tablegen 154let summary = "..."; 155 156let description = [{ 157... 158}]; 159``` 160 161`description` should be written in Markdown syntax. 162 163Placing the documentation at the beginning is recommended since it helps in 164understanding the operation. 165 166> * Place documentation at the beginning of the operation definition 167> * The summary should be short and concise. It should be a one-liner without 168> trailing punctuation. Put expanded explanation in description. 169 170### Operation arguments 171 172There are two kinds of arguments: operands and attributes. Operands are runtime 173values produced by other ops; while attributes are compile-time known constant 174values, including two categories: 175 1761. Natural attributes: these attributes affect the behavior of the operations 177 (e.g., padding for convolution); 1781. Derived attributes: these attributes are not needed to define the operation 179 but are instead derived from information of the operation. E.g., the output 180 shape of type. This is mostly used for convenience interface generation or 181 interaction with other frameworks/translation. 182 183 All derived attributes should be materializable as an Attribute. That is, 184 even though they are not materialized, it should be possible to store as an 185 attribute. 186 187Both operands and attributes are specified inside the `dag`-typed `arguments`, 188led by `ins`: 189 190```tablegen 191let arguments = (ins 192 <type-constraint>:$<operand-name>, 193 ... 194 <attr-constraint>:$<attr-name>, 195 ... 196); 197``` 198 199Here `<type-constraint>` is a TableGen `def` from the `TypeConstraint` class 200hierarchy. Similarly, `<attr-constraint>` is a TableGen `def` from the 201`AttrConstraint` class hierarchy. See [Constraints](#constraints) for more 202information. 203 204There is no requirements on the relative order of operands and attributes; they 205can mix freely. The relative order of operands themselves matters. From each 206named argument a named getter will be generated that returns the argument with 207the return type (in the case of attributes the return type will be constructed 208from the storage type, while for operands it will be `Value`). Each attribute's 209raw value (e.g., as stored) can also be accessed via generated `<name>Attr` 210getters for use in transformation passes where the more user friendly return 211type is less suitable. 212 213All the arguments should be named to 1) provide documentation, 2) drive 214auto-generation of getter methods, 3) provide a handle to reference for other 215places like constraints. 216 217#### Variadic operands 218 219To declare a variadic operand, wrap the `TypeConstraint` for the operand with 220`Variadic<...>`. 221 222Normally operations have no variadic operands or just one variadic operand. For 223the latter case, it is easy to deduce which dynamic operands are for the static 224variadic operand definition. Though, if an operation has more than one variable 225length operands (either optional or variadic), it would be impossible to 226attribute dynamic operands to the corresponding static variadic operand 227definitions without further information from the operation. Therefore, either 228the `SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to 229indicate that all variable length operands have the same number of dynamic 230values. 231 232#### VariadicOfVariadic operands 233 234To declare a variadic operand that has a variadic number of sub-ranges, wrap the 235`TypeConstraint` for the operand with `VariadicOfVariadic<..., 236"<segment-attribute-name>">`. 237 238The second field of the `VariadicOfVariadic` is the name of an `I32ElementsAttr` 239argument that contains the sizes of the variadic sub-ranges. This attribute will 240be used when determining the size of sub-ranges, or when updating the size of 241sub-ranges. 242 243#### Optional operands 244 245To declare an optional operand, wrap the `TypeConstraint` for the operand with 246`Optional<...>`. 247 248Normally operations have no optional operands or just one optional operand. For 249the latter case, it is easy to deduce which dynamic operands are for the static 250operand definition. Though, if an operation has more than one variable length 251operands (either optional or variadic), it would be impossible to attribute 252dynamic operands to the corresponding static variadic operand definitions 253without further information from the operation. Therefore, either the 254`SameVariadicOperandSize` or `AttrSizedOperandSegments` trait is needed to 255indicate that all variable length operands have the same number of dynamic 256values. 257 258#### Optional attributes 259 260To declare an optional attribute, wrap the `AttrConstraint` for the attribute 261with `OptionalAttr<...>`. 262 263#### Attributes with default values 264 265To declare an attribute with a default value, wrap the `AttrConstraint` for the 266attribute with `DefaultValuedAttr<..., "...">`. 267 268The second parameter to `DefaultValuedAttr` should be a string containing the 269C++ default value. For example, a float default value should be specified as 270like `"0.5f"`, and an integer array default value should be specified as like 271`"{1, 2, 3}"`. 272 273#### Confining attributes 274 275`Confined` is provided as a general mechanism to help modelling further 276constraints on attributes beyond the ones brought by value types. You can use 277`Confined` to compose complex constraints out of more primitive ones. For 278example, a 32-bit integer attribute whose minimum value must be 10 can be 279expressed as `Confined<I32Attr, [IntMinValue<10>]>`. 280 281Right now, the following primitive constraints are supported: 282 283* `IntMinValue<N>`: Specifying an integer attribute to be greater than or 284 equal to `N` 285* `IntMaxValue<N>`: Specifying an integer attribute to be less than or equal 286 to `N` 287* `ArrayMinCount<N>`: Specifying an array attribute to have at least `N` 288 elements 289* `IntArrayNthElemEq<I, N>`: Specifying an integer array attribute's `I`-th 290 element to be equal to `N` 291* `IntArrayNthElemMinValue<I, N>`: Specifying an integer array attribute's 292 `I`-th element to be greater than or equal to `N` 293 294TODO: Design and implement more primitive constraints 295 296### Operation regions 297 298The regions of an operation are specified inside of the `dag`-typed `regions`, 299led by `region`: 300 301```tablegen 302let regions = (region 303 <region-constraint>:$<region-name>, 304 ... 305); 306``` 307 308#### Variadic regions 309 310Similar to the `Variadic` class used for variadic operands and results, 311`VariadicRegion<...>` can be used for regions. Variadic regions can currently 312only be specified as the last region in the regions list. 313 314### Operation results 315 316Similar to operands, results are specified inside the `dag`-typed `results`, led 317by `outs`: 318 319```tablegen 320let results = (outs 321 <type-constraint>:$<result-name>, 322 ... 323); 324``` 325 326#### Variadic results 327 328Similar to variadic operands, `Variadic<...>` can also be used for results. And 329similarly, `SameVariadicResultSize` for multiple variadic results in the same 330operation. 331 332### Operation successors 333 334For terminator operations, the successors are specified inside of the 335`dag`-typed `successors`, led by `successor`: 336 337```tablegen 338let successors = (successor 339 <successor-constraint>:$<successor-name>, 340 ... 341); 342``` 343 344#### Variadic successors 345 346Similar to the `Variadic` class used for variadic operands and results, 347`VariadicSuccessor<...>` can be used for successors. Variadic successors can 348currently only be specified as the last successor in the successor list. 349 350### Operation traits and constraints 351 352Traits are operation properties that affect syntax or semantics. MLIR C++ models 353various traits in the `mlir::OpTrait` namespace. 354 355Both operation traits, [interfaces](Interfaces.md/#utilizing-the-ods-framework), 356and constraints involving multiple operands/attributes/results are provided as 357the third template parameter to the `Op` class. They should be deriving from 358the `OpTrait` class. See [Constraints](#constraints) for more information. 359 360### Builder methods 361 362For each operation, there are a few builders automatically generated based on 363the arguments and returns types. For example, given the following op definition: 364 365```tablegen 366def MyOp : ... { 367 let arguments = (ins 368 I32:$i32_operand, 369 F32:$f32_operand, 370 ..., 371 372 I32Attr:$i32_attr, 373 F32Attr:$f32_attr, 374 ... 375 ); 376 377 let results = (outs 378 I32:$i32_result, 379 F32:$f32_result, 380 ... 381 ); 382} 383``` 384 385The following builders are generated: 386 387```c++ 388// All result-types/operands/attributes have one aggregate parameter. 389static void build(OpBuilder &odsBuilder, OperationState &odsState, 390 ArrayRef<Type> resultTypes, 391 ValueRange operands, 392 ArrayRef<NamedAttribute> attributes); 393 394// Each result-type/operand/attribute has a separate parameter. The parameters 395// for attributes are of mlir::Attribute types. 396static void build(OpBuilder &odsBuilder, OperationState &odsState, 397 Type i32_result, Type f32_result, ..., 398 Value i32_operand, Value f32_operand, ..., 399 IntegerAttr i32_attr, FloatAttr f32_attr, ...); 400 401// Each result-type/operand/attribute has a separate parameter. The parameters 402// for attributes are raw values unwrapped with mlir::Attribute instances. 403// (Note that this builder will not always be generated. See the following 404// explanation for more details.) 405static void build(OpBuilder &odsBuilder, OperationState &odsState, 406 Type i32_result, Type f32_result, ..., 407 Value i32_operand, Value f32_operand, ..., 408 APInt i32_attr, StringRef f32_attr, ...); 409 410// Each operand/attribute has a separate parameter but result type is aggregate. 411static void build(OpBuilder &odsBuilder, OperationState &odsState, 412 ArrayRef<Type> resultTypes, 413 Value i32_operand, Value f32_operand, ..., 414 IntegerAttr i32_attr, FloatAttr f32_attr, ...); 415 416// All operands/attributes have aggregate parameters. 417// Generated if return type can be inferred. 418static void build(OpBuilder &odsBuilder, OperationState &odsState, 419 ValueRange operands, ArrayRef<NamedAttribute> attributes); 420 421// (And manually specified builders depending on the specific op.) 422``` 423 424The first form provides basic uniformity so that we can create ops using the 425same form regardless of the exact op. This is particularly useful for 426implementing declarative pattern rewrites. 427 428The second and third forms are good for use in manually written code given that 429they provide better guarantee via signatures. 430 431The third form will be generated if any of the op's attribute has different 432`Attr.returnType` from `Attr.storageType` and we know how to build an attribute 433from an unwrapped value (i.e., `Attr.constBuilderCall` is defined.) 434Additionally, for the third form, if an attribute appearing later in the 435`arguments` list has a default value, the default value will be supplied in the 436declaration. This works for `BoolAttr`, `StrAttr`, `EnumAttr` for now and the 437list can grow in the future. So if possible, default valued attribute should be 438placed at the end of the `arguments` list to leverage this feature. (This 439behavior is essentially due to C++ function parameter default value placement 440restrictions.) Otherwise, the builder of the third form will still be generated 441but default values for the attributes not at the end of the `arguments` list 442will not be supplied in the builder's signature. 443 444ODS will generate a builder that doesn't require return type specified if 445 446* Op implements InferTypeOpInterface interface; 447* All return types are either buildable types or are the same as a given 448 operand (e.g., `AllTypesMatch` constraint between operand and result); 449 450And there may potentially exist other builders depending on the specific op; 451please refer to the 452[generated C++ file](#run-mlir-tblgen-to-see-the-generated-content) for the 453complete list. 454 455#### Custom builder methods 456 457However, if the above cases cannot satisfy all needs, you can define additional 458convenience build methods in the `builders` field as follows. 459 460```tablegen 461def MyOp : Op<"my_op", []> { 462 let arguments = (ins F32Attr:$attr); 463 464 let builders = [ 465 OpBuilder<(ins "float":$val)> 466 ]; 467} 468``` 469 470The `builders` field is a list of custom builders that are added to the Op 471class. In this example, we provide a convenience builder that takes a floating 472point value instead of an attribute. The `ins` prefix is common to many function 473declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What 474follows is a comma-separated list of types (quoted string) and names prefixed 475with the `$` sign. This will generate the declaration of a builder method that 476looks like: 477 478```c++ 479class MyOp : /*...*/ { 480 /*...*/ 481 static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, 482 float val); 483}; 484``` 485 486Note that the method has two additional leading arguments. These arguments are 487useful to construct the operation. In particular, the method must populate 488`state` with attributes, operands, regions and result types of the operation to 489be constructed. `builder` can be used to construct any IR objects that belong to 490the Op, such as types or nested operations. Since the type and name are 491generated as is in the C++ code, they should be valid C++ constructs for a type 492(in the namespace of the Op) and an identifier (e.g., `class` is not a valid 493identifier). 494 495Implementations of the builder can be provided directly in ODS, using TableGen 496code block as follows. 497 498```tablegen 499def MyOp : Op<"my_op", []> { 500 let arguments = (ins F32Attr:$attr); 501 502 let builders = [ 503 OpBuilder<(ins "float":$val), [{ 504 $_state.addAttribute("attr", $_builder.getF32FloatAttr(val)); 505 }]> 506 ]; 507} 508``` 509 510The equivalents of `builder` and `state` arguments are available as `$_builder` 511and `$_state` special variables. The named arguments listed in the `ins` part 512are available directly, e.g. `val`. The body of the builder will be generated by 513substituting special variables and should otherwise be valid C++. While there is 514no limitation on the code size, we encourage one to define only short builders 515inline in ODS and put definitions of longer builders in C++ files. 516 517Finally, if some arguments need a default value, they can be defined using 518`CArg` to wrap the type and this value as follows. 519 520```tablegen 521def MyOp : Op<"my_op", []> { 522 let arguments = (ins F32Attr:$attr); 523 524 let builders = [ 525 OpBuilder<(ins CArg<"float", "0.5f">:$val), [{ 526 $_state.addAttribute("attr", $_builder.getF32FloatAttr(val)); 527 }]> 528 ]; 529} 530``` 531 532The generated code will use default value in the declaration, but not in the 533definition, as required by C++. 534 535```c++ 536/// Header file. 537class MyOp : /*...*/ { 538 /*...*/ 539 static void build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, 540 float val = 0.5f); 541}; 542 543/// Source file. 544MyOp::build(::mlir::OpBuilder &builder, ::mlir::OperationState &state, 545 float val) { 546 state.addAttribute("attr", builder.getF32FloatAttr(val)); 547} 548``` 549 550**Deprecated:** `OpBuilder` class allows one to specify the custom builder 551signature as a raw string, without separating parameters into different `dag` 552arguments. It also supports leading parameters of `OpBuilder &` and 553`OperationState &` types, which will be used instead of the autogenerated ones 554if present. 555 556### Custom parser and printer methods 557 558Functions to parse and print the operation's custom assembly form. 559 560### Custom verifier code 561 562Verification code will be automatically generated for 563[constraints](#constraints) specified on various entities of the op. To perform 564_additional_ verification, you can use 565 566```tablegen 567let verifier = [{ 568 ... 569}]; 570``` 571 572Code placed in `verifier` will be called after the auto-generated verification 573code. The order of trait verification excluding those of `verifier` should not 574be relied upon. 575 576### Declarative Assembly Format 577 578The custom assembly form of the operation may be specified in a declarative 579string that matches the operations operands, attributes, etc. With the ability 580to express additional information that needs to be parsed to build the 581operation: 582 583```tablegen 584def CallOp : Std_Op<"call", ...> { 585 let arguments = (ins FlatSymbolRefAttr:$callee, Variadic<AnyType>:$args); 586 let results = (outs Variadic<AnyType>); 587 588 let assemblyFormat = [{ 589 $callee `(` $args `)` attr-dict `:` functional-type($args, results) 590 }]; 591} 592``` 593 594The format is comprised of three components: 595 596#### Directives 597 598A directive is a type of builtin function, with an optional set of arguments. 599The available directives are as follows: 600 601* `attr-dict` 602 603 - Represents the attribute dictionary of the operation. 604 605* `attr-dict-with-keyword` 606 607 - Represents the attribute dictionary of the operation, but prefixes the 608 dictionary with an `attributes` keyword. 609 610* `custom` < UserDirective > ( Params ) 611 612 - Represents a custom directive implemented by the user in C++. 613 - See the [Custom Directives](#custom-directives) section below for more 614 details. 615 616* `functional-type` ( inputs , results ) 617 618 - Formats the `inputs` and `results` arguments as a 619 [function type](Dialects/Builtin.md/#functiontype). 620 - The constraints on `inputs` and `results` are the same as the `input` of 621 the `type` directive. 622 623* `operands` 624 625 - Represents all of the operands of an operation. 626 627* `ref` ( input ) 628 629 - Represents a reference to the a variable or directive, that must have 630 already been resolved, that may be used as a parameter to a `custom` 631 directive. 632 - Used to pass previously parsed entities to custom directives. 633 - The input may be any directive or variable, aside from `functional-type` 634 and `custom`. 635 636* `regions` 637 638 - Represents all of the regions of an operation. 639 640* `results` 641 642 - Represents all of the results of an operation. 643 644* `successors` 645 646 - Represents all of the successors of an operation. 647 648* `type` ( input ) 649 650 - Represents the type of the given input. 651 - `input` must be either an operand or result [variable](#variables), the 652 `operands` directive, or the `results` directive. 653 654* `qualified` ( type_or_attribute ) 655 656 - Wraps a `type` directive or an attribute parameter. 657 - Used to force printing the type or attribute prefixed with its dialect 658 and mnemonic. For example the `vector.multi_reduction` operation has a 659 `kind` attribute ; by default the declarative assembly will print: 660 `vector.multi_reduction <minf>, ...` but using `qualified($kind)` in the 661 declarative assembly format will print it instead as: 662 `vector.multi_reduction #vector.kind<minf>, ...`. 663 664#### Literals 665 666A literal is either a keyword or punctuation surrounded by \`\`. 667 668The following are the set of valid punctuation: 669 670`:`, `,`, `=`, `<`, `>`, `(`, `)`, `{`, `}`, `[`, `]`, `->`, `?`, `+`, `*` 671 672The following are valid whitespace punctuation: 673 674`\n`, ` ` 675 676The `\n` literal emits a newline an indents to the start of the operation. An 677example is shown below: 678 679```tablegen 680let assemblyFormat = [{ 681 `{` `\n` ` ` ` ` `this_is_on_a_newline` `\n` `}` attr-dict 682}]; 683``` 684 685```mlir 686%results = my.operation { 687 this_is_on_a_newline 688} 689``` 690 691An empty literal \`\` may be used to remove a space that is inserted implicitly 692after certain literal elements, such as `)`/`]`/etc. For example, "`]`" may 693result in an output of `]` it is not the last element in the format. "`]` \`\`" 694would trim the trailing space in this situation. 695 696#### Variables 697 698A variable is an entity that has been registered on the operation itself, i.e. 699an argument(attribute or operand), region, result, successor, etc. In the 700`CallOp` example above, the variables would be `$callee` and `$args`. 701 702Attribute variables are printed with their respective value type, unless that 703value type is buildable. In those cases, the type of the attribute is elided. 704 705#### Custom Directives 706 707The declarative assembly format specification allows for handling a large 708majority of the common cases when formatting an operation. For the operations 709that require or desire specifying parts of the operation in a form not supported 710by the declarative syntax, custom directives may be specified. A custom 711directive essentially allows for users to use C++ for printing and parsing 712subsections of an otherwise declaratively specified format. Looking at the 713specification of a custom directive above: 714 715``` 716custom-directive ::= `custom` `<` UserDirective `>` `(` Params `)` 717``` 718 719A custom directive has two main parts: The `UserDirective` and the `Params`. A 720custom directive is transformed into a call to a `print*` and a `parse*` method 721when generating the C++ code for the format. The `UserDirective` is an 722identifier used as a suffix to these two calls, i.e., `custom<MyDirective>(...)` 723would result in calls to `parseMyDirective` and `printMyDirective` within the 724parser and printer respectively. `Params` may be any combination of variables 725(i.e. Attribute, Operand, Successor, etc.), type directives, and `attr-dict`. 726The type directives must refer to a variable, but that variable need not also be 727a parameter to the custom directive. 728 729The arguments to the `parse<UserDirective>` method are firstly a reference to 730the `OpAsmParser`(`OpAsmParser &`), and secondly a set of output parameters 731corresponding to the parameters specified in the format. The mapping of 732declarative parameter to `parse` method argument is detailed below: 733 734* Attribute Variables 735 - Single: `<Attribute-Storage-Type>(e.g. Attribute) &` 736 - Optional: `<Attribute-Storage-Type>(e.g. Attribute) &` 737* Operand Variables 738 - Single: `OpAsmParser::OperandType &` 739 - Optional: `Optional<OpAsmParser::OperandType> &` 740 - Variadic: `SmallVectorImpl<OpAsmParser::OperandType> &` 741 - VariadicOfVariadic: 742 `SmallVectorImpl<SmallVector<OpAsmParser::OperandType>> &` 743* Ref Directives 744 - A reference directive is passed to the parser using the same mapping as 745 the input operand. For example, a single region would be passed as a 746 `Region &`. 747* Region Variables 748 - Single: `Region &` 749 - Variadic: `SmallVectorImpl<std::unique_ptr<Region>> &` 750* Successor Variables 751 - Single: `Block *&` 752 - Variadic: `SmallVectorImpl<Block *> &` 753* Type Directives 754 - Single: `Type &` 755 - Optional: `Type &` 756 - Variadic: `SmallVectorImpl<Type> &` 757 - VariadicOfVariadic: `SmallVectorImpl<SmallVector<Type>> &` 758* `attr-dict` Directive: `NamedAttrList &` 759 760When a variable is optional, the value should only be specified if the variable 761is present. Otherwise, the value should remain `None` or null. 762 763The arguments to the `print<UserDirective>` method is firstly a reference to the 764`OpAsmPrinter`(`OpAsmPrinter &`), second the op (e.g. `FooOp op` which can be 765`Operation *op` alternatively), and finally a set of output parameters 766corresponding to the parameters specified in the format. The mapping of 767declarative parameter to `print` method argument is detailed below: 768 769* Attribute Variables 770 - Single: `<Attribute-Storage-Type>(e.g. Attribute)` 771 - Optional: `<Attribute-Storage-Type>(e.g. Attribute)` 772* Operand Variables 773 - Single: `Value` 774 - Optional: `Value` 775 - Variadic: `OperandRange` 776 - VariadicOfVariadic: `OperandRangeRange` 777* Ref Directives 778 - A reference directive is passed to the printer using the same mapping as 779 the input operand. For example, a single region would be passed as a 780 `Region &`. 781* Region Variables 782 - Single: `Region &` 783 - Variadic: `MutableArrayRef<Region>` 784* Successor Variables 785 - Single: `Block *` 786 - Variadic: `SuccessorRange` 787* Type Directives 788 - Single: `Type` 789 - Optional: `Type` 790 - Variadic: `TypeRange` 791 - VariadicOfVariadic: `TypeRangeRange` 792* `attr-dict` Directive: `DictionaryAttr` 793 794When a variable is optional, the provided value may be null. 795 796#### Optional Groups 797 798In certain situations operations may have "optional" information, e.g. 799attributes or an empty set of variadic operands. In these situations a section 800of the assembly format can be marked as `optional` based on the presence of this 801information. An optional group is defined as follows: 802 803``` 804optional-group: `(` elements `)` (`:` `(` else-elements `)`)? `?` 805``` 806 807The `elements` of an optional group have the following requirements: 808 809* The first element of the group must either be a attribute, literal, operand, 810 or region. 811 - This is because the first element must be optionally parsable. 812* Exactly one argument variable or type directive within the group must be 813 marked as the anchor of the group. 814 - The anchor is the element whose presence controls whether the group 815 should be printed/parsed. 816 - An element is marked as the anchor by adding a trailing `^`. 817 - The first element is *not* required to be the anchor of the group. 818 - When a non-variadic region anchors a group, the detector for printing 819 the group is if the region is empty. 820* Literals, variables, custom directives, and type directives are the only 821 valid elements within the group. 822 - Any attribute variable may be used, but only optional attributes can be 823 marked as the anchor. 824 - Only variadic or optional results and operand arguments and can be used. 825 - All region variables can be used. When a non-variable length region is 826 used, if the group is not present the region is empty. 827 828An example of an operation with an optional group is `std.return`, which has a 829variadic number of operands. 830 831```tablegen 832def ReturnOp : ... { 833 let arguments = (ins Variadic<AnyType>:$operands); 834 835 // We only print the operands and types if there are a non-zero number 836 // of operands. 837 let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?"; 838} 839``` 840 841##### Unit Attributes 842 843In MLIR, the [`unit` Attribute](Dialects/Builtin.md/#unitattr) is special in that it 844only has one possible value, i.e. it derives meaning from its existence. When a 845unit attribute is used to anchor an optional group and is not the first element 846of the group, the presence of the unit attribute can be directly correlated with 847the presence of the optional group itself. As such, in these situations the unit 848attribute will not be printed or present in the output and will be automatically 849inferred when parsing by the presence of the optional group itself. 850 851For example, the following operation: 852 853```tablegen 854def FooOp : ... { 855 let arguments = (ins UnitAttr:$is_read_only); 856 857 let assemblyFormat = "attr-dict (`is_read_only` $is_read_only^)?"; 858} 859``` 860 861would be formatted as such: 862 863```mlir 864// When the unit attribute is present: 865foo.op is_read_only 866 867// When the unit attribute is not present: 868foo.op 869``` 870 871##### Optional "else" Group 872 873Optional groups also have support for an "else" group of elements. These are 874elements that are parsed/printed if the `anchor` element of the optional group 875is *not* present. Unlike the main element group, the "else" group has no 876restriction on the first element and none of the elements may act as the 877`anchor` for the optional. An example is shown below: 878 879```tablegen 880def FooOp : ... { 881 let arguments = (ins UnitAttr:$foo); 882 883 let assemblyFormat = "attr-dict (`foo_is_present` $foo^):(`foo_is_absent`)?"; 884} 885``` 886 887would be formatted as such: 888 889```mlir 890// When the `foo` attribute is present: 891foo.op foo_is_present 892 893// When the `foo` attribute is not present: 894foo.op foo_is_absent 895``` 896 897#### Requirements 898 899The format specification has a certain set of requirements that must be adhered 900to: 901 9021. The output and operation name are never shown as they are fixed and cannot 903 be altered. 9041. All operands within the operation must appear within the format, either 905 individually or with the `operands` directive. 9061. All regions within the operation must appear within the format, either 907 individually or with the `regions` directive. 9081. All successors within the operation must appear within the format, either 909 individually or with the `successors` directive. 9101. All operand and result types must appear within the format using the various 911 `type` directives, either individually or with the `operands` or `results` 912 directives. 9131. The `attr-dict` directive must always be present. 9141. Must not contain overlapping information; e.g. multiple instances of 915 'attr-dict', types, operands, etc. 916 - Note that `attr-dict` does not overlap with individual attributes. These 917 attributes will simply be elided when printing the attribute dictionary. 918 919##### Type Inference 920 921One requirement of the format is that the types of operands and results must 922always be present. In certain instances, the type of a variable may be deduced 923via type constraints or other information available. In these cases, the type of 924that variable may be elided from the format. 925 926* Buildable Types 927 928Some type constraints may only have one representation, allowing for them to be 929directly buildable; for example the `I32` or `Index` types. Types in `ODS` may 930mark themselves as buildable by setting the `builderCall` field or inheriting 931from the `BuildableType` class. 932 933* Trait Equality Constraints 934 935There are many operations that have known type equality constraints registered 936as traits on the operation; for example the true, false, and result values of a 937`select` operation often have the same type. The assembly format may inspect 938these equal constraints to discern the types of missing variables. The currently 939supported traits are: `AllTypesMatch`, `TypesMatchWith`, `SameTypeOperands`, and 940`SameOperandsAndResultType`. 941 942* InferTypeOpInterface 943 944Operations that implement `InferTypeOpInterface` can omit their result types in 945their assembly format since the result types can be inferred from the operands. 946 947### `hasCanonicalizer` 948 949This boolean field indicate whether canonicalization patterns have been defined 950for this operation. If it is `1`, then `::getCanonicalizationPatterns()` should 951be defined. 952 953### `hasCanonicalizeMethod` 954 955When this boolean field is set to `true`, it indicates that the op implements a 956`canonicalize` method for simple "matchAndRewrite" style canonicalization 957patterns. If `hasCanonicalizer` is 0, then an implementation of 958`::getCanonicalizationPatterns()` is implemented to call this function. 959 960### `hasFolder` 961 962This boolean field indicate whether general folding rules have been defined for 963this operation. If it is `1`, then `::fold()` should be defined. 964 965### Extra declarations 966 967One of the goals of table-driven op definition is to auto-generate as much logic 968and methods needed for each op as possible. With that said, there will always be 969long-tail cases that won't be covered. For such cases, you can use 970`extraClassDeclaration`. Code in `extraClassDeclaration` will be copied 971literally to the generated C++ op class. 972 973Note that `extraClassDeclaration` is a mechanism intended for long-tail cases by 974power users; for not-yet-implemented widely-applicable cases, improving the 975infrastructure is preferable. 976 977### Extra definitions 978 979When defining base op classes in TableGen that are inherited many times by 980different ops, users may want to provide common definitions of utility and 981interface functions. However, many of these definitions may not be desirable or 982possible in `extraClassDeclaration`, which append them to the op's C++ class 983declaration. In these cases, users can add an `extraClassDefinition` to define 984code that is added to the generated source file inside the op's C++ namespace. 985The substitution `$cppClass` is replaced by the op's C++ class name. 986 987### Generated C++ code 988 989[OpDefinitionsGen][OpDefinitionsGen] processes the op definition spec file and 990generates two files containing the corresponding C++ code: one for declarations, 991the other for definitions. The former is generated via the `-gen-op-decls` 992command-line option, while the latter is via the `-gen-op-defs` option. 993 994The definition file contains all the op method definitions, which can be 995included and enabled by defining `GET_OP_CLASSES`. For each operation, 996OpDefinitionsGen generates an operation class and an 997[operand adaptor](#operand-adaptors) class. Besides, it also contains a 998comma-separated list of all defined ops, which can be included and enabled by 999defining `GET_OP_LIST`. 1000 1001#### Class name and namespaces 1002 1003For each operation, its generated C++ class name is the symbol `def`ed with 1004TableGen with dialect prefix removed. The first `_` serves as the delimiter. For 1005example, for `def TF_AddOp`, the C++ class name would be `AddOp`. We remove the 1006`TF` prefix because it is for scoping ops; other dialects may as well define 1007their own `AddOp`s. 1008 1009The namespaces of the generated C++ class will come from the dialect's 1010`cppNamespace` field. For example, if a dialect's `cppNamespace` is `A::B`, then 1011an op of that dialect will be placed in `namespace A { namespace B { ... } }`. 1012If a dialect does not specify a `cppNamespace`, we then use the dialect's name 1013as the namespace. 1014 1015This means the qualified name of the generated C++ class does not necessarily 1016match exactly with the operation name as explained in 1017[Operation name](#operation-name). This is to allow flexible naming to satisfy 1018coding style requirements. 1019 1020#### Operand adaptors 1021 1022For each operation, we automatically generate an _operand adaptor_. This class 1023solves the problem of accessing operands provided as a list of `Value`s without 1024using "magic" constants. The operand adaptor takes a reference to an array of 1025`Value` and provides methods with the same names as those in the operation class 1026to access them. For example, for a binary arithmetic operation, it may provide 1027`.lhs()` to access the first operand and `.rhs()` to access the second operand. 1028 1029The operand adaptor class lives in the same namespace as the operation class, 1030and has the name of the operation followed by `Adaptor` as well as an alias 1031`Adaptor` inside the op class. 1032 1033Operand adaptors can be used in function templates that also process operations: 1034 1035```c++ 1036template <typename BinaryOpTy> 1037std::pair<Value, Value> zip(BinaryOpTy &&op) { 1038 return std::make_pair(op.lhs(), op.rhs());; 1039} 1040 1041void process(AddOp op, ArrayRef<Value> newOperands) { 1042 zip(op); 1043 zip(Adaptor<AddOp>(newOperands)); 1044 /*...*/ 1045} 1046``` 1047 1048## Constraints 1049 1050Constraint is a core concept in table-driven operation definition: operation 1051verification and graph operation matching are all based on satisfying 1052constraints. So both the operation definition and rewrite rules specification 1053significantly involve writing constraints. We have the `Constraint` class in 1054[`OpBase.td`][OpBase] as the common base class for all constraints. 1055 1056An operation's constraint can cover different range; it may 1057 1058* Only concern a single attribute (e.g. being a 32-bit integer greater than 1059 5), 1060* Multiple operands and results (e.g., the 1st result's shape must be the same 1061 as the 1st operand), or 1062* Intrinsic to the operation itself (e.g., having no side effect). 1063 1064We call them as single-entity constraint, multi-entity constraint, and traits, 1065respectively. 1066 1067### Single-entity constraint 1068 1069Constraints scoped to a single operand, attribute, or result are specified at 1070the entity's declaration place as described in 1071[Operation arguments](#operation-arguments) and 1072[Operation results](#operation-results). 1073 1074To help modelling constraints of common types, a set of `TypeConstraint`s are 1075created; they are the `Type` subclass hierarchy. It includes `F32` for the 1076constraints of being a float, `TensorOf<[F32]>` for the constraints of being a 1077float tensor, and so on. 1078 1079Similarly, a set of `AttrConstraint`s are created for helping modelling 1080constraints of common attribute kinds. They are the `Attr` subclass hierarchy. 1081It includes `F32Attr` for the constraints of being a float attribute, 1082`F32ArrayAttr` for the constraints of being a float array attribute, and so on. 1083 1084### Multi-entity constraint 1085 1086Constraints involving more than one operand/attribute/result are quite common on 1087operations, like the element type and shape relation between operands and 1088results. These constraints should be specified as the `Op` class template 1089parameter as described in 1090[Operation traits and constraints](#operation-traits-and-constraints). 1091 1092Multi-entity constraints are modeled as `PredOpTrait` (a subclass of `OpTrait`) 1093in [`OpBase.td`][OpBase].A bunch of constraint primitives are provided to help 1094specification. See [`OpBase.td`][OpBase] for the complete list. 1095 1096### Trait 1097 1098Traits are intrinsic properties of the operation like having side effect or not, 1099commutative or not, whether is a terminator, etc. These constraints should be 1100specified as the `Op` class template parameter as described in 1101[Operation traits and constraints](#operation-traits-and-constraints). 1102 1103Traits are modeled as `NativeOpTrait` (a subclass of `OpTrait`) in 1104[`OpBase.td`][OpBase]. They are backed and will be translated into the 1105corresponding C++ `mlir::OpTrait` classes. 1106 1107### How to specify new constraint 1108 1109To write a constraint, you need to provide its predicates and give it a 1110descriptive name. Predicates, modeled with the `Pred` class, are the workhorse 1111for composing constraints. The predicate for a constraint is typically built up 1112in a nested manner, using the two categories of predicates: 1113 11141. `CPred`: the primitive leaf predicate. 11152. Compound predicate: a predicate composed from child predicates using 1116 predicate combiners (conjunction: `And`, disjunction: `Or`, negation: `Neg`, 1117 substitution: `SubstLeaves`, concatenation: `Concat`). 1118 1119`CPred` is the basis for composing more complex predicates. It is the "atom" 1120predicate from the perspective of TableGen and the "interface" between TableGen 1121and C++. What is inside is already C++ code, which will be treated as opaque 1122strings with special placeholders to be substituted. 1123 1124You can put any C++ code that returns a boolean value inside a `CPred`, 1125including evaluating expressions, calling functions, calling class methods, and 1126so on. 1127 1128To help interaction with the C++ environment, there are a few special 1129placeholders provided to refer to entities in the context where this predicate 1130is used. They serve as "hooks" to the enclosing environment. This includes 1131`$_builder`, `$_op`, and `$_self`: 1132 1133* `$_builder` will be replaced by a `mlir::Builder` instance so that you can 1134 access common build methods. 1135* `$_op` will be replaced by the current operation so that you can access 1136 information of the current operation. 1137* `$_self` will be replaced with the entity this predicate is attached to. 1138 E.g., `BoolAttr` is an attribute constraint that wraps a 1139 `CPred<"$_self.isa<BoolAttr>()">`. Then for `BoolAttr:$attr`,`$_self` will be 1140 replaced by `$attr`. For type constraints, it's a little bit special since 1141 we want the constraints on each type definition reads naturally and we want 1142 to attach type constraints directly to an operand/result, `$_self` will be 1143 replaced by the operand/result's type. E.g., for `F32` in `F32:$operand`, 1144 its `$_self` will be expanded as `operand(...).getType()`. 1145 1146TODO: Reconsider the leading symbol for special placeholders. Eventually we want 1147to allow referencing operand/result `$-name`s; such `$-name`s can start with 1148underscore. 1149 1150For example, to write an attribute `attr` is an `IntegerAttr`, in C++ you can 1151just call `attr.isa<IntegerAttr>()`. The code can be wrapped in a `CPred` as 1152`$_self.isa<IntegerAttr>()`, with `$_self` as the special placeholder to be 1153replaced by the current attribute `attr` at expansion time. 1154 1155For more complicated predicates, you can wrap it in a single `CPred`, or you can 1156use predicate combiners to combine them. For example, to write the constraint 1157that an attribute `attr` is a 32-bit or 64-bit integer, you can write it as 1158 1159```tablegen 1160And<[ 1161 CPred<"$_self.isa<IntegerAttr>()">, 1162 Or<[ 1163 CPred<"$_self.cast<IntegerAttr>().getType().isInteger(32)">, 1164 CPred<"$_self.cast<IntegerAttr>().getType().isInteger(64)"> 1165 ]> 1166]> 1167``` 1168 1169(Note that the above is just to show with a familiar example how you can use 1170`CPred` and predicate combiners to write complicated predicates. For integer 1171attributes specifically, [`OpBase.td`][OpBase] already defines `I32Attr` and 1172`I64Attr`. So you can actually reuse them to write it as `Or<[I32Attr.predicate, 1173I64Attr.predicate]>`.) 1174 1175TODO: Build up a library of reusable primitive constraints 1176 1177If the predicate is very complex to write with `CPred` together with predicate 1178combiners, you can also write it as a normal C++ function and use the `CPred` as 1179a way to "invoke" the function. For example, to verify an attribute `attr` has 1180some property, you can write a C++ function like 1181 1182```cpp 1183bool HasSomeProperty(Attribute attr) { ... } 1184``` 1185 1186and then define the op as: 1187 1188```tablegen 1189def HasSomeProperty : AttrConstraint<CPred<"HasSomeProperty($_self)">, 1190 "has some property">; 1191 1192def MyOp : Op<...> { 1193 let arguments = (ins 1194 ... 1195 HasSomeProperty:$attr 1196 ); 1197} 1198``` 1199 1200As to whether we should define the predicate using a single `CPred` wrapping the 1201whole expression, multiple `CPred`s with predicate combiners, or a single 1202`CPred` "invoking" a function, there are no clear-cut criteria. Defining using 1203`CPred` and predicate combiners is preferable since it exposes more information 1204(instead hiding all the logic behind a C++ function) into the op definition spec 1205so that it can potentially drive more auto-generation cases. But it will require 1206a nice library of common predicates as the building blocks to avoid the 1207duplication, which is being worked on right now. 1208 1209## Attribute Definition 1210 1211An attribute is a compile-time known constant of an operation. 1212 1213ODS provides attribute wrappers over C++ attribute classes. There are a few 1214common C++ [attribute classes][AttrClasses] defined in MLIR's core IR library 1215and one is free to define dialect-specific attribute classes. ODS allows one to 1216use these attributes in TableGen to define operations, potentially with more 1217fine-grained constraints. For example, `StrAttr` directly maps to `StringAttr`; 1218`F32Attr`/`F64Attr` requires the `FloatAttr` to additionally be of a certain 1219bitwidth. 1220 1221ODS attributes are defined as having a storage type (corresponding to a backing 1222`mlir::Attribute` that _stores_ the attribute), a return type (corresponding to 1223the C++ _return_ type of the generated helper getters) as well as a method 1224to convert between the internal storage and the helper method. 1225 1226### Attribute decorators 1227 1228There are a few important attribute adapters/decorators/modifiers that can be 1229applied to ODS attributes to specify common additional properties like 1230optionality, default values, etc.: 1231 1232* `DefaultValuedAttr`: specifies the 1233 [default value](#attributes-with-default-values) for an attribute. 1234* `OptionalAttr`: specifies an attribute as [optional](#optional-attributes). 1235* `Confined`: adapts an attribute with 1236 [further constraints](#confining-attributes). 1237 1238### Enum attributes 1239 1240Some attributes can only take values from a predefined enum, e.g., the 1241comparison kind of a comparison op. To define such attributes, ODS provides 1242several mechanisms: `StrEnumAttr`, `IntEnumAttr`, and `BitEnumAttr`. 1243 1244* `StrEnumAttr`: each enum case is a string, the attribute is stored as a 1245 [`StringAttr`][StringAttr] in the op. 1246* `IntEnumAttr`: each enum case is an integer, the attribute is stored as a 1247 [`IntegerAttr`][IntegerAttr] in the op. 1248* `BitEnumAttr`: each enum case is a bit, the attribute is stored as a 1249 [`IntegerAttr`][IntegerAttr] in the op. 1250 1251All these `*EnumAttr` attributes require fully specifying all of the allowed 1252cases via their corresponding `*EnumAttrCase`. With this, ODS is able to 1253generate additional verification to only accept allowed cases. To facilitate the 1254interaction between `*EnumAttr`s and their C++ consumers, the 1255[`EnumsGen`][EnumsGen] TableGen backend can generate a few common utilities: a 1256C++ enum class, `llvm::DenseMapInfo` for the enum class, conversion functions 1257from/to strings. This is controlled via the `-gen-enum-decls` and 1258`-gen-enum-defs` command-line options of `mlir-tblgen`. 1259 1260For example, given the following `EnumAttr`: 1261 1262```tablegen 1263def Case15: I32EnumAttrCase<"Case15", 15>; 1264def Case20: I32EnumAttrCase<"Case20", 20>; 1265 1266def MyIntEnum: I32EnumAttr<"MyIntEnum", "An example int enum", 1267 [Case15, Case20]> { 1268 let cppNamespace = "Outer::Inner"; 1269 let stringToSymbolFnName = "ConvertToEnum"; 1270 let symbolToStringFnName = "ConvertToString"; 1271} 1272``` 1273 1274The following will be generated via `mlir-tblgen -gen-enum-decls`: 1275 1276```c++ 1277namespace Outer { 1278namespace Inner { 1279// An example int enum 1280enum class MyIntEnum : uint32_t { 1281 Case15 = 15, 1282 Case20 = 20, 1283}; 1284 1285llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t); 1286llvm::StringRef ConvertToString(MyIntEnum); 1287llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef); 1288inline constexpr unsigned getMaxEnumValForMyIntEnum() { 1289 return 20; 1290} 1291 1292} // namespace Inner 1293} // namespace Outer 1294 1295namespace llvm { 1296template<> struct DenseMapInfo<Outer::Inner::MyIntEnum> { 1297 using StorageInfo = llvm::DenseMapInfo<uint32_t>; 1298 1299 static inline Outer::Inner::MyIntEnum getEmptyKey() { 1300 return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getEmptyKey()); 1301 } 1302 1303 static inline Outer::Inner::MyIntEnum getTombstoneKey() { 1304 return static_cast<Outer::Inner::MyIntEnum>(StorageInfo::getTombstoneKey()); 1305 } 1306 1307 static unsigned getHashValue(const Outer::Inner::MyIntEnum &val) { 1308 return StorageInfo::getHashValue(static_cast<uint32_t>(val)); 1309 } 1310 1311 static bool isEqual(const Outer::Inner::MyIntEnum &lhs, const Outer::Inner::MyIntEnum &rhs) { 1312 return lhs == rhs; 1313 } 1314}; 1315} 1316``` 1317 1318The following will be generated via `mlir-tblgen -gen-enum-defs`: 1319 1320```c++ 1321namespace Outer { 1322namespace Inner { 1323llvm::StringRef ConvertToString(MyIntEnum val) { 1324 switch (val) { 1325 case MyIntEnum::Case15: return "Case15"; 1326 case MyIntEnum::Case20: return "Case20"; 1327 } 1328 return ""; 1329} 1330 1331llvm::Optional<MyIntEnum> ConvertToEnum(llvm::StringRef str) { 1332 return llvm::StringSwitch<llvm::Optional<MyIntEnum>>(str) 1333 .Case("Case15", MyIntEnum::Case15) 1334 .Case("Case20", MyIntEnum::Case20) 1335 .Default(llvm::None); 1336} 1337llvm::Optional<MyIntEnum> symbolizeMyIntEnum(uint32_t value) { 1338 switch (value) { 1339 case 15: return MyIntEnum::Case15; 1340 case 20: return MyIntEnum::Case20; 1341 default: return llvm::None; 1342 } 1343} 1344 1345} // namespace Inner 1346} // namespace Outer 1347``` 1348 1349Similarly for the following `BitEnumAttr` definition: 1350 1351```tablegen 1352def None: BitEnumAttrCase<"None", 0x0000>; 1353def Bit1: BitEnumAttrCase<"Bit1", 0x0001>; 1354def Bit2: BitEnumAttrCase<"Bit2", 0x0002>; 1355def Bit3: BitEnumAttrCase<"Bit3", 0x0004>; 1356 1357def MyBitEnum: BitEnumAttr<"MyBitEnum", "An example bit enum", 1358 [None, Bit1, Bit2, Bit3]>; 1359``` 1360 1361We can have: 1362 1363```c++ 1364// An example bit enum 1365enum class MyBitEnum : uint32_t { 1366 None = 0, 1367 Bit1 = 1, 1368 Bit2 = 2, 1369 Bit3 = 4, 1370}; 1371 1372llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t); 1373std::string stringifyMyBitEnum(MyBitEnum); 1374llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef); 1375inline MyBitEnum operator|(MyBitEnum lhs, MyBitEnum rhs) { 1376 return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) | static_cast<uint32_t>(rhs)); 1377} 1378inline MyBitEnum operator&(MyBitEnum lhs, MyBitEnum rhs) { 1379 return static_cast<MyBitEnum>(static_cast<uint32_t>(lhs) & static_cast<uint32_t>(rhs)); 1380} 1381inline bool bitEnumContains(MyBitEnum bits, MyBitEnum bit) { 1382 return (static_cast<uint32_t>(bits) & static_cast<uint32_t>(bit)) != 0; 1383} 1384 1385namespace llvm { 1386template<> struct DenseMapInfo<::MyBitEnum> { 1387 using StorageInfo = llvm::DenseMapInfo<uint32_t>; 1388 1389 static inline ::MyBitEnum getEmptyKey() { 1390 return static_cast<::MyBitEnum>(StorageInfo::getEmptyKey()); 1391 } 1392 1393 static inline ::MyBitEnum getTombstoneKey() { 1394 return static_cast<::MyBitEnum>(StorageInfo::getTombstoneKey()); 1395 } 1396 1397 static unsigned getHashValue(const ::MyBitEnum &val) { 1398 return StorageInfo::getHashValue(static_cast<uint32_t>(val)); 1399 } 1400 1401 static bool isEqual(const ::MyBitEnum &lhs, const ::MyBitEnum &rhs) { 1402 return lhs == rhs; 1403 } 1404}; 1405``` 1406 1407```c++ 1408std::string stringifyMyBitEnum(MyBitEnum symbol) { 1409 auto val = static_cast<uint32_t>(symbol); 1410 // Special case for all bits unset. 1411 if (val == 0) return "None"; 1412 1413 llvm::SmallVector<llvm::StringRef, 2> strs; 1414 if (1u & val) { strs.push_back("Bit1"); val &= ~1u; } 1415 if (2u & val) { strs.push_back("Bit2"); val &= ~2u; } 1416 if (4u & val) { strs.push_back("Bit3"); val &= ~4u; } 1417 1418 if (val) return ""; 1419 return llvm::join(strs, "|"); 1420} 1421 1422llvm::Optional<MyBitEnum> symbolizeMyBitEnum(llvm::StringRef str) { 1423 // Special case for all bits unset. 1424 if (str == "None") return MyBitEnum::None; 1425 1426 llvm::SmallVector<llvm::StringRef, 2> symbols; 1427 str.split(symbols, "|"); 1428 1429 uint32_t val = 0; 1430 for (auto symbol : symbols) { 1431 auto bit = llvm::StringSwitch<llvm::Optional<uint32_t>>(symbol) 1432 .Case("Bit1", 1) 1433 .Case("Bit2", 2) 1434 .Case("Bit3", 4) 1435 .Default(llvm::None); 1436 if (bit) { val |= *bit; } else { return llvm::None; } 1437 } 1438 return static_cast<MyBitEnum>(val); 1439} 1440 1441llvm::Optional<MyBitEnum> symbolizeMyBitEnum(uint32_t value) { 1442 // Special case for all bits unset. 1443 if (value == 0) return MyBitEnum::None; 1444 1445 if (value & ~(1u | 2u | 4u)) return llvm::None; 1446 return static_cast<MyBitEnum>(value); 1447} 1448``` 1449 1450## Type Definitions 1451 1452MLIR defines the `TypeDef` class hierarchy to enable generation of data types from 1453their specifications. A type is defined by specializing the `TypeDef` class with 1454concrete contents for all the fields it requires. For example, an integer type 1455could be defined as: 1456 1457```tablegen 1458// All of the types will extend this class. 1459class Test_Type<string name> : TypeDef<Test_Dialect, name> { } 1460 1461// An alternate int type. 1462def IntegerType : Test_Type<"TestInteger"> { 1463 let mnemonic = "int"; 1464 1465 let summary = "An integer type with special semantics"; 1466 1467 let description = [{ 1468 An alternate integer type. This type differentiates itself from the 1469 standard integer type by not having a SignednessSemantics parameter, just 1470 a width. 1471 }]; 1472 1473 let parameters = (ins "unsigned":$width); 1474 1475 // We define the printer inline. 1476 let printer = [{ 1477 $_printer << "int<" << getImpl()->width << ">"; 1478 }]; 1479 1480 // The parser is defined here also. 1481 let parser = [{ 1482 if ($_parser.parseLess()) 1483 return Type(); 1484 int width; 1485 if ($_parser.parseInteger(width)) 1486 return Type(); 1487 if ($_parser.parseGreater()) 1488 return Type(); 1489 return get($_ctxt, width); 1490 }]; 1491} 1492``` 1493 1494### Type name 1495 1496The name of the C++ class which gets generated defaults to 1497`<classParamName>Type` (e.g. `TestIntegerType` in the above example). This can 1498be overridden via the `cppClassName` field. The field `mnemonic` is to specify 1499the asm name for parsing. It is optional and not specifying it will imply that 1500no parser or printer methods are attached to this class. 1501 1502### Type documentation 1503 1504The `summary` and `description` fields exist and are to be used the same way as 1505in Operations. Namely, the summary should be a one-liner and `description` 1506should be a longer explanation. 1507 1508### Type parameters 1509 1510The `parameters` field is a list of the type's parameters. If no parameters are 1511specified (the default), this type is considered a singleton type. Parameters 1512are in the `"c++Type":$paramName` format. To use C++ types as parameters which 1513need allocation in the storage constructor, there are two options: 1514 1515- Set `hasCustomStorageConstructor` to generate the TypeStorage class with a 1516 constructor which is just declared -- no definition -- so you can write it 1517 yourself. 1518- Use the `TypeParameter` tablegen class instead of the "c++Type" string. 1519 1520### TypeParameter tablegen class 1521 1522This is used to further specify attributes about each of the types parameters. 1523It includes documentation (`summary` and `syntax`), the C++ type to use, a 1524custom allocator to use in the storage constructor method, and a custom 1525comparator to decide if two instances of the parameter type are equal. 1526 1527```tablegen 1528// DO NOT DO THIS! 1529let parameters = (ins "ArrayRef<int>":$dims); 1530``` 1531 1532The default storage constructor blindly copies fields by value. It does not know 1533anything about the types. In this case, the ArrayRef<int> requires allocation 1534with `dims = allocator.copyInto(dims)`. 1535 1536You can specify the necessary constructor by specializing the `TypeParameter` 1537tblgen class: 1538 1539```tablegen 1540class ArrayRefIntParam : 1541 TypeParameter<"::llvm::ArrayRef<int>", "Array of ints"> { 1542 let allocator = "$_dst = $_allocator.copyInto($_self);"; 1543} 1544 1545... 1546 1547let parameters = (ins ArrayRefIntParam:$dims); 1548``` 1549 1550The `allocator` code block has the following substitutions: 1551 1552- `$_allocator` is the TypeStorageAllocator in which to allocate objects. 1553- `$_dst` is the variable in which to place the allocated data. 1554 1555The `comparator` code block has the following substitutions: 1556 1557- `$_lhs` is an instance of the parameter type. 1558- `$_rhs` is an instance of the parameter type. 1559 1560MLIR includes several specialized classes for common situations: 1561 1562- `StringRefParameter<descriptionOfParam>` for StringRefs. 1563- `ArrayRefParameter<arrayOf, descriptionOfParam>` for ArrayRefs of value 1564 types 1565- `SelfAllocationParameter<descriptionOfParam>` for C++ classes which contain 1566 a method called `allocateInto(StorageAllocator &allocator)` to allocate 1567 itself into `allocator`. 1568- `ArrayRefOfSelfAllocationParameter<arrayOf, descriptionOfParam>` for arrays 1569 of objects which self-allocate as per the last specialization. 1570 1571If we were to use one of these included specializations: 1572 1573```tablegen 1574let parameters = (ins 1575 ArrayRefParameter<"int", "The dimensions">:$dims 1576); 1577``` 1578 1579### Parsing and printing 1580 1581If a mnemonic is specified, the `printer` and `parser` code fields are active. 1582The rules for both are: 1583 1584- If null, generate just the declaration. 1585- If non-null and non-empty, use the code in the definition. The `$_printer` 1586 or `$_parser` substitutions are valid and should be used. 1587- It is an error to have an empty code block. 1588 1589For each dialect, two "dispatch" functions will be created: one for parsing and 1590one for printing. You should add calls to these in your `Dialect::printType` and 1591`Dialect::parseType` methods. They are static functions placed alongside the 1592type class definitions and have the following function signatures: 1593 1594```c++ 1595static Type generatedTypeParser(MLIRContext* ctxt, DialectAsmParser& parser, StringRef mnemonic); 1596LogicalResult generatedTypePrinter(Type type, DialectAsmPrinter& printer); 1597``` 1598 1599The mnemonic, parser, and printer fields are optional. If they're not defined, 1600the generated code will not include any parsing or printing code and omit the 1601type from the dispatch functions above. In this case, the dialect author is 1602responsible for parsing/printing the types in `Dialect::printType` and 1603`Dialect::parseType`. 1604 1605### Other fields 1606 1607- If the `genStorageClass` field is set to 1 (the default) a storage class is 1608 generated with member variables corresponding to each of the specified 1609 `parameters`. 1610- If the `genAccessors` field is 1 (the default) accessor methods will be 1611 generated on the Type class (e.g. `int getWidth() const` in the example 1612 above). 1613- If the `genVerifyDecl` field is set, a declaration for a method `static 1614 LogicalResult verify(emitErrorFn, parameters...)` is added to the class as 1615 well as a `getChecked(emitErrorFn, parameters...)` method which checks the 1616 result of `verify` before calling `get`. 1617- The `storageClass` field can be used to set the name of the storage class. 1618- The `storageNamespace` field is used to set the namespace where the storage 1619 class should sit. Defaults to "detail". 1620- The `extraClassDeclaration` field is used to include extra code in the class 1621 declaration. 1622 1623### Type builder methods 1624 1625For each type, there are a few builders(`get`/`getChecked`) automatically 1626generated based on the parameters of the type. For example, given the following 1627type definition: 1628 1629```tablegen 1630def MyType : ... { 1631 let parameters = (ins "int":$intParam); 1632} 1633``` 1634 1635The following builders are generated: 1636 1637```c++ 1638// Type builders are named `get`, and return a new instance of a type for a 1639// given set of parameters. 1640static MyType get(MLIRContext *context, int intParam); 1641 1642// If `genVerifyDecl` is set to 1, the following method is also generated. 1643static MyType getChecked(function_ref<InFlightDiagnostic()> emitError, 1644 MLIRContext *context, int intParam); 1645``` 1646 1647If these autogenerated methods are not desired, such as when they conflict with 1648a custom builder method, a type can set `skipDefaultBuilders` to 1 to signal 1649that they should not be generated. 1650 1651#### Custom type builder methods 1652 1653The default build methods may cover a majority of the simple cases related to 1654type construction, but when they cannot satisfy a type's needs, you can define 1655additional convenience 'get' methods in the `builders` field as follows: 1656 1657```tablegen 1658def MyType : ... { 1659 let parameters = (ins "int":$intParam); 1660 1661 let builders = [ 1662 TypeBuilder<(ins "int":$intParam)>, 1663 TypeBuilder<(ins CArg<"int", "0">:$intParam)>, 1664 TypeBuilder<(ins CArg<"int", "0">:$intParam), [{ 1665 // Write the body of the `get` builder inline here. 1666 return Base::get($_ctxt, intParam); 1667 }]>, 1668 TypeBuilderWithInferredContext<(ins "Type":$typeParam), [{ 1669 // This builder states that it can infer an MLIRContext instance from 1670 // its arguments. 1671 return Base::get(typeParam.getContext(), ...); 1672 }]>, 1673 ]; 1674} 1675``` 1676 1677The `builders` field is a list of custom builders that are added to the type 1678class. In this example, we provide several different convenience builders that 1679are useful in different scenarios. The `ins` prefix is common to many function 1680declarations in ODS, which use a TableGen [`dag`](#tablegen-syntax). What 1681follows is a comma-separated list of types (quoted string or `CArg`) and names 1682prefixed with the `$` sign. The use of `CArg` allows for providing a default 1683value to that argument. Let's take a look at each of these builders individually 1684 1685The first builder will generate the declaration of a builder method that looks 1686like: 1687 1688```tablegen 1689 let builders = [ 1690 TypeBuilder<(ins "int":$intParam)>, 1691 ]; 1692``` 1693 1694```c++ 1695class MyType : /*...*/ { 1696 /*...*/ 1697 static MyType get(::mlir::MLIRContext *context, int intParam); 1698}; 1699``` 1700 1701This builder is identical to the one that will be automatically generated for 1702`MyType`. The `context` parameter is implicitly added by the generator, and is 1703used when building the Type instance (with `Base::get`). The distinction 1704here is that we can provide the implementation of this `get` method. With this 1705style of builder definition only the declaration is generated, the implementor 1706of `MyType` will need to provide a definition of `MyType::get`. 1707 1708The second builder will generate the declaration of a builder method that looks 1709like: 1710 1711```tablegen 1712 let builders = [ 1713 TypeBuilder<(ins CArg<"int", "0">:$intParam)>, 1714 ]; 1715``` 1716 1717```c++ 1718class MyType : /*...*/ { 1719 /*...*/ 1720 static MyType get(::mlir::MLIRContext *context, int intParam = 0); 1721}; 1722``` 1723 1724The constraints here are identical to the first builder example except for the 1725fact that `intParam` now has a default value attached. 1726 1727The third builder will generate the declaration of a builder method that looks 1728like: 1729 1730```tablegen 1731 let builders = [ 1732 TypeBuilder<(ins CArg<"int", "0">:$intParam), [{ 1733 // Write the body of the `get` builder inline here. 1734 return Base::get($_ctxt, intParam); 1735 }]>, 1736 ]; 1737``` 1738 1739```c++ 1740class MyType : /*...*/ { 1741 /*...*/ 1742 static MyType get(::mlir::MLIRContext *context, int intParam = 0); 1743}; 1744 1745MyType MyType::get(::mlir::MLIRContext *context, int intParam) { 1746 // Write the body of the `get` builder inline here. 1747 return Base::get(context, intParam); 1748} 1749``` 1750 1751This is identical to the second builder example. The difference is that now, a 1752definition for the builder method will be generated automatically using the 1753provided code block as the body. When specifying the body inline, `$_ctxt` may 1754be used to access the `MLIRContext *` parameter. 1755 1756The fourth builder will generate the declaration of a builder method that looks 1757like: 1758 1759```tablegen 1760 let builders = [ 1761 TypeBuilderWithInferredContext<(ins "Type":$typeParam), [{ 1762 // This builder states that it can infer an MLIRContext instance from 1763 // its arguments. 1764 return Base::get(typeParam.getContext(), ...); 1765 }]>, 1766 ]; 1767``` 1768 1769```c++ 1770class MyType : /*...*/ { 1771 /*...*/ 1772 static MyType get(Type typeParam); 1773}; 1774 1775MyType MyType::get(Type typeParam) { 1776 // This builder states that it can infer an MLIRContext instance from its 1777 // arguments. 1778 return Base::get(typeParam.getContext(), ...); 1779} 1780``` 1781 1782In this builder example, the main difference from the third builder example 1783there is that the `MLIRContext` parameter is no longer added. This is because 1784the type builder used `TypeBuilderWithInferredContext` implies that the context 1785parameter is not necessary as it can be inferred from the arguments to the 1786builder. 1787 1788## Debugging Tips 1789 1790### Run `mlir-tblgen` to see the generated content 1791 1792TableGen syntax sometimes can be obscure; reading the generated content can be a 1793very helpful way to understand and debug issues. To build `mlir-tblgen`, run 1794`cmake --build . --target mlir-tblgen` in your build directory and find the 1795`mlir-tblgen` binary in the `bin/` subdirectory. All the supported generators 1796can be found via `mlir-tblgen --help`. For example, `--gen-op-decls` and 1797`--gen-op-defs` as explained in [Generated C++ code](#generated-c-code). 1798 1799To see the generated code, invoke `mlir-tblgen` with a specific generator by 1800providing include paths via `-I`. For example, 1801 1802```sh 1803# To see op C++ class declaration 1804mlir-tblgen --gen-op-decls -I /path/to/mlir/include /path/to/input/td/file 1805# To see op C++ class definition 1806mlir-tblgen --gen-op-defs -I /path/to/mlir/include /path/to/input/td/file 1807# To see op documentation 1808mlir-tblgen --gen-dialect-doc -I /path/to/mlir/include /path/to/input/td/file 1809 1810# To see op interface C++ class declaration 1811mlir-tblgen --gen-op-interface-decls -I /path/to/mlir/include /path/to/input/td/file 1812# To see op interface C++ class definition 1813mlir-tblgen --gen-op-interface-defs -I /path/to/mlir/include /path/to/input/td/file 1814# To see op interface documentation 1815mlir-tblgen --gen-op-interface-doc -I /path/to/mlir/include /path/to/input/td/file 1816``` 1817 1818## Appendix 1819 1820### Requirements and existing mechanisms analysis 1821 1822The op description should be as declarative as possible to allow a wide range of 1823tools to work with them and query methods generated from them. In particular 1824this means specifying traits, constraints and shape inference information in a 1825way that is easily analyzable (e.g., avoid opaque calls to C++ functions where 1826possible). 1827 1828We considered the approaches of several contemporary systems and focused on 1829requirements that were desirable: 1830 1831* Ops registered using a registry separate from C++ code. 1832 * Unknown ops are allowed in MLIR, so ops need not be registered. The 1833 ability of the compiler to optimize those ops or graphs containing those 1834 ops is constrained but correct. 1835 * The current proposal does not include a runtime op description, but it 1836 does not preclude such description, it can be added later. 1837 * The op registry is essential for generating C++ classes that make 1838 manipulating ops, verifying correct construction etc. in C++ easier by 1839 providing a typed representation and accessors. 1840* The op registry will be defined in 1841 [TableGen](https://llvm.org/docs/TableGen/index.html) and be used to 1842 generate C++ classes and utility functions 1843 (builder/verifier/parser/printer). 1844 * TableGen is a modelling specification language used by LLVM's backends 1845 and fits in well with trait-based modelling. This is an implementation 1846 decision and there are alternative ways of doing this. But the 1847 specification language is good for the requirements of modelling the 1848 traits (as seen from usage in LLVM processor backend modelling) and easy 1849 to extend, so a practical choice. If another good option comes up, we 1850 will consider it. 1851* MLIR allows both defined and undefined ops. 1852 * Defined ops should have fixed semantics and could have a corresponding 1853 reference implementation defined. 1854 * Dialects are under full control of the dialect owner and normally live 1855 with the framework of the dialect. 1856* The op's traits (e.g., commutative) are modelled along with the op in the 1857 registry. 1858* The op's operand/return type constraints are modelled along with the op in 1859 the registry (see [Shape inference](ShapeInference.md) discussion below), 1860 this allows (e.g.) optimized concise syntax in textual dumps. 1861* Behavior of the op is documented along with the op with a summary and a 1862 description. The description is written in markdown and extracted for 1863 inclusion in the generated LangRef section of the dialect. 1864* The generic assembly form of printing and parsing is available as normal, 1865 but a custom parser and printer can either be specified or automatically 1866 generated from an optional string representation showing the mapping of the 1867 "assembly" string to operands/type. 1868 * Parser-level remappings (e.g., `eq` to enum) will be supported as part 1869 of the parser generation. 1870* Matching patterns are specified separately from the op description. 1871 * Contrasted with LLVM there is no "base" set of ops that every backend 1872 needs to be aware of. Instead there are many different dialects and the 1873 transformations/legalizations between these dialects form a graph of 1874 transformations. 1875* Reference implementation may be provided along with the op definition. 1876 1877 * The reference implementation may be in terms of either standard ops or 1878 other reference implementations. 1879 1880 TODO: document expectation if the dependent op's definition changes. 1881 1882[TableGen]: https://llvm.org/docs/TableGen/index.html 1883[TableGenProgRef]: https://llvm.org/docs/TableGen/ProgRef.html 1884[TableGenBackend]: https://llvm.org/docs/TableGen/BackEnds.html#introduction 1885[OpBase]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/OpBase.td 1886[OpDefinitionsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp 1887[EnumsGen]: https://github.com/llvm/llvm-project/blob/main/mlir/tools/mlir-tblgen/EnumsGen.cpp 1888[StringAttr]: Dialects/Builtin.md/#stringattr 1889[IntegerAttr]: Dialects/Builtin.md/#integertype 1890[AttrClasses]: https://github.com/llvm/llvm-project/blob/main/mlir/include/mlir/IR/Attributes.h 1891