1# MLIR Python Bindings 2 3**Current status**: Under development and not enabled by default 4 5[TOC] 6 7## Building 8 9### Pre-requisites 10 11* A relatively recent Python3 installation 12* Installation of python dependencies as specified in 13 `mlir/python/requirements.txt` 14 15### CMake variables 16 17* **`MLIR_ENABLE_BINDINGS_PYTHON`**`:BOOL` 18 19 Enables building the Python bindings. Defaults to `OFF`. 20 21* **`Python3_EXECUTABLE`**:`STRING` 22 23 Specifies the `python` executable used for the LLVM build, including for 24 determining header/link flags for the Python bindings. On systems with 25 multiple Python implementations, setting this explicitly to the preferred 26 `python3` executable is strongly recommended. 27 28### Recommended development practices 29 30It is recommended to use a python virtual environment. Many ways exist for this, 31but the following is the simplest: 32 33```shell 34# Make sure your 'python' is what you expect. Note that on multi-python 35# systems, this may have a version suffix, and on many Linuxes and MacOS where 36# python2 and python3 co-exist, you may also want to use `python3`. 37which python 38python -m venv ~/.venv/mlirdev 39source ~/.venv/mlirdev/bin/activate 40 41# Note that many LTS distros will bundle a version of pip itself that is too 42# old to download all of the latest binaries for certain platforms. 43# The pip version can be obtained with `python -m pip --version`, and for 44# Linux specifically, this should be cross checked with minimum versions 45# here: https://github.com/pypa/manylinux 46# It is recommended to upgrade pip: 47python -m pip install --upgrade pip 48 49 50# Now the `python` command will resolve to your virtual environment and 51# packages will be installed there. 52python -m pip install -r mlir/python/requirements.txt 53 54# Now run `cmake`, `ninja`, et al. 55``` 56 57For interactive use, it is sufficient to add the 58`tools/mlir/python_packages/mlir_core/` directory in your `build/` directory to 59the `PYTHONPATH`. Typically: 60 61```shell 62export PYTHONPATH=$(cd build && pwd)/tools/mlir/python_packages/mlir_core 63``` 64 65Note that if you have installed (i.e. via `ninja install`, et al), then python 66packages for all enabled projects will be in your install tree under 67`python_packages/` (i.e. `python_packages/mlir_core`). Official distributions 68are built with a more specialized setup. 69 70## Design 71 72### Use cases 73 74There are likely two primary use cases for the MLIR python bindings: 75 761. Support users who expect that an installed version of LLVM/MLIR will yield 77 the ability to `import mlir` and use the API in a pure way out of the box. 78 791. Downstream integrations will likely want to include parts of the API in 80 their private namespace or specially built libraries, probably mixing it 81 with other python native bits. 82 83### Composable modules 84 85In order to support use case \#2, the Python bindings are organized into 86composable modules that downstream integrators can include and re-export into 87their own namespace if desired. This forces several design points: 88 89* Separate the construction/populating of a `py::module` from 90 `PYBIND11_MODULE` global constructor. 91 92* Introduce headers for C++-only wrapper classes as other related C++ modules 93 will need to interop with it. 94 95* Separate any initialization routines that depend on optional components into 96 its own module/dependency (currently, things like `registerAllDialects` fall 97 into this category). 98 99There are a lot of co-related issues of shared library linkage, distribution 100concerns, etc that affect such things. Organizing the code into composable 101modules (versus a monolithic `cpp` file) allows the flexibility to address many 102of these as needed over time. Also, compilation time for all of the template 103meta-programming in pybind scales with the number of things you define in a 104translation unit. Breaking into multiple translation units can significantly aid 105compile times for APIs with a large surface area. 106 107### Submodules 108 109Generally, the C++ codebase namespaces most things into the `mlir` namespace. 110However, in order to modularize and make the Python bindings easier to 111understand, sub-packages are defined that map roughly to the directory structure 112of functional units in MLIR. 113 114Examples: 115 116* `mlir.ir` 117* `mlir.passes` (`pass` is a reserved word :( ) 118* `mlir.dialect` 119* `mlir.execution_engine` (aside from namespacing, it is important that 120 "bulky"/optional parts like this are isolated) 121 122In addition, initialization functions that imply optional dependencies should be 123in underscored (notionally private) modules such as `_init` and linked 124separately. This allows downstream integrators to completely customize what is 125included "in the box" and covers things like dialect registration, pass 126registration, etc. 127 128### Loader 129 130LLVM/MLIR is a non-trivial python-native project that is likely to co-exist with 131other non-trivial native extensions. As such, the native extension (i.e. the 132`.so`/`.pyd`/`.dylib`) is exported as a notionally private top-level symbol 133(`_mlir`), while a small set of Python code is provided in 134`mlir/_cext_loader.py` and siblings which loads and re-exports it. This split 135provides a place to stage code that needs to prepare the environment *before* 136the shared library is loaded into the Python runtime, and also provides a place 137that one-time initialization code can be invoked apart from module constructors. 138 139It is recommended to avoid using `__init__.py` files to the extent possible, 140until reaching a leaf package that represents a discrete component. The rule to 141keep in mind is that the presence of an `__init__.py` file prevents the ability 142to split anything at that level or below in the namespace into different 143directories, deployment packages, wheels, etc. 144 145See the documentation for more information and advice: 146https://packaging.python.org/guides/packaging-namespace-packages/ 147 148### Use the C-API 149 150The Python APIs should seek to layer on top of the C-API to the degree possible. 151Especially for the core, dialect-independent parts, such a binding enables 152packaging decisions that would be difficult or impossible if spanning a C++ ABI 153boundary. In addition, factoring in this way side-steps some very difficult 154issues that arise when combining RTTI-based modules (which pybind derived things 155are) with non-RTTI polymorphic C++ code (the default compilation mode of LLVM). 156 157### Ownership in the Core IR 158 159There are several top-level types in the core IR that are strongly owned by 160their python-side reference: 161 162* `PyContext` (`mlir.ir.Context`) 163* `PyModule` (`mlir.ir.Module`) 164* `PyOperation` (`mlir.ir.Operation`) - but with caveats 165 166All other objects are dependent. All objects maintain a back-reference 167(keep-alive) to their closest containing top-level object. Further, dependent 168objects fall into two categories: a) uniqued (which live for the life-time of 169the context) and b) mutable. Mutable objects need additional machinery for 170keeping track of when the C++ instance that backs their Python object is no 171longer valid (typically due to some specific mutation of the IR, deletion, or 172bulk operation). 173 174### Optionality and argument ordering in the Core IR 175 176The following types support being bound to the current thread as a context 177manager: 178 179* `PyLocation` (`loc: mlir.ir.Location = None`) 180* `PyInsertionPoint` (`ip: mlir.ir.InsertionPoint = None`) 181* `PyMlirContext` (`context: mlir.ir.Context = None`) 182 183In order to support composability of function arguments, when these types appear 184as arguments, they should always be the last and appear in the above order and 185with the given names (which is generally the order in which they are expected to 186need to be expressed explicitly in special cases) as necessary. Each should 187carry a default value of `py::none()` and use either a manual or automatic 188conversion for resolving either with the explicit value or a value from the 189thread context manager (i.e. `DefaultingPyMlirContext` or 190`DefaultingPyLocation`). 191 192The rationale for this is that in Python, trailing keyword arguments to the 193*right* are the most composable, enabling a variety of strategies such as kwarg 194passthrough, default values, etc. Keeping function signatures composable 195increases the chances that interesting DSLs and higher level APIs can be 196constructed without a lot of exotic boilerplate. 197 198Used consistently, this enables a style of IR construction that rarely needs to 199use explicit contexts, locations, or insertion points but is free to do so when 200extra control is needed. 201 202#### Operation hierarchy 203 204As mentioned above, `PyOperation` is special because it can exist in either a 205top-level or dependent state. The life-cycle is unidirectional: operations can 206be created detached (top-level) and once added to another operation, they are 207then dependent for the remainder of their lifetime. The situation is more 208complicated when considering construction scenarios where an operation is added 209to a transitive parent that is still detached, necessitating further accounting 210at such transition points (i.e. all such added children are initially added to 211the IR with a parent of their outer-most detached operation, but then once it is 212added to an attached operation, they need to be re-parented to the containing 213module). 214 215Due to the validity and parenting accounting needs, `PyOperation` is the owner 216for regions and blocks and needs to be a top-level type that we can count on not 217aliasing. This let's us do things like selectively invalidating instances when 218mutations occur without worrying that there is some alias to the same operation 219in the hierarchy. Operations are also the only entity that are allowed to be in 220a detached state, and they are interned at the context level so that there is 221never more than one Python `mlir.ir.Operation` object for a unique 222`MlirOperation`, regardless of how it is obtained. 223 224The C/C++ API allows for Region/Block to also be detached, but it simplifies the 225ownership model a lot to eliminate that possibility in this API, allowing the 226Region/Block to be completely dependent on its owning operation for accounting. 227The aliasing of Python `Region`/`Block` instances to underlying 228`MlirRegion`/`MlirBlock` is considered benign and these objects are not interned 229in the context (unlike operations). 230 231If we ever want to re-introduce detached regions/blocks, we could do so with new 232"DetachedRegion" class or similar and also avoid the complexity of accounting. 233With the way it is now, we can avoid having a global live list for regions and 234blocks. We may end up needing an op-local one at some point TBD, depending on 235how hard it is to guarantee how mutations interact with their Python peer 236objects. We can cross that bridge easily when we get there. 237 238Module, when used purely from the Python API, can't alias anyway, so we can use 239it as a top-level ref type without a live-list for interning. If the API ever 240changes such that this cannot be guaranteed (i.e. by letting you marshal a 241native-defined Module in), then there would need to be a live table for it too. 242 243## User-level API 244 245### Context Management 246 247The bindings rely on Python 248[context managers](https://docs.python.org/3/reference/datamodel.html#context-managers) 249(`with` statements) to simplify creation and handling of IR objects by omitting 250repeated arguments such as MLIR contexts, operation insertion points and 251locations. A context manager sets up the default object to be used by all 252binding calls within the following context and in the same thread. This default 253can be overridden by specific calls through the dedicated keyword arguments. 254 255#### MLIR Context 256 257An MLIR context is a top-level entity that owns attributes and types and is 258referenced from virtually all IR constructs. Contexts also provide thread safety 259at the C++ level. In Python bindings, the MLIR context is also a Python context 260manager, one can write: 261 262```python 263from mlir.ir import Context, Module 264 265with Context() as ctx: 266 # IR construction using `ctx` as context. 267 268 # For example, parsing an MLIR module from string requires the context. 269 Module.parse("builtin.module {}") 270``` 271 272IR objects referencing a context usually provide access to it through the 273`.context` property. Most IR-constructing functions expect the context to be 274provided in some form. In case of attributes and types, the context may be 275extracted from the contained attribute or type. In case of operations, the 276context is systematically extracted from Locations (see below). When the context 277cannot be extracted from any argument, the bindings API expects the (keyword) 278argument `context`. If it is not provided or set to `None` (default), it will be 279looked up from an implicit stack of contexts maintained by the bindings in the 280current thread and updated by context managers. If there is no surrounding 281context, an error will be raised. 282 283Note that it is possible to manually specify the MLIR context both inside and 284outside of the `with` statement: 285 286```python 287from mlir.ir import Context, Module 288 289standalone_ctx = Context() 290with Context() as managed_ctx: 291 # Parse a module in managed_ctx. 292 Module.parse("...") 293 294 # Parse a module in standalone_ctx (override the context manager). 295 Module.parse("...", context=standalone_ctx) 296 297# Parse a module without using context managers. 298Module.parse("...", context=standalone_ctx) 299``` 300 301The context object remains live as long as there are IR objects referencing it. 302 303#### Insertion Points and Locations 304 305When constructing an MLIR operation, two pieces of information are required: 306 307- an *insertion point* that indicates where the operation is to be created in 308 the IR region/block/operation structure (usually before or after another 309 operation, or at the end of some block); it may be missing, at which point 310 the operation is created in the *detached* state; 311- a *location* that contains user-understandable information about the source 312 of the operation (for example, file/line/column information), which must 313 always be provided as it carries a reference to the MLIR context. 314 315Both can be provided using context managers or explicitly as keyword arguments 316in the operation constructor. They can be also provided as keyword arguments 317`ip` and `loc` both within and outside of the context manager. 318 319```python 320from mlir.ir import Context, InsertionPoint, Location, Module, Operation 321 322with Context() as ctx: 323 module = Module.create() 324 325 # Prepare for inserting operations into the body of the module and indicate 326 # that these operations originate in the "f.mlir" file at the given line and 327 # column. 328 with InsertionPoint(module.body), Location.file("f.mlir", line=42, col=1): 329 # This operation will be inserted at the end of the module body and will 330 # have the location set up by the context manager. 331 Operation(<...>) 332 333 # This operation will be inserted at the end of the module (and after the 334 # previously constructed operation) and will have the location provided as 335 # the keyword argument. 336 Operation(<...>, loc=Location.file("g.mlir", line=1, col=10)) 337 338 # This operation will be inserted at the *beginning* of the block rather 339 # than at its end. 340 Operation(<...>, ip=InsertionPoint.at_block_begin(module.body)) 341``` 342 343Note that `Location` needs an MLIR context to be constructed. It can take the 344context set up in the current thread by some surrounding context manager, or 345accept it as an explicit argument: 346 347```python 348from mlir.ir import Context, Location 349 350# Create a context and a location in this context in the same `with` statement. 351with Context() as ctx, Location.file("f.mlir", line=42, col=1, context=ctx): 352 pass 353``` 354 355Locations are owned by the context and live as long as they are (transitively) 356referenced from somewhere in Python code. 357 358Unlike locations, the insertion point may be left unspecified (or, equivalently, 359set to `None` or `False`) during operation construction. In this case, the 360operation is created in the *detached* state, that is, it is not added into the 361region of another operation and is owned by the caller. This is usually the case 362for top-level operations that contain the IR, such as modules. Regions, blocks 363and values contained in an operation point back to it and maintain it live. 364 365### Inspecting IR Objects 366 367Inspecting the IR is one of the primary tasks the Python bindings are designed 368for. One can traverse the IR operation/region/block structure and inspect their 369aspects such as operation attributes and value types. 370 371#### Operations, Regions and Blocks 372 373Operations are represented as either: 374 375- the generic `Operation` class, useful in particular for generic processing 376 of unregistered operations; or 377- a specific subclass of `OpView` that provides more semantically-loaded 378 accessors to operation properties. 379 380Given an `OpView` subclass, one can obtain an `Operation` using its `.operation` 381property. Given an `Operation`, one can obtain the corresponding `OpView` using 382its `.opview` property *as long as* the corresponding class has been set up. 383This typically means that the Python module of its dialect has been loaded. By 384default, the `OpView` version is produced when navigating the IR tree. 385 386One can check if an operation has a specific type by means of Python's 387`isinstance` function: 388 389```python 390operation = <...> 391opview = <...> 392if isinstance(operation.opview, mydialect.MyOp): 393 pass 394if isinstance(opview, mydialect.MyOp): 395 pass 396``` 397 398The components of an operation can be inspected using its properties. 399 400- `attributes` is a collection of operation attributes . It can be subscripted 401 as both dictionary and sequence, e.g., both `operation.attributes["value"]` 402 and `operation.attributes[0]` will work. There is no guarantee on the order 403 in which the attributes are traversed when iterating over the `attributes` 404 property as sequence. 405- `operands` is a sequence collection of operation operands. 406- `results` is a sequence collection of operation results. 407- `regions` is a sequence collection of regions attached to the operation. 408 409The objects produced by `operands` and `results` have a `.types` property that 410contains a sequence collection of types of the corresponding values. 411 412```python 413from mlir.ir import Operation 414 415operation1 = <...> 416operation2 = <...> 417if operation1.results.types == operation2.operand.types: 418 pass 419``` 420 421`OpView` subclasses for specific operations may provide leaner accessors to 422properties of an opeation. For example, named attributes, operand and results 423are usually accessible as properties of the `OpView` subclass with the same 424name, such as `operation.const_value` instead of 425`operation.attributes["const_value"]`. If this name is a reserved Python 426keyword, it is suffixed with an underscore. 427 428The operation itself is iterable, which provides access to the attached regions 429in order: 430 431```python 432from mlir.ir import Operation 433 434operation = <...> 435for region in operation: 436 do_something_with_region(region) 437``` 438 439A region is conceptually a sequence of blocks. Objects of the `Region` class are 440thus iterable, which provides access to the blocks. One can also use the 441`.blocks` property. 442 443```python 444# Regions are directly iterable and give acceess to blocks. 445for block1, block2 in zip(operation.regions[0], operation.regions[0].blocks) 446 assert block1 == block2 447``` 448 449A block contains a sequence of operations, and has several additional 450properties. Objects of the `Block` class are iterable and provide access to the 451operations contained in the block. So does the `.operations` property. Blocks 452also have a list of arguments available as a sequence collection using the 453`.arguments` property. 454 455Block and region belong to the parent operation in Python bindings and keep it 456alive. This operation can be accessed using the `.owner` property. 457 458#### Attributes and Types 459 460Attributes and types are (mostly) immutable context-owned objects. They are 461represented as either: 462 463- an opaque `Attribute` or `Type` object supporting printing and comparsion; 464 or 465- a concrete subclass thereof with access to properties of the attribute or 466 type. 467 468Given an `Attribute` or `Type` object, one can obtain a concrete subclass using 469the constructor of the subclass. This may raise a `ValueError` if the attribute 470or type is not of the expected subclass: 471 472```python 473from mlir.ir import Attribute, Type 474from mlir.<dialect> import ConcreteAttr, ConcreteType 475 476attribute = <...> 477type = <...> 478try: 479 concrete_attr = ConcreteAttr(attribute) 480 concrete_type = ConcreteType(type) 481except ValueError as e: 482 # Handle incorrect subclass. 483``` 484 485In addition, concrete attribute and type classes provide a static `isinstance` 486method to check whether an object of the opaque `Attribute` or `Type` type can 487be downcasted: 488 489```python 490from mlir.ir import Attribute, Type 491from mlir.<dialect> import ConcreteAttr, ConcreteType 492 493attribute = <...> 494type = <...> 495 496# No need to handle errors here. 497if ConcreteAttr.isinstance(attribute): 498 concrete_attr = ConcreteAttr(attribute) 499if ConcreteType.isinstance(type): 500 concrete_type = ConcreteType(type) 501``` 502 503By default, and unlike operations, attributes and types are returned from IR 504traversals using the opaque `Attribute` or `Type` that needs to be downcasted. 505 506Concrete attribute and type classes usually expose their properties as Python 507readonly properties. For example, the elemental type of a tensor type can be 508accessed using the `.element_type` property. 509 510#### Values 511 512MLIR has two kinds of values based on their defining object: block arguments and 513operation results. Values are handled similarly to attributes and types. They 514are represented as either: 515 516- a generic `Value` object; or 517- a concrete `BlockArgument` or `OpResult` object. 518 519The former provides all the generic functionality such as comparison, type 520access and printing. The latter provide access to the defining block or 521operation and the position of the value within it. By default, the generic 522`Value` objects are returned from IR traversals. Downcasting is implemented 523through concrete subclass constructors, similarly to attribtues and types: 524 525```python 526from mlir.ir import BlockArgument, OpResult, Value 527 528value = ... 529 530# Set `concrete` to the specific value subclass. 531try: 532 concrete = BlockArgument(value) 533except ValueError: 534 # This must not raise another ValueError as values are either block arguments 535 # or op results. 536 concrete = OpResult(value) 537``` 538 539### Creating IR Objects 540 541Python bindings also support IR creation and manipulation. 542 543#### Operations, Regions and Blocks 544 545Operations can be created given a `Location` and an optional `InsertionPoint`. 546It is often easier to user context managers to specify locations and insertion 547points for several operations created in a row as decribed above. 548 549Concrete operations can be created by using constructors of the corresponding 550`OpView` subclasses. The generic, default form of the constructor accepts: 551 552- an optional sequence of types for operation results (`results`); 553- an optional sequence of values for operation operands, or another operation 554 producing those values (`operands`); 555- an optional dictionary of operation attributes (`attributes`); 556- an optional sequence of successor blocks (`successors`); 557- the number of regions to attach to the operation (`regions`, default `0`); 558- the `loc` keyword argument containing the `Location` of this operation; if 559 `None`, the location created by the closest context manager is used or an 560 exception will be raised if there is no context manager; 561- the `ip` keyword argument indicating where the operation will be inserted in 562 the IR; if `None`, the insertion point created by the closest context 563 manager is used; if there is no surrounding context manager, the operation 564 is created in the detached state. 565 566Most operations will customize the constructor to accept a reduced list of 567arguments that are relevant for the operation. For example, zero-result 568operations may omit the `results` argument, so can the operations where the 569result types can be derived from operand types unambiguously. As a concrete 570example, built-in function operations can be constructed by providing a function 571name as string and its argument and result types as a tuple of sequences: 572 573```python 574from mlir.ir import Context, Module 575from mlir.dialects import builtin 576 577with Context(): 578 module = Module.create() 579 with InsertionPoint(module.body), Location.unknown(): 580 func = builtin.FuncOp("main", ([], [])) 581``` 582 583Also see below for constructors generated from ODS. 584 585Operations can also be constructed using the generic class and based on the 586canonical string name of the operation using `Operation.create`. It accepts the 587operation name as string, which must exactly match the canonical name of the 588operation in C++ or ODS, followed by the same argument list as the default 589constructor for `OpView`. *This form is discouraged* from use and is intended 590for generic operation processing. 591 592```python 593from mlir.ir import Context, Module 594from mlir.dialects import builtin 595 596with Context(): 597 module = Module.create() 598 with InsertionPoint(module.body), Location.unknown(): 599 # Operations can be created in a generic way. 600 func = Operation.create( 601 "builtin.func", results=[], operands=[], 602 attributes={"type":TypeAttr.get(FunctionType.get([], []))}, 603 successors=None, regions=1) 604 # The result will be downcasted to the concrete `OpView` subclass if 605 # available. 606 assert isinstance(func, builtin.FuncOp) 607``` 608 609Regions are created for an operation when constructing it on the C++ side. They 610are not constructible in Python and are not expected to exist outside of 611operations (unlike in C++ that supports detached regions). 612 613Blocks can be created within a given region and inserted before or after another 614block of the same region using `create_before()`, `create_after()` methods of 615the `Block` class, or the `create_at_start()` static method of the same class. 616They are not expected to exist outside of regions (unlike in C++ that supports 617detached blocks). 618 619```python 620from mlir.ir import Block, Context, Operation 621 622with Context(): 623 op = Operation.create("generic.op", regions=1) 624 625 # Create the first block in the region. 626 entry_block = Block.create_at_start(op.regions[0]) 627 628 # Create further blocks. 629 other_block = entry_block.create_after() 630``` 631 632Blocks can be used to create `InsertionPoint`s, which can point to the beginning 633or the end of the block, or just before its terminator. It is common for 634`OpView` subclasses to provide a `.body` property that can be used to construct 635an `InsertionPoint`. For example, builtin `Module` and `FuncOp` provide a 636`.body` and `.add_entry_blocK()`, respectively. 637 638#### Attributes and Types 639 640Attributes and types can be created given a `Context` or another attribute or 641type object that already references the context. To indicate that they are owned 642by the context, they are obtained by calling the static `get` method on the 643concrete attribute or type class. These method take as arguments the data 644necessary to construct the attribute or type and a the keyword `context` 645argument when the context cannot be derived from other arguments. 646 647```python 648from mlir.ir import Context, F32Type, FloatAttr 649 650# Attribute and types require access to an MLIR context, either directly or 651# through another context-owned object. 652ctx = Context() 653f32 = F32Type.get(context=ctx) 654pi = FloatAttr.get(f32, 3.14) 655 656# They may use the context defined by the surrounding context manager. 657with Context(): 658 f32 = F32Type.get() 659 pi = FloatAttr.get(f32, 3.14) 660``` 661 662Some attributes provide additional construction methods for clarity. 663 664```python 665from mlir.ir import Context, IntegerAttr, IntegerType 666 667with Context(): 668 i8 = IntegerType.get_signless(8) 669 IntegerAttr.get(i8, 42) 670``` 671 672Builtin attribute can often be constructed from Python types with similar 673structure. For example, `ArrayAttr` can be constructed from a sequence 674collection of attributes, and a `DictAttr` can be constructed from a dictionary: 675 676```python 677from mlir.ir import ArrayAttr, Context, DictAttr, UnitAttr 678 679with Context(): 680 array = ArrayAttr.get([UnitAttr.get(), UnitAttr.get()]) 681 dictionary = DictAttr.get({"array": array, "unit": UnitAttr.get()}) 682``` 683 684## Style 685 686In general, for the core parts of MLIR, the Python bindings should be largely 687isomorphic with the underlying C++ structures. However, concessions are made 688either for practicality or to give the resulting library an appropriately 689"Pythonic" flavor. 690 691### Properties vs get\*() methods 692 693Generally favor converting trivial methods like `getContext()`, `getName()`, 694`isEntryBlock()`, etc to read-only Python properties (i.e. `context`). It is 695primarily a matter of calling `def_property_readonly` vs `def` in binding code, 696and makes things feel much nicer to the Python side. 697 698For example, prefer: 699 700```c++ 701m.def_property_readonly("context", ...) 702``` 703 704Over: 705 706```c++ 707m.def("getContext", ...) 708``` 709 710### **repr** methods 711 712Things that have nice printed representations are really great :) If there is a 713reasonable printed form, it can be a significant productivity boost to wire that 714to the `__repr__` method (and verify it with a [doctest](#sample-doctest)). 715 716### CamelCase vs snake\_case 717 718Name functions/methods/properties in `snake_case` and classes in `CamelCase`. As 719a mechanical concession to Python style, this can go a long way to making the 720API feel like it fits in with its peers in the Python landscape. 721 722If in doubt, choose names that will flow properly with other 723[PEP 8 style names](https://pep8.org/#descriptive-naming-styles). 724 725### Prefer pseudo-containers 726 727Many core IR constructs provide methods directly on the instance to query count 728and begin/end iterators. Prefer hoisting these to dedicated pseudo containers. 729 730For example, a direct mapping of blocks within regions could be done this way: 731 732```python 733region = ... 734 735for block in region: 736 737 pass 738``` 739 740However, this way is preferred: 741 742```python 743region = ... 744 745for block in region.blocks: 746 747 pass 748 749print(len(region.blocks)) 750print(region.blocks[0]) 751print(region.blocks[-1]) 752``` 753 754Instead of leaking STL-derived identifiers (`front`, `back`, etc), translate 755them to appropriate `__dunder__` methods and iterator wrappers in the bindings. 756 757Note that this can be taken too far, so use good judgment. For example, block 758arguments may appear container-like but have defined methods for lookup and 759mutation that would be hard to model properly without making semantics 760complicated. If running into these, just mirror the C/C++ API. 761 762### Provide one stop helpers for common things 763 764One stop helpers that aggregate over multiple low level entities can be 765incredibly helpful and are encouraged within reason. For example, making 766`Context` have a `parse_asm` or equivalent that avoids needing to explicitly 767construct a SourceMgr can be quite nice. One stop helpers do not have to be 768mutually exclusive with a more complete mapping of the backing constructs. 769 770## Testing 771 772Tests should be added in the `test/Bindings/Python` directory and should 773typically be `.py` files that have a lit run line. 774 775We use `lit` and `FileCheck` based tests: 776 777* For generative tests (those that produce IR), define a Python module that 778 constructs/prints the IR and pipe it through `FileCheck`. 779* Parsing should be kept self-contained within the module under test by use of 780 raw constants and an appropriate `parse_asm` call. 781* Any file I/O code should be staged through a tempfile vs relying on file 782 artifacts/paths outside of the test module. 783* For convenience, we also test non-generative API interactions with the same 784 mechanisms, printing and `CHECK`ing as needed. 785 786### Sample FileCheck test 787 788```python 789# RUN: %PYTHON %s | mlir-opt -split-input-file | FileCheck 790 791# TODO: Move to a test utility class once any of this actually exists. 792def print_module(f): 793 m = f() 794 print("// -----") 795 print("// TEST_FUNCTION:", f.__name__) 796 print(m.to_asm()) 797 return f 798 799# CHECK-LABEL: TEST_FUNCTION: create_my_op 800@print_module 801def create_my_op(): 802 m = mlir.ir.Module() 803 builder = m.new_op_builder() 804 # CHECK: mydialect.my_operation ... 805 builder.my_op() 806 return m 807``` 808 809## Integration with ODS 810 811The MLIR Python bindings integrate with the tablegen-based ODS system for 812providing user-friendly wrappers around MLIR dialects and operations. There are 813multiple parts to this integration, outlined below. Most details have been 814elided: refer to the build rules and python sources under `mlir.dialects` for 815the canonical way to use this facility. 816 817Users are responsible for providing a `{DIALECT_NAMESPACE}.py` (or an equivalent 818directory with `__init__.py` file) as the entrypoint. 819 820### Generating `_{DIALECT_NAMESPACE}_ops_gen.py` wrapper modules 821 822Each dialect with a mapping to python requires that an appropriate 823`_{DIALECT_NAMESPACE}_ops_gen.py` wrapper module is created. This is done by 824invoking `mlir-tblgen` on a python-bindings specific tablegen wrapper that 825includes the boilerplate and actual dialect specific `td` file. An example, for 826the `StandardOps` (which is assigned the namespace `std` as a special case): 827 828```tablegen 829#ifndef PYTHON_BINDINGS_STANDARD_OPS 830#define PYTHON_BINDINGS_STANDARD_OPS 831 832include "mlir/Bindings/Python/Attributes.td" 833include "mlir/Dialect/StandardOps/IR/Ops.td" 834 835#endif 836``` 837 838In the main repository, building the wrapper is done via the CMake function 839`add_mlir_dialect_python_bindings`, which invokes: 840 841``` 842mlir-tblgen -gen-python-op-bindings -bind-dialect={DIALECT_NAMESPACE} \ 843 {PYTHON_BINDING_TD_FILE} 844``` 845 846The generates op classes must be included in the `{DIALECT_NAMESPACE}.py` file 847in a similar way that generated headers are included for C++ generated code: 848 849```python 850from ._my_dialect_ops_gen import * 851``` 852 853### Extending the search path for wrapper modules 854 855When the python bindings need to locate a wrapper module, they consult the 856`dialect_search_path` and use it to find an appropriately named module. For the 857main repository, this search path is hard-coded to include the `mlir.dialects` 858module, which is where wrappers are emitted by the abobe build rule. Out of tree 859dialects and add their modules to the search path by calling: 860 861```python 862mlir._cext.append_dialect_search_prefix("myproject.mlir.dialects") 863``` 864 865### Wrapper module code organization 866 867The wrapper module tablegen emitter outputs: 868 869* A `_Dialect` class (extending `mlir.ir.Dialect`) with a `DIALECT_NAMESPACE` 870 attribute. 871* An `{OpName}` class for each operation (extending `mlir.ir.OpView`). 872* Decorators for each of the above to register with the system. 873 874Note: In order to avoid naming conflicts, all internal names used by the wrapper 875module are prefixed by `_ods_`. 876 877Each concrete `OpView` subclass further defines several public-intended 878attributes: 879 880* `OPERATION_NAME` attribute with the `str` fully qualified operation name 881 (i.e. `math.abs`). 882* An `__init__` method for the *default builder* if one is defined or inferred 883 for the operation. 884* `@property` getter for each operand or result (using an auto-generated name 885 for unnamed of each). 886* `@property` getter, setter and deleter for each declared attribute. 887 888It further emits additional private-intended attributes meant for subclassing 889and customization (default cases omit these attributes in favor of the defaults 890on `OpView`): 891 892* `_ODS_REGIONS`: A specification on the number and types of regions. 893 Currently a tuple of (min_region_count, has_no_variadic_regions). Note that 894 the API does some light validation on this but the primary purpose is to 895 capture sufficient information to perform other default building and region 896 accessor generation. 897* `_ODS_OPERAND_SEGMENTS` and `_ODS_RESULT_SEGMENTS`: Black-box value which 898 indicates the structure of either the operand or results with respect to 899 variadics. Used by `OpView._ods_build_default` to decode operand and result 900 lists that contain lists. 901 902#### Default Builder 903 904Presently, only a single, default builder is mapped to the `__init__` method. 905The intent is that this `__init__` method represents the *most specific* of the 906builders typically generated for C++; however currently it is just the generic 907form below. 908 909* One argument for each declared result: 910 * For single-valued results: Each will accept an `mlir.ir.Type`. 911 * For variadic results: Each will accept a `List[mlir.ir.Type]`. 912* One argument for each declared operand or attribute: 913 * For single-valued operands: Each will accept an `mlir.ir.Value`. 914 * For variadic operands: Each will accept a `List[mlir.ir.Value]`. 915 * For attributes, it will accept an `mlir.ir.Attribute`. 916* Trailing usage-specific, optional keyword arguments: 917 * `loc`: An explicit `mlir.ir.Location` to use. Defaults to the location 918 bound to the thread (i.e. `with Location.unknown():`) or an error if 919 none is bound nor specified. 920 * `ip`: An explicit `mlir.ir.InsertionPoint` to use. Default to the 921 insertion point bound to the thread (i.e. `with InsertionPoint(...):`). 922 923In addition, each `OpView` inherits a `build_generic` method which allows 924construction via a (nested in the case of variadic) sequence of `results` and 925`operands`. This can be used to get some default construction semantics for 926operations that are otherwise unsupported in Python, at the expense of having a 927very generic signature. 928 929#### Extending Generated Op Classes 930 931Note that this is a rather complex mechanism and this section errs on the side 932of explicitness. Users are encouraged to find an example and duplicate it if 933they don't feel the need to understand the subtlety. The `builtin` dialect 934provides some relatively simple examples. 935 936As mentioned above, the build system generates Python sources like 937`_{DIALECT_NAMESPACE}_ops_gen.py` for each dialect with Python bindings. It is 938often desirable to to use these generated classes as a starting point for 939further customization, so an extension mechanism is provided to make this easy 940(you are always free to do ad-hoc patching in your `{DIALECT_NAMESPACE}.py` file 941but we prefer a more standard mechanism that is applied uniformly). 942 943To provide extensions, add a `_{DIALECT_NAMESPACE}_ops_ext.py` file to the 944`dialects` module (i.e. adjacent to your `{DIALECT_NAMESPACE}.py` top-level and 945the `*_ops_gen.py` file). Using the `builtin` dialect and `FuncOp` as an 946example, the generated code will include an import like this: 947 948```python 949try: 950 from . import _builtin_ops_ext as _ods_ext_module 951except ImportError: 952 _ods_ext_module = None 953``` 954 955Then for each generated concrete `OpView` subclass, it will apply a decorator 956like: 957 958```python 959@_ods_cext.register_operation(_Dialect) 960@_ods_extend_opview_class(_ods_ext_module) 961class FuncOp(_ods_ir.OpView): 962``` 963 964See the `_ods_common.py` `extend_opview_class` function for details of the 965mechanism. At a high level: 966 967* If the extension module exists, locate an extension class for the op (in 968 this example, `FuncOp`): 969 * First by looking for an attribute with the exact name in the extension 970 module. 971 * Falling back to calling a `select_opview_mixin(parent_opview_cls)` 972 function defined in the extension module. 973* If a mixin class is found, a new subclass is dynamically created that 974 multiply inherits from `({_builtin_ops_ext.FuncOp}, 975 _builtin_ops_gen.FuncOp)`. 976 977The mixin class should not inherit from anything (i.e. directly extends `object` 978only). The facility is typically used to define custom `__init__` methods, 979properties, instance methods and static methods. Due to the inheritance 980ordering, the mixin class can act as though it extends the generated `OpView` 981subclass in most contexts (i.e. `issubclass(_builtin_ops_ext.FuncOp, OpView)` 982will return `False` but usage generally allows you treat it as duck typed as an 983`OpView`). 984 985There are a couple of recommendations, given how the class hierarchy is defined: 986 987* For static methods that need to instantiate the actual "leaf" op (which is 988 dynamically generated and would result in circular dependencies to try to 989 reference by name), prefer to use `@classmethod` and the concrete subclass 990 will be provided as your first `cls` argument. See 991 `_builtin_ops_ext.FuncOp.from_py_func` as an example. 992* If seeking to replace the generated `__init__` method entirely, you may 993 actually want to invoke the super-super-class `mlir.ir.OpView` constructor 994 directly, as it takes an `mlir.ir.Operation`, which is likely what you are 995 constructing (i.e. the generated `__init__` method likely adds more API 996 constraints than you want to expose in a custom builder). 997 998A pattern that comes up frequently is wanting to provide a sugared `__init__` 999method which has optional or type-polymorphism/implicit conversions but to 1000otherwise want to invoke the default op building logic. For such cases, it is 1001recommended to use an idiom such as: 1002 1003```python 1004 def __init__(self, sugar, spice, *, loc=None, ip=None): 1005 ... massage into result_type, operands, attributes ... 1006 OpView.__init__(self, self.build_generic( 1007 results=[result_type], 1008 operands=operands, 1009 attributes=attributes, 1010 loc=loc, 1011 ip=ip)) 1012``` 1013 1014Refer to the documentation for `build_generic` for more information. 1015