1# 'llvm' Dialect 2 3This dialect maps [LLVM IR](https://llvm.org/docs/LangRef.html) into MLIR by 4defining the corresponding operations and types. LLVM IR metadata is usually 5represented as MLIR attributes, which offer additional structure verification. 6 7We use "LLVM IR" to designate the 8[intermediate representation of LLVM](https://llvm.org/docs/LangRef.html) and 9"LLVM _dialect_" or "LLVM IR _dialect_" to refer to this MLIR dialect. 10 11Unless explicitly stated otherwise, the semantics of the LLVM dialect operations 12must correspond to the semantics of LLVM IR instructions and any divergence is 13considered a bug. The dialect also contains auxiliary operations that smoothen 14the differences in the IR structure, e.g., MLIR does not have `phi` operations 15and LLVM IR does not have a `constant` operation. These auxiliary operations are 16systematically prefixed with `mlir`, e.g. `llvm.mlir.constant` where `llvm.` is 17the dialect namespace prefix. 18 19[TOC] 20 21## Dependency on LLVM IR 22 23LLVM dialect is not expected to depend on any object that requires an 24`LLVMContext`, such as an LLVM IR instruction or type. Instead, MLIR provides 25thread-safe alternatives compatible with the rest of the infrastructure. The 26dialect is allowed to depend on the LLVM IR objects that don't require a 27context, such as data layout and triple description. 28 29## Module Structure 30 31IR modules use the built-in MLIR `ModuleOp` and support all its features. In 32particular, modules can be named, nested and are subject to symbol visibility. 33Modules can contain any operations, including LLVM functions and globals. 34 35### Data Layout and Triple 36 37An IR module may have an optional data layout and triple information attached 38using MLIR attributes `llvm.data_layout` and `llvm.triple`, respectively. Both 39are string attributes with the 40[same syntax](https://llvm.org/docs/LangRef.html#data-layout) as in LLVM IR and 41are verified to be correct. They can be defined as follows. 42 43```mlir 44module attributes {llvm.data_layout = "e", 45 llvm.target_triple = "aarch64-linux-android"} { 46 // module contents 47} 48``` 49 50### Functions 51 52LLVM functions are represented by a special operation, `llvm.func`, that has 53syntax similar to that of the built-in function operation but supports 54LLVM-related features such as linkage and variadic argument lists. See detailed 55description in the operation list [below](#llvmfunc-mlirllvmllvmfuncop). 56 57### PHI Nodes and Block Arguments 58 59MLIR uses block arguments instead of PHI nodes to communicate values between 60blocks. Therefore, the LLVM dialect has no operation directly equivalent to 61`phi` in LLVM IR. Instead, all terminators can pass values as successor operands 62as these values will be forwarded as block arguments when the control flow is 63transferred. 64 65For example: 66 67```mlir 68^bb1: 69 %0 = llvm.addi %arg0, %cst : i32 70 llvm.br ^bb2[%0: i32] 71 72// If the control flow comes from ^bb1, %arg1 == %0. 73^bb2(%arg1: i32) 74 // ... 75``` 76 77is equivalent to LLVM IR 78 79```llvm 80%0: 81 %1 = add i32 %arg0, %cst 82 br %3 83 84%3: 85 %arg1 = phi [%1, %0], //... 86``` 87 88Since there is no need to use the block identifier to differentiate the source 89of different values, the LLVM dialect supports terminators that transfer the 90control flow to the same block with different arguments. For example: 91 92```mlir 93^bb1: 94 llvm.cond_br %cond, ^bb2[%0: i32], ^bb2[%1: i32] 95 96^bb2(%arg0: i32): 97 // ... 98``` 99 100### Context-Level Values 101 102Some value kinds in LLVM IR, such as constants and undefs, are uniqued in 103context and used directly in relevant operations. MLIR does not support such 104values for thread-safety and concept parsimony reasons. Instead, regular values 105are produced by dedicated operations that have the corresponding semantics: 106[`llvm.mlir.constant`](#llvmmlirconstant-mlirllvmconstantop), 107[`llvm.mlir.undef`](#llvmmlirundef-mlirllvmundefop), 108[`llvm.mlir.null`](#llvmmlirnull-mlirllvmnullop). Note how these operations are 109prefixed with `mlir.` to indicate that they don't belong to LLVM IR but are only 110necessary to model it in MLIR. The values produced by these operations are 111usable just like any other value. 112 113Examples: 114 115```mlir 116// Create an undefined value of structure type with a 32-bit integer followed 117// by a float. 118%0 = llvm.mlir.undef : !llvm.struct<(i32, f32)> 119 120// Null pointer to i8. 121%1 = llvm.mlir.null : !llvm.ptr<i8> 122 123// Null pointer to a function with signature void(). 124%2 = llvm.mlir.null : !llvm.ptr<func<void ()>> 125 126// Constant 42 as i32. 127%3 = llvm.mlir.constant(42 : i32) : i32 128 129// Splat dense vector constant. 130%3 = llvm.mlir.constant(dense<1.0> : vector<4xf32>) : vector<4xf32> 131``` 132 133Note that constants list the type twice. This is an artifact of the LLVM dialect 134not using built-in types, which are used for typed MLIR attributes. The syntax 135will be reevaluated after considering composite constants. 136 137### Globals 138 139Global variables are also defined using a special operation, 140[`llvm.mlir.global`](#llvmmlirglobal-mlirllvmglobalop), located at the module 141level. Globals are MLIR symbols and are identified by their name. 142 143Since functions need to be isolated-from-above, i.e. values defined outside the 144function cannot be directly used inside the function, an additional operation, 145[`llvm.mlir.addressof`](#llvmmliraddressof-mlirllvmaddressofop), is provided to 146locally define a value containing the _address_ of a global. The actual value 147can then be loaded from that pointer, or a new value can be stored into it if 148the global is not declared constant. This is similar to LLVM IR where globals 149are accessed through name and have a pointer type. 150 151### Linkage 152 153Module-level named objects in the LLVM dialect, namely functions and globals, 154have an optional _linkage_ attribute derived from LLVM IR 155[linkage types](https://llvm.org/docs/LangRef.html#linkage-types). Linkage is 156specified by the same keyword as in LLVM IR and is located between the operation 157name (`llvm.func` or `llvm.global`) and the symbol name. If no linkage keyword 158is present, `external` linkage is assumed by default. Linkage is _distinct_ from 159MLIR symbol visibility. 160 161### Attribute Pass-Through 162 163The LLVM dialect provides a mechanism to forward function-level attributes to 164LLVM IR using the `passthrough` attribute. This is an array attribute containing 165either string attributes or array attributes. In the former case, the value of 166the string is interpreted as the name of LLVM IR function attribute. In the 167latter case, the array is expected to contain exactly two string attributes, the 168first corresponding to the name of LLVM IR function attribute, and the second 169corresponding to its value. Note that even integer LLVM IR function attributes 170have their value represented in the string form. 171 172Example: 173 174```mlir 175llvm.func @func() attributes { 176 passthrough = ["noinline", // value-less attribute 177 ["alignstack", "4"], // integer attribute with value 178 ["other", "attr"]] // attribute unknown to LLVM 179} { 180 llvm.return 181} 182``` 183 184If the attribute is not known to LLVM IR, it will be attached as a string 185attribute. 186 187## Types 188 189LLVM dialect uses built-in types whenever possible and defines a set of 190complementary types, which correspond to the LLVM IR types that cannot be 191directly represented with built-in types. Similarly to other MLIR context-owned 192objects, the creation and manipulation of LLVM dialect types is thread-safe. 193 194MLIR does not support module-scoped named type declarations, e.g. `%s = type 195{i32, i32}` in LLVM IR. Instead, types must be fully specified at each use, 196except for recursive types where only the first reference to a named type needs 197to be fully specified. MLIR [type aliases](../LangRef.md/#type-aliases) can be 198used to achieve more compact syntax. 199 200The general syntax of LLVM dialect types is `!llvm.`, followed by a type kind 201identifier (e.g., `ptr` for pointer or `struct` for structure) and by an 202optional list of type parameters in angle brackets. The dialect follows MLIR 203style for types with nested angle brackets and keyword specifiers rather than 204using different bracket styles to differentiate types. Types inside the angle 205brackets may omit the `!llvm.` prefix for brevity: the parser first attempts to 206find a type (starting with `!` or a built-in type) and falls back to accepting a 207keyword. For example, `!llvm.ptr<!llvm.ptr<i32>>` and `!llvm.ptr<ptr<i32>>` are 208equivalent, with the latter being the canonical form, and denote a pointer to a 209pointer to a 32-bit integer. 210 211### Built-in Type Compatibility 212 213LLVM dialect accepts a subset of built-in types that are referred to as _LLVM 214dialect-compatible types_. The following types are compatible: 215 216- Signless integers - `iN` (`IntegerType`). 217- Floating point types - `bfloat`, `half`, `float`, `double` , `f80`, `f128` 218 (`FloatType`). 219- 1D vectors of signless integers or floating point types - `vector<NxT>` 220 (`VectorType`). 221 222Note that only a subset of types that can be represented by a given class is 223compatible. For example, signed and unsigned integers are not compatible. LLVM 224provides a function, `bool LLVM::isCompatibleType(Type)`, that can be used as a 225compatibility check. 226 227Each LLVM IR type corresponds to *exactly one* MLIR type, either built-in or 228LLVM dialect type. For example, because `i32` is LLVM-compatible, there is no 229`!llvm.i32` type. However, `!llvm.ptr<T>` is defined in the LLVM dialect as 230there is no corresponding built-in type. 231 232### Additional Simple Types 233 234The following non-parametric types derived from the LLVM IR are available in the 235LLVM dialect: 236 237- `!llvm.x86_mmx` (`LLVMX86MMXType`) - value held in an MMX register on x86 238 machine. 239- `!llvm.ppc_fp128` (`LLVMPPCFP128Type`) - 128-bit floating-point value (two 240 64 bits). 241- `!llvm.token` (`LLVMTokenType`) - a non-inspectable value associated with an 242 operation. 243- `!llvm.metadata` (`LLVMMetadataType`) - LLVM IR metadata, to be used only if 244 the metadata cannot be represented as structured MLIR attributes. 245- `!llvm.void` (`LLVMVoidType`) - does not represent any value; can only 246 appear in function results. 247 248These types represent a single value (or an absence thereof in case of `void`) 249and correspond to their LLVM IR counterparts. 250 251### Additional Parametric Types 252 253These types are parameterized by the types they contain, e.g., the pointee or 254the element type, which can be either compatible built-in or LLVM dialect types. 255 256#### Pointer Types 257 258Pointer types specify an address in memory. 259 260Both opaque and type-parameterized pointer types are supported. 261[Opaque pointers](https://llvm.org/docs/OpaquePointers.html) do not indicate the 262type of the data pointed to, and are intended to simplify LLVM IR by encoding 263behavior relevant to the pointee type into operations rather than into types. 264Non-opaque pointer types carry the pointee type as a type parameter. Both kinds 265of pointers may be additionally parameterized by an address space. The address 266space is an integer, but this choice may be reconsidered if MLIR implements 267named address spaces. The syntax of pointer types is as follows: 268 269``` 270 llvm-ptr-type ::= `!llvm.ptr` (`<` integer-literal `>`)? 271 | `!llvm.ptr<` type (`,` integer-literal)? `>` 272``` 273 274where the former case is the opaque pointer type and the latter case is the 275non-opaque pointer type; the optional group containing the integer literal 276corresponds to the memory space. All cases are represented by `LLVMPointerType` 277internally. 278 279#### Array Types 280 281Array types represent sequences of elements in memory. Array elements can be 282addressed with a value unknown at compile time, and can be nested. Only 1D 283arrays are allowed though. 284 285Array types are parameterized by the fixed size and the element type. 286Syntactically, their representation is the following: 287 288``` 289 llvm-array-type ::= `!llvm.array<` integer-literal `x` type `>` 290``` 291 292and they are internally represented as `LLVMArrayType`. 293 294#### Function Types 295 296Function types represent the type of a function, i.e. its signature. 297 298Function types are parameterized by the result type, the list of argument types 299and by an optional "variadic" flag. Unlike built-in `FunctionType`, LLVM dialect 300functions (`LLVMFunctionType`) always have single result, which may be 301`!llvm.void` if the function does not return anything. The syntax is as follows: 302 303``` 304 llvm-func-type ::= `!llvm.func<` type `(` type-list (`,` `...`)? `)` `>` 305``` 306 307For example, 308 309```mlir 310!llvm.func<void ()> // a function with no arguments; 311!llvm.func<i32 (f32, i32)> // a function with two arguments and a result; 312!llvm.func<void (i32, ...)> // a variadic function with at least one argument. 313``` 314 315In the LLVM dialect, functions are not first-class objects and one cannot have a 316value of function type. Instead, one can take the address of a function and 317operate on pointers to functions. 318 319### Vector Types 320 321Vector types represent sequences of elements, typically when multiple data 322elements are processed by a single instruction (SIMD). Vectors are thought of as 323stored in registers and therefore vector elements can only be addressed through 324constant indices. 325 326Vector types are parameterized by the size, which may be either _fixed_ or a 327multiple of some fixed size in case of _scalable_ vectors, and the element type. 328Vectors cannot be nested and only 1D vectors are supported. Scalable vectors are 329still considered 1D. 330 331LLVM dialect uses built-in vector types for _fixed_-size vectors of built-in 332types, and provides additional types for fixed-sized vectors of LLVM dialect 333types (`LLVMFixedVectorType`) and scalable vectors of any types 334(`LLVMScalableVectorType`). These two additional types share the following 335syntax: 336 337``` 338 llvm-vec-type ::= `!llvm.vec<` (`?` `x`)? integer-literal `x` type `>` 339``` 340 341Note that the sets of element types supported by built-in and LLVM dialect 342vector types are mutually exclusive, e.g., the built-in vector type does not 343accept `!llvm.ptr<i32>` and the LLVM dialect fixed-width vector type does not 344accept `i32`. 345 346The following functions are provided to operate on any kind of the vector types 347compatible with the LLVM dialect: 348 349- `bool LLVM::isCompatibleVectorType(Type)` - checks whether a type is a 350 vector type compatible with the LLVM dialect; 351- `Type LLVM::getVectorElementType(Type)` - returns the element type of any 352 vector type compatible with the LLVM dialect; 353- `llvm::ElementCount LLVM::getVectorNumElements(Type)` - returns the number 354 of elements in any vector type compatible with the LLVM dialect; 355- `Type LLVM::getFixedVectorType(Type, unsigned)` - gets a fixed vector type 356 with the given element type and size; the resulting type is either a 357 built-in or an LLVM dialect vector type depending on which one supports the 358 given element type. 359 360#### Examples of Compatible Vector Types 361 362```mlir 363vector<42 x i32> // Vector of 42 32-bit integers. 364!llvm.vec<42 x ptr<i32>> // Vector of 42 pointers to 32-bit integers. 365!llvm.vec<? x 4 x i32> // Scalable vector of 32-bit integers with 366 // size divisible by 4. 367!llvm.array<2 x vector<2 x i32>> // Array of 2 vectors of 2 32-bit integers. 368!llvm.array<2 x vec<2 x ptr<i32>>> // Array of 2 vectors of 2 pointers to 32-bit 369 // integers. 370``` 371 372### Structure Types 373 374The structure type is used to represent a collection of data members together in 375memory. The elements of a structure may be any type that has a size. 376 377Structure types are represented in a single dedicated class 378mlir::LLVM::LLVMStructType. Internally, the struct type stores a (potentially 379empty) name, a (potentially empty) list of contained types and a bitmask 380indicating whether the struct is named, opaque, packed or uninitialized. 381Structure types that don't have a name are referred to as _literal_ structs. 382Such structures are uniquely identified by their contents. _Identified_ structs 383on the other hand are uniquely identified by the name. 384 385#### Identified Structure Types 386 387Identified structure types are uniqued using their name in a given context. 388Attempting to construct an identified structure with the same name a structure 389that already exists in the context *will result in the existing structure being 390returned*. **MLIR does not auto-rename identified structs in case of name 391conflicts** because there is no naming scope equivalent to a module in LLVM IR 392since MLIR modules can be arbitrarily nested. 393 394Programmatically, identified structures can be constructed in an _uninitialized_ 395state. In this case, they are given a name but the body must be set up by a 396later call, using MLIR's type mutation mechanism. Such uninitialized types can 397be used in type construction, but must be eventually initialized for IR to be 398valid. This mechanism allows for constructing _recursive_ or mutually referring 399structure types: an uninitialized type can be used in its own initialization. 400 401Once the type is initialized, its body cannot be changed anymore. Any further 402attempts to modify the body will fail and return failure to the caller _unless 403the type is initialized with the exact same body_. Type initialization is 404thread-safe; however, if a concurrent thread initializes the type before the 405current thread, the initialization may return failure. 406 407The syntax for identified structure types is as follows. 408 409``` 410llvm-ident-struct-type ::= `!llvm.struct<` string-literal, `opaque` `>` 411 | `!llvm.struct<` string-literal, `packed`? 412 `(` type-or-ref-list `)` `>` 413type-or-ref-list ::= <maybe empty comma-separated list of type-or-ref> 414type-or-ref ::= <any compatible type with optional !llvm.> 415 | `!llvm.`? `struct<` string-literal `>` 416``` 417 418The body of the identified struct is printed in full unless the it is 419transitively contained in the same struct. In the latter case, only the 420identifier is printed. For example, the structure containing the pointer to 421itself is represented as `!llvm.struct<"A", (ptr<"A">)>`, and the structure `A` 422containing two pointers to the structure `B` containing a pointer to the 423structure `A` is represented as `!llvm.struct<"A", (ptr<"B", (ptr<"A">)>, 424ptr<"B", (ptr<"A">))>`. Note that the structure `B` is "unrolled" for both 425elements. _A structure with the same name but different body is a syntax error._ 426**The user must ensure structure name uniqueness across all modules processed in 427a given MLIR context.** Structure names are arbitrary string literals and may 428include, e.g., spaces and keywords. 429 430Identified structs may be _opaque_. In this case, the body is unknown but the 431structure type is considered _initialized_ and is valid in the IR. 432 433#### Literal Structure Types 434 435Literal structures are uniqued according to the list of elements they contain, 436and can optionally be packed. The syntax for such structs is as follows. 437 438``` 439llvm-literal-struct-type ::= `!llvm.struct<` `packed`? `(` type-list `)` `>` 440type-list ::= <maybe empty comma-separated list of types with optional !llvm.> 441``` 442 443Literal structs cannot be recursive, but can contain other structs. Therefore, 444they must be constructed in a single step with the entire list of contained 445elements provided. 446 447#### Examples of Structure Types 448 449```mlir 450!llvm.struct<> // NOT allowed 451!llvm.struct<()> // empty, literal 452!llvm.struct<(i32)> // literal 453!llvm.struct<(struct<(i32)>)> // struct containing a struct 454!llvm.struct<packed (i8, i32)> // packed struct 455!llvm.struct<"a"> // recursive reference, only allowed within 456 // another struct, NOT allowed at top level 457!llvm.struct<"a", ptr<struct<"a">>> // supported example of recursive reference 458!llvm.struct<"a", ()> // empty, named (necessary to differentiate from 459 // recursive reference) 460!llvm.struct<"a", opaque> // opaque, named 461!llvm.struct<"a", (i32)> // named 462!llvm.struct<"a", packed (i8, i32)> // named, packed 463``` 464 465### Unsupported Types 466 467LLVM IR `label` type does not have a counterpart in the LLVM dialect since, in 468MLIR, blocks are not values and don't need a type. 469 470## Operations 471 472All operations in the LLVM IR dialect have a custom form in MLIR. The mnemonic 473of an operation is that used in LLVM IR prefixed with "`llvm.`". 474 475[include "Dialects/LLVMOps.md"] 476 477## Operations for LLVM IR Intrinsics 478 479MLIR operation system is open making it unnecessary to introduce a hard bound 480between "core" operations and "intrinsics". General LLVM IR intrinsics are 481modeled as first-class operations in the LLVM dialect. Target-specific LLVM IR 482intrinsics, e.g., NVVM or ROCDL, are modeled as separate dialects. 483 484[include "Dialects/LLVMIntrinsicOps.md"] 485