1# How ISLE is Integrated with Cranelift
2
3This document contains an overview of and FAQ about how ISLE fits into
4Cranelift.
5
6## What is ISLE?
7
8ISLE is a domain-specific language for authoring instruction selection and
9rewrite rules. ISLE source text is [compiled down into Rust
10code](https://github.com/bytecodealliance/wasmtime/tree/main/cranelift/isle#implementation).
11
12Documentation on the ISLE language itself can be found
13[here](../isle/docs/language-reference.md).
14
15## How does ISLE integrate with the build system?
16
17The build integration is inside of `cranelift/codegen/build.rs`.
18
19The ISLE compiler is built as a build-dependency, and the build script then
20uses it to compile ISLE source to generated Rust code. In other words, the ISLE
21compiler behaves as an additional compile step, and ISLE source is rebuilt just
22like any Rust source would be. Nothing special needs to be done when editing
23ISLE.
24
25Sometimes, it's desirable to see what code is actually generated. By default,
26the generated code is placed in a Cargo-managed path in `target/`. If you want
27to see the source instead, invoke Cargo with the environment variable
28`ISLE_SOURCE_DIR` set to another directory, as follows:
29
30```shell
31$ ISLE_SOURCE_DIR=`pwd`/isle-sources cargo check -p cranelift-codegen
32```
33
34This will place the ISLE source in `./isle-sources`, where you can inspect it,
35debug by setting breakpoints in it, etc.
36
37If there are any errors during ISLE compilation (e.g., a type mismatch), you
38will see a basic error message with a file, line number, and one-line error. To
39see a more detailed output with context, `--features isle-errors` can be used.
40This will give pretty-printed errors with source context.
41
42Additionally, the `cranelift-codegen-meta` crate will automatically generate
43ISLE `extern` declarations and helpers for working with CLIF. The code that does
44this is defined inside `cranelift/codegen/meta/src/gen_inst.rs` and it creates
45several ISLE files in the `target/` output directory which are subsequently
46read by the ISLE compiler as part of its prologue.
47
48## Where are the relevant files?
49
50* `cranelift/isle`: The ISLE compiler's source code.
51
52* `cranelift/codegen/src/prelude.isle`: Common definitions and declarations for
53  ISLE. This gets included in every ISLE compilation.
54
55* `target/.../out/clif_lower.isle`: Auto-generated declarations and helpers
56  for working with CLIF for instruction lowering inside ISLE. Generated by
57  `cranelift/codegen/build.rs`, which builds it into every backend.
58
59* `target/.../out/clif_opt.isle`: Auto-generated declarations and helpers for
60  working with CLIF for mid-end optimizations. Generated by
61  `cranelift/codegen/build.rs`, which builds it into the mid-end optimizer.
62
63* `cranelift/codegen/src/machinst/isle.rs`: Common Rust code for gluing
64  ISLE-generated code into a target architecture's backend. Contains
65  implementations of ISA-agnostic `extern` helpers declared in ISLE.
66
67* `cranelift/codegen/src/isa/<arch>/inst.isle`: ISA-specific ISLE
68  helpers. Contains things like constructors for each instruction in the ISA, or
69  helpers to get a specific register. Helps bridge the gap between the raw,
70  non-SSA ISA and the pure, SSA view that the lowering rules have.
71
72* `cranelift/codegen/src/isa/<arch>/lower.isle`: Instruction selection lowering
73  rules for an ISA. These should be pure, SSA rewrite rules, that lend
74  themselves to eventual verification.
75
76* `cranelift/codegen/src/isa/<arch>/lower/isle.rs`: The Rust glue code for
77  integrating this ISA's ISLE-generate Rust code into the rest of the backend
78  for this ISA. Contains implementations of ISA-specific `extern` helpers
79  declared in ISLE.
80
81## Gluing ISLE's generated code into Cranelift
82
83Each ISA-specific, ISLE-generated file is generic over a `Context` trait that
84has a trait method for each `extern` helper defined in ISLE. There is one
85concrete implementation of each of these traits, defined in
86`cranelift/codegen/src/isa/<arch>/lower/isle.rs`. In general, the way that
87ISLE-generated code is glued into the rest of the system is with these trait
88implementations.
89
90There may also be a `lower` function defined in `isle.rs` that encapsulates
91creating the ISLE `Context` and calling into the generated code.
92
93## Lowering rules are always pure, use SSA
94
95The lowering rules themselves, defined in
96`cranelift/codegen/src/isa/<arch>/lower.isle`, must always be a pure mapping
97from a CLIF instruction to the target ISA's `MachInst`.
98
99Examples of things that the lowering rules themselves shouldn't deal with or
100talk about:
101
102* Registers that are modified (both read and written to, violating SSA)
103* Implicit uses of registers
104* Maintaining use counts for each CLIF value or virtual register
105
106Instead, these things should be handled by some combination of
107`cranelift/codegen/src/isa/<arch>/inst.isle` and general Rust code (either in
108`cranelift/codegen/src/isa/<arch>/lower/isle.rs` or elsewhere).
109
110When an instruction modifies a register, both reading from it and writing to it,
111we should build an SSA view of that instruction that gets legalized via "move
112mitosis" by splitting a move out from the register.
113
114For example, on x86 the `add` instruction reads and writes its first operand:
115
116    add a, b    ==    a = a + b
117
118So we present an SSA facade where `add` operates on three registers, instead of
119two, and defines one of them, while reading the other two and leaving them
120unmodified:
121
122    add a, b, c    ==    a = b + c
123
124Then, as an implementation detail of the facade, we emit moves as necessary:
125
126    add a, b, c    ==>    mov a, b; add b, c
127
128We call the process of emitting these moves "move mitosis". For ISAs with
129ubiquitous use of modified registers and instructions in two-operand form, like
130x86, we implement move mitosis with methods on the ISA's `MachInst`. For other
131ISAs that are RISCier and where modified registers are pretty rare, such as
132aarch64, we implement the handful of move mitosis special cases at the
133`inst.isle` layer. Either way, the important thing is that the lowering rules
134remain pure.
135
136Finally, note that these moves are generally cleaned up by the register
137allocator's move coalescing, and move mitosis will eventually go away completely
138once we switch over to `regalloc2`, which takes instructions in SSA form
139directly as input.
140
141Instructions that implicitly operate on specific registers, or which require
142that certain operands be in certain registers, are handled similarly: the
143lowering rules use a pure paradigm that ignores these constraints and has
144instructions that explicitly take implicit operands, and we ensure the
145constraints are fulfilled a layer below the lowering rules (in `inst.isle` or in
146Rust glue code).
147
148## When are lowering rules allowed to have side effects?
149
150Extractors (the matchers that appear on the left-hand sides of `rule`s) should
151**never** have side effects. When evaluating a rule's extractors, we haven't yet
152committed to evaluating that rule's right-hand side. If the extractors performed
153side effects, we could get deeply confusing action-at-a-distance bugs where
154rules we never fully match pull the rug out from under our feet.
155
156Anytime you are tempted to perform side effects in an extractor, you should
157instead just package up the things you would need in order to perform that side
158effect, and then have a separate constructor that takes that package and
159performs the side effect it describes. The constructor can only be called inside
160a rule's right-hand side, which is only evaluated after we've committed to this
161rule, which avoids the action-at-a-distance bugs described earlier.
162
163For example, loads have a side effect in CLIF: they might trap. Therefore, even
164if a loaded value is never used, we will emit code that implements that
165load. But if we are compiling for x86 we can sink loads into the operand
166for another operation depending on how the loaded value is used. If we sink that
167load into, say, an `add` then we need to tell the lowering context *not* to
168lower the CLIF `load` instruction anymore, because its effectively already
169lowered as part of lowering the `add` that uses the loaded value. Marking an
170instruction as "already lowered" is a side effect, and we might be tempted to
171perform that side effect in the extractor that matches sinkable loads. But we
172can't do that because although the load itself might be sinkable, there might be
173a reason why we ultimately don't perform this load-sinking rule, and if that
174happens we still need to lower the CLIF load.
175
176Therefore, we make the `sinkable_load` extractor create a `SinkableLoad` type
177that packages up everything we need to know about the load and how to tell the
178lowering context that we've sunk it and the lowering context doesn't need to
179lower it anymore, but *it doesn't actually tell that to the lowering context
180yet*.
181
182```lisp
183;; inst.isle
184
185;; A load that can be sunk into another operation.
186(type SinkableLoad extern (enum))
187
188;; Extract a `SinkableLoad` from a value if the value is defined by a compatible
189;; load.
190(decl sinkable_load (SinkableLoad) Value)
191(extern extractor sinkable_load sinkable_load)
192```
193
194Then, we pair that with a `sink_load` constructor that takes the `SinkableLoad`,
195performs the associated side effect of telling the lowering context not to lower
196the load anymore, and returns the x86 operand with the load sunken into it.
197
198```lisp
199;; inst.isle
200
201;; Sink a `SinkableLoad` into a `RegMemImm.Mem`.
202;;
203;; This is a side-effectful operation that notifies the context that the
204;; instruction that produced the `SinkableImm` has been sunk into another
205;; instruction, and no longer needs to be lowered.
206(decl sink_load (SinkableLoad) RegMemImm)
207(extern constructor sink_load sink_load)
208```
209
210Finally, we can use `sinkable_load` and `sink_load` inside lowering rules that
211create instructions where an operand is loaded directly from memory:
212
213```
214;; lower.isle
215
216(rule (lower (has_type (fits_in_64 ty)
217                       (iadd x (sinkable_load y))))
218      (value_reg (add ty
219                      (put_in_reg x)
220                      (sink_load y))))
221```
222
223See the `sinkable_load`, `SinkableLoad`, and `sink_load` declarations inside
224`cranelift/codegen/src/isa/x64/inst.isle` as well as their external
225implementations inside `cranelift/codegen/src/isa/x64/lower/isle.rs` for
226details.
227
228See also the "ISLE code should leverage types" section below.
229
230## ISLE code should leverage types
231
232ISLE is a typed language, and we should leverage that to prevent whole classes
233of bugs where possible. Use newtypes liberally.
234
235For example, use the `with_flags` family of helpers to pair flags-producing
236instructions with flags-consuming instructions, ensuring that no errant
237instructions are ever inserted between our flags-using instructions, clobbering
238their flags. See `with_flags`, `ProducesFlags`, and `ConsumesFlags` inside
239`cranelift/codegen/src/prelude.isle` for details.
240
241## Implicit type conversions
242
243ISLE supports implicit type conversions, and we will make use of these
244where possible to simplify the lowering rules. For example, we have
245`Value` and `ValueRegs`; the former denotes SSA values in CLIF, and
246the latter denotes the register or registers that hold that value in
247lowered code. Prior to introduction of implicit type conversions, we
248had many occurrences of expressions like `(value_regs r1 r2)` or
249`(value_reg r)`. Given that we have already defined a term
250`value_reg`, we can define a conversion such as
251
252```lisp
253    (convert Reg ValueRegs value_reg)
254```
255
256and the ISLE compiler will automatically insert it where the types
257imply that it is necessary. When properly defined, converters allow us
258to write rules like
259
260```lisp
261    (rule (lower (has_type (fits_in_64 ty)
262                           (iadd x y)))
263          (add ty x y))
264```
265
266### Implicit type conversions and side-effects
267
268While implicit conversions are very convenient, we take care to manage
269how they introduce invisible side-effects.
270
271Particularly important here is the fact that implicit conversions can
272occur more than once. For example, if we define `(convert Value Reg
273put_value_in_reg)`, and we emit an instruction `(add a a)` (say, to
274double the value of `a`) where `a` is a `Value`, then we will invoke
275`put_value_in_reg` twice.
276
277If this were an external constructor that performed some unique action
278per invocation, for example allocating a fresh register, then this
279would not be desirable. So we follow a convention when defining
280implicit type conversions: the conversion must be *idempotent*, i.e.,
281must return the same value if invoked more than
282once. `put_value_in_reg` in particular already has this property, so
283we are safe to use it as an implicit converter.
284
285Note that this condition is *not* the same as saying that the
286converter cannot have side-effects. It is perfectly fine for this to
287be the case, and can add significant convenience, even. The small loss
288in efficiency from invoking the converter twice is tolerable, and we
289could optimize it later if we needed to do so.
290
291Even given this, there may be times where it is still clearer to be
292explicit. For example, sinking a load is an explicit and important
293detail of some lowering patterns; while we could define an implicit
294conversion from `SinkableLoad` to `Reg`, it is probably better for now
295to retain the `(sink_load ...)` form for clarity.
296