1 //! In-memory representation of compiled machine code, with labels and fixups to
2 //! refer to those labels. Handles constant-pool island insertion and also
3 //! veneer insertion for out-of-range jumps.
4 //!
5 //! This code exists to solve three problems:
6 //!
7 //! - Branch targets for forward branches are not known until later, when we
8 //!   emit code in a single pass through the instruction structs.
9 //!
10 //! - On many architectures, address references or offsets have limited range.
11 //!   For example, on AArch64, conditional branches can only target code +/- 1MB
12 //!   from the branch itself.
13 //!
14 //! - The lowering of control flow from the CFG-with-edges produced by
15 //!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16 //!   edge blocks when the register allocator does not need to insert any
17 //!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18 //!   patterns. The lowering also pays no attention to block order, and so
19 //!   two-target conditional forms (cond-br followed by uncond-br) can often by
20 //!   avoided because one of the targets is the fallthrough. There are several
21 //!   cases here where we can simplify to use fewer branches.
22 //!
23 //! This "buffer" implements a single-pass code emission strategy (with a later
24 //! "fixup" pass, but only through recorded fixups, not all instructions). The
25 //! basic idea is:
26 //!
27 //! - Emit branches as they are, including two-target (cond/uncond) compound
28 //!   forms, but with zero offsets and optimistically assuming the target will be
29 //!   in range. Record the "fixup" for later. Targets are denoted instead by
30 //!   symbolic "labels" that are then bound to certain offsets in the buffer as
31 //!   we emit code. (Nominally, there is a label at the start of every basic
32 //!   block.)
33 //!
34 //! - As we do this, track the offset in the buffer at which the first label
35 //!   reference "goes out of range". We call this the "deadline". If we reach the
36 //!   deadline and we still have not bound the label to which an unresolved branch
37 //!   refers, we have a problem!
38 //!
39 //! - To solve this problem, we emit "islands" full of "veneers". An island is
40 //!   simply a chunk of code inserted in the middle of the code actually produced
41 //!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42 //!   has some awareness of this: it either asks for an island between blocks, so
43 //!   it is not accidentally executed, or else it emits a branch around the island
44 //!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45 //!
46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47 //!   that implements a longer-range reference to a label. The idea is that, for
48 //!   example, a branch with a limited range can branch to a "veneer" instead,
49 //!   which is simply a branch in a form that can use a longer-range reference. On
50 //!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51 //!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52 //!   conditional branch's label reference can be fixed up with a "veneer" to
53 //!   achieve a longer range.
54 //!
55 //! - To implement all of this, we require the backend to provide a `LabelUse`
56 //!   type that implements a trait. This is nominally an enum that records one of
57 //!   several kinds of references to an offset in code -- basically, a relocation
58 //!   type -- and will usually correspond to different instruction formats. The
59 //!   `LabelUse` implementation specifies the maximum range, how to patch in the
60 //!   actual label location when known, and how to generate a veneer to extend the
61 //!   range.
62 //!
63 //! That satisfies label references, but we still may have suboptimal branch
64 //! patterns. To clean up the branches, we do a simple "peephole"-style
65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66 //! informs the buffer of branches in the code and, in the case of conditionals,
67 //! the code that would have been emitted to invert this branch's condition. We
68 //! track the "latest branches": these are branches that are contiguous up to
69 //! the current offset. (If any code is emitted after a branch, that branch or
70 //! run of contiguous branches is no longer "latest".) The latest branches are
71 //! those that we can edit by simply truncating the buffer and doing something
72 //! else instead.
73 //!
74 //! To optimize branches, we implement several simple rules, and try to apply
75 //! them to the "latest branches" when possible:
76 //!
77 //! - A branch with a label target, when that label is bound to the ending
78 //!   offset of the branch (the fallthrough location), can be removed altogether,
79 //!   because the branch would have no effect).
80 //!
81 //! - An unconditional branch that starts at a label location, and branches to
82 //!   another label, results in a "label alias": all references to the label bound
83 //!   *to* this branch instruction are instead resolved to the *target* of the
84 //!   branch instruction. This effectively removes empty blocks that just
85 //!   unconditionally branch to the next block. We call this "branch threading".
86 //!
87 //! - A conditional followed by an unconditional, when the conditional branches
88 //!   to the unconditional's fallthrough, results in (i) the truncation of the
89 //!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90 //!   replacement of the conditional's target (using the original target of the
91 //!   unconditional). This is a fancy way of saying "we can flip a two-target
92 //!   conditional branch's taken/not-taken targets if it works better with our
93 //!   fallthrough". To make this work, the emitter actually gives the buffer
94 //!   *both* forms of every conditional branch: the true form is emitted into the
95 //!   buffer, and the "inverted" machine-code bytes are provided as part of the
96 //!   branch-fixup metadata.
97 //!
98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have
99 //!   been redirected to target(B), can be removed entirely. This is an extension
100 //!   of the branch-threading optimization, and is valid because if we know there
101 //!   will be no fallthrough into this branch instruction (the prior instruction
102 //!   is an unconditional jump), and if we know we have successfully redirected
103 //!   all labels, then this branch instruction is unreachable. Note that this
104 //!   works because the redirection happens before the label is ever resolved
105 //!   (fixups happen at island emission time, at which point latest-branches are
106 //!   cleared, or at the end of emission), so we are sure to catch and redirect
107 //!   all possible paths to this instruction.
108 //!
109 //! # Branch-optimization Correctness
110 //!
111 //! The branch-optimization mechanism depends on a few data structures with
112 //! invariants, which are always held outside the scope of top-level public
113 //! methods:
114 //!
115 //! - The latest-branches list. Each entry describes a span of the buffer
116 //!   (start/end offsets), the label target, the corresponding fixup-list entry
117 //!   index, and the bytes (must be the same length) for the inverted form, if
118 //!   conditional. The list of labels that are bound to the start-offset of this
119 //!   branch is *complete* (if any label has a resolved offset equal to `start`
120 //!   and is not an alias, it must appear in this list) and *precise* (no label
121 //!   in this list can be bound to another offset). No label in this list should
122 //!   be an alias.  No two branch ranges can overlap, and branches are in
123 //!   ascending-offset order.
124 //!
125 //! - The labels-at-tail list. This contains all MachLabels that have been bound
126 //!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127 //!   No label in this list should be an alias.
128 //!
129 //! - The label_offsets array, containing the bound offset of a label or
130 //!   UNKNOWN. No label can be bound at an offset greater than the current
131 //!   buffer tail.
132 //!
133 //! - The label_aliases array, containing another label to which a label is
134 //!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135 //!   of the label it is aliased to, if this is set.
136 //!
137 //! We argue below, at each method, how the invariants in these data structures
138 //! are maintained (grep for "Post-invariant").
139 //!
140 //! Given these invariants, we argue why each optimization preserves execution
141 //! semantics below (grep for "Preserves execution semantics").
142 //!
143 //! # Avoiding Quadratic Behavior
144 //!
145 //! There are two cases where we've had to take some care to avoid
146 //! quadratic worst-case behavior:
147 //!
148 //! - The "labels at this branch" list can grow unboundedly if the
149 //!   code generator binds many labels at one location. If the count
150 //!   gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151 //!   simply abort an optimization early in a way that is always correct
152 //!   but is conservative.
153 //!
154 //! - The fixup list can interact with island emission to create
155 //!   "quadratic island behavior". In a little more detail, one can hit
156 //!   this behavior by having some pending fixups (forward label
157 //!   references) with long-range label-use kinds, and some others
158 //!   with shorter-range references that nonetheless still are pending
159 //!   long enough to trigger island generation. In such a case, we
160 //!   process the fixup list, generate veneers to extend some forward
161 //!   references' ranges, but leave the other (longer-range) ones
162 //!   alone. The way this was implemented put them back on a list and
163 //!   resulted in quadratic behavior.
164 //!
165 //!   To avoid this fixups are split into two lists: one "pending" list and one
166 //!   final list. The pending list is kept around for handling fixups related to
167 //!   branches so it can be edited/truncated. When an island is reached, which
168 //!   starts processing fixups, all pending fixups are flushed into the final
169 //!   list. The final list is a `BinaryHeap` which enables fixup processing to
170 //!   only process those which are required during island emission, deferring
171 //!   all longer-range fixups to later.
172 
173 use crate::binemit::{Addend, CodeOffset, Reloc};
174 use crate::ir::function::FunctionParameters;
175 use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176 use crate::isa::unwind::UnwindInst;
177 use crate::machinst::{
178     BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179 };
180 use crate::trace;
181 use crate::{MachInstEmitState, ir};
182 use crate::{VCodeConstantData, timing};
183 use core::ops::Range;
184 use cranelift_control::ControlPlane;
185 use cranelift_entity::{PrimaryMap, entity_impl};
186 use smallvec::SmallVec;
187 use std::cmp::Ordering;
188 use std::collections::BinaryHeap;
189 use std::mem;
190 use std::string::String;
191 use std::vec::Vec;
192 
193 #[cfg(feature = "enable-serde")]
194 use serde::{Deserialize, Serialize};
195 
196 #[cfg(feature = "enable-serde")]
197 pub trait CompilePhase {
198     type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
199     type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200 }
201 
202 #[cfg(not(feature = "enable-serde"))]
203 pub trait CompilePhase {
204     type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
205     type SourceLocType: core::fmt::Debug + PartialEq + Clone;
206 }
207 
208 /// Status of a compiled artifact that needs patching before being used.
209 #[derive(Clone, Debug, PartialEq)]
210 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
211 pub struct Stencil;
212 
213 /// Status of a compiled artifact ready to use.
214 #[derive(Clone, Debug, PartialEq)]
215 pub struct Final;
216 
217 impl CompilePhase for Stencil {
218     type MachSrcLocType = MachSrcLoc<Stencil>;
219     type SourceLocType = RelSourceLoc;
220 }
221 
222 impl CompilePhase for Final {
223     type MachSrcLocType = MachSrcLoc<Final>;
224     type SourceLocType = SourceLoc;
225 }
226 
227 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
228 enum ForceVeneers {
229     Yes,
230     No,
231 }
232 
233 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
234 /// in bulk.
235 ///
236 /// This struct uses `SmallVec`s to support small-ish function bodies without
237 /// any heap allocation. As such, it will be several kilobytes large. This is
238 /// likely fine as long as it is stack-allocated for function emission then
239 /// thrown away; but beware if many buffer objects are retained persistently.
240 pub struct MachBuffer<I: VCodeInst> {
241     /// The buffer contents, as raw bytes.
242     data: SmallVec<[u8; 1024]>,
243     /// The required alignment of this buffer.
244     min_alignment: u32,
245     /// Any relocations referring to this code. Note that only *external*
246     /// relocations are tracked here; references to labels within the buffer are
247     /// resolved before emission.
248     relocs: SmallVec<[MachReloc; 16]>,
249     /// Any trap records referring to this code.
250     traps: SmallVec<[MachTrap; 16]>,
251     /// Any call site records referring to this code.
252     call_sites: SmallVec<[MachCallSite; 16]>,
253     /// Any exception-handler records referred to at call sites.
254     exception_handlers: SmallVec<[MachExceptionHandler; 16]>,
255     /// Any source location mappings referring to this code.
256     srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
257     /// Any user stack maps for this code.
258     ///
259     /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
260     /// by code offset, and each stack map covers `span` bytes on the stack.
261     user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
262     /// Any unwind info at a given location.
263     unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
264     /// The current source location in progress (after `start_srcloc()` and
265     /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
266     cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
267     /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
268     label_offsets: SmallVec<[CodeOffset; 16]>,
269     /// Label aliases: when one label points to an unconditional jump, and that
270     /// jump points to another label, we can redirect references to the first
271     /// label immediately to the second.
272     ///
273     /// Invariant: we don't have label-alias cycles. We ensure this by,
274     /// before setting label A to alias label B, resolving B's alias
275     /// target (iteratively until a non-aliased label); if B is already
276     /// aliased to A, then we cannot alias A back to B.
277     label_aliases: SmallVec<[MachLabel; 16]>,
278     /// Constants that must be emitted at some point.
279     pending_constants: SmallVec<[VCodeConstant; 16]>,
280     /// Byte size of all constants in `pending_constants`.
281     pending_constants_size: CodeOffset,
282     /// Traps that must be emitted at some point.
283     pending_traps: SmallVec<[MachLabelTrap; 16]>,
284     /// Fixups that haven't yet been flushed into `fixup_records` below and may
285     /// be related to branches that are chomped. These all get added to
286     /// `fixup_records` during island emission.
287     pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
288     /// The nearest upcoming deadline for entries in `pending_fixup_records`.
289     pending_fixup_deadline: CodeOffset,
290     /// Fixups that must be performed after all code is emitted.
291     fixup_records: BinaryHeap<MachLabelFixup<I>>,
292     /// Latest branches, to facilitate in-place editing for better fallthrough
293     /// behavior and empty-block removal.
294     latest_branches: SmallVec<[MachBranch; 4]>,
295     /// All labels at the current offset (emission tail). This is lazily
296     /// cleared: it is actually accurate as long as the current offset is
297     /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
298     /// be considered as empty.
299     ///
300     /// For correctness, this *must* be complete (i.e., the vector must contain
301     /// all labels whose offsets are resolved to the current tail), because we
302     /// rely on it to update labels when we truncate branches.
303     labels_at_tail: SmallVec<[MachLabel; 4]>,
304     /// The last offset at which `labels_at_tail` is valid. It is conceptually
305     /// always describing the tail of the buffer, but we do not clear
306     /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
307     /// when the offset has grown past this (`labels_at_tail_off`) point.
308     /// Always <= `cur_offset()`.
309     labels_at_tail_off: CodeOffset,
310     /// Metadata about all constants that this function has access to.
311     ///
312     /// This records the size/alignment of all constants (not the actual data)
313     /// along with the last available label generated for the constant. This map
314     /// is consulted when constants are referred to and the label assigned to a
315     /// constant may change over time as well.
316     constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
317     /// All recorded usages of constants as pairs of the constant and where the
318     /// constant needs to be placed within `self.data`. Note that the same
319     /// constant may appear in this array multiple times if it was emitted
320     /// multiple times.
321     used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
322     /// Indicates when a patchable region is currently open, to guard that it's
323     /// not possible to nest patchable regions.
324     open_patchable: bool,
325 }
326 
327 impl MachBufferFinalized<Stencil> {
328     /// Get a finalized machine buffer by applying the function's base source location.
329     pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
330         MachBufferFinalized {
331             data: self.data,
332             relocs: self.relocs,
333             traps: self.traps,
334             call_sites: self.call_sites,
335             exception_handlers: self.exception_handlers,
336             srclocs: self
337                 .srclocs
338                 .into_iter()
339                 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
340                 .collect(),
341             user_stack_maps: self.user_stack_maps,
342             unwind_info: self.unwind_info,
343             alignment: self.alignment,
344         }
345     }
346 }
347 
348 /// A `MachBuffer` once emission is completed: holds generated code and records,
349 /// without fixups. This allows the type to be independent of the backend.
350 #[derive(PartialEq, Debug, Clone)]
351 #[cfg_attr(
352     feature = "enable-serde",
353     derive(serde_derive::Serialize, serde_derive::Deserialize)
354 )]
355 pub struct MachBufferFinalized<T: CompilePhase> {
356     /// The buffer contents, as raw bytes.
357     pub(crate) data: SmallVec<[u8; 1024]>,
358     /// Any relocations referring to this code. Note that only *external*
359     /// relocations are tracked here; references to labels within the buffer are
360     /// resolved before emission.
361     pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
362     /// Any trap records referring to this code.
363     pub(crate) traps: SmallVec<[MachTrap; 16]>,
364     /// Any call site records referring to this code.
365     pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
366     /// Any exception-handler records referred to at call sites.
367     pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>,
368     /// Any source location mappings referring to this code.
369     pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
370     /// Any user stack maps for this code.
371     ///
372     /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
373     /// by code offset, and each stack map covers `span` bytes on the stack.
374     pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
375     /// Any unwind info at a given location.
376     pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
377     /// The required alignment of this buffer.
378     pub alignment: u32,
379 }
380 
381 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
382 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
383 
384 /// Threshold on max length of `labels_at_this_branch` list to avoid
385 /// unbounded quadratic behavior (see comment below at use-site).
386 const LABEL_LIST_THRESHOLD: usize = 100;
387 
388 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at
389 /// the point at which it is used by emitted code; the buffer records "fixups"
390 /// for references to the label, and will come back and patch the code
391 /// appropriately when the label's location is eventually known.
392 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
393 pub struct MachLabel(u32);
394 entity_impl!(MachLabel);
395 
396 impl MachLabel {
397     /// Get a label for a block. (The first N MachLabels are always reserved for
398     /// the N blocks in the vcode.)
399     pub fn from_block(bindex: BlockIndex) -> MachLabel {
400         MachLabel(bindex.index() as u32)
401     }
402 
403     /// Creates a string representing this label, for convenience.
404     pub fn to_string(&self) -> String {
405         format!("label{}", self.0)
406     }
407 }
408 
409 impl Default for MachLabel {
410     fn default() -> Self {
411         UNKNOWN_LABEL
412     }
413 }
414 
415 /// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
416 /// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
417 /// the [`OpenPatchRegion`] token in the process.
418 pub struct OpenPatchRegion(usize);
419 
420 /// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
421 /// of where you might want to use this is for patching instructions that mention constants that
422 /// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
423 /// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
424 /// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
425 /// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
426 /// bytes, and the constants uses can be updated directly.
427 pub struct PatchRegion {
428     range: Range<usize>,
429 }
430 
431 impl PatchRegion {
432     /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
433     pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
434         &mut buffer.data[self.range]
435     }
436 }
437 
438 impl<I: VCodeInst> MachBuffer<I> {
439     /// Create a new section, known to start at `start_offset` and with a size limited to
440     /// `length_limit`.
441     pub fn new() -> MachBuffer<I> {
442         MachBuffer {
443             data: SmallVec::new(),
444             min_alignment: I::function_alignment().minimum,
445             relocs: SmallVec::new(),
446             traps: SmallVec::new(),
447             call_sites: SmallVec::new(),
448             exception_handlers: SmallVec::new(),
449             srclocs: SmallVec::new(),
450             user_stack_maps: SmallVec::new(),
451             unwind_info: SmallVec::new(),
452             cur_srcloc: None,
453             label_offsets: SmallVec::new(),
454             label_aliases: SmallVec::new(),
455             pending_constants: SmallVec::new(),
456             pending_constants_size: 0,
457             pending_traps: SmallVec::new(),
458             pending_fixup_records: SmallVec::new(),
459             pending_fixup_deadline: u32::MAX,
460             fixup_records: Default::default(),
461             latest_branches: SmallVec::new(),
462             labels_at_tail: SmallVec::new(),
463             labels_at_tail_off: 0,
464             constants: Default::default(),
465             used_constants: Default::default(),
466             open_patchable: false,
467         }
468     }
469 
470     /// Current offset from start of buffer.
471     pub fn cur_offset(&self) -> CodeOffset {
472         self.data.len() as CodeOffset
473     }
474 
475     /// Add a byte.
476     pub fn put1(&mut self, value: u8) {
477         self.data.push(value);
478 
479         // Post-invariant: conceptual-labels_at_tail contains a complete and
480         // precise list of labels bound at `cur_offset()`. We have advanced
481         // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
482         // before, it is not anymore (and it cannot become equal, because
483         // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
484         // conceptually empty (even though it is only lazily cleared). No labels
485         // can be bound at this new offset (by invariant on `label_offsets`).
486         // Hence the invariant holds.
487     }
488 
489     /// Add 2 bytes.
490     pub fn put2(&mut self, value: u16) {
491         let bytes = value.to_le_bytes();
492         self.data.extend_from_slice(&bytes[..]);
493 
494         // Post-invariant: as for `put1()`.
495     }
496 
497     /// Add 4 bytes.
498     pub fn put4(&mut self, value: u32) {
499         let bytes = value.to_le_bytes();
500         self.data.extend_from_slice(&bytes[..]);
501 
502         // Post-invariant: as for `put1()`.
503     }
504 
505     /// Add 8 bytes.
506     pub fn put8(&mut self, value: u64) {
507         let bytes = value.to_le_bytes();
508         self.data.extend_from_slice(&bytes[..]);
509 
510         // Post-invariant: as for `put1()`.
511     }
512 
513     /// Add a slice of bytes.
514     pub fn put_data(&mut self, data: &[u8]) {
515         self.data.extend_from_slice(data);
516 
517         // Post-invariant: as for `put1()`.
518     }
519 
520     /// Reserve appended space and return a mutable slice referring to it.
521     pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
522         let off = self.data.len();
523         let new_len = self.data.len() + len;
524         self.data.resize(new_len, 0);
525         &mut self.data[off..]
526 
527         // Post-invariant: as for `put1()`.
528     }
529 
530     /// Align up to the given alignment.
531     pub fn align_to(&mut self, align_to: CodeOffset) {
532         trace!("MachBuffer: align to {}", align_to);
533         assert!(
534             align_to.is_power_of_two(),
535             "{align_to} is not a power of two"
536         );
537         while self.cur_offset() & (align_to - 1) != 0 {
538             self.put1(0);
539         }
540 
541         // Post-invariant: as for `put1()`.
542     }
543 
544     /// Begin a region of patchable code. There is one requirement for the
545     /// code that is emitted: It must not introduce any instructions that
546     /// could be chomped (branches are an example of this). In other words,
547     /// you must not call [`MachBuffer::add_cond_branch`] or
548     /// [`MachBuffer::add_uncond_branch`] between calls to this method and
549     /// [`MachBuffer::end_patchable`].
550     pub fn start_patchable(&mut self) -> OpenPatchRegion {
551         assert!(!self.open_patchable, "Patchable regions may not be nested");
552         self.open_patchable = true;
553         OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
554     }
555 
556     /// End a region of patchable code, yielding a [`PatchRegion`] value that
557     /// can be consumed later to produce a one-off mutable slice to the
558     /// associated region of the data buffer.
559     pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
560         // No need to assert the state of `open_patchable` here, as we take
561         // ownership of the only `OpenPatchable` value.
562         self.open_patchable = false;
563         let end = usize::try_from(self.cur_offset()).unwrap();
564         PatchRegion { range: open.0..end }
565     }
566 
567     /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
568     /// offset yet.
569     pub fn get_label(&mut self) -> MachLabel {
570         let l = self.label_offsets.len() as u32;
571         self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
572         self.label_aliases.push(UNKNOWN_LABEL);
573         trace!("MachBuffer: new label -> {:?}", MachLabel(l));
574         MachLabel(l)
575 
576         // Post-invariant: the only mutation is to add a new label; it has no
577         // bound offset yet, so it trivially satisfies all invariants.
578     }
579 
580     /// Reserve the first N MachLabels for blocks.
581     pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
582         trace!("MachBuffer: first {} labels are for blocks", blocks);
583         debug_assert!(self.label_offsets.is_empty());
584         self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
585         self.label_aliases.resize(blocks, UNKNOWN_LABEL);
586 
587         // Post-invariant: as for `get_label()`.
588     }
589 
590     /// Registers metadata in this `MachBuffer` about the `constants` provided.
591     ///
592     /// This will record the size/alignment of all constants which will prepare
593     /// them for emission later on.
594     pub fn register_constants(&mut self, constants: &VCodeConstants) {
595         for (c, val) in constants.iter() {
596             self.register_constant(&c, val);
597         }
598     }
599 
600     /// Similar to [`MachBuffer::register_constants`] but registers a
601     /// single constant metadata. This function is useful in
602     /// situations where not all constants are known at the time of
603     /// emission.
604     pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
605         let c2 = self.constants.push(MachBufferConstant {
606             upcoming_label: None,
607             align: data.alignment(),
608             size: data.as_slice().len(),
609         });
610         assert_eq!(*constant, c2);
611     }
612 
613     /// Completes constant emission by iterating over `self.used_constants` and
614     /// filling in the "holes" with the constant values provided by `constants`.
615     ///
616     /// Returns the alignment required for this entire buffer. Alignment starts
617     /// at the ISA's minimum function alignment and can be increased due to
618     /// constant requirements.
619     fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
620         let mut alignment = self.min_alignment;
621         for (constant, offset) in mem::take(&mut self.used_constants) {
622             let constant = constants.get(constant);
623             let data = constant.as_slice();
624             self.data[offset as usize..][..data.len()].copy_from_slice(data);
625             alignment = constant.alignment().max(alignment);
626         }
627         alignment
628     }
629 
630     /// Returns a label that can be used to refer to the `constant` provided.
631     ///
632     /// This will automatically defer a new constant to be emitted for
633     /// `constant` if it has not been previously emitted. Note that this
634     /// function may return a different label for the same constant at
635     /// different points in time. The label is valid to use only from the
636     /// current location; the MachBuffer takes care to emit the same constant
637     /// multiple times if needed so the constant is always in range.
638     pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
639         let MachBufferConstant {
640             align,
641             size,
642             upcoming_label,
643         } = self.constants[constant];
644         if let Some(label) = upcoming_label {
645             return label;
646         }
647 
648         let label = self.get_label();
649         trace!(
650             "defer constant: eventually emit {size} bytes aligned \
651              to {align} at label {label:?}",
652         );
653         self.pending_constants.push(constant);
654         self.pending_constants_size += size as u32;
655         self.constants[constant].upcoming_label = Some(label);
656         label
657     }
658 
659     /// Bind a label to the current offset. A label can only be bound once.
660     pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
661         trace!(
662             "MachBuffer: bind label {:?} at offset {}",
663             label,
664             self.cur_offset()
665         );
666         debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
667         debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
668         let offset = self.cur_offset();
669         self.label_offsets[label.0 as usize] = offset;
670         self.lazily_clear_labels_at_tail();
671         self.labels_at_tail.push(label);
672 
673         // Invariants hold: bound offset of label is <= cur_offset (in fact it
674         // is equal). If the `labels_at_tail` list was complete and precise
675         // before, it is still, because we have bound this label to the current
676         // offset and added it to the list (which contains all labels at the
677         // current offset).
678 
679         self.optimize_branches(ctrl_plane);
680 
681         // Post-invariant: by `optimize_branches()` (see argument there).
682     }
683 
684     /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
685     /// offset that it applies to.
686     fn lazily_clear_labels_at_tail(&mut self) {
687         let offset = self.cur_offset();
688         if offset > self.labels_at_tail_off {
689             self.labels_at_tail_off = offset;
690             self.labels_at_tail.clear();
691         }
692 
693         // Post-invariant: either labels_at_tail_off was at cur_offset, and
694         // state is untouched, or was less than cur_offset, in which case the
695         // labels_at_tail list was conceptually empty, and is now actually
696         // empty.
697     }
698 
699     /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
700     pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
701         let mut iters = 0;
702         while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
703             label = self.label_aliases[label.0 as usize];
704             // To protect against an infinite loop (despite our assurances to
705             // ourselves that the invariants make this impossible), assert out
706             // after 1M iterations. The number of basic blocks is limited
707             // in most contexts anyway so this should be impossible to hit with
708             // a legitimate input.
709             iters += 1;
710             assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
711         }
712         self.label_offsets[label.0 as usize]
713 
714         // Post-invariant: no mutations.
715     }
716 
717     /// Emit a reference to the given label with the given reference type (i.e.,
718     /// branch-instruction format) at the current offset.  This is like a
719     /// relocation, but handled internally.
720     ///
721     /// This can be called before the branch is actually emitted; fixups will
722     /// not happen until an island is emitted or the buffer is finished.
723     pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
724         trace!(
725             "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
726             offset, label, kind
727         );
728 
729         // Add the fixup, and update the worst-case island size based on a
730         // veneer for this label use.
731         let fixup = MachLabelFixup {
732             label,
733             offset,
734             kind,
735         };
736         self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
737         self.pending_fixup_records.push(fixup);
738 
739         // Post-invariant: no mutations to branches/labels data structures.
740     }
741 
742     /// Inform the buffer of an unconditional branch at the given offset,
743     /// targeting the given label. May be used to optimize branches.
744     /// The last added label-use must correspond to this branch.
745     /// This must be called when the current offset is equal to `start`; i.e.,
746     /// before actually emitting the branch. This implies that for a branch that
747     /// uses a label and is eligible for optimizations by the MachBuffer, the
748     /// proper sequence is:
749     ///
750     /// - Call `use_label_at_offset()` to emit the fixup record.
751     /// - Call `add_uncond_branch()` to make note of the branch.
752     /// - Emit the bytes for the branch's machine code.
753     ///
754     /// Additional requirement: no labels may be bound between `start` and `end`
755     /// (exclusive on both ends).
756     pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
757         debug_assert!(
758             !self.open_patchable,
759             "Branch instruction inserted within a patchable region"
760         );
761         assert!(self.cur_offset() == start);
762         debug_assert!(end > start);
763         assert!(!self.pending_fixup_records.is_empty());
764         let fixup = self.pending_fixup_records.len() - 1;
765         self.lazily_clear_labels_at_tail();
766         self.latest_branches.push(MachBranch {
767             start,
768             end,
769             target,
770             fixup,
771             inverted: None,
772             labels_at_this_branch: self.labels_at_tail.clone(),
773         });
774 
775         // Post-invariant: we asserted branch start is current tail; the list of
776         // labels at branch is cloned from list of labels at current tail.
777     }
778 
779     /// Inform the buffer of a conditional branch at the given offset,
780     /// targeting the given label. May be used to optimize branches.
781     /// The last added label-use must correspond to this branch.
782     ///
783     /// Additional requirement: no labels may be bound between `start` and `end`
784     /// (exclusive on both ends).
785     pub fn add_cond_branch(
786         &mut self,
787         start: CodeOffset,
788         end: CodeOffset,
789         target: MachLabel,
790         inverted: &[u8],
791     ) {
792         debug_assert!(
793             !self.open_patchable,
794             "Branch instruction inserted within a patchable region"
795         );
796         assert!(self.cur_offset() == start);
797         debug_assert!(end > start);
798         assert!(!self.pending_fixup_records.is_empty());
799         debug_assert!(
800             inverted.len() == (end - start) as usize,
801             "branch length = {}, but inverted length = {}",
802             end - start,
803             inverted.len()
804         );
805         let fixup = self.pending_fixup_records.len() - 1;
806         let inverted = Some(SmallVec::from(inverted));
807         self.lazily_clear_labels_at_tail();
808         self.latest_branches.push(MachBranch {
809             start,
810             end,
811             target,
812             fixup,
813             inverted,
814             labels_at_this_branch: self.labels_at_tail.clone(),
815         });
816 
817         // Post-invariant: we asserted branch start is current tail; labels at
818         // branch list is cloned from list of labels at current tail.
819     }
820 
821     fn truncate_last_branch(&mut self) {
822         debug_assert!(
823             !self.open_patchable,
824             "Branch instruction truncated within a patchable region"
825         );
826 
827         self.lazily_clear_labels_at_tail();
828         // Invariants hold at this point.
829 
830         let b = self.latest_branches.pop().unwrap();
831         assert!(b.end == self.cur_offset());
832 
833         // State:
834         //    [PRE CODE]
835         //  Offset b.start, b.labels_at_this_branch:
836         //    [BRANCH CODE]
837         //  cur_off, self.labels_at_tail -->
838         //    (end of buffer)
839         self.data.truncate(b.start as usize);
840         self.pending_fixup_records.truncate(b.fixup);
841         while let Some(last_srcloc) = self.srclocs.last_mut() {
842             if last_srcloc.end <= b.start {
843                 break;
844             }
845             if last_srcloc.start < b.start {
846                 last_srcloc.end = b.start;
847                 break;
848             }
849             self.srclocs.pop();
850         }
851         // State:
852         //    [PRE CODE]
853         //  cur_off, Offset b.start, b.labels_at_this_branch:
854         //    (end of buffer)
855         //
856         //  self.labels_at_tail -->  (past end of buffer)
857         let cur_off = self.cur_offset();
858         self.labels_at_tail_off = cur_off;
859         // State:
860         //    [PRE CODE]
861         //  cur_off, Offset b.start, b.labels_at_this_branch,
862         //  self.labels_at_tail:
863         //    (end of buffer)
864         //
865         // resolve_label_offset(l) for l in labels_at_tail:
866         //    (past end of buffer)
867 
868         trace!(
869             "truncate_last_branch: truncated {:?}; off now {}",
870             b, cur_off
871         );
872 
873         // Fix up resolved label offsets for labels at tail.
874         for &l in &self.labels_at_tail {
875             self.label_offsets[l.0 as usize] = cur_off;
876         }
877         // Old labels_at_this_branch are now at cur_off.
878         self.labels_at_tail.extend(b.labels_at_this_branch);
879 
880         // Post-invariant: this operation is defined to truncate the buffer,
881         // which moves cur_off backward, and to move labels at the end of the
882         // buffer back to the start-of-branch offset.
883         //
884         // latest_branches satisfies all invariants:
885         // - it has no branches past the end of the buffer (branches are in
886         //   order, we removed the last one, and we truncated the buffer to just
887         //   before the start of that branch)
888         // - no labels were moved to lower offsets than the (new) cur_off, so
889         //   the labels_at_this_branch list for any other branch need not change.
890         //
891         // labels_at_tail satisfies all invariants:
892         // - all labels that were at the tail after the truncated branch are
893         //   moved backward to just before the branch, which becomes the new tail;
894         //   thus every element in the list should remain (ensured by `.extend()`
895         //   above).
896         // - all labels that refer to the new tail, which is the start-offset of
897         //   the truncated branch, must be present. The `labels_at_this_branch`
898         //   list in the truncated branch's record is a complete and precise list
899         //   of exactly these labels; we append these to labels_at_tail.
900         // - labels_at_tail_off is at cur_off after truncation occurs, so the
901         //   list is valid (not to be lazily cleared).
902         //
903         // The stated operation was performed:
904         // - For each label at the end of the buffer prior to this method, it
905         //   now resolves to the new (truncated) end of the buffer: it must have
906         //   been in `labels_at_tail` (this list is precise and complete, and
907         //   the tail was at the end of the truncated branch on entry), and we
908         //   iterate over this list and set `label_offsets` to the new tail.
909         //   None of these labels could have been an alias (by invariant), so
910         //   `label_offsets` is authoritative for each.
911         // - No other labels will be past the end of the buffer, because of the
912         //   requirement that no labels be bound to the middle of branch ranges
913         //   (see comments to `add_{cond,uncond}_branch()`).
914         // - The buffer is truncated to just before the last branch, and the
915         //   fixup record referring to that last branch is removed.
916     }
917 
918     /// Performs various optimizations on branches pointing at the current label.
919     pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
920         if ctrl_plane.get_decision() {
921             return;
922         }
923 
924         self.lazily_clear_labels_at_tail();
925         // Invariants valid at this point.
926 
927         trace!(
928             "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
929             self.latest_branches, self.labels_at_tail, self.pending_fixup_records
930         );
931 
932         // We continue to munch on branches at the tail of the buffer until no
933         // more rules apply. Note that the loop only continues if a branch is
934         // actually truncated (or if labels are redirected away from a branch),
935         // so this always makes progress.
936         while let Some(b) = self.latest_branches.last() {
937             let cur_off = self.cur_offset();
938             trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
939             // If there has been any code emission since the end of the last branch or
940             // label definition, then there's nothing we can edit (because we
941             // don't move code once placed, only back up and overwrite), so
942             // clear the records and finish.
943             if b.end < cur_off {
944                 break;
945             }
946 
947             // If the "labels at this branch" list on this branch is
948             // longer than a threshold, don't do any simplification,
949             // and let the branch remain to separate those labels from
950             // the current tail. This avoids quadratic behavior (see
951             // #3468): otherwise, if a long string of "goto next;
952             // next:" patterns are emitted, all of the labels will
953             // coalesce into a long list of aliases for the current
954             // buffer tail. We must track all aliases of the current
955             // tail for correctness, but we are also allowed to skip
956             // optimization (removal) of any branch, so we take the
957             // escape hatch here and let it stand. In effect this
958             // "spreads" the many thousands of labels in the
959             // pathological case among an actual (harmless but
960             // suboptimal) instruction once per N labels.
961             if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
962                 break;
963             }
964 
965             // Invariant: we are looking at a branch that ends at the tail of
966             // the buffer.
967 
968             // For any branch, conditional or unconditional:
969             // - If the target is a label at the current offset, then remove
970             //   the conditional branch, and reset all labels that targeted
971             //   the current offset (end of branch) to the truncated
972             //   end-of-code.
973             //
974             // Preserves execution semantics: a branch to its own fallthrough
975             // address is equivalent to a no-op; in both cases, nextPC is the
976             // fallthrough.
977             if self.resolve_label_offset(b.target) == cur_off {
978                 trace!("branch with target == cur off; truncating");
979                 self.truncate_last_branch();
980                 continue;
981             }
982 
983             // If latest is an unconditional branch:
984             //
985             // - If the branch's target is not its own start address, then for
986             //   each label at the start of branch, make the label an alias of the
987             //   branch target, and remove the label from the "labels at this
988             //   branch" list.
989             //
990             //   - Preserves execution semantics: an unconditional branch's
991             //     only effect is to set PC to a new PC; this change simply
992             //     collapses one step in the step-semantics.
993             //
994             //   - Post-invariant: the labels that were bound to the start of
995             //     this branch become aliases, so they must not be present in any
996             //     labels-at-this-branch list or the labels-at-tail list. The
997             //     labels are removed form the latest-branch record's
998             //     labels-at-this-branch list, and are never placed in the
999             //     labels-at-tail list. Furthermore, it is correct that they are
1000             //     not in either list, because they are now aliases, and labels
1001             //     that are aliases remain aliases forever.
1002             //
1003             // - If there is a prior unconditional branch that ends just before
1004             //   this one begins, and this branch has no labels bound to its
1005             //   start, then we can truncate this branch, because it is entirely
1006             //   unreachable (we have redirected all labels that make it
1007             //   reachable otherwise). Do so and continue around the loop.
1008             //
1009             //   - Preserves execution semantics: the branch is unreachable,
1010             //     because execution can only flow into an instruction from the
1011             //     prior instruction's fallthrough or from a branch bound to that
1012             //     instruction's start offset. Unconditional branches have no
1013             //     fallthrough, so if the prior instruction is an unconditional
1014             //     branch, no fallthrough entry can happen. The
1015             //     labels-at-this-branch list is complete (by invariant), so if it
1016             //     is empty, then the instruction is entirely unreachable. Thus,
1017             //     it can be removed.
1018             //
1019             //   - Post-invariant: ensured by truncate_last_branch().
1020             //
1021             // - If there is a prior conditional branch whose target label
1022             //   resolves to the current offset (branches around the
1023             //   unconditional branch), then remove the unconditional branch,
1024             //   and make the target of the unconditional the target of the
1025             //   conditional instead.
1026             //
1027             //   - Preserves execution semantics: previously we had:
1028             //
1029             //         L1:
1030             //            cond_br L2
1031             //            br L3
1032             //         L2:
1033             //            (end of buffer)
1034             //
1035             //     by removing the last branch, we have:
1036             //
1037             //         L1:
1038             //            cond_br L2
1039             //         L2:
1040             //            (end of buffer)
1041             //
1042             //     we then fix up the records for the conditional branch to
1043             //     have:
1044             //
1045             //         L1:
1046             //           cond_br.inverted L3
1047             //         L2:
1048             //
1049             //     In the original code, control flow reaches L2 when the
1050             //     conditional branch's predicate is true, and L3 otherwise. In
1051             //     the optimized code, the same is true.
1052             //
1053             //   - Post-invariant: all edits to latest_branches and
1054             //     labels_at_tail are performed by `truncate_last_branch()`,
1055             //     which maintains the invariants at each step.
1056 
1057             if b.is_uncond() {
1058                 // Set any label equal to current branch's start as an alias of
1059                 // the branch's target, if the target is not the branch itself
1060                 // (i.e., an infinite loop).
1061                 //
1062                 // We cannot perform this aliasing if the target of this branch
1063                 // ultimately aliases back here; if so, we need to keep this
1064                 // branch, so break out of this loop entirely (and clear the
1065                 // latest-branches list below).
1066                 //
1067                 // Note that this check is what prevents cycles from forming in
1068                 // `self.label_aliases`. To see why, consider an arbitrary start
1069                 // state:
1070                 //
1071                 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1072                 // Ln, which is not aliased.
1073                 //
1074                 // We would create a cycle if we assigned label_aliases[Ln]
1075                 // = L1.  Note that the below assignment is the only write
1076                 // to label_aliases.
1077                 //
1078                 // By our other invariants, we have that Ln (`l` below)
1079                 // resolves to the offset `b.start`, because it is in the
1080                 // set `b.labels_at_this_branch`.
1081                 //
1082                 // If L1 were already aliased, through some arbitrarily deep
1083                 // chain, to Ln, then it must also resolve to this offset
1084                 // `b.start`.
1085                 //
1086                 // By checking the resolution of `L1` against this offset,
1087                 // and aborting this branch-simplification if they are
1088                 // equal, we prevent the below assignment from ever creating
1089                 // a cycle.
1090                 if self.resolve_label_offset(b.target) != b.start {
1091                     let redirected = b.labels_at_this_branch.len();
1092                     for &l in &b.labels_at_this_branch {
1093                         trace!(
1094                             " -> label at start of branch {:?} redirected to target {:?}",
1095                             l, b.target
1096                         );
1097                         self.label_aliases[l.0 as usize] = b.target;
1098                         // NOTE: we continue to ensure the invariant that labels
1099                         // pointing to tail of buffer are in `labels_at_tail`
1100                         // because we already ensured above that the last branch
1101                         // cannot have a target of `cur_off`; so we never have
1102                         // to put the label into `labels_at_tail` when moving it
1103                         // here.
1104                     }
1105                     // Maintain invariant: all branches have been redirected
1106                     // and are no longer pointing at the start of this branch.
1107                     let mut_b = self.latest_branches.last_mut().unwrap();
1108                     mut_b.labels_at_this_branch.clear();
1109 
1110                     if redirected > 0 {
1111                         trace!(" -> after label redirects, restarting loop");
1112                         continue;
1113                     }
1114                 } else {
1115                     break;
1116                 }
1117 
1118                 let b = self.latest_branches.last().unwrap();
1119 
1120                 // Examine any immediately preceding branch.
1121                 if self.latest_branches.len() > 1 {
1122                     let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1123                     trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1124                     // This uncond is immediately after another uncond; we
1125                     // should have already redirected labels to this uncond away
1126                     // (but check to be sure); so we can truncate this uncond.
1127                     if prev_b.is_uncond()
1128                         && prev_b.end == b.start
1129                         && b.labels_at_this_branch.is_empty()
1130                     {
1131                         trace!(" -> uncond follows another uncond; truncating");
1132                         self.truncate_last_branch();
1133                         continue;
1134                     }
1135 
1136                     // This uncond is immediately after a conditional, and the
1137                     // conditional's target is the end of this uncond, and we've
1138                     // already redirected labels to this uncond away; so we can
1139                     // truncate this uncond, flip the sense of the conditional, and
1140                     // set the conditional's target (in `latest_branches` and in
1141                     // `fixup_records`) to the uncond's target.
1142                     if prev_b.is_cond()
1143                         && prev_b.end == b.start
1144                         && self.resolve_label_offset(prev_b.target) == cur_off
1145                     {
1146                         trace!(
1147                             " -> uncond follows a conditional, and conditional's target resolves to current offset"
1148                         );
1149                         // Save the target of the uncond (this becomes the
1150                         // target of the cond), and truncate the uncond.
1151                         let target = b.target;
1152                         let data = prev_b.inverted.clone().unwrap();
1153                         self.truncate_last_branch();
1154 
1155                         // Mutate the code and cond branch.
1156                         let off_before_edit = self.cur_offset();
1157                         let prev_b = self.latest_branches.last_mut().unwrap();
1158                         let not_inverted = SmallVec::from(
1159                             &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1160                         );
1161 
1162                         // Low-level edit: replaces bytes of branch with
1163                         // inverted form. cur_off remains the same afterward, so
1164                         // we do not need to modify label data structures.
1165                         self.data.truncate(prev_b.start as usize);
1166                         self.data.extend_from_slice(&data[..]);
1167 
1168                         // Save the original code as the inversion of the
1169                         // inverted branch, in case we later edit this branch
1170                         // again.
1171                         prev_b.inverted = Some(not_inverted);
1172                         self.pending_fixup_records[prev_b.fixup].label = target;
1173                         trace!(" -> reassigning target of condbr to {:?}", target);
1174                         prev_b.target = target;
1175                         debug_assert_eq!(off_before_edit, self.cur_offset());
1176                         continue;
1177                     }
1178                 }
1179             }
1180 
1181             // If we couldn't do anything with the last branch, then break.
1182             break;
1183         }
1184 
1185         self.purge_latest_branches();
1186 
1187         trace!(
1188             "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1189             self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1190         );
1191     }
1192 
1193     fn purge_latest_branches(&mut self) {
1194         // All of our branch simplification rules work only if a branch ends at
1195         // the tail of the buffer, with no following code; and branches are in
1196         // order in latest_branches; so if the last entry ends prior to
1197         // cur_offset, then clear all entries.
1198         let cur_off = self.cur_offset();
1199         if let Some(l) = self.latest_branches.last() {
1200             if l.end < cur_off {
1201                 trace!("purge_latest_branches: removing branch {:?}", l);
1202                 self.latest_branches.clear();
1203             }
1204         }
1205 
1206         // Post-invariant: no invariant requires any branch to appear in
1207         // `latest_branches`; it is always optional. The list-clear above thus
1208         // preserves all semantics.
1209     }
1210 
1211     /// Emit a trap at some point in the future with the specified code and
1212     /// stack map.
1213     ///
1214     /// This function returns a [`MachLabel`] which will be the future address
1215     /// of the trap. Jumps should refer to this label, likely by using the
1216     /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1217     /// patched in once the address of the trap is known.
1218     ///
1219     /// This will batch all traps into the end of the function.
1220     pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1221         let label = self.get_label();
1222         self.pending_traps.push(MachLabelTrap {
1223             label,
1224             code,
1225             loc: self.cur_srcloc.map(|(_start, loc)| loc),
1226         });
1227         label
1228     }
1229 
1230     /// Is an island needed within the next N bytes?
1231     pub fn island_needed(&self, distance: CodeOffset) -> bool {
1232         let deadline = match self.fixup_records.peek() {
1233             Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1234             None => self.pending_fixup_deadline,
1235         };
1236         deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1237     }
1238 
1239     /// Returns the maximal offset that islands can reach if `distance` more
1240     /// bytes are appended.
1241     ///
1242     /// This is used to determine if veneers need insertions since jumps that
1243     /// can't reach past this point must get a veneer of some form.
1244     fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1245         // Assume that all fixups will require veneers and that the veneers are
1246         // the worst-case size for each platform. This is an over-generalization
1247         // to avoid iterating over the `fixup_records` list or maintaining
1248         // information about it as we go along.
1249         let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1250             as u32)
1251             * (I::LabelUse::worst_case_veneer_size())
1252             + self.pending_constants_size
1253             + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1254         self.cur_offset()
1255             .saturating_add(distance)
1256             .saturating_add(island_worst_case_size)
1257     }
1258 
1259     /// Emit all pending constants and required pending veneers.
1260     ///
1261     /// Should only be called if `island_needed()` returns true, i.e., if we
1262     /// actually reach a deadline. It's not necessarily a problem to do so
1263     /// otherwise but it may result in unnecessary work during emission.
1264     pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1265         self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1266     }
1267 
1268     /// Same as `emit_island`, but an internal API with a `force_veneers`
1269     /// argument to force all veneers to always get emitted for debugging.
1270     fn emit_island_maybe_forced(
1271         &mut self,
1272         force_veneers: ForceVeneers,
1273         distance: CodeOffset,
1274         ctrl_plane: &mut ControlPlane,
1275     ) {
1276         // We're going to purge fixups, so no latest-branch editing can happen
1277         // anymore.
1278         self.latest_branches.clear();
1279 
1280         // End the current location tracking since anything emitted during this
1281         // function shouldn't be attributed to whatever the current source
1282         // location is.
1283         //
1284         // Note that the current source location, if it's set right now, will be
1285         // restored at the end of this island emission.
1286         let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1287         if cur_loc.is_some() {
1288             self.end_srcloc();
1289         }
1290 
1291         let forced_threshold = self.worst_case_end_of_island(distance);
1292 
1293         // First flush out all traps/constants so we have more labels in case
1294         // fixups are applied against these labels.
1295         //
1296         // Note that traps are placed first since this typically happens at the
1297         // end of the function and for disassemblers we try to keep all the code
1298         // contiguously together.
1299         for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1300             // If this trap has source information associated with it then
1301             // emit this information for the trap instruction going out now too.
1302             if let Some(loc) = loc {
1303                 self.start_srcloc(loc);
1304             }
1305             self.align_to(I::LabelUse::ALIGN);
1306             self.bind_label(label, ctrl_plane);
1307             self.add_trap(code);
1308             self.put_data(I::TRAP_OPCODE);
1309             if loc.is_some() {
1310                 self.end_srcloc();
1311             }
1312         }
1313 
1314         for constant in mem::take(&mut self.pending_constants) {
1315             let MachBufferConstant { align, size, .. } = self.constants[constant];
1316             let label = self.constants[constant].upcoming_label.take().unwrap();
1317             self.align_to(align);
1318             self.bind_label(label, ctrl_plane);
1319             self.used_constants.push((constant, self.cur_offset()));
1320             self.get_appended_space(size);
1321         }
1322 
1323         // Either handle all pending fixups because they're ready or move them
1324         // onto the `BinaryHeap` tracking all pending fixups if they aren't
1325         // ready.
1326         assert!(self.latest_branches.is_empty());
1327         for fixup in mem::take(&mut self.pending_fixup_records) {
1328             if self.should_apply_fixup(&fixup, forced_threshold) {
1329                 self.handle_fixup(fixup, force_veneers, forced_threshold);
1330             } else {
1331                 self.fixup_records.push(fixup);
1332             }
1333         }
1334         self.pending_fixup_deadline = u32::MAX;
1335         while let Some(fixup) = self.fixup_records.peek() {
1336             trace!("emit_island: fixup {:?}", fixup);
1337 
1338             // If this fixup shouldn't be applied, that means its label isn't
1339             // defined yet and there'll be remaining space to apply a veneer if
1340             // necessary in the future after this island. In that situation
1341             // because `fixup_records` is sorted by deadline this loop can
1342             // exit.
1343             if !self.should_apply_fixup(fixup, forced_threshold) {
1344                 break;
1345             }
1346 
1347             let fixup = self.fixup_records.pop().unwrap();
1348             self.handle_fixup(fixup, force_veneers, forced_threshold);
1349         }
1350 
1351         if let Some(loc) = cur_loc {
1352             self.start_srcloc(loc);
1353         }
1354     }
1355 
1356     fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1357         let label_offset = self.resolve_label_offset(fixup.label);
1358         label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1359     }
1360 
1361     fn handle_fixup(
1362         &mut self,
1363         fixup: MachLabelFixup<I>,
1364         force_veneers: ForceVeneers,
1365         forced_threshold: CodeOffset,
1366     ) {
1367         let MachLabelFixup {
1368             label,
1369             offset,
1370             kind,
1371         } = fixup;
1372         let start = offset as usize;
1373         let end = (offset + kind.patch_size()) as usize;
1374         let label_offset = self.resolve_label_offset(label);
1375 
1376         if label_offset != UNKNOWN_LABEL_OFFSET {
1377             // If the offset of the label for this fixup is known then
1378             // we're going to do something here-and-now. We're either going
1379             // to patch the original offset because it's an in-bounds jump,
1380             // or we're going to generate a veneer, patch the fixup to jump
1381             // to the veneer, and then keep going.
1382             //
1383             // If the label comes after the original fixup, then we should
1384             // be guaranteed that the jump is in-bounds. Otherwise there's
1385             // a bug somewhere because this method wasn't called soon
1386             // enough. All forward-jumps are tracked and should get veneers
1387             // before their deadline comes and they're unable to jump
1388             // further.
1389             //
1390             // Otherwise if the label is before the fixup, then that's a
1391             // backwards jump. If it's past the maximum negative range
1392             // then we'll emit a veneer that to jump forward to which can
1393             // then jump backwards.
1394             let veneer_required = if label_offset >= offset {
1395                 assert!((label_offset - offset) <= kind.max_pos_range());
1396                 false
1397             } else {
1398                 (offset - label_offset) > kind.max_neg_range()
1399             };
1400             trace!(
1401                 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1402                 label_offset,
1403                 veneer_required,
1404                 kind.max_pos_range(),
1405                 kind.max_neg_range()
1406             );
1407 
1408             if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1409                 self.emit_veneer(label, offset, kind);
1410             } else {
1411                 let slice = &mut self.data[start..end];
1412                 trace!(
1413                     "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1414                 );
1415                 kind.patch(slice, offset, label_offset);
1416             }
1417         } else {
1418             // If the offset of this label is not known at this time then
1419             // that means that a veneer is required because after this
1420             // island the target can't be in range of the original target.
1421             assert!(forced_threshold - offset > kind.max_pos_range());
1422             self.emit_veneer(label, offset, kind);
1423         }
1424     }
1425 
1426     /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1427     ///
1428     /// This will generate extra machine code, using `kind`, to get a
1429     /// larger-jump-kind than `kind` allows. The code at `offset` is then
1430     /// patched to jump to our new code, and then the new code is enqueued for
1431     /// a fixup to get processed at some later time.
1432     fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1433         // If this `kind` doesn't support a veneer then that's a bug in the
1434         // backend because we need to implement support for such a veneer.
1435         assert!(
1436             kind.supports_veneer(),
1437             "jump beyond the range of {kind:?} but a veneer isn't supported",
1438         );
1439 
1440         // Allocate space for a veneer in the island.
1441         self.align_to(I::LabelUse::ALIGN);
1442         let veneer_offset = self.cur_offset();
1443         trace!("making a veneer at {}", veneer_offset);
1444         let start = offset as usize;
1445         let end = (offset + kind.patch_size()) as usize;
1446         let slice = &mut self.data[start..end];
1447         // Patch the original label use to refer to the veneer.
1448         trace!(
1449             "patching original at offset {} to veneer offset {}",
1450             offset, veneer_offset
1451         );
1452         kind.patch(slice, offset, veneer_offset);
1453         // Generate the veneer.
1454         let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1455         let (veneer_fixup_off, veneer_label_use) =
1456             kind.generate_veneer(veneer_slice, veneer_offset);
1457         trace!(
1458             "generated veneer; fixup offset {}, label_use {:?}",
1459             veneer_fixup_off, veneer_label_use
1460         );
1461         // Register a new use of `label` with our new veneer fixup and
1462         // offset. This'll recalculate deadlines accordingly and
1463         // enqueue this fixup to get processed at some later
1464         // time.
1465         self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1466     }
1467 
1468     fn finish_emission_maybe_forcing_veneers(
1469         &mut self,
1470         force_veneers: ForceVeneers,
1471         ctrl_plane: &mut ControlPlane,
1472     ) {
1473         while !self.pending_constants.is_empty()
1474             || !self.pending_traps.is_empty()
1475             || !self.fixup_records.is_empty()
1476             || !self.pending_fixup_records.is_empty()
1477         {
1478             // `emit_island()` will emit any pending veneers and constants, and
1479             // as a side-effect, will also take care of any fixups with resolved
1480             // labels eagerly.
1481             self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1482         }
1483 
1484         // Ensure that all labels have been fixed up after the last island is emitted. This is a
1485         // full (release-mode) assert because an unresolved label means the emitted code is
1486         // incorrect.
1487         assert!(self.fixup_records.is_empty());
1488         assert!(self.pending_fixup_records.is_empty());
1489     }
1490 
1491     /// Finish any deferred emissions and/or fixups.
1492     pub fn finish(
1493         mut self,
1494         constants: &VCodeConstants,
1495         ctrl_plane: &mut ControlPlane,
1496     ) -> MachBufferFinalized<Stencil> {
1497         let _tt = timing::vcode_emit_finish();
1498 
1499         self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1500 
1501         let alignment = self.finish_constants(constants);
1502 
1503         // Resolve all labels to their offsets.
1504         let finalized_relocs = self
1505             .relocs
1506             .iter()
1507             .map(|reloc| FinalizedMachReloc {
1508                 offset: reloc.offset,
1509                 kind: reloc.kind,
1510                 addend: reloc.addend,
1511                 target: match &reloc.target {
1512                     RelocTarget::ExternalName(name) => {
1513                         FinalizedRelocTarget::ExternalName(name.clone())
1514                     }
1515                     RelocTarget::Label(label) => {
1516                         FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1517                     }
1518                 },
1519             })
1520             .collect();
1521 
1522         let finalized_exception_handlers = self
1523             .exception_handlers
1524             .iter()
1525             .map(|handler| handler.finalize(|label| self.resolve_label_offset(label)))
1526             .collect();
1527 
1528         let mut srclocs = self.srclocs;
1529         srclocs.sort_by_key(|entry| entry.start);
1530 
1531         MachBufferFinalized {
1532             data: self.data,
1533             relocs: finalized_relocs,
1534             traps: self.traps,
1535             call_sites: self.call_sites,
1536             exception_handlers: finalized_exception_handlers,
1537             srclocs,
1538             user_stack_maps: self.user_stack_maps,
1539             unwind_info: self.unwind_info,
1540             alignment,
1541         }
1542     }
1543 
1544     /// Add an external relocation at the given offset.
1545     pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1546         &mut self,
1547         offset: CodeOffset,
1548         kind: Reloc,
1549         target: &T,
1550         addend: Addend,
1551     ) {
1552         let target: RelocTarget = target.clone().into();
1553         // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1554         // generate a label-use statement to track whether an island is possibly
1555         // needed to escape this function to actually get to the external name.
1556         // This is most likely to come up on AArch64 where calls between
1557         // functions use a 26-bit signed offset which gives +/- 64MB. This means
1558         // that if a function is 128MB in size and there's a call in the middle
1559         // it's impossible to reach the actual target. Also, while it's
1560         // technically possible to jump to the start of a function and then jump
1561         // further, island insertion below always inserts islands after
1562         // previously appended code so for Cranelift's own implementation this
1563         // is also a problem for 64MB functions on AArch64 which start with a
1564         // call instruction, those won't be able to escape.
1565         //
1566         // Ideally what needs to happen here is that a `LabelUse` is
1567         // transparently generated (or call-sites of this function are audited
1568         // to generate a `LabelUse` instead) and tracked internally. The actual
1569         // relocation would then change over time if and when a veneer is
1570         // inserted, where the relocation here would be patched by this
1571         // `MachBuffer` to jump to the veneer. The problem, though, is that all
1572         // this still needs to end up, in the case of a singular function,
1573         // generating a final relocation pointing either to this particular
1574         // relocation or to the veneer inserted. Additionally
1575         // `MachBuffer` needs the concept of a label which will never be
1576         // resolved, so `emit_island` doesn't trip over not actually ever
1577         // knowning what some labels are. Currently the loop in
1578         // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1579         // loop.
1580         //
1581         // For now this means that because relocs aren't tracked at all that
1582         // AArch64 functions have a rough size limits of 64MB. For now that's
1583         // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1584         // when a relocation can't otherwise be resolved later, so it shouldn't
1585         // actually result in any memory unsafety or anything like that.
1586         self.relocs.push(MachReloc {
1587             offset,
1588             kind,
1589             target,
1590             addend,
1591         });
1592     }
1593 
1594     /// Add an external relocation at the current offset.
1595     pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1596         &mut self,
1597         kind: Reloc,
1598         target: &T,
1599         addend: Addend,
1600     ) {
1601         self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1602     }
1603 
1604     /// Add a trap record at the current offset.
1605     pub fn add_trap(&mut self, code: TrapCode) {
1606         self.traps.push(MachTrap {
1607             offset: self.data.len() as CodeOffset,
1608             code,
1609         });
1610     }
1611 
1612     /// Add a call-site record at the current offset.
1613     pub fn add_call_site(&mut self) {
1614         self.add_try_call_site(core::iter::empty());
1615     }
1616 
1617     /// Add a call-site record at the current offset with exception
1618     /// handlers.
1619     pub fn add_try_call_site(
1620         &mut self,
1621         exception_handlers: impl Iterator<Item = MachExceptionHandler>,
1622     ) {
1623         let start = u32::try_from(self.exception_handlers.len()).unwrap();
1624         self.exception_handlers.extend(exception_handlers);
1625         let end = u32::try_from(self.exception_handlers.len()).unwrap();
1626         let exception_handler_range = start..end;
1627 
1628         self.call_sites.push(MachCallSite {
1629             ret_addr: self.data.len() as CodeOffset,
1630             exception_handler_range,
1631         });
1632     }
1633 
1634     /// Add an unwind record at the current offset.
1635     pub fn add_unwind(&mut self, unwind: UnwindInst) {
1636         self.unwind_info.push((self.cur_offset(), unwind));
1637     }
1638 
1639     /// Set the `SourceLoc` for code from this offset until the offset at the
1640     /// next call to `end_srcloc()`.
1641     /// Returns the current [CodeOffset] and [RelSourceLoc].
1642     pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1643         let cur = (self.cur_offset(), loc);
1644         self.cur_srcloc = Some(cur);
1645         cur
1646     }
1647 
1648     /// Mark the end of the `SourceLoc` segment started at the last
1649     /// `start_srcloc()` call.
1650     pub fn end_srcloc(&mut self) {
1651         let (start, loc) = self
1652             .cur_srcloc
1653             .take()
1654             .expect("end_srcloc() called without start_srcloc()");
1655         let end = self.cur_offset();
1656         // Skip zero-length extends.
1657         debug_assert!(end >= start);
1658         if end > start {
1659             self.srclocs.push(MachSrcLoc { start, end, loc });
1660         }
1661     }
1662 
1663     /// Push a user stack map onto this buffer.
1664     ///
1665     /// The stack map is associated with the given `return_addr` code
1666     /// offset. This must be the PC for the instruction just *after* this stack
1667     /// map's associated instruction. For example in the sequence `call $foo;
1668     /// add r8, rax`, the `return_addr` must be the offset of the start of the
1669     /// `add` instruction.
1670     ///
1671     /// Stack maps must be pushed in sorted `return_addr` order.
1672     pub fn push_user_stack_map(
1673         &mut self,
1674         emit_state: &I::State,
1675         return_addr: CodeOffset,
1676         mut stack_map: ir::UserStackMap,
1677     ) {
1678         let span = emit_state.frame_layout().active_size();
1679         trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1680 
1681         debug_assert!(
1682             self.user_stack_maps
1683                 .last()
1684                 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1685             "pushed stack maps out of order: {} is not less than {}",
1686             self.user_stack_maps.last().unwrap().0,
1687             return_addr,
1688         );
1689 
1690         stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1691         self.user_stack_maps.push((return_addr, span, stack_map));
1692     }
1693 
1694     /// Increase the alignment of the buffer to the given alignment if bigger
1695     /// than the current alignment.
1696     pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1697         self.min_alignment = self.min_alignment.max(
1698             1u32.checked_shl(u32::from(align_to))
1699                 .expect("log2_min_function_alignment too large"),
1700         );
1701     }
1702 }
1703 
1704 impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1705     fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1706         for b in iter {
1707             self.put1(b);
1708         }
1709     }
1710 }
1711 
1712 impl<T: CompilePhase> MachBufferFinalized<T> {
1713     /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1714     pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1715         &self.srclocs[..]
1716     }
1717 
1718     /// Get the total required size for the code.
1719     pub fn total_size(&self) -> CodeOffset {
1720         self.data.len() as CodeOffset
1721     }
1722 
1723     /// Return the code in this mach buffer as a hex string for testing purposes.
1724     pub fn stringify_code_bytes(&self) -> String {
1725         // This is pretty lame, but whatever ..
1726         use std::fmt::Write;
1727         let mut s = String::with_capacity(self.data.len() * 2);
1728         for b in &self.data {
1729             write!(&mut s, "{b:02X}").unwrap();
1730         }
1731         s
1732     }
1733 
1734     /// Get the code bytes.
1735     pub fn data(&self) -> &[u8] {
1736         // N.B.: we emit every section into the .text section as far as
1737         // the `CodeSink` is concerned; we do not bother to segregate
1738         // the contents into the actual program text, the jumptable and the
1739         // rodata (constant pool). This allows us to generate code assuming
1740         // that these will not be relocated relative to each other, and avoids
1741         // having to designate each section as belonging in one of the three
1742         // fixed categories defined by `CodeSink`. If this becomes a problem
1743         // later (e.g. because of memory permissions or similar), we can
1744         // add this designation and segregate the output; take care, however,
1745         // to add the appropriate relocations in this case.
1746 
1747         &self.data[..]
1748     }
1749 
1750     /// Get the list of external relocations for this code.
1751     pub fn relocs(&self) -> &[FinalizedMachReloc] {
1752         &self.relocs[..]
1753     }
1754 
1755     /// Get the list of trap records for this code.
1756     pub fn traps(&self) -> &[MachTrap] {
1757         &self.traps[..]
1758     }
1759 
1760     /// Get the user stack map metadata for this code.
1761     pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1762         &self.user_stack_maps
1763     }
1764 
1765     /// Take this buffer's user strack map metadata.
1766     pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1767         mem::take(&mut self.user_stack_maps)
1768     }
1769 
1770     /// Get the list of call sites for this code, along with
1771     /// associated exception handlers.
1772     ///
1773     /// Each item yielded by the returned iterator is a struct with:
1774     ///
1775     /// - The call site metadata record, with a `ret_addr` field
1776     ///   directly accessible and denoting the offset of the return
1777     ///   address into this buffer's code.
1778     /// - The slice of pairs of exception tags and code offsets
1779     ///   denoting exception-handler entry points associated with this
1780     ///   call site.
1781     pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1782         self.call_sites.iter().map(|call_site| {
1783             let handler_range = call_site.exception_handler_range.clone();
1784             let handler_range = usize::try_from(handler_range.start).unwrap()
1785                 ..usize::try_from(handler_range.end).unwrap();
1786             FinalizedMachCallSite {
1787                 ret_addr: call_site.ret_addr,
1788                 exception_handlers: &self.exception_handlers[handler_range],
1789             }
1790         })
1791     }
1792 }
1793 
1794 /// An item in the exception-handler list for a callsite, with label
1795 /// references.  Items are interpreted in left-to-right order and the
1796 /// first match wins.
1797 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
1798 pub enum MachExceptionHandler {
1799     /// A specific tag (in the current dynamic context) should be
1800     /// handled by the code at the given offset.
1801     Tag(ExceptionTag, MachLabel),
1802     /// All exceptions should be handled by the code at the given
1803     /// offset.
1804     Default(MachLabel),
1805     /// The dynamic context for interpreting tags is updated to the
1806     /// value stored in the given machine location (in this frame's
1807     /// context).
1808     Context(ExceptionContextLoc),
1809 }
1810 
1811 impl MachExceptionHandler {
1812     fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler {
1813         match self {
1814             Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)),
1815             Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)),
1816             Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc),
1817         }
1818     }
1819 }
1820 
1821 /// An item in the exception-handler list for a callsite, with final
1822 /// (lowered) code offsets. Items are interpreted in left-to-right
1823 /// order and the first match wins.
1824 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
1825 #[cfg_attr(
1826     feature = "enable-serde",
1827     derive(serde_derive::Serialize, serde_derive::Deserialize)
1828 )]
1829 pub enum FinalizedMachExceptionHandler {
1830     /// A specific tag (in the current dynamic context) should be
1831     /// handled by the code at the given offset.
1832     Tag(ExceptionTag, CodeOffset),
1833     /// All exceptions should be handled by the code at the given
1834     /// offset.
1835     Default(CodeOffset),
1836     /// The dynamic context for interpreting tags is updated to the
1837     /// value stored in the given machine location (in this frame's
1838     /// context).
1839     Context(ExceptionContextLoc),
1840 }
1841 
1842 /// A location for a dynamic exception context value.
1843 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
1844 #[cfg_attr(
1845     feature = "enable-serde",
1846     derive(serde_derive::Serialize, serde_derive::Deserialize)
1847 )]
1848 pub enum ExceptionContextLoc {
1849     /// An offset from SP at the callsite.
1850     SPOffset(u32),
1851     /// A GPR at the callsite. The physical register number for the
1852     /// GPR register file on the target architecture is used.
1853     GPR(u8),
1854 }
1855 
1856 /// Metadata about a constant.
1857 struct MachBufferConstant {
1858     /// A label which has not yet been bound which can be used for this
1859     /// constant.
1860     ///
1861     /// This is lazily created when a label is requested for a constant and is
1862     /// cleared when a constant is emitted.
1863     upcoming_label: Option<MachLabel>,
1864     /// Required alignment.
1865     align: CodeOffset,
1866     /// The byte size of this constant.
1867     size: usize,
1868 }
1869 
1870 /// A trap that is deferred to the next time an island is emitted for either
1871 /// traps, constants, or fixups.
1872 struct MachLabelTrap {
1873     /// This label will refer to the trap's offset.
1874     label: MachLabel,
1875     /// The code associated with this trap.
1876     code: TrapCode,
1877     /// An optional source location to assign for this trap.
1878     loc: Option<RelSourceLoc>,
1879 }
1880 
1881 /// A fixup to perform on the buffer once code is emitted. Fixups always refer
1882 /// to labels and patch the code based on label offsets. Hence, they are like
1883 /// relocations, but internal to one buffer.
1884 #[derive(Debug)]
1885 struct MachLabelFixup<I: VCodeInst> {
1886     /// The label whose offset controls this fixup.
1887     label: MachLabel,
1888     /// The offset to fix up / patch to refer to this label.
1889     offset: CodeOffset,
1890     /// The kind of fixup. This is architecture-specific; each architecture may have,
1891     /// e.g., several types of branch instructions, each with differently-sized
1892     /// offset fields and different places within the instruction to place the
1893     /// bits.
1894     kind: I::LabelUse,
1895 }
1896 
1897 impl<I: VCodeInst> MachLabelFixup<I> {
1898     fn deadline(&self) -> CodeOffset {
1899         self.offset.saturating_add(self.kind.max_pos_range())
1900     }
1901 }
1902 
1903 impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1904     fn eq(&self, other: &Self) -> bool {
1905         self.deadline() == other.deadline()
1906     }
1907 }
1908 
1909 impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1910 
1911 impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1912     fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1913         Some(self.cmp(other))
1914     }
1915 }
1916 
1917 impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1918     fn cmp(&self, other: &Self) -> Ordering {
1919         other.deadline().cmp(&self.deadline())
1920     }
1921 }
1922 
1923 /// A relocation resulting from a compilation.
1924 #[derive(Clone, Debug, PartialEq)]
1925 #[cfg_attr(
1926     feature = "enable-serde",
1927     derive(serde_derive::Serialize, serde_derive::Deserialize)
1928 )]
1929 pub struct MachRelocBase<T> {
1930     /// The offset at which the relocation applies, *relative to the
1931     /// containing section*.
1932     pub offset: CodeOffset,
1933     /// The kind of relocation.
1934     pub kind: Reloc,
1935     /// The external symbol / name to which this relocation refers.
1936     pub target: T,
1937     /// The addend to add to the symbol value.
1938     pub addend: i64,
1939 }
1940 
1941 type MachReloc = MachRelocBase<RelocTarget>;
1942 
1943 /// A relocation resulting from a compilation.
1944 pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1945 
1946 /// A Relocation target
1947 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
1948 pub enum RelocTarget {
1949     /// Points to an [ExternalName] outside the current function.
1950     ExternalName(ExternalName),
1951     /// Points to a [MachLabel] inside this function.
1952     /// This is different from [MachLabelFixup] in that both the relocation and the
1953     /// label will be emitted and are only resolved at link time.
1954     ///
1955     /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1956     Label(MachLabel),
1957 }
1958 
1959 impl From<ExternalName> for RelocTarget {
1960     fn from(name: ExternalName) -> Self {
1961         Self::ExternalName(name)
1962     }
1963 }
1964 
1965 impl From<MachLabel> for RelocTarget {
1966     fn from(label: MachLabel) -> Self {
1967         Self::Label(label)
1968     }
1969 }
1970 
1971 /// A Relocation target
1972 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
1973 #[cfg_attr(
1974     feature = "enable-serde",
1975     derive(serde_derive::Serialize, serde_derive::Deserialize)
1976 )]
1977 pub enum FinalizedRelocTarget {
1978     /// Points to an [ExternalName] outside the current function.
1979     ExternalName(ExternalName),
1980     /// Points to a [CodeOffset] from the start of the current function.
1981     Func(CodeOffset),
1982 }
1983 
1984 impl FinalizedRelocTarget {
1985     /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1986     /// output.
1987     pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1988         match self {
1989             FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1990             FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1991         }
1992     }
1993 }
1994 
1995 /// A trap record resulting from a compilation.
1996 #[derive(Clone, Debug, PartialEq)]
1997 #[cfg_attr(
1998     feature = "enable-serde",
1999     derive(serde_derive::Serialize, serde_derive::Deserialize)
2000 )]
2001 pub struct MachTrap {
2002     /// The offset at which the trap instruction occurs, *relative to the
2003     /// containing section*.
2004     pub offset: CodeOffset,
2005     /// The trap code.
2006     pub code: TrapCode,
2007 }
2008 
2009 /// A call site record resulting from a compilation.
2010 #[derive(Clone, Debug, PartialEq)]
2011 #[cfg_attr(
2012     feature = "enable-serde",
2013     derive(serde_derive::Serialize, serde_derive::Deserialize)
2014 )]
2015 pub struct MachCallSite {
2016     /// The offset of the call's return address, *relative to the
2017     /// start of the buffer*.
2018     pub ret_addr: CodeOffset,
2019 
2020     /// Range in `exception_handlers` corresponding to the exception
2021     /// handlers for this callsite.
2022     exception_handler_range: Range<u32>,
2023 }
2024 
2025 /// A call site record resulting from a compilation.
2026 #[derive(Clone, Debug, PartialEq)]
2027 pub struct FinalizedMachCallSite<'a> {
2028     /// The offset of the call's return address, *relative to the
2029     /// start of the buffer*.
2030     pub ret_addr: CodeOffset,
2031 
2032     /// Exception handlers at this callsite, with target offsets
2033     /// *relative to the start of the buffer*.
2034     pub exception_handlers: &'a [FinalizedMachExceptionHandler],
2035 }
2036 
2037 /// A source-location mapping resulting from a compilation.
2038 #[derive(PartialEq, Debug, Clone)]
2039 #[cfg_attr(
2040     feature = "enable-serde",
2041     derive(serde_derive::Serialize, serde_derive::Deserialize)
2042 )]
2043 pub struct MachSrcLoc<T: CompilePhase> {
2044     /// The start of the region of code corresponding to a source location.
2045     /// This is relative to the start of the function, not to the start of the
2046     /// section.
2047     pub start: CodeOffset,
2048     /// The end of the region of code corresponding to a source location.
2049     /// This is relative to the start of the function, not to the start of the
2050     /// section.
2051     pub end: CodeOffset,
2052     /// The source location.
2053     pub loc: T::SourceLocType,
2054 }
2055 
2056 impl MachSrcLoc<Stencil> {
2057     fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
2058         MachSrcLoc {
2059             start: self.start,
2060             end: self.end,
2061             loc: self.loc.expand(base_srcloc),
2062         }
2063     }
2064 }
2065 
2066 /// Record of branch instruction in the buffer, to facilitate editing.
2067 #[derive(Clone, Debug)]
2068 struct MachBranch {
2069     start: CodeOffset,
2070     end: CodeOffset,
2071     target: MachLabel,
2072     fixup: usize,
2073     inverted: Option<SmallVec<[u8; 8]>>,
2074     /// All labels pointing to the start of this branch. For correctness, this
2075     /// *must* be complete (i.e., must contain all labels whose resolved offsets
2076     /// are at the start of this branch): we rely on being able to redirect all
2077     /// labels that could jump to this branch before removing it, if it is
2078     /// otherwise unreachable.
2079     labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2080 }
2081 
2082 impl MachBranch {
2083     fn is_cond(&self) -> bool {
2084         self.inverted.is_some()
2085     }
2086     fn is_uncond(&self) -> bool {
2087         self.inverted.is_none()
2088     }
2089 }
2090 
2091 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2092 ///
2093 /// Note that `MachBuffer` was primarily written for intra-function references
2094 /// of jumps between basic blocks, but it's also quite usable for entire text
2095 /// sections and resolving references between functions themselves. This
2096 /// builder interprets "blocks" as labeled functions for the purposes of
2097 /// resolving labels internally in the buffer.
2098 pub struct MachTextSectionBuilder<I: VCodeInst> {
2099     buf: MachBuffer<I>,
2100     next_func: usize,
2101     force_veneers: ForceVeneers,
2102 }
2103 
2104 impl<I: VCodeInst> MachTextSectionBuilder<I> {
2105     /// Creates a new text section builder which will have `num_funcs` functions
2106     /// pushed into it.
2107     pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2108         let mut buf = MachBuffer::new();
2109         buf.reserve_labels_for_blocks(num_funcs);
2110         MachTextSectionBuilder {
2111             buf,
2112             next_func: 0,
2113             force_veneers: ForceVeneers::No,
2114         }
2115     }
2116 }
2117 
2118 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2119     fn append(
2120         &mut self,
2121         labeled: bool,
2122         func: &[u8],
2123         align: u32,
2124         ctrl_plane: &mut ControlPlane,
2125     ) -> u64 {
2126         // Conditionally emit an island if it's necessary to resolve jumps
2127         // between functions which are too far away.
2128         let size = func.len() as u32;
2129         if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2130             self.buf
2131                 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2132         }
2133 
2134         self.buf.align_to(align);
2135         let pos = self.buf.cur_offset();
2136         if labeled {
2137             self.buf.bind_label(
2138                 MachLabel::from_block(BlockIndex::new(self.next_func)),
2139                 ctrl_plane,
2140             );
2141             self.next_func += 1;
2142         }
2143         self.buf.put_data(func);
2144         u64::from(pos)
2145     }
2146 
2147     fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2148         crate::trace!(
2149             "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2150         );
2151         let label = MachLabel::from_block(BlockIndex::new(target));
2152         let offset = u32::try_from(offset).unwrap();
2153         match I::LabelUse::from_reloc(reloc, addend) {
2154             Some(label_use) => {
2155                 self.buf.use_label_at_offset(offset, label, label_use);
2156                 true
2157             }
2158             None => false,
2159         }
2160     }
2161 
2162     fn force_veneers(&mut self) {
2163         self.force_veneers = ForceVeneers::Yes;
2164     }
2165 
2166     fn write(&mut self, offset: u64, data: &[u8]) {
2167         self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2168     }
2169 
2170     fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2171         // Double-check all functions were pushed.
2172         assert_eq!(self.next_func, self.buf.label_offsets.len());
2173 
2174         // Finish up any veneers, if necessary.
2175         self.buf
2176             .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2177 
2178         // We don't need the data any more, so return it to the caller.
2179         mem::take(&mut self.buf.data).into_vec()
2180     }
2181 }
2182 
2183 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2184 #[cfg(all(test, feature = "arm64"))]
2185 mod test {
2186     use cranelift_entity::EntityRef as _;
2187 
2188     use super::*;
2189     use crate::ir::UserExternalNameRef;
2190     use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2191     use crate::isa::aarch64::inst::{OperandSize, xreg};
2192     use crate::machinst::{MachInstEmit, MachInstEmitState};
2193     use crate::settings;
2194 
2195     fn label(n: u32) -> MachLabel {
2196         MachLabel::from_block(BlockIndex::new(n as usize))
2197     }
2198     fn target(n: u32) -> BranchTarget {
2199         BranchTarget::Label(label(n))
2200     }
2201 
2202     #[test]
2203     fn test_elide_jump_to_next() {
2204         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2205         let mut buf = MachBuffer::new();
2206         let mut state = <Inst as MachInstEmit>::State::default();
2207         let constants = Default::default();
2208 
2209         buf.reserve_labels_for_blocks(2);
2210         buf.bind_label(label(0), state.ctrl_plane_mut());
2211         let inst = Inst::Jump { dest: target(1) };
2212         inst.emit(&mut buf, &info, &mut state);
2213         buf.bind_label(label(1), state.ctrl_plane_mut());
2214         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2215         assert_eq!(0, buf.total_size());
2216     }
2217 
2218     #[test]
2219     fn test_elide_trivial_jump_blocks() {
2220         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2221         let mut buf = MachBuffer::new();
2222         let mut state = <Inst as MachInstEmit>::State::default();
2223         let constants = Default::default();
2224 
2225         buf.reserve_labels_for_blocks(4);
2226 
2227         buf.bind_label(label(0), state.ctrl_plane_mut());
2228         let inst = Inst::CondBr {
2229             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2230             taken: target(1),
2231             not_taken: target(2),
2232         };
2233         inst.emit(&mut buf, &info, &mut state);
2234 
2235         buf.bind_label(label(1), state.ctrl_plane_mut());
2236         let inst = Inst::Jump { dest: target(3) };
2237         inst.emit(&mut buf, &info, &mut state);
2238 
2239         buf.bind_label(label(2), state.ctrl_plane_mut());
2240         let inst = Inst::Jump { dest: target(3) };
2241         inst.emit(&mut buf, &info, &mut state);
2242 
2243         buf.bind_label(label(3), state.ctrl_plane_mut());
2244 
2245         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2246         assert_eq!(0, buf.total_size());
2247     }
2248 
2249     #[test]
2250     fn test_flip_cond() {
2251         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2252         let mut buf = MachBuffer::new();
2253         let mut state = <Inst as MachInstEmit>::State::default();
2254         let constants = Default::default();
2255 
2256         buf.reserve_labels_for_blocks(4);
2257 
2258         buf.bind_label(label(0), state.ctrl_plane_mut());
2259         let inst = Inst::CondBr {
2260             kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2261             taken: target(1),
2262             not_taken: target(2),
2263         };
2264         inst.emit(&mut buf, &info, &mut state);
2265 
2266         buf.bind_label(label(1), state.ctrl_plane_mut());
2267         let inst = Inst::Nop4;
2268         inst.emit(&mut buf, &info, &mut state);
2269 
2270         buf.bind_label(label(2), state.ctrl_plane_mut());
2271         let inst = Inst::Udf {
2272             trap_code: TrapCode::STACK_OVERFLOW,
2273         };
2274         inst.emit(&mut buf, &info, &mut state);
2275 
2276         buf.bind_label(label(3), state.ctrl_plane_mut());
2277 
2278         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2279 
2280         let mut buf2 = MachBuffer::new();
2281         let mut state = Default::default();
2282         let inst = Inst::TrapIf {
2283             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2284             trap_code: TrapCode::STACK_OVERFLOW,
2285         };
2286         inst.emit(&mut buf2, &info, &mut state);
2287         let inst = Inst::Nop4;
2288         inst.emit(&mut buf2, &info, &mut state);
2289 
2290         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2291 
2292         assert_eq!(buf.data, buf2.data);
2293     }
2294 
2295     #[test]
2296     fn test_island() {
2297         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2298         let mut buf = MachBuffer::new();
2299         let mut state = <Inst as MachInstEmit>::State::default();
2300         let constants = Default::default();
2301 
2302         buf.reserve_labels_for_blocks(4);
2303 
2304         buf.bind_label(label(0), state.ctrl_plane_mut());
2305         let inst = Inst::CondBr {
2306             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2307             taken: target(2),
2308             not_taken: target(3),
2309         };
2310         inst.emit(&mut buf, &info, &mut state);
2311 
2312         buf.bind_label(label(1), state.ctrl_plane_mut());
2313         while buf.cur_offset() < 2000000 {
2314             if buf.island_needed(0) {
2315                 buf.emit_island(0, state.ctrl_plane_mut());
2316             }
2317             let inst = Inst::Nop4;
2318             inst.emit(&mut buf, &info, &mut state);
2319         }
2320 
2321         buf.bind_label(label(2), state.ctrl_plane_mut());
2322         let inst = Inst::Nop4;
2323         inst.emit(&mut buf, &info, &mut state);
2324 
2325         buf.bind_label(label(3), state.ctrl_plane_mut());
2326         let inst = Inst::Nop4;
2327         inst.emit(&mut buf, &info, &mut state);
2328 
2329         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2330 
2331         assert_eq!(2000000 + 8, buf.total_size());
2332 
2333         let mut buf2 = MachBuffer::new();
2334         let mut state = Default::default();
2335         let inst = Inst::CondBr {
2336             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2337 
2338             // This conditionally taken branch has a 19-bit constant, shifted
2339             // to the left by two, giving us a 21-bit range in total. Half of
2340             // this range positive so the we should be around 1 << 20 bytes
2341             // away for our jump target.
2342             //
2343             // There are two pending fixups by the time we reach this point,
2344             // one for this 19-bit jump and one for the unconditional 26-bit
2345             // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2346             // veneer is 20 bytes large, which means that pessimistically
2347             // assuming we'll need two veneers. Currently each veneer is
2348             // pessimistically assumed to be the maximal size which means we
2349             // need 40 bytes of extra space, meaning that the actual island
2350             // should come 40-bytes before the deadline.
2351             taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2352 
2353             // This branch is in-range so no veneers should be needed, it should
2354             // go directly to the target.
2355             not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2356         };
2357         inst.emit(&mut buf2, &info, &mut state);
2358 
2359         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2360 
2361         assert_eq!(&buf.data[0..8], &buf2.data[..]);
2362     }
2363 
2364     #[test]
2365     fn test_island_backward() {
2366         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2367         let mut buf = MachBuffer::new();
2368         let mut state = <Inst as MachInstEmit>::State::default();
2369         let constants = Default::default();
2370 
2371         buf.reserve_labels_for_blocks(4);
2372 
2373         buf.bind_label(label(0), state.ctrl_plane_mut());
2374         let inst = Inst::Nop4;
2375         inst.emit(&mut buf, &info, &mut state);
2376 
2377         buf.bind_label(label(1), state.ctrl_plane_mut());
2378         let inst = Inst::Nop4;
2379         inst.emit(&mut buf, &info, &mut state);
2380 
2381         buf.bind_label(label(2), state.ctrl_plane_mut());
2382         while buf.cur_offset() < 2000000 {
2383             let inst = Inst::Nop4;
2384             inst.emit(&mut buf, &info, &mut state);
2385         }
2386 
2387         buf.bind_label(label(3), state.ctrl_plane_mut());
2388         let inst = Inst::CondBr {
2389             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2390             taken: target(0),
2391             not_taken: target(1),
2392         };
2393         inst.emit(&mut buf, &info, &mut state);
2394 
2395         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2396 
2397         assert_eq!(2000000 + 12, buf.total_size());
2398 
2399         let mut buf2 = MachBuffer::new();
2400         let mut state = Default::default();
2401         let inst = Inst::CondBr {
2402             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2403             taken: BranchTarget::ResolvedOffset(8),
2404             not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2405         };
2406         inst.emit(&mut buf2, &info, &mut state);
2407         let inst = Inst::Jump {
2408             dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2409         };
2410         inst.emit(&mut buf2, &info, &mut state);
2411 
2412         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2413 
2414         assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2415     }
2416 
2417     #[test]
2418     fn test_multiple_redirect() {
2419         // label0:
2420         //   cbz x0, label1
2421         //   b label2
2422         // label1:
2423         //   b label3
2424         // label2:
2425         //   nop
2426         //   nop
2427         //   b label0
2428         // label3:
2429         //   b label4
2430         // label4:
2431         //   b label5
2432         // label5:
2433         //   b label7
2434         // label6:
2435         //   nop
2436         // label7:
2437         //   ret
2438         //
2439         // -- should become:
2440         //
2441         // label0:
2442         //   cbz x0, label7
2443         // label2:
2444         //   nop
2445         //   nop
2446         //   b label0
2447         // label6:
2448         //   nop
2449         // label7:
2450         //   ret
2451 
2452         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2453         let mut buf = MachBuffer::new();
2454         let mut state = <Inst as MachInstEmit>::State::default();
2455         let constants = Default::default();
2456 
2457         buf.reserve_labels_for_blocks(8);
2458 
2459         buf.bind_label(label(0), state.ctrl_plane_mut());
2460         let inst = Inst::CondBr {
2461             kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2462             taken: target(1),
2463             not_taken: target(2),
2464         };
2465         inst.emit(&mut buf, &info, &mut state);
2466 
2467         buf.bind_label(label(1), state.ctrl_plane_mut());
2468         let inst = Inst::Jump { dest: target(3) };
2469         inst.emit(&mut buf, &info, &mut state);
2470 
2471         buf.bind_label(label(2), state.ctrl_plane_mut());
2472         let inst = Inst::Nop4;
2473         inst.emit(&mut buf, &info, &mut state);
2474         inst.emit(&mut buf, &info, &mut state);
2475         let inst = Inst::Jump { dest: target(0) };
2476         inst.emit(&mut buf, &info, &mut state);
2477 
2478         buf.bind_label(label(3), state.ctrl_plane_mut());
2479         let inst = Inst::Jump { dest: target(4) };
2480         inst.emit(&mut buf, &info, &mut state);
2481 
2482         buf.bind_label(label(4), state.ctrl_plane_mut());
2483         let inst = Inst::Jump { dest: target(5) };
2484         inst.emit(&mut buf, &info, &mut state);
2485 
2486         buf.bind_label(label(5), state.ctrl_plane_mut());
2487         let inst = Inst::Jump { dest: target(7) };
2488         inst.emit(&mut buf, &info, &mut state);
2489 
2490         buf.bind_label(label(6), state.ctrl_plane_mut());
2491         let inst = Inst::Nop4;
2492         inst.emit(&mut buf, &info, &mut state);
2493 
2494         buf.bind_label(label(7), state.ctrl_plane_mut());
2495         let inst = Inst::Ret {};
2496         inst.emit(&mut buf, &info, &mut state);
2497 
2498         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2499 
2500         let golden_data = vec![
2501             0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2502             0x1f, 0x20, 0x03, 0xd5, // nop
2503             0x1f, 0x20, 0x03, 0xd5, // nop
2504             0xfd, 0xff, 0xff, 0x17, // b 0
2505             0x1f, 0x20, 0x03, 0xd5, // nop
2506             0xc0, 0x03, 0x5f, 0xd6, // ret
2507         ];
2508 
2509         assert_eq!(&golden_data[..], &buf.data[..]);
2510     }
2511 
2512     #[test]
2513     fn test_handle_branch_cycle() {
2514         // label0:
2515         //   b label1
2516         // label1:
2517         //   b label2
2518         // label2:
2519         //   b label3
2520         // label3:
2521         //   b label4
2522         // label4:
2523         //   b label1  // note: not label0 (to make it interesting).
2524         //
2525         // -- should become:
2526         //
2527         // label0, label1, ..., label4:
2528         //   b label0
2529         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2530         let mut buf = MachBuffer::new();
2531         let mut state = <Inst as MachInstEmit>::State::default();
2532         let constants = Default::default();
2533 
2534         buf.reserve_labels_for_blocks(5);
2535 
2536         buf.bind_label(label(0), state.ctrl_plane_mut());
2537         let inst = Inst::Jump { dest: target(1) };
2538         inst.emit(&mut buf, &info, &mut state);
2539 
2540         buf.bind_label(label(1), state.ctrl_plane_mut());
2541         let inst = Inst::Jump { dest: target(2) };
2542         inst.emit(&mut buf, &info, &mut state);
2543 
2544         buf.bind_label(label(2), state.ctrl_plane_mut());
2545         let inst = Inst::Jump { dest: target(3) };
2546         inst.emit(&mut buf, &info, &mut state);
2547 
2548         buf.bind_label(label(3), state.ctrl_plane_mut());
2549         let inst = Inst::Jump { dest: target(4) };
2550         inst.emit(&mut buf, &info, &mut state);
2551 
2552         buf.bind_label(label(4), state.ctrl_plane_mut());
2553         let inst = Inst::Jump { dest: target(1) };
2554         inst.emit(&mut buf, &info, &mut state);
2555 
2556         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2557 
2558         let golden_data = vec![
2559             0x00, 0x00, 0x00, 0x14, // b 0
2560         ];
2561 
2562         assert_eq!(&golden_data[..], &buf.data[..]);
2563     }
2564 
2565     #[test]
2566     fn metadata_records() {
2567         let mut buf = MachBuffer::<Inst>::new();
2568         let ctrl_plane = &mut Default::default();
2569         let constants = Default::default();
2570 
2571         buf.reserve_labels_for_blocks(3);
2572 
2573         buf.bind_label(label(0), ctrl_plane);
2574         buf.put1(1);
2575         buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2576         buf.put1(2);
2577         buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2578         buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2579         buf.add_try_call_site(
2580             [
2581                 MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)),
2582                 MachExceptionHandler::Default(label(1)),
2583             ]
2584             .into_iter(),
2585         );
2586         buf.add_reloc(
2587             Reloc::Abs4,
2588             &ExternalName::User(UserExternalNameRef::new(0)),
2589             0,
2590         );
2591         buf.put1(3);
2592         buf.add_reloc(
2593             Reloc::Abs8,
2594             &ExternalName::User(UserExternalNameRef::new(1)),
2595             1,
2596         );
2597         buf.put1(4);
2598         buf.bind_label(label(1), ctrl_plane);
2599         buf.put1(0xff);
2600         buf.bind_label(label(2), ctrl_plane);
2601         buf.put1(0xff);
2602 
2603         let buf = buf.finish(&constants, ctrl_plane);
2604 
2605         assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2606         assert_eq!(
2607             buf.traps()
2608                 .iter()
2609                 .map(|trap| (trap.offset, trap.code))
2610                 .collect::<Vec<_>>(),
2611             vec![
2612                 (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2613                 (2, TrapCode::INTEGER_OVERFLOW),
2614                 (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2615             ]
2616         );
2617         let call_sites: Vec<_> = buf.call_sites().collect();
2618         assert_eq!(call_sites[0].ret_addr, 2);
2619         assert_eq!(
2620             call_sites[0].exception_handlers,
2621             &[
2622                 FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5),
2623                 FinalizedMachExceptionHandler::Default(4)
2624             ],
2625         );
2626         assert_eq!(
2627             buf.relocs()
2628                 .iter()
2629                 .map(|reloc| (reloc.offset, reloc.kind))
2630                 .collect::<Vec<_>>(),
2631             vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2632         );
2633     }
2634 }
2635