1 //! In-memory representation of compiled machine code, with labels and fixups to
2 //! refer to those labels. Handles constant-pool island insertion and also
3 //! veneer insertion for out-of-range jumps.
4 //!
5 //! This code exists to solve three problems:
6 //!
7 //! - Branch targets for forward branches are not known until later, when we
8 //!   emit code in a single pass through the instruction structs.
9 //!
10 //! - On many architectures, address references or offsets have limited range.
11 //!   For example, on AArch64, conditional branches can only target code +/- 1MB
12 //!   from the branch itself.
13 //!
14 //! - The lowering of control flow from the CFG-with-edges produced by
15 //!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16 //!   edge blocks when the register allocator does not need to insert any
17 //!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18 //!   patterns. The lowering also pays no attention to block order, and so
19 //!   two-target conditional forms (cond-br followed by uncond-br) can often by
20 //!   avoided because one of the targets is the fallthrough. There are several
21 //!   cases here where we can simplify to use fewer branches.
22 //!
23 //! This "buffer" implements a single-pass code emission strategy (with a later
24 //! "fixup" pass, but only through recorded fixups, not all instructions). The
25 //! basic idea is:
26 //!
27 //! - Emit branches as they are, including two-target (cond/uncond) compound
28 //!   forms, but with zero offsets and optimistically assuming the target will be
29 //!   in range. Record the "fixup" for later. Targets are denoted instead by
30 //!   symbolic "labels" that are then bound to certain offsets in the buffer as
31 //!   we emit code. (Nominally, there is a label at the start of every basic
32 //!   block.)
33 //!
34 //! - As we do this, track the offset in the buffer at which the first label
35 //!   reference "goes out of range". We call this the "deadline". If we reach the
36 //!   deadline and we still have not bound the label to which an unresolved branch
37 //!   refers, we have a problem!
38 //!
39 //! - To solve this problem, we emit "islands" full of "veneers". An island is
40 //!   simply a chunk of code inserted in the middle of the code actually produced
41 //!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42 //!   has some awareness of this: it either asks for an island between blocks, so
43 //!   it is not accidentally executed, or else it emits a branch around the island
44 //!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45 //!
46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47 //!   that implements a longer-range reference to a label. The idea is that, for
48 //!   example, a branch with a limited range can branch to a "veneer" instead,
49 //!   which is simply a branch in a form that can use a longer-range reference. On
50 //!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51 //!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52 //!   conditional branch's label reference can be fixed up with a "veneer" to
53 //!   achieve a longer range.
54 //!
55 //! - To implement all of this, we require the backend to provide a `LabelUse`
56 //!   type that implements a trait. This is nominally an enum that records one of
57 //!   several kinds of references to an offset in code -- basically, a relocation
58 //!   type -- and will usually correspond to different instruction formats. The
59 //!   `LabelUse` implementation specifies the maximum range, how to patch in the
60 //!   actual label location when known, and how to generate a veneer to extend the
61 //!   range.
62 //!
63 //! That satisfies label references, but we still may have suboptimal branch
64 //! patterns. To clean up the branches, we do a simple "peephole"-style
65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66 //! informs the buffer of branches in the code and, in the case of conditionals,
67 //! the code that would have been emitted to invert this branch's condition. We
68 //! track the "latest branches": these are branches that are contiguous up to
69 //! the current offset. (If any code is emitted after a branch, that branch or
70 //! run of contiguous branches is no longer "latest".) The latest branches are
71 //! those that we can edit by simply truncating the buffer and doing something
72 //! else instead.
73 //!
74 //! To optimize branches, we implement several simple rules, and try to apply
75 //! them to the "latest branches" when possible:
76 //!
77 //! - A branch with a label target, when that label is bound to the ending
78 //!   offset of the branch (the fallthrough location), can be removed altogether,
79 //!   because the branch would have no effect).
80 //!
81 //! - An unconditional branch that starts at a label location, and branches to
82 //!   another label, results in a "label alias": all references to the label bound
83 //!   *to* this branch instruction are instead resolved to the *target* of the
84 //!   branch instruction. This effectively removes empty blocks that just
85 //!   unconditionally branch to the next block. We call this "branch threading".
86 //!
87 //! - A conditional followed by an unconditional, when the conditional branches
88 //!   to the unconditional's fallthrough, results in (i) the truncation of the
89 //!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90 //!   replacement of the conditional's target (using the original target of the
91 //!   unconditional). This is a fancy way of saying "we can flip a two-target
92 //!   conditional branch's taken/not-taken targets if it works better with our
93 //!   fallthrough". To make this work, the emitter actually gives the buffer
94 //!   *both* forms of every conditional branch: the true form is emitted into the
95 //!   buffer, and the "inverted" machine-code bytes are provided as part of the
96 //!   branch-fixup metadata.
97 //!
98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have
99 //!   been redirected to target(B), can be removed entirely. This is an extension
100 //!   of the branch-threading optimization, and is valid because if we know there
101 //!   will be no fallthrough into this branch instruction (the prior instruction
102 //!   is an unconditional jump), and if we know we have successfully redirected
103 //!   all labels, then this branch instruction is unreachable. Note that this
104 //!   works because the redirection happens before the label is ever resolved
105 //!   (fixups happen at island emission time, at which point latest-branches are
106 //!   cleared, or at the end of emission), so we are sure to catch and redirect
107 //!   all possible paths to this instruction.
108 //!
109 //! # Branch-optimization Correctness
110 //!
111 //! The branch-optimization mechanism depends on a few data structures with
112 //! invariants, which are always held outside the scope of top-level public
113 //! methods:
114 //!
115 //! - The latest-branches list. Each entry describes a span of the buffer
116 //!   (start/end offsets), the label target, the corresponding fixup-list entry
117 //!   index, and the bytes (must be the same length) for the inverted form, if
118 //!   conditional. The list of labels that are bound to the start-offset of this
119 //!   branch is *complete* (if any label has a resolved offset equal to `start`
120 //!   and is not an alias, it must appear in this list) and *precise* (no label
121 //!   in this list can be bound to another offset). No label in this list should
122 //!   be an alias.  No two branch ranges can overlap, and branches are in
123 //!   ascending-offset order.
124 //!
125 //! - The labels-at-tail list. This contains all MachLabels that have been bound
126 //!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127 //!   No label in this list should be an alias.
128 //!
129 //! - The label_offsets array, containing the bound offset of a label or
130 //!   UNKNOWN. No label can be bound at an offset greater than the current
131 //!   buffer tail.
132 //!
133 //! - The label_aliases array, containing another label to which a label is
134 //!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135 //!   of the label it is aliased to, if this is set.
136 //!
137 //! We argue below, at each method, how the invariants in these data structures
138 //! are maintained (grep for "Post-invariant").
139 //!
140 //! Given these invariants, we argue why each optimization preserves execution
141 //! semantics below (grep for "Preserves execution semantics").
142 //!
143 //! # Avoiding Quadratic Behavior
144 //!
145 //! There are two cases where we've had to take some care to avoid
146 //! quadratic worst-case behavior:
147 //!
148 //! - The "labels at this branch" list can grow unboundedly if the
149 //!   code generator binds many labels at one location. If the count
150 //!   gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151 //!   simply abort an optimization early in a way that is always correct
152 //!   but is conservative.
153 //!
154 //! - The fixup list can interact with island emission to create
155 //!   "quadratic island behavior". In a little more detail, one can hit
156 //!   this behavior by having some pending fixups (forward label
157 //!   references) with long-range label-use kinds, and some others
158 //!   with shorter-range references that nonetheless still are pending
159 //!   long enough to trigger island generation. In such a case, we
160 //!   process the fixup list, generate veneers to extend some forward
161 //!   references' ranges, but leave the other (longer-range) ones
162 //!   alone. The way this was implemented put them back on a list and
163 //!   resulted in quadratic behavior.
164 //!
165 //!   To avoid this fixups are split into two lists: one "pending" list and one
166 //!   final list. The pending list is kept around for handling fixups related to
167 //!   branches so it can be edited/truncated. When an island is reached, which
168 //!   starts processing fixups, all pending fixups are flushed into the final
169 //!   list. The final list is a `BinaryHeap` which enables fixup processing to
170 //!   only process those which are required during island emission, deferring
171 //!   all longer-range fixups to later.
172 
173 use crate::binemit::{Addend, CodeOffset, Reloc};
174 use crate::ir::function::FunctionParameters;
175 use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176 use crate::isa::unwind::UnwindInst;
177 use crate::machinst::{
178     BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179 };
180 use crate::trace;
181 use crate::{ir, MachInstEmitState};
182 use crate::{timing, VCodeConstantData};
183 use core::ops::Range;
184 use cranelift_control::ControlPlane;
185 use cranelift_entity::packed_option::PackedOption;
186 use cranelift_entity::{entity_impl, PrimaryMap};
187 use smallvec::SmallVec;
188 use std::cmp::Ordering;
189 use std::collections::BinaryHeap;
190 use std::mem;
191 use std::string::String;
192 use std::vec::Vec;
193 
194 #[cfg(feature = "enable-serde")]
195 use serde::{Deserialize, Serialize};
196 
197 #[cfg(feature = "enable-serde")]
198 pub trait CompilePhase {
199     type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200     type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
201 }
202 
203 #[cfg(not(feature = "enable-serde"))]
204 pub trait CompilePhase {
205     type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
206     type SourceLocType: core::fmt::Debug + PartialEq + Clone;
207 }
208 
209 /// Status of a compiled artifact that needs patching before being used.
210 #[derive(Clone, Debug, PartialEq)]
211 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
212 pub struct Stencil;
213 
214 /// Status of a compiled artifact ready to use.
215 #[derive(Clone, Debug, PartialEq)]
216 pub struct Final;
217 
218 impl CompilePhase for Stencil {
219     type MachSrcLocType = MachSrcLoc<Stencil>;
220     type SourceLocType = RelSourceLoc;
221 }
222 
223 impl CompilePhase for Final {
224     type MachSrcLocType = MachSrcLoc<Final>;
225     type SourceLocType = SourceLoc;
226 }
227 
228 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
229 enum ForceVeneers {
230     Yes,
231     No,
232 }
233 
234 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
235 /// in bulk.
236 ///
237 /// This struct uses `SmallVec`s to support small-ish function bodies without
238 /// any heap allocation. As such, it will be several kilobytes large. This is
239 /// likely fine as long as it is stack-allocated for function emission then
240 /// thrown away; but beware if many buffer objects are retained persistently.
241 pub struct MachBuffer<I: VCodeInst> {
242     /// The buffer contents, as raw bytes.
243     data: SmallVec<[u8; 1024]>,
244     /// The required alignment of this buffer.
245     min_alignment: u32,
246     /// Any relocations referring to this code. Note that only *external*
247     /// relocations are tracked here; references to labels within the buffer are
248     /// resolved before emission.
249     relocs: SmallVec<[MachReloc; 16]>,
250     /// Any trap records referring to this code.
251     traps: SmallVec<[MachTrap; 16]>,
252     /// Any call site records referring to this code.
253     call_sites: SmallVec<[MachCallSite; 16]>,
254     /// Any exception-handler records referred to at call sites.
255     exception_handlers: SmallVec<[(PackedOption<ir::ExceptionTag>, MachLabel); 16]>,
256     /// Any source location mappings referring to this code.
257     srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
258     /// Any user stack maps for this code.
259     ///
260     /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
261     /// by code offset, and each stack map covers `span` bytes on the stack.
262     user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
263     /// Any unwind info at a given location.
264     unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
265     /// The current source location in progress (after `start_srcloc()` and
266     /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
267     cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
268     /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
269     label_offsets: SmallVec<[CodeOffset; 16]>,
270     /// Label aliases: when one label points to an unconditional jump, and that
271     /// jump points to another label, we can redirect references to the first
272     /// label immediately to the second.
273     ///
274     /// Invariant: we don't have label-alias cycles. We ensure this by,
275     /// before setting label A to alias label B, resolving B's alias
276     /// target (iteratively until a non-aliased label); if B is already
277     /// aliased to A, then we cannot alias A back to B.
278     label_aliases: SmallVec<[MachLabel; 16]>,
279     /// Constants that must be emitted at some point.
280     pending_constants: SmallVec<[VCodeConstant; 16]>,
281     /// Byte size of all constants in `pending_constants`.
282     pending_constants_size: CodeOffset,
283     /// Traps that must be emitted at some point.
284     pending_traps: SmallVec<[MachLabelTrap; 16]>,
285     /// Fixups that haven't yet been flushed into `fixup_records` below and may
286     /// be related to branches that are chomped. These all get added to
287     /// `fixup_records` during island emission.
288     pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
289     /// The nearest upcoming deadline for entries in `pending_fixup_records`.
290     pending_fixup_deadline: CodeOffset,
291     /// Fixups that must be performed after all code is emitted.
292     fixup_records: BinaryHeap<MachLabelFixup<I>>,
293     /// Latest branches, to facilitate in-place editing for better fallthrough
294     /// behavior and empty-block removal.
295     latest_branches: SmallVec<[MachBranch; 4]>,
296     /// All labels at the current offset (emission tail). This is lazily
297     /// cleared: it is actually accurate as long as the current offset is
298     /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
299     /// be considered as empty.
300     ///
301     /// For correctness, this *must* be complete (i.e., the vector must contain
302     /// all labels whose offsets are resolved to the current tail), because we
303     /// rely on it to update labels when we truncate branches.
304     labels_at_tail: SmallVec<[MachLabel; 4]>,
305     /// The last offset at which `labels_at_tail` is valid. It is conceptually
306     /// always describing the tail of the buffer, but we do not clear
307     /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
308     /// when the offset has grown past this (`labels_at_tail_off`) point.
309     /// Always <= `cur_offset()`.
310     labels_at_tail_off: CodeOffset,
311     /// Metadata about all constants that this function has access to.
312     ///
313     /// This records the size/alignment of all constants (not the actual data)
314     /// along with the last available label generated for the constant. This map
315     /// is consulted when constants are referred to and the label assigned to a
316     /// constant may change over time as well.
317     constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
318     /// All recorded usages of constants as pairs of the constant and where the
319     /// constant needs to be placed within `self.data`. Note that the same
320     /// constant may appear in this array multiple times if it was emitted
321     /// multiple times.
322     used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
323     /// Indicates when a patchable region is currently open, to guard that it's
324     /// not possible to nest patchable regions.
325     open_patchable: bool,
326 }
327 
328 impl MachBufferFinalized<Stencil> {
329     /// Get a finalized machine buffer by applying the function's base source location.
330     pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
331         MachBufferFinalized {
332             data: self.data,
333             relocs: self.relocs,
334             traps: self.traps,
335             call_sites: self.call_sites,
336             exception_handlers: self.exception_handlers,
337             srclocs: self
338                 .srclocs
339                 .into_iter()
340                 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
341                 .collect(),
342             user_stack_maps: self.user_stack_maps,
343             unwind_info: self.unwind_info,
344             alignment: self.alignment,
345         }
346     }
347 }
348 
349 /// A `MachBuffer` once emission is completed: holds generated code and records,
350 /// without fixups. This allows the type to be independent of the backend.
351 #[derive(PartialEq, Debug, Clone)]
352 #[cfg_attr(
353     feature = "enable-serde",
354     derive(serde_derive::Serialize, serde_derive::Deserialize)
355 )]
356 pub struct MachBufferFinalized<T: CompilePhase> {
357     /// The buffer contents, as raw bytes.
358     pub(crate) data: SmallVec<[u8; 1024]>,
359     /// Any relocations referring to this code. Note that only *external*
360     /// relocations are tracked here; references to labels within the buffer are
361     /// resolved before emission.
362     pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
363     /// Any trap records referring to this code.
364     pub(crate) traps: SmallVec<[MachTrap; 16]>,
365     /// Any call site records referring to this code.
366     pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
367     /// Any exception-handler records referred to at call sites.
368     pub(crate) exception_handlers: SmallVec<[(PackedOption<ir::ExceptionTag>, CodeOffset); 16]>,
369     /// Any source location mappings referring to this code.
370     pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
371     /// Any user stack maps for this code.
372     ///
373     /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
374     /// by code offset, and each stack map covers `span` bytes on the stack.
375     pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
376     /// Any unwind info at a given location.
377     pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
378     /// The required alignment of this buffer.
379     pub alignment: u32,
380 }
381 
382 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
383 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
384 
385 /// Threshold on max length of `labels_at_this_branch` list to avoid
386 /// unbounded quadratic behavior (see comment below at use-site).
387 const LABEL_LIST_THRESHOLD: usize = 100;
388 
389 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at
390 /// the point at which it is used by emitted code; the buffer records "fixups"
391 /// for references to the label, and will come back and patch the code
392 /// appropriately when the label's location is eventually known.
393 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
394 pub struct MachLabel(u32);
395 entity_impl!(MachLabel);
396 
397 impl MachLabel {
398     /// Get a label for a block. (The first N MachLabels are always reserved for
399     /// the N blocks in the vcode.)
400     pub fn from_block(bindex: BlockIndex) -> MachLabel {
401         MachLabel(bindex.index() as u32)
402     }
403 
404     /// Creates a string representing this label, for convenience.
405     pub fn to_string(&self) -> String {
406         format!("label{}", self.0)
407     }
408 }
409 
410 impl Default for MachLabel {
411     fn default() -> Self {
412         UNKNOWN_LABEL
413     }
414 }
415 
416 /// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
417 /// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
418 /// the [`OpenPatchRegion`] token in the process.
419 pub struct OpenPatchRegion(usize);
420 
421 /// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
422 /// of where you might want to use this is for patching instructions that mention constants that
423 /// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
424 /// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
425 /// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
426 /// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
427 /// bytes, and the constants uses can be updated directly.
428 pub struct PatchRegion {
429     range: Range<usize>,
430 }
431 
432 impl PatchRegion {
433     /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
434     pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
435         &mut buffer.data[self.range]
436     }
437 }
438 
439 impl<I: VCodeInst> MachBuffer<I> {
440     /// Create a new section, known to start at `start_offset` and with a size limited to
441     /// `length_limit`.
442     pub fn new() -> MachBuffer<I> {
443         MachBuffer {
444             data: SmallVec::new(),
445             min_alignment: I::function_alignment().minimum,
446             relocs: SmallVec::new(),
447             traps: SmallVec::new(),
448             call_sites: SmallVec::new(),
449             exception_handlers: SmallVec::new(),
450             srclocs: SmallVec::new(),
451             user_stack_maps: SmallVec::new(),
452             unwind_info: SmallVec::new(),
453             cur_srcloc: None,
454             label_offsets: SmallVec::new(),
455             label_aliases: SmallVec::new(),
456             pending_constants: SmallVec::new(),
457             pending_constants_size: 0,
458             pending_traps: SmallVec::new(),
459             pending_fixup_records: SmallVec::new(),
460             pending_fixup_deadline: u32::MAX,
461             fixup_records: Default::default(),
462             latest_branches: SmallVec::new(),
463             labels_at_tail: SmallVec::new(),
464             labels_at_tail_off: 0,
465             constants: Default::default(),
466             used_constants: Default::default(),
467             open_patchable: false,
468         }
469     }
470 
471     /// Current offset from start of buffer.
472     pub fn cur_offset(&self) -> CodeOffset {
473         self.data.len() as CodeOffset
474     }
475 
476     /// Add a byte.
477     pub fn put1(&mut self, value: u8) {
478         self.data.push(value);
479 
480         // Post-invariant: conceptual-labels_at_tail contains a complete and
481         // precise list of labels bound at `cur_offset()`. We have advanced
482         // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
483         // before, it is not anymore (and it cannot become equal, because
484         // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
485         // conceptually empty (even though it is only lazily cleared). No labels
486         // can be bound at this new offset (by invariant on `label_offsets`).
487         // Hence the invariant holds.
488     }
489 
490     /// Add 2 bytes.
491     pub fn put2(&mut self, value: u16) {
492         let bytes = value.to_le_bytes();
493         self.data.extend_from_slice(&bytes[..]);
494 
495         // Post-invariant: as for `put1()`.
496     }
497 
498     /// Add 4 bytes.
499     pub fn put4(&mut self, value: u32) {
500         let bytes = value.to_le_bytes();
501         self.data.extend_from_slice(&bytes[..]);
502 
503         // Post-invariant: as for `put1()`.
504     }
505 
506     /// Add 8 bytes.
507     pub fn put8(&mut self, value: u64) {
508         let bytes = value.to_le_bytes();
509         self.data.extend_from_slice(&bytes[..]);
510 
511         // Post-invariant: as for `put1()`.
512     }
513 
514     /// Add a slice of bytes.
515     pub fn put_data(&mut self, data: &[u8]) {
516         self.data.extend_from_slice(data);
517 
518         // Post-invariant: as for `put1()`.
519     }
520 
521     /// Reserve appended space and return a mutable slice referring to it.
522     pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
523         let off = self.data.len();
524         let new_len = self.data.len() + len;
525         self.data.resize(new_len, 0);
526         &mut self.data[off..]
527 
528         // Post-invariant: as for `put1()`.
529     }
530 
531     /// Align up to the given alignment.
532     pub fn align_to(&mut self, align_to: CodeOffset) {
533         trace!("MachBuffer: align to {}", align_to);
534         assert!(
535             align_to.is_power_of_two(),
536             "{align_to} is not a power of two"
537         );
538         while self.cur_offset() & (align_to - 1) != 0 {
539             self.put1(0);
540         }
541 
542         // Post-invariant: as for `put1()`.
543     }
544 
545     /// Begin a region of patchable code. There is one requirement for the
546     /// code that is emitted: It must not introduce any instructions that
547     /// could be chomped (branches are an example of this). In other words,
548     /// you must not call [`MachBuffer::add_cond_branch`] or
549     /// [`MachBuffer::add_uncond_branch`] between calls to this method and
550     /// [`MachBuffer::end_patchable`].
551     pub fn start_patchable(&mut self) -> OpenPatchRegion {
552         assert!(!self.open_patchable, "Patchable regions may not be nested");
553         self.open_patchable = true;
554         OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
555     }
556 
557     /// End a region of patchable code, yielding a [`PatchRegion`] value that
558     /// can be consumed later to produce a one-off mutable slice to the
559     /// associated region of the data buffer.
560     pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
561         // No need to assert the state of `open_patchable` here, as we take
562         // ownership of the only `OpenPatchable` value.
563         self.open_patchable = false;
564         let end = usize::try_from(self.cur_offset()).unwrap();
565         PatchRegion { range: open.0..end }
566     }
567 
568     /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
569     /// offset yet.
570     pub fn get_label(&mut self) -> MachLabel {
571         let l = self.label_offsets.len() as u32;
572         self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
573         self.label_aliases.push(UNKNOWN_LABEL);
574         trace!("MachBuffer: new label -> {:?}", MachLabel(l));
575         MachLabel(l)
576 
577         // Post-invariant: the only mutation is to add a new label; it has no
578         // bound offset yet, so it trivially satisfies all invariants.
579     }
580 
581     /// Reserve the first N MachLabels for blocks.
582     pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
583         trace!("MachBuffer: first {} labels are for blocks", blocks);
584         debug_assert!(self.label_offsets.is_empty());
585         self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
586         self.label_aliases.resize(blocks, UNKNOWN_LABEL);
587 
588         // Post-invariant: as for `get_label()`.
589     }
590 
591     /// Registers metadata in this `MachBuffer` about the `constants` provided.
592     ///
593     /// This will record the size/alignment of all constants which will prepare
594     /// them for emission later on.
595     pub fn register_constants(&mut self, constants: &VCodeConstants) {
596         for (c, val) in constants.iter() {
597             self.register_constant(&c, val);
598         }
599     }
600 
601     /// Similar to [`MachBuffer::register_constants`] but registers a
602     /// single constant metadata. This function is useful in
603     /// situations where not all constants are known at the time of
604     /// emission.
605     pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
606         let c2 = self.constants.push(MachBufferConstant {
607             upcoming_label: None,
608             align: data.alignment(),
609             size: data.as_slice().len(),
610         });
611         assert_eq!(*constant, c2);
612     }
613 
614     /// Completes constant emission by iterating over `self.used_constants` and
615     /// filling in the "holes" with the constant values provided by `constants`.
616     ///
617     /// Returns the alignment required for this entire buffer. Alignment starts
618     /// at the ISA's minimum function alignment and can be increased due to
619     /// constant requirements.
620     fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
621         let mut alignment = self.min_alignment;
622         for (constant, offset) in mem::take(&mut self.used_constants) {
623             let constant = constants.get(constant);
624             let data = constant.as_slice();
625             self.data[offset as usize..][..data.len()].copy_from_slice(data);
626             alignment = constant.alignment().max(alignment);
627         }
628         alignment
629     }
630 
631     /// Returns a label that can be used to refer to the `constant` provided.
632     ///
633     /// This will automatically defer a new constant to be emitted for
634     /// `constant` if it has not been previously emitted. Note that this
635     /// function may return a different label for the same constant at
636     /// different points in time. The label is valid to use only from the
637     /// current location; the MachBuffer takes care to emit the same constant
638     /// multiple times if needed so the constant is always in range.
639     pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
640         let MachBufferConstant {
641             align,
642             size,
643             upcoming_label,
644         } = self.constants[constant];
645         if let Some(label) = upcoming_label {
646             return label;
647         }
648 
649         let label = self.get_label();
650         trace!(
651             "defer constant: eventually emit {size} bytes aligned \
652              to {align} at label {label:?}",
653         );
654         self.pending_constants.push(constant);
655         self.pending_constants_size += size as u32;
656         self.constants[constant].upcoming_label = Some(label);
657         label
658     }
659 
660     /// Bind a label to the current offset. A label can only be bound once.
661     pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
662         trace!(
663             "MachBuffer: bind label {:?} at offset {}",
664             label,
665             self.cur_offset()
666         );
667         debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
668         debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
669         let offset = self.cur_offset();
670         self.label_offsets[label.0 as usize] = offset;
671         self.lazily_clear_labels_at_tail();
672         self.labels_at_tail.push(label);
673 
674         // Invariants hold: bound offset of label is <= cur_offset (in fact it
675         // is equal). If the `labels_at_tail` list was complete and precise
676         // before, it is still, because we have bound this label to the current
677         // offset and added it to the list (which contains all labels at the
678         // current offset).
679 
680         self.optimize_branches(ctrl_plane);
681 
682         // Post-invariant: by `optimize_branches()` (see argument there).
683     }
684 
685     /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
686     /// offset that it applies to.
687     fn lazily_clear_labels_at_tail(&mut self) {
688         let offset = self.cur_offset();
689         if offset > self.labels_at_tail_off {
690             self.labels_at_tail_off = offset;
691             self.labels_at_tail.clear();
692         }
693 
694         // Post-invariant: either labels_at_tail_off was at cur_offset, and
695         // state is untouched, or was less than cur_offset, in which case the
696         // labels_at_tail list was conceptually empty, and is now actually
697         // empty.
698     }
699 
700     /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
701     pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
702         let mut iters = 0;
703         while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
704             label = self.label_aliases[label.0 as usize];
705             // To protect against an infinite loop (despite our assurances to
706             // ourselves that the invariants make this impossible), assert out
707             // after 1M iterations. The number of basic blocks is limited
708             // in most contexts anyway so this should be impossible to hit with
709             // a legitimate input.
710             iters += 1;
711             assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
712         }
713         self.label_offsets[label.0 as usize]
714 
715         // Post-invariant: no mutations.
716     }
717 
718     /// Emit a reference to the given label with the given reference type (i.e.,
719     /// branch-instruction format) at the current offset.  This is like a
720     /// relocation, but handled internally.
721     ///
722     /// This can be called before the branch is actually emitted; fixups will
723     /// not happen until an island is emitted or the buffer is finished.
724     pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
725         trace!(
726             "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
727             offset,
728             label,
729             kind
730         );
731 
732         // Add the fixup, and update the worst-case island size based on a
733         // veneer for this label use.
734         let fixup = MachLabelFixup {
735             label,
736             offset,
737             kind,
738         };
739         self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
740         self.pending_fixup_records.push(fixup);
741 
742         // Post-invariant: no mutations to branches/labels data structures.
743     }
744 
745     /// Inform the buffer of an unconditional branch at the given offset,
746     /// targeting the given label. May be used to optimize branches.
747     /// The last added label-use must correspond to this branch.
748     /// This must be called when the current offset is equal to `start`; i.e.,
749     /// before actually emitting the branch. This implies that for a branch that
750     /// uses a label and is eligible for optimizations by the MachBuffer, the
751     /// proper sequence is:
752     ///
753     /// - Call `use_label_at_offset()` to emit the fixup record.
754     /// - Call `add_uncond_branch()` to make note of the branch.
755     /// - Emit the bytes for the branch's machine code.
756     ///
757     /// Additional requirement: no labels may be bound between `start` and `end`
758     /// (exclusive on both ends).
759     pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
760         debug_assert!(
761             !self.open_patchable,
762             "Branch instruction inserted within a patchable region"
763         );
764         assert!(self.cur_offset() == start);
765         debug_assert!(end > start);
766         assert!(!self.pending_fixup_records.is_empty());
767         let fixup = self.pending_fixup_records.len() - 1;
768         self.lazily_clear_labels_at_tail();
769         self.latest_branches.push(MachBranch {
770             start,
771             end,
772             target,
773             fixup,
774             inverted: None,
775             labels_at_this_branch: self.labels_at_tail.clone(),
776         });
777 
778         // Post-invariant: we asserted branch start is current tail; the list of
779         // labels at branch is cloned from list of labels at current tail.
780     }
781 
782     /// Inform the buffer of a conditional branch at the given offset,
783     /// targeting the given label. May be used to optimize branches.
784     /// The last added label-use must correspond to this branch.
785     ///
786     /// Additional requirement: no labels may be bound between `start` and `end`
787     /// (exclusive on both ends).
788     pub fn add_cond_branch(
789         &mut self,
790         start: CodeOffset,
791         end: CodeOffset,
792         target: MachLabel,
793         inverted: &[u8],
794     ) {
795         debug_assert!(
796             !self.open_patchable,
797             "Branch instruction inserted within a patchable region"
798         );
799         assert!(self.cur_offset() == start);
800         debug_assert!(end > start);
801         assert!(!self.pending_fixup_records.is_empty());
802         debug_assert!(
803             inverted.len() == (end - start) as usize,
804             "branch length = {}, but inverted length = {}",
805             end - start,
806             inverted.len()
807         );
808         let fixup = self.pending_fixup_records.len() - 1;
809         let inverted = Some(SmallVec::from(inverted));
810         self.lazily_clear_labels_at_tail();
811         self.latest_branches.push(MachBranch {
812             start,
813             end,
814             target,
815             fixup,
816             inverted,
817             labels_at_this_branch: self.labels_at_tail.clone(),
818         });
819 
820         // Post-invariant: we asserted branch start is current tail; labels at
821         // branch list is cloned from list of labels at current tail.
822     }
823 
824     fn truncate_last_branch(&mut self) {
825         debug_assert!(
826             !self.open_patchable,
827             "Branch instruction truncated within a patchable region"
828         );
829 
830         self.lazily_clear_labels_at_tail();
831         // Invariants hold at this point.
832 
833         let b = self.latest_branches.pop().unwrap();
834         assert!(b.end == self.cur_offset());
835 
836         // State:
837         //    [PRE CODE]
838         //  Offset b.start, b.labels_at_this_branch:
839         //    [BRANCH CODE]
840         //  cur_off, self.labels_at_tail -->
841         //    (end of buffer)
842         self.data.truncate(b.start as usize);
843         self.pending_fixup_records.truncate(b.fixup);
844         while let Some(last_srcloc) = self.srclocs.last_mut() {
845             if last_srcloc.end <= b.start {
846                 break;
847             }
848             if last_srcloc.start < b.start {
849                 last_srcloc.end = b.start;
850                 break;
851             }
852             self.srclocs.pop();
853         }
854         // State:
855         //    [PRE CODE]
856         //  cur_off, Offset b.start, b.labels_at_this_branch:
857         //    (end of buffer)
858         //
859         //  self.labels_at_tail -->  (past end of buffer)
860         let cur_off = self.cur_offset();
861         self.labels_at_tail_off = cur_off;
862         // State:
863         //    [PRE CODE]
864         //  cur_off, Offset b.start, b.labels_at_this_branch,
865         //  self.labels_at_tail:
866         //    (end of buffer)
867         //
868         // resolve_label_offset(l) for l in labels_at_tail:
869         //    (past end of buffer)
870 
871         trace!(
872             "truncate_last_branch: truncated {:?}; off now {}",
873             b,
874             cur_off
875         );
876 
877         // Fix up resolved label offsets for labels at tail.
878         for &l in &self.labels_at_tail {
879             self.label_offsets[l.0 as usize] = cur_off;
880         }
881         // Old labels_at_this_branch are now at cur_off.
882         self.labels_at_tail
883             .extend(b.labels_at_this_branch.into_iter());
884 
885         // Post-invariant: this operation is defined to truncate the buffer,
886         // which moves cur_off backward, and to move labels at the end of the
887         // buffer back to the start-of-branch offset.
888         //
889         // latest_branches satisfies all invariants:
890         // - it has no branches past the end of the buffer (branches are in
891         //   order, we removed the last one, and we truncated the buffer to just
892         //   before the start of that branch)
893         // - no labels were moved to lower offsets than the (new) cur_off, so
894         //   the labels_at_this_branch list for any other branch need not change.
895         //
896         // labels_at_tail satisfies all invariants:
897         // - all labels that were at the tail after the truncated branch are
898         //   moved backward to just before the branch, which becomes the new tail;
899         //   thus every element in the list should remain (ensured by `.extend()`
900         //   above).
901         // - all labels that refer to the new tail, which is the start-offset of
902         //   the truncated branch, must be present. The `labels_at_this_branch`
903         //   list in the truncated branch's record is a complete and precise list
904         //   of exactly these labels; we append these to labels_at_tail.
905         // - labels_at_tail_off is at cur_off after truncation occurs, so the
906         //   list is valid (not to be lazily cleared).
907         //
908         // The stated operation was performed:
909         // - For each label at the end of the buffer prior to this method, it
910         //   now resolves to the new (truncated) end of the buffer: it must have
911         //   been in `labels_at_tail` (this list is precise and complete, and
912         //   the tail was at the end of the truncated branch on entry), and we
913         //   iterate over this list and set `label_offsets` to the new tail.
914         //   None of these labels could have been an alias (by invariant), so
915         //   `label_offsets` is authoritative for each.
916         // - No other labels will be past the end of the buffer, because of the
917         //   requirement that no labels be bound to the middle of branch ranges
918         //   (see comments to `add_{cond,uncond}_branch()`).
919         // - The buffer is truncated to just before the last branch, and the
920         //   fixup record referring to that last branch is removed.
921     }
922 
923     /// Performs various optimizations on branches pointing at the current label.
924     pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
925         if ctrl_plane.get_decision() {
926             return;
927         }
928 
929         self.lazily_clear_labels_at_tail();
930         // Invariants valid at this point.
931 
932         trace!(
933             "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
934             self.latest_branches,
935             self.labels_at_tail,
936             self.pending_fixup_records
937         );
938 
939         // We continue to munch on branches at the tail of the buffer until no
940         // more rules apply. Note that the loop only continues if a branch is
941         // actually truncated (or if labels are redirected away from a branch),
942         // so this always makes progress.
943         while let Some(b) = self.latest_branches.last() {
944             let cur_off = self.cur_offset();
945             trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
946             // If there has been any code emission since the end of the last branch or
947             // label definition, then there's nothing we can edit (because we
948             // don't move code once placed, only back up and overwrite), so
949             // clear the records and finish.
950             if b.end < cur_off {
951                 break;
952             }
953 
954             // If the "labels at this branch" list on this branch is
955             // longer than a threshold, don't do any simplification,
956             // and let the branch remain to separate those labels from
957             // the current tail. This avoids quadratic behavior (see
958             // #3468): otherwise, if a long string of "goto next;
959             // next:" patterns are emitted, all of the labels will
960             // coalesce into a long list of aliases for the current
961             // buffer tail. We must track all aliases of the current
962             // tail for correctness, but we are also allowed to skip
963             // optimization (removal) of any branch, so we take the
964             // escape hatch here and let it stand. In effect this
965             // "spreads" the many thousands of labels in the
966             // pathological case among an actual (harmless but
967             // suboptimal) instruction once per N labels.
968             if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
969                 break;
970             }
971 
972             // Invariant: we are looking at a branch that ends at the tail of
973             // the buffer.
974 
975             // For any branch, conditional or unconditional:
976             // - If the target is a label at the current offset, then remove
977             //   the conditional branch, and reset all labels that targeted
978             //   the current offset (end of branch) to the truncated
979             //   end-of-code.
980             //
981             // Preserves execution semantics: a branch to its own fallthrough
982             // address is equivalent to a no-op; in both cases, nextPC is the
983             // fallthrough.
984             if self.resolve_label_offset(b.target) == cur_off {
985                 trace!("branch with target == cur off; truncating");
986                 self.truncate_last_branch();
987                 continue;
988             }
989 
990             // If latest is an unconditional branch:
991             //
992             // - If the branch's target is not its own start address, then for
993             //   each label at the start of branch, make the label an alias of the
994             //   branch target, and remove the label from the "labels at this
995             //   branch" list.
996             //
997             //   - Preserves execution semantics: an unconditional branch's
998             //     only effect is to set PC to a new PC; this change simply
999             //     collapses one step in the step-semantics.
1000             //
1001             //   - Post-invariant: the labels that were bound to the start of
1002             //     this branch become aliases, so they must not be present in any
1003             //     labels-at-this-branch list or the labels-at-tail list. The
1004             //     labels are removed form the latest-branch record's
1005             //     labels-at-this-branch list, and are never placed in the
1006             //     labels-at-tail list. Furthermore, it is correct that they are
1007             //     not in either list, because they are now aliases, and labels
1008             //     that are aliases remain aliases forever.
1009             //
1010             // - If there is a prior unconditional branch that ends just before
1011             //   this one begins, and this branch has no labels bound to its
1012             //   start, then we can truncate this branch, because it is entirely
1013             //   unreachable (we have redirected all labels that make it
1014             //   reachable otherwise). Do so and continue around the loop.
1015             //
1016             //   - Preserves execution semantics: the branch is unreachable,
1017             //     because execution can only flow into an instruction from the
1018             //     prior instruction's fallthrough or from a branch bound to that
1019             //     instruction's start offset. Unconditional branches have no
1020             //     fallthrough, so if the prior instruction is an unconditional
1021             //     branch, no fallthrough entry can happen. The
1022             //     labels-at-this-branch list is complete (by invariant), so if it
1023             //     is empty, then the instruction is entirely unreachable. Thus,
1024             //     it can be removed.
1025             //
1026             //   - Post-invariant: ensured by truncate_last_branch().
1027             //
1028             // - If there is a prior conditional branch whose target label
1029             //   resolves to the current offset (branches around the
1030             //   unconditional branch), then remove the unconditional branch,
1031             //   and make the target of the unconditional the target of the
1032             //   conditional instead.
1033             //
1034             //   - Preserves execution semantics: previously we had:
1035             //
1036             //         L1:
1037             //            cond_br L2
1038             //            br L3
1039             //         L2:
1040             //            (end of buffer)
1041             //
1042             //     by removing the last branch, we have:
1043             //
1044             //         L1:
1045             //            cond_br L2
1046             //         L2:
1047             //            (end of buffer)
1048             //
1049             //     we then fix up the records for the conditional branch to
1050             //     have:
1051             //
1052             //         L1:
1053             //           cond_br.inverted L3
1054             //         L2:
1055             //
1056             //     In the original code, control flow reaches L2 when the
1057             //     conditional branch's predicate is true, and L3 otherwise. In
1058             //     the optimized code, the same is true.
1059             //
1060             //   - Post-invariant: all edits to latest_branches and
1061             //     labels_at_tail are performed by `truncate_last_branch()`,
1062             //     which maintains the invariants at each step.
1063 
1064             if b.is_uncond() {
1065                 // Set any label equal to current branch's start as an alias of
1066                 // the branch's target, if the target is not the branch itself
1067                 // (i.e., an infinite loop).
1068                 //
1069                 // We cannot perform this aliasing if the target of this branch
1070                 // ultimately aliases back here; if so, we need to keep this
1071                 // branch, so break out of this loop entirely (and clear the
1072                 // latest-branches list below).
1073                 //
1074                 // Note that this check is what prevents cycles from forming in
1075                 // `self.label_aliases`. To see why, consider an arbitrary start
1076                 // state:
1077                 //
1078                 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1079                 // Ln, which is not aliased.
1080                 //
1081                 // We would create a cycle if we assigned label_aliases[Ln]
1082                 // = L1.  Note that the below assignment is the only write
1083                 // to label_aliases.
1084                 //
1085                 // By our other invariants, we have that Ln (`l` below)
1086                 // resolves to the offset `b.start`, because it is in the
1087                 // set `b.labels_at_this_branch`.
1088                 //
1089                 // If L1 were already aliased, through some arbitrarily deep
1090                 // chain, to Ln, then it must also resolve to this offset
1091                 // `b.start`.
1092                 //
1093                 // By checking the resolution of `L1` against this offset,
1094                 // and aborting this branch-simplification if they are
1095                 // equal, we prevent the below assignment from ever creating
1096                 // a cycle.
1097                 if self.resolve_label_offset(b.target) != b.start {
1098                     let redirected = b.labels_at_this_branch.len();
1099                     for &l in &b.labels_at_this_branch {
1100                         trace!(
1101                             " -> label at start of branch {:?} redirected to target {:?}",
1102                             l,
1103                             b.target
1104                         );
1105                         self.label_aliases[l.0 as usize] = b.target;
1106                         // NOTE: we continue to ensure the invariant that labels
1107                         // pointing to tail of buffer are in `labels_at_tail`
1108                         // because we already ensured above that the last branch
1109                         // cannot have a target of `cur_off`; so we never have
1110                         // to put the label into `labels_at_tail` when moving it
1111                         // here.
1112                     }
1113                     // Maintain invariant: all branches have been redirected
1114                     // and are no longer pointing at the start of this branch.
1115                     let mut_b = self.latest_branches.last_mut().unwrap();
1116                     mut_b.labels_at_this_branch.clear();
1117 
1118                     if redirected > 0 {
1119                         trace!(" -> after label redirects, restarting loop");
1120                         continue;
1121                     }
1122                 } else {
1123                     break;
1124                 }
1125 
1126                 let b = self.latest_branches.last().unwrap();
1127 
1128                 // Examine any immediately preceding branch.
1129                 if self.latest_branches.len() > 1 {
1130                     let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1131                     trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1132                     // This uncond is immediately after another uncond; we
1133                     // should have already redirected labels to this uncond away
1134                     // (but check to be sure); so we can truncate this uncond.
1135                     if prev_b.is_uncond()
1136                         && prev_b.end == b.start
1137                         && b.labels_at_this_branch.is_empty()
1138                     {
1139                         trace!(" -> uncond follows another uncond; truncating");
1140                         self.truncate_last_branch();
1141                         continue;
1142                     }
1143 
1144                     // This uncond is immediately after a conditional, and the
1145                     // conditional's target is the end of this uncond, and we've
1146                     // already redirected labels to this uncond away; so we can
1147                     // truncate this uncond, flip the sense of the conditional, and
1148                     // set the conditional's target (in `latest_branches` and in
1149                     // `fixup_records`) to the uncond's target.
1150                     if prev_b.is_cond()
1151                         && prev_b.end == b.start
1152                         && self.resolve_label_offset(prev_b.target) == cur_off
1153                     {
1154                         trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
1155                         // Save the target of the uncond (this becomes the
1156                         // target of the cond), and truncate the uncond.
1157                         let target = b.target;
1158                         let data = prev_b.inverted.clone().unwrap();
1159                         self.truncate_last_branch();
1160 
1161                         // Mutate the code and cond branch.
1162                         let off_before_edit = self.cur_offset();
1163                         let prev_b = self.latest_branches.last_mut().unwrap();
1164                         let not_inverted = SmallVec::from(
1165                             &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1166                         );
1167 
1168                         // Low-level edit: replaces bytes of branch with
1169                         // inverted form. cur_off remains the same afterward, so
1170                         // we do not need to modify label data structures.
1171                         self.data.truncate(prev_b.start as usize);
1172                         self.data.extend_from_slice(&data[..]);
1173 
1174                         // Save the original code as the inversion of the
1175                         // inverted branch, in case we later edit this branch
1176                         // again.
1177                         prev_b.inverted = Some(not_inverted);
1178                         self.pending_fixup_records[prev_b.fixup].label = target;
1179                         trace!(" -> reassigning target of condbr to {:?}", target);
1180                         prev_b.target = target;
1181                         debug_assert_eq!(off_before_edit, self.cur_offset());
1182                         continue;
1183                     }
1184                 }
1185             }
1186 
1187             // If we couldn't do anything with the last branch, then break.
1188             break;
1189         }
1190 
1191         self.purge_latest_branches();
1192 
1193         trace!(
1194             "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1195             self.latest_branches,
1196             self.labels_at_tail,
1197             self.pending_fixup_records
1198         );
1199     }
1200 
1201     fn purge_latest_branches(&mut self) {
1202         // All of our branch simplification rules work only if a branch ends at
1203         // the tail of the buffer, with no following code; and branches are in
1204         // order in latest_branches; so if the last entry ends prior to
1205         // cur_offset, then clear all entries.
1206         let cur_off = self.cur_offset();
1207         if let Some(l) = self.latest_branches.last() {
1208             if l.end < cur_off {
1209                 trace!("purge_latest_branches: removing branch {:?}", l);
1210                 self.latest_branches.clear();
1211             }
1212         }
1213 
1214         // Post-invariant: no invariant requires any branch to appear in
1215         // `latest_branches`; it is always optional. The list-clear above thus
1216         // preserves all semantics.
1217     }
1218 
1219     /// Emit a trap at some point in the future with the specified code and
1220     /// stack map.
1221     ///
1222     /// This function returns a [`MachLabel`] which will be the future address
1223     /// of the trap. Jumps should refer to this label, likely by using the
1224     /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1225     /// patched in once the address of the trap is known.
1226     ///
1227     /// This will batch all traps into the end of the function.
1228     pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1229         let label = self.get_label();
1230         self.pending_traps.push(MachLabelTrap {
1231             label,
1232             code,
1233             loc: self.cur_srcloc.map(|(_start, loc)| loc),
1234         });
1235         label
1236     }
1237 
1238     /// Is an island needed within the next N bytes?
1239     pub fn island_needed(&self, distance: CodeOffset) -> bool {
1240         let deadline = match self.fixup_records.peek() {
1241             Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1242             None => self.pending_fixup_deadline,
1243         };
1244         deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1245     }
1246 
1247     /// Returns the maximal offset that islands can reach if `distance` more
1248     /// bytes are appended.
1249     ///
1250     /// This is used to determine if veneers need insertions since jumps that
1251     /// can't reach past this point must get a veneer of some form.
1252     fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1253         // Assume that all fixups will require veneers and that the veneers are
1254         // the worst-case size for each platform. This is an over-generalization
1255         // to avoid iterating over the `fixup_records` list or maintaining
1256         // information about it as we go along.
1257         let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1258             as u32)
1259             * (I::LabelUse::worst_case_veneer_size())
1260             + self.pending_constants_size
1261             + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1262         self.cur_offset()
1263             .saturating_add(distance)
1264             .saturating_add(island_worst_case_size)
1265     }
1266 
1267     /// Emit all pending constants and required pending veneers.
1268     ///
1269     /// Should only be called if `island_needed()` returns true, i.e., if we
1270     /// actually reach a deadline. It's not necessarily a problem to do so
1271     /// otherwise but it may result in unnecessary work during emission.
1272     pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1273         self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1274     }
1275 
1276     /// Same as `emit_island`, but an internal API with a `force_veneers`
1277     /// argument to force all veneers to always get emitted for debugging.
1278     fn emit_island_maybe_forced(
1279         &mut self,
1280         force_veneers: ForceVeneers,
1281         distance: CodeOffset,
1282         ctrl_plane: &mut ControlPlane,
1283     ) {
1284         // We're going to purge fixups, so no latest-branch editing can happen
1285         // anymore.
1286         self.latest_branches.clear();
1287 
1288         // End the current location tracking since anything emitted during this
1289         // function shouldn't be attributed to whatever the current source
1290         // location is.
1291         //
1292         // Note that the current source location, if it's set right now, will be
1293         // restored at the end of this island emission.
1294         let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1295         if cur_loc.is_some() {
1296             self.end_srcloc();
1297         }
1298 
1299         let forced_threshold = self.worst_case_end_of_island(distance);
1300 
1301         // First flush out all traps/constants so we have more labels in case
1302         // fixups are applied against these labels.
1303         //
1304         // Note that traps are placed first since this typically happens at the
1305         // end of the function and for disassemblers we try to keep all the code
1306         // contiguously together.
1307         for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1308             // If this trap has source information associated with it then
1309             // emit this information for the trap instruction going out now too.
1310             if let Some(loc) = loc {
1311                 self.start_srcloc(loc);
1312             }
1313             self.align_to(I::LabelUse::ALIGN);
1314             self.bind_label(label, ctrl_plane);
1315             self.add_trap(code);
1316             self.put_data(I::TRAP_OPCODE);
1317             if loc.is_some() {
1318                 self.end_srcloc();
1319             }
1320         }
1321 
1322         for constant in mem::take(&mut self.pending_constants) {
1323             let MachBufferConstant { align, size, .. } = self.constants[constant];
1324             let label = self.constants[constant].upcoming_label.take().unwrap();
1325             self.align_to(align);
1326             self.bind_label(label, ctrl_plane);
1327             self.used_constants.push((constant, self.cur_offset()));
1328             self.get_appended_space(size);
1329         }
1330 
1331         // Either handle all pending fixups because they're ready or move them
1332         // onto the `BinaryHeap` tracking all pending fixups if they aren't
1333         // ready.
1334         assert!(self.latest_branches.is_empty());
1335         for fixup in mem::take(&mut self.pending_fixup_records) {
1336             if self.should_apply_fixup(&fixup, forced_threshold) {
1337                 self.handle_fixup(fixup, force_veneers, forced_threshold);
1338             } else {
1339                 self.fixup_records.push(fixup);
1340             }
1341         }
1342         self.pending_fixup_deadline = u32::MAX;
1343         while let Some(fixup) = self.fixup_records.peek() {
1344             trace!("emit_island: fixup {:?}", fixup);
1345 
1346             // If this fixup shouldn't be applied, that means its label isn't
1347             // defined yet and there'll be remaining space to apply a veneer if
1348             // necessary in the future after this island. In that situation
1349             // because `fixup_records` is sorted by deadline this loop can
1350             // exit.
1351             if !self.should_apply_fixup(fixup, forced_threshold) {
1352                 break;
1353             }
1354 
1355             let fixup = self.fixup_records.pop().unwrap();
1356             self.handle_fixup(fixup, force_veneers, forced_threshold);
1357         }
1358 
1359         if let Some(loc) = cur_loc {
1360             self.start_srcloc(loc);
1361         }
1362     }
1363 
1364     fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1365         let label_offset = self.resolve_label_offset(fixup.label);
1366         label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1367     }
1368 
1369     fn handle_fixup(
1370         &mut self,
1371         fixup: MachLabelFixup<I>,
1372         force_veneers: ForceVeneers,
1373         forced_threshold: CodeOffset,
1374     ) {
1375         let MachLabelFixup {
1376             label,
1377             offset,
1378             kind,
1379         } = fixup;
1380         let start = offset as usize;
1381         let end = (offset + kind.patch_size()) as usize;
1382         let label_offset = self.resolve_label_offset(label);
1383 
1384         if label_offset != UNKNOWN_LABEL_OFFSET {
1385             // If the offset of the label for this fixup is known then
1386             // we're going to do something here-and-now. We're either going
1387             // to patch the original offset because it's an in-bounds jump,
1388             // or we're going to generate a veneer, patch the fixup to jump
1389             // to the veneer, and then keep going.
1390             //
1391             // If the label comes after the original fixup, then we should
1392             // be guaranteed that the jump is in-bounds. Otherwise there's
1393             // a bug somewhere because this method wasn't called soon
1394             // enough. All forward-jumps are tracked and should get veneers
1395             // before their deadline comes and they're unable to jump
1396             // further.
1397             //
1398             // Otherwise if the label is before the fixup, then that's a
1399             // backwards jump. If it's past the maximum negative range
1400             // then we'll emit a veneer that to jump forward to which can
1401             // then jump backwards.
1402             let veneer_required = if label_offset >= offset {
1403                 assert!((label_offset - offset) <= kind.max_pos_range());
1404                 false
1405             } else {
1406                 (offset - label_offset) > kind.max_neg_range()
1407             };
1408             trace!(
1409                 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1410                 label_offset,
1411                 veneer_required,
1412                 kind.max_pos_range(),
1413                 kind.max_neg_range()
1414             );
1415 
1416             if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1417                 self.emit_veneer(label, offset, kind);
1418             } else {
1419                 let slice = &mut self.data[start..end];
1420                 trace!("patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}");
1421                 kind.patch(slice, offset, label_offset);
1422             }
1423         } else {
1424             // If the offset of this label is not known at this time then
1425             // that means that a veneer is required because after this
1426             // island the target can't be in range of the original target.
1427             assert!(forced_threshold - offset > kind.max_pos_range());
1428             self.emit_veneer(label, offset, kind);
1429         }
1430     }
1431 
1432     /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1433     ///
1434     /// This will generate extra machine code, using `kind`, to get a
1435     /// larger-jump-kind than `kind` allows. The code at `offset` is then
1436     /// patched to jump to our new code, and then the new code is enqueued for
1437     /// a fixup to get processed at some later time.
1438     fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1439         // If this `kind` doesn't support a veneer then that's a bug in the
1440         // backend because we need to implement support for such a veneer.
1441         assert!(
1442             kind.supports_veneer(),
1443             "jump beyond the range of {kind:?} but a veneer isn't supported",
1444         );
1445 
1446         // Allocate space for a veneer in the island.
1447         self.align_to(I::LabelUse::ALIGN);
1448         let veneer_offset = self.cur_offset();
1449         trace!("making a veneer at {}", veneer_offset);
1450         let start = offset as usize;
1451         let end = (offset + kind.patch_size()) as usize;
1452         let slice = &mut self.data[start..end];
1453         // Patch the original label use to refer to the veneer.
1454         trace!(
1455             "patching original at offset {} to veneer offset {}",
1456             offset,
1457             veneer_offset
1458         );
1459         kind.patch(slice, offset, veneer_offset);
1460         // Generate the veneer.
1461         let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1462         let (veneer_fixup_off, veneer_label_use) =
1463             kind.generate_veneer(veneer_slice, veneer_offset);
1464         trace!(
1465             "generated veneer; fixup offset {}, label_use {:?}",
1466             veneer_fixup_off,
1467             veneer_label_use
1468         );
1469         // Register a new use of `label` with our new veneer fixup and
1470         // offset. This'll recalculate deadlines accordingly and
1471         // enqueue this fixup to get processed at some later
1472         // time.
1473         self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1474     }
1475 
1476     fn finish_emission_maybe_forcing_veneers(
1477         &mut self,
1478         force_veneers: ForceVeneers,
1479         ctrl_plane: &mut ControlPlane,
1480     ) {
1481         while !self.pending_constants.is_empty()
1482             || !self.pending_traps.is_empty()
1483             || !self.fixup_records.is_empty()
1484             || !self.pending_fixup_records.is_empty()
1485         {
1486             // `emit_island()` will emit any pending veneers and constants, and
1487             // as a side-effect, will also take care of any fixups with resolved
1488             // labels eagerly.
1489             self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1490         }
1491 
1492         // Ensure that all labels have been fixed up after the last island is emitted. This is a
1493         // full (release-mode) assert because an unresolved label means the emitted code is
1494         // incorrect.
1495         assert!(self.fixup_records.is_empty());
1496         assert!(self.pending_fixup_records.is_empty());
1497     }
1498 
1499     /// Finish any deferred emissions and/or fixups.
1500     pub fn finish(
1501         mut self,
1502         constants: &VCodeConstants,
1503         ctrl_plane: &mut ControlPlane,
1504     ) -> MachBufferFinalized<Stencil> {
1505         let _tt = timing::vcode_emit_finish();
1506 
1507         self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1508 
1509         let alignment = self.finish_constants(constants);
1510 
1511         // Resolve all labels to their offsets.
1512         let finalized_relocs = self
1513             .relocs
1514             .iter()
1515             .map(|reloc| FinalizedMachReloc {
1516                 offset: reloc.offset,
1517                 kind: reloc.kind,
1518                 addend: reloc.addend,
1519                 target: match &reloc.target {
1520                     RelocTarget::ExternalName(name) => {
1521                         FinalizedRelocTarget::ExternalName(name.clone())
1522                     }
1523                     RelocTarget::Label(label) => {
1524                         FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1525                     }
1526                 },
1527             })
1528             .collect();
1529 
1530         let finalized_exception_handlers = self
1531             .exception_handlers
1532             .iter()
1533             .map(|(tag, label)| (*tag, self.resolve_label_offset(*label)))
1534             .collect();
1535 
1536         let mut srclocs = self.srclocs;
1537         srclocs.sort_by_key(|entry| entry.start);
1538 
1539         MachBufferFinalized {
1540             data: self.data,
1541             relocs: finalized_relocs,
1542             traps: self.traps,
1543             call_sites: self.call_sites,
1544             exception_handlers: finalized_exception_handlers,
1545             srclocs,
1546             user_stack_maps: self.user_stack_maps,
1547             unwind_info: self.unwind_info,
1548             alignment,
1549         }
1550     }
1551 
1552     /// Add an external relocation at the given offset.
1553     pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1554         &mut self,
1555         offset: CodeOffset,
1556         kind: Reloc,
1557         target: &T,
1558         addend: Addend,
1559     ) {
1560         let target: RelocTarget = target.clone().into();
1561         // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1562         // generate a label-use statement to track whether an island is possibly
1563         // needed to escape this function to actually get to the external name.
1564         // This is most likely to come up on AArch64 where calls between
1565         // functions use a 26-bit signed offset which gives +/- 64MB. This means
1566         // that if a function is 128MB in size and there's a call in the middle
1567         // it's impossible to reach the actual target. Also, while it's
1568         // technically possible to jump to the start of a function and then jump
1569         // further, island insertion below always inserts islands after
1570         // previously appended code so for Cranelift's own implementation this
1571         // is also a problem for 64MB functions on AArch64 which start with a
1572         // call instruction, those won't be able to escape.
1573         //
1574         // Ideally what needs to happen here is that a `LabelUse` is
1575         // transparently generated (or call-sites of this function are audited
1576         // to generate a `LabelUse` instead) and tracked internally. The actual
1577         // relocation would then change over time if and when a veneer is
1578         // inserted, where the relocation here would be patched by this
1579         // `MachBuffer` to jump to the veneer. The problem, though, is that all
1580         // this still needs to end up, in the case of a singular function,
1581         // generating a final relocation pointing either to this particular
1582         // relocation or to the veneer inserted. Additionally
1583         // `MachBuffer` needs the concept of a label which will never be
1584         // resolved, so `emit_island` doesn't trip over not actually ever
1585         // knowning what some labels are. Currently the loop in
1586         // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1587         // loop.
1588         //
1589         // For now this means that because relocs aren't tracked at all that
1590         // AArch64 functions have a rough size limits of 64MB. For now that's
1591         // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1592         // when a relocation can't otherwise be resolved later, so it shouldn't
1593         // actually result in any memory unsafety or anything like that.
1594         self.relocs.push(MachReloc {
1595             offset,
1596             kind,
1597             target,
1598             addend,
1599         });
1600     }
1601 
1602     /// Add an external relocation at the current offset.
1603     pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1604         &mut self,
1605         kind: Reloc,
1606         target: &T,
1607         addend: Addend,
1608     ) {
1609         self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1610     }
1611 
1612     /// Add a trap record at the current offset.
1613     pub fn add_trap(&mut self, code: TrapCode) {
1614         self.traps.push(MachTrap {
1615             offset: self.data.len() as CodeOffset,
1616             code,
1617         });
1618     }
1619 
1620     /// Add a call-site record at the current offset, optionally with exception handlers.
1621     pub fn add_call_site(
1622         &mut self,
1623         exception_handlers: &[(PackedOption<ExceptionTag>, MachLabel)],
1624     ) {
1625         let start = u32::try_from(self.exception_handlers.len()).unwrap();
1626         self.exception_handlers
1627             .extend(exception_handlers.into_iter().copied());
1628         let end = u32::try_from(self.exception_handlers.len()).unwrap();
1629         self.call_sites.push(MachCallSite {
1630             ret_addr: self.data.len() as CodeOffset,
1631             exception_handler_range: start..end,
1632         });
1633     }
1634 
1635     /// Add an unwind record at the current offset.
1636     pub fn add_unwind(&mut self, unwind: UnwindInst) {
1637         self.unwind_info.push((self.cur_offset(), unwind));
1638     }
1639 
1640     /// Set the `SourceLoc` for code from this offset until the offset at the
1641     /// next call to `end_srcloc()`.
1642     /// Returns the current [CodeOffset] and [RelSourceLoc].
1643     pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1644         let cur = (self.cur_offset(), loc);
1645         self.cur_srcloc = Some(cur);
1646         cur
1647     }
1648 
1649     /// Mark the end of the `SourceLoc` segment started at the last
1650     /// `start_srcloc()` call.
1651     pub fn end_srcloc(&mut self) {
1652         let (start, loc) = self
1653             .cur_srcloc
1654             .take()
1655             .expect("end_srcloc() called without start_srcloc()");
1656         let end = self.cur_offset();
1657         // Skip zero-length extends.
1658         debug_assert!(end >= start);
1659         if end > start {
1660             self.srclocs.push(MachSrcLoc { start, end, loc });
1661         }
1662     }
1663 
1664     /// Push a user stack map onto this buffer.
1665     ///
1666     /// The stack map is associated with the given `return_addr` code
1667     /// offset. This must be the PC for the instruction just *after* this stack
1668     /// map's associated instruction. For example in the sequence `call $foo;
1669     /// add r8, rax`, the `return_addr` must be the offset of the start of the
1670     /// `add` instruction.
1671     ///
1672     /// Stack maps must be pushed in sorted `return_addr` order.
1673     pub fn push_user_stack_map(
1674         &mut self,
1675         emit_state: &I::State,
1676         return_addr: CodeOffset,
1677         mut stack_map: ir::UserStackMap,
1678     ) {
1679         let span = emit_state.frame_layout().active_size();
1680         trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1681 
1682         debug_assert!(
1683             self.user_stack_maps
1684                 .last()
1685                 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1686             "pushed stack maps out of order: {} is not less than {}",
1687             self.user_stack_maps.last().unwrap().0,
1688             return_addr,
1689         );
1690 
1691         stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1692         self.user_stack_maps.push((return_addr, span, stack_map));
1693     }
1694 
1695     /// Increase the alignment of the buffer to the given alignment if bigger
1696     /// than the current alignment.
1697     pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1698         self.min_alignment = self.min_alignment.max(
1699             1u32.checked_shl(u32::from(align_to))
1700                 .expect("log2_min_function_alignment too large"),
1701         );
1702     }
1703 }
1704 
1705 impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1706     fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1707         for b in iter {
1708             self.put1(b);
1709         }
1710     }
1711 }
1712 
1713 impl<T: CompilePhase> MachBufferFinalized<T> {
1714     /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1715     pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1716         &self.srclocs[..]
1717     }
1718 
1719     /// Get the total required size for the code.
1720     pub fn total_size(&self) -> CodeOffset {
1721         self.data.len() as CodeOffset
1722     }
1723 
1724     /// Return the code in this mach buffer as a hex string for testing purposes.
1725     pub fn stringify_code_bytes(&self) -> String {
1726         // This is pretty lame, but whatever ..
1727         use std::fmt::Write;
1728         let mut s = String::with_capacity(self.data.len() * 2);
1729         for b in &self.data {
1730             write!(&mut s, "{b:02X}").unwrap();
1731         }
1732         s
1733     }
1734 
1735     /// Get the code bytes.
1736     pub fn data(&self) -> &[u8] {
1737         // N.B.: we emit every section into the .text section as far as
1738         // the `CodeSink` is concerned; we do not bother to segregate
1739         // the contents into the actual program text, the jumptable and the
1740         // rodata (constant pool). This allows us to generate code assuming
1741         // that these will not be relocated relative to each other, and avoids
1742         // having to designate each section as belonging in one of the three
1743         // fixed categories defined by `CodeSink`. If this becomes a problem
1744         // later (e.g. because of memory permissions or similar), we can
1745         // add this designation and segregate the output; take care, however,
1746         // to add the appropriate relocations in this case.
1747 
1748         &self.data[..]
1749     }
1750 
1751     /// Get the list of external relocations for this code.
1752     pub fn relocs(&self) -> &[FinalizedMachReloc] {
1753         &self.relocs[..]
1754     }
1755 
1756     /// Get the list of trap records for this code.
1757     pub fn traps(&self) -> &[MachTrap] {
1758         &self.traps[..]
1759     }
1760 
1761     /// Get the user stack map metadata for this code.
1762     pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1763         &self.user_stack_maps
1764     }
1765 
1766     /// Take this buffer's user strack map metadata.
1767     pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1768         mem::take(&mut self.user_stack_maps)
1769     }
1770 
1771     /// Get the list of call sites for this code, along with
1772     /// associated exception handlers.
1773     ///
1774     /// Each item yielded by the returned iterator is a struct with:
1775     ///
1776     /// - The call site metadata record, with a `ret_addr` field
1777     ///   directly accessible and denoting the offset of the return
1778     ///   address into this buffer's code.
1779     /// - The slice of pairs of exception tags and code offsets
1780     ///   denoting exception-handler entry points associated with this
1781     ///   call site.
1782     pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1783         self.call_sites.iter().map(|call_site| {
1784             let range = call_site.exception_handler_range.clone();
1785             let range = usize::try_from(range.start).unwrap()..usize::try_from(range.end).unwrap();
1786             FinalizedMachCallSite {
1787                 ret_addr: call_site.ret_addr,
1788                 exception_handlers: &self.exception_handlers[range],
1789             }
1790         })
1791     }
1792 }
1793 
1794 /// Metadata about a constant.
1795 struct MachBufferConstant {
1796     /// A label which has not yet been bound which can be used for this
1797     /// constant.
1798     ///
1799     /// This is lazily created when a label is requested for a constant and is
1800     /// cleared when a constant is emitted.
1801     upcoming_label: Option<MachLabel>,
1802     /// Required alignment.
1803     align: CodeOffset,
1804     /// The byte size of this constant.
1805     size: usize,
1806 }
1807 
1808 /// A trap that is deferred to the next time an island is emitted for either
1809 /// traps, constants, or fixups.
1810 struct MachLabelTrap {
1811     /// This label will refer to the trap's offset.
1812     label: MachLabel,
1813     /// The code associated with this trap.
1814     code: TrapCode,
1815     /// An optional source location to assign for this trap.
1816     loc: Option<RelSourceLoc>,
1817 }
1818 
1819 /// A fixup to perform on the buffer once code is emitted. Fixups always refer
1820 /// to labels and patch the code based on label offsets. Hence, they are like
1821 /// relocations, but internal to one buffer.
1822 #[derive(Debug)]
1823 struct MachLabelFixup<I: VCodeInst> {
1824     /// The label whose offset controls this fixup.
1825     label: MachLabel,
1826     /// The offset to fix up / patch to refer to this label.
1827     offset: CodeOffset,
1828     /// The kind of fixup. This is architecture-specific; each architecture may have,
1829     /// e.g., several types of branch instructions, each with differently-sized
1830     /// offset fields and different places within the instruction to place the
1831     /// bits.
1832     kind: I::LabelUse,
1833 }
1834 
1835 impl<I: VCodeInst> MachLabelFixup<I> {
1836     fn deadline(&self) -> CodeOffset {
1837         self.offset.saturating_add(self.kind.max_pos_range())
1838     }
1839 }
1840 
1841 impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1842     fn eq(&self, other: &Self) -> bool {
1843         self.deadline() == other.deadline()
1844     }
1845 }
1846 
1847 impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1848 
1849 impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1850     fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1851         Some(self.cmp(other))
1852     }
1853 }
1854 
1855 impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1856     fn cmp(&self, other: &Self) -> Ordering {
1857         other.deadline().cmp(&self.deadline())
1858     }
1859 }
1860 
1861 /// A relocation resulting from a compilation.
1862 #[derive(Clone, Debug, PartialEq)]
1863 #[cfg_attr(
1864     feature = "enable-serde",
1865     derive(serde_derive::Serialize, serde_derive::Deserialize)
1866 )]
1867 pub struct MachRelocBase<T> {
1868     /// The offset at which the relocation applies, *relative to the
1869     /// containing section*.
1870     pub offset: CodeOffset,
1871     /// The kind of relocation.
1872     pub kind: Reloc,
1873     /// The external symbol / name to which this relocation refers.
1874     pub target: T,
1875     /// The addend to add to the symbol value.
1876     pub addend: i64,
1877 }
1878 
1879 type MachReloc = MachRelocBase<RelocTarget>;
1880 
1881 /// A relocation resulting from a compilation.
1882 pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1883 
1884 /// A Relocation target
1885 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
1886 pub enum RelocTarget {
1887     /// Points to an [ExternalName] outside the current function.
1888     ExternalName(ExternalName),
1889     /// Points to a [MachLabel] inside this function.
1890     /// This is different from [MachLabelFixup] in that both the relocation and the
1891     /// label will be emitted and are only resolved at link time.
1892     ///
1893     /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1894     Label(MachLabel),
1895 }
1896 
1897 impl From<ExternalName> for RelocTarget {
1898     fn from(name: ExternalName) -> Self {
1899         Self::ExternalName(name)
1900     }
1901 }
1902 
1903 impl From<MachLabel> for RelocTarget {
1904     fn from(label: MachLabel) -> Self {
1905         Self::Label(label)
1906     }
1907 }
1908 
1909 /// A Relocation target
1910 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
1911 #[cfg_attr(
1912     feature = "enable-serde",
1913     derive(serde_derive::Serialize, serde_derive::Deserialize)
1914 )]
1915 pub enum FinalizedRelocTarget {
1916     /// Points to an [ExternalName] outside the current function.
1917     ExternalName(ExternalName),
1918     /// Points to a [CodeOffset] from the start of the current function.
1919     Func(CodeOffset),
1920 }
1921 
1922 impl FinalizedRelocTarget {
1923     /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1924     /// output.
1925     pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1926         match self {
1927             FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1928             FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1929         }
1930     }
1931 }
1932 
1933 /// A trap record resulting from a compilation.
1934 #[derive(Clone, Debug, PartialEq)]
1935 #[cfg_attr(
1936     feature = "enable-serde",
1937     derive(serde_derive::Serialize, serde_derive::Deserialize)
1938 )]
1939 pub struct MachTrap {
1940     /// The offset at which the trap instruction occurs, *relative to the
1941     /// containing section*.
1942     pub offset: CodeOffset,
1943     /// The trap code.
1944     pub code: TrapCode,
1945 }
1946 
1947 /// A call site record resulting from a compilation.
1948 #[derive(Clone, Debug, PartialEq)]
1949 #[cfg_attr(
1950     feature = "enable-serde",
1951     derive(serde_derive::Serialize, serde_derive::Deserialize)
1952 )]
1953 pub struct MachCallSite {
1954     /// The offset of the call's return address, *relative to the
1955     /// start of the buffer*.
1956     pub ret_addr: CodeOffset,
1957 
1958     /// Range in `exception_handlers` corresponding to the exception
1959     /// handlers for this callsite.
1960     exception_handler_range: Range<u32>,
1961 }
1962 
1963 /// A call site record resulting from a compilation.
1964 #[derive(Clone, Debug, PartialEq)]
1965 pub struct FinalizedMachCallSite<'a> {
1966     /// The offset of the call's return address, *relative to the
1967     /// start of the buffer*.
1968     pub ret_addr: CodeOffset,
1969 
1970     /// Exception handlers at this callsite, with target offsets
1971     /// *relative to the start of the buffer*.
1972     pub exception_handlers: &'a [(PackedOption<ir::ExceptionTag>, CodeOffset)],
1973 }
1974 
1975 /// A source-location mapping resulting from a compilation.
1976 #[derive(PartialEq, Debug, Clone)]
1977 #[cfg_attr(
1978     feature = "enable-serde",
1979     derive(serde_derive::Serialize, serde_derive::Deserialize)
1980 )]
1981 pub struct MachSrcLoc<T: CompilePhase> {
1982     /// The start of the region of code corresponding to a source location.
1983     /// This is relative to the start of the function, not to the start of the
1984     /// section.
1985     pub start: CodeOffset,
1986     /// The end of the region of code corresponding to a source location.
1987     /// This is relative to the start of the section, not to the start of the
1988     /// section.
1989     pub end: CodeOffset,
1990     /// The source location.
1991     pub loc: T::SourceLocType,
1992 }
1993 
1994 impl MachSrcLoc<Stencil> {
1995     fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1996         MachSrcLoc {
1997             start: self.start,
1998             end: self.end,
1999             loc: self.loc.expand(base_srcloc),
2000         }
2001     }
2002 }
2003 
2004 /// Record of branch instruction in the buffer, to facilitate editing.
2005 #[derive(Clone, Debug)]
2006 struct MachBranch {
2007     start: CodeOffset,
2008     end: CodeOffset,
2009     target: MachLabel,
2010     fixup: usize,
2011     inverted: Option<SmallVec<[u8; 8]>>,
2012     /// All labels pointing to the start of this branch. For correctness, this
2013     /// *must* be complete (i.e., must contain all labels whose resolved offsets
2014     /// are at the start of this branch): we rely on being able to redirect all
2015     /// labels that could jump to this branch before removing it, if it is
2016     /// otherwise unreachable.
2017     labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2018 }
2019 
2020 impl MachBranch {
2021     fn is_cond(&self) -> bool {
2022         self.inverted.is_some()
2023     }
2024     fn is_uncond(&self) -> bool {
2025         self.inverted.is_none()
2026     }
2027 }
2028 
2029 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2030 ///
2031 /// Note that `MachBuffer` was primarily written for intra-function references
2032 /// of jumps between basic blocks, but it's also quite usable for entire text
2033 /// sections and resolving references between functions themselves. This
2034 /// builder interprets "blocks" as labeled functions for the purposes of
2035 /// resolving labels internally in the buffer.
2036 pub struct MachTextSectionBuilder<I: VCodeInst> {
2037     buf: MachBuffer<I>,
2038     next_func: usize,
2039     force_veneers: ForceVeneers,
2040 }
2041 
2042 impl<I: VCodeInst> MachTextSectionBuilder<I> {
2043     /// Creates a new text section builder which will have `num_funcs` functions
2044     /// pushed into it.
2045     pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2046         let mut buf = MachBuffer::new();
2047         buf.reserve_labels_for_blocks(num_funcs);
2048         MachTextSectionBuilder {
2049             buf,
2050             next_func: 0,
2051             force_veneers: ForceVeneers::No,
2052         }
2053     }
2054 }
2055 
2056 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2057     fn append(
2058         &mut self,
2059         labeled: bool,
2060         func: &[u8],
2061         align: u32,
2062         ctrl_plane: &mut ControlPlane,
2063     ) -> u64 {
2064         // Conditionally emit an island if it's necessary to resolve jumps
2065         // between functions which are too far away.
2066         let size = func.len() as u32;
2067         if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2068             self.buf
2069                 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2070         }
2071 
2072         self.buf.align_to(align);
2073         let pos = self.buf.cur_offset();
2074         if labeled {
2075             self.buf.bind_label(
2076                 MachLabel::from_block(BlockIndex::new(self.next_func)),
2077                 ctrl_plane,
2078             );
2079             self.next_func += 1;
2080         }
2081         self.buf.put_data(func);
2082         u64::from(pos)
2083     }
2084 
2085     fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2086         crate::trace!(
2087             "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2088         );
2089         let label = MachLabel::from_block(BlockIndex::new(target));
2090         let offset = u32::try_from(offset).unwrap();
2091         match I::LabelUse::from_reloc(reloc, addend) {
2092             Some(label_use) => {
2093                 self.buf.use_label_at_offset(offset, label, label_use);
2094                 true
2095             }
2096             None => false,
2097         }
2098     }
2099 
2100     fn force_veneers(&mut self) {
2101         self.force_veneers = ForceVeneers::Yes;
2102     }
2103 
2104     fn write(&mut self, offset: u64, data: &[u8]) {
2105         self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2106     }
2107 
2108     fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2109         // Double-check all functions were pushed.
2110         assert_eq!(self.next_func, self.buf.label_offsets.len());
2111 
2112         // Finish up any veneers, if necessary.
2113         self.buf
2114             .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2115 
2116         // We don't need the data any more, so return it to the caller.
2117         mem::take(&mut self.buf.data).into_vec()
2118     }
2119 }
2120 
2121 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2122 #[cfg(all(test, feature = "arm64"))]
2123 mod test {
2124     use cranelift_entity::EntityRef as _;
2125 
2126     use super::*;
2127     use crate::ir::UserExternalNameRef;
2128     use crate::isa::aarch64::inst::{xreg, OperandSize};
2129     use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2130     use crate::machinst::{MachInstEmit, MachInstEmitState};
2131     use crate::settings;
2132 
2133     fn label(n: u32) -> MachLabel {
2134         MachLabel::from_block(BlockIndex::new(n as usize))
2135     }
2136     fn target(n: u32) -> BranchTarget {
2137         BranchTarget::Label(label(n))
2138     }
2139 
2140     #[test]
2141     fn test_elide_jump_to_next() {
2142         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2143         let mut buf = MachBuffer::new();
2144         let mut state = <Inst as MachInstEmit>::State::default();
2145         let constants = Default::default();
2146 
2147         buf.reserve_labels_for_blocks(2);
2148         buf.bind_label(label(0), state.ctrl_plane_mut());
2149         let inst = Inst::Jump { dest: target(1) };
2150         inst.emit(&mut buf, &info, &mut state);
2151         buf.bind_label(label(1), state.ctrl_plane_mut());
2152         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2153         assert_eq!(0, buf.total_size());
2154     }
2155 
2156     #[test]
2157     fn test_elide_trivial_jump_blocks() {
2158         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2159         let mut buf = MachBuffer::new();
2160         let mut state = <Inst as MachInstEmit>::State::default();
2161         let constants = Default::default();
2162 
2163         buf.reserve_labels_for_blocks(4);
2164 
2165         buf.bind_label(label(0), state.ctrl_plane_mut());
2166         let inst = Inst::CondBr {
2167             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2168             taken: target(1),
2169             not_taken: target(2),
2170         };
2171         inst.emit(&mut buf, &info, &mut state);
2172 
2173         buf.bind_label(label(1), state.ctrl_plane_mut());
2174         let inst = Inst::Jump { dest: target(3) };
2175         inst.emit(&mut buf, &info, &mut state);
2176 
2177         buf.bind_label(label(2), state.ctrl_plane_mut());
2178         let inst = Inst::Jump { dest: target(3) };
2179         inst.emit(&mut buf, &info, &mut state);
2180 
2181         buf.bind_label(label(3), state.ctrl_plane_mut());
2182 
2183         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2184         assert_eq!(0, buf.total_size());
2185     }
2186 
2187     #[test]
2188     fn test_flip_cond() {
2189         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2190         let mut buf = MachBuffer::new();
2191         let mut state = <Inst as MachInstEmit>::State::default();
2192         let constants = Default::default();
2193 
2194         buf.reserve_labels_for_blocks(4);
2195 
2196         buf.bind_label(label(0), state.ctrl_plane_mut());
2197         let inst = Inst::CondBr {
2198             kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2199             taken: target(1),
2200             not_taken: target(2),
2201         };
2202         inst.emit(&mut buf, &info, &mut state);
2203 
2204         buf.bind_label(label(1), state.ctrl_plane_mut());
2205         let inst = Inst::Nop4;
2206         inst.emit(&mut buf, &info, &mut state);
2207 
2208         buf.bind_label(label(2), state.ctrl_plane_mut());
2209         let inst = Inst::Udf {
2210             trap_code: TrapCode::STACK_OVERFLOW,
2211         };
2212         inst.emit(&mut buf, &info, &mut state);
2213 
2214         buf.bind_label(label(3), state.ctrl_plane_mut());
2215 
2216         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2217 
2218         let mut buf2 = MachBuffer::new();
2219         let mut state = Default::default();
2220         let inst = Inst::TrapIf {
2221             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2222             trap_code: TrapCode::STACK_OVERFLOW,
2223         };
2224         inst.emit(&mut buf2, &info, &mut state);
2225         let inst = Inst::Nop4;
2226         inst.emit(&mut buf2, &info, &mut state);
2227 
2228         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2229 
2230         assert_eq!(buf.data, buf2.data);
2231     }
2232 
2233     #[test]
2234     fn test_island() {
2235         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2236         let mut buf = MachBuffer::new();
2237         let mut state = <Inst as MachInstEmit>::State::default();
2238         let constants = Default::default();
2239 
2240         buf.reserve_labels_for_blocks(4);
2241 
2242         buf.bind_label(label(0), state.ctrl_plane_mut());
2243         let inst = Inst::CondBr {
2244             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2245             taken: target(2),
2246             not_taken: target(3),
2247         };
2248         inst.emit(&mut buf, &info, &mut state);
2249 
2250         buf.bind_label(label(1), state.ctrl_plane_mut());
2251         while buf.cur_offset() < 2000000 {
2252             if buf.island_needed(0) {
2253                 buf.emit_island(0, state.ctrl_plane_mut());
2254             }
2255             let inst = Inst::Nop4;
2256             inst.emit(&mut buf, &info, &mut state);
2257         }
2258 
2259         buf.bind_label(label(2), state.ctrl_plane_mut());
2260         let inst = Inst::Nop4;
2261         inst.emit(&mut buf, &info, &mut state);
2262 
2263         buf.bind_label(label(3), state.ctrl_plane_mut());
2264         let inst = Inst::Nop4;
2265         inst.emit(&mut buf, &info, &mut state);
2266 
2267         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2268 
2269         assert_eq!(2000000 + 8, buf.total_size());
2270 
2271         let mut buf2 = MachBuffer::new();
2272         let mut state = Default::default();
2273         let inst = Inst::CondBr {
2274             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2275 
2276             // This conditionally taken branch has a 19-bit constant, shifted
2277             // to the left by two, giving us a 21-bit range in total. Half of
2278             // this range positive so the we should be around 1 << 20 bytes
2279             // away for our jump target.
2280             //
2281             // There are two pending fixups by the time we reach this point,
2282             // one for this 19-bit jump and one for the unconditional 26-bit
2283             // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2284             // veneer is 20 bytes large, which means that pessimistically
2285             // assuming we'll need two veneers. Currently each veneer is
2286             // pessimistically assumed to be the maximal size which means we
2287             // need 40 bytes of extra space, meaning that the actual island
2288             // should come 40-bytes before the deadline.
2289             taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2290 
2291             // This branch is in-range so no veneers should be needed, it should
2292             // go directly to the target.
2293             not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2294         };
2295         inst.emit(&mut buf2, &info, &mut state);
2296 
2297         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2298 
2299         assert_eq!(&buf.data[0..8], &buf2.data[..]);
2300     }
2301 
2302     #[test]
2303     fn test_island_backward() {
2304         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2305         let mut buf = MachBuffer::new();
2306         let mut state = <Inst as MachInstEmit>::State::default();
2307         let constants = Default::default();
2308 
2309         buf.reserve_labels_for_blocks(4);
2310 
2311         buf.bind_label(label(0), state.ctrl_plane_mut());
2312         let inst = Inst::Nop4;
2313         inst.emit(&mut buf, &info, &mut state);
2314 
2315         buf.bind_label(label(1), state.ctrl_plane_mut());
2316         let inst = Inst::Nop4;
2317         inst.emit(&mut buf, &info, &mut state);
2318 
2319         buf.bind_label(label(2), state.ctrl_plane_mut());
2320         while buf.cur_offset() < 2000000 {
2321             let inst = Inst::Nop4;
2322             inst.emit(&mut buf, &info, &mut state);
2323         }
2324 
2325         buf.bind_label(label(3), state.ctrl_plane_mut());
2326         let inst = Inst::CondBr {
2327             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2328             taken: target(0),
2329             not_taken: target(1),
2330         };
2331         inst.emit(&mut buf, &info, &mut state);
2332 
2333         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2334 
2335         assert_eq!(2000000 + 12, buf.total_size());
2336 
2337         let mut buf2 = MachBuffer::new();
2338         let mut state = Default::default();
2339         let inst = Inst::CondBr {
2340             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2341             taken: BranchTarget::ResolvedOffset(8),
2342             not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2343         };
2344         inst.emit(&mut buf2, &info, &mut state);
2345         let inst = Inst::Jump {
2346             dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2347         };
2348         inst.emit(&mut buf2, &info, &mut state);
2349 
2350         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2351 
2352         assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2353     }
2354 
2355     #[test]
2356     fn test_multiple_redirect() {
2357         // label0:
2358         //   cbz x0, label1
2359         //   b label2
2360         // label1:
2361         //   b label3
2362         // label2:
2363         //   nop
2364         //   nop
2365         //   b label0
2366         // label3:
2367         //   b label4
2368         // label4:
2369         //   b label5
2370         // label5:
2371         //   b label7
2372         // label6:
2373         //   nop
2374         // label7:
2375         //   ret
2376         //
2377         // -- should become:
2378         //
2379         // label0:
2380         //   cbz x0, label7
2381         // label2:
2382         //   nop
2383         //   nop
2384         //   b label0
2385         // label6:
2386         //   nop
2387         // label7:
2388         //   ret
2389 
2390         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2391         let mut buf = MachBuffer::new();
2392         let mut state = <Inst as MachInstEmit>::State::default();
2393         let constants = Default::default();
2394 
2395         buf.reserve_labels_for_blocks(8);
2396 
2397         buf.bind_label(label(0), state.ctrl_plane_mut());
2398         let inst = Inst::CondBr {
2399             kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2400             taken: target(1),
2401             not_taken: target(2),
2402         };
2403         inst.emit(&mut buf, &info, &mut state);
2404 
2405         buf.bind_label(label(1), state.ctrl_plane_mut());
2406         let inst = Inst::Jump { dest: target(3) };
2407         inst.emit(&mut buf, &info, &mut state);
2408 
2409         buf.bind_label(label(2), state.ctrl_plane_mut());
2410         let inst = Inst::Nop4;
2411         inst.emit(&mut buf, &info, &mut state);
2412         inst.emit(&mut buf, &info, &mut state);
2413         let inst = Inst::Jump { dest: target(0) };
2414         inst.emit(&mut buf, &info, &mut state);
2415 
2416         buf.bind_label(label(3), state.ctrl_plane_mut());
2417         let inst = Inst::Jump { dest: target(4) };
2418         inst.emit(&mut buf, &info, &mut state);
2419 
2420         buf.bind_label(label(4), state.ctrl_plane_mut());
2421         let inst = Inst::Jump { dest: target(5) };
2422         inst.emit(&mut buf, &info, &mut state);
2423 
2424         buf.bind_label(label(5), state.ctrl_plane_mut());
2425         let inst = Inst::Jump { dest: target(7) };
2426         inst.emit(&mut buf, &info, &mut state);
2427 
2428         buf.bind_label(label(6), state.ctrl_plane_mut());
2429         let inst = Inst::Nop4;
2430         inst.emit(&mut buf, &info, &mut state);
2431 
2432         buf.bind_label(label(7), state.ctrl_plane_mut());
2433         let inst = Inst::Ret {};
2434         inst.emit(&mut buf, &info, &mut state);
2435 
2436         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2437 
2438         let golden_data = vec![
2439             0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2440             0x1f, 0x20, 0x03, 0xd5, // nop
2441             0x1f, 0x20, 0x03, 0xd5, // nop
2442             0xfd, 0xff, 0xff, 0x17, // b 0
2443             0x1f, 0x20, 0x03, 0xd5, // nop
2444             0xc0, 0x03, 0x5f, 0xd6, // ret
2445         ];
2446 
2447         assert_eq!(&golden_data[..], &buf.data[..]);
2448     }
2449 
2450     #[test]
2451     fn test_handle_branch_cycle() {
2452         // label0:
2453         //   b label1
2454         // label1:
2455         //   b label2
2456         // label2:
2457         //   b label3
2458         // label3:
2459         //   b label4
2460         // label4:
2461         //   b label1  // note: not label0 (to make it interesting).
2462         //
2463         // -- should become:
2464         //
2465         // label0, label1, ..., label4:
2466         //   b label0
2467         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2468         let mut buf = MachBuffer::new();
2469         let mut state = <Inst as MachInstEmit>::State::default();
2470         let constants = Default::default();
2471 
2472         buf.reserve_labels_for_blocks(5);
2473 
2474         buf.bind_label(label(0), state.ctrl_plane_mut());
2475         let inst = Inst::Jump { dest: target(1) };
2476         inst.emit(&mut buf, &info, &mut state);
2477 
2478         buf.bind_label(label(1), state.ctrl_plane_mut());
2479         let inst = Inst::Jump { dest: target(2) };
2480         inst.emit(&mut buf, &info, &mut state);
2481 
2482         buf.bind_label(label(2), state.ctrl_plane_mut());
2483         let inst = Inst::Jump { dest: target(3) };
2484         inst.emit(&mut buf, &info, &mut state);
2485 
2486         buf.bind_label(label(3), state.ctrl_plane_mut());
2487         let inst = Inst::Jump { dest: target(4) };
2488         inst.emit(&mut buf, &info, &mut state);
2489 
2490         buf.bind_label(label(4), state.ctrl_plane_mut());
2491         let inst = Inst::Jump { dest: target(1) };
2492         inst.emit(&mut buf, &info, &mut state);
2493 
2494         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2495 
2496         let golden_data = vec![
2497             0x00, 0x00, 0x00, 0x14, // b 0
2498         ];
2499 
2500         assert_eq!(&golden_data[..], &buf.data[..]);
2501     }
2502 
2503     #[test]
2504     fn metadata_records() {
2505         let mut buf = MachBuffer::<Inst>::new();
2506         let ctrl_plane = &mut Default::default();
2507         let constants = Default::default();
2508 
2509         buf.reserve_labels_for_blocks(3);
2510 
2511         buf.bind_label(label(0), ctrl_plane);
2512         buf.put1(1);
2513         buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2514         buf.put1(2);
2515         buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2516         buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2517         buf.add_call_site(&[
2518             (None.into(), label(1)),
2519             (Some(ExceptionTag::new(42)).into(), label(2)),
2520         ]);
2521         buf.add_reloc(
2522             Reloc::Abs4,
2523             &ExternalName::User(UserExternalNameRef::new(0)),
2524             0,
2525         );
2526         buf.put1(3);
2527         buf.add_reloc(
2528             Reloc::Abs8,
2529             &ExternalName::User(UserExternalNameRef::new(1)),
2530             1,
2531         );
2532         buf.put1(4);
2533         buf.bind_label(label(1), ctrl_plane);
2534         buf.put1(0xff);
2535         buf.bind_label(label(2), ctrl_plane);
2536         buf.put1(0xff);
2537 
2538         let buf = buf.finish(&constants, ctrl_plane);
2539 
2540         assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2541         assert_eq!(
2542             buf.traps()
2543                 .iter()
2544                 .map(|trap| (trap.offset, trap.code))
2545                 .collect::<Vec<_>>(),
2546             vec![
2547                 (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2548                 (2, TrapCode::INTEGER_OVERFLOW),
2549                 (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2550             ]
2551         );
2552         let call_sites: Vec<_> = buf.call_sites().collect();
2553         assert_eq!(call_sites[0].ret_addr, 2);
2554         assert_eq!(
2555             call_sites[0].exception_handlers,
2556             &[(None.into(), 4), (Some(ExceptionTag::new(42)).into(), 5)]
2557         );
2558         assert_eq!(
2559             buf.relocs()
2560                 .iter()
2561                 .map(|reloc| (reloc.offset, reloc.kind))
2562                 .collect::<Vec<_>>(),
2563             vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2564         );
2565     }
2566 }
2567