1 //! In-memory representation of compiled machine code, with labels and fixups to
2 //! refer to those labels. Handles constant-pool island insertion and also
3 //! veneer insertion for out-of-range jumps.
4 //!
5 //! This code exists to solve three problems:
6 //!
7 //! - Branch targets for forward branches are not known until later, when we
8 //!   emit code in a single pass through the instruction structs.
9 //!
10 //! - On many architectures, address references or offsets have limited range.
11 //!   For example, on AArch64, conditional branches can only target code +/- 1MB
12 //!   from the branch itself.
13 //!
14 //! - The lowering of control flow from the CFG-with-edges produced by
15 //!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16 //!   edge blocks when the register allocator does not need to insert any
17 //!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18 //!   patterns. The lowering also pays no attention to block order, and so
19 //!   two-target conditional forms (cond-br followed by uncond-br) can often by
20 //!   avoided because one of the targets is the fallthrough. There are several
21 //!   cases here where we can simplify to use fewer branches.
22 //!
23 //! This "buffer" implements a single-pass code emission strategy (with a later
24 //! "fixup" pass, but only through recorded fixups, not all instructions). The
25 //! basic idea is:
26 //!
27 //! - Emit branches as they are, including two-target (cond/uncond) compound
28 //!   forms, but with zero offsets and optimistically assuming the target will be
29 //!   in range. Record the "fixup" for later. Targets are denoted instead by
30 //!   symbolic "labels" that are then bound to certain offsets in the buffer as
31 //!   we emit code. (Nominally, there is a label at the start of every basic
32 //!   block.)
33 //!
34 //! - As we do this, track the offset in the buffer at which the first label
35 //!   reference "goes out of range". We call this the "deadline". If we reach the
36 //!   deadline and we still have not bound the label to which an unresolved branch
37 //!   refers, we have a problem!
38 //!
39 //! - To solve this problem, we emit "islands" full of "veneers". An island is
40 //!   simply a chunk of code inserted in the middle of the code actually produced
41 //!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42 //!   has some awareness of this: it either asks for an island between blocks, so
43 //!   it is not accidentally executed, or else it emits a branch around the island
44 //!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45 //!
46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47 //!   that implements a longer-range reference to a label. The idea is that, for
48 //!   example, a branch with a limited range can branch to a "veneer" instead,
49 //!   which is simply a branch in a form that can use a longer-range reference. On
50 //!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51 //!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52 //!   conditional branch's label reference can be fixed up with a "veneer" to
53 //!   achieve a longer range.
54 //!
55 //! - To implement all of this, we require the backend to provide a `LabelUse`
56 //!   type that implements a trait. This is nominally an enum that records one of
57 //!   several kinds of references to an offset in code -- basically, a relocation
58 //!   type -- and will usually correspond to different instruction formats. The
59 //!   `LabelUse` implementation specifies the maximum range, how to patch in the
60 //!   actual label location when known, and how to generate a veneer to extend the
61 //!   range.
62 //!
63 //! That satisfies label references, but we still may have suboptimal branch
64 //! patterns. To clean up the branches, we do a simple "peephole"-style
65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66 //! informs the buffer of branches in the code and, in the case of conditionals,
67 //! the code that would have been emitted to invert this branch's condition. We
68 //! track the "latest branches": these are branches that are contiguous up to
69 //! the current offset. (If any code is emitted after a branch, that branch or
70 //! run of contiguous branches is no longer "latest".) The latest branches are
71 //! those that we can edit by simply truncating the buffer and doing something
72 //! else instead.
73 //!
74 //! To optimize branches, we implement several simple rules, and try to apply
75 //! them to the "latest branches" when possible:
76 //!
77 //! - A branch with a label target, when that label is bound to the ending
78 //!   offset of the branch (the fallthrough location), can be removed altogether,
79 //!   because the branch would have no effect).
80 //!
81 //! - An unconditional branch that starts at a label location, and branches to
82 //!   another label, results in a "label alias": all references to the label bound
83 //!   *to* this branch instruction are instead resolved to the *target* of the
84 //!   branch instruction. This effectively removes empty blocks that just
85 //!   unconditionally branch to the next block. We call this "branch threading".
86 //!
87 //! - A conditional followed by an unconditional, when the conditional branches
88 //!   to the unconditional's fallthrough, results in (i) the truncation of the
89 //!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90 //!   replacement of the conditional's target (using the original target of the
91 //!   unconditional). This is a fancy way of saying "we can flip a two-target
92 //!   conditional branch's taken/not-taken targets if it works better with our
93 //!   fallthrough". To make this work, the emitter actually gives the buffer
94 //!   *both* forms of every conditional branch: the true form is emitted into the
95 //!   buffer, and the "inverted" machine-code bytes are provided as part of the
96 //!   branch-fixup metadata.
97 //!
98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have
99 //!   been redirected to target(B), can be removed entirely. This is an extension
100 //!   of the branch-threading optimization, and is valid because if we know there
101 //!   will be no fallthrough into this branch instruction (the prior instruction
102 //!   is an unconditional jump), and if we know we have successfully redirected
103 //!   all labels, then this branch instruction is unreachable. Note that this
104 //!   works because the redirection happens before the label is ever resolved
105 //!   (fixups happen at island emission time, at which point latest-branches are
106 //!   cleared, or at the end of emission), so we are sure to catch and redirect
107 //!   all possible paths to this instruction.
108 //!
109 //! # Branch-optimization Correctness
110 //!
111 //! The branch-optimization mechanism depends on a few data structures with
112 //! invariants, which are always held outside the scope of top-level public
113 //! methods:
114 //!
115 //! - The latest-branches list. Each entry describes a span of the buffer
116 //!   (start/end offsets), the label target, the corresponding fixup-list entry
117 //!   index, and the bytes (must be the same length) for the inverted form, if
118 //!   conditional. The list of labels that are bound to the start-offset of this
119 //!   branch is *complete* (if any label has a resolved offset equal to `start`
120 //!   and is not an alias, it must appear in this list) and *precise* (no label
121 //!   in this list can be bound to another offset). No label in this list should
122 //!   be an alias.  No two branch ranges can overlap, and branches are in
123 //!   ascending-offset order.
124 //!
125 //! - The labels-at-tail list. This contains all MachLabels that have been bound
126 //!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127 //!   No label in this list should be an alias.
128 //!
129 //! - The label_offsets array, containing the bound offset of a label or
130 //!   UNKNOWN. No label can be bound at an offset greater than the current
131 //!   buffer tail.
132 //!
133 //! - The label_aliases array, containing another label to which a label is
134 //!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135 //!   of the label it is aliased to, if this is set.
136 //!
137 //! We argue below, at each method, how the invariants in these data structures
138 //! are maintained (grep for "Post-invariant").
139 //!
140 //! Given these invariants, we argue why each optimization preserves execution
141 //! semantics below (grep for "Preserves execution semantics").
142 //!
143 //! # Avoiding Quadratic Behavior
144 //!
145 //! There are two cases where we've had to take some care to avoid
146 //! quadratic worst-case behavior:
147 //!
148 //! - The "labels at this branch" list can grow unboundedly if the
149 //!   code generator binds many labels at one location. If the count
150 //!   gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we
151 //!   simply abort an optimization early in a way that is always correct
152 //!   but is conservative.
153 //!
154 //! - The fixup list can interact with island emission to create
155 //!   "quadratic island behavior". In a little more detail, one can hit
156 //!   this behavior by having some pending fixups (forward label
157 //!   references) with long-range label-use kinds, and some others
158 //!   with shorter-range references that nonetheless still are pending
159 //!   long enough to trigger island generation. In such a case, we
160 //!   process the fixup list, generate veneers to extend some forward
161 //!   references' ranges, but leave the other (longer-range) ones
162 //!   alone. The way this was implemented put them back on a list and
163 //!   resulted in quadratic behavior.
164 //!
165 //!   To avoid this fixups are split into two lists: one "pending" list and one
166 //!   final list. The pending list is kept around for handling fixups related to
167 //!   branches so it can be edited/truncated. When an island is reached, which
168 //!   starts processing fixups, all pending fixups are flushed into the final
169 //!   list. The final list is a `BinaryHeap` which enables fixup processing to
170 //!   only process those which are required during island emission, deferring
171 //!   all longer-range fixups to later.
172 
173 use crate::binemit::{Addend, CodeOffset, Reloc};
174 use crate::ir::function::FunctionParameters;
175 use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode};
176 use crate::isa::unwind::UnwindInst;
177 use crate::machinst::{
178     BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
179 };
180 use crate::trace;
181 use crate::{MachInstEmitState, ir};
182 use crate::{VCodeConstantData, timing};
183 use core::ops::Range;
184 use cranelift_control::ControlPlane;
185 use cranelift_entity::packed_option::PackedOption;
186 use cranelift_entity::{PrimaryMap, entity_impl};
187 use smallvec::SmallVec;
188 use std::cmp::Ordering;
189 use std::collections::BinaryHeap;
190 use std::mem;
191 use std::string::String;
192 use std::vec::Vec;
193 
194 #[cfg(feature = "enable-serde")]
195 use serde::{Deserialize, Serialize};
196 
197 #[cfg(feature = "enable-serde")]
198 pub trait CompilePhase {
199     type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
200     type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
201 }
202 
203 #[cfg(not(feature = "enable-serde"))]
204 pub trait CompilePhase {
205     type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
206     type SourceLocType: core::fmt::Debug + PartialEq + Clone;
207 }
208 
209 /// Status of a compiled artifact that needs patching before being used.
210 #[derive(Clone, Debug, PartialEq)]
211 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
212 pub struct Stencil;
213 
214 /// Status of a compiled artifact ready to use.
215 #[derive(Clone, Debug, PartialEq)]
216 pub struct Final;
217 
218 impl CompilePhase for Stencil {
219     type MachSrcLocType = MachSrcLoc<Stencil>;
220     type SourceLocType = RelSourceLoc;
221 }
222 
223 impl CompilePhase for Final {
224     type MachSrcLocType = MachSrcLoc<Final>;
225     type SourceLocType = SourceLoc;
226 }
227 
228 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
229 enum ForceVeneers {
230     Yes,
231     No,
232 }
233 
234 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
235 /// in bulk.
236 ///
237 /// This struct uses `SmallVec`s to support small-ish function bodies without
238 /// any heap allocation. As such, it will be several kilobytes large. This is
239 /// likely fine as long as it is stack-allocated for function emission then
240 /// thrown away; but beware if many buffer objects are retained persistently.
241 pub struct MachBuffer<I: VCodeInst> {
242     /// The buffer contents, as raw bytes.
243     data: SmallVec<[u8; 1024]>,
244     /// The required alignment of this buffer.
245     min_alignment: u32,
246     /// Any relocations referring to this code. Note that only *external*
247     /// relocations are tracked here; references to labels within the buffer are
248     /// resolved before emission.
249     relocs: SmallVec<[MachReloc; 16]>,
250     /// Any trap records referring to this code.
251     traps: SmallVec<[MachTrap; 16]>,
252     /// Any call site records referring to this code.
253     call_sites: SmallVec<[MachCallSite; 16]>,
254     /// Any exception-handler records referred to at call sites.
255     exception_handlers: SmallVec<[(PackedOption<ir::ExceptionTag>, MachLabel); 16]>,
256     /// Any source location mappings referring to this code.
257     srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
258     /// Any user stack maps for this code.
259     ///
260     /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
261     /// by code offset, and each stack map covers `span` bytes on the stack.
262     user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
263     /// Any unwind info at a given location.
264     unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
265     /// The current source location in progress (after `start_srcloc()` and
266     /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
267     cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
268     /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
269     label_offsets: SmallVec<[CodeOffset; 16]>,
270     /// Label aliases: when one label points to an unconditional jump, and that
271     /// jump points to another label, we can redirect references to the first
272     /// label immediately to the second.
273     ///
274     /// Invariant: we don't have label-alias cycles. We ensure this by,
275     /// before setting label A to alias label B, resolving B's alias
276     /// target (iteratively until a non-aliased label); if B is already
277     /// aliased to A, then we cannot alias A back to B.
278     label_aliases: SmallVec<[MachLabel; 16]>,
279     /// Constants that must be emitted at some point.
280     pending_constants: SmallVec<[VCodeConstant; 16]>,
281     /// Byte size of all constants in `pending_constants`.
282     pending_constants_size: CodeOffset,
283     /// Traps that must be emitted at some point.
284     pending_traps: SmallVec<[MachLabelTrap; 16]>,
285     /// Fixups that haven't yet been flushed into `fixup_records` below and may
286     /// be related to branches that are chomped. These all get added to
287     /// `fixup_records` during island emission.
288     pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
289     /// The nearest upcoming deadline for entries in `pending_fixup_records`.
290     pending_fixup_deadline: CodeOffset,
291     /// Fixups that must be performed after all code is emitted.
292     fixup_records: BinaryHeap<MachLabelFixup<I>>,
293     /// Latest branches, to facilitate in-place editing for better fallthrough
294     /// behavior and empty-block removal.
295     latest_branches: SmallVec<[MachBranch; 4]>,
296     /// All labels at the current offset (emission tail). This is lazily
297     /// cleared: it is actually accurate as long as the current offset is
298     /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
299     /// be considered as empty.
300     ///
301     /// For correctness, this *must* be complete (i.e., the vector must contain
302     /// all labels whose offsets are resolved to the current tail), because we
303     /// rely on it to update labels when we truncate branches.
304     labels_at_tail: SmallVec<[MachLabel; 4]>,
305     /// The last offset at which `labels_at_tail` is valid. It is conceptually
306     /// always describing the tail of the buffer, but we do not clear
307     /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
308     /// when the offset has grown past this (`labels_at_tail_off`) point.
309     /// Always <= `cur_offset()`.
310     labels_at_tail_off: CodeOffset,
311     /// Metadata about all constants that this function has access to.
312     ///
313     /// This records the size/alignment of all constants (not the actual data)
314     /// along with the last available label generated for the constant. This map
315     /// is consulted when constants are referred to and the label assigned to a
316     /// constant may change over time as well.
317     constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
318     /// All recorded usages of constants as pairs of the constant and where the
319     /// constant needs to be placed within `self.data`. Note that the same
320     /// constant may appear in this array multiple times if it was emitted
321     /// multiple times.
322     used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
323     /// Indicates when a patchable region is currently open, to guard that it's
324     /// not possible to nest patchable regions.
325     open_patchable: bool,
326 }
327 
328 impl MachBufferFinalized<Stencil> {
329     /// Get a finalized machine buffer by applying the function's base source location.
330     pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
331         MachBufferFinalized {
332             data: self.data,
333             relocs: self.relocs,
334             traps: self.traps,
335             call_sites: self.call_sites,
336             exception_handlers: self.exception_handlers,
337             srclocs: self
338                 .srclocs
339                 .into_iter()
340                 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
341                 .collect(),
342             user_stack_maps: self.user_stack_maps,
343             unwind_info: self.unwind_info,
344             alignment: self.alignment,
345         }
346     }
347 }
348 
349 /// A `MachBuffer` once emission is completed: holds generated code and records,
350 /// without fixups. This allows the type to be independent of the backend.
351 #[derive(PartialEq, Debug, Clone)]
352 #[cfg_attr(
353     feature = "enable-serde",
354     derive(serde_derive::Serialize, serde_derive::Deserialize)
355 )]
356 pub struct MachBufferFinalized<T: CompilePhase> {
357     /// The buffer contents, as raw bytes.
358     pub(crate) data: SmallVec<[u8; 1024]>,
359     /// Any relocations referring to this code. Note that only *external*
360     /// relocations are tracked here; references to labels within the buffer are
361     /// resolved before emission.
362     pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>,
363     /// Any trap records referring to this code.
364     pub(crate) traps: SmallVec<[MachTrap; 16]>,
365     /// Any call site records referring to this code.
366     pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
367     /// Any exception-handler records referred to at call sites.
368     pub(crate) exception_handlers: SmallVec<[(PackedOption<ir::ExceptionTag>, CodeOffset); 16]>,
369     /// Any source location mappings referring to this code.
370     pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
371     /// Any user stack maps for this code.
372     ///
373     /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted
374     /// by code offset, and each stack map covers `span` bytes on the stack.
375     pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>,
376     /// Any unwind info at a given location.
377     pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
378     /// The required alignment of this buffer.
379     pub alignment: u32,
380 }
381 
382 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
383 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
384 
385 /// Threshold on max length of `labels_at_this_branch` list to avoid
386 /// unbounded quadratic behavior (see comment below at use-site).
387 const LABEL_LIST_THRESHOLD: usize = 100;
388 
389 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at
390 /// the point at which it is used by emitted code; the buffer records "fixups"
391 /// for references to the label, and will come back and patch the code
392 /// appropriately when the label's location is eventually known.
393 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
394 pub struct MachLabel(u32);
395 entity_impl!(MachLabel);
396 
397 impl MachLabel {
398     /// Get a label for a block. (The first N MachLabels are always reserved for
399     /// the N blocks in the vcode.)
400     pub fn from_block(bindex: BlockIndex) -> MachLabel {
401         MachLabel(bindex.index() as u32)
402     }
403 
404     /// Creates a string representing this label, for convenience.
405     pub fn to_string(&self) -> String {
406         format!("label{}", self.0)
407     }
408 }
409 
410 impl Default for MachLabel {
411     fn default() -> Self {
412         UNKNOWN_LABEL
413     }
414 }
415 
416 /// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is
417 /// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming
418 /// the [`OpenPatchRegion`] token in the process.
419 pub struct OpenPatchRegion(usize);
420 
421 /// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example
422 /// of where you might want to use this is for patching instructions that mention constants that
423 /// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable
424 /// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token
425 /// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known,
426 /// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction
427 /// bytes, and the constants uses can be updated directly.
428 pub struct PatchRegion {
429     range: Range<usize>,
430 }
431 
432 impl PatchRegion {
433     /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer.
434     pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] {
435         &mut buffer.data[self.range]
436     }
437 }
438 
439 impl<I: VCodeInst> MachBuffer<I> {
440     /// Create a new section, known to start at `start_offset` and with a size limited to
441     /// `length_limit`.
442     pub fn new() -> MachBuffer<I> {
443         MachBuffer {
444             data: SmallVec::new(),
445             min_alignment: I::function_alignment().minimum,
446             relocs: SmallVec::new(),
447             traps: SmallVec::new(),
448             call_sites: SmallVec::new(),
449             exception_handlers: SmallVec::new(),
450             srclocs: SmallVec::new(),
451             user_stack_maps: SmallVec::new(),
452             unwind_info: SmallVec::new(),
453             cur_srcloc: None,
454             label_offsets: SmallVec::new(),
455             label_aliases: SmallVec::new(),
456             pending_constants: SmallVec::new(),
457             pending_constants_size: 0,
458             pending_traps: SmallVec::new(),
459             pending_fixup_records: SmallVec::new(),
460             pending_fixup_deadline: u32::MAX,
461             fixup_records: Default::default(),
462             latest_branches: SmallVec::new(),
463             labels_at_tail: SmallVec::new(),
464             labels_at_tail_off: 0,
465             constants: Default::default(),
466             used_constants: Default::default(),
467             open_patchable: false,
468         }
469     }
470 
471     /// Current offset from start of buffer.
472     pub fn cur_offset(&self) -> CodeOffset {
473         self.data.len() as CodeOffset
474     }
475 
476     /// Add a byte.
477     pub fn put1(&mut self, value: u8) {
478         self.data.push(value);
479 
480         // Post-invariant: conceptual-labels_at_tail contains a complete and
481         // precise list of labels bound at `cur_offset()`. We have advanced
482         // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
483         // before, it is not anymore (and it cannot become equal, because
484         // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
485         // conceptually empty (even though it is only lazily cleared). No labels
486         // can be bound at this new offset (by invariant on `label_offsets`).
487         // Hence the invariant holds.
488     }
489 
490     /// Add 2 bytes.
491     pub fn put2(&mut self, value: u16) {
492         let bytes = value.to_le_bytes();
493         self.data.extend_from_slice(&bytes[..]);
494 
495         // Post-invariant: as for `put1()`.
496     }
497 
498     /// Add 4 bytes.
499     pub fn put4(&mut self, value: u32) {
500         let bytes = value.to_le_bytes();
501         self.data.extend_from_slice(&bytes[..]);
502 
503         // Post-invariant: as for `put1()`.
504     }
505 
506     /// Add 8 bytes.
507     pub fn put8(&mut self, value: u64) {
508         let bytes = value.to_le_bytes();
509         self.data.extend_from_slice(&bytes[..]);
510 
511         // Post-invariant: as for `put1()`.
512     }
513 
514     /// Add a slice of bytes.
515     pub fn put_data(&mut self, data: &[u8]) {
516         self.data.extend_from_slice(data);
517 
518         // Post-invariant: as for `put1()`.
519     }
520 
521     /// Reserve appended space and return a mutable slice referring to it.
522     pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
523         let off = self.data.len();
524         let new_len = self.data.len() + len;
525         self.data.resize(new_len, 0);
526         &mut self.data[off..]
527 
528         // Post-invariant: as for `put1()`.
529     }
530 
531     /// Align up to the given alignment.
532     pub fn align_to(&mut self, align_to: CodeOffset) {
533         trace!("MachBuffer: align to {}", align_to);
534         assert!(
535             align_to.is_power_of_two(),
536             "{align_to} is not a power of two"
537         );
538         while self.cur_offset() & (align_to - 1) != 0 {
539             self.put1(0);
540         }
541 
542         // Post-invariant: as for `put1()`.
543     }
544 
545     /// Begin a region of patchable code. There is one requirement for the
546     /// code that is emitted: It must not introduce any instructions that
547     /// could be chomped (branches are an example of this). In other words,
548     /// you must not call [`MachBuffer::add_cond_branch`] or
549     /// [`MachBuffer::add_uncond_branch`] between calls to this method and
550     /// [`MachBuffer::end_patchable`].
551     pub fn start_patchable(&mut self) -> OpenPatchRegion {
552         assert!(!self.open_patchable, "Patchable regions may not be nested");
553         self.open_patchable = true;
554         OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap())
555     }
556 
557     /// End a region of patchable code, yielding a [`PatchRegion`] value that
558     /// can be consumed later to produce a one-off mutable slice to the
559     /// associated region of the data buffer.
560     pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion {
561         // No need to assert the state of `open_patchable` here, as we take
562         // ownership of the only `OpenPatchable` value.
563         self.open_patchable = false;
564         let end = usize::try_from(self.cur_offset()).unwrap();
565         PatchRegion { range: open.0..end }
566     }
567 
568     /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
569     /// offset yet.
570     pub fn get_label(&mut self) -> MachLabel {
571         let l = self.label_offsets.len() as u32;
572         self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
573         self.label_aliases.push(UNKNOWN_LABEL);
574         trace!("MachBuffer: new label -> {:?}", MachLabel(l));
575         MachLabel(l)
576 
577         // Post-invariant: the only mutation is to add a new label; it has no
578         // bound offset yet, so it trivially satisfies all invariants.
579     }
580 
581     /// Reserve the first N MachLabels for blocks.
582     pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
583         trace!("MachBuffer: first {} labels are for blocks", blocks);
584         debug_assert!(self.label_offsets.is_empty());
585         self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
586         self.label_aliases.resize(blocks, UNKNOWN_LABEL);
587 
588         // Post-invariant: as for `get_label()`.
589     }
590 
591     /// Registers metadata in this `MachBuffer` about the `constants` provided.
592     ///
593     /// This will record the size/alignment of all constants which will prepare
594     /// them for emission later on.
595     pub fn register_constants(&mut self, constants: &VCodeConstants) {
596         for (c, val) in constants.iter() {
597             self.register_constant(&c, val);
598         }
599     }
600 
601     /// Similar to [`MachBuffer::register_constants`] but registers a
602     /// single constant metadata. This function is useful in
603     /// situations where not all constants are known at the time of
604     /// emission.
605     pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) {
606         let c2 = self.constants.push(MachBufferConstant {
607             upcoming_label: None,
608             align: data.alignment(),
609             size: data.as_slice().len(),
610         });
611         assert_eq!(*constant, c2);
612     }
613 
614     /// Completes constant emission by iterating over `self.used_constants` and
615     /// filling in the "holes" with the constant values provided by `constants`.
616     ///
617     /// Returns the alignment required for this entire buffer. Alignment starts
618     /// at the ISA's minimum function alignment and can be increased due to
619     /// constant requirements.
620     fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
621         let mut alignment = self.min_alignment;
622         for (constant, offset) in mem::take(&mut self.used_constants) {
623             let constant = constants.get(constant);
624             let data = constant.as_slice();
625             self.data[offset as usize..][..data.len()].copy_from_slice(data);
626             alignment = constant.alignment().max(alignment);
627         }
628         alignment
629     }
630 
631     /// Returns a label that can be used to refer to the `constant` provided.
632     ///
633     /// This will automatically defer a new constant to be emitted for
634     /// `constant` if it has not been previously emitted. Note that this
635     /// function may return a different label for the same constant at
636     /// different points in time. The label is valid to use only from the
637     /// current location; the MachBuffer takes care to emit the same constant
638     /// multiple times if needed so the constant is always in range.
639     pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
640         let MachBufferConstant {
641             align,
642             size,
643             upcoming_label,
644         } = self.constants[constant];
645         if let Some(label) = upcoming_label {
646             return label;
647         }
648 
649         let label = self.get_label();
650         trace!(
651             "defer constant: eventually emit {size} bytes aligned \
652              to {align} at label {label:?}",
653         );
654         self.pending_constants.push(constant);
655         self.pending_constants_size += size as u32;
656         self.constants[constant].upcoming_label = Some(label);
657         label
658     }
659 
660     /// Bind a label to the current offset. A label can only be bound once.
661     pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
662         trace!(
663             "MachBuffer: bind label {:?} at offset {}",
664             label,
665             self.cur_offset()
666         );
667         debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
668         debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
669         let offset = self.cur_offset();
670         self.label_offsets[label.0 as usize] = offset;
671         self.lazily_clear_labels_at_tail();
672         self.labels_at_tail.push(label);
673 
674         // Invariants hold: bound offset of label is <= cur_offset (in fact it
675         // is equal). If the `labels_at_tail` list was complete and precise
676         // before, it is still, because we have bound this label to the current
677         // offset and added it to the list (which contains all labels at the
678         // current offset).
679 
680         self.optimize_branches(ctrl_plane);
681 
682         // Post-invariant: by `optimize_branches()` (see argument there).
683     }
684 
685     /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
686     /// offset that it applies to.
687     fn lazily_clear_labels_at_tail(&mut self) {
688         let offset = self.cur_offset();
689         if offset > self.labels_at_tail_off {
690             self.labels_at_tail_off = offset;
691             self.labels_at_tail.clear();
692         }
693 
694         // Post-invariant: either labels_at_tail_off was at cur_offset, and
695         // state is untouched, or was less than cur_offset, in which case the
696         // labels_at_tail list was conceptually empty, and is now actually
697         // empty.
698     }
699 
700     /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
701     pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
702         let mut iters = 0;
703         while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
704             label = self.label_aliases[label.0 as usize];
705             // To protect against an infinite loop (despite our assurances to
706             // ourselves that the invariants make this impossible), assert out
707             // after 1M iterations. The number of basic blocks is limited
708             // in most contexts anyway so this should be impossible to hit with
709             // a legitimate input.
710             iters += 1;
711             assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
712         }
713         self.label_offsets[label.0 as usize]
714 
715         // Post-invariant: no mutations.
716     }
717 
718     /// Emit a reference to the given label with the given reference type (i.e.,
719     /// branch-instruction format) at the current offset.  This is like a
720     /// relocation, but handled internally.
721     ///
722     /// This can be called before the branch is actually emitted; fixups will
723     /// not happen until an island is emitted or the buffer is finished.
724     pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
725         trace!(
726             "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
727             offset, label, kind
728         );
729 
730         // Add the fixup, and update the worst-case island size based on a
731         // veneer for this label use.
732         let fixup = MachLabelFixup {
733             label,
734             offset,
735             kind,
736         };
737         self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline());
738         self.pending_fixup_records.push(fixup);
739 
740         // Post-invariant: no mutations to branches/labels data structures.
741     }
742 
743     /// Inform the buffer of an unconditional branch at the given offset,
744     /// targeting the given label. May be used to optimize branches.
745     /// The last added label-use must correspond to this branch.
746     /// This must be called when the current offset is equal to `start`; i.e.,
747     /// before actually emitting the branch. This implies that for a branch that
748     /// uses a label and is eligible for optimizations by the MachBuffer, the
749     /// proper sequence is:
750     ///
751     /// - Call `use_label_at_offset()` to emit the fixup record.
752     /// - Call `add_uncond_branch()` to make note of the branch.
753     /// - Emit the bytes for the branch's machine code.
754     ///
755     /// Additional requirement: no labels may be bound between `start` and `end`
756     /// (exclusive on both ends).
757     pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
758         debug_assert!(
759             !self.open_patchable,
760             "Branch instruction inserted within a patchable region"
761         );
762         assert!(self.cur_offset() == start);
763         debug_assert!(end > start);
764         assert!(!self.pending_fixup_records.is_empty());
765         let fixup = self.pending_fixup_records.len() - 1;
766         self.lazily_clear_labels_at_tail();
767         self.latest_branches.push(MachBranch {
768             start,
769             end,
770             target,
771             fixup,
772             inverted: None,
773             labels_at_this_branch: self.labels_at_tail.clone(),
774         });
775 
776         // Post-invariant: we asserted branch start is current tail; the list of
777         // labels at branch is cloned from list of labels at current tail.
778     }
779 
780     /// Inform the buffer of a conditional branch at the given offset,
781     /// targeting the given label. May be used to optimize branches.
782     /// The last added label-use must correspond to this branch.
783     ///
784     /// Additional requirement: no labels may be bound between `start` and `end`
785     /// (exclusive on both ends).
786     pub fn add_cond_branch(
787         &mut self,
788         start: CodeOffset,
789         end: CodeOffset,
790         target: MachLabel,
791         inverted: &[u8],
792     ) {
793         debug_assert!(
794             !self.open_patchable,
795             "Branch instruction inserted within a patchable region"
796         );
797         assert!(self.cur_offset() == start);
798         debug_assert!(end > start);
799         assert!(!self.pending_fixup_records.is_empty());
800         debug_assert!(
801             inverted.len() == (end - start) as usize,
802             "branch length = {}, but inverted length = {}",
803             end - start,
804             inverted.len()
805         );
806         let fixup = self.pending_fixup_records.len() - 1;
807         let inverted = Some(SmallVec::from(inverted));
808         self.lazily_clear_labels_at_tail();
809         self.latest_branches.push(MachBranch {
810             start,
811             end,
812             target,
813             fixup,
814             inverted,
815             labels_at_this_branch: self.labels_at_tail.clone(),
816         });
817 
818         // Post-invariant: we asserted branch start is current tail; labels at
819         // branch list is cloned from list of labels at current tail.
820     }
821 
822     fn truncate_last_branch(&mut self) {
823         debug_assert!(
824             !self.open_patchable,
825             "Branch instruction truncated within a patchable region"
826         );
827 
828         self.lazily_clear_labels_at_tail();
829         // Invariants hold at this point.
830 
831         let b = self.latest_branches.pop().unwrap();
832         assert!(b.end == self.cur_offset());
833 
834         // State:
835         //    [PRE CODE]
836         //  Offset b.start, b.labels_at_this_branch:
837         //    [BRANCH CODE]
838         //  cur_off, self.labels_at_tail -->
839         //    (end of buffer)
840         self.data.truncate(b.start as usize);
841         self.pending_fixup_records.truncate(b.fixup);
842         while let Some(last_srcloc) = self.srclocs.last_mut() {
843             if last_srcloc.end <= b.start {
844                 break;
845             }
846             if last_srcloc.start < b.start {
847                 last_srcloc.end = b.start;
848                 break;
849             }
850             self.srclocs.pop();
851         }
852         // State:
853         //    [PRE CODE]
854         //  cur_off, Offset b.start, b.labels_at_this_branch:
855         //    (end of buffer)
856         //
857         //  self.labels_at_tail -->  (past end of buffer)
858         let cur_off = self.cur_offset();
859         self.labels_at_tail_off = cur_off;
860         // State:
861         //    [PRE CODE]
862         //  cur_off, Offset b.start, b.labels_at_this_branch,
863         //  self.labels_at_tail:
864         //    (end of buffer)
865         //
866         // resolve_label_offset(l) for l in labels_at_tail:
867         //    (past end of buffer)
868 
869         trace!(
870             "truncate_last_branch: truncated {:?}; off now {}",
871             b, cur_off
872         );
873 
874         // Fix up resolved label offsets for labels at tail.
875         for &l in &self.labels_at_tail {
876             self.label_offsets[l.0 as usize] = cur_off;
877         }
878         // Old labels_at_this_branch are now at cur_off.
879         self.labels_at_tail.extend(b.labels_at_this_branch);
880 
881         // Post-invariant: this operation is defined to truncate the buffer,
882         // which moves cur_off backward, and to move labels at the end of the
883         // buffer back to the start-of-branch offset.
884         //
885         // latest_branches satisfies all invariants:
886         // - it has no branches past the end of the buffer (branches are in
887         //   order, we removed the last one, and we truncated the buffer to just
888         //   before the start of that branch)
889         // - no labels were moved to lower offsets than the (new) cur_off, so
890         //   the labels_at_this_branch list for any other branch need not change.
891         //
892         // labels_at_tail satisfies all invariants:
893         // - all labels that were at the tail after the truncated branch are
894         //   moved backward to just before the branch, which becomes the new tail;
895         //   thus every element in the list should remain (ensured by `.extend()`
896         //   above).
897         // - all labels that refer to the new tail, which is the start-offset of
898         //   the truncated branch, must be present. The `labels_at_this_branch`
899         //   list in the truncated branch's record is a complete and precise list
900         //   of exactly these labels; we append these to labels_at_tail.
901         // - labels_at_tail_off is at cur_off after truncation occurs, so the
902         //   list is valid (not to be lazily cleared).
903         //
904         // The stated operation was performed:
905         // - For each label at the end of the buffer prior to this method, it
906         //   now resolves to the new (truncated) end of the buffer: it must have
907         //   been in `labels_at_tail` (this list is precise and complete, and
908         //   the tail was at the end of the truncated branch on entry), and we
909         //   iterate over this list and set `label_offsets` to the new tail.
910         //   None of these labels could have been an alias (by invariant), so
911         //   `label_offsets` is authoritative for each.
912         // - No other labels will be past the end of the buffer, because of the
913         //   requirement that no labels be bound to the middle of branch ranges
914         //   (see comments to `add_{cond,uncond}_branch()`).
915         // - The buffer is truncated to just before the last branch, and the
916         //   fixup record referring to that last branch is removed.
917     }
918 
919     /// Performs various optimizations on branches pointing at the current label.
920     pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
921         if ctrl_plane.get_decision() {
922             return;
923         }
924 
925         self.lazily_clear_labels_at_tail();
926         // Invariants valid at this point.
927 
928         trace!(
929             "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
930             self.latest_branches, self.labels_at_tail, self.pending_fixup_records
931         );
932 
933         // We continue to munch on branches at the tail of the buffer until no
934         // more rules apply. Note that the loop only continues if a branch is
935         // actually truncated (or if labels are redirected away from a branch),
936         // so this always makes progress.
937         while let Some(b) = self.latest_branches.last() {
938             let cur_off = self.cur_offset();
939             trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
940             // If there has been any code emission since the end of the last branch or
941             // label definition, then there's nothing we can edit (because we
942             // don't move code once placed, only back up and overwrite), so
943             // clear the records and finish.
944             if b.end < cur_off {
945                 break;
946             }
947 
948             // If the "labels at this branch" list on this branch is
949             // longer than a threshold, don't do any simplification,
950             // and let the branch remain to separate those labels from
951             // the current tail. This avoids quadratic behavior (see
952             // #3468): otherwise, if a long string of "goto next;
953             // next:" patterns are emitted, all of the labels will
954             // coalesce into a long list of aliases for the current
955             // buffer tail. We must track all aliases of the current
956             // tail for correctness, but we are also allowed to skip
957             // optimization (removal) of any branch, so we take the
958             // escape hatch here and let it stand. In effect this
959             // "spreads" the many thousands of labels in the
960             // pathological case among an actual (harmless but
961             // suboptimal) instruction once per N labels.
962             if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
963                 break;
964             }
965 
966             // Invariant: we are looking at a branch that ends at the tail of
967             // the buffer.
968 
969             // For any branch, conditional or unconditional:
970             // - If the target is a label at the current offset, then remove
971             //   the conditional branch, and reset all labels that targeted
972             //   the current offset (end of branch) to the truncated
973             //   end-of-code.
974             //
975             // Preserves execution semantics: a branch to its own fallthrough
976             // address is equivalent to a no-op; in both cases, nextPC is the
977             // fallthrough.
978             if self.resolve_label_offset(b.target) == cur_off {
979                 trace!("branch with target == cur off; truncating");
980                 self.truncate_last_branch();
981                 continue;
982             }
983 
984             // If latest is an unconditional branch:
985             //
986             // - If the branch's target is not its own start address, then for
987             //   each label at the start of branch, make the label an alias of the
988             //   branch target, and remove the label from the "labels at this
989             //   branch" list.
990             //
991             //   - Preserves execution semantics: an unconditional branch's
992             //     only effect is to set PC to a new PC; this change simply
993             //     collapses one step in the step-semantics.
994             //
995             //   - Post-invariant: the labels that were bound to the start of
996             //     this branch become aliases, so they must not be present in any
997             //     labels-at-this-branch list or the labels-at-tail list. The
998             //     labels are removed form the latest-branch record's
999             //     labels-at-this-branch list, and are never placed in the
1000             //     labels-at-tail list. Furthermore, it is correct that they are
1001             //     not in either list, because they are now aliases, and labels
1002             //     that are aliases remain aliases forever.
1003             //
1004             // - If there is a prior unconditional branch that ends just before
1005             //   this one begins, and this branch has no labels bound to its
1006             //   start, then we can truncate this branch, because it is entirely
1007             //   unreachable (we have redirected all labels that make it
1008             //   reachable otherwise). Do so and continue around the loop.
1009             //
1010             //   - Preserves execution semantics: the branch is unreachable,
1011             //     because execution can only flow into an instruction from the
1012             //     prior instruction's fallthrough or from a branch bound to that
1013             //     instruction's start offset. Unconditional branches have no
1014             //     fallthrough, so if the prior instruction is an unconditional
1015             //     branch, no fallthrough entry can happen. The
1016             //     labels-at-this-branch list is complete (by invariant), so if it
1017             //     is empty, then the instruction is entirely unreachable. Thus,
1018             //     it can be removed.
1019             //
1020             //   - Post-invariant: ensured by truncate_last_branch().
1021             //
1022             // - If there is a prior conditional branch whose target label
1023             //   resolves to the current offset (branches around the
1024             //   unconditional branch), then remove the unconditional branch,
1025             //   and make the target of the unconditional the target of the
1026             //   conditional instead.
1027             //
1028             //   - Preserves execution semantics: previously we had:
1029             //
1030             //         L1:
1031             //            cond_br L2
1032             //            br L3
1033             //         L2:
1034             //            (end of buffer)
1035             //
1036             //     by removing the last branch, we have:
1037             //
1038             //         L1:
1039             //            cond_br L2
1040             //         L2:
1041             //            (end of buffer)
1042             //
1043             //     we then fix up the records for the conditional branch to
1044             //     have:
1045             //
1046             //         L1:
1047             //           cond_br.inverted L3
1048             //         L2:
1049             //
1050             //     In the original code, control flow reaches L2 when the
1051             //     conditional branch's predicate is true, and L3 otherwise. In
1052             //     the optimized code, the same is true.
1053             //
1054             //   - Post-invariant: all edits to latest_branches and
1055             //     labels_at_tail are performed by `truncate_last_branch()`,
1056             //     which maintains the invariants at each step.
1057 
1058             if b.is_uncond() {
1059                 // Set any label equal to current branch's start as an alias of
1060                 // the branch's target, if the target is not the branch itself
1061                 // (i.e., an infinite loop).
1062                 //
1063                 // We cannot perform this aliasing if the target of this branch
1064                 // ultimately aliases back here; if so, we need to keep this
1065                 // branch, so break out of this loop entirely (and clear the
1066                 // latest-branches list below).
1067                 //
1068                 // Note that this check is what prevents cycles from forming in
1069                 // `self.label_aliases`. To see why, consider an arbitrary start
1070                 // state:
1071                 //
1072                 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
1073                 // Ln, which is not aliased.
1074                 //
1075                 // We would create a cycle if we assigned label_aliases[Ln]
1076                 // = L1.  Note that the below assignment is the only write
1077                 // to label_aliases.
1078                 //
1079                 // By our other invariants, we have that Ln (`l` below)
1080                 // resolves to the offset `b.start`, because it is in the
1081                 // set `b.labels_at_this_branch`.
1082                 //
1083                 // If L1 were already aliased, through some arbitrarily deep
1084                 // chain, to Ln, then it must also resolve to this offset
1085                 // `b.start`.
1086                 //
1087                 // By checking the resolution of `L1` against this offset,
1088                 // and aborting this branch-simplification if they are
1089                 // equal, we prevent the below assignment from ever creating
1090                 // a cycle.
1091                 if self.resolve_label_offset(b.target) != b.start {
1092                     let redirected = b.labels_at_this_branch.len();
1093                     for &l in &b.labels_at_this_branch {
1094                         trace!(
1095                             " -> label at start of branch {:?} redirected to target {:?}",
1096                             l, b.target
1097                         );
1098                         self.label_aliases[l.0 as usize] = b.target;
1099                         // NOTE: we continue to ensure the invariant that labels
1100                         // pointing to tail of buffer are in `labels_at_tail`
1101                         // because we already ensured above that the last branch
1102                         // cannot have a target of `cur_off`; so we never have
1103                         // to put the label into `labels_at_tail` when moving it
1104                         // here.
1105                     }
1106                     // Maintain invariant: all branches have been redirected
1107                     // and are no longer pointing at the start of this branch.
1108                     let mut_b = self.latest_branches.last_mut().unwrap();
1109                     mut_b.labels_at_this_branch.clear();
1110 
1111                     if redirected > 0 {
1112                         trace!(" -> after label redirects, restarting loop");
1113                         continue;
1114                     }
1115                 } else {
1116                     break;
1117                 }
1118 
1119                 let b = self.latest_branches.last().unwrap();
1120 
1121                 // Examine any immediately preceding branch.
1122                 if self.latest_branches.len() > 1 {
1123                     let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1124                     trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1125                     // This uncond is immediately after another uncond; we
1126                     // should have already redirected labels to this uncond away
1127                     // (but check to be sure); so we can truncate this uncond.
1128                     if prev_b.is_uncond()
1129                         && prev_b.end == b.start
1130                         && b.labels_at_this_branch.is_empty()
1131                     {
1132                         trace!(" -> uncond follows another uncond; truncating");
1133                         self.truncate_last_branch();
1134                         continue;
1135                     }
1136 
1137                     // This uncond is immediately after a conditional, and the
1138                     // conditional's target is the end of this uncond, and we've
1139                     // already redirected labels to this uncond away; so we can
1140                     // truncate this uncond, flip the sense of the conditional, and
1141                     // set the conditional's target (in `latest_branches` and in
1142                     // `fixup_records`) to the uncond's target.
1143                     if prev_b.is_cond()
1144                         && prev_b.end == b.start
1145                         && self.resolve_label_offset(prev_b.target) == cur_off
1146                     {
1147                         trace!(
1148                             " -> uncond follows a conditional, and conditional's target resolves to current offset"
1149                         );
1150                         // Save the target of the uncond (this becomes the
1151                         // target of the cond), and truncate the uncond.
1152                         let target = b.target;
1153                         let data = prev_b.inverted.clone().unwrap();
1154                         self.truncate_last_branch();
1155 
1156                         // Mutate the code and cond branch.
1157                         let off_before_edit = self.cur_offset();
1158                         let prev_b = self.latest_branches.last_mut().unwrap();
1159                         let not_inverted = SmallVec::from(
1160                             &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1161                         );
1162 
1163                         // Low-level edit: replaces bytes of branch with
1164                         // inverted form. cur_off remains the same afterward, so
1165                         // we do not need to modify label data structures.
1166                         self.data.truncate(prev_b.start as usize);
1167                         self.data.extend_from_slice(&data[..]);
1168 
1169                         // Save the original code as the inversion of the
1170                         // inverted branch, in case we later edit this branch
1171                         // again.
1172                         prev_b.inverted = Some(not_inverted);
1173                         self.pending_fixup_records[prev_b.fixup].label = target;
1174                         trace!(" -> reassigning target of condbr to {:?}", target);
1175                         prev_b.target = target;
1176                         debug_assert_eq!(off_before_edit, self.cur_offset());
1177                         continue;
1178                     }
1179                 }
1180             }
1181 
1182             // If we couldn't do anything with the last branch, then break.
1183             break;
1184         }
1185 
1186         self.purge_latest_branches();
1187 
1188         trace!(
1189             "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1190             self.latest_branches, self.labels_at_tail, self.pending_fixup_records
1191         );
1192     }
1193 
1194     fn purge_latest_branches(&mut self) {
1195         // All of our branch simplification rules work only if a branch ends at
1196         // the tail of the buffer, with no following code; and branches are in
1197         // order in latest_branches; so if the last entry ends prior to
1198         // cur_offset, then clear all entries.
1199         let cur_off = self.cur_offset();
1200         if let Some(l) = self.latest_branches.last() {
1201             if l.end < cur_off {
1202                 trace!("purge_latest_branches: removing branch {:?}", l);
1203                 self.latest_branches.clear();
1204             }
1205         }
1206 
1207         // Post-invariant: no invariant requires any branch to appear in
1208         // `latest_branches`; it is always optional. The list-clear above thus
1209         // preserves all semantics.
1210     }
1211 
1212     /// Emit a trap at some point in the future with the specified code and
1213     /// stack map.
1214     ///
1215     /// This function returns a [`MachLabel`] which will be the future address
1216     /// of the trap. Jumps should refer to this label, likely by using the
1217     /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1218     /// patched in once the address of the trap is known.
1219     ///
1220     /// This will batch all traps into the end of the function.
1221     pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel {
1222         let label = self.get_label();
1223         self.pending_traps.push(MachLabelTrap {
1224             label,
1225             code,
1226             loc: self.cur_srcloc.map(|(_start, loc)| loc),
1227         });
1228         label
1229     }
1230 
1231     /// Is an island needed within the next N bytes?
1232     pub fn island_needed(&self, distance: CodeOffset) -> bool {
1233         let deadline = match self.fixup_records.peek() {
1234             Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline),
1235             None => self.pending_fixup_deadline,
1236         };
1237         deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline
1238     }
1239 
1240     /// Returns the maximal offset that islands can reach if `distance` more
1241     /// bytes are appended.
1242     ///
1243     /// This is used to determine if veneers need insertions since jumps that
1244     /// can't reach past this point must get a veneer of some form.
1245     fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1246         // Assume that all fixups will require veneers and that the veneers are
1247         // the worst-case size for each platform. This is an over-generalization
1248         // to avoid iterating over the `fixup_records` list or maintaining
1249         // information about it as we go along.
1250         let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len())
1251             as u32)
1252             * (I::LabelUse::worst_case_veneer_size())
1253             + self.pending_constants_size
1254             + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32;
1255         self.cur_offset()
1256             .saturating_add(distance)
1257             .saturating_add(island_worst_case_size)
1258     }
1259 
1260     /// Emit all pending constants and required pending veneers.
1261     ///
1262     /// Should only be called if `island_needed()` returns true, i.e., if we
1263     /// actually reach a deadline. It's not necessarily a problem to do so
1264     /// otherwise but it may result in unnecessary work during emission.
1265     pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1266         self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane);
1267     }
1268 
1269     /// Same as `emit_island`, but an internal API with a `force_veneers`
1270     /// argument to force all veneers to always get emitted for debugging.
1271     fn emit_island_maybe_forced(
1272         &mut self,
1273         force_veneers: ForceVeneers,
1274         distance: CodeOffset,
1275         ctrl_plane: &mut ControlPlane,
1276     ) {
1277         // We're going to purge fixups, so no latest-branch editing can happen
1278         // anymore.
1279         self.latest_branches.clear();
1280 
1281         // End the current location tracking since anything emitted during this
1282         // function shouldn't be attributed to whatever the current source
1283         // location is.
1284         //
1285         // Note that the current source location, if it's set right now, will be
1286         // restored at the end of this island emission.
1287         let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1288         if cur_loc.is_some() {
1289             self.end_srcloc();
1290         }
1291 
1292         let forced_threshold = self.worst_case_end_of_island(distance);
1293 
1294         // First flush out all traps/constants so we have more labels in case
1295         // fixups are applied against these labels.
1296         //
1297         // Note that traps are placed first since this typically happens at the
1298         // end of the function and for disassemblers we try to keep all the code
1299         // contiguously together.
1300         for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) {
1301             // If this trap has source information associated with it then
1302             // emit this information for the trap instruction going out now too.
1303             if let Some(loc) = loc {
1304                 self.start_srcloc(loc);
1305             }
1306             self.align_to(I::LabelUse::ALIGN);
1307             self.bind_label(label, ctrl_plane);
1308             self.add_trap(code);
1309             self.put_data(I::TRAP_OPCODE);
1310             if loc.is_some() {
1311                 self.end_srcloc();
1312             }
1313         }
1314 
1315         for constant in mem::take(&mut self.pending_constants) {
1316             let MachBufferConstant { align, size, .. } = self.constants[constant];
1317             let label = self.constants[constant].upcoming_label.take().unwrap();
1318             self.align_to(align);
1319             self.bind_label(label, ctrl_plane);
1320             self.used_constants.push((constant, self.cur_offset()));
1321             self.get_appended_space(size);
1322         }
1323 
1324         // Either handle all pending fixups because they're ready or move them
1325         // onto the `BinaryHeap` tracking all pending fixups if they aren't
1326         // ready.
1327         assert!(self.latest_branches.is_empty());
1328         for fixup in mem::take(&mut self.pending_fixup_records) {
1329             if self.should_apply_fixup(&fixup, forced_threshold) {
1330                 self.handle_fixup(fixup, force_veneers, forced_threshold);
1331             } else {
1332                 self.fixup_records.push(fixup);
1333             }
1334         }
1335         self.pending_fixup_deadline = u32::MAX;
1336         while let Some(fixup) = self.fixup_records.peek() {
1337             trace!("emit_island: fixup {:?}", fixup);
1338 
1339             // If this fixup shouldn't be applied, that means its label isn't
1340             // defined yet and there'll be remaining space to apply a veneer if
1341             // necessary in the future after this island. In that situation
1342             // because `fixup_records` is sorted by deadline this loop can
1343             // exit.
1344             if !self.should_apply_fixup(fixup, forced_threshold) {
1345                 break;
1346             }
1347 
1348             let fixup = self.fixup_records.pop().unwrap();
1349             self.handle_fixup(fixup, force_veneers, forced_threshold);
1350         }
1351 
1352         if let Some(loc) = cur_loc {
1353             self.start_srcloc(loc);
1354         }
1355     }
1356 
1357     fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool {
1358         let label_offset = self.resolve_label_offset(fixup.label);
1359         label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold
1360     }
1361 
1362     fn handle_fixup(
1363         &mut self,
1364         fixup: MachLabelFixup<I>,
1365         force_veneers: ForceVeneers,
1366         forced_threshold: CodeOffset,
1367     ) {
1368         let MachLabelFixup {
1369             label,
1370             offset,
1371             kind,
1372         } = fixup;
1373         let start = offset as usize;
1374         let end = (offset + kind.patch_size()) as usize;
1375         let label_offset = self.resolve_label_offset(label);
1376 
1377         if label_offset != UNKNOWN_LABEL_OFFSET {
1378             // If the offset of the label for this fixup is known then
1379             // we're going to do something here-and-now. We're either going
1380             // to patch the original offset because it's an in-bounds jump,
1381             // or we're going to generate a veneer, patch the fixup to jump
1382             // to the veneer, and then keep going.
1383             //
1384             // If the label comes after the original fixup, then we should
1385             // be guaranteed that the jump is in-bounds. Otherwise there's
1386             // a bug somewhere because this method wasn't called soon
1387             // enough. All forward-jumps are tracked and should get veneers
1388             // before their deadline comes and they're unable to jump
1389             // further.
1390             //
1391             // Otherwise if the label is before the fixup, then that's a
1392             // backwards jump. If it's past the maximum negative range
1393             // then we'll emit a veneer that to jump forward to which can
1394             // then jump backwards.
1395             let veneer_required = if label_offset >= offset {
1396                 assert!((label_offset - offset) <= kind.max_pos_range());
1397                 false
1398             } else {
1399                 (offset - label_offset) > kind.max_neg_range()
1400             };
1401             trace!(
1402                 " -> label_offset = {}, known, required = {} (pos {} neg {})",
1403                 label_offset,
1404                 veneer_required,
1405                 kind.max_pos_range(),
1406                 kind.max_neg_range()
1407             );
1408 
1409             if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required {
1410                 self.emit_veneer(label, offset, kind);
1411             } else {
1412                 let slice = &mut self.data[start..end];
1413                 trace!(
1414                     "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}"
1415                 );
1416                 kind.patch(slice, offset, label_offset);
1417             }
1418         } else {
1419             // If the offset of this label is not known at this time then
1420             // that means that a veneer is required because after this
1421             // island the target can't be in range of the original target.
1422             assert!(forced_threshold - offset > kind.max_pos_range());
1423             self.emit_veneer(label, offset, kind);
1424         }
1425     }
1426 
1427     /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1428     ///
1429     /// This will generate extra machine code, using `kind`, to get a
1430     /// larger-jump-kind than `kind` allows. The code at `offset` is then
1431     /// patched to jump to our new code, and then the new code is enqueued for
1432     /// a fixup to get processed at some later time.
1433     fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1434         // If this `kind` doesn't support a veneer then that's a bug in the
1435         // backend because we need to implement support for such a veneer.
1436         assert!(
1437             kind.supports_veneer(),
1438             "jump beyond the range of {kind:?} but a veneer isn't supported",
1439         );
1440 
1441         // Allocate space for a veneer in the island.
1442         self.align_to(I::LabelUse::ALIGN);
1443         let veneer_offset = self.cur_offset();
1444         trace!("making a veneer at {}", veneer_offset);
1445         let start = offset as usize;
1446         let end = (offset + kind.patch_size()) as usize;
1447         let slice = &mut self.data[start..end];
1448         // Patch the original label use to refer to the veneer.
1449         trace!(
1450             "patching original at offset {} to veneer offset {}",
1451             offset, veneer_offset
1452         );
1453         kind.patch(slice, offset, veneer_offset);
1454         // Generate the veneer.
1455         let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1456         let (veneer_fixup_off, veneer_label_use) =
1457             kind.generate_veneer(veneer_slice, veneer_offset);
1458         trace!(
1459             "generated veneer; fixup offset {}, label_use {:?}",
1460             veneer_fixup_off, veneer_label_use
1461         );
1462         // Register a new use of `label` with our new veneer fixup and
1463         // offset. This'll recalculate deadlines accordingly and
1464         // enqueue this fixup to get processed at some later
1465         // time.
1466         self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1467     }
1468 
1469     fn finish_emission_maybe_forcing_veneers(
1470         &mut self,
1471         force_veneers: ForceVeneers,
1472         ctrl_plane: &mut ControlPlane,
1473     ) {
1474         while !self.pending_constants.is_empty()
1475             || !self.pending_traps.is_empty()
1476             || !self.fixup_records.is_empty()
1477             || !self.pending_fixup_records.is_empty()
1478         {
1479             // `emit_island()` will emit any pending veneers and constants, and
1480             // as a side-effect, will also take care of any fixups with resolved
1481             // labels eagerly.
1482             self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1483         }
1484 
1485         // Ensure that all labels have been fixed up after the last island is emitted. This is a
1486         // full (release-mode) assert because an unresolved label means the emitted code is
1487         // incorrect.
1488         assert!(self.fixup_records.is_empty());
1489         assert!(self.pending_fixup_records.is_empty());
1490     }
1491 
1492     /// Finish any deferred emissions and/or fixups.
1493     pub fn finish(
1494         mut self,
1495         constants: &VCodeConstants,
1496         ctrl_plane: &mut ControlPlane,
1497     ) -> MachBufferFinalized<Stencil> {
1498         let _tt = timing::vcode_emit_finish();
1499 
1500         self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane);
1501 
1502         let alignment = self.finish_constants(constants);
1503 
1504         // Resolve all labels to their offsets.
1505         let finalized_relocs = self
1506             .relocs
1507             .iter()
1508             .map(|reloc| FinalizedMachReloc {
1509                 offset: reloc.offset,
1510                 kind: reloc.kind,
1511                 addend: reloc.addend,
1512                 target: match &reloc.target {
1513                     RelocTarget::ExternalName(name) => {
1514                         FinalizedRelocTarget::ExternalName(name.clone())
1515                     }
1516                     RelocTarget::Label(label) => {
1517                         FinalizedRelocTarget::Func(self.resolve_label_offset(*label))
1518                     }
1519                 },
1520             })
1521             .collect();
1522 
1523         let finalized_exception_handlers = self
1524             .exception_handlers
1525             .iter()
1526             .map(|(tag, label)| (*tag, self.resolve_label_offset(*label)))
1527             .collect();
1528 
1529         let mut srclocs = self.srclocs;
1530         srclocs.sort_by_key(|entry| entry.start);
1531 
1532         MachBufferFinalized {
1533             data: self.data,
1534             relocs: finalized_relocs,
1535             traps: self.traps,
1536             call_sites: self.call_sites,
1537             exception_handlers: finalized_exception_handlers,
1538             srclocs,
1539             user_stack_maps: self.user_stack_maps,
1540             unwind_info: self.unwind_info,
1541             alignment,
1542         }
1543     }
1544 
1545     /// Add an external relocation at the given offset.
1546     pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>(
1547         &mut self,
1548         offset: CodeOffset,
1549         kind: Reloc,
1550         target: &T,
1551         addend: Addend,
1552     ) {
1553         let target: RelocTarget = target.clone().into();
1554         // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1555         // generate a label-use statement to track whether an island is possibly
1556         // needed to escape this function to actually get to the external name.
1557         // This is most likely to come up on AArch64 where calls between
1558         // functions use a 26-bit signed offset which gives +/- 64MB. This means
1559         // that if a function is 128MB in size and there's a call in the middle
1560         // it's impossible to reach the actual target. Also, while it's
1561         // technically possible to jump to the start of a function and then jump
1562         // further, island insertion below always inserts islands after
1563         // previously appended code so for Cranelift's own implementation this
1564         // is also a problem for 64MB functions on AArch64 which start with a
1565         // call instruction, those won't be able to escape.
1566         //
1567         // Ideally what needs to happen here is that a `LabelUse` is
1568         // transparently generated (or call-sites of this function are audited
1569         // to generate a `LabelUse` instead) and tracked internally. The actual
1570         // relocation would then change over time if and when a veneer is
1571         // inserted, where the relocation here would be patched by this
1572         // `MachBuffer` to jump to the veneer. The problem, though, is that all
1573         // this still needs to end up, in the case of a singular function,
1574         // generating a final relocation pointing either to this particular
1575         // relocation or to the veneer inserted. Additionally
1576         // `MachBuffer` needs the concept of a label which will never be
1577         // resolved, so `emit_island` doesn't trip over not actually ever
1578         // knowning what some labels are. Currently the loop in
1579         // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1580         // loop.
1581         //
1582         // For now this means that because relocs aren't tracked at all that
1583         // AArch64 functions have a rough size limits of 64MB. For now that's
1584         // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1585         // when a relocation can't otherwise be resolved later, so it shouldn't
1586         // actually result in any memory unsafety or anything like that.
1587         self.relocs.push(MachReloc {
1588             offset,
1589             kind,
1590             target,
1591             addend,
1592         });
1593     }
1594 
1595     /// Add an external relocation at the current offset.
1596     pub fn add_reloc<T: Into<RelocTarget> + Clone>(
1597         &mut self,
1598         kind: Reloc,
1599         target: &T,
1600         addend: Addend,
1601     ) {
1602         self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend);
1603     }
1604 
1605     /// Add a trap record at the current offset.
1606     pub fn add_trap(&mut self, code: TrapCode) {
1607         self.traps.push(MachTrap {
1608             offset: self.data.len() as CodeOffset,
1609             code,
1610         });
1611     }
1612 
1613     /// Add a call-site record at the current offset, optionally with exception handlers.
1614     pub fn add_call_site(
1615         &mut self,
1616         exception_handlers: &[(PackedOption<ExceptionTag>, MachLabel)],
1617     ) {
1618         let start = u32::try_from(self.exception_handlers.len()).unwrap();
1619         self.exception_handlers
1620             .extend(exception_handlers.into_iter().copied());
1621         let end = u32::try_from(self.exception_handlers.len()).unwrap();
1622         self.call_sites.push(MachCallSite {
1623             ret_addr: self.data.len() as CodeOffset,
1624             exception_handler_range: start..end,
1625         });
1626     }
1627 
1628     /// Add an unwind record at the current offset.
1629     pub fn add_unwind(&mut self, unwind: UnwindInst) {
1630         self.unwind_info.push((self.cur_offset(), unwind));
1631     }
1632 
1633     /// Set the `SourceLoc` for code from this offset until the offset at the
1634     /// next call to `end_srcloc()`.
1635     /// Returns the current [CodeOffset] and [RelSourceLoc].
1636     pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) {
1637         let cur = (self.cur_offset(), loc);
1638         self.cur_srcloc = Some(cur);
1639         cur
1640     }
1641 
1642     /// Mark the end of the `SourceLoc` segment started at the last
1643     /// `start_srcloc()` call.
1644     pub fn end_srcloc(&mut self) {
1645         let (start, loc) = self
1646             .cur_srcloc
1647             .take()
1648             .expect("end_srcloc() called without start_srcloc()");
1649         let end = self.cur_offset();
1650         // Skip zero-length extends.
1651         debug_assert!(end >= start);
1652         if end > start {
1653             self.srclocs.push(MachSrcLoc { start, end, loc });
1654         }
1655     }
1656 
1657     /// Push a user stack map onto this buffer.
1658     ///
1659     /// The stack map is associated with the given `return_addr` code
1660     /// offset. This must be the PC for the instruction just *after* this stack
1661     /// map's associated instruction. For example in the sequence `call $foo;
1662     /// add r8, rax`, the `return_addr` must be the offset of the start of the
1663     /// `add` instruction.
1664     ///
1665     /// Stack maps must be pushed in sorted `return_addr` order.
1666     pub fn push_user_stack_map(
1667         &mut self,
1668         emit_state: &I::State,
1669         return_addr: CodeOffset,
1670         mut stack_map: ir::UserStackMap,
1671     ) {
1672         let span = emit_state.frame_layout().active_size();
1673         trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}");
1674 
1675         debug_assert!(
1676             self.user_stack_maps
1677                 .last()
1678                 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr),
1679             "pushed stack maps out of order: {} is not less than {}",
1680             self.user_stack_maps.last().unwrap().0,
1681             return_addr,
1682         );
1683 
1684         stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots());
1685         self.user_stack_maps.push((return_addr, span, stack_map));
1686     }
1687 
1688     /// Increase the alignment of the buffer to the given alignment if bigger
1689     /// than the current alignment.
1690     pub fn set_log2_min_function_alignment(&mut self, align_to: u8) {
1691         self.min_alignment = self.min_alignment.max(
1692             1u32.checked_shl(u32::from(align_to))
1693                 .expect("log2_min_function_alignment too large"),
1694         );
1695     }
1696 }
1697 
1698 impl<I: VCodeInst> Extend<u8> for MachBuffer<I> {
1699     fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) {
1700         for b in iter {
1701             self.put1(b);
1702         }
1703     }
1704 }
1705 
1706 impl<T: CompilePhase> MachBufferFinalized<T> {
1707     /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1708     pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1709         &self.srclocs[..]
1710     }
1711 
1712     /// Get the total required size for the code.
1713     pub fn total_size(&self) -> CodeOffset {
1714         self.data.len() as CodeOffset
1715     }
1716 
1717     /// Return the code in this mach buffer as a hex string for testing purposes.
1718     pub fn stringify_code_bytes(&self) -> String {
1719         // This is pretty lame, but whatever ..
1720         use std::fmt::Write;
1721         let mut s = String::with_capacity(self.data.len() * 2);
1722         for b in &self.data {
1723             write!(&mut s, "{b:02X}").unwrap();
1724         }
1725         s
1726     }
1727 
1728     /// Get the code bytes.
1729     pub fn data(&self) -> &[u8] {
1730         // N.B.: we emit every section into the .text section as far as
1731         // the `CodeSink` is concerned; we do not bother to segregate
1732         // the contents into the actual program text, the jumptable and the
1733         // rodata (constant pool). This allows us to generate code assuming
1734         // that these will not be relocated relative to each other, and avoids
1735         // having to designate each section as belonging in one of the three
1736         // fixed categories defined by `CodeSink`. If this becomes a problem
1737         // later (e.g. because of memory permissions or similar), we can
1738         // add this designation and segregate the output; take care, however,
1739         // to add the appropriate relocations in this case.
1740 
1741         &self.data[..]
1742     }
1743 
1744     /// Get the list of external relocations for this code.
1745     pub fn relocs(&self) -> &[FinalizedMachReloc] {
1746         &self.relocs[..]
1747     }
1748 
1749     /// Get the list of trap records for this code.
1750     pub fn traps(&self) -> &[MachTrap] {
1751         &self.traps[..]
1752     }
1753 
1754     /// Get the user stack map metadata for this code.
1755     pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] {
1756         &self.user_stack_maps
1757     }
1758 
1759     /// Take this buffer's user strack map metadata.
1760     pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> {
1761         mem::take(&mut self.user_stack_maps)
1762     }
1763 
1764     /// Get the list of call sites for this code, along with
1765     /// associated exception handlers.
1766     ///
1767     /// Each item yielded by the returned iterator is a struct with:
1768     ///
1769     /// - The call site metadata record, with a `ret_addr` field
1770     ///   directly accessible and denoting the offset of the return
1771     ///   address into this buffer's code.
1772     /// - The slice of pairs of exception tags and code offsets
1773     ///   denoting exception-handler entry points associated with this
1774     ///   call site.
1775     pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ {
1776         self.call_sites.iter().map(|call_site| {
1777             let range = call_site.exception_handler_range.clone();
1778             let range = usize::try_from(range.start).unwrap()..usize::try_from(range.end).unwrap();
1779             FinalizedMachCallSite {
1780                 ret_addr: call_site.ret_addr,
1781                 exception_handlers: &self.exception_handlers[range],
1782             }
1783         })
1784     }
1785 }
1786 
1787 /// Metadata about a constant.
1788 struct MachBufferConstant {
1789     /// A label which has not yet been bound which can be used for this
1790     /// constant.
1791     ///
1792     /// This is lazily created when a label is requested for a constant and is
1793     /// cleared when a constant is emitted.
1794     upcoming_label: Option<MachLabel>,
1795     /// Required alignment.
1796     align: CodeOffset,
1797     /// The byte size of this constant.
1798     size: usize,
1799 }
1800 
1801 /// A trap that is deferred to the next time an island is emitted for either
1802 /// traps, constants, or fixups.
1803 struct MachLabelTrap {
1804     /// This label will refer to the trap's offset.
1805     label: MachLabel,
1806     /// The code associated with this trap.
1807     code: TrapCode,
1808     /// An optional source location to assign for this trap.
1809     loc: Option<RelSourceLoc>,
1810 }
1811 
1812 /// A fixup to perform on the buffer once code is emitted. Fixups always refer
1813 /// to labels and patch the code based on label offsets. Hence, they are like
1814 /// relocations, but internal to one buffer.
1815 #[derive(Debug)]
1816 struct MachLabelFixup<I: VCodeInst> {
1817     /// The label whose offset controls this fixup.
1818     label: MachLabel,
1819     /// The offset to fix up / patch to refer to this label.
1820     offset: CodeOffset,
1821     /// The kind of fixup. This is architecture-specific; each architecture may have,
1822     /// e.g., several types of branch instructions, each with differently-sized
1823     /// offset fields and different places within the instruction to place the
1824     /// bits.
1825     kind: I::LabelUse,
1826 }
1827 
1828 impl<I: VCodeInst> MachLabelFixup<I> {
1829     fn deadline(&self) -> CodeOffset {
1830         self.offset.saturating_add(self.kind.max_pos_range())
1831     }
1832 }
1833 
1834 impl<I: VCodeInst> PartialEq for MachLabelFixup<I> {
1835     fn eq(&self, other: &Self) -> bool {
1836         self.deadline() == other.deadline()
1837     }
1838 }
1839 
1840 impl<I: VCodeInst> Eq for MachLabelFixup<I> {}
1841 
1842 impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> {
1843     fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
1844         Some(self.cmp(other))
1845     }
1846 }
1847 
1848 impl<I: VCodeInst> Ord for MachLabelFixup<I> {
1849     fn cmp(&self, other: &Self) -> Ordering {
1850         other.deadline().cmp(&self.deadline())
1851     }
1852 }
1853 
1854 /// A relocation resulting from a compilation.
1855 #[derive(Clone, Debug, PartialEq)]
1856 #[cfg_attr(
1857     feature = "enable-serde",
1858     derive(serde_derive::Serialize, serde_derive::Deserialize)
1859 )]
1860 pub struct MachRelocBase<T> {
1861     /// The offset at which the relocation applies, *relative to the
1862     /// containing section*.
1863     pub offset: CodeOffset,
1864     /// The kind of relocation.
1865     pub kind: Reloc,
1866     /// The external symbol / name to which this relocation refers.
1867     pub target: T,
1868     /// The addend to add to the symbol value.
1869     pub addend: i64,
1870 }
1871 
1872 type MachReloc = MachRelocBase<RelocTarget>;
1873 
1874 /// A relocation resulting from a compilation.
1875 pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>;
1876 
1877 /// A Relocation target
1878 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
1879 pub enum RelocTarget {
1880     /// Points to an [ExternalName] outside the current function.
1881     ExternalName(ExternalName),
1882     /// Points to a [MachLabel] inside this function.
1883     /// This is different from [MachLabelFixup] in that both the relocation and the
1884     /// label will be emitted and are only resolved at link time.
1885     ///
1886     /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it.
1887     Label(MachLabel),
1888 }
1889 
1890 impl From<ExternalName> for RelocTarget {
1891     fn from(name: ExternalName) -> Self {
1892         Self::ExternalName(name)
1893     }
1894 }
1895 
1896 impl From<MachLabel> for RelocTarget {
1897     fn from(label: MachLabel) -> Self {
1898         Self::Label(label)
1899     }
1900 }
1901 
1902 /// A Relocation target
1903 #[derive(Debug, Clone, PartialEq, Eq, Hash)]
1904 #[cfg_attr(
1905     feature = "enable-serde",
1906     derive(serde_derive::Serialize, serde_derive::Deserialize)
1907 )]
1908 pub enum FinalizedRelocTarget {
1909     /// Points to an [ExternalName] outside the current function.
1910     ExternalName(ExternalName),
1911     /// Points to a [CodeOffset] from the start of the current function.
1912     Func(CodeOffset),
1913 }
1914 
1915 impl FinalizedRelocTarget {
1916     /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the
1917     /// output.
1918     pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String {
1919         match self {
1920             FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)),
1921             FinalizedRelocTarget::Func(offset) => format!("func+{offset}"),
1922         }
1923     }
1924 }
1925 
1926 /// A trap record resulting from a compilation.
1927 #[derive(Clone, Debug, PartialEq)]
1928 #[cfg_attr(
1929     feature = "enable-serde",
1930     derive(serde_derive::Serialize, serde_derive::Deserialize)
1931 )]
1932 pub struct MachTrap {
1933     /// The offset at which the trap instruction occurs, *relative to the
1934     /// containing section*.
1935     pub offset: CodeOffset,
1936     /// The trap code.
1937     pub code: TrapCode,
1938 }
1939 
1940 /// A call site record resulting from a compilation.
1941 #[derive(Clone, Debug, PartialEq)]
1942 #[cfg_attr(
1943     feature = "enable-serde",
1944     derive(serde_derive::Serialize, serde_derive::Deserialize)
1945 )]
1946 pub struct MachCallSite {
1947     /// The offset of the call's return address, *relative to the
1948     /// start of the buffer*.
1949     pub ret_addr: CodeOffset,
1950 
1951     /// Range in `exception_handlers` corresponding to the exception
1952     /// handlers for this callsite.
1953     exception_handler_range: Range<u32>,
1954 }
1955 
1956 /// A call site record resulting from a compilation.
1957 #[derive(Clone, Debug, PartialEq)]
1958 pub struct FinalizedMachCallSite<'a> {
1959     /// The offset of the call's return address, *relative to the
1960     /// start of the buffer*.
1961     pub ret_addr: CodeOffset,
1962 
1963     /// Exception handlers at this callsite, with target offsets
1964     /// *relative to the start of the buffer*.
1965     pub exception_handlers: &'a [(PackedOption<ir::ExceptionTag>, CodeOffset)],
1966 }
1967 
1968 /// A source-location mapping resulting from a compilation.
1969 #[derive(PartialEq, Debug, Clone)]
1970 #[cfg_attr(
1971     feature = "enable-serde",
1972     derive(serde_derive::Serialize, serde_derive::Deserialize)
1973 )]
1974 pub struct MachSrcLoc<T: CompilePhase> {
1975     /// The start of the region of code corresponding to a source location.
1976     /// This is relative to the start of the function, not to the start of the
1977     /// section.
1978     pub start: CodeOffset,
1979     /// The end of the region of code corresponding to a source location.
1980     /// This is relative to the start of the section, not to the start of the
1981     /// section.
1982     pub end: CodeOffset,
1983     /// The source location.
1984     pub loc: T::SourceLocType,
1985 }
1986 
1987 impl MachSrcLoc<Stencil> {
1988     fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1989         MachSrcLoc {
1990             start: self.start,
1991             end: self.end,
1992             loc: self.loc.expand(base_srcloc),
1993         }
1994     }
1995 }
1996 
1997 /// Record of branch instruction in the buffer, to facilitate editing.
1998 #[derive(Clone, Debug)]
1999 struct MachBranch {
2000     start: CodeOffset,
2001     end: CodeOffset,
2002     target: MachLabel,
2003     fixup: usize,
2004     inverted: Option<SmallVec<[u8; 8]>>,
2005     /// All labels pointing to the start of this branch. For correctness, this
2006     /// *must* be complete (i.e., must contain all labels whose resolved offsets
2007     /// are at the start of this branch): we rely on being able to redirect all
2008     /// labels that could jump to this branch before removing it, if it is
2009     /// otherwise unreachable.
2010     labels_at_this_branch: SmallVec<[MachLabel; 4]>,
2011 }
2012 
2013 impl MachBranch {
2014     fn is_cond(&self) -> bool {
2015         self.inverted.is_some()
2016     }
2017     fn is_uncond(&self) -> bool {
2018         self.inverted.is_none()
2019     }
2020 }
2021 
2022 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
2023 ///
2024 /// Note that `MachBuffer` was primarily written for intra-function references
2025 /// of jumps between basic blocks, but it's also quite usable for entire text
2026 /// sections and resolving references between functions themselves. This
2027 /// builder interprets "blocks" as labeled functions for the purposes of
2028 /// resolving labels internally in the buffer.
2029 pub struct MachTextSectionBuilder<I: VCodeInst> {
2030     buf: MachBuffer<I>,
2031     next_func: usize,
2032     force_veneers: ForceVeneers,
2033 }
2034 
2035 impl<I: VCodeInst> MachTextSectionBuilder<I> {
2036     /// Creates a new text section builder which will have `num_funcs` functions
2037     /// pushed into it.
2038     pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
2039         let mut buf = MachBuffer::new();
2040         buf.reserve_labels_for_blocks(num_funcs);
2041         MachTextSectionBuilder {
2042             buf,
2043             next_func: 0,
2044             force_veneers: ForceVeneers::No,
2045         }
2046     }
2047 }
2048 
2049 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
2050     fn append(
2051         &mut self,
2052         labeled: bool,
2053         func: &[u8],
2054         align: u32,
2055         ctrl_plane: &mut ControlPlane,
2056     ) -> u64 {
2057         // Conditionally emit an island if it's necessary to resolve jumps
2058         // between functions which are too far away.
2059         let size = func.len() as u32;
2060         if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) {
2061             self.buf
2062                 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
2063         }
2064 
2065         self.buf.align_to(align);
2066         let pos = self.buf.cur_offset();
2067         if labeled {
2068             self.buf.bind_label(
2069                 MachLabel::from_block(BlockIndex::new(self.next_func)),
2070                 ctrl_plane,
2071             );
2072             self.next_func += 1;
2073         }
2074         self.buf.put_data(func);
2075         u64::from(pos)
2076     }
2077 
2078     fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
2079         crate::trace!(
2080             "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}"
2081         );
2082         let label = MachLabel::from_block(BlockIndex::new(target));
2083         let offset = u32::try_from(offset).unwrap();
2084         match I::LabelUse::from_reloc(reloc, addend) {
2085             Some(label_use) => {
2086                 self.buf.use_label_at_offset(offset, label, label_use);
2087                 true
2088             }
2089             None => false,
2090         }
2091     }
2092 
2093     fn force_veneers(&mut self) {
2094         self.force_veneers = ForceVeneers::Yes;
2095     }
2096 
2097     fn write(&mut self, offset: u64, data: &[u8]) {
2098         self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data);
2099     }
2100 
2101     fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
2102         // Double-check all functions were pushed.
2103         assert_eq!(self.next_func, self.buf.label_offsets.len());
2104 
2105         // Finish up any veneers, if necessary.
2106         self.buf
2107             .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
2108 
2109         // We don't need the data any more, so return it to the caller.
2110         mem::take(&mut self.buf.data).into_vec()
2111     }
2112 }
2113 
2114 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
2115 #[cfg(all(test, feature = "arm64"))]
2116 mod test {
2117     use cranelift_entity::EntityRef as _;
2118 
2119     use super::*;
2120     use crate::ir::UserExternalNameRef;
2121     use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
2122     use crate::isa::aarch64::inst::{OperandSize, xreg};
2123     use crate::machinst::{MachInstEmit, MachInstEmitState};
2124     use crate::settings;
2125 
2126     fn label(n: u32) -> MachLabel {
2127         MachLabel::from_block(BlockIndex::new(n as usize))
2128     }
2129     fn target(n: u32) -> BranchTarget {
2130         BranchTarget::Label(label(n))
2131     }
2132 
2133     #[test]
2134     fn test_elide_jump_to_next() {
2135         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2136         let mut buf = MachBuffer::new();
2137         let mut state = <Inst as MachInstEmit>::State::default();
2138         let constants = Default::default();
2139 
2140         buf.reserve_labels_for_blocks(2);
2141         buf.bind_label(label(0), state.ctrl_plane_mut());
2142         let inst = Inst::Jump { dest: target(1) };
2143         inst.emit(&mut buf, &info, &mut state);
2144         buf.bind_label(label(1), state.ctrl_plane_mut());
2145         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2146         assert_eq!(0, buf.total_size());
2147     }
2148 
2149     #[test]
2150     fn test_elide_trivial_jump_blocks() {
2151         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2152         let mut buf = MachBuffer::new();
2153         let mut state = <Inst as MachInstEmit>::State::default();
2154         let constants = Default::default();
2155 
2156         buf.reserve_labels_for_blocks(4);
2157 
2158         buf.bind_label(label(0), state.ctrl_plane_mut());
2159         let inst = Inst::CondBr {
2160             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2161             taken: target(1),
2162             not_taken: target(2),
2163         };
2164         inst.emit(&mut buf, &info, &mut state);
2165 
2166         buf.bind_label(label(1), state.ctrl_plane_mut());
2167         let inst = Inst::Jump { dest: target(3) };
2168         inst.emit(&mut buf, &info, &mut state);
2169 
2170         buf.bind_label(label(2), state.ctrl_plane_mut());
2171         let inst = Inst::Jump { dest: target(3) };
2172         inst.emit(&mut buf, &info, &mut state);
2173 
2174         buf.bind_label(label(3), state.ctrl_plane_mut());
2175 
2176         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2177         assert_eq!(0, buf.total_size());
2178     }
2179 
2180     #[test]
2181     fn test_flip_cond() {
2182         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2183         let mut buf = MachBuffer::new();
2184         let mut state = <Inst as MachInstEmit>::State::default();
2185         let constants = Default::default();
2186 
2187         buf.reserve_labels_for_blocks(4);
2188 
2189         buf.bind_label(label(0), state.ctrl_plane_mut());
2190         let inst = Inst::CondBr {
2191             kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2192             taken: target(1),
2193             not_taken: target(2),
2194         };
2195         inst.emit(&mut buf, &info, &mut state);
2196 
2197         buf.bind_label(label(1), state.ctrl_plane_mut());
2198         let inst = Inst::Nop4;
2199         inst.emit(&mut buf, &info, &mut state);
2200 
2201         buf.bind_label(label(2), state.ctrl_plane_mut());
2202         let inst = Inst::Udf {
2203             trap_code: TrapCode::STACK_OVERFLOW,
2204         };
2205         inst.emit(&mut buf, &info, &mut state);
2206 
2207         buf.bind_label(label(3), state.ctrl_plane_mut());
2208 
2209         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2210 
2211         let mut buf2 = MachBuffer::new();
2212         let mut state = Default::default();
2213         let inst = Inst::TrapIf {
2214             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2215             trap_code: TrapCode::STACK_OVERFLOW,
2216         };
2217         inst.emit(&mut buf2, &info, &mut state);
2218         let inst = Inst::Nop4;
2219         inst.emit(&mut buf2, &info, &mut state);
2220 
2221         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2222 
2223         assert_eq!(buf.data, buf2.data);
2224     }
2225 
2226     #[test]
2227     fn test_island() {
2228         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2229         let mut buf = MachBuffer::new();
2230         let mut state = <Inst as MachInstEmit>::State::default();
2231         let constants = Default::default();
2232 
2233         buf.reserve_labels_for_blocks(4);
2234 
2235         buf.bind_label(label(0), state.ctrl_plane_mut());
2236         let inst = Inst::CondBr {
2237             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2238             taken: target(2),
2239             not_taken: target(3),
2240         };
2241         inst.emit(&mut buf, &info, &mut state);
2242 
2243         buf.bind_label(label(1), state.ctrl_plane_mut());
2244         while buf.cur_offset() < 2000000 {
2245             if buf.island_needed(0) {
2246                 buf.emit_island(0, state.ctrl_plane_mut());
2247             }
2248             let inst = Inst::Nop4;
2249             inst.emit(&mut buf, &info, &mut state);
2250         }
2251 
2252         buf.bind_label(label(2), state.ctrl_plane_mut());
2253         let inst = Inst::Nop4;
2254         inst.emit(&mut buf, &info, &mut state);
2255 
2256         buf.bind_label(label(3), state.ctrl_plane_mut());
2257         let inst = Inst::Nop4;
2258         inst.emit(&mut buf, &info, &mut state);
2259 
2260         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2261 
2262         assert_eq!(2000000 + 8, buf.total_size());
2263 
2264         let mut buf2 = MachBuffer::new();
2265         let mut state = Default::default();
2266         let inst = Inst::CondBr {
2267             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2268 
2269             // This conditionally taken branch has a 19-bit constant, shifted
2270             // to the left by two, giving us a 21-bit range in total. Half of
2271             // this range positive so the we should be around 1 << 20 bytes
2272             // away for our jump target.
2273             //
2274             // There are two pending fixups by the time we reach this point,
2275             // one for this 19-bit jump and one for the unconditional 26-bit
2276             // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
2277             // veneer is 20 bytes large, which means that pessimistically
2278             // assuming we'll need two veneers. Currently each veneer is
2279             // pessimistically assumed to be the maximal size which means we
2280             // need 40 bytes of extra space, meaning that the actual island
2281             // should come 40-bytes before the deadline.
2282             taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20),
2283 
2284             // This branch is in-range so no veneers should be needed, it should
2285             // go directly to the target.
2286             not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2287         };
2288         inst.emit(&mut buf2, &info, &mut state);
2289 
2290         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2291 
2292         assert_eq!(&buf.data[0..8], &buf2.data[..]);
2293     }
2294 
2295     #[test]
2296     fn test_island_backward() {
2297         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2298         let mut buf = MachBuffer::new();
2299         let mut state = <Inst as MachInstEmit>::State::default();
2300         let constants = Default::default();
2301 
2302         buf.reserve_labels_for_blocks(4);
2303 
2304         buf.bind_label(label(0), state.ctrl_plane_mut());
2305         let inst = Inst::Nop4;
2306         inst.emit(&mut buf, &info, &mut state);
2307 
2308         buf.bind_label(label(1), state.ctrl_plane_mut());
2309         let inst = Inst::Nop4;
2310         inst.emit(&mut buf, &info, &mut state);
2311 
2312         buf.bind_label(label(2), state.ctrl_plane_mut());
2313         while buf.cur_offset() < 2000000 {
2314             let inst = Inst::Nop4;
2315             inst.emit(&mut buf, &info, &mut state);
2316         }
2317 
2318         buf.bind_label(label(3), state.ctrl_plane_mut());
2319         let inst = Inst::CondBr {
2320             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2321             taken: target(0),
2322             not_taken: target(1),
2323         };
2324         inst.emit(&mut buf, &info, &mut state);
2325 
2326         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2327 
2328         assert_eq!(2000000 + 12, buf.total_size());
2329 
2330         let mut buf2 = MachBuffer::new();
2331         let mut state = Default::default();
2332         let inst = Inst::CondBr {
2333             kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64),
2334             taken: BranchTarget::ResolvedOffset(8),
2335             not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2336         };
2337         inst.emit(&mut buf2, &info, &mut state);
2338         let inst = Inst::Jump {
2339             dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2340         };
2341         inst.emit(&mut buf2, &info, &mut state);
2342 
2343         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2344 
2345         assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2346     }
2347 
2348     #[test]
2349     fn test_multiple_redirect() {
2350         // label0:
2351         //   cbz x0, label1
2352         //   b label2
2353         // label1:
2354         //   b label3
2355         // label2:
2356         //   nop
2357         //   nop
2358         //   b label0
2359         // label3:
2360         //   b label4
2361         // label4:
2362         //   b label5
2363         // label5:
2364         //   b label7
2365         // label6:
2366         //   nop
2367         // label7:
2368         //   ret
2369         //
2370         // -- should become:
2371         //
2372         // label0:
2373         //   cbz x0, label7
2374         // label2:
2375         //   nop
2376         //   nop
2377         //   b label0
2378         // label6:
2379         //   nop
2380         // label7:
2381         //   ret
2382 
2383         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2384         let mut buf = MachBuffer::new();
2385         let mut state = <Inst as MachInstEmit>::State::default();
2386         let constants = Default::default();
2387 
2388         buf.reserve_labels_for_blocks(8);
2389 
2390         buf.bind_label(label(0), state.ctrl_plane_mut());
2391         let inst = Inst::CondBr {
2392             kind: CondBrKind::Zero(xreg(0), OperandSize::Size64),
2393             taken: target(1),
2394             not_taken: target(2),
2395         };
2396         inst.emit(&mut buf, &info, &mut state);
2397 
2398         buf.bind_label(label(1), state.ctrl_plane_mut());
2399         let inst = Inst::Jump { dest: target(3) };
2400         inst.emit(&mut buf, &info, &mut state);
2401 
2402         buf.bind_label(label(2), state.ctrl_plane_mut());
2403         let inst = Inst::Nop4;
2404         inst.emit(&mut buf, &info, &mut state);
2405         inst.emit(&mut buf, &info, &mut state);
2406         let inst = Inst::Jump { dest: target(0) };
2407         inst.emit(&mut buf, &info, &mut state);
2408 
2409         buf.bind_label(label(3), state.ctrl_plane_mut());
2410         let inst = Inst::Jump { dest: target(4) };
2411         inst.emit(&mut buf, &info, &mut state);
2412 
2413         buf.bind_label(label(4), state.ctrl_plane_mut());
2414         let inst = Inst::Jump { dest: target(5) };
2415         inst.emit(&mut buf, &info, &mut state);
2416 
2417         buf.bind_label(label(5), state.ctrl_plane_mut());
2418         let inst = Inst::Jump { dest: target(7) };
2419         inst.emit(&mut buf, &info, &mut state);
2420 
2421         buf.bind_label(label(6), state.ctrl_plane_mut());
2422         let inst = Inst::Nop4;
2423         inst.emit(&mut buf, &info, &mut state);
2424 
2425         buf.bind_label(label(7), state.ctrl_plane_mut());
2426         let inst = Inst::Ret {};
2427         inst.emit(&mut buf, &info, &mut state);
2428 
2429         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2430 
2431         let golden_data = vec![
2432             0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2433             0x1f, 0x20, 0x03, 0xd5, // nop
2434             0x1f, 0x20, 0x03, 0xd5, // nop
2435             0xfd, 0xff, 0xff, 0x17, // b 0
2436             0x1f, 0x20, 0x03, 0xd5, // nop
2437             0xc0, 0x03, 0x5f, 0xd6, // ret
2438         ];
2439 
2440         assert_eq!(&golden_data[..], &buf.data[..]);
2441     }
2442 
2443     #[test]
2444     fn test_handle_branch_cycle() {
2445         // label0:
2446         //   b label1
2447         // label1:
2448         //   b label2
2449         // label2:
2450         //   b label3
2451         // label3:
2452         //   b label4
2453         // label4:
2454         //   b label1  // note: not label0 (to make it interesting).
2455         //
2456         // -- should become:
2457         //
2458         // label0, label1, ..., label4:
2459         //   b label0
2460         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2461         let mut buf = MachBuffer::new();
2462         let mut state = <Inst as MachInstEmit>::State::default();
2463         let constants = Default::default();
2464 
2465         buf.reserve_labels_for_blocks(5);
2466 
2467         buf.bind_label(label(0), state.ctrl_plane_mut());
2468         let inst = Inst::Jump { dest: target(1) };
2469         inst.emit(&mut buf, &info, &mut state);
2470 
2471         buf.bind_label(label(1), state.ctrl_plane_mut());
2472         let inst = Inst::Jump { dest: target(2) };
2473         inst.emit(&mut buf, &info, &mut state);
2474 
2475         buf.bind_label(label(2), state.ctrl_plane_mut());
2476         let inst = Inst::Jump { dest: target(3) };
2477         inst.emit(&mut buf, &info, &mut state);
2478 
2479         buf.bind_label(label(3), state.ctrl_plane_mut());
2480         let inst = Inst::Jump { dest: target(4) };
2481         inst.emit(&mut buf, &info, &mut state);
2482 
2483         buf.bind_label(label(4), state.ctrl_plane_mut());
2484         let inst = Inst::Jump { dest: target(1) };
2485         inst.emit(&mut buf, &info, &mut state);
2486 
2487         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2488 
2489         let golden_data = vec![
2490             0x00, 0x00, 0x00, 0x14, // b 0
2491         ];
2492 
2493         assert_eq!(&golden_data[..], &buf.data[..]);
2494     }
2495 
2496     #[test]
2497     fn metadata_records() {
2498         let mut buf = MachBuffer::<Inst>::new();
2499         let ctrl_plane = &mut Default::default();
2500         let constants = Default::default();
2501 
2502         buf.reserve_labels_for_blocks(3);
2503 
2504         buf.bind_label(label(0), ctrl_plane);
2505         buf.put1(1);
2506         buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS);
2507         buf.put1(2);
2508         buf.add_trap(TrapCode::INTEGER_OVERFLOW);
2509         buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO);
2510         buf.add_call_site(&[
2511             (None.into(), label(1)),
2512             (Some(ExceptionTag::new(42)).into(), label(2)),
2513         ]);
2514         buf.add_reloc(
2515             Reloc::Abs4,
2516             &ExternalName::User(UserExternalNameRef::new(0)),
2517             0,
2518         );
2519         buf.put1(3);
2520         buf.add_reloc(
2521             Reloc::Abs8,
2522             &ExternalName::User(UserExternalNameRef::new(1)),
2523             1,
2524         );
2525         buf.put1(4);
2526         buf.bind_label(label(1), ctrl_plane);
2527         buf.put1(0xff);
2528         buf.bind_label(label(2), ctrl_plane);
2529         buf.put1(0xff);
2530 
2531         let buf = buf.finish(&constants, ctrl_plane);
2532 
2533         assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]);
2534         assert_eq!(
2535             buf.traps()
2536                 .iter()
2537                 .map(|trap| (trap.offset, trap.code))
2538                 .collect::<Vec<_>>(),
2539             vec![
2540                 (1, TrapCode::HEAP_OUT_OF_BOUNDS),
2541                 (2, TrapCode::INTEGER_OVERFLOW),
2542                 (2, TrapCode::INTEGER_DIVISION_BY_ZERO)
2543             ]
2544         );
2545         let call_sites: Vec<_> = buf.call_sites().collect();
2546         assert_eq!(call_sites[0].ret_addr, 2);
2547         assert_eq!(
2548             call_sites[0].exception_handlers,
2549             &[(None.into(), 4), (Some(ExceptionTag::new(42)).into(), 5)]
2550         );
2551         assert_eq!(
2552             buf.relocs()
2553                 .iter()
2554                 .map(|reloc| (reloc.offset, reloc.kind))
2555                 .collect::<Vec<_>>(),
2556             vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2557         );
2558     }
2559 }
2560