1 //! In-memory representation of compiled machine code, with labels and fixups to
2 //! refer to those labels. Handles constant-pool island insertion and also
3 //! veneer insertion for out-of-range jumps.
4 //!
5 //! This code exists to solve three problems:
6 //!
7 //! - Branch targets for forward branches are not known until later, when we
8 //!   emit code in a single pass through the instruction structs.
9 //!
10 //! - On many architectures, address references or offsets have limited range.
11 //!   For example, on AArch64, conditional branches can only target code +/- 1MB
12 //!   from the branch itself.
13 //!
14 //! - The lowering of control flow from the CFG-with-edges produced by
15 //!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16 //!   edge blocks when the register allocator does not need to insert any
17 //!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18 //!   patterns. The lowering also pays no attention to block order, and so
19 //!   two-target conditional forms (cond-br followed by uncond-br) can often by
20 //!   avoided because one of the targets is the fallthrough. There are several
21 //!   cases here where we can simplify to use fewer branches.
22 //!
23 //! This "buffer" implements a single-pass code emission strategy (with a later
24 //! "fixup" pass, but only through recorded fixups, not all instructions). The
25 //! basic idea is:
26 //!
27 //! - Emit branches as they are, including two-target (cond/uncond) compound
28 //!   forms, but with zero offsets and optimistically assuming the target will be
29 //!   in range. Record the "fixup" for later. Targets are denoted instead by
30 //!   symbolic "labels" that are then bound to certain offsets in the buffer as
31 //!   we emit code. (Nominally, there is a label at the start of every basic
32 //!   block.)
33 //!
34 //! - As we do this, track the offset in the buffer at which the first label
35 //!   reference "goes out of range". We call this the "deadline". If we reach the
36 //!   deadline and we still have not bound the label to which an unresolved branch
37 //!   refers, we have a problem!
38 //!
39 //! - To solve this problem, we emit "islands" full of "veneers". An island is
40 //!   simply a chunk of code inserted in the middle of the code actually produced
41 //!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42 //!   has some awareness of this: it either asks for an island between blocks, so
43 //!   it is not accidentally executed, or else it emits a branch around the island
44 //!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45 //!
46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47 //!   that implements a longer-range reference to a label. The idea is that, for
48 //!   example, a branch with a limited range can branch to a "veneer" instead,
49 //!   which is simply a branch in a form that can use a longer-range reference. On
50 //!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51 //!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52 //!   conditional branch's label reference can be fixed up with a "veneer" to
53 //!   achieve a longer range.
54 //!
55 //! - To implement all of this, we require the backend to provide a `LabelUse`
56 //!   type that implements a trait. This is nominally an enum that records one of
57 //!   several kinds of references to an offset in code -- basically, a relocation
58 //!   type -- and will usually correspond to different instruction formats. The
59 //!   `LabelUse` implementation specifies the maximum range, how to patch in the
60 //!   actual label location when known, and how to generate a veneer to extend the
61 //!   range.
62 //!
63 //! That satisfies label references, but we still may have suboptimal branch
64 //! patterns. To clean up the branches, we do a simple "peephole"-style
65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66 //! informs the buffer of branches in the code and, in the case of conditionals,
67 //! the code that would have been emitted to invert this branch's condition. We
68 //! track the "latest branches": these are branches that are contiguous up to
69 //! the current offset. (If any code is emitted after a branch, that branch or
70 //! run of contiguous branches is no longer "latest".) The latest branches are
71 //! those that we can edit by simply truncating the buffer and doing something
72 //! else instead.
73 //!
74 //! To optimize branches, we implement several simple rules, and try to apply
75 //! them to the "latest branches" when possible:
76 //!
77 //! - A branch with a label target, when that label is bound to the ending
78 //!   offset of the branch (the fallthrough location), can be removed altogether,
79 //!   because the branch would have no effect).
80 //!
81 //! - An unconditional branch that starts at a label location, and branches to
82 //!   another label, results in a "label alias": all references to the label bound
83 //!   *to* this branch instruction are instead resolved to the *target* of the
84 //!   branch instruction. This effectively removes empty blocks that just
85 //!   unconditionally branch to the next block. We call this "branch threading".
86 //!
87 //! - A conditional followed by an unconditional, when the conditional branches
88 //!   to the unconditional's fallthrough, results in (i) the truncation of the
89 //!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90 //!   replacement of the conditional's target (using the original target of the
91 //!   unconditional). This is a fancy way of saying "we can flip a two-target
92 //!   conditional branch's taken/not-taken targets if it works better with our
93 //!   fallthrough". To make this work, the emitter actually gives the buffer
94 //!   *both* forms of every conditional branch: the true form is emitted into the
95 //!   buffer, and the "inverted" machine-code bytes are provided as part of the
96 //!   branch-fixup metadata.
97 //!
98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have
99 //!   been redirected to target(B), can be removed entirely. This is an extension
100 //!   of the branch-threading optimization, and is valid because if we know there
101 //!   will be no fallthrough into this branch instruction (the prior instruction
102 //!   is an unconditional jump), and if we know we have successfully redirected
103 //!   all labels, then this branch instruction is unreachable. Note that this
104 //!   works because the redirection happens before the label is ever resolved
105 //!   (fixups happen at island emission time, at which point latest-branches are
106 //!   cleared, or at the end of emission), so we are sure to catch and redirect
107 //!   all possible paths to this instruction.
108 //!
109 //! # Branch-optimization Correctness
110 //!
111 //! The branch-optimization mechanism depends on a few data structures with
112 //! invariants, which are always held outside the scope of top-level public
113 //! methods:
114 //!
115 //! - The latest-branches list. Each entry describes a span of the buffer
116 //!   (start/end offsets), the label target, the corresponding fixup-list entry
117 //!   index, and the bytes (must be the same length) for the inverted form, if
118 //!   conditional. The list of labels that are bound to the start-offset of this
119 //!   branch is *complete* (if any label has a resolved offset equal to `start`
120 //!   and is not an alias, it must appear in this list) and *precise* (no label
121 //!   in this list can be bound to another offset). No label in this list should
122 //!   be an alias.  No two branch ranges can overlap, and branches are in
123 //!   ascending-offset order.
124 //!
125 //! - The labels-at-tail list. This contains all MachLabels that have been bound
126 //!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127 //!   No label in this list should be an alias.
128 //!
129 //! - The label_offsets array, containing the bound offset of a label or
130 //!   UNKNOWN. No label can be bound at an offset greater than the current
131 //!   buffer tail.
132 //!
133 //! - The label_aliases array, containing another label to which a label is
134 //!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135 //!   of the label it is aliased to, if this is set.
136 //!
137 //! We argue below, at each method, how the invariants in these data structures
138 //! are maintained (grep for "Post-invariant").
139 //!
140 //! Given these invariants, we argue why each optimization preserves execution
141 //! semantics below (grep for "Preserves execution semantics").
142 
143 use crate::binemit::{Addend, CodeOffset, Reloc, StackMap};
144 use crate::ir::{ExternalName, Opcode, RelSourceLoc, SourceLoc, TrapCode};
145 use crate::isa::unwind::UnwindInst;
146 use crate::machinst::{
147     BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
148 };
149 use crate::timing;
150 use crate::trace;
151 use cranelift_control::ControlPlane;
152 use cranelift_entity::{entity_impl, PrimaryMap};
153 use smallvec::SmallVec;
154 use std::convert::TryFrom;
155 use std::mem;
156 use std::string::String;
157 use std::vec::Vec;
158 
159 #[cfg(feature = "enable-serde")]
160 use serde::{Deserialize, Serialize};
161 
162 #[cfg(feature = "enable-serde")]
163 pub trait CompilePhase {
164     type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
165     type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone;
166 }
167 
168 #[cfg(not(feature = "enable-serde"))]
169 pub trait CompilePhase {
170     type MachSrcLocType: core::fmt::Debug + PartialEq + Clone;
171     type SourceLocType: core::fmt::Debug + PartialEq + Clone;
172 }
173 
174 /// Status of a compiled artifact that needs patching before being used.
175 #[derive(Clone, Debug, PartialEq)]
176 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))]
177 pub struct Stencil;
178 
179 /// Status of a compiled artifact ready to use.
180 #[derive(Clone, Debug, PartialEq)]
181 pub struct Final;
182 
183 impl CompilePhase for Stencil {
184     type MachSrcLocType = MachSrcLoc<Stencil>;
185     type SourceLocType = RelSourceLoc;
186 }
187 
188 impl CompilePhase for Final {
189     type MachSrcLocType = MachSrcLoc<Final>;
190     type SourceLocType = SourceLoc;
191 }
192 
193 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
194 /// in bulk.
195 ///
196 /// This struct uses `SmallVec`s to support small-ish function bodies without
197 /// any heap allocation. As such, it will be several kilobytes large. This is
198 /// likely fine as long as it is stack-allocated for function emission then
199 /// thrown away; but beware if many buffer objects are retained persistently.
200 pub struct MachBuffer<I: VCodeInst> {
201     /// The buffer contents, as raw bytes.
202     data: SmallVec<[u8; 1024]>,
203     /// Any relocations referring to this code. Note that only *external*
204     /// relocations are tracked here; references to labels within the buffer are
205     /// resolved before emission.
206     relocs: SmallVec<[MachReloc; 16]>,
207     /// Any trap records referring to this code.
208     traps: SmallVec<[MachTrap; 16]>,
209     /// Any call site records referring to this code.
210     call_sites: SmallVec<[MachCallSite; 16]>,
211     /// Any source location mappings referring to this code.
212     srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>,
213     /// Any stack maps referring to this code.
214     stack_maps: SmallVec<[MachStackMap; 8]>,
215     /// Any unwind info at a given location.
216     unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
217     /// The current source location in progress (after `start_srcloc()` and
218     /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
219     cur_srcloc: Option<(CodeOffset, RelSourceLoc)>,
220     /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
221     label_offsets: SmallVec<[CodeOffset; 16]>,
222     /// Label aliases: when one label points to an unconditional jump, and that
223     /// jump points to another label, we can redirect references to the first
224     /// label immediately to the second.
225     ///
226     /// Invariant: we don't have label-alias cycles. We ensure this by,
227     /// before setting label A to alias label B, resolving B's alias
228     /// target (iteratively until a non-aliased label); if B is already
229     /// aliased to A, then we cannot alias A back to B.
230     label_aliases: SmallVec<[MachLabel; 16]>,
231     /// Constants that must be emitted at some point.
232     pending_constants: SmallVec<[VCodeConstant; 16]>,
233     /// Traps that must be emitted at some point.
234     pending_traps: SmallVec<[MachLabelTrap; 16]>,
235     /// Fixups that must be performed after all code is emitted.
236     fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
237     /// Current deadline at which all constants are flushed and all code labels
238     /// are extended by emitting long-range jumps in an island. This flush
239     /// should be rare (e.g., on AArch64, the shortest-range PC-rel references
240     /// are +/- 1MB for conditional jumps and load-literal instructions), so
241     /// it's acceptable to track a minimum and flush-all rather than doing more
242     /// detailed "current minimum" / sort-by-deadline trickery.
243     island_deadline: CodeOffset,
244     /// How many bytes are needed in the worst case for an island, given all
245     /// pending constants and fixups.
246     island_worst_case_size: CodeOffset,
247     /// Latest branches, to facilitate in-place editing for better fallthrough
248     /// behavior and empty-block removal.
249     latest_branches: SmallVec<[MachBranch; 4]>,
250     /// All labels at the current offset (emission tail). This is lazily
251     /// cleared: it is actually accurate as long as the current offset is
252     /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
253     /// be considered as empty.
254     ///
255     /// For correctness, this *must* be complete (i.e., the vector must contain
256     /// all labels whose offsets are resolved to the current tail), because we
257     /// rely on it to update labels when we truncate branches.
258     labels_at_tail: SmallVec<[MachLabel; 4]>,
259     /// The last offset at which `labels_at_tail` is valid. It is conceptually
260     /// always describing the tail of the buffer, but we do not clear
261     /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
262     /// when the offset has grown past this (`labels_at_tail_off`) point.
263     /// Always <= `cur_offset()`.
264     labels_at_tail_off: CodeOffset,
265     /// Metadata about all constants that this function has access to.
266     ///
267     /// This records the size/alignment of all constants (not the actual data)
268     /// along with the last available label generated for the constant. This map
269     /// is consulted when constants are referred to and the label assigned to a
270     /// constant may change over time as well.
271     constants: PrimaryMap<VCodeConstant, MachBufferConstant>,
272     /// All recorded usages of constants as pairs of the constant and where the
273     /// constant needs to be placed within `self.data`. Note that the same
274     /// constant may appear in this array multiple times if it was emitted
275     /// multiple times.
276     used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>,
277 }
278 
279 impl MachBufferFinalized<Stencil> {
280     /// Get a finalized machine buffer by applying the function's base source location.
281     pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> {
282         MachBufferFinalized {
283             data: self.data,
284             relocs: self.relocs,
285             traps: self.traps,
286             call_sites: self.call_sites,
287             srclocs: self
288                 .srclocs
289                 .into_iter()
290                 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc))
291                 .collect(),
292             stack_maps: self.stack_maps,
293             unwind_info: self.unwind_info,
294             alignment: self.alignment,
295         }
296     }
297 }
298 
299 /// A `MachBuffer` once emission is completed: holds generated code and records,
300 /// without fixups. This allows the type to be independent of the backend.
301 #[derive(PartialEq, Debug, Clone)]
302 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
303 pub struct MachBufferFinalized<T: CompilePhase> {
304     /// The buffer contents, as raw bytes.
305     pub(crate) data: SmallVec<[u8; 1024]>,
306     /// Any relocations referring to this code. Note that only *external*
307     /// relocations are tracked here; references to labels within the buffer are
308     /// resolved before emission.
309     pub(crate) relocs: SmallVec<[MachReloc; 16]>,
310     /// Any trap records referring to this code.
311     pub(crate) traps: SmallVec<[MachTrap; 16]>,
312     /// Any call site records referring to this code.
313     pub(crate) call_sites: SmallVec<[MachCallSite; 16]>,
314     /// Any source location mappings referring to this code.
315     pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>,
316     /// Any stack maps referring to this code.
317     pub(crate) stack_maps: SmallVec<[MachStackMap; 8]>,
318     /// Any unwind info at a given location.
319     pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
320     /// The requireed alignment of this buffer
321     pub alignment: u32,
322 }
323 
324 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
325 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
326 
327 /// Threshold on max length of `labels_at_this_branch` list to avoid
328 /// unbounded quadratic behavior (see comment below at use-site).
329 const LABEL_LIST_THRESHOLD: usize = 100;
330 
331 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at
332 /// the point at which it is used by emitted code; the buffer records "fixups"
333 /// for references to the label, and will come back and patch the code
334 /// appropriately when the label's location is eventually known.
335 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
336 pub struct MachLabel(u32);
337 entity_impl!(MachLabel);
338 
339 impl MachLabel {
340     /// Get a label for a block. (The first N MachLabels are always reseved for
341     /// the N blocks in the vcode.)
342     pub fn from_block(bindex: BlockIndex) -> MachLabel {
343         MachLabel(bindex.index() as u32)
344     }
345 
346     /// Get the numeric label index.
347     pub fn get(self) -> u32 {
348         self.0
349     }
350 
351     /// Creates a string representing this label, for convenience.
352     pub fn to_string(&self) -> String {
353         format!("label{}", self.0)
354     }
355 }
356 
357 impl Default for MachLabel {
358     fn default() -> Self {
359         UNKNOWN_LABEL
360     }
361 }
362 
363 /// A stack map extent, when creating a stack map.
364 pub enum StackMapExtent {
365     /// The stack map starts at this instruction, and ends after the number of upcoming bytes
366     /// (note: this is a code offset diff).
367     UpcomingBytes(CodeOffset),
368 
369     /// The stack map started at the given offset and ends at the current one. This helps
370     /// architectures where the instruction size has not a fixed length.
371     StartedAtOffset(CodeOffset),
372 }
373 
374 impl<I: VCodeInst> MachBuffer<I> {
375     /// Create a new section, known to start at `start_offset` and with a size limited to
376     /// `length_limit`.
377     pub fn new() -> MachBuffer<I> {
378         MachBuffer {
379             data: SmallVec::new(),
380             relocs: SmallVec::new(),
381             traps: SmallVec::new(),
382             call_sites: SmallVec::new(),
383             srclocs: SmallVec::new(),
384             stack_maps: SmallVec::new(),
385             unwind_info: SmallVec::new(),
386             cur_srcloc: None,
387             label_offsets: SmallVec::new(),
388             label_aliases: SmallVec::new(),
389             pending_constants: SmallVec::new(),
390             pending_traps: SmallVec::new(),
391             fixup_records: SmallVec::new(),
392             island_deadline: UNKNOWN_LABEL_OFFSET,
393             island_worst_case_size: 0,
394             latest_branches: SmallVec::new(),
395             labels_at_tail: SmallVec::new(),
396             labels_at_tail_off: 0,
397             constants: Default::default(),
398             used_constants: Default::default(),
399         }
400     }
401 
402     /// Current offset from start of buffer.
403     pub fn cur_offset(&self) -> CodeOffset {
404         self.data.len() as CodeOffset
405     }
406 
407     /// Add a byte.
408     pub fn put1(&mut self, value: u8) {
409         trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value);
410         self.data.push(value);
411 
412         // Post-invariant: conceptual-labels_at_tail contains a complete and
413         // precise list of labels bound at `cur_offset()`. We have advanced
414         // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
415         // before, it is not anymore (and it cannot become equal, because
416         // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
417         // conceptually empty (even though it is only lazily cleared). No labels
418         // can be bound at this new offset (by invariant on `label_offsets`).
419         // Hence the invariant holds.
420     }
421 
422     /// Add 2 bytes.
423     pub fn put2(&mut self, value: u16) {
424         trace!(
425             "MachBuffer: put 16-bit word @ {}: {:x}",
426             self.cur_offset(),
427             value
428         );
429         let bytes = value.to_le_bytes();
430         self.data.extend_from_slice(&bytes[..]);
431 
432         // Post-invariant: as for `put1()`.
433     }
434 
435     /// Add 4 bytes.
436     pub fn put4(&mut self, value: u32) {
437         trace!(
438             "MachBuffer: put 32-bit word @ {}: {:x}",
439             self.cur_offset(),
440             value
441         );
442         let bytes = value.to_le_bytes();
443         self.data.extend_from_slice(&bytes[..]);
444 
445         // Post-invariant: as for `put1()`.
446     }
447 
448     /// Add 8 bytes.
449     pub fn put8(&mut self, value: u64) {
450         trace!(
451             "MachBuffer: put 64-bit word @ {}: {:x}",
452             self.cur_offset(),
453             value
454         );
455         let bytes = value.to_le_bytes();
456         self.data.extend_from_slice(&bytes[..]);
457 
458         // Post-invariant: as for `put1()`.
459     }
460 
461     /// Add a slice of bytes.
462     pub fn put_data(&mut self, data: &[u8]) {
463         trace!(
464             "MachBuffer: put data @ {}: len {}",
465             self.cur_offset(),
466             data.len()
467         );
468         self.data.extend_from_slice(data);
469 
470         // Post-invariant: as for `put1()`.
471     }
472 
473     /// Reserve appended space and return a mutable slice referring to it.
474     pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
475         trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len);
476         let off = self.data.len();
477         let new_len = self.data.len() + len;
478         self.data.resize(new_len, 0);
479         &mut self.data[off..]
480 
481         // Post-invariant: as for `put1()`.
482     }
483 
484     /// Align up to the given alignment.
485     pub fn align_to(&mut self, align_to: CodeOffset) {
486         trace!("MachBuffer: align to {}", align_to);
487         assert!(
488             align_to.is_power_of_two(),
489             "{} is not a power of two",
490             align_to
491         );
492         while self.cur_offset() & (align_to - 1) != 0 {
493             self.put1(0);
494         }
495 
496         // Post-invariant: as for `put1()`.
497     }
498 
499     /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
500     /// offset yet.
501     pub fn get_label(&mut self) -> MachLabel {
502         let l = self.label_offsets.len() as u32;
503         self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
504         self.label_aliases.push(UNKNOWN_LABEL);
505         trace!("MachBuffer: new label -> {:?}", MachLabel(l));
506         MachLabel(l)
507 
508         // Post-invariant: the only mutation is to add a new label; it has no
509         // bound offset yet, so it trivially satisfies all invariants.
510     }
511 
512     /// Reserve the first N MachLabels for blocks.
513     pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
514         trace!("MachBuffer: first {} labels are for blocks", blocks);
515         debug_assert!(self.label_offsets.is_empty());
516         self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
517         self.label_aliases.resize(blocks, UNKNOWN_LABEL);
518 
519         // Post-invariant: as for `get_label()`.
520     }
521 
522     /// Registers metadata in this `MachBuffer` about the `constants` provided.
523     ///
524     /// This will record the size/alignment of all constants which will prepare
525     /// them for emission later on.
526     pub fn register_constants(&mut self, constants: &VCodeConstants) {
527         for (c, val) in constants.iter() {
528             let c2 = self.constants.push(MachBufferConstant {
529                 upcoming_label: None,
530                 align: val.alignment(),
531                 size: val.as_slice().len(),
532             });
533             assert_eq!(c, c2);
534         }
535     }
536 
537     /// Completes constant emission by iterating over `self.used_constants` and
538     /// filling in the "holes" with the constant values provided by `constants`.
539     ///
540     /// Returns the alignment required for this entire buffer. Alignment starts
541     /// at the ISA's minimum function alignment and can be increased due to
542     /// constant requirements.
543     fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 {
544         let mut alignment = I::function_alignment().minimum;
545         for (constant, offset) in mem::take(&mut self.used_constants) {
546             let constant = constants.get(constant);
547             let data = constant.as_slice();
548             self.data[offset as usize..][..data.len()].copy_from_slice(data);
549             alignment = constant.alignment().max(alignment);
550         }
551         alignment
552     }
553 
554     /// Returns a label that can be used to refer to the `constant` provided.
555     ///
556     /// This will automatically defer a new constant to be emitted for
557     /// `constant` if it has not been previously emitted. Note that this
558     /// function may return a different label for the same constant at
559     /// different points in time. The label is valid to use only from the
560     /// current location; the MachBuffer takes care to emit the same constant
561     /// multiple times if needed so the constant is always in range.
562     pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel {
563         let MachBufferConstant {
564             align,
565             size,
566             upcoming_label,
567         } = self.constants[constant];
568         if let Some(label) = upcoming_label {
569             return label;
570         }
571 
572         let label = self.get_label();
573         trace!(
574             "defer constant: eventually emit {size} bytes aligned \
575              to {align} at label {label:?}",
576         );
577         self.update_deadline(size, u32::MAX);
578         self.pending_constants.push(constant);
579         self.constants[constant].upcoming_label = Some(label);
580         label
581     }
582 
583     /// Bind a label to the current offset. A label can only be bound once.
584     pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) {
585         trace!(
586             "MachBuffer: bind label {:?} at offset {}",
587             label,
588             self.cur_offset()
589         );
590         debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
591         debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
592         let offset = self.cur_offset();
593         self.label_offsets[label.0 as usize] = offset;
594         self.lazily_clear_labels_at_tail();
595         self.labels_at_tail.push(label);
596 
597         // Invariants hold: bound offset of label is <= cur_offset (in fact it
598         // is equal). If the `labels_at_tail` list was complete and precise
599         // before, it is still, because we have bound this label to the current
600         // offset and added it to the list (which contains all labels at the
601         // current offset).
602 
603         self.optimize_branches(ctrl_plane);
604 
605         // Post-invariant: by `optimize_branches()` (see argument there).
606     }
607 
608     /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
609     /// offset that it applies to.
610     fn lazily_clear_labels_at_tail(&mut self) {
611         let offset = self.cur_offset();
612         if offset > self.labels_at_tail_off {
613             self.labels_at_tail_off = offset;
614             self.labels_at_tail.clear();
615         }
616 
617         // Post-invariant: either labels_at_tail_off was at cur_offset, and
618         // state is untouched, or was less than cur_offset, in which case the
619         // labels_at_tail list was conceptually empty, and is now actually
620         // empty.
621     }
622 
623     /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
624     pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
625         let mut iters = 0;
626         while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
627             label = self.label_aliases[label.0 as usize];
628             // To protect against an infinite loop (despite our assurances to
629             // ourselves that the invariants make this impossible), assert out
630             // after 1M iterations. The number of basic blocks is limited
631             // in most contexts anyway so this should be impossible to hit with
632             // a legitimate input.
633             iters += 1;
634             assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
635         }
636         self.label_offsets[label.0 as usize]
637 
638         // Post-invariant: no mutations.
639     }
640 
641     /// Emit a reference to the given label with the given reference type (i.e.,
642     /// branch-instruction format) at the current offset.  This is like a
643     /// relocation, but handled internally.
644     ///
645     /// This can be called before the branch is actually emitted; fixups will
646     /// not happen until an island is emitted or the buffer is finished.
647     pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
648         trace!(
649             "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
650             offset,
651             label,
652             kind
653         );
654 
655         // Add the fixup, and update the worst-case island size based on a
656         // veneer for this label use.
657         self.fixup_records.push(MachLabelFixup {
658             label,
659             offset,
660             kind,
661         });
662         if kind.supports_veneer() {
663             self.island_worst_case_size += kind.veneer_size();
664             self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1);
665         }
666         let deadline = offset.saturating_add(kind.max_pos_range());
667         if deadline < self.island_deadline {
668             self.island_deadline = deadline;
669         }
670 
671         // Post-invariant: no mutations to branches/labels data structures.
672     }
673 
674     /// Inform the buffer of an unconditional branch at the given offset,
675     /// targetting the given label. May be used to optimize branches.
676     /// The last added label-use must correspond to this branch.
677     /// This must be called when the current offset is equal to `start`; i.e.,
678     /// before actually emitting the branch. This implies that for a branch that
679     /// uses a label and is eligible for optimizations by the MachBuffer, the
680     /// proper sequence is:
681     ///
682     /// - Call `use_label_at_offset()` to emit the fixup record.
683     /// - Call `add_uncond_branch()` to make note of the branch.
684     /// - Emit the bytes for the branch's machine code.
685     ///
686     /// Additional requirement: no labels may be bound between `start` and `end`
687     /// (exclusive on both ends).
688     pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
689         assert!(self.cur_offset() == start);
690         debug_assert!(end > start);
691         assert!(!self.fixup_records.is_empty());
692         let fixup = self.fixup_records.len() - 1;
693         self.lazily_clear_labels_at_tail();
694         self.latest_branches.push(MachBranch {
695             start,
696             end,
697             target,
698             fixup,
699             inverted: None,
700             labels_at_this_branch: self.labels_at_tail.clone(),
701         });
702 
703         // Post-invariant: we asserted branch start is current tail; the list of
704         // labels at branch is cloned from list of labels at current tail.
705     }
706 
707     /// Inform the buffer of a conditional branch at the given offset,
708     /// targetting the given label. May be used to optimize branches.
709     /// The last added label-use must correspond to this branch.
710     ///
711     /// Additional requirement: no labels may be bound between `start` and `end`
712     /// (exclusive on both ends).
713     pub fn add_cond_branch(
714         &mut self,
715         start: CodeOffset,
716         end: CodeOffset,
717         target: MachLabel,
718         inverted: &[u8],
719     ) {
720         assert!(self.cur_offset() == start);
721         debug_assert!(end > start);
722         assert!(!self.fixup_records.is_empty());
723         debug_assert!(inverted.len() == (end - start) as usize);
724         let fixup = self.fixup_records.len() - 1;
725         let inverted = Some(SmallVec::from(inverted));
726         self.lazily_clear_labels_at_tail();
727         self.latest_branches.push(MachBranch {
728             start,
729             end,
730             target,
731             fixup,
732             inverted,
733             labels_at_this_branch: self.labels_at_tail.clone(),
734         });
735 
736         // Post-invariant: we asserted branch start is current tail; labels at
737         // branch list is cloned from list of labels at current tail.
738     }
739 
740     fn truncate_last_branch(&mut self) {
741         self.lazily_clear_labels_at_tail();
742         // Invariants hold at this point.
743 
744         let b = self.latest_branches.pop().unwrap();
745         assert!(b.end == self.cur_offset());
746 
747         // State:
748         //    [PRE CODE]
749         //  Offset b.start, b.labels_at_this_branch:
750         //    [BRANCH CODE]
751         //  cur_off, self.labels_at_tail -->
752         //    (end of buffer)
753         self.data.truncate(b.start as usize);
754         self.fixup_records.truncate(b.fixup);
755         while let Some(last_srcloc) = self.srclocs.last_mut() {
756             if last_srcloc.end <= b.start {
757                 break;
758             }
759             if last_srcloc.start < b.start {
760                 last_srcloc.end = b.start;
761                 break;
762             }
763             self.srclocs.pop();
764         }
765         // State:
766         //    [PRE CODE]
767         //  cur_off, Offset b.start, b.labels_at_this_branch:
768         //    (end of buffer)
769         //
770         //  self.labels_at_tail -->  (past end of buffer)
771         let cur_off = self.cur_offset();
772         self.labels_at_tail_off = cur_off;
773         // State:
774         //    [PRE CODE]
775         //  cur_off, Offset b.start, b.labels_at_this_branch,
776         //  self.labels_at_tail:
777         //    (end of buffer)
778         //
779         // resolve_label_offset(l) for l in labels_at_tail:
780         //    (past end of buffer)
781 
782         trace!(
783             "truncate_last_branch: truncated {:?}; off now {}",
784             b,
785             cur_off
786         );
787 
788         // Fix up resolved label offsets for labels at tail.
789         for &l in &self.labels_at_tail {
790             self.label_offsets[l.0 as usize] = cur_off;
791         }
792         // Old labels_at_this_branch are now at cur_off.
793         self.labels_at_tail
794             .extend(b.labels_at_this_branch.into_iter());
795 
796         // Post-invariant: this operation is defined to truncate the buffer,
797         // which moves cur_off backward, and to move labels at the end of the
798         // buffer back to the start-of-branch offset.
799         //
800         // latest_branches satisfies all invariants:
801         // - it has no branches past the end of the buffer (branches are in
802         //   order, we removed the last one, and we truncated the buffer to just
803         //   before the start of that branch)
804         // - no labels were moved to lower offsets than the (new) cur_off, so
805         //   the labels_at_this_branch list for any other branch need not change.
806         //
807         // labels_at_tail satisfies all invariants:
808         // - all labels that were at the tail after the truncated branch are
809         //   moved backward to just before the branch, which becomes the new tail;
810         //   thus every element in the list should remain (ensured by `.extend()`
811         //   above).
812         // - all labels that refer to the new tail, which is the start-offset of
813         //   the truncated branch, must be present. The `labels_at_this_branch`
814         //   list in the truncated branch's record is a complete and precise list
815         //   of exactly these labels; we append these to labels_at_tail.
816         // - labels_at_tail_off is at cur_off after truncation occurs, so the
817         //   list is valid (not to be lazily cleared).
818         //
819         // The stated operation was performed:
820         // - For each label at the end of the buffer prior to this method, it
821         //   now resolves to the new (truncated) end of the buffer: it must have
822         //   been in `labels_at_tail` (this list is precise and complete, and
823         //   the tail was at the end of the truncated branch on entry), and we
824         //   iterate over this list and set `label_offsets` to the new tail.
825         //   None of these labels could have been an alias (by invariant), so
826         //   `label_offsets` is authoritative for each.
827         // - No other labels will be past the end of the buffer, because of the
828         //   requirement that no labels be bound to the middle of branch ranges
829         //   (see comments to `add_{cond,uncond}_branch()`).
830         // - The buffer is truncated to just before the last branch, and the
831         //   fixup record referring to that last branch is removed.
832     }
833 
834     fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) {
835         if ctrl_plane.get_decision() {
836             return;
837         }
838 
839         self.lazily_clear_labels_at_tail();
840         // Invariants valid at this point.
841 
842         trace!(
843             "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
844             self.latest_branches,
845             self.labels_at_tail,
846             self.fixup_records
847         );
848 
849         // We continue to munch on branches at the tail of the buffer until no
850         // more rules apply. Note that the loop only continues if a branch is
851         // actually truncated (or if labels are redirected away from a branch),
852         // so this always makes progress.
853         while let Some(b) = self.latest_branches.last() {
854             let cur_off = self.cur_offset();
855             trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
856             // If there has been any code emission since the end of the last branch or
857             // label definition, then there's nothing we can edit (because we
858             // don't move code once placed, only back up and overwrite), so
859             // clear the records and finish.
860             if b.end < cur_off {
861                 break;
862             }
863 
864             // If the "labels at this branch" list on this branch is
865             // longer than a threshold, don't do any simplification,
866             // and let the branch remain to separate those labels from
867             // the current tail. This avoids quadratic behavior (see
868             // #3468): otherwise, if a long string of "goto next;
869             // next:" patterns are emitted, all of the labels will
870             // coalesce into a long list of aliases for the current
871             // buffer tail. We must track all aliases of the current
872             // tail for correctness, but we are also allowed to skip
873             // optimization (removal) of any branch, so we take the
874             // escape hatch here and let it stand. In effect this
875             // "spreads" the many thousands of labels in the
876             // pathological case among an actual (harmless but
877             // suboptimal) instruction once per N labels.
878             if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
879                 break;
880             }
881 
882             // Invariant: we are looking at a branch that ends at the tail of
883             // the buffer.
884 
885             // For any branch, conditional or unconditional:
886             // - If the target is a label at the current offset, then remove
887             //   the conditional branch, and reset all labels that targetted
888             //   the current offset (end of branch) to the truncated
889             //   end-of-code.
890             //
891             // Preserves execution semantics: a branch to its own fallthrough
892             // address is equivalent to a no-op; in both cases, nextPC is the
893             // fallthrough.
894             if self.resolve_label_offset(b.target) == cur_off {
895                 trace!("branch with target == cur off; truncating");
896                 self.truncate_last_branch();
897                 continue;
898             }
899 
900             // If latest is an unconditional branch:
901             //
902             // - If the branch's target is not its own start address, then for
903             //   each label at the start of branch, make the label an alias of the
904             //   branch target, and remove the label from the "labels at this
905             //   branch" list.
906             //
907             //   - Preserves execution semantics: an unconditional branch's
908             //     only effect is to set PC to a new PC; this change simply
909             //     collapses one step in the step-semantics.
910             //
911             //   - Post-invariant: the labels that were bound to the start of
912             //     this branch become aliases, so they must not be present in any
913             //     labels-at-this-branch list or the labels-at-tail list. The
914             //     labels are removed form the latest-branch record's
915             //     labels-at-this-branch list, and are never placed in the
916             //     labels-at-tail list. Furthermore, it is correct that they are
917             //     not in either list, because they are now aliases, and labels
918             //     that are aliases remain aliases forever.
919             //
920             // - If there is a prior unconditional branch that ends just before
921             //   this one begins, and this branch has no labels bound to its
922             //   start, then we can truncate this branch, because it is entirely
923             //   unreachable (we have redirected all labels that make it
924             //   reachable otherwise). Do so and continue around the loop.
925             //
926             //   - Preserves execution semantics: the branch is unreachable,
927             //     because execution can only flow into an instruction from the
928             //     prior instruction's fallthrough or from a branch bound to that
929             //     instruction's start offset. Unconditional branches have no
930             //     fallthrough, so if the prior instruction is an unconditional
931             //     branch, no fallthrough entry can happen. The
932             //     labels-at-this-branch list is complete (by invariant), so if it
933             //     is empty, then the instruction is entirely unreachable. Thus,
934             //     it can be removed.
935             //
936             //   - Post-invariant: ensured by truncate_last_branch().
937             //
938             // - If there is a prior conditional branch whose target label
939             //   resolves to the current offset (branches around the
940             //   unconditional branch), then remove the unconditional branch,
941             //   and make the target of the unconditional the target of the
942             //   conditional instead.
943             //
944             //   - Preserves execution semantics: previously we had:
945             //
946             //         L1:
947             //            cond_br L2
948             //            br L3
949             //         L2:
950             //            (end of buffer)
951             //
952             //     by removing the last branch, we have:
953             //
954             //         L1:
955             //            cond_br L2
956             //         L2:
957             //            (end of buffer)
958             //
959             //     we then fix up the records for the conditional branch to
960             //     have:
961             //
962             //         L1:
963             //           cond_br.inverted L3
964             //         L2:
965             //
966             //     In the original code, control flow reaches L2 when the
967             //     conditional branch's predicate is true, and L3 otherwise. In
968             //     the optimized code, the same is true.
969             //
970             //   - Post-invariant: all edits to latest_branches and
971             //     labels_at_tail are performed by `truncate_last_branch()`,
972             //     which maintains the invariants at each step.
973 
974             if b.is_uncond() {
975                 // Set any label equal to current branch's start as an alias of
976                 // the branch's target, if the target is not the branch itself
977                 // (i.e., an infinite loop).
978                 //
979                 // We cannot perform this aliasing if the target of this branch
980                 // ultimately aliases back here; if so, we need to keep this
981                 // branch, so break out of this loop entirely (and clear the
982                 // latest-branches list below).
983                 //
984                 // Note that this check is what prevents cycles from forming in
985                 // `self.label_aliases`. To see why, consider an arbitrary start
986                 // state:
987                 //
988                 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
989                 // Ln, which is not aliased.
990                 //
991                 // We would create a cycle if we assigned label_aliases[Ln]
992                 // = L1.  Note that the below assignment is the only write
993                 // to label_aliases.
994                 //
995                 // By our other invariants, we have that Ln (`l` below)
996                 // resolves to the offset `b.start`, because it is in the
997                 // set `b.labels_at_this_branch`.
998                 //
999                 // If L1 were already aliased, through some arbitrarily deep
1000                 // chain, to Ln, then it must also resolve to this offset
1001                 // `b.start`.
1002                 //
1003                 // By checking the resolution of `L1` against this offset,
1004                 // and aborting this branch-simplification if they are
1005                 // equal, we prevent the below assignment from ever creating
1006                 // a cycle.
1007                 if self.resolve_label_offset(b.target) != b.start {
1008                     let redirected = b.labels_at_this_branch.len();
1009                     for &l in &b.labels_at_this_branch {
1010                         trace!(
1011                             " -> label at start of branch {:?} redirected to target {:?}",
1012                             l,
1013                             b.target
1014                         );
1015                         self.label_aliases[l.0 as usize] = b.target;
1016                         // NOTE: we continue to ensure the invariant that labels
1017                         // pointing to tail of buffer are in `labels_at_tail`
1018                         // because we already ensured above that the last branch
1019                         // cannot have a target of `cur_off`; so we never have
1020                         // to put the label into `labels_at_tail` when moving it
1021                         // here.
1022                     }
1023                     // Maintain invariant: all branches have been redirected
1024                     // and are no longer pointing at the start of this branch.
1025                     let mut_b = self.latest_branches.last_mut().unwrap();
1026                     mut_b.labels_at_this_branch.clear();
1027 
1028                     if redirected > 0 {
1029                         trace!(" -> after label redirects, restarting loop");
1030                         continue;
1031                     }
1032                 } else {
1033                     break;
1034                 }
1035 
1036                 let b = self.latest_branches.last().unwrap();
1037 
1038                 // Examine any immediately preceding branch.
1039                 if self.latest_branches.len() > 1 {
1040                     let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
1041                     trace!(" -> more than one branch; prev_b = {:?}", prev_b);
1042                     // This uncond is immediately after another uncond; we
1043                     // should have already redirected labels to this uncond away
1044                     // (but check to be sure); so we can truncate this uncond.
1045                     if prev_b.is_uncond()
1046                         && prev_b.end == b.start
1047                         && b.labels_at_this_branch.is_empty()
1048                     {
1049                         trace!(" -> uncond follows another uncond; truncating");
1050                         self.truncate_last_branch();
1051                         continue;
1052                     }
1053 
1054                     // This uncond is immediately after a conditional, and the
1055                     // conditional's target is the end of this uncond, and we've
1056                     // already redirected labels to this uncond away; so we can
1057                     // truncate this uncond, flip the sense of the conditional, and
1058                     // set the conditional's target (in `latest_branches` and in
1059                     // `fixup_records`) to the uncond's target.
1060                     if prev_b.is_cond()
1061                         && prev_b.end == b.start
1062                         && self.resolve_label_offset(prev_b.target) == cur_off
1063                     {
1064                         trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
1065                         // Save the target of the uncond (this becomes the
1066                         // target of the cond), and truncate the uncond.
1067                         let target = b.target;
1068                         let data = prev_b.inverted.clone().unwrap();
1069                         self.truncate_last_branch();
1070 
1071                         // Mutate the code and cond branch.
1072                         let off_before_edit = self.cur_offset();
1073                         let prev_b = self.latest_branches.last_mut().unwrap();
1074                         let not_inverted = SmallVec::from(
1075                             &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
1076                         );
1077 
1078                         // Low-level edit: replaces bytes of branch with
1079                         // inverted form. cur_off remains the same afterward, so
1080                         // we do not need to modify label data structures.
1081                         self.data.truncate(prev_b.start as usize);
1082                         self.data.extend_from_slice(&data[..]);
1083 
1084                         // Save the original code as the inversion of the
1085                         // inverted branch, in case we later edit this branch
1086                         // again.
1087                         prev_b.inverted = Some(not_inverted);
1088                         self.fixup_records[prev_b.fixup].label = target;
1089                         trace!(" -> reassigning target of condbr to {:?}", target);
1090                         prev_b.target = target;
1091                         debug_assert_eq!(off_before_edit, self.cur_offset());
1092                         continue;
1093                     }
1094                 }
1095             }
1096 
1097             // If we couldn't do anything with the last branch, then break.
1098             break;
1099         }
1100 
1101         self.purge_latest_branches();
1102 
1103         trace!(
1104             "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
1105             self.latest_branches,
1106             self.labels_at_tail,
1107             self.fixup_records
1108         );
1109     }
1110 
1111     fn purge_latest_branches(&mut self) {
1112         // All of our branch simplification rules work only if a branch ends at
1113         // the tail of the buffer, with no following code; and branches are in
1114         // order in latest_branches; so if the last entry ends prior to
1115         // cur_offset, then clear all entries.
1116         let cur_off = self.cur_offset();
1117         if let Some(l) = self.latest_branches.last() {
1118             if l.end < cur_off {
1119                 trace!("purge_latest_branches: removing branch {:?}", l);
1120                 self.latest_branches.clear();
1121             }
1122         }
1123 
1124         // Post-invariant: no invariant requires any branch to appear in
1125         // `latest_branches`; it is always optional. The list-clear above thus
1126         // preserves all semantics.
1127     }
1128 
1129     /// Emit a trap at some point in the future with the specified code and
1130     /// stack map.
1131     ///
1132     /// This function returns a [`MachLabel`] which will be the future address
1133     /// of the trap. Jumps should refer to this label, likely by using the
1134     /// [`MachBuffer::use_label_at_offset`] method, to get a relocation
1135     /// patched in once the address of the trap is known.
1136     ///
1137     /// This will batch all traps into the end of the function.
1138     pub fn defer_trap(&mut self, code: TrapCode, stack_map: Option<StackMap>) -> MachLabel {
1139         let label = self.get_label();
1140         self.update_deadline(I::TRAP_OPCODE.len(), u32::MAX);
1141         self.pending_traps.push(MachLabelTrap {
1142             label,
1143             code,
1144             stack_map,
1145             loc: self.cur_srcloc.map(|(_start, loc)| loc),
1146         });
1147         label
1148     }
1149 
1150     fn update_deadline(&mut self, len: usize, max_distance: CodeOffset) {
1151         trace!("defer: eventually emit {} bytes", len);
1152         let deadline = self.cur_offset().saturating_add(max_distance);
1153         self.island_worst_case_size += len as CodeOffset;
1154         self.island_worst_case_size =
1155             (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1);
1156         if deadline < self.island_deadline {
1157             self.island_deadline = deadline;
1158         }
1159     }
1160 
1161     /// Is an island needed within the next N bytes?
1162     pub fn island_needed(&self, distance: CodeOffset) -> bool {
1163         self.worst_case_end_of_island(distance) > self.island_deadline
1164     }
1165 
1166     /// Returns the maximal offset that islands can reach if `distance` more
1167     /// bytes are appended.
1168     ///
1169     /// This is used to determine if veneers need insertions since jumps that
1170     /// can't reach past this point must get a veneer of some form.
1171     fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1172         self.cur_offset()
1173             .saturating_add(distance)
1174             .saturating_add(self.island_worst_case_size)
1175     }
1176 
1177     /// Emit all pending constants and required pending veneers.
1178     ///
1179     /// Should only be called if `island_needed()` returns true, i.e., if we
1180     /// actually reach a deadline. It's not necessarily a problem to do so
1181     /// otherwise but it may result in unnecessary work during emission.
1182     pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) {
1183         self.emit_island_maybe_forced(false, distance, ctrl_plane);
1184     }
1185 
1186     /// Same as `emit_island`, but an internal API with a `force_veneers`
1187     /// argument to force all veneers to always get emitted for debugging.
1188     fn emit_island_maybe_forced(
1189         &mut self,
1190         force_veneers: bool,
1191         distance: CodeOffset,
1192         ctrl_plane: &mut ControlPlane,
1193     ) {
1194         // We're going to purge fixups, so no latest-branch editing can happen
1195         // anymore.
1196         self.latest_branches.clear();
1197 
1198         // Reset internal calculations about islands since we're going to
1199         // change the calculus as we apply fixups. The `forced_threshold` is
1200         // used here to determine whether jumps to unknown labels will require
1201         // a veneer or not.
1202         let forced_threshold = self.worst_case_end_of_island(distance);
1203         self.island_deadline = UNKNOWN_LABEL_OFFSET;
1204         self.island_worst_case_size = 0;
1205 
1206         // End the current location tracking since anything emitted during this
1207         // function shouldn't be attributed to whatever the current source
1208         // location is.
1209         //
1210         // Note that the current source location, if it's set right now, will be
1211         // restored at the end of this island emission.
1212         let cur_loc = self.cur_srcloc.map(|(_, loc)| loc);
1213         if cur_loc.is_some() {
1214             self.end_srcloc();
1215         }
1216 
1217         // First flush out all traps/constants so we have more labels in case
1218         // fixups are applied against these labels.
1219         //
1220         // Note that traps are placed first since this typically happens at the
1221         // end of the function and for disassemblers we try to keep all the code
1222         // contiguously together.
1223         for MachLabelTrap {
1224             label,
1225             code,
1226             stack_map,
1227             loc,
1228         } in mem::take(&mut self.pending_traps)
1229         {
1230             // If this trap has source information associated with it then
1231             // emit this information for the trap instruction going out now too.
1232             if let Some(loc) = loc {
1233                 self.start_srcloc(loc);
1234             }
1235             self.align_to(I::LabelUse::ALIGN);
1236             self.bind_label(label, ctrl_plane);
1237             self.add_trap(code);
1238             if let Some(map) = stack_map {
1239                 let extent = StackMapExtent::UpcomingBytes(I::TRAP_OPCODE.len() as u32);
1240                 self.add_stack_map(extent, map);
1241             }
1242             self.put_data(I::TRAP_OPCODE);
1243             if loc.is_some() {
1244                 self.end_srcloc();
1245             }
1246         }
1247 
1248         for constant in mem::take(&mut self.pending_constants) {
1249             let MachBufferConstant { align, size, .. } = self.constants[constant];
1250             let label = self.constants[constant].upcoming_label.take().unwrap();
1251             self.align_to(align);
1252             self.bind_label(label, ctrl_plane);
1253             self.used_constants.push((constant, self.cur_offset()));
1254             self.get_appended_space(size);
1255         }
1256 
1257         for fixup in mem::take(&mut self.fixup_records) {
1258             trace!("emit_island: fixup {:?}", fixup);
1259             let MachLabelFixup {
1260                 label,
1261                 offset,
1262                 kind,
1263             } = fixup;
1264             let label_offset = self.resolve_label_offset(label);
1265             let start = offset as usize;
1266             let end = (offset + kind.patch_size()) as usize;
1267 
1268             if label_offset != UNKNOWN_LABEL_OFFSET {
1269                 // If the offset of the label for this fixup is known then
1270                 // we're going to do something here-and-now. We're either going
1271                 // to patch the original offset because it's an in-bounds jump,
1272                 // or we're going to generate a veneer, patch the fixup to jump
1273                 // to the veneer, and then keep going.
1274                 //
1275                 // If the label comes after the original fixup, then we should
1276                 // be guaranteed that the jump is in-bounds. Otherwise there's
1277                 // a bug somewhere because this method wasn't called soon
1278                 // enough. All forward-jumps are tracked and should get veneers
1279                 // before their deadline comes and they're unable to jump
1280                 // further.
1281                 //
1282                 // Otherwise if the label is before the fixup, then that's a
1283                 // backwards jump. If it's past the maximum negative range
1284                 // then we'll emit a veneer that to jump forward to which can
1285                 // then jump backwards.
1286                 let veneer_required = if label_offset >= offset {
1287                     assert!((label_offset - offset) <= kind.max_pos_range());
1288                     false
1289                 } else {
1290                     (offset - label_offset) > kind.max_neg_range()
1291                 };
1292                 trace!(
1293                     " -> label_offset = {}, known, required = {} (pos {} neg {})",
1294                     label_offset,
1295                     veneer_required,
1296                     kind.max_pos_range(),
1297                     kind.max_neg_range()
1298                 );
1299 
1300                 if (force_veneers && kind.supports_veneer()) || veneer_required {
1301                     self.emit_veneer(label, offset, kind);
1302                 } else {
1303                     let slice = &mut self.data[start..end];
1304                     trace!("patching in-range!");
1305                     kind.patch(slice, offset, label_offset);
1306                 }
1307             } else {
1308                 // If the offset of this label is not known at this time then
1309                 // there's one of two possibilities:
1310                 //
1311                 // * First we may be about to exceed the maximum jump range of
1312                 //   this fixup. In that case a veneer is inserted to buy some
1313                 //   more budget for the forward-jump. It's guaranteed that the
1314                 //   label will eventually come after where we're at, so we know
1315                 //   that the forward jump is necessary.
1316                 //
1317                 // * Otherwise we're still within range of the forward jump but
1318                 //   the precise target isn't known yet. In that case we
1319                 //   enqueue the fixup to get processed later.
1320                 if forced_threshold - offset > kind.max_pos_range() {
1321                     self.emit_veneer(label, offset, kind);
1322                 } else {
1323                     self.use_label_at_offset(offset, label, kind);
1324                 }
1325             }
1326         }
1327 
1328         if let Some(loc) = cur_loc {
1329             self.start_srcloc(loc);
1330         }
1331     }
1332 
1333     /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1334     ///
1335     /// This will generate extra machine code, using `kind`, to get a
1336     /// larger-jump-kind than `kind` allows. The code at `offset` is then
1337     /// patched to jump to our new code, and then the new code is enqueued for
1338     /// a fixup to get processed at some later time.
1339     fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1340         // If this `kind` doesn't support a veneer then that's a bug in the
1341         // backend because we need to implement support for such a veneer.
1342         assert!(
1343             kind.supports_veneer(),
1344             "jump beyond the range of {:?} but a veneer isn't supported",
1345             kind,
1346         );
1347 
1348         // Allocate space for a veneer in the island.
1349         self.align_to(I::LabelUse::ALIGN);
1350         let veneer_offset = self.cur_offset();
1351         trace!("making a veneer at {}", veneer_offset);
1352         let start = offset as usize;
1353         let end = (offset + kind.patch_size()) as usize;
1354         let slice = &mut self.data[start..end];
1355         // Patch the original label use to refer to the veneer.
1356         trace!(
1357             "patching original at offset {} to veneer offset {}",
1358             offset,
1359             veneer_offset
1360         );
1361         kind.patch(slice, offset, veneer_offset);
1362         // Generate the veneer.
1363         let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1364         let (veneer_fixup_off, veneer_label_use) =
1365             kind.generate_veneer(veneer_slice, veneer_offset);
1366         trace!(
1367             "generated veneer; fixup offset {}, label_use {:?}",
1368             veneer_fixup_off,
1369             veneer_label_use
1370         );
1371         // Register a new use of `label` with our new veneer fixup and offset.
1372         // This'll recalculate deadlines accordingly and enqueue this fixup to
1373         // get processed at some later time.
1374         self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1375     }
1376 
1377     fn finish_emission_maybe_forcing_veneers(
1378         &mut self,
1379         force_veneers: bool,
1380         ctrl_plane: &mut ControlPlane,
1381     ) {
1382         while !self.pending_constants.is_empty()
1383             || !self.pending_traps.is_empty()
1384             || !self.fixup_records.is_empty()
1385         {
1386             // `emit_island()` will emit any pending veneers and constants, and
1387             // as a side-effect, will also take care of any fixups with resolved
1388             // labels eagerly.
1389             self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane);
1390         }
1391 
1392         // Ensure that all labels have been fixed up after the last island is emitted. This is a
1393         // full (release-mode) assert because an unresolved label means the emitted code is
1394         // incorrect.
1395         assert!(self.fixup_records.is_empty());
1396     }
1397 
1398     /// Finish any deferred emissions and/or fixups.
1399     pub fn finish(
1400         mut self,
1401         constants: &VCodeConstants,
1402         ctrl_plane: &mut ControlPlane,
1403     ) -> MachBufferFinalized<Stencil> {
1404         let _tt = timing::vcode_emit_finish();
1405 
1406         // Do any optimizations on branches at tail of buffer, as if we
1407         // had bound one last label.
1408         self.optimize_branches(ctrl_plane);
1409 
1410         self.finish_emission_maybe_forcing_veneers(false, ctrl_plane);
1411 
1412         let alignment = self.finish_constants(constants);
1413 
1414         let mut srclocs = self.srclocs;
1415         srclocs.sort_by_key(|entry| entry.start);
1416 
1417         MachBufferFinalized {
1418             data: self.data,
1419             relocs: self.relocs,
1420             traps: self.traps,
1421             call_sites: self.call_sites,
1422             srclocs,
1423             stack_maps: self.stack_maps,
1424             unwind_info: self.unwind_info,
1425             alignment,
1426         }
1427     }
1428 
1429     /// Add an external relocation at the current offset.
1430     pub fn add_reloc(&mut self, kind: Reloc, name: &ExternalName, addend: Addend) {
1431         let name = name.clone();
1432         // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1433         // generate a label-use statement to track whether an island is possibly
1434         // needed to escape this function to actually get to the external name.
1435         // This is most likely to come up on AArch64 where calls between
1436         // functions use a 26-bit signed offset which gives +/- 64MB. This means
1437         // that if a function is 128MB in size and there's a call in the middle
1438         // it's impossible to reach the actual target. Also, while it's
1439         // technically possible to jump to the start of a function and then jump
1440         // further, island insertion below always inserts islands after
1441         // previously appended code so for Cranelift's own implementation this
1442         // is also a problem for 64MB functions on AArch64 which start with a
1443         // call instruction, those won't be able to escape.
1444         //
1445         // Ideally what needs to happen here is that a `LabelUse` is
1446         // transparently generated (or call-sites of this function are audited
1447         // to generate a `LabelUse` instead) and tracked internally. The actual
1448         // relocation would then change over time if and when a veneer is
1449         // inserted, where the relocation here would be patched by this
1450         // `MachBuffer` to jump to the veneer. The problem, though, is that all
1451         // this still needs to end up, in the case of a singular function,
1452         // generating a final relocation pointing either to this particular
1453         // relocation or to the veneer inserted. Additionally
1454         // `MachBuffer` needs the concept of a label which will never be
1455         // resolved, so `emit_island` doesn't trip over not actually ever
1456         // knowning what some labels are. Currently the loop in
1457         // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1458         // loop.
1459         //
1460         // For now this means that because relocs aren't tracked at all that
1461         // AArch64 functions have a rough size limits of 64MB. For now that's
1462         // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1463         // when a relocation can't otherwise be resolved later, so it shouldn't
1464         // actually result in any memory unsafety or anything like that.
1465         self.relocs.push(MachReloc {
1466             offset: self.data.len() as CodeOffset,
1467             kind,
1468             name,
1469             addend,
1470         });
1471     }
1472 
1473     /// Add a trap record at the current offset.
1474     pub fn add_trap(&mut self, code: TrapCode) {
1475         self.traps.push(MachTrap {
1476             offset: self.data.len() as CodeOffset,
1477             code,
1478         });
1479     }
1480 
1481     /// Add a call-site record at the current offset.
1482     pub fn add_call_site(&mut self, opcode: Opcode) {
1483         debug_assert!(
1484             opcode.is_call(),
1485             "adding call site info for a non-call instruction."
1486         );
1487         self.call_sites.push(MachCallSite {
1488             ret_addr: self.data.len() as CodeOffset,
1489             opcode,
1490         });
1491     }
1492 
1493     /// Add an unwind record at the current offset.
1494     pub fn add_unwind(&mut self, unwind: UnwindInst) {
1495         self.unwind_info.push((self.cur_offset(), unwind));
1496     }
1497 
1498     /// Set the `SourceLoc` for code from this offset until the offset at the
1499     /// next call to `end_srcloc()`.
1500     pub fn start_srcloc(&mut self, loc: RelSourceLoc) {
1501         self.cur_srcloc = Some((self.cur_offset(), loc));
1502     }
1503 
1504     /// Mark the end of the `SourceLoc` segment started at the last
1505     /// `start_srcloc()` call.
1506     pub fn end_srcloc(&mut self) {
1507         let (start, loc) = self
1508             .cur_srcloc
1509             .take()
1510             .expect("end_srcloc() called without start_srcloc()");
1511         let end = self.cur_offset();
1512         // Skip zero-length extends.
1513         debug_assert!(end >= start);
1514         if end > start {
1515             self.srclocs.push(MachSrcLoc { start, end, loc });
1516         }
1517     }
1518 
1519     /// Add stack map metadata for this program point: a set of stack offsets
1520     /// (from SP upward) that contain live references.
1521     ///
1522     /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets`
1523     /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets`
1524     /// element, one can obtain live-reference offsets from FP instead.
1525     pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) {
1526         let (start, end) = match extent {
1527             StackMapExtent::UpcomingBytes(insn_len) => {
1528                 let start_offset = self.cur_offset();
1529                 (start_offset, start_offset + insn_len)
1530             }
1531             StackMapExtent::StartedAtOffset(start_offset) => {
1532                 let end_offset = self.cur_offset();
1533                 debug_assert!(end_offset >= start_offset);
1534                 (start_offset, end_offset)
1535             }
1536         };
1537         trace!("Adding stack map for offsets {start:#x}..{end:#x}");
1538         self.stack_maps.push(MachStackMap {
1539             offset: start,
1540             offset_end: end,
1541             stack_map,
1542         });
1543     }
1544 }
1545 
1546 impl<T: CompilePhase> MachBufferFinalized<T> {
1547     /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1548     pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] {
1549         &self.srclocs[..]
1550     }
1551 
1552     /// Get the total required size for the code.
1553     pub fn total_size(&self) -> CodeOffset {
1554         self.data.len() as CodeOffset
1555     }
1556 
1557     /// Return the code in this mach buffer as a hex string for testing purposes.
1558     pub fn stringify_code_bytes(&self) -> String {
1559         // This is pretty lame, but whatever ..
1560         use std::fmt::Write;
1561         let mut s = String::with_capacity(self.data.len() * 2);
1562         for b in &self.data {
1563             write!(&mut s, "{:02X}", b).unwrap();
1564         }
1565         s
1566     }
1567 
1568     /// Get the code bytes.
1569     pub fn data(&self) -> &[u8] {
1570         // N.B.: we emit every section into the .text section as far as
1571         // the `CodeSink` is concerned; we do not bother to segregate
1572         // the contents into the actual program text, the jumptable and the
1573         // rodata (constant pool). This allows us to generate code assuming
1574         // that these will not be relocated relative to each other, and avoids
1575         // having to designate each section as belonging in one of the three
1576         // fixed categories defined by `CodeSink`. If this becomes a problem
1577         // later (e.g. because of memory permissions or similar), we can
1578         // add this designation and segregate the output; take care, however,
1579         // to add the appropriate relocations in this case.
1580 
1581         &self.data[..]
1582     }
1583 
1584     /// Get the list of external relocations for this code.
1585     pub fn relocs(&self) -> &[MachReloc] {
1586         &self.relocs[..]
1587     }
1588 
1589     /// Get the list of trap records for this code.
1590     pub fn traps(&self) -> &[MachTrap] {
1591         &self.traps[..]
1592     }
1593 
1594     /// Get the stack map metadata for this code.
1595     pub fn stack_maps(&self) -> &[MachStackMap] {
1596         &self.stack_maps[..]
1597     }
1598 
1599     /// Get the list of call sites for this code.
1600     pub fn call_sites(&self) -> &[MachCallSite] {
1601         &self.call_sites[..]
1602     }
1603 }
1604 
1605 /// Metadata about a constant.
1606 struct MachBufferConstant {
1607     /// A label which has not yet been bound which can be used for this
1608     /// constant.
1609     ///
1610     /// This is lazily created when a label is requested for a constant and is
1611     /// cleared when a constant is emitted.
1612     upcoming_label: Option<MachLabel>,
1613     /// Required alignment.
1614     align: CodeOffset,
1615     /// The byte size of this constant.
1616     size: usize,
1617 }
1618 
1619 /// A trap that is deferred to the next time an island is emitted for either
1620 /// traps, constants, or fixups.
1621 struct MachLabelTrap {
1622     /// This label will refer to the trap's offset.
1623     label: MachLabel,
1624     /// The code associated with this trap.
1625     code: TrapCode,
1626     /// An optional stack map to associate with this trap.
1627     stack_map: Option<StackMap>,
1628     /// An optional source location to assign for this trap.
1629     loc: Option<RelSourceLoc>,
1630 }
1631 
1632 /// A fixup to perform on the buffer once code is emitted. Fixups always refer
1633 /// to labels and patch the code based on label offsets. Hence, they are like
1634 /// relocations, but internal to one buffer.
1635 #[derive(Debug)]
1636 struct MachLabelFixup<I: VCodeInst> {
1637     /// The label whose offset controls this fixup.
1638     label: MachLabel,
1639     /// The offset to fix up / patch to refer to this label.
1640     offset: CodeOffset,
1641     /// The kind of fixup. This is architecture-specific; each architecture may have,
1642     /// e.g., several types of branch instructions, each with differently-sized
1643     /// offset fields and different places within the instruction to place the
1644     /// bits.
1645     kind: I::LabelUse,
1646 }
1647 
1648 /// A relocation resulting from a compilation.
1649 #[derive(Clone, Debug, PartialEq)]
1650 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1651 pub struct MachReloc {
1652     /// The offset at which the relocation applies, *relative to the
1653     /// containing section*.
1654     pub offset: CodeOffset,
1655     /// The kind of relocation.
1656     pub kind: Reloc,
1657     /// The external symbol / name to which this relocation refers.
1658     pub name: ExternalName,
1659     /// The addend to add to the symbol value.
1660     pub addend: i64,
1661 }
1662 
1663 /// A trap record resulting from a compilation.
1664 #[derive(Clone, Debug, PartialEq)]
1665 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1666 pub struct MachTrap {
1667     /// The offset at which the trap instruction occurs, *relative to the
1668     /// containing section*.
1669     pub offset: CodeOffset,
1670     /// The trap code.
1671     pub code: TrapCode,
1672 }
1673 
1674 /// A call site record resulting from a compilation.
1675 #[derive(Clone, Debug, PartialEq)]
1676 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1677 pub struct MachCallSite {
1678     /// The offset of the call's return address, *relative to the containing section*.
1679     pub ret_addr: CodeOffset,
1680     /// The call's opcode.
1681     pub opcode: Opcode,
1682 }
1683 
1684 /// A source-location mapping resulting from a compilation.
1685 #[derive(PartialEq, Debug, Clone)]
1686 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1687 pub struct MachSrcLoc<T: CompilePhase> {
1688     /// The start of the region of code corresponding to a source location.
1689     /// This is relative to the start of the function, not to the start of the
1690     /// section.
1691     pub start: CodeOffset,
1692     /// The end of the region of code corresponding to a source location.
1693     /// This is relative to the start of the section, not to the start of the
1694     /// section.
1695     pub end: CodeOffset,
1696     /// The source location.
1697     pub loc: T::SourceLocType,
1698 }
1699 
1700 impl MachSrcLoc<Stencil> {
1701     fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> {
1702         MachSrcLoc {
1703             start: self.start,
1704             end: self.end,
1705             loc: self.loc.expand(base_srcloc),
1706         }
1707     }
1708 }
1709 
1710 /// Record of stack map metadata: stack offsets containing references.
1711 #[derive(Clone, Debug, PartialEq)]
1712 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))]
1713 pub struct MachStackMap {
1714     /// The code offset at which this stack map applies.
1715     pub offset: CodeOffset,
1716     /// The code offset just past the "end" of the instruction: that is, the
1717     /// offset of the first byte of the following instruction, or equivalently,
1718     /// the start offset plus the instruction length.
1719     pub offset_end: CodeOffset,
1720     /// The stack map itself.
1721     pub stack_map: StackMap,
1722 }
1723 
1724 /// Record of branch instruction in the buffer, to facilitate editing.
1725 #[derive(Clone, Debug)]
1726 struct MachBranch {
1727     start: CodeOffset,
1728     end: CodeOffset,
1729     target: MachLabel,
1730     fixup: usize,
1731     inverted: Option<SmallVec<[u8; 8]>>,
1732     /// All labels pointing to the start of this branch. For correctness, this
1733     /// *must* be complete (i.e., must contain all labels whose resolved offsets
1734     /// are at the start of this branch): we rely on being able to redirect all
1735     /// labels that could jump to this branch before removing it, if it is
1736     /// otherwise unreachable.
1737     labels_at_this_branch: SmallVec<[MachLabel; 4]>,
1738 }
1739 
1740 impl MachBranch {
1741     fn is_cond(&self) -> bool {
1742         self.inverted.is_some()
1743     }
1744     fn is_uncond(&self) -> bool {
1745         self.inverted.is_none()
1746     }
1747 }
1748 
1749 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
1750 ///
1751 /// Note that `MachBuffer` was primarily written for intra-function references
1752 /// of jumps between basic blocks, but it's also quite usable for entire text
1753 /// sections and resolving references between functions themselves. This
1754 /// builder interprets "blocks" as labeled functions for the purposes of
1755 /// resolving labels internally in the buffer.
1756 pub struct MachTextSectionBuilder<I: VCodeInst> {
1757     buf: MachBuffer<I>,
1758     next_func: usize,
1759     force_veneers: bool,
1760 }
1761 
1762 impl<I: VCodeInst> MachTextSectionBuilder<I> {
1763     /// Creates a new text section builder which will have `num_funcs` functions
1764     /// pushed into it.
1765     pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> {
1766         let mut buf = MachBuffer::new();
1767         buf.reserve_labels_for_blocks(num_funcs);
1768         MachTextSectionBuilder {
1769             buf,
1770             next_func: 0,
1771             force_veneers: false,
1772         }
1773     }
1774 }
1775 
1776 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
1777     fn append(
1778         &mut self,
1779         labeled: bool,
1780         func: &[u8],
1781         align: u32,
1782         ctrl_plane: &mut ControlPlane,
1783     ) -> u64 {
1784         // Conditionally emit an island if it's necessary to resolve jumps
1785         // between functions which are too far away.
1786         let size = func.len() as u32;
1787         if self.force_veneers || self.buf.island_needed(size) {
1788             self.buf
1789                 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane);
1790         }
1791 
1792         self.buf.align_to(align);
1793         let pos = self.buf.cur_offset();
1794         if labeled {
1795             self.buf.bind_label(
1796                 MachLabel::from_block(BlockIndex::new(self.next_func)),
1797                 ctrl_plane,
1798             );
1799             self.next_func += 1;
1800         }
1801         self.buf.put_data(func);
1802         u64::from(pos)
1803     }
1804 
1805     fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool {
1806         let label = MachLabel::from_block(BlockIndex::new(target));
1807         let offset = u32::try_from(offset).unwrap();
1808         match I::LabelUse::from_reloc(reloc, addend) {
1809             Some(label_use) => {
1810                 self.buf.use_label_at_offset(offset, label, label_use);
1811                 true
1812             }
1813             None => false,
1814         }
1815     }
1816 
1817     fn force_veneers(&mut self) {
1818         self.force_veneers = true;
1819     }
1820 
1821     fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> {
1822         // Double-check all functions were pushed.
1823         assert_eq!(self.next_func, self.buf.label_offsets.len());
1824 
1825         // Finish up any veneers, if necessary.
1826         self.buf
1827             .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane);
1828 
1829         // We don't need the data any more, so return it to the caller.
1830         mem::take(&mut self.buf.data).into_vec()
1831     }
1832 }
1833 
1834 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
1835 #[cfg(all(test, feature = "arm64"))]
1836 mod test {
1837     use cranelift_entity::EntityRef as _;
1838 
1839     use super::*;
1840     use crate::ir::UserExternalNameRef;
1841     use crate::isa::aarch64::inst::xreg;
1842     use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
1843     use crate::machinst::{MachInstEmit, MachInstEmitState};
1844     use crate::settings;
1845     use std::default::Default;
1846     use std::vec::Vec;
1847 
1848     fn label(n: u32) -> MachLabel {
1849         MachLabel::from_block(BlockIndex::new(n as usize))
1850     }
1851     fn target(n: u32) -> BranchTarget {
1852         BranchTarget::Label(label(n))
1853     }
1854 
1855     #[test]
1856     fn test_elide_jump_to_next() {
1857         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1858         let mut buf = MachBuffer::new();
1859         let mut state = <Inst as MachInstEmit>::State::default();
1860         let constants = Default::default();
1861 
1862         buf.reserve_labels_for_blocks(2);
1863         buf.bind_label(label(0), state.ctrl_plane_mut());
1864         let inst = Inst::Jump { dest: target(1) };
1865         inst.emit(&[], &mut buf, &info, &mut state);
1866         buf.bind_label(label(1), state.ctrl_plane_mut());
1867         let buf = buf.finish(&constants, state.ctrl_plane_mut());
1868         assert_eq!(0, buf.total_size());
1869     }
1870 
1871     #[test]
1872     fn test_elide_trivial_jump_blocks() {
1873         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1874         let mut buf = MachBuffer::new();
1875         let mut state = <Inst as MachInstEmit>::State::default();
1876         let constants = Default::default();
1877 
1878         buf.reserve_labels_for_blocks(4);
1879 
1880         buf.bind_label(label(0), state.ctrl_plane_mut());
1881         let inst = Inst::CondBr {
1882             kind: CondBrKind::NotZero(xreg(0)),
1883             taken: target(1),
1884             not_taken: target(2),
1885         };
1886         inst.emit(&[], &mut buf, &info, &mut state);
1887 
1888         buf.bind_label(label(1), state.ctrl_plane_mut());
1889         let inst = Inst::Jump { dest: target(3) };
1890         inst.emit(&[], &mut buf, &info, &mut state);
1891 
1892         buf.bind_label(label(2), state.ctrl_plane_mut());
1893         let inst = Inst::Jump { dest: target(3) };
1894         inst.emit(&[], &mut buf, &info, &mut state);
1895 
1896         buf.bind_label(label(3), state.ctrl_plane_mut());
1897 
1898         let buf = buf.finish(&constants, state.ctrl_plane_mut());
1899         assert_eq!(0, buf.total_size());
1900     }
1901 
1902     #[test]
1903     fn test_flip_cond() {
1904         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1905         let mut buf = MachBuffer::new();
1906         let mut state = <Inst as MachInstEmit>::State::default();
1907         let constants = Default::default();
1908 
1909         buf.reserve_labels_for_blocks(4);
1910 
1911         buf.bind_label(label(0), state.ctrl_plane_mut());
1912         let inst = Inst::CondBr {
1913             kind: CondBrKind::Zero(xreg(0)),
1914             taken: target(1),
1915             not_taken: target(2),
1916         };
1917         inst.emit(&[], &mut buf, &info, &mut state);
1918 
1919         buf.bind_label(label(1), state.ctrl_plane_mut());
1920         let inst = Inst::Nop4;
1921         inst.emit(&[], &mut buf, &info, &mut state);
1922 
1923         buf.bind_label(label(2), state.ctrl_plane_mut());
1924         let inst = Inst::Udf {
1925             trap_code: TrapCode::Interrupt,
1926         };
1927         inst.emit(&[], &mut buf, &info, &mut state);
1928 
1929         buf.bind_label(label(3), state.ctrl_plane_mut());
1930 
1931         let buf = buf.finish(&constants, state.ctrl_plane_mut());
1932 
1933         let mut buf2 = MachBuffer::new();
1934         let mut state = Default::default();
1935         let inst = Inst::TrapIf {
1936             kind: CondBrKind::NotZero(xreg(0)),
1937             trap_code: TrapCode::Interrupt,
1938         };
1939         inst.emit(&[], &mut buf2, &info, &mut state);
1940         let inst = Inst::Nop4;
1941         inst.emit(&[], &mut buf2, &info, &mut state);
1942 
1943         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
1944 
1945         assert_eq!(buf.data, buf2.data);
1946     }
1947 
1948     #[test]
1949     fn test_island() {
1950         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1951         let mut buf = MachBuffer::new();
1952         let mut state = <Inst as MachInstEmit>::State::default();
1953         let constants = Default::default();
1954 
1955         buf.reserve_labels_for_blocks(4);
1956 
1957         buf.bind_label(label(0), state.ctrl_plane_mut());
1958         let inst = Inst::CondBr {
1959             kind: CondBrKind::NotZero(xreg(0)),
1960             taken: target(2),
1961             not_taken: target(3),
1962         };
1963         inst.emit(&[], &mut buf, &info, &mut state);
1964 
1965         buf.bind_label(label(1), state.ctrl_plane_mut());
1966         while buf.cur_offset() < 2000000 {
1967             if buf.island_needed(0) {
1968                 buf.emit_island(0, state.ctrl_plane_mut());
1969             }
1970             let inst = Inst::Nop4;
1971             inst.emit(&[], &mut buf, &info, &mut state);
1972         }
1973 
1974         buf.bind_label(label(2), state.ctrl_plane_mut());
1975         let inst = Inst::Nop4;
1976         inst.emit(&[], &mut buf, &info, &mut state);
1977 
1978         buf.bind_label(label(3), state.ctrl_plane_mut());
1979         let inst = Inst::Nop4;
1980         inst.emit(&[], &mut buf, &info, &mut state);
1981 
1982         let buf = buf.finish(&constants, state.ctrl_plane_mut());
1983 
1984         assert_eq!(2000000 + 8, buf.total_size());
1985 
1986         let mut buf2 = MachBuffer::new();
1987         let mut state = Default::default();
1988         let inst = Inst::CondBr {
1989             kind: CondBrKind::NotZero(xreg(0)),
1990 
1991             // This conditionally taken branch has a 19-bit constant, shifted
1992             // to the left by two, giving us a 21-bit range in total. Half of
1993             // this range positive so the we should be around 1 << 20 bytes
1994             // away for our jump target.
1995             //
1996             // There are two pending fixups by the time we reach this point,
1997             // one for this 19-bit jump and one for the unconditional 26-bit
1998             // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
1999             // veneer is 20 bytes large, which means that pessimistically
2000             // assuming we'll need two veneers we need 24 bytes of extra
2001             // space, meaning that the actual island should come 24-bytes
2002             // before the deadline.
2003             taken: BranchTarget::ResolvedOffset((1 << 20) - 4 - 20),
2004 
2005             // This branch is in-range so no veneers should be needed, it should
2006             // go directly to the target.
2007             not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
2008         };
2009         inst.emit(&[], &mut buf2, &info, &mut state);
2010 
2011         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2012 
2013         assert_eq!(&buf.data[0..8], &buf2.data[..]);
2014     }
2015 
2016     #[test]
2017     fn test_island_backward() {
2018         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2019         let mut buf = MachBuffer::new();
2020         let mut state = <Inst as MachInstEmit>::State::default();
2021         let constants = Default::default();
2022 
2023         buf.reserve_labels_for_blocks(4);
2024 
2025         buf.bind_label(label(0), state.ctrl_plane_mut());
2026         let inst = Inst::Nop4;
2027         inst.emit(&[], &mut buf, &info, &mut state);
2028 
2029         buf.bind_label(label(1), state.ctrl_plane_mut());
2030         let inst = Inst::Nop4;
2031         inst.emit(&[], &mut buf, &info, &mut state);
2032 
2033         buf.bind_label(label(2), state.ctrl_plane_mut());
2034         while buf.cur_offset() < 2000000 {
2035             let inst = Inst::Nop4;
2036             inst.emit(&[], &mut buf, &info, &mut state);
2037         }
2038 
2039         buf.bind_label(label(3), state.ctrl_plane_mut());
2040         let inst = Inst::CondBr {
2041             kind: CondBrKind::NotZero(xreg(0)),
2042             taken: target(0),
2043             not_taken: target(1),
2044         };
2045         inst.emit(&[], &mut buf, &info, &mut state);
2046 
2047         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2048 
2049         assert_eq!(2000000 + 12, buf.total_size());
2050 
2051         let mut buf2 = MachBuffer::new();
2052         let mut state = Default::default();
2053         let inst = Inst::CondBr {
2054             kind: CondBrKind::NotZero(xreg(0)),
2055             taken: BranchTarget::ResolvedOffset(8),
2056             not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
2057         };
2058         inst.emit(&[], &mut buf2, &info, &mut state);
2059         let inst = Inst::Jump {
2060             dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
2061         };
2062         inst.emit(&[], &mut buf2, &info, &mut state);
2063 
2064         let buf2 = buf2.finish(&constants, state.ctrl_plane_mut());
2065 
2066         assert_eq!(&buf.data[2000000..], &buf2.data[..]);
2067     }
2068 
2069     #[test]
2070     fn test_multiple_redirect() {
2071         // label0:
2072         //   cbz x0, label1
2073         //   b label2
2074         // label1:
2075         //   b label3
2076         // label2:
2077         //   nop
2078         //   nop
2079         //   b label0
2080         // label3:
2081         //   b label4
2082         // label4:
2083         //   b label5
2084         // label5:
2085         //   b label7
2086         // label6:
2087         //   nop
2088         // label7:
2089         //   ret
2090         //
2091         // -- should become:
2092         //
2093         // label0:
2094         //   cbz x0, label7
2095         // label2:
2096         //   nop
2097         //   nop
2098         //   b label0
2099         // label6:
2100         //   nop
2101         // label7:
2102         //   ret
2103 
2104         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2105         let mut buf = MachBuffer::new();
2106         let mut state = <Inst as MachInstEmit>::State::default();
2107         let constants = Default::default();
2108 
2109         buf.reserve_labels_for_blocks(8);
2110 
2111         buf.bind_label(label(0), state.ctrl_plane_mut());
2112         let inst = Inst::CondBr {
2113             kind: CondBrKind::Zero(xreg(0)),
2114             taken: target(1),
2115             not_taken: target(2),
2116         };
2117         inst.emit(&[], &mut buf, &info, &mut state);
2118 
2119         buf.bind_label(label(1), state.ctrl_plane_mut());
2120         let inst = Inst::Jump { dest: target(3) };
2121         inst.emit(&[], &mut buf, &info, &mut state);
2122 
2123         buf.bind_label(label(2), state.ctrl_plane_mut());
2124         let inst = Inst::Nop4;
2125         inst.emit(&[], &mut buf, &info, &mut state);
2126         inst.emit(&[], &mut buf, &info, &mut state);
2127         let inst = Inst::Jump { dest: target(0) };
2128         inst.emit(&[], &mut buf, &info, &mut state);
2129 
2130         buf.bind_label(label(3), state.ctrl_plane_mut());
2131         let inst = Inst::Jump { dest: target(4) };
2132         inst.emit(&[], &mut buf, &info, &mut state);
2133 
2134         buf.bind_label(label(4), state.ctrl_plane_mut());
2135         let inst = Inst::Jump { dest: target(5) };
2136         inst.emit(&[], &mut buf, &info, &mut state);
2137 
2138         buf.bind_label(label(5), state.ctrl_plane_mut());
2139         let inst = Inst::Jump { dest: target(7) };
2140         inst.emit(&[], &mut buf, &info, &mut state);
2141 
2142         buf.bind_label(label(6), state.ctrl_plane_mut());
2143         let inst = Inst::Nop4;
2144         inst.emit(&[], &mut buf, &info, &mut state);
2145 
2146         buf.bind_label(label(7), state.ctrl_plane_mut());
2147         let inst = Inst::Ret {
2148             rets: vec![],
2149             stack_bytes_to_pop: 0,
2150         };
2151         inst.emit(&[], &mut buf, &info, &mut state);
2152 
2153         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2154 
2155         let golden_data = vec![
2156             0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
2157             0x1f, 0x20, 0x03, 0xd5, // nop
2158             0x1f, 0x20, 0x03, 0xd5, // nop
2159             0xfd, 0xff, 0xff, 0x17, // b 0
2160             0x1f, 0x20, 0x03, 0xd5, // nop
2161             0xc0, 0x03, 0x5f, 0xd6, // ret
2162         ];
2163 
2164         assert_eq!(&golden_data[..], &buf.data[..]);
2165     }
2166 
2167     #[test]
2168     fn test_handle_branch_cycle() {
2169         // label0:
2170         //   b label1
2171         // label1:
2172         //   b label2
2173         // label2:
2174         //   b label3
2175         // label3:
2176         //   b label4
2177         // label4:
2178         //   b label1  // note: not label0 (to make it interesting).
2179         //
2180         // -- should become:
2181         //
2182         // label0, label1, ..., label4:
2183         //   b label0
2184         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
2185         let mut buf = MachBuffer::new();
2186         let mut state = <Inst as MachInstEmit>::State::default();
2187         let constants = Default::default();
2188 
2189         buf.reserve_labels_for_blocks(5);
2190 
2191         buf.bind_label(label(0), state.ctrl_plane_mut());
2192         let inst = Inst::Jump { dest: target(1) };
2193         inst.emit(&[], &mut buf, &info, &mut state);
2194 
2195         buf.bind_label(label(1), state.ctrl_plane_mut());
2196         let inst = Inst::Jump { dest: target(2) };
2197         inst.emit(&[], &mut buf, &info, &mut state);
2198 
2199         buf.bind_label(label(2), state.ctrl_plane_mut());
2200         let inst = Inst::Jump { dest: target(3) };
2201         inst.emit(&[], &mut buf, &info, &mut state);
2202 
2203         buf.bind_label(label(3), state.ctrl_plane_mut());
2204         let inst = Inst::Jump { dest: target(4) };
2205         inst.emit(&[], &mut buf, &info, &mut state);
2206 
2207         buf.bind_label(label(4), state.ctrl_plane_mut());
2208         let inst = Inst::Jump { dest: target(1) };
2209         inst.emit(&[], &mut buf, &info, &mut state);
2210 
2211         let buf = buf.finish(&constants, state.ctrl_plane_mut());
2212 
2213         let golden_data = vec![
2214             0x00, 0x00, 0x00, 0x14, // b 0
2215         ];
2216 
2217         assert_eq!(&golden_data[..], &buf.data[..]);
2218     }
2219 
2220     #[test]
2221     fn metadata_records() {
2222         let mut buf = MachBuffer::<Inst>::new();
2223         let ctrl_plane = &mut Default::default();
2224         let constants = Default::default();
2225 
2226         buf.reserve_labels_for_blocks(1);
2227 
2228         buf.bind_label(label(0), ctrl_plane);
2229         buf.put1(1);
2230         buf.add_trap(TrapCode::HeapOutOfBounds);
2231         buf.put1(2);
2232         buf.add_trap(TrapCode::IntegerOverflow);
2233         buf.add_trap(TrapCode::IntegerDivisionByZero);
2234         buf.add_call_site(Opcode::Call);
2235         buf.add_reloc(
2236             Reloc::Abs4,
2237             &ExternalName::User(UserExternalNameRef::new(0)),
2238             0,
2239         );
2240         buf.put1(3);
2241         buf.add_reloc(
2242             Reloc::Abs8,
2243             &ExternalName::User(UserExternalNameRef::new(1)),
2244             1,
2245         );
2246         buf.put1(4);
2247 
2248         let buf = buf.finish(&constants, ctrl_plane);
2249 
2250         assert_eq!(buf.data(), &[1, 2, 3, 4]);
2251         assert_eq!(
2252             buf.traps()
2253                 .iter()
2254                 .map(|trap| (trap.offset, trap.code))
2255                 .collect::<Vec<_>>(),
2256             vec![
2257                 (1, TrapCode::HeapOutOfBounds),
2258                 (2, TrapCode::IntegerOverflow),
2259                 (2, TrapCode::IntegerDivisionByZero)
2260             ]
2261         );
2262         assert_eq!(
2263             buf.call_sites()
2264                 .iter()
2265                 .map(|call_site| (call_site.ret_addr, call_site.opcode))
2266                 .collect::<Vec<_>>(),
2267             vec![(2, Opcode::Call)]
2268         );
2269         assert_eq!(
2270             buf.relocs()
2271                 .iter()
2272                 .map(|reloc| (reloc.offset, reloc.kind))
2273                 .collect::<Vec<_>>(),
2274             vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2275         );
2276     }
2277 }
2278