1 //! In-memory representation of compiled machine code, with labels and fixups to
2 //! refer to those labels. Handles constant-pool island insertion and also
3 //! veneer insertion for out-of-range jumps.
4 //!
5 //! This code exists to solve three problems:
6 //!
7 //! - Branch targets for forward branches are not known until later, when we
8 //!   emit code in a single pass through the instruction structs.
9 //!
10 //! - On many architectures, address references or offsets have limited range.
11 //!   For example, on AArch64, conditional branches can only target code +/- 1MB
12 //!   from the branch itself.
13 //!
14 //! - The lowering of control flow from the CFG-with-edges produced by
15 //!   [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
16 //!   edge blocks when the register allocator does not need to insert any
17 //!   spills/reloads/moves in edge blocks, results in many suboptimal branch
18 //!   patterns. The lowering also pays no attention to block order, and so
19 //!   two-target conditional forms (cond-br followed by uncond-br) can often by
20 //!   avoided because one of the targets is the fallthrough. There are several
21 //!   cases here where we can simplify to use fewer branches.
22 //!
23 //! This "buffer" implements a single-pass code emission strategy (with a later
24 //! "fixup" pass, but only through recorded fixups, not all instructions). The
25 //! basic idea is:
26 //!
27 //! - Emit branches as they are, including two-target (cond/uncond) compound
28 //!   forms, but with zero offsets and optimistically assuming the target will be
29 //!   in range. Record the "fixup" for later. Targets are denoted instead by
30 //!   symbolic "labels" that are then bound to certain offsets in the buffer as
31 //!   we emit code. (Nominally, there is a label at the start of every basic
32 //!   block.)
33 //!
34 //! - As we do this, track the offset in the buffer at which the first label
35 //!   reference "goes out of range". We call this the "deadline". If we reach the
36 //!   deadline and we still have not bound the label to which an unresolved branch
37 //!   refers, we have a problem!
38 //!
39 //! - To solve this problem, we emit "islands" full of "veneers". An island is
40 //!   simply a chunk of code inserted in the middle of the code actually produced
41 //!   by the emitter (e.g., vcode iterating over instruction structs). The emitter
42 //!   has some awareness of this: it either asks for an island between blocks, so
43 //!   it is not accidentally executed, or else it emits a branch around the island
44 //!   when all other options fail (see `Inst::EmitIsland` meta-instruction).
45 //!
46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island"
47 //!   that implements a longer-range reference to a label. The idea is that, for
48 //!   example, a branch with a limited range can branch to a "veneer" instead,
49 //!   which is simply a branch in a form that can use a longer-range reference. On
50 //!   AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
51 //!   can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
52 //!   conditional branch's label reference can be fixed up with a "veneer" to
53 //!   achieve a longer range.
54 //!
55 //! - To implement all of this, we require the backend to provide a `LabelUse`
56 //!   type that implements a trait. This is nominally an enum that records one of
57 //!   several kinds of references to an offset in code -- basically, a relocation
58 //!   type -- and will usually correspond to different instruction formats. The
59 //!   `LabelUse` implementation specifies the maximum range, how to patch in the
60 //!   actual label location when known, and how to generate a veneer to extend the
61 //!   range.
62 //!
63 //! That satisfies label references, but we still may have suboptimal branch
64 //! patterns. To clean up the branches, we do a simple "peephole"-style
65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
66 //! informs the buffer of branches in the code and, in the case of conditionals,
67 //! the code that would have been emitted to invert this branch's condition. We
68 //! track the "latest branches": these are branches that are contiguous up to
69 //! the current offset. (If any code is emitted after a branch, that branch or
70 //! run of contiguous branches is no longer "latest".) The latest branches are
71 //! those that we can edit by simply truncating the buffer and doing something
72 //! else instead.
73 //!
74 //! To optimize branches, we implement several simple rules, and try to apply
75 //! them to the "latest branches" when possible:
76 //!
77 //! - A branch with a label target, when that label is bound to the ending
78 //!   offset of the branch (the fallthrough location), can be removed altogether,
79 //!   because the branch would have no effect).
80 //!
81 //! - An unconditional branch that starts at a label location, and branches to
82 //!   another label, results in a "label alias": all references to the label bound
83 //!   *to* this branch instruction are instead resolved to the *target* of the
84 //!   branch instruction. This effectively removes empty blocks that just
85 //!   unconditionally branch to the next block. We call this "branch threading".
86 //!
87 //! - A conditional followed by an unconditional, when the conditional branches
88 //!   to the unconditional's fallthrough, results in (i) the truncation of the
89 //!   unconditional, (ii) the inversion of the condition's condition, and (iii)
90 //!   replacement of the conditional's target (using the original target of the
91 //!   unconditional). This is a fancy way of saying "we can flip a two-target
92 //!   conditional branch's taken/not-taken targets if it works better with our
93 //!   fallthrough". To make this work, the emitter actually gives the buffer
94 //!   *both* forms of every conditional branch: the true form is emitted into the
95 //!   buffer, and the "inverted" machine-code bytes are provided as part of the
96 //!   branch-fixup metadata.
97 //!
98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have
99 //!   been redirected to target(B), can be removed entirely. This is an extension
100 //!   of the branch-threading optimization, and is valid because if we know there
101 //!   will be no fallthrough into this branch instruction (the prior instruction
102 //!   is an unconditional jump), and if we know we have successfully redirected
103 //!   all labels, then this branch instruction is unreachable. Note that this
104 //!   works because the redirection happens before the label is ever resolved
105 //!   (fixups happen at island emission time, at which point latest-branches are
106 //!   cleared, or at the end of emission), so we are sure to catch and redirect
107 //!   all possible paths to this instruction.
108 //!
109 //! # Branch-optimization Correctness
110 //!
111 //! The branch-optimization mechanism depends on a few data structures with
112 //! invariants, which are always held outside the scope of top-level public
113 //! methods:
114 //!
115 //! - The latest-branches list. Each entry describes a span of the buffer
116 //!   (start/end offsets), the label target, the corresponding fixup-list entry
117 //!   index, and the bytes (must be the same length) for the inverted form, if
118 //!   conditional. The list of labels that are bound to the start-offset of this
119 //!   branch is *complete* (if any label has a resolved offset equal to `start`
120 //!   and is not an alias, it must appear in this list) and *precise* (no label
121 //!   in this list can be bound to another offset). No label in this list should
122 //!   be an alias.  No two branch ranges can overlap, and branches are in
123 //!   ascending-offset order.
124 //!
125 //! - The labels-at-tail list. This contains all MachLabels that have been bound
126 //!   to (whose resolved offsets are equal to) the tail offset of the buffer.
127 //!   No label in this list should be an alias.
128 //!
129 //! - The label_offsets array, containing the bound offset of a label or
130 //!   UNKNOWN. No label can be bound at an offset greater than the current
131 //!   buffer tail.
132 //!
133 //! - The label_aliases array, containing another label to which a label is
134 //!   bound or UNKNOWN. A label's resolved offset is the resolved offset
135 //!   of the label it is aliased to, if this is set.
136 //!
137 //! We argue below, at each method, how the invariants in these data structures
138 //! are maintained (grep for "Post-invariant").
139 //!
140 //! Given these invariants, we argue why each optimization preserves execution
141 //! semantics below (grep for "Preserves execution semantics").
142 
143 use crate::binemit::{Addend, CodeOffset, Reloc, StackMap};
144 use crate::ir::{ExternalName, Opcode, SourceLoc, TrapCode};
145 use crate::isa::unwind::UnwindInst;
146 use crate::machinst::{
147     BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst,
148 };
149 use crate::timing;
150 use cranelift_entity::{entity_impl, SecondaryMap};
151 use log::trace;
152 use smallvec::SmallVec;
153 use std::convert::TryFrom;
154 use std::mem;
155 use std::string::String;
156 use std::vec::Vec;
157 
158 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
159 /// in bulk.
160 ///
161 /// This struct uses `SmallVec`s to support small-ish function bodies without
162 /// any heap allocation. As such, it will be several kilobytes large. This is
163 /// likely fine as long as it is stack-allocated for function emission then
164 /// thrown away; but beware if many buffer objects are retained persistently.
165 pub struct MachBuffer<I: VCodeInst> {
166     /// The buffer contents, as raw bytes.
167     data: SmallVec<[u8; 1024]>,
168     /// Any relocations referring to this code. Note that only *external*
169     /// relocations are tracked here; references to labels within the buffer are
170     /// resolved before emission.
171     relocs: SmallVec<[MachReloc; 16]>,
172     /// Any trap records referring to this code.
173     traps: SmallVec<[MachTrap; 16]>,
174     /// Any call site records referring to this code.
175     call_sites: SmallVec<[MachCallSite; 16]>,
176     /// Any source location mappings referring to this code.
177     srclocs: SmallVec<[MachSrcLoc; 64]>,
178     /// Any stack maps referring to this code.
179     stack_maps: SmallVec<[MachStackMap; 8]>,
180     /// Any unwind info at a given location.
181     unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
182     /// The current source location in progress (after `start_srcloc()` and
183     /// before `end_srcloc()`).  This is a (start_offset, src_loc) tuple.
184     cur_srcloc: Option<(CodeOffset, SourceLoc)>,
185     /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
186     label_offsets: SmallVec<[CodeOffset; 16]>,
187     /// Label aliases: when one label points to an unconditional jump, and that
188     /// jump points to another label, we can redirect references to the first
189     /// label immediately to the second.
190     ///
191     /// Invariant: we don't have label-alias cycles. We ensure this by,
192     /// before setting label A to alias label B, resolving B's alias
193     /// target (iteratively until a non-aliased label); if B is already
194     /// aliased to A, then we cannot alias A back to B.
195     label_aliases: SmallVec<[MachLabel; 16]>,
196     /// Constants that must be emitted at some point.
197     pending_constants: SmallVec<[MachLabelConstant; 16]>,
198     /// Fixups that must be performed after all code is emitted.
199     fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
200     /// Current deadline at which all constants are flushed and all code labels
201     /// are extended by emitting long-range jumps in an island. This flush
202     /// should be rare (e.g., on AArch64, the shortest-range PC-rel references
203     /// are +/- 1MB for conditional jumps and load-literal instructions), so
204     /// it's acceptable to track a minimum and flush-all rather than doing more
205     /// detailed "current minimum" / sort-by-deadline trickery.
206     island_deadline: CodeOffset,
207     /// How many bytes are needed in the worst case for an island, given all
208     /// pending constants and fixups.
209     island_worst_case_size: CodeOffset,
210     /// Latest branches, to facilitate in-place editing for better fallthrough
211     /// behavior and empty-block removal.
212     latest_branches: SmallVec<[MachBranch; 4]>,
213     /// All labels at the current offset (emission tail). This is lazily
214     /// cleared: it is actually accurate as long as the current offset is
215     /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
216     /// be considered as empty.
217     ///
218     /// For correctness, this *must* be complete (i.e., the vector must contain
219     /// all labels whose offsets are resolved to the current tail), because we
220     /// rely on it to update labels when we truncate branches.
221     labels_at_tail: SmallVec<[MachLabel; 4]>,
222     /// The last offset at which `labels_at_tail` is valid. It is conceptually
223     /// always describing the tail of the buffer, but we do not clear
224     /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
225     /// when the offset has grown past this (`labels_at_tail_off`) point.
226     /// Always <= `cur_offset()`.
227     labels_at_tail_off: CodeOffset,
228     /// Map used constants to their [MachLabel].
229     constant_labels: SecondaryMap<VCodeConstant, MachLabel>,
230 }
231 
232 /// A `MachBuffer` once emission is completed: holds generated code and records,
233 /// without fixups. This allows the type to be independent of the backend.
234 pub struct MachBufferFinalized {
235     /// The buffer contents, as raw bytes.
236     data: SmallVec<[u8; 1024]>,
237     /// Any relocations referring to this code. Note that only *external*
238     /// relocations are tracked here; references to labels within the buffer are
239     /// resolved before emission.
240     relocs: SmallVec<[MachReloc; 16]>,
241     /// Any trap records referring to this code.
242     traps: SmallVec<[MachTrap; 16]>,
243     /// Any call site records referring to this code.
244     call_sites: SmallVec<[MachCallSite; 16]>,
245     /// Any source location mappings referring to this code.
246     srclocs: SmallVec<[MachSrcLoc; 64]>,
247     /// Any stack maps referring to this code.
248     stack_maps: SmallVec<[MachStackMap; 8]>,
249     /// Any unwind info at a given location.
250     pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>,
251 }
252 
253 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
254 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
255 
256 /// Threshold on max length of `labels_at_this_branch` list to avoid
257 /// unbounded quadratic behavior (see comment below at use-site).
258 const LABEL_LIST_THRESHOLD: usize = 100;
259 
260 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at
261 /// the point at which it is used by emitted code; the buffer records "fixups"
262 /// for references to the label, and will come back and patch the code
263 /// appropriately when the label's location is eventually known.
264 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
265 pub struct MachLabel(u32);
266 entity_impl!(MachLabel);
267 
268 impl MachLabel {
269     /// Get a label for a block. (The first N MachLabels are always reseved for
270     /// the N blocks in the vcode.)
271     pub fn from_block(bindex: BlockIndex) -> MachLabel {
272         MachLabel(bindex.index() as u32)
273     }
274 
275     /// Get the numeric label index.
276     pub fn get(self) -> u32 {
277         self.0
278     }
279 
280     /// Creates a string representing this label, for convenience.
281     pub fn to_string(&self) -> String {
282         format!("label{}", self.0)
283     }
284 }
285 
286 impl Default for MachLabel {
287     fn default() -> Self {
288         UNKNOWN_LABEL
289     }
290 }
291 
292 /// A stack map extent, when creating a stack map.
293 pub enum StackMapExtent {
294     /// The stack map starts at this instruction, and ends after the number of upcoming bytes
295     /// (note: this is a code offset diff).
296     UpcomingBytes(CodeOffset),
297 
298     /// The stack map started at the given offset and ends at the current one. This helps
299     /// architectures where the instruction size has not a fixed length.
300     StartedAtOffset(CodeOffset),
301 }
302 
303 impl<I: VCodeInst> MachBuffer<I> {
304     /// Create a new section, known to start at `start_offset` and with a size limited to
305     /// `length_limit`.
306     pub fn new() -> MachBuffer<I> {
307         MachBuffer {
308             data: SmallVec::new(),
309             relocs: SmallVec::new(),
310             traps: SmallVec::new(),
311             call_sites: SmallVec::new(),
312             srclocs: SmallVec::new(),
313             stack_maps: SmallVec::new(),
314             unwind_info: SmallVec::new(),
315             cur_srcloc: None,
316             label_offsets: SmallVec::new(),
317             label_aliases: SmallVec::new(),
318             pending_constants: SmallVec::new(),
319             fixup_records: SmallVec::new(),
320             island_deadline: UNKNOWN_LABEL_OFFSET,
321             island_worst_case_size: 0,
322             latest_branches: SmallVec::new(),
323             labels_at_tail: SmallVec::new(),
324             labels_at_tail_off: 0,
325             constant_labels: SecondaryMap::new(),
326         }
327     }
328 
329     /// Current offset from start of buffer.
330     pub fn cur_offset(&self) -> CodeOffset {
331         self.data.len() as CodeOffset
332     }
333 
334     /// Add a byte.
335     pub fn put1(&mut self, value: u8) {
336         trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value);
337         self.data.push(value);
338 
339         // Post-invariant: conceptual-labels_at_tail contains a complete and
340         // precise list of labels bound at `cur_offset()`. We have advanced
341         // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
342         // before, it is not anymore (and it cannot become equal, because
343         // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
344         // conceptually empty (even though it is only lazily cleared). No labels
345         // can be bound at this new offset (by invariant on `label_offsets`).
346         // Hence the invariant holds.
347     }
348 
349     /// Add 2 bytes.
350     pub fn put2(&mut self, value: u16) {
351         trace!(
352             "MachBuffer: put 16-bit word @ {}: {:x}",
353             self.cur_offset(),
354             value
355         );
356         let bytes = value.to_le_bytes();
357         self.data.extend_from_slice(&bytes[..]);
358 
359         // Post-invariant: as for `put1()`.
360     }
361 
362     /// Add 4 bytes.
363     pub fn put4(&mut self, value: u32) {
364         trace!(
365             "MachBuffer: put 32-bit word @ {}: {:x}",
366             self.cur_offset(),
367             value
368         );
369         let bytes = value.to_le_bytes();
370         self.data.extend_from_slice(&bytes[..]);
371 
372         // Post-invariant: as for `put1()`.
373     }
374 
375     /// Add 8 bytes.
376     pub fn put8(&mut self, value: u64) {
377         trace!(
378             "MachBuffer: put 64-bit word @ {}: {:x}",
379             self.cur_offset(),
380             value
381         );
382         let bytes = value.to_le_bytes();
383         self.data.extend_from_slice(&bytes[..]);
384 
385         // Post-invariant: as for `put1()`.
386     }
387 
388     /// Add a slice of bytes.
389     pub fn put_data(&mut self, data: &[u8]) {
390         trace!(
391             "MachBuffer: put data @ {}: len {}",
392             self.cur_offset(),
393             data.len()
394         );
395         self.data.extend_from_slice(data);
396 
397         // Post-invariant: as for `put1()`.
398     }
399 
400     /// Reserve appended space and return a mutable slice referring to it.
401     pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
402         trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len);
403         let off = self.data.len();
404         let new_len = self.data.len() + len;
405         self.data.resize(new_len, 0);
406         &mut self.data[off..]
407 
408         // Post-invariant: as for `put1()`.
409     }
410 
411     /// Align up to the given alignment.
412     pub fn align_to(&mut self, align_to: CodeOffset) {
413         trace!("MachBuffer: align to {}", align_to);
414         assert!(align_to.is_power_of_two());
415         while self.cur_offset() & (align_to - 1) != 0 {
416             self.put1(0);
417         }
418 
419         // Post-invariant: as for `put1()`.
420     }
421 
422     /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
423     /// offset yet.
424     pub fn get_label(&mut self) -> MachLabel {
425         let l = self.label_offsets.len() as u32;
426         self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
427         self.label_aliases.push(UNKNOWN_LABEL);
428         trace!("MachBuffer: new label -> {:?}", MachLabel(l));
429         MachLabel(l)
430 
431         // Post-invariant: the only mutation is to add a new label; it has no
432         // bound offset yet, so it trivially satisfies all invariants.
433     }
434 
435     /// Reserve the first N MachLabels for blocks.
436     pub fn reserve_labels_for_blocks(&mut self, blocks: usize) {
437         trace!("MachBuffer: first {} labels are for blocks", blocks);
438         debug_assert!(self.label_offsets.is_empty());
439         self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET);
440         self.label_aliases.resize(blocks, UNKNOWN_LABEL);
441 
442         // Post-invariant: as for `get_label()`.
443     }
444 
445     /// Reserve the next N MachLabels for constants.
446     pub fn reserve_labels_for_constants(&mut self, constants: &VCodeConstants) {
447         trace!(
448             "MachBuffer: next {} labels are for constants",
449             constants.len()
450         );
451         for c in constants.keys() {
452             self.constant_labels[c] = self.get_label();
453         }
454 
455         // Post-invariant: as for `get_label()`.
456     }
457 
458     /// Retrieve the reserved label for a constant.
459     pub fn get_label_for_constant(&self, constant: VCodeConstant) -> MachLabel {
460         self.constant_labels[constant]
461     }
462 
463     /// Bind a label to the current offset. A label can only be bound once.
464     pub fn bind_label(&mut self, label: MachLabel) {
465         trace!(
466             "MachBuffer: bind label {:?} at offset {}",
467             label,
468             self.cur_offset()
469         );
470         debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
471         debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
472         let offset = self.cur_offset();
473         self.label_offsets[label.0 as usize] = offset;
474         self.lazily_clear_labels_at_tail();
475         self.labels_at_tail.push(label);
476 
477         // Invariants hold: bound offset of label is <= cur_offset (in fact it
478         // is equal). If the `labels_at_tail` list was complete and precise
479         // before, it is still, because we have bound this label to the current
480         // offset and added it to the list (which contains all labels at the
481         // current offset).
482 
483         self.optimize_branches();
484 
485         // Post-invariant: by `optimize_branches()` (see argument there).
486     }
487 
488     /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
489     /// offset that it applies to.
490     fn lazily_clear_labels_at_tail(&mut self) {
491         let offset = self.cur_offset();
492         if offset > self.labels_at_tail_off {
493             self.labels_at_tail_off = offset;
494             self.labels_at_tail.clear();
495         }
496 
497         // Post-invariant: either labels_at_tail_off was at cur_offset, and
498         // state is untouched, or was less than cur_offset, in which case the
499         // labels_at_tail list was conceptually empty, and is now actually
500         // empty.
501     }
502 
503     /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
504     pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
505         let mut iters = 0;
506         while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
507             label = self.label_aliases[label.0 as usize];
508             // To protect against an infinite loop (despite our assurances to
509             // ourselves that the invariants make this impossible), assert out
510             // after 1M iterations. The number of basic blocks is limited
511             // in most contexts anyway so this should be impossible to hit with
512             // a legitimate input.
513             iters += 1;
514             assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
515         }
516         self.label_offsets[label.0 as usize]
517 
518         // Post-invariant: no mutations.
519     }
520 
521     /// Emit a reference to the given label with the given reference type (i.e.,
522     /// branch-instruction format) at the current offset.  This is like a
523     /// relocation, but handled internally.
524     ///
525     /// This can be called before the branch is actually emitted; fixups will
526     /// not happen until an island is emitted or the buffer is finished.
527     pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
528         trace!(
529             "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
530             offset,
531             label,
532             kind
533         );
534 
535         // Add the fixup, and update the worst-case island size based on a
536         // veneer for this label use.
537         self.fixup_records.push(MachLabelFixup {
538             label,
539             offset,
540             kind,
541         });
542         if kind.supports_veneer() {
543             self.island_worst_case_size += kind.veneer_size();
544             self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1);
545         }
546         let deadline = offset.saturating_add(kind.max_pos_range());
547         if deadline < self.island_deadline {
548             self.island_deadline = deadline;
549         }
550 
551         // Post-invariant: no mutations to branches/labels data structures.
552     }
553 
554     /// Inform the buffer of an unconditional branch at the given offset,
555     /// targetting the given label. May be used to optimize branches.
556     /// The last added label-use must correspond to this branch.
557     /// This must be called when the current offset is equal to `start`; i.e.,
558     /// before actually emitting the branch. This implies that for a branch that
559     /// uses a label and is eligible for optimizations by the MachBuffer, the
560     /// proper sequence is:
561     ///
562     /// - Call `use_label_at_offset()` to emit the fixup record.
563     /// - Call `add_uncond_branch()` to make note of the branch.
564     /// - Emit the bytes for the branch's machine code.
565     ///
566     /// Additional requirement: no labels may be bound between `start` and `end`
567     /// (exclusive on both ends).
568     pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
569         assert!(self.cur_offset() == start);
570         debug_assert!(end > start);
571         assert!(!self.fixup_records.is_empty());
572         let fixup = self.fixup_records.len() - 1;
573         self.lazily_clear_labels_at_tail();
574         self.latest_branches.push(MachBranch {
575             start,
576             end,
577             target,
578             fixup,
579             inverted: None,
580             labels_at_this_branch: self.labels_at_tail.clone(),
581         });
582 
583         // Post-invariant: we asserted branch start is current tail; the list of
584         // labels at branch is cloned from list of labels at current tail.
585     }
586 
587     /// Inform the buffer of a conditional branch at the given offset,
588     /// targetting the given label. May be used to optimize branches.
589     /// The last added label-use must correspond to this branch.
590     ///
591     /// Additional requirement: no labels may be bound between `start` and `end`
592     /// (exclusive on both ends).
593     pub fn add_cond_branch(
594         &mut self,
595         start: CodeOffset,
596         end: CodeOffset,
597         target: MachLabel,
598         inverted: &[u8],
599     ) {
600         assert!(self.cur_offset() == start);
601         debug_assert!(end > start);
602         assert!(!self.fixup_records.is_empty());
603         debug_assert!(inverted.len() == (end - start) as usize);
604         let fixup = self.fixup_records.len() - 1;
605         let inverted = Some(SmallVec::from(inverted));
606         self.lazily_clear_labels_at_tail();
607         self.latest_branches.push(MachBranch {
608             start,
609             end,
610             target,
611             fixup,
612             inverted,
613             labels_at_this_branch: self.labels_at_tail.clone(),
614         });
615 
616         // Post-invariant: we asserted branch start is current tail; labels at
617         // branch list is cloned from list of labels at current tail.
618     }
619 
620     fn truncate_last_branch(&mut self) {
621         self.lazily_clear_labels_at_tail();
622         // Invariants hold at this point.
623 
624         let b = self.latest_branches.pop().unwrap();
625         assert!(b.end == self.cur_offset());
626 
627         // State:
628         //    [PRE CODE]
629         //  Offset b.start, b.labels_at_this_branch:
630         //    [BRANCH CODE]
631         //  cur_off, self.labels_at_tail -->
632         //    (end of buffer)
633         self.data.truncate(b.start as usize);
634         self.fixup_records.truncate(b.fixup);
635         while let Some(mut last_srcloc) = self.srclocs.last_mut() {
636             if last_srcloc.end <= b.start {
637                 break;
638             }
639             if last_srcloc.start < b.start {
640                 last_srcloc.end = b.start;
641                 break;
642             }
643             self.srclocs.pop();
644         }
645         // State:
646         //    [PRE CODE]
647         //  cur_off, Offset b.start, b.labels_at_this_branch:
648         //    (end of buffer)
649         //
650         //  self.labels_at_tail -->  (past end of buffer)
651         let cur_off = self.cur_offset();
652         self.labels_at_tail_off = cur_off;
653         // State:
654         //    [PRE CODE]
655         //  cur_off, Offset b.start, b.labels_at_this_branch,
656         //  self.labels_at_tail:
657         //    (end of buffer)
658         //
659         // resolve_label_offset(l) for l in labels_at_tail:
660         //    (past end of buffer)
661 
662         trace!(
663             "truncate_last_branch: truncated {:?}; off now {}",
664             b,
665             cur_off
666         );
667 
668         // Fix up resolved label offsets for labels at tail.
669         for &l in &self.labels_at_tail {
670             self.label_offsets[l.0 as usize] = cur_off;
671         }
672         // Old labels_at_this_branch are now at cur_off.
673         self.labels_at_tail
674             .extend(b.labels_at_this_branch.into_iter());
675 
676         // Post-invariant: this operation is defined to truncate the buffer,
677         // which moves cur_off backward, and to move labels at the end of the
678         // buffer back to the start-of-branch offset.
679         //
680         // latest_branches satisfies all invariants:
681         // - it has no branches past the end of the buffer (branches are in
682         //   order, we removed the last one, and we truncated the buffer to just
683         //   before the start of that branch)
684         // - no labels were moved to lower offsets than the (new) cur_off, so
685         //   the labels_at_this_branch list for any other branch need not change.
686         //
687         // labels_at_tail satisfies all invariants:
688         // - all labels that were at the tail after the truncated branch are
689         //   moved backward to just before the branch, which becomes the new tail;
690         //   thus every element in the list should remain (ensured by `.extend()`
691         //   above).
692         // - all labels that refer to the new tail, which is the start-offset of
693         //   the truncated branch, must be present. The `labels_at_this_branch`
694         //   list in the truncated branch's record is a complete and precise list
695         //   of exactly these labels; we append these to labels_at_tail.
696         // - labels_at_tail_off is at cur_off after truncation occurs, so the
697         //   list is valid (not to be lazily cleared).
698         //
699         // The stated operation was performed:
700         // - For each label at the end of the buffer prior to this method, it
701         //   now resolves to the new (truncated) end of the buffer: it must have
702         //   been in `labels_at_tail` (this list is precise and complete, and
703         //   the tail was at the end of the truncated branch on entry), and we
704         //   iterate over this list and set `label_offsets` to the new tail.
705         //   None of these labels could have been an alias (by invariant), so
706         //   `label_offsets` is authoritative for each.
707         // - No other labels will be past the end of the buffer, because of the
708         //   requirement that no labels be bound to the middle of branch ranges
709         //   (see comments to `add_{cond,uncond}_branch()`).
710         // - The buffer is truncated to just before the last branch, and the
711         //   fixup record referring to that last branch is removed.
712     }
713 
714     fn optimize_branches(&mut self) {
715         self.lazily_clear_labels_at_tail();
716         // Invariants valid at this point.
717 
718         trace!(
719             "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
720             self.latest_branches,
721             self.labels_at_tail,
722             self.fixup_records
723         );
724 
725         // We continue to munch on branches at the tail of the buffer until no
726         // more rules apply. Note that the loop only continues if a branch is
727         // actually truncated (or if labels are redirected away from a branch),
728         // so this always makes progress.
729         while let Some(b) = self.latest_branches.last() {
730             let cur_off = self.cur_offset();
731             trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
732             // If there has been any code emission since the end of the last branch or
733             // label definition, then there's nothing we can edit (because we
734             // don't move code once placed, only back up and overwrite), so
735             // clear the records and finish.
736             if b.end < cur_off {
737                 break;
738             }
739 
740             // If the "labels at this branch" list on this branch is
741             // longer than a threshold, don't do any simplification,
742             // and let the branch remain to separate those labels from
743             // the current tail. This avoids quadratic behavior (see
744             // #3468): otherwise, if a long string of "goto next;
745             // next:" patterns are emitted, all of the labels will
746             // coalesce into a long list of aliases for the current
747             // buffer tail. We must track all aliases of the current
748             // tail for correctness, but we are also allowed to skip
749             // optimization (removal) of any branch, so we take the
750             // escape hatch here and let it stand. In effect this
751             // "spreads" the many thousands of labels in the
752             // pathological case among an actual (harmless but
753             // suboptimal) instruction once per N labels.
754             if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD {
755                 break;
756             }
757 
758             // Invariant: we are looking at a branch that ends at the tail of
759             // the buffer.
760 
761             // For any branch, conditional or unconditional:
762             // - If the target is a label at the current offset, then remove
763             //   the conditional branch, and reset all labels that targetted
764             //   the current offset (end of branch) to the truncated
765             //   end-of-code.
766             //
767             // Preserves execution semantics: a branch to its own fallthrough
768             // address is equivalent to a no-op; in both cases, nextPC is the
769             // fallthrough.
770             if self.resolve_label_offset(b.target) == cur_off {
771                 trace!("branch with target == cur off; truncating");
772                 self.truncate_last_branch();
773                 continue;
774             }
775 
776             // If latest is an unconditional branch:
777             //
778             // - If the branch's target is not its own start address, then for
779             //   each label at the start of branch, make the label an alias of the
780             //   branch target, and remove the label from the "labels at this
781             //   branch" list.
782             //
783             //   - Preserves execution semantics: an unconditional branch's
784             //     only effect is to set PC to a new PC; this change simply
785             //     collapses one step in the step-semantics.
786             //
787             //   - Post-invariant: the labels that were bound to the start of
788             //     this branch become aliases, so they must not be present in any
789             //     labels-at-this-branch list or the labels-at-tail list. The
790             //     labels are removed form the latest-branch record's
791             //     labels-at-this-branch list, and are never placed in the
792             //     labels-at-tail list. Furthermore, it is correct that they are
793             //     not in either list, because they are now aliases, and labels
794             //     that are aliases remain aliases forever.
795             //
796             // - If there is a prior unconditional branch that ends just before
797             //   this one begins, and this branch has no labels bound to its
798             //   start, then we can truncate this branch, because it is entirely
799             //   unreachable (we have redirected all labels that make it
800             //   reachable otherwise). Do so and continue around the loop.
801             //
802             //   - Preserves execution semantics: the branch is unreachable,
803             //     because execution can only flow into an instruction from the
804             //     prior instruction's fallthrough or from a branch bound to that
805             //     instruction's start offset. Unconditional branches have no
806             //     fallthrough, so if the prior instruction is an unconditional
807             //     branch, no fallthrough entry can happen. The
808             //     labels-at-this-branch list is complete (by invariant), so if it
809             //     is empty, then the instruction is entirely unreachable. Thus,
810             //     it can be removed.
811             //
812             //   - Post-invariant: ensured by truncate_last_branch().
813             //
814             // - If there is a prior conditional branch whose target label
815             //   resolves to the current offset (branches around the
816             //   unconditional branch), then remove the unconditional branch,
817             //   and make the target of the unconditional the target of the
818             //   conditional instead.
819             //
820             //   - Preserves execution semantics: previously we had:
821             //
822             //         L1:
823             //            cond_br L2
824             //            br L3
825             //         L2:
826             //            (end of buffer)
827             //
828             //     by removing the last branch, we have:
829             //
830             //         L1:
831             //            cond_br L2
832             //         L2:
833             //            (end of buffer)
834             //
835             //     we then fix up the records for the conditional branch to
836             //     have:
837             //
838             //         L1:
839             //           cond_br.inverted L3
840             //         L2:
841             //
842             //     In the original code, control flow reaches L2 when the
843             //     conditional branch's predicate is true, and L3 otherwise. In
844             //     the optimized code, the same is true.
845             //
846             //   - Post-invariant: all edits to latest_branches and
847             //     labels_at_tail are performed by `truncate_last_branch()`,
848             //     which maintains the invariants at each step.
849 
850             if b.is_uncond() {
851                 // Set any label equal to current branch's start as an alias of
852                 // the branch's target, if the target is not the branch itself
853                 // (i.e., an infinite loop).
854                 //
855                 // We cannot perform this aliasing if the target of this branch
856                 // ultimately aliases back here; if so, we need to keep this
857                 // branch, so break out of this loop entirely (and clear the
858                 // latest-branches list below).
859                 //
860                 // Note that this check is what prevents cycles from forming in
861                 // `self.label_aliases`. To see why, consider an arbitrary start
862                 // state:
863                 //
864                 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
865                 // Ln, which is not aliased.
866                 //
867                 // We would create a cycle if we assigned label_aliases[Ln]
868                 // = L1.  Note that the below assignment is the only write
869                 // to label_aliases.
870                 //
871                 // By our other invariants, we have that Ln (`l` below)
872                 // resolves to the offset `b.start`, because it is in the
873                 // set `b.labels_at_this_branch`.
874                 //
875                 // If L1 were already aliased, through some arbitrarily deep
876                 // chain, to Ln, then it must also resolve to this offset
877                 // `b.start`.
878                 //
879                 // By checking the resolution of `L1` against this offset,
880                 // and aborting this branch-simplification if they are
881                 // equal, we prevent the below assignment from ever creating
882                 // a cycle.
883                 if self.resolve_label_offset(b.target) != b.start {
884                     let redirected = b.labels_at_this_branch.len();
885                     for &l in &b.labels_at_this_branch {
886                         trace!(
887                             " -> label at start of branch {:?} redirected to target {:?}",
888                             l,
889                             b.target
890                         );
891                         self.label_aliases[l.0 as usize] = b.target;
892                         // NOTE: we continue to ensure the invariant that labels
893                         // pointing to tail of buffer are in `labels_at_tail`
894                         // because we already ensured above that the last branch
895                         // cannot have a target of `cur_off`; so we never have
896                         // to put the label into `labels_at_tail` when moving it
897                         // here.
898                     }
899                     // Maintain invariant: all branches have been redirected
900                     // and are no longer pointing at the start of this branch.
901                     let mut_b = self.latest_branches.last_mut().unwrap();
902                     mut_b.labels_at_this_branch.clear();
903 
904                     if redirected > 0 {
905                         trace!(" -> after label redirects, restarting loop");
906                         continue;
907                     }
908                 } else {
909                     break;
910                 }
911 
912                 let b = self.latest_branches.last().unwrap();
913 
914                 // Examine any immediately preceding branch.
915                 if self.latest_branches.len() > 1 {
916                     let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
917                     trace!(" -> more than one branch; prev_b = {:?}", prev_b);
918                     // This uncond is immediately after another uncond; we
919                     // should have already redirected labels to this uncond away
920                     // (but check to be sure); so we can truncate this uncond.
921                     if prev_b.is_uncond()
922                         && prev_b.end == b.start
923                         && b.labels_at_this_branch.is_empty()
924                     {
925                         trace!(" -> uncond follows another uncond; truncating");
926                         self.truncate_last_branch();
927                         continue;
928                     }
929 
930                     // This uncond is immediately after a conditional, and the
931                     // conditional's target is the end of this uncond, and we've
932                     // already redirected labels to this uncond away; so we can
933                     // truncate this uncond, flip the sense of the conditional, and
934                     // set the conditional's target (in `latest_branches` and in
935                     // `fixup_records`) to the uncond's target.
936                     if prev_b.is_cond()
937                         && prev_b.end == b.start
938                         && self.resolve_label_offset(prev_b.target) == cur_off
939                     {
940                         trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
941                         // Save the target of the uncond (this becomes the
942                         // target of the cond), and truncate the uncond.
943                         let target = b.target;
944                         let data = prev_b.inverted.clone().unwrap();
945                         self.truncate_last_branch();
946 
947                         // Mutate the code and cond branch.
948                         let off_before_edit = self.cur_offset();
949                         let prev_b = self.latest_branches.last_mut().unwrap();
950                         let not_inverted = SmallVec::from(
951                             &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
952                         );
953 
954                         // Low-level edit: replaces bytes of branch with
955                         // inverted form. cur_off remains the same afterward, so
956                         // we do not need to modify label data structures.
957                         self.data.truncate(prev_b.start as usize);
958                         self.data.extend_from_slice(&data[..]);
959 
960                         // Save the original code as the inversion of the
961                         // inverted branch, in case we later edit this branch
962                         // again.
963                         prev_b.inverted = Some(not_inverted);
964                         self.fixup_records[prev_b.fixup].label = target;
965                         trace!(" -> reassigning target of condbr to {:?}", target);
966                         prev_b.target = target;
967                         debug_assert_eq!(off_before_edit, self.cur_offset());
968                         continue;
969                     }
970                 }
971             }
972 
973             // If we couldn't do anything with the last branch, then break.
974             break;
975         }
976 
977         self.purge_latest_branches();
978 
979         trace!(
980             "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
981             self.latest_branches,
982             self.labels_at_tail,
983             self.fixup_records
984         );
985     }
986 
987     fn purge_latest_branches(&mut self) {
988         // All of our branch simplification rules work only if a branch ends at
989         // the tail of the buffer, with no following code; and branches are in
990         // order in latest_branches; so if the last entry ends prior to
991         // cur_offset, then clear all entries.
992         let cur_off = self.cur_offset();
993         if let Some(l) = self.latest_branches.last() {
994             if l.end < cur_off {
995                 trace!("purge_latest_branches: removing branch {:?}", l);
996                 self.latest_branches.clear();
997             }
998         }
999 
1000         // Post-invariant: no invariant requires any branch to appear in
1001         // `latest_branches`; it is always optional. The list-clear above thus
1002         // preserves all semantics.
1003     }
1004 
1005     /// Emit a constant at some point in the future, binding the given label to
1006     /// its offset. The constant will be placed at most `max_distance` from the
1007     /// current offset.
1008     pub fn defer_constant(
1009         &mut self,
1010         label: MachLabel,
1011         align: CodeOffset,
1012         data: &[u8],
1013         max_distance: CodeOffset,
1014     ) {
1015         trace!(
1016             "defer_constant: eventually emit {} bytes aligned to {} at label {:?}",
1017             data.len(),
1018             align,
1019             label
1020         );
1021         let deadline = self.cur_offset().saturating_add(max_distance);
1022         self.island_worst_case_size += data.len() as CodeOffset;
1023         self.island_worst_case_size =
1024             (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1);
1025         self.pending_constants.push(MachLabelConstant {
1026             label,
1027             align,
1028             data: SmallVec::from(data),
1029         });
1030         if deadline < self.island_deadline {
1031             self.island_deadline = deadline;
1032         }
1033     }
1034 
1035     /// Is an island needed within the next N bytes?
1036     pub fn island_needed(&self, distance: CodeOffset) -> bool {
1037         self.worst_case_end_of_island(distance) > self.island_deadline
1038     }
1039 
1040     /// Returns the maximal offset that islands can reach if `distance` more
1041     /// bytes are appended.
1042     ///
1043     /// This is used to determine if veneers need insertions since jumps that
1044     /// can't reach past this point must get a veneer of some form.
1045     fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset {
1046         self.cur_offset()
1047             .saturating_add(distance)
1048             .saturating_add(self.island_worst_case_size)
1049     }
1050 
1051     /// Emit all pending constants and required pending veneers.
1052     ///
1053     /// Should only be called if `island_needed()` returns true, i.e., if we
1054     /// actually reach a deadline. It's not necessarily a problem to do so
1055     /// otherwise but it may result in unnecessary work during emission.
1056     pub fn emit_island(&mut self, distance: CodeOffset) {
1057         self.emit_island_maybe_forced(false, distance);
1058     }
1059 
1060     /// Same as `emit_island`, but an internal API with a `force_veneers`
1061     /// argument to force all veneers to always get emitted for debugging.
1062     fn emit_island_maybe_forced(&mut self, force_veneers: bool, distance: CodeOffset) {
1063         // We're going to purge fixups, so no latest-branch editing can happen
1064         // anymore.
1065         self.latest_branches.clear();
1066 
1067         // Reset internal calculations about islands since we're going to
1068         // change the calculus as we apply fixups. The `forced_threshold` is
1069         // used here to determine whether jumps to unknown labels will require
1070         // a veneer or not.
1071         let forced_threshold = self.worst_case_end_of_island(distance);
1072         self.island_deadline = UNKNOWN_LABEL_OFFSET;
1073         self.island_worst_case_size = 0;
1074 
1075         // First flush out all constants so we have more labels in case fixups
1076         // are applied against these labels.
1077         for MachLabelConstant { label, align, data } in mem::take(&mut self.pending_constants) {
1078             self.align_to(align);
1079             self.bind_label(label);
1080             self.put_data(&data[..]);
1081         }
1082 
1083         for fixup in mem::take(&mut self.fixup_records) {
1084             trace!("emit_island: fixup {:?}", fixup);
1085             let MachLabelFixup {
1086                 label,
1087                 offset,
1088                 kind,
1089             } = fixup;
1090             let label_offset = self.resolve_label_offset(label);
1091             let start = offset as usize;
1092             let end = (offset + kind.patch_size()) as usize;
1093 
1094             if label_offset != UNKNOWN_LABEL_OFFSET {
1095                 // If the offset of the label for this fixup is known then
1096                 // we're going to do something here-and-now. We're either going
1097                 // to patch the original offset because it's an in-bounds jump,
1098                 // or we're going to generate a veneer, patch the fixup to jump
1099                 // to the veneer, and then keep going.
1100                 //
1101                 // If the label comes after the original fixup, then we should
1102                 // be guaranteed that the jump is in-bounds. Otherwise there's
1103                 // a bug somewhere because this method wasn't called soon
1104                 // enough. All forward-jumps are tracked and should get veneers
1105                 // before their deadline comes and they're unable to jump
1106                 // further.
1107                 //
1108                 // Otherwise if the label is before the fixup, then that's a
1109                 // backwards jump. If it's past the maximum negative range
1110                 // then we'll emit a veneer that to jump forward to which can
1111                 // then jump backwards.
1112                 let veneer_required = if label_offset >= offset {
1113                     assert!((label_offset - offset) <= kind.max_pos_range());
1114                     false
1115                 } else {
1116                     (offset - label_offset) > kind.max_neg_range()
1117                 };
1118                 trace!(
1119                     " -> label_offset = {}, known, required = {} (pos {} neg {})",
1120                     label_offset,
1121                     veneer_required,
1122                     kind.max_pos_range(),
1123                     kind.max_neg_range()
1124                 );
1125 
1126                 if (force_veneers && kind.supports_veneer()) || veneer_required {
1127                     self.emit_veneer(label, offset, kind);
1128                 } else {
1129                     let slice = &mut self.data[start..end];
1130                     trace!("patching in-range!");
1131                     kind.patch(slice, offset, label_offset);
1132                 }
1133             } else {
1134                 // If the offset of this label is not known at this time then
1135                 // there's one of two possibilities:
1136                 //
1137                 // * First we may be about to exceed the maximum jump range of
1138                 //   this fixup. In that case a veneer is inserted to buy some
1139                 //   more budget for the forward-jump. It's guaranteed that the
1140                 //   label will eventually come after where we're at, so we know
1141                 //   that the forward jump is necessary.
1142                 //
1143                 // * Otherwise we're still within range of the forward jump but
1144                 //   the precise target isn't known yet. In that case we
1145                 //   enqueue the fixup to get processed later.
1146                 if forced_threshold - offset > kind.max_pos_range() {
1147                     self.emit_veneer(label, offset, kind);
1148                 } else {
1149                     self.use_label_at_offset(offset, label, kind);
1150                 }
1151             }
1152         }
1153     }
1154 
1155     /// Emits a "veneer" the `kind` code at `offset` to jump to `label`.
1156     ///
1157     /// This will generate extra machine code, using `kind`, to get a
1158     /// larger-jump-kind than `kind` allows. The code at `offset` is then
1159     /// patched to jump to our new code, and then the new code is enqueued for
1160     /// a fixup to get processed at some later time.
1161     fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) {
1162         // If this `kind` doesn't support a veneer then that's a bug in the
1163         // backend because we need to implement support for such a veneer.
1164         assert!(
1165             kind.supports_veneer(),
1166             "jump beyond the range of {:?} but a veneer isn't supported",
1167             kind,
1168         );
1169 
1170         // Allocate space for a veneer in the island.
1171         self.align_to(I::LabelUse::ALIGN);
1172         let veneer_offset = self.cur_offset();
1173         trace!("making a veneer at {}", veneer_offset);
1174         let start = offset as usize;
1175         let end = (offset + kind.patch_size()) as usize;
1176         let slice = &mut self.data[start..end];
1177         // Patch the original label use to refer to the veneer.
1178         trace!(
1179             "patching original at offset {} to veneer offset {}",
1180             offset,
1181             veneer_offset
1182         );
1183         kind.patch(slice, offset, veneer_offset);
1184         // Generate the veneer.
1185         let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
1186         let (veneer_fixup_off, veneer_label_use) =
1187             kind.generate_veneer(veneer_slice, veneer_offset);
1188         trace!(
1189             "generated veneer; fixup offset {}, label_use {:?}",
1190             veneer_fixup_off,
1191             veneer_label_use
1192         );
1193         // Register a new use of `label` with our new veneer fixup and offset.
1194         // This'll recalculate deadlines accordingly and enqueue this fixup to
1195         // get processed at some later time.
1196         self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use);
1197     }
1198 
1199     fn finish_emission_maybe_forcing_veneers(&mut self, force_veneers: bool) {
1200         while !self.pending_constants.is_empty() || !self.fixup_records.is_empty() {
1201             // `emit_island()` will emit any pending veneers and constants, and
1202             // as a side-effect, will also take care of any fixups with resolved
1203             // labels eagerly.
1204             self.emit_island_maybe_forced(force_veneers, u32::MAX);
1205         }
1206 
1207         // Ensure that all labels have been fixed up after the last island is emitted. This is a
1208         // full (release-mode) assert because an unresolved label means the emitted code is
1209         // incorrect.
1210         assert!(self.fixup_records.is_empty());
1211     }
1212 
1213     /// Finish any deferred emissions and/or fixups.
1214     pub fn finish(mut self) -> MachBufferFinalized {
1215         let _tt = timing::vcode_emit_finish();
1216 
1217         self.finish_emission_maybe_forcing_veneers(false);
1218 
1219         let mut srclocs = self.srclocs;
1220         srclocs.sort_by_key(|entry| entry.start);
1221 
1222         MachBufferFinalized {
1223             data: self.data,
1224             relocs: self.relocs,
1225             traps: self.traps,
1226             call_sites: self.call_sites,
1227             srclocs,
1228             stack_maps: self.stack_maps,
1229             unwind_info: self.unwind_info,
1230         }
1231     }
1232 
1233     /// Add an external relocation at the current offset.
1234     pub fn add_reloc(&mut self, kind: Reloc, name: &ExternalName, addend: Addend) {
1235         let name = name.clone();
1236         // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally
1237         // generate a label-use statement to track whether an island is possibly
1238         // needed to escape this function to actually get to the external name.
1239         // This is most likely to come up on AArch64 where calls between
1240         // functions use a 26-bit signed offset which gives +/- 64MB. This means
1241         // that if a function is 128MB in size and there's a call in the middle
1242         // it's impossible to reach the actual target. Also, while it's
1243         // technically possible to jump to the start of a function and then jump
1244         // further, island insertion below always inserts islands after
1245         // previously appended code so for Cranelift's own implementation this
1246         // is also a problem for 64MB functions on AArch64 which start with a
1247         // call instruction, those won't be able to escape.
1248         //
1249         // Ideally what needs to happen here is that a `LabelUse` is
1250         // transparently generated (or call-sites of this function are audited
1251         // to generate a `LabelUse` instead) and tracked internally. The actual
1252         // relocation would then change over time if and when a veneer is
1253         // inserted, where the relocation here would be patched by this
1254         // `MachBuffer` to jump to the veneer. The problem, though, is that all
1255         // this still needs to end up, in the case of a singular function,
1256         // generating a final relocation pointing either to this particular
1257         // relocation or to the veneer inserted. Additionally
1258         // `MachBuffer` needs the concept of a label which will never be
1259         // resolved, so `emit_island` doesn't trip over not actually ever
1260         // knowning what some labels are. Currently the loop in
1261         // `finish_emission_maybe_forcing_veneers` would otherwise infinitely
1262         // loop.
1263         //
1264         // For now this means that because relocs aren't tracked at all that
1265         // AArch64 functions have a rough size limits of 64MB. For now that's
1266         // somewhat reasonable and the failure mode is a panic in `MachBuffer`
1267         // when a relocation can't otherwise be resolved later, so it shouldn't
1268         // actually result in any memory unsafety or anything like that.
1269         self.relocs.push(MachReloc {
1270             offset: self.data.len() as CodeOffset,
1271             kind,
1272             name,
1273             addend,
1274         });
1275     }
1276 
1277     /// Add a trap record at the current offset.
1278     pub fn add_trap(&mut self, code: TrapCode) {
1279         self.traps.push(MachTrap {
1280             offset: self.data.len() as CodeOffset,
1281             code,
1282         });
1283     }
1284 
1285     /// Add a call-site record at the current offset.
1286     pub fn add_call_site(&mut self, opcode: Opcode) {
1287         debug_assert!(
1288             opcode.is_call(),
1289             "adding call site info for a non-call instruction."
1290         );
1291         self.call_sites.push(MachCallSite {
1292             ret_addr: self.data.len() as CodeOffset,
1293             opcode,
1294         });
1295     }
1296 
1297     /// Add an unwind record at the current offset.
1298     pub fn add_unwind(&mut self, unwind: UnwindInst) {
1299         self.unwind_info.push((self.cur_offset(), unwind));
1300     }
1301 
1302     /// Set the `SourceLoc` for code from this offset until the offset at the
1303     /// next call to `end_srcloc()`.
1304     pub fn start_srcloc(&mut self, loc: SourceLoc) {
1305         self.cur_srcloc = Some((self.cur_offset(), loc));
1306     }
1307 
1308     /// Mark the end of the `SourceLoc` segment started at the last
1309     /// `start_srcloc()` call.
1310     pub fn end_srcloc(&mut self) {
1311         let (start, loc) = self
1312             .cur_srcloc
1313             .take()
1314             .expect("end_srcloc() called without start_srcloc()");
1315         let end = self.cur_offset();
1316         // Skip zero-length extends.
1317         debug_assert!(end >= start);
1318         if end > start {
1319             self.srclocs.push(MachSrcLoc { start, end, loc });
1320         }
1321     }
1322 
1323     /// Add stack map metadata for this program point: a set of stack offsets
1324     /// (from SP upward) that contain live references.
1325     ///
1326     /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets`
1327     /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets`
1328     /// element, one can obtain live-reference offsets from FP instead.
1329     pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) {
1330         let (start, end) = match extent {
1331             StackMapExtent::UpcomingBytes(insn_len) => {
1332                 let start_offset = self.cur_offset();
1333                 (start_offset, start_offset + insn_len)
1334             }
1335             StackMapExtent::StartedAtOffset(start_offset) => {
1336                 let end_offset = self.cur_offset();
1337                 debug_assert!(end_offset >= start_offset);
1338                 (start_offset, end_offset)
1339             }
1340         };
1341         self.stack_maps.push(MachStackMap {
1342             offset: start,
1343             offset_end: end,
1344             stack_map,
1345         });
1346     }
1347 }
1348 
1349 impl MachBufferFinalized {
1350     /// Get a list of source location mapping tuples in sorted-by-start-offset order.
1351     pub fn get_srclocs_sorted(&self) -> &[MachSrcLoc] {
1352         &self.srclocs[..]
1353     }
1354 
1355     /// Get the total required size for the code.
1356     pub fn total_size(&self) -> CodeOffset {
1357         self.data.len() as CodeOffset
1358     }
1359 
1360     /// Return the code in this mach buffer as a hex string for testing purposes.
1361     pub fn stringify_code_bytes(&self) -> String {
1362         // This is pretty lame, but whatever ..
1363         use std::fmt::Write;
1364         let mut s = String::with_capacity(self.data.len() * 2);
1365         for b in &self.data {
1366             write!(&mut s, "{:02X}", b).unwrap();
1367         }
1368         s
1369     }
1370 
1371     /// Get the code bytes.
1372     pub fn data(&self) -> &[u8] {
1373         // N.B.: we emit every section into the .text section as far as
1374         // the `CodeSink` is concerned; we do not bother to segregate
1375         // the contents into the actual program text, the jumptable and the
1376         // rodata (constant pool). This allows us to generate code assuming
1377         // that these will not be relocated relative to each other, and avoids
1378         // having to designate each section as belonging in one of the three
1379         // fixed categories defined by `CodeSink`. If this becomes a problem
1380         // later (e.g. because of memory permissions or similar), we can
1381         // add this designation and segregate the output; take care, however,
1382         // to add the appropriate relocations in this case.
1383 
1384         &self.data[..]
1385     }
1386 
1387     /// Get the list of external relocations for this code.
1388     pub fn relocs(&self) -> &[MachReloc] {
1389         &self.relocs[..]
1390     }
1391 
1392     /// Get the list of trap records for this code.
1393     pub fn traps(&self) -> &[MachTrap] {
1394         &self.traps[..]
1395     }
1396 
1397     /// Get the stack map metadata for this code.
1398     pub fn stack_maps(&self) -> &[MachStackMap] {
1399         &self.stack_maps[..]
1400     }
1401 
1402     /// Get the list of call sites for this code.
1403     pub fn call_sites(&self) -> &[MachCallSite] {
1404         &self.call_sites[..]
1405     }
1406 }
1407 
1408 /// A constant that is deferred to the next constant-pool opportunity.
1409 struct MachLabelConstant {
1410     /// This label will refer to the constant's offset.
1411     label: MachLabel,
1412     /// Required alignment.
1413     align: CodeOffset,
1414     /// This data will be emitted when able.
1415     data: SmallVec<[u8; 16]>,
1416 }
1417 
1418 /// A fixup to perform on the buffer once code is emitted. Fixups always refer
1419 /// to labels and patch the code based on label offsets. Hence, they are like
1420 /// relocations, but internal to one buffer.
1421 #[derive(Debug)]
1422 struct MachLabelFixup<I: VCodeInst> {
1423     /// The label whose offset controls this fixup.
1424     label: MachLabel,
1425     /// The offset to fix up / patch to refer to this label.
1426     offset: CodeOffset,
1427     /// The kind of fixup. This is architecture-specific; each architecture may have,
1428     /// e.g., several types of branch instructions, each with differently-sized
1429     /// offset fields and different places within the instruction to place the
1430     /// bits.
1431     kind: I::LabelUse,
1432 }
1433 
1434 /// A relocation resulting from a compilation.
1435 #[derive(Clone, Debug)]
1436 pub struct MachReloc {
1437     /// The offset at which the relocation applies, *relative to the
1438     /// containing section*.
1439     pub offset: CodeOffset,
1440     /// The kind of relocation.
1441     pub kind: Reloc,
1442     /// The external symbol / name to which this relocation refers.
1443     pub name: ExternalName,
1444     /// The addend to add to the symbol value.
1445     pub addend: i64,
1446 }
1447 
1448 /// A trap record resulting from a compilation.
1449 #[derive(Clone, Debug)]
1450 pub struct MachTrap {
1451     /// The offset at which the trap instruction occurs, *relative to the
1452     /// containing section*.
1453     pub offset: CodeOffset,
1454     /// The trap code.
1455     pub code: TrapCode,
1456 }
1457 
1458 /// A call site record resulting from a compilation.
1459 #[derive(Clone, Debug)]
1460 pub struct MachCallSite {
1461     /// The offset of the call's return address, *relative to the containing section*.
1462     pub ret_addr: CodeOffset,
1463     /// The call's opcode.
1464     pub opcode: Opcode,
1465 }
1466 
1467 /// A source-location mapping resulting from a compilation.
1468 #[derive(Clone, Debug)]
1469 pub struct MachSrcLoc {
1470     /// The start of the region of code corresponding to a source location.
1471     /// This is relative to the start of the function, not to the start of the
1472     /// section.
1473     pub start: CodeOffset,
1474     /// The end of the region of code corresponding to a source location.
1475     /// This is relative to the start of the section, not to the start of the
1476     /// section.
1477     pub end: CodeOffset,
1478     /// The source location.
1479     pub loc: SourceLoc,
1480 }
1481 
1482 /// Record of stack map metadata: stack offsets containing references.
1483 #[derive(Clone, Debug)]
1484 pub struct MachStackMap {
1485     /// The code offset at which this stack map applies.
1486     pub offset: CodeOffset,
1487     /// The code offset just past the "end" of the instruction: that is, the
1488     /// offset of the first byte of the following instruction, or equivalently,
1489     /// the start offset plus the instruction length.
1490     pub offset_end: CodeOffset,
1491     /// The stack map itself.
1492     pub stack_map: StackMap,
1493 }
1494 
1495 /// Record of branch instruction in the buffer, to facilitate editing.
1496 #[derive(Clone, Debug)]
1497 struct MachBranch {
1498     start: CodeOffset,
1499     end: CodeOffset,
1500     target: MachLabel,
1501     fixup: usize,
1502     inverted: Option<SmallVec<[u8; 8]>>,
1503     /// All labels pointing to the start of this branch. For correctness, this
1504     /// *must* be complete (i.e., must contain all labels whose resolved offsets
1505     /// are at the start of this branch): we rely on being able to redirect all
1506     /// labels that could jump to this branch before removing it, if it is
1507     /// otherwise unreachable.
1508     labels_at_this_branch: SmallVec<[MachLabel; 4]>,
1509 }
1510 
1511 impl MachBranch {
1512     fn is_cond(&self) -> bool {
1513         self.inverted.is_some()
1514     }
1515     fn is_uncond(&self) -> bool {
1516         self.inverted.is_none()
1517     }
1518 }
1519 
1520 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`.
1521 ///
1522 /// Note that `MachBuffer` was primarily written for intra-function references
1523 /// of jumps between basic blocks, but it's also quite usable for entire text
1524 /// sections and resolving references between functions themselves. This
1525 /// builder interprets "blocks" as labeled functions for the purposes of
1526 /// resolving labels internally in the buffer.
1527 pub struct MachTextSectionBuilder<I: VCodeInst> {
1528     buf: MachBuffer<I>,
1529     next_func: usize,
1530     force_veneers: bool,
1531 }
1532 
1533 impl<I: VCodeInst> MachTextSectionBuilder<I> {
1534     pub fn new(num_funcs: u32) -> MachTextSectionBuilder<I> {
1535         let mut buf = MachBuffer::new();
1536         buf.reserve_labels_for_blocks(num_funcs as usize);
1537         MachTextSectionBuilder {
1538             buf,
1539             next_func: 0,
1540             force_veneers: false,
1541         }
1542     }
1543 }
1544 
1545 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> {
1546     fn append(&mut self, named: bool, func: &[u8], align: Option<u32>) -> u64 {
1547         // Conditionally emit an island if it's necessary to resolve jumps
1548         // between functions which are too far away.
1549         let size = func.len() as u32;
1550         if self.force_veneers || self.buf.island_needed(size) {
1551             self.buf.emit_island_maybe_forced(self.force_veneers, size);
1552         }
1553 
1554         self.buf.align_to(align.unwrap_or(I::LabelUse::ALIGN));
1555         let pos = self.buf.cur_offset();
1556         if named {
1557             self.buf
1558                 .bind_label(MachLabel::from_block(BlockIndex::new(self.next_func)));
1559             self.next_func += 1;
1560         }
1561         self.buf.put_data(func);
1562         u64::from(pos)
1563     }
1564 
1565     fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: u32) -> bool {
1566         let label = MachLabel::from_block(BlockIndex::new(target as usize));
1567         let offset = u32::try_from(offset).unwrap();
1568         match I::LabelUse::from_reloc(reloc, addend) {
1569             Some(label_use) => {
1570                 self.buf.use_label_at_offset(offset, label, label_use);
1571                 true
1572             }
1573             None => false,
1574         }
1575     }
1576 
1577     fn force_veneers(&mut self) {
1578         self.force_veneers = true;
1579     }
1580 
1581     fn finish(&mut self) -> Vec<u8> {
1582         // Double-check all functions were pushed.
1583         assert_eq!(self.next_func, self.buf.label_offsets.len());
1584 
1585         // Finish up any veneers, if necessary.
1586         self.buf
1587             .finish_emission_maybe_forcing_veneers(self.force_veneers);
1588 
1589         // We don't need the data any more, so return it to the caller.
1590         mem::take(&mut self.buf.data).into_vec()
1591     }
1592 }
1593 
1594 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
1595 #[cfg(all(test, feature = "arm64"))]
1596 mod test {
1597     use super::*;
1598     use crate::isa::aarch64::inst::xreg;
1599     use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
1600     use crate::machinst::MachInstEmit;
1601     use crate::settings;
1602     use std::default::Default;
1603     use std::vec::Vec;
1604 
1605     fn label(n: u32) -> MachLabel {
1606         MachLabel::from_block(BlockIndex::new(n as usize))
1607     }
1608     fn target(n: u32) -> BranchTarget {
1609         BranchTarget::Label(label(n))
1610     }
1611 
1612     #[test]
1613     fn test_elide_jump_to_next() {
1614         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1615         let mut buf = MachBuffer::new();
1616         let mut state = Default::default();
1617 
1618         buf.reserve_labels_for_blocks(2);
1619         buf.bind_label(label(0));
1620         let inst = Inst::Jump { dest: target(1) };
1621         inst.emit(&[], &mut buf, &info, &mut state);
1622         buf.bind_label(label(1));
1623         let buf = buf.finish();
1624         assert_eq!(0, buf.total_size());
1625     }
1626 
1627     #[test]
1628     fn test_elide_trivial_jump_blocks() {
1629         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1630         let mut buf = MachBuffer::new();
1631         let mut state = Default::default();
1632 
1633         buf.reserve_labels_for_blocks(4);
1634 
1635         buf.bind_label(label(0));
1636         let inst = Inst::CondBr {
1637             kind: CondBrKind::NotZero(xreg(0)),
1638             taken: target(1),
1639             not_taken: target(2),
1640         };
1641         inst.emit(&[], &mut buf, &info, &mut state);
1642 
1643         buf.bind_label(label(1));
1644         let inst = Inst::Jump { dest: target(3) };
1645         inst.emit(&[], &mut buf, &info, &mut state);
1646 
1647         buf.bind_label(label(2));
1648         let inst = Inst::Jump { dest: target(3) };
1649         inst.emit(&[], &mut buf, &info, &mut state);
1650 
1651         buf.bind_label(label(3));
1652 
1653         let buf = buf.finish();
1654         assert_eq!(0, buf.total_size());
1655     }
1656 
1657     #[test]
1658     fn test_flip_cond() {
1659         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1660         let mut buf = MachBuffer::new();
1661         let mut state = Default::default();
1662 
1663         buf.reserve_labels_for_blocks(4);
1664 
1665         buf.bind_label(label(0));
1666         let inst = Inst::CondBr {
1667             kind: CondBrKind::NotZero(xreg(0)),
1668             taken: target(1),
1669             not_taken: target(2),
1670         };
1671         inst.emit(&[], &mut buf, &info, &mut state);
1672 
1673         buf.bind_label(label(1));
1674         let inst = Inst::Udf {
1675             trap_code: TrapCode::Interrupt,
1676         };
1677         inst.emit(&[], &mut buf, &info, &mut state);
1678 
1679         buf.bind_label(label(2));
1680         let inst = Inst::Nop4;
1681         inst.emit(&[], &mut buf, &info, &mut state);
1682 
1683         buf.bind_label(label(3));
1684 
1685         let buf = buf.finish();
1686 
1687         let mut buf2 = MachBuffer::new();
1688         let mut state = Default::default();
1689         let inst = Inst::TrapIf {
1690             kind: CondBrKind::NotZero(xreg(0)),
1691             trap_code: TrapCode::Interrupt,
1692         };
1693         inst.emit(&[], &mut buf2, &info, &mut state);
1694         let inst = Inst::Nop4;
1695         inst.emit(&[], &mut buf2, &info, &mut state);
1696 
1697         let buf2 = buf2.finish();
1698 
1699         assert_eq!(buf.data, buf2.data);
1700     }
1701 
1702     #[test]
1703     fn test_island() {
1704         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1705         let mut buf = MachBuffer::new();
1706         let mut state = Default::default();
1707 
1708         buf.reserve_labels_for_blocks(4);
1709 
1710         buf.bind_label(label(0));
1711         let inst = Inst::CondBr {
1712             kind: CondBrKind::NotZero(xreg(0)),
1713             taken: target(2),
1714             not_taken: target(3),
1715         };
1716         inst.emit(&[], &mut buf, &info, &mut state);
1717 
1718         buf.bind_label(label(1));
1719         while buf.cur_offset() < 2000000 {
1720             if buf.island_needed(0) {
1721                 buf.emit_island(0);
1722             }
1723             let inst = Inst::Nop4;
1724             inst.emit(&[], &mut buf, &info, &mut state);
1725         }
1726 
1727         buf.bind_label(label(2));
1728         let inst = Inst::Nop4;
1729         inst.emit(&[], &mut buf, &info, &mut state);
1730 
1731         buf.bind_label(label(3));
1732         let inst = Inst::Nop4;
1733         inst.emit(&[], &mut buf, &info, &mut state);
1734 
1735         let buf = buf.finish();
1736 
1737         assert_eq!(2000000 + 8, buf.total_size());
1738 
1739         let mut buf2 = MachBuffer::new();
1740         let mut state = Default::default();
1741         let inst = Inst::CondBr {
1742             kind: CondBrKind::NotZero(xreg(0)),
1743 
1744             // This conditionally taken branch has a 19-bit constant, shifted
1745             // to the left by two, giving us a 21-bit range in total. Half of
1746             // this range positive so the we should be around 1 << 20 bytes
1747             // away for our jump target.
1748             //
1749             // There are two pending fixups by the time we reach this point,
1750             // one for this 19-bit jump and one for the unconditional 26-bit
1751             // jump below. A 19-bit veneer is 4 bytes large and the 26-bit
1752             // veneer is 20 bytes large, which means that pessimistically
1753             // assuming we'll need two veneers we need 24 bytes of extra
1754             // space, meaning that the actual island should come 24-bytes
1755             // before the deadline.
1756             taken: BranchTarget::ResolvedOffset((1 << 20) - 4 - 20),
1757 
1758             // This branch is in-range so no veneers should be needed, it should
1759             // go directly to the target.
1760             not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
1761         };
1762         inst.emit(&[], &mut buf2, &info, &mut state);
1763 
1764         let buf2 = buf2.finish();
1765 
1766         assert_eq!(&buf.data[0..8], &buf2.data[..]);
1767     }
1768 
1769     #[test]
1770     fn test_island_backward() {
1771         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1772         let mut buf = MachBuffer::new();
1773         let mut state = Default::default();
1774 
1775         buf.reserve_labels_for_blocks(4);
1776 
1777         buf.bind_label(label(0));
1778         let inst = Inst::Nop4;
1779         inst.emit(&[], &mut buf, &info, &mut state);
1780 
1781         buf.bind_label(label(1));
1782         let inst = Inst::Nop4;
1783         inst.emit(&[], &mut buf, &info, &mut state);
1784 
1785         buf.bind_label(label(2));
1786         while buf.cur_offset() < 2000000 {
1787             let inst = Inst::Nop4;
1788             inst.emit(&[], &mut buf, &info, &mut state);
1789         }
1790 
1791         buf.bind_label(label(3));
1792         let inst = Inst::CondBr {
1793             kind: CondBrKind::NotZero(xreg(0)),
1794             taken: target(0),
1795             not_taken: target(1),
1796         };
1797         inst.emit(&[], &mut buf, &info, &mut state);
1798 
1799         let buf = buf.finish();
1800 
1801         assert_eq!(2000000 + 12, buf.total_size());
1802 
1803         let mut buf2 = MachBuffer::new();
1804         let mut state = Default::default();
1805         let inst = Inst::CondBr {
1806             kind: CondBrKind::NotZero(xreg(0)),
1807             taken: BranchTarget::ResolvedOffset(8),
1808             not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
1809         };
1810         inst.emit(&[], &mut buf2, &info, &mut state);
1811         let inst = Inst::Jump {
1812             dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
1813         };
1814         inst.emit(&[], &mut buf2, &info, &mut state);
1815 
1816         let buf2 = buf2.finish();
1817 
1818         assert_eq!(&buf.data[2000000..], &buf2.data[..]);
1819     }
1820 
1821     #[test]
1822     fn test_multiple_redirect() {
1823         // label0:
1824         //   cbz x0, label1
1825         //   b label2
1826         // label1:
1827         //   b label3
1828         // label2:
1829         //   nop
1830         //   nop
1831         //   b label0
1832         // label3:
1833         //   b label4
1834         // label4:
1835         //   b label5
1836         // label5:
1837         //   b label7
1838         // label6:
1839         //   nop
1840         // label7:
1841         //   ret
1842         //
1843         // -- should become:
1844         //
1845         // label0:
1846         //   cbz x0, label7
1847         // label2:
1848         //   nop
1849         //   nop
1850         //   b label0
1851         // label6:
1852         //   nop
1853         // label7:
1854         //   ret
1855 
1856         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1857         let mut buf = MachBuffer::new();
1858         let mut state = Default::default();
1859 
1860         buf.reserve_labels_for_blocks(8);
1861 
1862         buf.bind_label(label(0));
1863         let inst = Inst::CondBr {
1864             kind: CondBrKind::Zero(xreg(0)),
1865             taken: target(1),
1866             not_taken: target(2),
1867         };
1868         inst.emit(&[], &mut buf, &info, &mut state);
1869 
1870         buf.bind_label(label(1));
1871         let inst = Inst::Jump { dest: target(3) };
1872         inst.emit(&[], &mut buf, &info, &mut state);
1873 
1874         buf.bind_label(label(2));
1875         let inst = Inst::Nop4;
1876         inst.emit(&[], &mut buf, &info, &mut state);
1877         inst.emit(&[], &mut buf, &info, &mut state);
1878         let inst = Inst::Jump { dest: target(0) };
1879         inst.emit(&[], &mut buf, &info, &mut state);
1880 
1881         buf.bind_label(label(3));
1882         let inst = Inst::Jump { dest: target(4) };
1883         inst.emit(&[], &mut buf, &info, &mut state);
1884 
1885         buf.bind_label(label(4));
1886         let inst = Inst::Jump { dest: target(5) };
1887         inst.emit(&[], &mut buf, &info, &mut state);
1888 
1889         buf.bind_label(label(5));
1890         let inst = Inst::Jump { dest: target(7) };
1891         inst.emit(&[], &mut buf, &info, &mut state);
1892 
1893         buf.bind_label(label(6));
1894         let inst = Inst::Nop4;
1895         inst.emit(&[], &mut buf, &info, &mut state);
1896 
1897         buf.bind_label(label(7));
1898         let inst = Inst::Ret { rets: vec![] };
1899         inst.emit(&[], &mut buf, &info, &mut state);
1900 
1901         let buf = buf.finish();
1902 
1903         let golden_data = vec![
1904             0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
1905             0x1f, 0x20, 0x03, 0xd5, // nop
1906             0x1f, 0x20, 0x03, 0xd5, // nop
1907             0xfd, 0xff, 0xff, 0x17, // b 0
1908             0x1f, 0x20, 0x03, 0xd5, // nop
1909             0xc0, 0x03, 0x5f, 0xd6, // ret
1910         ];
1911 
1912         assert_eq!(&golden_data[..], &buf.data[..]);
1913     }
1914 
1915     #[test]
1916     fn test_handle_branch_cycle() {
1917         // label0:
1918         //   b label1
1919         // label1:
1920         //   b label2
1921         // label2:
1922         //   b label3
1923         // label3:
1924         //   b label4
1925         // label4:
1926         //   b label1  // note: not label0 (to make it interesting).
1927         //
1928         // -- should become:
1929         //
1930         // label0, label1, ..., label4:
1931         //   b label0
1932         let info = EmitInfo::new(settings::Flags::new(settings::builder()));
1933         let mut buf = MachBuffer::new();
1934         let mut state = Default::default();
1935 
1936         buf.reserve_labels_for_blocks(5);
1937 
1938         buf.bind_label(label(0));
1939         let inst = Inst::Jump { dest: target(1) };
1940         inst.emit(&[], &mut buf, &info, &mut state);
1941 
1942         buf.bind_label(label(1));
1943         let inst = Inst::Jump { dest: target(2) };
1944         inst.emit(&[], &mut buf, &info, &mut state);
1945 
1946         buf.bind_label(label(2));
1947         let inst = Inst::Jump { dest: target(3) };
1948         inst.emit(&[], &mut buf, &info, &mut state);
1949 
1950         buf.bind_label(label(3));
1951         let inst = Inst::Jump { dest: target(4) };
1952         inst.emit(&[], &mut buf, &info, &mut state);
1953 
1954         buf.bind_label(label(4));
1955         let inst = Inst::Jump { dest: target(1) };
1956         inst.emit(&[], &mut buf, &info, &mut state);
1957 
1958         let buf = buf.finish();
1959 
1960         let golden_data = vec![
1961             0x00, 0x00, 0x00, 0x14, // b 0
1962         ];
1963 
1964         assert_eq!(&golden_data[..], &buf.data[..]);
1965     }
1966 
1967     #[test]
1968     fn metadata_records() {
1969         let mut buf = MachBuffer::<Inst>::new();
1970 
1971         buf.reserve_labels_for_blocks(1);
1972 
1973         buf.bind_label(label(0));
1974         buf.put1(1);
1975         buf.add_trap(TrapCode::HeapOutOfBounds);
1976         buf.put1(2);
1977         buf.add_trap(TrapCode::IntegerOverflow);
1978         buf.add_trap(TrapCode::IntegerDivisionByZero);
1979         buf.add_call_site(Opcode::Call);
1980         buf.add_reloc(Reloc::Abs4, &ExternalName::user(0, 0), 0);
1981         buf.put1(3);
1982         buf.add_reloc(Reloc::Abs8, &ExternalName::user(1, 1), 1);
1983         buf.put1(4);
1984 
1985         let buf = buf.finish();
1986 
1987         assert_eq!(buf.data(), &[1, 2, 3, 4]);
1988         assert_eq!(
1989             buf.traps()
1990                 .iter()
1991                 .map(|trap| (trap.offset, trap.code))
1992                 .collect::<Vec<_>>(),
1993             vec![
1994                 (1, TrapCode::HeapOutOfBounds),
1995                 (2, TrapCode::IntegerOverflow),
1996                 (2, TrapCode::IntegerDivisionByZero)
1997             ]
1998         );
1999         assert_eq!(
2000             buf.call_sites()
2001                 .iter()
2002                 .map(|call_site| (call_site.ret_addr, call_site.opcode))
2003                 .collect::<Vec<_>>(),
2004             vec![(2, Opcode::Call)]
2005         );
2006         assert_eq!(
2007             buf.relocs()
2008                 .iter()
2009                 .map(|reloc| (reloc.offset, reloc.kind))
2010                 .collect::<Vec<_>>(),
2011             vec![(2, Reloc::Abs4), (3, Reloc::Abs8)]
2012         );
2013     }
2014 }
2015