1 //! In-memory representation of compiled machine code, with labels and fixups to 2 //! refer to those labels. Handles constant-pool island insertion and also 3 //! veneer insertion for out-of-range jumps. 4 //! 5 //! This code exists to solve three problems: 6 //! 7 //! - Branch targets for forward branches are not known until later, when we 8 //! emit code in a single pass through the instruction structs. 9 //! 10 //! - On many architectures, address references or offsets have limited range. 11 //! For example, on AArch64, conditional branches can only target code +/- 1MB 12 //! from the branch itself. 13 //! 14 //! - The lowering of control flow from the CFG-with-edges produced by 15 //! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty 16 //! edge blocks when the register allocator does not need to insert any 17 //! spills/reloads/moves in edge blocks, results in many suboptimal branch 18 //! patterns. The lowering also pays no attention to block order, and so 19 //! two-target conditional forms (cond-br followed by uncond-br) can often by 20 //! avoided because one of the targets is the fallthrough. There are several 21 //! cases here where we can simplify to use fewer branches. 22 //! 23 //! This "buffer" implements a single-pass code emission strategy (with a later 24 //! "fixup" pass, but only through recorded fixups, not all instructions). The 25 //! basic idea is: 26 //! 27 //! - Emit branches as they are, including two-target (cond/uncond) compound 28 //! forms, but with zero offsets and optimistically assuming the target will be 29 //! in range. Record the "fixup" for later. Targets are denoted instead by 30 //! symbolic "labels" that are then bound to certain offsets in the buffer as 31 //! we emit code. (Nominally, there is a label at the start of every basic 32 //! block.) 33 //! 34 //! - As we do this, track the offset in the buffer at which the first label 35 //! reference "goes out of range". We call this the "deadline". If we reach the 36 //! deadline and we still have not bound the label to which an unresolved branch 37 //! refers, we have a problem! 38 //! 39 //! - To solve this problem, we emit "islands" full of "veneers". An island is 40 //! simply a chunk of code inserted in the middle of the code actually produced 41 //! by the emitter (e.g., vcode iterating over instruction structs). The emitter 42 //! has some awareness of this: it either asks for an island between blocks, so 43 //! it is not accidentally executed, or else it emits a branch around the island 44 //! when all other options fail (see `Inst::EmitIsland` meta-instruction). 45 //! 46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island" 47 //! that implements a longer-range reference to a label. The idea is that, for 48 //! example, a branch with a limited range can branch to a "veneer" instead, 49 //! which is simply a branch in a form that can use a longer-range reference. On 50 //! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional 51 //! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a 52 //! conditional branch's label reference can be fixed up with a "veneer" to 53 //! achieve a longer range. 54 //! 55 //! - To implement all of this, we require the backend to provide a `LabelUse` 56 //! type that implements a trait. This is nominally an enum that records one of 57 //! several kinds of references to an offset in code -- basically, a relocation 58 //! type -- and will usually correspond to different instruction formats. The 59 //! `LabelUse` implementation specifies the maximum range, how to patch in the 60 //! actual label location when known, and how to generate a veneer to extend the 61 //! range. 62 //! 63 //! That satisfies label references, but we still may have suboptimal branch 64 //! patterns. To clean up the branches, we do a simple "peephole"-style 65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`) 66 //! informs the buffer of branches in the code and, in the case of conditionals, 67 //! the code that would have been emitted to invert this branch's condition. We 68 //! track the "latest branches": these are branches that are contiguous up to 69 //! the current offset. (If any code is emitted after a branch, that branch or 70 //! run of contiguous branches is no longer "latest".) The latest branches are 71 //! those that we can edit by simply truncating the buffer and doing something 72 //! else instead. 73 //! 74 //! To optimize branches, we implement several simple rules, and try to apply 75 //! them to the "latest branches" when possible: 76 //! 77 //! - A branch with a label target, when that label is bound to the ending 78 //! offset of the branch (the fallthrough location), can be removed altogether, 79 //! because the branch would have no effect). 80 //! 81 //! - An unconditional branch that starts at a label location, and branches to 82 //! another label, results in a "label alias": all references to the label bound 83 //! *to* this branch instruction are instead resolved to the *target* of the 84 //! branch instruction. This effectively removes empty blocks that just 85 //! unconditionally branch to the next block. We call this "branch threading". 86 //! 87 //! - A conditional followed by an unconditional, when the conditional branches 88 //! to the unconditional's fallthrough, results in (i) the truncation of the 89 //! unconditional, (ii) the inversion of the condition's condition, and (iii) 90 //! replacement of the conditional's target (using the original target of the 91 //! unconditional). This is a fancy way of saying "we can flip a two-target 92 //! conditional branch's taken/not-taken targets if it works better with our 93 //! fallthrough". To make this work, the emitter actually gives the buffer 94 //! *both* forms of every conditional branch: the true form is emitted into the 95 //! buffer, and the "inverted" machine-code bytes are provided as part of the 96 //! branch-fixup metadata. 97 //! 98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have 99 //! been redirected to target(B), can be removed entirely. This is an extension 100 //! of the branch-threading optimization, and is valid because if we know there 101 //! will be no fallthrough into this branch instruction (the prior instruction 102 //! is an unconditional jump), and if we know we have successfully redirected 103 //! all labels, then this branch instruction is unreachable. Note that this 104 //! works because the redirection happens before the label is ever resolved 105 //! (fixups happen at island emission time, at which point latest-branches are 106 //! cleared, or at the end of emission), so we are sure to catch and redirect 107 //! all possible paths to this instruction. 108 //! 109 //! # Branch-optimization Correctness 110 //! 111 //! The branch-optimization mechanism depends on a few data structures with 112 //! invariants, which are always held outside the scope of top-level public 113 //! methods: 114 //! 115 //! - The latest-branches list. Each entry describes a span of the buffer 116 //! (start/end offsets), the label target, the corresponding fixup-list entry 117 //! index, and the bytes (must be the same length) for the inverted form, if 118 //! conditional. The list of labels that are bound to the start-offset of this 119 //! branch is *complete* (if any label has a resolved offset equal to `start` 120 //! and is not an alias, it must appear in this list) and *precise* (no label 121 //! in this list can be bound to another offset). No label in this list should 122 //! be an alias. No two branch ranges can overlap, and branches are in 123 //! ascending-offset order. 124 //! 125 //! - The labels-at-tail list. This contains all MachLabels that have been bound 126 //! to (whose resolved offsets are equal to) the tail offset of the buffer. 127 //! No label in this list should be an alias. 128 //! 129 //! - The label_offsets array, containing the bound offset of a label or 130 //! UNKNOWN. No label can be bound at an offset greater than the current 131 //! buffer tail. 132 //! 133 //! - The label_aliases array, containing another label to which a label is 134 //! bound or UNKNOWN. A label's resolved offset is the resolved offset 135 //! of the label it is aliased to, if this is set. 136 //! 137 //! We argue below, at each method, how the invariants in these data structures 138 //! are maintained (grep for "Post-invariant"). 139 //! 140 //! Given these invariants, we argue why each optimization preserves execution 141 //! semantics below (grep for "Preserves execution semantics"). 142 //! 143 //! # Avoiding Quadratic Behavior 144 //! 145 //! There are two cases where we've had to take some care to avoid 146 //! quadratic worst-case behavior: 147 //! 148 //! - The "labels at this branch" list can grow unboundedly if the 149 //! code generator binds many labels at one location. If the count 150 //! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we 151 //! simply abort an optimization early in a way that is always correct 152 //! but is conservative. 153 //! 154 //! - The fixup list can interact with island emission to create 155 //! "quadratic island behvior". In a little more detail, one can hit 156 //! this behavior by having some pending fixups (forward label 157 //! references) with long-range label-use kinds, and some others 158 //! with shorter-range references that nonetheless still are pending 159 //! long enough to trigger island generation. In such a case, we 160 //! process the fixup list, generate veneers to extend some forward 161 //! references' ranges, but leave the other (longer-range) ones 162 //! alone. The way this was implemented put them back on a list and 163 //! resulted in quadratic behavior. 164 //! 165 //! To avoid this fixups are split into two lists: one "pending" list and one 166 //! final list. The pending list is kept around for handling fixups related to 167 //! branches so it can be edited/truncated. When an island is reached, which 168 //! starts processing fixups, all pending fixups are flushed into the final 169 //! list. The final list is a `BinaryHeap` which enables fixup processing to 170 //! only process those which are required during island emission, deferring 171 //! all longer-range fixups to later. 172 173 use crate::binemit::{Addend, CodeOffset, Reloc, StackMap}; 174 use crate::ir::function::FunctionParameters; 175 use crate::ir::{ExternalName, RelSourceLoc, SourceLoc, TrapCode}; 176 use crate::isa::unwind::UnwindInst; 177 use crate::machinst::{ 178 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst, 179 }; 180 use crate::trace; 181 use crate::{ir, MachInstEmitState}; 182 use crate::{timing, VCodeConstantData}; 183 use cranelift_control::ControlPlane; 184 use cranelift_entity::{entity_impl, PrimaryMap}; 185 use smallvec::SmallVec; 186 use std::cmp::Ordering; 187 use std::collections::BinaryHeap; 188 use std::mem; 189 use std::string::String; 190 use std::vec::Vec; 191 192 #[cfg(feature = "enable-serde")] 193 use serde::{Deserialize, Serialize}; 194 195 #[cfg(feature = "enable-serde")] 196 pub trait CompilePhase { 197 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone; 198 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone; 199 } 200 201 #[cfg(not(feature = "enable-serde"))] 202 pub trait CompilePhase { 203 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone; 204 type SourceLocType: core::fmt::Debug + PartialEq + Clone; 205 } 206 207 /// Status of a compiled artifact that needs patching before being used. 208 #[derive(Clone, Debug, PartialEq)] 209 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))] 210 pub struct Stencil; 211 212 /// Status of a compiled artifact ready to use. 213 #[derive(Clone, Debug, PartialEq)] 214 pub struct Final; 215 216 impl CompilePhase for Stencil { 217 type MachSrcLocType = MachSrcLoc<Stencil>; 218 type SourceLocType = RelSourceLoc; 219 } 220 221 impl CompilePhase for Final { 222 type MachSrcLocType = MachSrcLoc<Final>; 223 type SourceLocType = SourceLoc; 224 } 225 226 #[derive(Clone, Copy, Debug, PartialEq, Eq)] 227 enum ForceVeneers { 228 Yes, 229 No, 230 } 231 232 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink 233 /// in bulk. 234 /// 235 /// This struct uses `SmallVec`s to support small-ish function bodies without 236 /// any heap allocation. As such, it will be several kilobytes large. This is 237 /// likely fine as long as it is stack-allocated for function emission then 238 /// thrown away; but beware if many buffer objects are retained persistently. 239 pub struct MachBuffer<I: VCodeInst> { 240 /// The buffer contents, as raw bytes. 241 data: SmallVec<[u8; 1024]>, 242 /// Any relocations referring to this code. Note that only *external* 243 /// relocations are tracked here; references to labels within the buffer are 244 /// resolved before emission. 245 relocs: SmallVec<[MachReloc; 16]>, 246 /// Any trap records referring to this code. 247 traps: SmallVec<[MachTrap; 16]>, 248 /// Any call site records referring to this code. 249 call_sites: SmallVec<[MachCallSite; 16]>, 250 /// Any source location mappings referring to this code. 251 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>, 252 /// Any stack maps referring to this code. 253 stack_maps: SmallVec<[MachStackMap; 8]>, 254 /// Any user stack maps for this code. 255 /// 256 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted 257 /// by code offset, and each stack map covers `span` bytes on the stack. 258 user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>, 259 /// Any unwind info at a given location. 260 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 261 /// The current source location in progress (after `start_srcloc()` and 262 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple. 263 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>, 264 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown. 265 label_offsets: SmallVec<[CodeOffset; 16]>, 266 /// Label aliases: when one label points to an unconditional jump, and that 267 /// jump points to another label, we can redirect references to the first 268 /// label immediately to the second. 269 /// 270 /// Invariant: we don't have label-alias cycles. We ensure this by, 271 /// before setting label A to alias label B, resolving B's alias 272 /// target (iteratively until a non-aliased label); if B is already 273 /// aliased to A, then we cannot alias A back to B. 274 label_aliases: SmallVec<[MachLabel; 16]>, 275 /// Constants that must be emitted at some point. 276 pending_constants: SmallVec<[VCodeConstant; 16]>, 277 /// Byte size of all constants in `pending_constants`. 278 pending_constants_size: CodeOffset, 279 /// Traps that must be emitted at some point. 280 pending_traps: SmallVec<[MachLabelTrap; 16]>, 281 /// Fixups that haven't yet been flushed into `fixup_records` below and may 282 /// be related to branches that are chomped. These all get added to 283 /// `fixup_records` during island emission. 284 pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>, 285 /// The nearest upcoming deadline for entries in `pending_fixup_records`. 286 pending_fixup_deadline: CodeOffset, 287 /// Fixups that must be performed after all code is emitted. 288 fixup_records: BinaryHeap<MachLabelFixup<I>>, 289 /// Latest branches, to facilitate in-place editing for better fallthrough 290 /// behavior and empty-block removal. 291 latest_branches: SmallVec<[MachBranch; 4]>, 292 /// All labels at the current offset (emission tail). This is lazily 293 /// cleared: it is actually accurate as long as the current offset is 294 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should 295 /// be considered as empty. 296 /// 297 /// For correctness, this *must* be complete (i.e., the vector must contain 298 /// all labels whose offsets are resolved to the current tail), because we 299 /// rely on it to update labels when we truncate branches. 300 labels_at_tail: SmallVec<[MachLabel; 4]>, 301 /// The last offset at which `labels_at_tail` is valid. It is conceptually 302 /// always describing the tail of the buffer, but we do not clear 303 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it 304 /// when the offset has grown past this (`labels_at_tail_off`) point. 305 /// Always <= `cur_offset()`. 306 labels_at_tail_off: CodeOffset, 307 /// Metadata about all constants that this function has access to. 308 /// 309 /// This records the size/alignment of all constants (not the actual data) 310 /// along with the last available label generated for the constant. This map 311 /// is consulted when constants are referred to and the label assigned to a 312 /// constant may change over time as well. 313 constants: PrimaryMap<VCodeConstant, MachBufferConstant>, 314 /// All recorded usages of constants as pairs of the constant and where the 315 /// constant needs to be placed within `self.data`. Note that the same 316 /// constant may appear in this array multiple times if it was emitted 317 /// multiple times. 318 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>, 319 /// Indicates when a patchable region is currently open, to guard that it's 320 /// not possible to nest patchable regions. 321 open_patchable: bool, 322 } 323 324 impl MachBufferFinalized<Stencil> { 325 /// Get a finalized machine buffer by applying the function's base source location. 326 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> { 327 MachBufferFinalized { 328 data: self.data, 329 relocs: self.relocs, 330 traps: self.traps, 331 call_sites: self.call_sites, 332 srclocs: self 333 .srclocs 334 .into_iter() 335 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc)) 336 .collect(), 337 stack_maps: self.stack_maps, 338 user_stack_maps: self.user_stack_maps, 339 unwind_info: self.unwind_info, 340 alignment: self.alignment, 341 } 342 } 343 } 344 345 /// A `MachBuffer` once emission is completed: holds generated code and records, 346 /// without fixups. This allows the type to be independent of the backend. 347 #[derive(PartialEq, Debug, Clone)] 348 #[cfg_attr( 349 feature = "enable-serde", 350 derive(serde_derive::Serialize, serde_derive::Deserialize) 351 )] 352 pub struct MachBufferFinalized<T: CompilePhase> { 353 /// The buffer contents, as raw bytes. 354 pub(crate) data: SmallVec<[u8; 1024]>, 355 /// Any relocations referring to this code. Note that only *external* 356 /// relocations are tracked here; references to labels within the buffer are 357 /// resolved before emission. 358 pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>, 359 /// Any trap records referring to this code. 360 pub(crate) traps: SmallVec<[MachTrap; 16]>, 361 /// Any call site records referring to this code. 362 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>, 363 /// Any source location mappings referring to this code. 364 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>, 365 /// Any stack maps referring to this code. 366 pub(crate) stack_maps: SmallVec<[MachStackMap; 8]>, 367 /// Any user stack maps for this code. 368 /// 369 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted 370 /// by code offset, and each stack map covers `span` bytes on the stack. 371 pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>, 372 /// Any unwind info at a given location. 373 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 374 /// The required alignment of this buffer. 375 pub alignment: u32, 376 } 377 378 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff; 379 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff); 380 381 /// Threshold on max length of `labels_at_this_branch` list to avoid 382 /// unbounded quadratic behavior (see comment below at use-site). 383 const LABEL_LIST_THRESHOLD: usize = 100; 384 385 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at 386 /// the point at which it is used by emitted code; the buffer records "fixups" 387 /// for references to the label, and will come back and patch the code 388 /// appropriately when the label's location is eventually known. 389 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)] 390 pub struct MachLabel(u32); 391 entity_impl!(MachLabel); 392 393 impl MachLabel { 394 /// Get a label for a block. (The first N MachLabels are always reserved for 395 /// the N blocks in the vcode.) 396 pub fn from_block(bindex: BlockIndex) -> MachLabel { 397 MachLabel(bindex.index() as u32) 398 } 399 400 /// Get the numeric label index. 401 pub fn get(self) -> u32 { 402 self.0 403 } 404 405 /// Creates a string representing this label, for convenience. 406 pub fn to_string(&self) -> String { 407 format!("label{}", self.0) 408 } 409 } 410 411 impl Default for MachLabel { 412 fn default() -> Self { 413 UNKNOWN_LABEL 414 } 415 } 416 417 /// A stack map extent, when creating a stack map. 418 pub enum StackMapExtent { 419 /// The stack map starts at this instruction, and ends after the number of upcoming bytes 420 /// (note: this is a code offset diff). 421 UpcomingBytes(CodeOffset), 422 423 /// The stack map started at the given offset and ends at the current one. This helps 424 /// architectures where the instruction size has not a fixed length. 425 StartedAtOffset(CodeOffset), 426 } 427 428 /// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is 429 /// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming 430 /// the [`OpenPatchRegion`] token in the process. 431 pub struct OpenPatchRegion(usize); 432 433 /// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example 434 /// of where you might want to use this is for patching instructions that mention constants that 435 /// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable 436 /// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token 437 /// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known, 438 /// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction 439 /// bytes, and the constants uses can be updated directly. 440 pub struct PatchRegion { 441 range: std::ops::Range<usize>, 442 } 443 444 impl PatchRegion { 445 /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer. 446 pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] { 447 &mut buffer.data[self.range] 448 } 449 } 450 451 impl<I: VCodeInst> MachBuffer<I> { 452 /// Create a new section, known to start at `start_offset` and with a size limited to 453 /// `length_limit`. 454 pub fn new() -> MachBuffer<I> { 455 MachBuffer { 456 data: SmallVec::new(), 457 relocs: SmallVec::new(), 458 traps: SmallVec::new(), 459 call_sites: SmallVec::new(), 460 srclocs: SmallVec::new(), 461 stack_maps: SmallVec::new(), 462 user_stack_maps: SmallVec::new(), 463 unwind_info: SmallVec::new(), 464 cur_srcloc: None, 465 label_offsets: SmallVec::new(), 466 label_aliases: SmallVec::new(), 467 pending_constants: SmallVec::new(), 468 pending_constants_size: 0, 469 pending_traps: SmallVec::new(), 470 pending_fixup_records: SmallVec::new(), 471 pending_fixup_deadline: u32::MAX, 472 fixup_records: Default::default(), 473 latest_branches: SmallVec::new(), 474 labels_at_tail: SmallVec::new(), 475 labels_at_tail_off: 0, 476 constants: Default::default(), 477 used_constants: Default::default(), 478 open_patchable: false, 479 } 480 } 481 482 /// Current offset from start of buffer. 483 pub fn cur_offset(&self) -> CodeOffset { 484 self.data.len() as CodeOffset 485 } 486 487 /// Add a byte. 488 pub fn put1(&mut self, value: u8) { 489 self.data.push(value); 490 491 // Post-invariant: conceptual-labels_at_tail contains a complete and 492 // precise list of labels bound at `cur_offset()`. We have advanced 493 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off` 494 // before, it is not anymore (and it cannot become equal, because 495 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is 496 // conceptually empty (even though it is only lazily cleared). No labels 497 // can be bound at this new offset (by invariant on `label_offsets`). 498 // Hence the invariant holds. 499 } 500 501 /// Add 2 bytes. 502 pub fn put2(&mut self, value: u16) { 503 let bytes = value.to_le_bytes(); 504 self.data.extend_from_slice(&bytes[..]); 505 506 // Post-invariant: as for `put1()`. 507 } 508 509 /// Add 4 bytes. 510 pub fn put4(&mut self, value: u32) { 511 let bytes = value.to_le_bytes(); 512 self.data.extend_from_slice(&bytes[..]); 513 514 // Post-invariant: as for `put1()`. 515 } 516 517 /// Add 8 bytes. 518 pub fn put8(&mut self, value: u64) { 519 let bytes = value.to_le_bytes(); 520 self.data.extend_from_slice(&bytes[..]); 521 522 // Post-invariant: as for `put1()`. 523 } 524 525 /// Add a slice of bytes. 526 pub fn put_data(&mut self, data: &[u8]) { 527 self.data.extend_from_slice(data); 528 529 // Post-invariant: as for `put1()`. 530 } 531 532 /// Reserve appended space and return a mutable slice referring to it. 533 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] { 534 let off = self.data.len(); 535 let new_len = self.data.len() + len; 536 self.data.resize(new_len, 0); 537 &mut self.data[off..] 538 539 // Post-invariant: as for `put1()`. 540 } 541 542 /// Align up to the given alignment. 543 pub fn align_to(&mut self, align_to: CodeOffset) { 544 trace!("MachBuffer: align to {}", align_to); 545 assert!( 546 align_to.is_power_of_two(), 547 "{align_to} is not a power of two" 548 ); 549 while self.cur_offset() & (align_to - 1) != 0 { 550 self.put1(0); 551 } 552 553 // Post-invariant: as for `put1()`. 554 } 555 556 /// Begin a region of patchable code. There is one requirement for the 557 /// code that is emitted: It must not introduce any instructions that 558 /// could be chomped (branches are an example of this). In other words, 559 /// you must not call [`MachBuffer::add_cond_branch`] or 560 /// [`MachBuffer::add_uncond_branch`] between calls to this method and 561 /// [`MachBuffer::end_patchable`]. 562 pub fn start_patchable(&mut self) -> OpenPatchRegion { 563 assert!(!self.open_patchable, "Patchable regions may not be nested"); 564 self.open_patchable = true; 565 OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap()) 566 } 567 568 /// End a region of patchable code, yielding a [`PatchRegion`] value that 569 /// can be consumed later to produce a one-off mutable slice to the 570 /// associated region of the data buffer. 571 pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion { 572 // No need to assert the state of `open_patchable` here, as we take 573 // ownership of the only `OpenPatchable` value. 574 self.open_patchable = false; 575 let end = usize::try_from(self.cur_offset()).unwrap(); 576 PatchRegion { range: open.0..end } 577 } 578 579 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed 580 /// offset yet. 581 pub fn get_label(&mut self) -> MachLabel { 582 let l = self.label_offsets.len() as u32; 583 self.label_offsets.push(UNKNOWN_LABEL_OFFSET); 584 self.label_aliases.push(UNKNOWN_LABEL); 585 trace!("MachBuffer: new label -> {:?}", MachLabel(l)); 586 MachLabel(l) 587 588 // Post-invariant: the only mutation is to add a new label; it has no 589 // bound offset yet, so it trivially satisfies all invariants. 590 } 591 592 /// Reserve the first N MachLabels for blocks. 593 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) { 594 trace!("MachBuffer: first {} labels are for blocks", blocks); 595 debug_assert!(self.label_offsets.is_empty()); 596 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET); 597 self.label_aliases.resize(blocks, UNKNOWN_LABEL); 598 599 // Post-invariant: as for `get_label()`. 600 } 601 602 /// Registers metadata in this `MachBuffer` about the `constants` provided. 603 /// 604 /// This will record the size/alignment of all constants which will prepare 605 /// them for emission later on. 606 pub fn register_constants(&mut self, constants: &VCodeConstants) { 607 for (c, val) in constants.iter() { 608 self.register_constant(&c, val); 609 } 610 } 611 612 /// Similar to [`MachBuffer::register_constants`] but registers a 613 /// single constant metadata. This function is useful in 614 /// situations where not all constants are known at the time of 615 /// emission. 616 pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) { 617 let c2 = self.constants.push(MachBufferConstant { 618 upcoming_label: None, 619 align: data.alignment(), 620 size: data.as_slice().len(), 621 }); 622 assert_eq!(*constant, c2); 623 } 624 625 /// Completes constant emission by iterating over `self.used_constants` and 626 /// filling in the "holes" with the constant values provided by `constants`. 627 /// 628 /// Returns the alignment required for this entire buffer. Alignment starts 629 /// at the ISA's minimum function alignment and can be increased due to 630 /// constant requirements. 631 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 { 632 let mut alignment = I::function_alignment().minimum; 633 for (constant, offset) in mem::take(&mut self.used_constants) { 634 let constant = constants.get(constant); 635 let data = constant.as_slice(); 636 self.data[offset as usize..][..data.len()].copy_from_slice(data); 637 alignment = constant.alignment().max(alignment); 638 } 639 alignment 640 } 641 642 /// Returns a label that can be used to refer to the `constant` provided. 643 /// 644 /// This will automatically defer a new constant to be emitted for 645 /// `constant` if it has not been previously emitted. Note that this 646 /// function may return a different label for the same constant at 647 /// different points in time. The label is valid to use only from the 648 /// current location; the MachBuffer takes care to emit the same constant 649 /// multiple times if needed so the constant is always in range. 650 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel { 651 let MachBufferConstant { 652 align, 653 size, 654 upcoming_label, 655 } = self.constants[constant]; 656 if let Some(label) = upcoming_label { 657 return label; 658 } 659 660 let label = self.get_label(); 661 trace!( 662 "defer constant: eventually emit {size} bytes aligned \ 663 to {align} at label {label:?}", 664 ); 665 self.pending_constants.push(constant); 666 self.pending_constants_size += size as u32; 667 self.constants[constant].upcoming_label = Some(label); 668 label 669 } 670 671 /// Bind a label to the current offset. A label can only be bound once. 672 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) { 673 trace!( 674 "MachBuffer: bind label {:?} at offset {}", 675 label, 676 self.cur_offset() 677 ); 678 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET); 679 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL); 680 let offset = self.cur_offset(); 681 self.label_offsets[label.0 as usize] = offset; 682 self.lazily_clear_labels_at_tail(); 683 self.labels_at_tail.push(label); 684 685 // Invariants hold: bound offset of label is <= cur_offset (in fact it 686 // is equal). If the `labels_at_tail` list was complete and precise 687 // before, it is still, because we have bound this label to the current 688 // offset and added it to the list (which contains all labels at the 689 // current offset). 690 691 self.optimize_branches(ctrl_plane); 692 693 // Post-invariant: by `optimize_branches()` (see argument there). 694 } 695 696 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the 697 /// offset that it applies to. 698 fn lazily_clear_labels_at_tail(&mut self) { 699 let offset = self.cur_offset(); 700 if offset > self.labels_at_tail_off { 701 self.labels_at_tail_off = offset; 702 self.labels_at_tail.clear(); 703 } 704 705 // Post-invariant: either labels_at_tail_off was at cur_offset, and 706 // state is untouched, or was less than cur_offset, in which case the 707 // labels_at_tail list was conceptually empty, and is now actually 708 // empty. 709 } 710 711 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`. 712 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset { 713 let mut iters = 0; 714 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL { 715 label = self.label_aliases[label.0 as usize]; 716 // To protect against an infinite loop (despite our assurances to 717 // ourselves that the invariants make this impossible), assert out 718 // after 1M iterations. The number of basic blocks is limited 719 // in most contexts anyway so this should be impossible to hit with 720 // a legitimate input. 721 iters += 1; 722 assert!(iters < 1_000_000, "Unexpected cycle in label aliases"); 723 } 724 self.label_offsets[label.0 as usize] 725 726 // Post-invariant: no mutations. 727 } 728 729 /// Emit a reference to the given label with the given reference type (i.e., 730 /// branch-instruction format) at the current offset. This is like a 731 /// relocation, but handled internally. 732 /// 733 /// This can be called before the branch is actually emitted; fixups will 734 /// not happen until an island is emitted or the buffer is finished. 735 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) { 736 trace!( 737 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}", 738 offset, 739 label, 740 kind 741 ); 742 743 // Add the fixup, and update the worst-case island size based on a 744 // veneer for this label use. 745 let fixup = MachLabelFixup { 746 label, 747 offset, 748 kind, 749 }; 750 self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline()); 751 self.pending_fixup_records.push(fixup); 752 753 // Post-invariant: no mutations to branches/labels data structures. 754 } 755 756 /// Inform the buffer of an unconditional branch at the given offset, 757 /// targeting the given label. May be used to optimize branches. 758 /// The last added label-use must correspond to this branch. 759 /// This must be called when the current offset is equal to `start`; i.e., 760 /// before actually emitting the branch. This implies that for a branch that 761 /// uses a label and is eligible for optimizations by the MachBuffer, the 762 /// proper sequence is: 763 /// 764 /// - Call `use_label_at_offset()` to emit the fixup record. 765 /// - Call `add_uncond_branch()` to make note of the branch. 766 /// - Emit the bytes for the branch's machine code. 767 /// 768 /// Additional requirement: no labels may be bound between `start` and `end` 769 /// (exclusive on both ends). 770 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) { 771 debug_assert!( 772 !self.open_patchable, 773 "Branch instruction inserted within a patchable region" 774 ); 775 assert!(self.cur_offset() == start); 776 debug_assert!(end > start); 777 assert!(!self.pending_fixup_records.is_empty()); 778 let fixup = self.pending_fixup_records.len() - 1; 779 self.lazily_clear_labels_at_tail(); 780 self.latest_branches.push(MachBranch { 781 start, 782 end, 783 target, 784 fixup, 785 inverted: None, 786 labels_at_this_branch: self.labels_at_tail.clone(), 787 }); 788 789 // Post-invariant: we asserted branch start is current tail; the list of 790 // labels at branch is cloned from list of labels at current tail. 791 } 792 793 /// Inform the buffer of a conditional branch at the given offset, 794 /// targeting the given label. May be used to optimize branches. 795 /// The last added label-use must correspond to this branch. 796 /// 797 /// Additional requirement: no labels may be bound between `start` and `end` 798 /// (exclusive on both ends). 799 pub fn add_cond_branch( 800 &mut self, 801 start: CodeOffset, 802 end: CodeOffset, 803 target: MachLabel, 804 inverted: &[u8], 805 ) { 806 debug_assert!( 807 !self.open_patchable, 808 "Branch instruction inserted within a patchable region" 809 ); 810 assert!(self.cur_offset() == start); 811 debug_assert!(end > start); 812 assert!(!self.pending_fixup_records.is_empty()); 813 debug_assert!(inverted.len() == (end - start) as usize); 814 let fixup = self.pending_fixup_records.len() - 1; 815 let inverted = Some(SmallVec::from(inverted)); 816 self.lazily_clear_labels_at_tail(); 817 self.latest_branches.push(MachBranch { 818 start, 819 end, 820 target, 821 fixup, 822 inverted, 823 labels_at_this_branch: self.labels_at_tail.clone(), 824 }); 825 826 // Post-invariant: we asserted branch start is current tail; labels at 827 // branch list is cloned from list of labels at current tail. 828 } 829 830 fn truncate_last_branch(&mut self) { 831 debug_assert!( 832 !self.open_patchable, 833 "Branch instruction truncated within a patchable region" 834 ); 835 836 self.lazily_clear_labels_at_tail(); 837 // Invariants hold at this point. 838 839 let b = self.latest_branches.pop().unwrap(); 840 assert!(b.end == self.cur_offset()); 841 842 // State: 843 // [PRE CODE] 844 // Offset b.start, b.labels_at_this_branch: 845 // [BRANCH CODE] 846 // cur_off, self.labels_at_tail --> 847 // (end of buffer) 848 self.data.truncate(b.start as usize); 849 self.pending_fixup_records.truncate(b.fixup); 850 while let Some(last_srcloc) = self.srclocs.last_mut() { 851 if last_srcloc.end <= b.start { 852 break; 853 } 854 if last_srcloc.start < b.start { 855 last_srcloc.end = b.start; 856 break; 857 } 858 self.srclocs.pop(); 859 } 860 // State: 861 // [PRE CODE] 862 // cur_off, Offset b.start, b.labels_at_this_branch: 863 // (end of buffer) 864 // 865 // self.labels_at_tail --> (past end of buffer) 866 let cur_off = self.cur_offset(); 867 self.labels_at_tail_off = cur_off; 868 // State: 869 // [PRE CODE] 870 // cur_off, Offset b.start, b.labels_at_this_branch, 871 // self.labels_at_tail: 872 // (end of buffer) 873 // 874 // resolve_label_offset(l) for l in labels_at_tail: 875 // (past end of buffer) 876 877 trace!( 878 "truncate_last_branch: truncated {:?}; off now {}", 879 b, 880 cur_off 881 ); 882 883 // Fix up resolved label offsets for labels at tail. 884 for &l in &self.labels_at_tail { 885 self.label_offsets[l.0 as usize] = cur_off; 886 } 887 // Old labels_at_this_branch are now at cur_off. 888 self.labels_at_tail 889 .extend(b.labels_at_this_branch.into_iter()); 890 891 // Post-invariant: this operation is defined to truncate the buffer, 892 // which moves cur_off backward, and to move labels at the end of the 893 // buffer back to the start-of-branch offset. 894 // 895 // latest_branches satisfies all invariants: 896 // - it has no branches past the end of the buffer (branches are in 897 // order, we removed the last one, and we truncated the buffer to just 898 // before the start of that branch) 899 // - no labels were moved to lower offsets than the (new) cur_off, so 900 // the labels_at_this_branch list for any other branch need not change. 901 // 902 // labels_at_tail satisfies all invariants: 903 // - all labels that were at the tail after the truncated branch are 904 // moved backward to just before the branch, which becomes the new tail; 905 // thus every element in the list should remain (ensured by `.extend()` 906 // above). 907 // - all labels that refer to the new tail, which is the start-offset of 908 // the truncated branch, must be present. The `labels_at_this_branch` 909 // list in the truncated branch's record is a complete and precise list 910 // of exactly these labels; we append these to labels_at_tail. 911 // - labels_at_tail_off is at cur_off after truncation occurs, so the 912 // list is valid (not to be lazily cleared). 913 // 914 // The stated operation was performed: 915 // - For each label at the end of the buffer prior to this method, it 916 // now resolves to the new (truncated) end of the buffer: it must have 917 // been in `labels_at_tail` (this list is precise and complete, and 918 // the tail was at the end of the truncated branch on entry), and we 919 // iterate over this list and set `label_offsets` to the new tail. 920 // None of these labels could have been an alias (by invariant), so 921 // `label_offsets` is authoritative for each. 922 // - No other labels will be past the end of the buffer, because of the 923 // requirement that no labels be bound to the middle of branch ranges 924 // (see comments to `add_{cond,uncond}_branch()`). 925 // - The buffer is truncated to just before the last branch, and the 926 // fixup record referring to that last branch is removed. 927 } 928 929 /// Performs various optimizations on branches pointing at the current label. 930 pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) { 931 if ctrl_plane.get_decision() { 932 return; 933 } 934 935 self.lazily_clear_labels_at_tail(); 936 // Invariants valid at this point. 937 938 trace!( 939 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 940 self.latest_branches, 941 self.labels_at_tail, 942 self.pending_fixup_records 943 ); 944 945 // We continue to munch on branches at the tail of the buffer until no 946 // more rules apply. Note that the loop only continues if a branch is 947 // actually truncated (or if labels are redirected away from a branch), 948 // so this always makes progress. 949 while let Some(b) = self.latest_branches.last() { 950 let cur_off = self.cur_offset(); 951 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off); 952 // If there has been any code emission since the end of the last branch or 953 // label definition, then there's nothing we can edit (because we 954 // don't move code once placed, only back up and overwrite), so 955 // clear the records and finish. 956 if b.end < cur_off { 957 break; 958 } 959 960 // If the "labels at this branch" list on this branch is 961 // longer than a threshold, don't do any simplification, 962 // and let the branch remain to separate those labels from 963 // the current tail. This avoids quadratic behavior (see 964 // #3468): otherwise, if a long string of "goto next; 965 // next:" patterns are emitted, all of the labels will 966 // coalesce into a long list of aliases for the current 967 // buffer tail. We must track all aliases of the current 968 // tail for correctness, but we are also allowed to skip 969 // optimization (removal) of any branch, so we take the 970 // escape hatch here and let it stand. In effect this 971 // "spreads" the many thousands of labels in the 972 // pathological case among an actual (harmless but 973 // suboptimal) instruction once per N labels. 974 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD { 975 break; 976 } 977 978 // Invariant: we are looking at a branch that ends at the tail of 979 // the buffer. 980 981 // For any branch, conditional or unconditional: 982 // - If the target is a label at the current offset, then remove 983 // the conditional branch, and reset all labels that targeted 984 // the current offset (end of branch) to the truncated 985 // end-of-code. 986 // 987 // Preserves execution semantics: a branch to its own fallthrough 988 // address is equivalent to a no-op; in both cases, nextPC is the 989 // fallthrough. 990 if self.resolve_label_offset(b.target) == cur_off { 991 trace!("branch with target == cur off; truncating"); 992 self.truncate_last_branch(); 993 continue; 994 } 995 996 // If latest is an unconditional branch: 997 // 998 // - If the branch's target is not its own start address, then for 999 // each label at the start of branch, make the label an alias of the 1000 // branch target, and remove the label from the "labels at this 1001 // branch" list. 1002 // 1003 // - Preserves execution semantics: an unconditional branch's 1004 // only effect is to set PC to a new PC; this change simply 1005 // collapses one step in the step-semantics. 1006 // 1007 // - Post-invariant: the labels that were bound to the start of 1008 // this branch become aliases, so they must not be present in any 1009 // labels-at-this-branch list or the labels-at-tail list. The 1010 // labels are removed form the latest-branch record's 1011 // labels-at-this-branch list, and are never placed in the 1012 // labels-at-tail list. Furthermore, it is correct that they are 1013 // not in either list, because they are now aliases, and labels 1014 // that are aliases remain aliases forever. 1015 // 1016 // - If there is a prior unconditional branch that ends just before 1017 // this one begins, and this branch has no labels bound to its 1018 // start, then we can truncate this branch, because it is entirely 1019 // unreachable (we have redirected all labels that make it 1020 // reachable otherwise). Do so and continue around the loop. 1021 // 1022 // - Preserves execution semantics: the branch is unreachable, 1023 // because execution can only flow into an instruction from the 1024 // prior instruction's fallthrough or from a branch bound to that 1025 // instruction's start offset. Unconditional branches have no 1026 // fallthrough, so if the prior instruction is an unconditional 1027 // branch, no fallthrough entry can happen. The 1028 // labels-at-this-branch list is complete (by invariant), so if it 1029 // is empty, then the instruction is entirely unreachable. Thus, 1030 // it can be removed. 1031 // 1032 // - Post-invariant: ensured by truncate_last_branch(). 1033 // 1034 // - If there is a prior conditional branch whose target label 1035 // resolves to the current offset (branches around the 1036 // unconditional branch), then remove the unconditional branch, 1037 // and make the target of the unconditional the target of the 1038 // conditional instead. 1039 // 1040 // - Preserves execution semantics: previously we had: 1041 // 1042 // L1: 1043 // cond_br L2 1044 // br L3 1045 // L2: 1046 // (end of buffer) 1047 // 1048 // by removing the last branch, we have: 1049 // 1050 // L1: 1051 // cond_br L2 1052 // L2: 1053 // (end of buffer) 1054 // 1055 // we then fix up the records for the conditional branch to 1056 // have: 1057 // 1058 // L1: 1059 // cond_br.inverted L3 1060 // L2: 1061 // 1062 // In the original code, control flow reaches L2 when the 1063 // conditional branch's predicate is true, and L3 otherwise. In 1064 // the optimized code, the same is true. 1065 // 1066 // - Post-invariant: all edits to latest_branches and 1067 // labels_at_tail are performed by `truncate_last_branch()`, 1068 // which maintains the invariants at each step. 1069 1070 if b.is_uncond() { 1071 // Set any label equal to current branch's start as an alias of 1072 // the branch's target, if the target is not the branch itself 1073 // (i.e., an infinite loop). 1074 // 1075 // We cannot perform this aliasing if the target of this branch 1076 // ultimately aliases back here; if so, we need to keep this 1077 // branch, so break out of this loop entirely (and clear the 1078 // latest-branches list below). 1079 // 1080 // Note that this check is what prevents cycles from forming in 1081 // `self.label_aliases`. To see why, consider an arbitrary start 1082 // state: 1083 // 1084 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to 1085 // Ln, which is not aliased. 1086 // 1087 // We would create a cycle if we assigned label_aliases[Ln] 1088 // = L1. Note that the below assignment is the only write 1089 // to label_aliases. 1090 // 1091 // By our other invariants, we have that Ln (`l` below) 1092 // resolves to the offset `b.start`, because it is in the 1093 // set `b.labels_at_this_branch`. 1094 // 1095 // If L1 were already aliased, through some arbitrarily deep 1096 // chain, to Ln, then it must also resolve to this offset 1097 // `b.start`. 1098 // 1099 // By checking the resolution of `L1` against this offset, 1100 // and aborting this branch-simplification if they are 1101 // equal, we prevent the below assignment from ever creating 1102 // a cycle. 1103 if self.resolve_label_offset(b.target) != b.start { 1104 let redirected = b.labels_at_this_branch.len(); 1105 for &l in &b.labels_at_this_branch { 1106 trace!( 1107 " -> label at start of branch {:?} redirected to target {:?}", 1108 l, 1109 b.target 1110 ); 1111 self.label_aliases[l.0 as usize] = b.target; 1112 // NOTE: we continue to ensure the invariant that labels 1113 // pointing to tail of buffer are in `labels_at_tail` 1114 // because we already ensured above that the last branch 1115 // cannot have a target of `cur_off`; so we never have 1116 // to put the label into `labels_at_tail` when moving it 1117 // here. 1118 } 1119 // Maintain invariant: all branches have been redirected 1120 // and are no longer pointing at the start of this branch. 1121 let mut_b = self.latest_branches.last_mut().unwrap(); 1122 mut_b.labels_at_this_branch.clear(); 1123 1124 if redirected > 0 { 1125 trace!(" -> after label redirects, restarting loop"); 1126 continue; 1127 } 1128 } else { 1129 break; 1130 } 1131 1132 let b = self.latest_branches.last().unwrap(); 1133 1134 // Examine any immediately preceding branch. 1135 if self.latest_branches.len() > 1 { 1136 let prev_b = &self.latest_branches[self.latest_branches.len() - 2]; 1137 trace!(" -> more than one branch; prev_b = {:?}", prev_b); 1138 // This uncond is immediately after another uncond; we 1139 // should have already redirected labels to this uncond away 1140 // (but check to be sure); so we can truncate this uncond. 1141 if prev_b.is_uncond() 1142 && prev_b.end == b.start 1143 && b.labels_at_this_branch.is_empty() 1144 { 1145 trace!(" -> uncond follows another uncond; truncating"); 1146 self.truncate_last_branch(); 1147 continue; 1148 } 1149 1150 // This uncond is immediately after a conditional, and the 1151 // conditional's target is the end of this uncond, and we've 1152 // already redirected labels to this uncond away; so we can 1153 // truncate this uncond, flip the sense of the conditional, and 1154 // set the conditional's target (in `latest_branches` and in 1155 // `fixup_records`) to the uncond's target. 1156 if prev_b.is_cond() 1157 && prev_b.end == b.start 1158 && self.resolve_label_offset(prev_b.target) == cur_off 1159 { 1160 trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset"); 1161 // Save the target of the uncond (this becomes the 1162 // target of the cond), and truncate the uncond. 1163 let target = b.target; 1164 let data = prev_b.inverted.clone().unwrap(); 1165 self.truncate_last_branch(); 1166 1167 // Mutate the code and cond branch. 1168 let off_before_edit = self.cur_offset(); 1169 let prev_b = self.latest_branches.last_mut().unwrap(); 1170 let not_inverted = SmallVec::from( 1171 &self.data[(prev_b.start as usize)..(prev_b.end as usize)], 1172 ); 1173 1174 // Low-level edit: replaces bytes of branch with 1175 // inverted form. cur_off remains the same afterward, so 1176 // we do not need to modify label data structures. 1177 self.data.truncate(prev_b.start as usize); 1178 self.data.extend_from_slice(&data[..]); 1179 1180 // Save the original code as the inversion of the 1181 // inverted branch, in case we later edit this branch 1182 // again. 1183 prev_b.inverted = Some(not_inverted); 1184 self.pending_fixup_records[prev_b.fixup].label = target; 1185 trace!(" -> reassigning target of condbr to {:?}", target); 1186 prev_b.target = target; 1187 debug_assert_eq!(off_before_edit, self.cur_offset()); 1188 continue; 1189 } 1190 } 1191 } 1192 1193 // If we couldn't do anything with the last branch, then break. 1194 break; 1195 } 1196 1197 self.purge_latest_branches(); 1198 1199 trace!( 1200 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 1201 self.latest_branches, 1202 self.labels_at_tail, 1203 self.pending_fixup_records 1204 ); 1205 } 1206 1207 fn purge_latest_branches(&mut self) { 1208 // All of our branch simplification rules work only if a branch ends at 1209 // the tail of the buffer, with no following code; and branches are in 1210 // order in latest_branches; so if the last entry ends prior to 1211 // cur_offset, then clear all entries. 1212 let cur_off = self.cur_offset(); 1213 if let Some(l) = self.latest_branches.last() { 1214 if l.end < cur_off { 1215 trace!("purge_latest_branches: removing branch {:?}", l); 1216 self.latest_branches.clear(); 1217 } 1218 } 1219 1220 // Post-invariant: no invariant requires any branch to appear in 1221 // `latest_branches`; it is always optional. The list-clear above thus 1222 // preserves all semantics. 1223 } 1224 1225 /// Emit a trap at some point in the future with the specified code and 1226 /// stack map. 1227 /// 1228 /// This function returns a [`MachLabel`] which will be the future address 1229 /// of the trap. Jumps should refer to this label, likely by using the 1230 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation 1231 /// patched in once the address of the trap is known. 1232 /// 1233 /// This will batch all traps into the end of the function. 1234 pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel { 1235 let label = self.get_label(); 1236 self.pending_traps.push(MachLabelTrap { 1237 label, 1238 code, 1239 loc: self.cur_srcloc.map(|(_start, loc)| loc), 1240 }); 1241 label 1242 } 1243 1244 /// Is an island needed within the next N bytes? 1245 pub fn island_needed(&self, distance: CodeOffset) -> bool { 1246 let deadline = match self.fixup_records.peek() { 1247 Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline), 1248 None => self.pending_fixup_deadline, 1249 }; 1250 deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline 1251 } 1252 1253 /// Returns the maximal offset that islands can reach if `distance` more 1254 /// bytes are appended. 1255 /// 1256 /// This is used to determine if veneers need insertions since jumps that 1257 /// can't reach past this point must get a veneer of some form. 1258 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset { 1259 // Assume that all fixups will require veneers and that the veneers are 1260 // the worst-case size for each platform. This is an over-generalization 1261 // to avoid iterating over the `fixup_records` list or maintaining 1262 // information about it as we go along. 1263 let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len()) 1264 as u32) 1265 * (I::LabelUse::worst_case_veneer_size()) 1266 + self.pending_constants_size 1267 + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32; 1268 self.cur_offset() 1269 .saturating_add(distance) 1270 .saturating_add(island_worst_case_size) 1271 } 1272 1273 /// Emit all pending constants and required pending veneers. 1274 /// 1275 /// Should only be called if `island_needed()` returns true, i.e., if we 1276 /// actually reach a deadline. It's not necessarily a problem to do so 1277 /// otherwise but it may result in unnecessary work during emission. 1278 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) { 1279 self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane); 1280 } 1281 1282 /// Same as `emit_island`, but an internal API with a `force_veneers` 1283 /// argument to force all veneers to always get emitted for debugging. 1284 fn emit_island_maybe_forced( 1285 &mut self, 1286 force_veneers: ForceVeneers, 1287 distance: CodeOffset, 1288 ctrl_plane: &mut ControlPlane, 1289 ) { 1290 // We're going to purge fixups, so no latest-branch editing can happen 1291 // anymore. 1292 self.latest_branches.clear(); 1293 1294 // End the current location tracking since anything emitted during this 1295 // function shouldn't be attributed to whatever the current source 1296 // location is. 1297 // 1298 // Note that the current source location, if it's set right now, will be 1299 // restored at the end of this island emission. 1300 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc); 1301 if cur_loc.is_some() { 1302 self.end_srcloc(); 1303 } 1304 1305 let forced_threshold = self.worst_case_end_of_island(distance); 1306 1307 // First flush out all traps/constants so we have more labels in case 1308 // fixups are applied against these labels. 1309 // 1310 // Note that traps are placed first since this typically happens at the 1311 // end of the function and for disassemblers we try to keep all the code 1312 // contiguously together. 1313 for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) { 1314 // If this trap has source information associated with it then 1315 // emit this information for the trap instruction going out now too. 1316 if let Some(loc) = loc { 1317 self.start_srcloc(loc); 1318 } 1319 self.align_to(I::LabelUse::ALIGN); 1320 self.bind_label(label, ctrl_plane); 1321 self.add_trap(code); 1322 self.put_data(I::TRAP_OPCODE); 1323 if loc.is_some() { 1324 self.end_srcloc(); 1325 } 1326 } 1327 1328 for constant in mem::take(&mut self.pending_constants) { 1329 let MachBufferConstant { align, size, .. } = self.constants[constant]; 1330 let label = self.constants[constant].upcoming_label.take().unwrap(); 1331 self.align_to(align); 1332 self.bind_label(label, ctrl_plane); 1333 self.used_constants.push((constant, self.cur_offset())); 1334 self.get_appended_space(size); 1335 } 1336 1337 // Either handle all pending fixups because they're ready or move them 1338 // onto the `BinaryHeap` tracking all pending fixups if they aren't 1339 // ready. 1340 assert!(self.latest_branches.is_empty()); 1341 for fixup in mem::take(&mut self.pending_fixup_records) { 1342 if self.should_apply_fixup(&fixup, forced_threshold) { 1343 self.handle_fixup(fixup, force_veneers, forced_threshold); 1344 } else { 1345 self.fixup_records.push(fixup); 1346 } 1347 } 1348 self.pending_fixup_deadline = u32::MAX; 1349 while let Some(fixup) = self.fixup_records.peek() { 1350 trace!("emit_island: fixup {:?}", fixup); 1351 1352 // If this fixup shouldn't be applied, that means its label isn't 1353 // defined yet and there'll be remaining space to apply a veneer if 1354 // necessary in the future after this island. In that situation 1355 // because `fixup_records` is sorted by deadline this loop can 1356 // exit. 1357 if !self.should_apply_fixup(fixup, forced_threshold) { 1358 break; 1359 } 1360 1361 let fixup = self.fixup_records.pop().unwrap(); 1362 self.handle_fixup(fixup, force_veneers, forced_threshold); 1363 } 1364 1365 if let Some(loc) = cur_loc { 1366 self.start_srcloc(loc); 1367 } 1368 } 1369 1370 fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool { 1371 let label_offset = self.resolve_label_offset(fixup.label); 1372 label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold 1373 } 1374 1375 fn handle_fixup( 1376 &mut self, 1377 fixup: MachLabelFixup<I>, 1378 force_veneers: ForceVeneers, 1379 forced_threshold: CodeOffset, 1380 ) { 1381 let MachLabelFixup { 1382 label, 1383 offset, 1384 kind, 1385 } = fixup; 1386 let start = offset as usize; 1387 let end = (offset + kind.patch_size()) as usize; 1388 let label_offset = self.resolve_label_offset(label); 1389 1390 if label_offset != UNKNOWN_LABEL_OFFSET { 1391 // If the offset of the label for this fixup is known then 1392 // we're going to do something here-and-now. We're either going 1393 // to patch the original offset because it's an in-bounds jump, 1394 // or we're going to generate a veneer, patch the fixup to jump 1395 // to the veneer, and then keep going. 1396 // 1397 // If the label comes after the original fixup, then we should 1398 // be guaranteed that the jump is in-bounds. Otherwise there's 1399 // a bug somewhere because this method wasn't called soon 1400 // enough. All forward-jumps are tracked and should get veneers 1401 // before their deadline comes and they're unable to jump 1402 // further. 1403 // 1404 // Otherwise if the label is before the fixup, then that's a 1405 // backwards jump. If it's past the maximum negative range 1406 // then we'll emit a veneer that to jump forward to which can 1407 // then jump backwards. 1408 let veneer_required = if label_offset >= offset { 1409 assert!((label_offset - offset) <= kind.max_pos_range()); 1410 false 1411 } else { 1412 (offset - label_offset) > kind.max_neg_range() 1413 }; 1414 trace!( 1415 " -> label_offset = {}, known, required = {} (pos {} neg {})", 1416 label_offset, 1417 veneer_required, 1418 kind.max_pos_range(), 1419 kind.max_neg_range() 1420 ); 1421 1422 if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required { 1423 self.emit_veneer(label, offset, kind); 1424 } else { 1425 let slice = &mut self.data[start..end]; 1426 trace!("patching in-range!"); 1427 kind.patch(slice, offset, label_offset); 1428 } 1429 } else { 1430 // If the offset of this label is not known at this time then 1431 // that means that a veneer is required because after this 1432 // island the target can't be in range of the original target. 1433 assert!(forced_threshold - offset > kind.max_pos_range()); 1434 self.emit_veneer(label, offset, kind); 1435 } 1436 } 1437 1438 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`. 1439 /// 1440 /// This will generate extra machine code, using `kind`, to get a 1441 /// larger-jump-kind than `kind` allows. The code at `offset` is then 1442 /// patched to jump to our new code, and then the new code is enqueued for 1443 /// a fixup to get processed at some later time. 1444 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) { 1445 // If this `kind` doesn't support a veneer then that's a bug in the 1446 // backend because we need to implement support for such a veneer. 1447 assert!( 1448 kind.supports_veneer(), 1449 "jump beyond the range of {kind:?} but a veneer isn't supported", 1450 ); 1451 1452 // Allocate space for a veneer in the island. 1453 self.align_to(I::LabelUse::ALIGN); 1454 let veneer_offset = self.cur_offset(); 1455 trace!("making a veneer at {}", veneer_offset); 1456 let start = offset as usize; 1457 let end = (offset + kind.patch_size()) as usize; 1458 let slice = &mut self.data[start..end]; 1459 // Patch the original label use to refer to the veneer. 1460 trace!( 1461 "patching original at offset {} to veneer offset {}", 1462 offset, 1463 veneer_offset 1464 ); 1465 kind.patch(slice, offset, veneer_offset); 1466 // Generate the veneer. 1467 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize); 1468 let (veneer_fixup_off, veneer_label_use) = 1469 kind.generate_veneer(veneer_slice, veneer_offset); 1470 trace!( 1471 "generated veneer; fixup offset {}, label_use {:?}", 1472 veneer_fixup_off, 1473 veneer_label_use 1474 ); 1475 // Register a new use of `label` with our new veneer fixup and 1476 // offset. This'll recalculate deadlines accordingly and 1477 // enqueue this fixup to get processed at some later 1478 // time. 1479 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use); 1480 } 1481 1482 fn finish_emission_maybe_forcing_veneers( 1483 &mut self, 1484 force_veneers: ForceVeneers, 1485 ctrl_plane: &mut ControlPlane, 1486 ) { 1487 while !self.pending_constants.is_empty() 1488 || !self.pending_traps.is_empty() 1489 || !self.fixup_records.is_empty() 1490 || !self.pending_fixup_records.is_empty() 1491 { 1492 // `emit_island()` will emit any pending veneers and constants, and 1493 // as a side-effect, will also take care of any fixups with resolved 1494 // labels eagerly. 1495 self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane); 1496 } 1497 1498 // Ensure that all labels have been fixed up after the last island is emitted. This is a 1499 // full (release-mode) assert because an unresolved label means the emitted code is 1500 // incorrect. 1501 assert!(self.fixup_records.is_empty()); 1502 assert!(self.pending_fixup_records.is_empty()); 1503 } 1504 1505 /// Finish any deferred emissions and/or fixups. 1506 pub fn finish( 1507 mut self, 1508 constants: &VCodeConstants, 1509 ctrl_plane: &mut ControlPlane, 1510 ) -> MachBufferFinalized<Stencil> { 1511 let _tt = timing::vcode_emit_finish(); 1512 1513 self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane); 1514 1515 let alignment = self.finish_constants(constants); 1516 1517 // Resolve all labels to their offsets. 1518 let finalized_relocs = self 1519 .relocs 1520 .iter() 1521 .map(|reloc| FinalizedMachReloc { 1522 offset: reloc.offset, 1523 kind: reloc.kind, 1524 addend: reloc.addend, 1525 target: match &reloc.target { 1526 RelocTarget::ExternalName(name) => { 1527 FinalizedRelocTarget::ExternalName(name.clone()) 1528 } 1529 RelocTarget::Label(label) => { 1530 FinalizedRelocTarget::Func(self.resolve_label_offset(*label)) 1531 } 1532 }, 1533 }) 1534 .collect(); 1535 1536 let mut srclocs = self.srclocs; 1537 srclocs.sort_by_key(|entry| entry.start); 1538 1539 MachBufferFinalized { 1540 data: self.data, 1541 relocs: finalized_relocs, 1542 traps: self.traps, 1543 call_sites: self.call_sites, 1544 srclocs, 1545 stack_maps: self.stack_maps, 1546 user_stack_maps: self.user_stack_maps, 1547 unwind_info: self.unwind_info, 1548 alignment, 1549 } 1550 } 1551 1552 /// Add an external relocation at the given offset from current offset. 1553 pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>( 1554 &mut self, 1555 offset: CodeOffset, 1556 kind: Reloc, 1557 target: &T, 1558 addend: Addend, 1559 ) { 1560 let target: RelocTarget = target.clone().into(); 1561 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally 1562 // generate a label-use statement to track whether an island is possibly 1563 // needed to escape this function to actually get to the external name. 1564 // This is most likely to come up on AArch64 where calls between 1565 // functions use a 26-bit signed offset which gives +/- 64MB. This means 1566 // that if a function is 128MB in size and there's a call in the middle 1567 // it's impossible to reach the actual target. Also, while it's 1568 // technically possible to jump to the start of a function and then jump 1569 // further, island insertion below always inserts islands after 1570 // previously appended code so for Cranelift's own implementation this 1571 // is also a problem for 64MB functions on AArch64 which start with a 1572 // call instruction, those won't be able to escape. 1573 // 1574 // Ideally what needs to happen here is that a `LabelUse` is 1575 // transparently generated (or call-sites of this function are audited 1576 // to generate a `LabelUse` instead) and tracked internally. The actual 1577 // relocation would then change over time if and when a veneer is 1578 // inserted, where the relocation here would be patched by this 1579 // `MachBuffer` to jump to the veneer. The problem, though, is that all 1580 // this still needs to end up, in the case of a singular function, 1581 // generating a final relocation pointing either to this particular 1582 // relocation or to the veneer inserted. Additionally 1583 // `MachBuffer` needs the concept of a label which will never be 1584 // resolved, so `emit_island` doesn't trip over not actually ever 1585 // knowning what some labels are. Currently the loop in 1586 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely 1587 // loop. 1588 // 1589 // For now this means that because relocs aren't tracked at all that 1590 // AArch64 functions have a rough size limits of 64MB. For now that's 1591 // somewhat reasonable and the failure mode is a panic in `MachBuffer` 1592 // when a relocation can't otherwise be resolved later, so it shouldn't 1593 // actually result in any memory unsafety or anything like that. 1594 self.relocs.push(MachReloc { 1595 offset: self.data.len() as CodeOffset + offset, 1596 kind, 1597 target, 1598 addend, 1599 }); 1600 } 1601 1602 /// Add an external relocation at the current offset. 1603 pub fn add_reloc<T: Into<RelocTarget> + Clone>( 1604 &mut self, 1605 kind: Reloc, 1606 target: &T, 1607 addend: Addend, 1608 ) { 1609 self.add_reloc_at_offset(0, kind, target, addend); 1610 } 1611 1612 /// Add a trap record at the current offset. 1613 pub fn add_trap(&mut self, code: TrapCode) { 1614 self.traps.push(MachTrap { 1615 offset: self.data.len() as CodeOffset, 1616 code, 1617 }); 1618 } 1619 1620 /// Add a call-site record at the current offset. 1621 pub fn add_call_site(&mut self) { 1622 self.call_sites.push(MachCallSite { 1623 ret_addr: self.data.len() as CodeOffset, 1624 }); 1625 } 1626 1627 /// Add an unwind record at the current offset. 1628 pub fn add_unwind(&mut self, unwind: UnwindInst) { 1629 self.unwind_info.push((self.cur_offset(), unwind)); 1630 } 1631 1632 /// Set the `SourceLoc` for code from this offset until the offset at the 1633 /// next call to `end_srcloc()`. 1634 /// Returns the current [CodeOffset] and [RelSourceLoc]. 1635 pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) { 1636 let cur = (self.cur_offset(), loc); 1637 self.cur_srcloc = Some(cur); 1638 cur 1639 } 1640 1641 /// Mark the end of the `SourceLoc` segment started at the last 1642 /// `start_srcloc()` call. 1643 pub fn end_srcloc(&mut self) { 1644 let (start, loc) = self 1645 .cur_srcloc 1646 .take() 1647 .expect("end_srcloc() called without start_srcloc()"); 1648 let end = self.cur_offset(); 1649 // Skip zero-length extends. 1650 debug_assert!(end >= start); 1651 if end > start { 1652 self.srclocs.push(MachSrcLoc { start, end, loc }); 1653 } 1654 } 1655 1656 /// Add stack map metadata for this program point: a set of stack offsets 1657 /// (from SP upward) that contain live references. 1658 pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) { 1659 let (start, end) = match extent { 1660 StackMapExtent::UpcomingBytes(insn_len) => { 1661 let start_offset = self.cur_offset(); 1662 (start_offset, start_offset + insn_len) 1663 } 1664 StackMapExtent::StartedAtOffset(start_offset) => { 1665 let end_offset = self.cur_offset(); 1666 debug_assert!(end_offset >= start_offset); 1667 (start_offset, end_offset) 1668 } 1669 }; 1670 trace!("Adding stack map for offsets {start:#x}..{end:#x}: {stack_map:?}"); 1671 self.stack_maps.push(MachStackMap { 1672 offset: start, 1673 offset_end: end, 1674 stack_map, 1675 }); 1676 } 1677 1678 /// Push a user stack map onto this buffer. 1679 /// 1680 /// The stack map is associated with the given `return_addr` code 1681 /// offset. This must be the PC for the instruction just *after* this stack 1682 /// map's associated instruction. For example in the sequence `call $foo; 1683 /// add r8, rax`, the `return_addr` must be the offset of the start of the 1684 /// `add` instruction. 1685 /// 1686 /// Stack maps must be pushed in sorted `return_addr` order. 1687 pub fn push_user_stack_map( 1688 &mut self, 1689 emit_state: &I::State, 1690 return_addr: CodeOffset, 1691 stack_map: ir::UserStackMap, 1692 ) { 1693 let span = emit_state.frame_layout().active_size(); 1694 trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}"); 1695 1696 debug_assert!( 1697 self.user_stack_maps 1698 .last() 1699 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr), 1700 "pushed stack maps out of order: {} is not less than {}", 1701 self.user_stack_maps.last().unwrap().0, 1702 return_addr, 1703 ); 1704 1705 self.user_stack_maps.push((return_addr, span, stack_map)); 1706 } 1707 } 1708 1709 impl<T: CompilePhase> MachBufferFinalized<T> { 1710 /// Get a list of source location mapping tuples in sorted-by-start-offset order. 1711 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] { 1712 &self.srclocs[..] 1713 } 1714 1715 /// Get the total required size for the code. 1716 pub fn total_size(&self) -> CodeOffset { 1717 self.data.len() as CodeOffset 1718 } 1719 1720 /// Return the code in this mach buffer as a hex string for testing purposes. 1721 pub fn stringify_code_bytes(&self) -> String { 1722 // This is pretty lame, but whatever .. 1723 use std::fmt::Write; 1724 let mut s = String::with_capacity(self.data.len() * 2); 1725 for b in &self.data { 1726 write!(&mut s, "{b:02X}").unwrap(); 1727 } 1728 s 1729 } 1730 1731 /// Get the code bytes. 1732 pub fn data(&self) -> &[u8] { 1733 // N.B.: we emit every section into the .text section as far as 1734 // the `CodeSink` is concerned; we do not bother to segregate 1735 // the contents into the actual program text, the jumptable and the 1736 // rodata (constant pool). This allows us to generate code assuming 1737 // that these will not be relocated relative to each other, and avoids 1738 // having to designate each section as belonging in one of the three 1739 // fixed categories defined by `CodeSink`. If this becomes a problem 1740 // later (e.g. because of memory permissions or similar), we can 1741 // add this designation and segregate the output; take care, however, 1742 // to add the appropriate relocations in this case. 1743 1744 &self.data[..] 1745 } 1746 1747 /// Get the list of external relocations for this code. 1748 pub fn relocs(&self) -> &[FinalizedMachReloc] { 1749 &self.relocs[..] 1750 } 1751 1752 /// Get the list of trap records for this code. 1753 pub fn traps(&self) -> &[MachTrap] { 1754 &self.traps[..] 1755 } 1756 1757 /// Get the stack map metadata for this code. 1758 pub fn stack_maps(&self) -> &[MachStackMap] { 1759 &self.stack_maps[..] 1760 } 1761 1762 /// Take this buffer's stack map metadata. 1763 pub fn take_stack_maps(&mut self) -> SmallVec<[MachStackMap; 8]> { 1764 mem::take(&mut self.stack_maps) 1765 } 1766 1767 /// Get the list of call sites for this code. 1768 pub fn call_sites(&self) -> &[MachCallSite] { 1769 &self.call_sites[..] 1770 } 1771 } 1772 1773 /// Metadata about a constant. 1774 struct MachBufferConstant { 1775 /// A label which has not yet been bound which can be used for this 1776 /// constant. 1777 /// 1778 /// This is lazily created when a label is requested for a constant and is 1779 /// cleared when a constant is emitted. 1780 upcoming_label: Option<MachLabel>, 1781 /// Required alignment. 1782 align: CodeOffset, 1783 /// The byte size of this constant. 1784 size: usize, 1785 } 1786 1787 /// A trap that is deferred to the next time an island is emitted for either 1788 /// traps, constants, or fixups. 1789 struct MachLabelTrap { 1790 /// This label will refer to the trap's offset. 1791 label: MachLabel, 1792 /// The code associated with this trap. 1793 code: TrapCode, 1794 /// An optional source location to assign for this trap. 1795 loc: Option<RelSourceLoc>, 1796 } 1797 1798 /// A fixup to perform on the buffer once code is emitted. Fixups always refer 1799 /// to labels and patch the code based on label offsets. Hence, they are like 1800 /// relocations, but internal to one buffer. 1801 #[derive(Debug)] 1802 struct MachLabelFixup<I: VCodeInst> { 1803 /// The label whose offset controls this fixup. 1804 label: MachLabel, 1805 /// The offset to fix up / patch to refer to this label. 1806 offset: CodeOffset, 1807 /// The kind of fixup. This is architecture-specific; each architecture may have, 1808 /// e.g., several types of branch instructions, each with differently-sized 1809 /// offset fields and different places within the instruction to place the 1810 /// bits. 1811 kind: I::LabelUse, 1812 } 1813 1814 impl<I: VCodeInst> MachLabelFixup<I> { 1815 fn deadline(&self) -> CodeOffset { 1816 self.offset.saturating_add(self.kind.max_pos_range()) 1817 } 1818 } 1819 1820 impl<I: VCodeInst> PartialEq for MachLabelFixup<I> { 1821 fn eq(&self, other: &Self) -> bool { 1822 self.deadline() == other.deadline() 1823 } 1824 } 1825 1826 impl<I: VCodeInst> Eq for MachLabelFixup<I> {} 1827 1828 impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> { 1829 fn partial_cmp(&self, other: &Self) -> Option<Ordering> { 1830 Some(self.cmp(other)) 1831 } 1832 } 1833 1834 impl<I: VCodeInst> Ord for MachLabelFixup<I> { 1835 fn cmp(&self, other: &Self) -> Ordering { 1836 other.deadline().cmp(&self.deadline()) 1837 } 1838 } 1839 1840 /// A relocation resulting from a compilation. 1841 #[derive(Clone, Debug, PartialEq)] 1842 #[cfg_attr( 1843 feature = "enable-serde", 1844 derive(serde_derive::Serialize, serde_derive::Deserialize) 1845 )] 1846 pub struct MachRelocBase<T> { 1847 /// The offset at which the relocation applies, *relative to the 1848 /// containing section*. 1849 pub offset: CodeOffset, 1850 /// The kind of relocation. 1851 pub kind: Reloc, 1852 /// The external symbol / name to which this relocation refers. 1853 pub target: T, 1854 /// The addend to add to the symbol value. 1855 pub addend: i64, 1856 } 1857 1858 type MachReloc = MachRelocBase<RelocTarget>; 1859 1860 /// A relocation resulting from a compilation. 1861 pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>; 1862 1863 /// A Relocation target 1864 #[derive(Debug, Clone, PartialEq, Eq, Hash)] 1865 pub enum RelocTarget { 1866 /// Points to an [ExternalName] outside the current function. 1867 ExternalName(ExternalName), 1868 /// Points to a [MachLabel] inside this function. 1869 /// This is different from [MachLabelFixup] in that both the relocation and the 1870 /// label will be emitted and are only resolved at link time. 1871 /// 1872 /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it. 1873 Label(MachLabel), 1874 } 1875 1876 impl From<ExternalName> for RelocTarget { 1877 fn from(name: ExternalName) -> Self { 1878 Self::ExternalName(name) 1879 } 1880 } 1881 1882 impl From<MachLabel> for RelocTarget { 1883 fn from(label: MachLabel) -> Self { 1884 Self::Label(label) 1885 } 1886 } 1887 1888 /// A Relocation target 1889 #[derive(Debug, Clone, PartialEq, Eq, Hash)] 1890 #[cfg_attr( 1891 feature = "enable-serde", 1892 derive(serde_derive::Serialize, serde_derive::Deserialize) 1893 )] 1894 pub enum FinalizedRelocTarget { 1895 /// Points to an [ExternalName] outside the current function. 1896 ExternalName(ExternalName), 1897 /// Points to a [CodeOffset] from the start of the current function. 1898 Func(CodeOffset), 1899 } 1900 1901 impl FinalizedRelocTarget { 1902 /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the 1903 /// output. 1904 pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String { 1905 match self { 1906 FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)), 1907 FinalizedRelocTarget::Func(offset) => format!("func+{offset}"), 1908 } 1909 } 1910 } 1911 1912 /// A trap record resulting from a compilation. 1913 #[derive(Clone, Debug, PartialEq)] 1914 #[cfg_attr( 1915 feature = "enable-serde", 1916 derive(serde_derive::Serialize, serde_derive::Deserialize) 1917 )] 1918 pub struct MachTrap { 1919 /// The offset at which the trap instruction occurs, *relative to the 1920 /// containing section*. 1921 pub offset: CodeOffset, 1922 /// The trap code. 1923 pub code: TrapCode, 1924 } 1925 1926 /// A call site record resulting from a compilation. 1927 #[derive(Clone, Debug, PartialEq)] 1928 #[cfg_attr( 1929 feature = "enable-serde", 1930 derive(serde_derive::Serialize, serde_derive::Deserialize) 1931 )] 1932 pub struct MachCallSite { 1933 /// The offset of the call's return address, *relative to the containing section*. 1934 pub ret_addr: CodeOffset, 1935 } 1936 1937 /// A source-location mapping resulting from a compilation. 1938 #[derive(PartialEq, Debug, Clone)] 1939 #[cfg_attr( 1940 feature = "enable-serde", 1941 derive(serde_derive::Serialize, serde_derive::Deserialize) 1942 )] 1943 pub struct MachSrcLoc<T: CompilePhase> { 1944 /// The start of the region of code corresponding to a source location. 1945 /// This is relative to the start of the function, not to the start of the 1946 /// section. 1947 pub start: CodeOffset, 1948 /// The end of the region of code corresponding to a source location. 1949 /// This is relative to the start of the section, not to the start of the 1950 /// section. 1951 pub end: CodeOffset, 1952 /// The source location. 1953 pub loc: T::SourceLocType, 1954 } 1955 1956 impl MachSrcLoc<Stencil> { 1957 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> { 1958 MachSrcLoc { 1959 start: self.start, 1960 end: self.end, 1961 loc: self.loc.expand(base_srcloc), 1962 } 1963 } 1964 } 1965 1966 /// Record of stack map metadata: stack offsets containing references. 1967 #[derive(Clone, Debug, PartialEq)] 1968 #[cfg_attr( 1969 feature = "enable-serde", 1970 derive(serde_derive::Serialize, serde_derive::Deserialize) 1971 )] 1972 pub struct MachStackMap { 1973 /// The code offset at which this stack map applies. 1974 pub offset: CodeOffset, 1975 /// The code offset just past the "end" of the instruction: that is, the 1976 /// offset of the first byte of the following instruction, or equivalently, 1977 /// the start offset plus the instruction length. 1978 pub offset_end: CodeOffset, 1979 /// The stack map itself. 1980 pub stack_map: StackMap, 1981 } 1982 1983 /// Record of branch instruction in the buffer, to facilitate editing. 1984 #[derive(Clone, Debug)] 1985 struct MachBranch { 1986 start: CodeOffset, 1987 end: CodeOffset, 1988 target: MachLabel, 1989 fixup: usize, 1990 inverted: Option<SmallVec<[u8; 8]>>, 1991 /// All labels pointing to the start of this branch. For correctness, this 1992 /// *must* be complete (i.e., must contain all labels whose resolved offsets 1993 /// are at the start of this branch): we rely on being able to redirect all 1994 /// labels that could jump to this branch before removing it, if it is 1995 /// otherwise unreachable. 1996 labels_at_this_branch: SmallVec<[MachLabel; 4]>, 1997 } 1998 1999 impl MachBranch { 2000 fn is_cond(&self) -> bool { 2001 self.inverted.is_some() 2002 } 2003 fn is_uncond(&self) -> bool { 2004 self.inverted.is_none() 2005 } 2006 } 2007 2008 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`. 2009 /// 2010 /// Note that `MachBuffer` was primarily written for intra-function references 2011 /// of jumps between basic blocks, but it's also quite usable for entire text 2012 /// sections and resolving references between functions themselves. This 2013 /// builder interprets "blocks" as labeled functions for the purposes of 2014 /// resolving labels internally in the buffer. 2015 pub struct MachTextSectionBuilder<I: VCodeInst> { 2016 buf: MachBuffer<I>, 2017 next_func: usize, 2018 force_veneers: ForceVeneers, 2019 } 2020 2021 impl<I: VCodeInst> MachTextSectionBuilder<I> { 2022 /// Creates a new text section builder which will have `num_funcs` functions 2023 /// pushed into it. 2024 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> { 2025 let mut buf = MachBuffer::new(); 2026 buf.reserve_labels_for_blocks(num_funcs); 2027 MachTextSectionBuilder { 2028 buf, 2029 next_func: 0, 2030 force_veneers: ForceVeneers::No, 2031 } 2032 } 2033 } 2034 2035 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> { 2036 fn append( 2037 &mut self, 2038 labeled: bool, 2039 func: &[u8], 2040 align: u32, 2041 ctrl_plane: &mut ControlPlane, 2042 ) -> u64 { 2043 // Conditionally emit an island if it's necessary to resolve jumps 2044 // between functions which are too far away. 2045 let size = func.len() as u32; 2046 if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) { 2047 self.buf 2048 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane); 2049 } 2050 2051 self.buf.align_to(align); 2052 let pos = self.buf.cur_offset(); 2053 if labeled { 2054 self.buf.bind_label( 2055 MachLabel::from_block(BlockIndex::new(self.next_func)), 2056 ctrl_plane, 2057 ); 2058 self.next_func += 1; 2059 } 2060 self.buf.put_data(func); 2061 u64::from(pos) 2062 } 2063 2064 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool { 2065 crate::trace!( 2066 "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}" 2067 ); 2068 let label = MachLabel::from_block(BlockIndex::new(target)); 2069 let offset = u32::try_from(offset).unwrap(); 2070 match I::LabelUse::from_reloc(reloc, addend) { 2071 Some(label_use) => { 2072 self.buf.use_label_at_offset(offset, label, label_use); 2073 true 2074 } 2075 None => false, 2076 } 2077 } 2078 2079 fn force_veneers(&mut self) { 2080 self.force_veneers = ForceVeneers::Yes; 2081 } 2082 2083 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> { 2084 // Double-check all functions were pushed. 2085 assert_eq!(self.next_func, self.buf.label_offsets.len()); 2086 2087 // Finish up any veneers, if necessary. 2088 self.buf 2089 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane); 2090 2091 // We don't need the data any more, so return it to the caller. 2092 mem::take(&mut self.buf.data).into_vec() 2093 } 2094 } 2095 2096 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here. 2097 #[cfg(all(test, feature = "arm64"))] 2098 mod test { 2099 use cranelift_entity::EntityRef as _; 2100 2101 use super::*; 2102 use crate::ir::UserExternalNameRef; 2103 use crate::isa::aarch64::inst::xreg; 2104 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst}; 2105 use crate::machinst::{MachInstEmit, MachInstEmitState}; 2106 use crate::settings; 2107 2108 fn label(n: u32) -> MachLabel { 2109 MachLabel::from_block(BlockIndex::new(n as usize)) 2110 } 2111 fn target(n: u32) -> BranchTarget { 2112 BranchTarget::Label(label(n)) 2113 } 2114 2115 #[test] 2116 fn test_elide_jump_to_next() { 2117 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2118 let mut buf = MachBuffer::new(); 2119 let mut state = <Inst as MachInstEmit>::State::default(); 2120 let constants = Default::default(); 2121 2122 buf.reserve_labels_for_blocks(2); 2123 buf.bind_label(label(0), state.ctrl_plane_mut()); 2124 let inst = Inst::Jump { dest: target(1) }; 2125 inst.emit(&mut buf, &info, &mut state); 2126 buf.bind_label(label(1), state.ctrl_plane_mut()); 2127 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2128 assert_eq!(0, buf.total_size()); 2129 } 2130 2131 #[test] 2132 fn test_elide_trivial_jump_blocks() { 2133 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2134 let mut buf = MachBuffer::new(); 2135 let mut state = <Inst as MachInstEmit>::State::default(); 2136 let constants = Default::default(); 2137 2138 buf.reserve_labels_for_blocks(4); 2139 2140 buf.bind_label(label(0), state.ctrl_plane_mut()); 2141 let inst = Inst::CondBr { 2142 kind: CondBrKind::NotZero(xreg(0)), 2143 taken: target(1), 2144 not_taken: target(2), 2145 }; 2146 inst.emit(&mut buf, &info, &mut state); 2147 2148 buf.bind_label(label(1), state.ctrl_plane_mut()); 2149 let inst = Inst::Jump { dest: target(3) }; 2150 inst.emit(&mut buf, &info, &mut state); 2151 2152 buf.bind_label(label(2), state.ctrl_plane_mut()); 2153 let inst = Inst::Jump { dest: target(3) }; 2154 inst.emit(&mut buf, &info, &mut state); 2155 2156 buf.bind_label(label(3), state.ctrl_plane_mut()); 2157 2158 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2159 assert_eq!(0, buf.total_size()); 2160 } 2161 2162 #[test] 2163 fn test_flip_cond() { 2164 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2165 let mut buf = MachBuffer::new(); 2166 let mut state = <Inst as MachInstEmit>::State::default(); 2167 let constants = Default::default(); 2168 2169 buf.reserve_labels_for_blocks(4); 2170 2171 buf.bind_label(label(0), state.ctrl_plane_mut()); 2172 let inst = Inst::CondBr { 2173 kind: CondBrKind::Zero(xreg(0)), 2174 taken: target(1), 2175 not_taken: target(2), 2176 }; 2177 inst.emit(&mut buf, &info, &mut state); 2178 2179 buf.bind_label(label(1), state.ctrl_plane_mut()); 2180 let inst = Inst::Nop4; 2181 inst.emit(&mut buf, &info, &mut state); 2182 2183 buf.bind_label(label(2), state.ctrl_plane_mut()); 2184 let inst = Inst::Udf { 2185 trap_code: TrapCode::Interrupt, 2186 }; 2187 inst.emit(&mut buf, &info, &mut state); 2188 2189 buf.bind_label(label(3), state.ctrl_plane_mut()); 2190 2191 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2192 2193 let mut buf2 = MachBuffer::new(); 2194 let mut state = Default::default(); 2195 let inst = Inst::TrapIf { 2196 kind: CondBrKind::NotZero(xreg(0)), 2197 trap_code: TrapCode::Interrupt, 2198 }; 2199 inst.emit(&mut buf2, &info, &mut state); 2200 let inst = Inst::Nop4; 2201 inst.emit(&mut buf2, &info, &mut state); 2202 2203 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2204 2205 assert_eq!(buf.data, buf2.data); 2206 } 2207 2208 #[test] 2209 fn test_island() { 2210 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2211 let mut buf = MachBuffer::new(); 2212 let mut state = <Inst as MachInstEmit>::State::default(); 2213 let constants = Default::default(); 2214 2215 buf.reserve_labels_for_blocks(4); 2216 2217 buf.bind_label(label(0), state.ctrl_plane_mut()); 2218 let inst = Inst::CondBr { 2219 kind: CondBrKind::NotZero(xreg(0)), 2220 taken: target(2), 2221 not_taken: target(3), 2222 }; 2223 inst.emit(&mut buf, &info, &mut state); 2224 2225 buf.bind_label(label(1), state.ctrl_plane_mut()); 2226 while buf.cur_offset() < 2000000 { 2227 if buf.island_needed(0) { 2228 buf.emit_island(0, state.ctrl_plane_mut()); 2229 } 2230 let inst = Inst::Nop4; 2231 inst.emit(&mut buf, &info, &mut state); 2232 } 2233 2234 buf.bind_label(label(2), state.ctrl_plane_mut()); 2235 let inst = Inst::Nop4; 2236 inst.emit(&mut buf, &info, &mut state); 2237 2238 buf.bind_label(label(3), state.ctrl_plane_mut()); 2239 let inst = Inst::Nop4; 2240 inst.emit(&mut buf, &info, &mut state); 2241 2242 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2243 2244 assert_eq!(2000000 + 8, buf.total_size()); 2245 2246 let mut buf2 = MachBuffer::new(); 2247 let mut state = Default::default(); 2248 let inst = Inst::CondBr { 2249 kind: CondBrKind::NotZero(xreg(0)), 2250 2251 // This conditionally taken branch has a 19-bit constant, shifted 2252 // to the left by two, giving us a 21-bit range in total. Half of 2253 // this range positive so the we should be around 1 << 20 bytes 2254 // away for our jump target. 2255 // 2256 // There are two pending fixups by the time we reach this point, 2257 // one for this 19-bit jump and one for the unconditional 26-bit 2258 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit 2259 // veneer is 20 bytes large, which means that pessimistically 2260 // assuming we'll need two veneers. Currently each veneer is 2261 // pessimistically assumed to be the maximal size which means we 2262 // need 40 bytes of extra space, meaning that the actual island 2263 // should come 40-bytes before the deadline. 2264 taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20), 2265 2266 // This branch is in-range so no veneers should be needed, it should 2267 // go directly to the target. 2268 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4), 2269 }; 2270 inst.emit(&mut buf2, &info, &mut state); 2271 2272 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2273 2274 assert_eq!(&buf.data[0..8], &buf2.data[..]); 2275 } 2276 2277 #[test] 2278 fn test_island_backward() { 2279 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2280 let mut buf = MachBuffer::new(); 2281 let mut state = <Inst as MachInstEmit>::State::default(); 2282 let constants = Default::default(); 2283 2284 buf.reserve_labels_for_blocks(4); 2285 2286 buf.bind_label(label(0), state.ctrl_plane_mut()); 2287 let inst = Inst::Nop4; 2288 inst.emit(&mut buf, &info, &mut state); 2289 2290 buf.bind_label(label(1), state.ctrl_plane_mut()); 2291 let inst = Inst::Nop4; 2292 inst.emit(&mut buf, &info, &mut state); 2293 2294 buf.bind_label(label(2), state.ctrl_plane_mut()); 2295 while buf.cur_offset() < 2000000 { 2296 let inst = Inst::Nop4; 2297 inst.emit(&mut buf, &info, &mut state); 2298 } 2299 2300 buf.bind_label(label(3), state.ctrl_plane_mut()); 2301 let inst = Inst::CondBr { 2302 kind: CondBrKind::NotZero(xreg(0)), 2303 taken: target(0), 2304 not_taken: target(1), 2305 }; 2306 inst.emit(&mut buf, &info, &mut state); 2307 2308 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2309 2310 assert_eq!(2000000 + 12, buf.total_size()); 2311 2312 let mut buf2 = MachBuffer::new(); 2313 let mut state = Default::default(); 2314 let inst = Inst::CondBr { 2315 kind: CondBrKind::NotZero(xreg(0)), 2316 taken: BranchTarget::ResolvedOffset(8), 2317 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)), 2318 }; 2319 inst.emit(&mut buf2, &info, &mut state); 2320 let inst = Inst::Jump { 2321 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)), 2322 }; 2323 inst.emit(&mut buf2, &info, &mut state); 2324 2325 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2326 2327 assert_eq!(&buf.data[2000000..], &buf2.data[..]); 2328 } 2329 2330 #[test] 2331 fn test_multiple_redirect() { 2332 // label0: 2333 // cbz x0, label1 2334 // b label2 2335 // label1: 2336 // b label3 2337 // label2: 2338 // nop 2339 // nop 2340 // b label0 2341 // label3: 2342 // b label4 2343 // label4: 2344 // b label5 2345 // label5: 2346 // b label7 2347 // label6: 2348 // nop 2349 // label7: 2350 // ret 2351 // 2352 // -- should become: 2353 // 2354 // label0: 2355 // cbz x0, label7 2356 // label2: 2357 // nop 2358 // nop 2359 // b label0 2360 // label6: 2361 // nop 2362 // label7: 2363 // ret 2364 2365 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2366 let mut buf = MachBuffer::new(); 2367 let mut state = <Inst as MachInstEmit>::State::default(); 2368 let constants = Default::default(); 2369 2370 buf.reserve_labels_for_blocks(8); 2371 2372 buf.bind_label(label(0), state.ctrl_plane_mut()); 2373 let inst = Inst::CondBr { 2374 kind: CondBrKind::Zero(xreg(0)), 2375 taken: target(1), 2376 not_taken: target(2), 2377 }; 2378 inst.emit(&mut buf, &info, &mut state); 2379 2380 buf.bind_label(label(1), state.ctrl_plane_mut()); 2381 let inst = Inst::Jump { dest: target(3) }; 2382 inst.emit(&mut buf, &info, &mut state); 2383 2384 buf.bind_label(label(2), state.ctrl_plane_mut()); 2385 let inst = Inst::Nop4; 2386 inst.emit(&mut buf, &info, &mut state); 2387 inst.emit(&mut buf, &info, &mut state); 2388 let inst = Inst::Jump { dest: target(0) }; 2389 inst.emit(&mut buf, &info, &mut state); 2390 2391 buf.bind_label(label(3), state.ctrl_plane_mut()); 2392 let inst = Inst::Jump { dest: target(4) }; 2393 inst.emit(&mut buf, &info, &mut state); 2394 2395 buf.bind_label(label(4), state.ctrl_plane_mut()); 2396 let inst = Inst::Jump { dest: target(5) }; 2397 inst.emit(&mut buf, &info, &mut state); 2398 2399 buf.bind_label(label(5), state.ctrl_plane_mut()); 2400 let inst = Inst::Jump { dest: target(7) }; 2401 inst.emit(&mut buf, &info, &mut state); 2402 2403 buf.bind_label(label(6), state.ctrl_plane_mut()); 2404 let inst = Inst::Nop4; 2405 inst.emit(&mut buf, &info, &mut state); 2406 2407 buf.bind_label(label(7), state.ctrl_plane_mut()); 2408 let inst = Inst::Ret {}; 2409 inst.emit(&mut buf, &info, &mut state); 2410 2411 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2412 2413 let golden_data = vec![ 2414 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14 2415 0x1f, 0x20, 0x03, 0xd5, // nop 2416 0x1f, 0x20, 0x03, 0xd5, // nop 2417 0xfd, 0xff, 0xff, 0x17, // b 0 2418 0x1f, 0x20, 0x03, 0xd5, // nop 2419 0xc0, 0x03, 0x5f, 0xd6, // ret 2420 ]; 2421 2422 assert_eq!(&golden_data[..], &buf.data[..]); 2423 } 2424 2425 #[test] 2426 fn test_handle_branch_cycle() { 2427 // label0: 2428 // b label1 2429 // label1: 2430 // b label2 2431 // label2: 2432 // b label3 2433 // label3: 2434 // b label4 2435 // label4: 2436 // b label1 // note: not label0 (to make it interesting). 2437 // 2438 // -- should become: 2439 // 2440 // label0, label1, ..., label4: 2441 // b label0 2442 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2443 let mut buf = MachBuffer::new(); 2444 let mut state = <Inst as MachInstEmit>::State::default(); 2445 let constants = Default::default(); 2446 2447 buf.reserve_labels_for_blocks(5); 2448 2449 buf.bind_label(label(0), state.ctrl_plane_mut()); 2450 let inst = Inst::Jump { dest: target(1) }; 2451 inst.emit(&mut buf, &info, &mut state); 2452 2453 buf.bind_label(label(1), state.ctrl_plane_mut()); 2454 let inst = Inst::Jump { dest: target(2) }; 2455 inst.emit(&mut buf, &info, &mut state); 2456 2457 buf.bind_label(label(2), state.ctrl_plane_mut()); 2458 let inst = Inst::Jump { dest: target(3) }; 2459 inst.emit(&mut buf, &info, &mut state); 2460 2461 buf.bind_label(label(3), state.ctrl_plane_mut()); 2462 let inst = Inst::Jump { dest: target(4) }; 2463 inst.emit(&mut buf, &info, &mut state); 2464 2465 buf.bind_label(label(4), state.ctrl_plane_mut()); 2466 let inst = Inst::Jump { dest: target(1) }; 2467 inst.emit(&mut buf, &info, &mut state); 2468 2469 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2470 2471 let golden_data = vec![ 2472 0x00, 0x00, 0x00, 0x14, // b 0 2473 ]; 2474 2475 assert_eq!(&golden_data[..], &buf.data[..]); 2476 } 2477 2478 #[test] 2479 fn metadata_records() { 2480 let mut buf = MachBuffer::<Inst>::new(); 2481 let ctrl_plane = &mut Default::default(); 2482 let constants = Default::default(); 2483 2484 buf.reserve_labels_for_blocks(1); 2485 2486 buf.bind_label(label(0), ctrl_plane); 2487 buf.put1(1); 2488 buf.add_trap(TrapCode::HeapOutOfBounds); 2489 buf.put1(2); 2490 buf.add_trap(TrapCode::IntegerOverflow); 2491 buf.add_trap(TrapCode::IntegerDivisionByZero); 2492 buf.add_call_site(); 2493 buf.add_reloc( 2494 Reloc::Abs4, 2495 &ExternalName::User(UserExternalNameRef::new(0)), 2496 0, 2497 ); 2498 buf.put1(3); 2499 buf.add_reloc( 2500 Reloc::Abs8, 2501 &ExternalName::User(UserExternalNameRef::new(1)), 2502 1, 2503 ); 2504 buf.put1(4); 2505 2506 let buf = buf.finish(&constants, ctrl_plane); 2507 2508 assert_eq!(buf.data(), &[1, 2, 3, 4]); 2509 assert_eq!( 2510 buf.traps() 2511 .iter() 2512 .map(|trap| (trap.offset, trap.code)) 2513 .collect::<Vec<_>>(), 2514 vec![ 2515 (1, TrapCode::HeapOutOfBounds), 2516 (2, TrapCode::IntegerOverflow), 2517 (2, TrapCode::IntegerDivisionByZero) 2518 ] 2519 ); 2520 assert_eq!( 2521 buf.call_sites() 2522 .iter() 2523 .map(|call_site| call_site.ret_addr) 2524 .collect::<Vec<_>>(), 2525 vec![2] 2526 ); 2527 assert_eq!( 2528 buf.relocs() 2529 .iter() 2530 .map(|reloc| (reloc.offset, reloc.kind)) 2531 .collect::<Vec<_>>(), 2532 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)] 2533 ); 2534 } 2535 } 2536