1 //! In-memory representation of compiled machine code, with labels and fixups to 2 //! refer to those labels. Handles constant-pool island insertion and also 3 //! veneer insertion for out-of-range jumps. 4 //! 5 //! This code exists to solve three problems: 6 //! 7 //! - Branch targets for forward branches are not known until later, when we 8 //! emit code in a single pass through the instruction structs. 9 //! 10 //! - On many architectures, address references or offsets have limited range. 11 //! For example, on AArch64, conditional branches can only target code +/- 1MB 12 //! from the branch itself. 13 //! 14 //! - The lowering of control flow from the CFG-with-edges produced by 15 //! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty 16 //! edge blocks when the register allocator does not need to insert any 17 //! spills/reloads/moves in edge blocks, results in many suboptimal branch 18 //! patterns. The lowering also pays no attention to block order, and so 19 //! two-target conditional forms (cond-br followed by uncond-br) can often by 20 //! avoided because one of the targets is the fallthrough. There are several 21 //! cases here where we can simplify to use fewer branches. 22 //! 23 //! This "buffer" implements a single-pass code emission strategy (with a later 24 //! "fixup" pass, but only through recorded fixups, not all instructions). The 25 //! basic idea is: 26 //! 27 //! - Emit branches as they are, including two-target (cond/uncond) compound 28 //! forms, but with zero offsets and optimistically assuming the target will be 29 //! in range. Record the "fixup" for later. Targets are denoted instead by 30 //! symbolic "labels" that are then bound to certain offsets in the buffer as 31 //! we emit code. (Nominally, there is a label at the start of every basic 32 //! block.) 33 //! 34 //! - As we do this, track the offset in the buffer at which the first label 35 //! reference "goes out of range". We call this the "deadline". If we reach the 36 //! deadline and we still have not bound the label to which an unresolved branch 37 //! refers, we have a problem! 38 //! 39 //! - To solve this problem, we emit "islands" full of "veneers". An island is 40 //! simply a chunk of code inserted in the middle of the code actually produced 41 //! by the emitter (e.g., vcode iterating over instruction structs). The emitter 42 //! has some awareness of this: it either asks for an island between blocks, so 43 //! it is not accidentally executed, or else it emits a branch around the island 44 //! when all other options fail (see `Inst::EmitIsland` meta-instruction). 45 //! 46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island" 47 //! that implements a longer-range reference to a label. The idea is that, for 48 //! example, a branch with a limited range can branch to a "veneer" instead, 49 //! which is simply a branch in a form that can use a longer-range reference. On 50 //! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional 51 //! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a 52 //! conditional branch's label reference can be fixed up with a "veneer" to 53 //! achieve a longer range. 54 //! 55 //! - To implement all of this, we require the backend to provide a `LabelUse` 56 //! type that implements a trait. This is nominally an enum that records one of 57 //! several kinds of references to an offset in code -- basically, a relocation 58 //! type -- and will usually correspond to different instruction formats. The 59 //! `LabelUse` implementation specifies the maximum range, how to patch in the 60 //! actual label location when known, and how to generate a veneer to extend the 61 //! range. 62 //! 63 //! That satisfies label references, but we still may have suboptimal branch 64 //! patterns. To clean up the branches, we do a simple "peephole"-style 65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`) 66 //! informs the buffer of branches in the code and, in the case of conditionals, 67 //! the code that would have been emitted to invert this branch's condition. We 68 //! track the "latest branches": these are branches that are contiguous up to 69 //! the current offset. (If any code is emitted after a branch, that branch or 70 //! run of contiguous branches is no longer "latest".) The latest branches are 71 //! those that we can edit by simply truncating the buffer and doing something 72 //! else instead. 73 //! 74 //! To optimize branches, we implement several simple rules, and try to apply 75 //! them to the "latest branches" when possible: 76 //! 77 //! - A branch with a label target, when that label is bound to the ending 78 //! offset of the branch (the fallthrough location), can be removed altogether, 79 //! because the branch would have no effect). 80 //! 81 //! - An unconditional branch that starts at a label location, and branches to 82 //! another label, results in a "label alias": all references to the label bound 83 //! *to* this branch instruction are instead resolved to the *target* of the 84 //! branch instruction. This effectively removes empty blocks that just 85 //! unconditionally branch to the next block. We call this "branch threading". 86 //! 87 //! - A conditional followed by an unconditional, when the conditional branches 88 //! to the unconditional's fallthrough, results in (i) the truncation of the 89 //! unconditional, (ii) the inversion of the condition's condition, and (iii) 90 //! replacement of the conditional's target (using the original target of the 91 //! unconditional). This is a fancy way of saying "we can flip a two-target 92 //! conditional branch's taken/not-taken targets if it works better with our 93 //! fallthrough". To make this work, the emitter actually gives the buffer 94 //! *both* forms of every conditional branch: the true form is emitted into the 95 //! buffer, and the "inverted" machine-code bytes are provided as part of the 96 //! branch-fixup metadata. 97 //! 98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have 99 //! been redirected to target(B), can be removed entirely. This is an extension 100 //! of the branch-threading optimization, and is valid because if we know there 101 //! will be no fallthrough into this branch instruction (the prior instruction 102 //! is an unconditional jump), and if we know we have successfully redirected 103 //! all labels, then this branch instruction is unreachable. Note that this 104 //! works because the redirection happens before the label is ever resolved 105 //! (fixups happen at island emission time, at which point latest-branches are 106 //! cleared, or at the end of emission), so we are sure to catch and redirect 107 //! all possible paths to this instruction. 108 //! 109 //! # Branch-optimization Correctness 110 //! 111 //! The branch-optimization mechanism depends on a few data structures with 112 //! invariants, which are always held outside the scope of top-level public 113 //! methods: 114 //! 115 //! - The latest-branches list. Each entry describes a span of the buffer 116 //! (start/end offsets), the label target, the corresponding fixup-list entry 117 //! index, and the bytes (must be the same length) for the inverted form, if 118 //! conditional. The list of labels that are bound to the start-offset of this 119 //! branch is *complete* (if any label has a resolved offset equal to `start` 120 //! and is not an alias, it must appear in this list) and *precise* (no label 121 //! in this list can be bound to another offset). No label in this list should 122 //! be an alias. No two branch ranges can overlap, and branches are in 123 //! ascending-offset order. 124 //! 125 //! - The labels-at-tail list. This contains all MachLabels that have been bound 126 //! to (whose resolved offsets are equal to) the tail offset of the buffer. 127 //! No label in this list should be an alias. 128 //! 129 //! - The label_offsets array, containing the bound offset of a label or 130 //! UNKNOWN. No label can be bound at an offset greater than the current 131 //! buffer tail. 132 //! 133 //! - The label_aliases array, containing another label to which a label is 134 //! bound or UNKNOWN. A label's resolved offset is the resolved offset 135 //! of the label it is aliased to, if this is set. 136 //! 137 //! We argue below, at each method, how the invariants in these data structures 138 //! are maintained (grep for "Post-invariant"). 139 //! 140 //! Given these invariants, we argue why each optimization preserves execution 141 //! semantics below (grep for "Preserves execution semantics"). 142 //! 143 //! # Avoiding Quadratic Behavior 144 //! 145 //! There are two cases where we've had to take some care to avoid 146 //! quadratic worst-case behavior: 147 //! 148 //! - The "labels at this branch" list can grow unboundedly if the 149 //! code generator binds many labels at one location. If the count 150 //! gets too high (defined by the `LABEL_LIST_THRESHOLD` constant), we 151 //! simply abort an optimization early in a way that is always correct 152 //! but is conservative. 153 //! 154 //! - The fixup list can interact with island emission to create 155 //! "quadratic island behavior". In a little more detail, one can hit 156 //! this behavior by having some pending fixups (forward label 157 //! references) with long-range label-use kinds, and some others 158 //! with shorter-range references that nonetheless still are pending 159 //! long enough to trigger island generation. In such a case, we 160 //! process the fixup list, generate veneers to extend some forward 161 //! references' ranges, but leave the other (longer-range) ones 162 //! alone. The way this was implemented put them back on a list and 163 //! resulted in quadratic behavior. 164 //! 165 //! To avoid this fixups are split into two lists: one "pending" list and one 166 //! final list. The pending list is kept around for handling fixups related to 167 //! branches so it can be edited/truncated. When an island is reached, which 168 //! starts processing fixups, all pending fixups are flushed into the final 169 //! list. The final list is a `BinaryHeap` which enables fixup processing to 170 //! only process those which are required during island emission, deferring 171 //! all longer-range fixups to later. 172 173 use crate::binemit::{Addend, CodeOffset, Reloc}; 174 use crate::ir::function::FunctionParameters; 175 use crate::ir::{ExceptionTag, ExternalName, RelSourceLoc, SourceLoc, TrapCode}; 176 use crate::isa::unwind::UnwindInst; 177 use crate::machinst::{ 178 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst, 179 }; 180 use crate::trace; 181 use crate::{MachInstEmitState, ir}; 182 use crate::{VCodeConstantData, timing}; 183 use core::ops::Range; 184 use cranelift_control::ControlPlane; 185 use cranelift_entity::{PrimaryMap, entity_impl}; 186 use smallvec::SmallVec; 187 use std::cmp::Ordering; 188 use std::collections::BinaryHeap; 189 use std::mem; 190 use std::string::String; 191 use std::vec::Vec; 192 193 #[cfg(feature = "enable-serde")] 194 use serde::{Deserialize, Serialize}; 195 196 #[cfg(feature = "enable-serde")] 197 pub trait CompilePhase { 198 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone; 199 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone; 200 } 201 202 #[cfg(not(feature = "enable-serde"))] 203 pub trait CompilePhase { 204 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone; 205 type SourceLocType: core::fmt::Debug + PartialEq + Clone; 206 } 207 208 /// Status of a compiled artifact that needs patching before being used. 209 #[derive(Clone, Debug, PartialEq)] 210 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))] 211 pub struct Stencil; 212 213 /// Status of a compiled artifact ready to use. 214 #[derive(Clone, Debug, PartialEq)] 215 pub struct Final; 216 217 impl CompilePhase for Stencil { 218 type MachSrcLocType = MachSrcLoc<Stencil>; 219 type SourceLocType = RelSourceLoc; 220 } 221 222 impl CompilePhase for Final { 223 type MachSrcLocType = MachSrcLoc<Final>; 224 type SourceLocType = SourceLoc; 225 } 226 227 #[derive(Clone, Copy, Debug, PartialEq, Eq)] 228 enum ForceVeneers { 229 Yes, 230 No, 231 } 232 233 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink 234 /// in bulk. 235 /// 236 /// This struct uses `SmallVec`s to support small-ish function bodies without 237 /// any heap allocation. As such, it will be several kilobytes large. This is 238 /// likely fine as long as it is stack-allocated for function emission then 239 /// thrown away; but beware if many buffer objects are retained persistently. 240 pub struct MachBuffer<I: VCodeInst> { 241 /// The buffer contents, as raw bytes. 242 data: SmallVec<[u8; 1024]>, 243 /// The required alignment of this buffer. 244 min_alignment: u32, 245 /// Any relocations referring to this code. Note that only *external* 246 /// relocations are tracked here; references to labels within the buffer are 247 /// resolved before emission. 248 relocs: SmallVec<[MachReloc; 16]>, 249 /// Any trap records referring to this code. 250 traps: SmallVec<[MachTrap; 16]>, 251 /// Any call site records referring to this code. 252 call_sites: SmallVec<[MachCallSite; 16]>, 253 /// Any exception-handler records referred to at call sites. 254 exception_handlers: SmallVec<[MachExceptionHandler; 16]>, 255 /// Any source location mappings referring to this code. 256 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>, 257 /// Any user stack maps for this code. 258 /// 259 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted 260 /// by code offset, and each stack map covers `span` bytes on the stack. 261 user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>, 262 /// Any unwind info at a given location. 263 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 264 /// The current source location in progress (after `start_srcloc()` and 265 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple. 266 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>, 267 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown. 268 label_offsets: SmallVec<[CodeOffset; 16]>, 269 /// Label aliases: when one label points to an unconditional jump, and that 270 /// jump points to another label, we can redirect references to the first 271 /// label immediately to the second. 272 /// 273 /// Invariant: we don't have label-alias cycles. We ensure this by, 274 /// before setting label A to alias label B, resolving B's alias 275 /// target (iteratively until a non-aliased label); if B is already 276 /// aliased to A, then we cannot alias A back to B. 277 label_aliases: SmallVec<[MachLabel; 16]>, 278 /// Constants that must be emitted at some point. 279 pending_constants: SmallVec<[VCodeConstant; 16]>, 280 /// Byte size of all constants in `pending_constants`. 281 pending_constants_size: CodeOffset, 282 /// Traps that must be emitted at some point. 283 pending_traps: SmallVec<[MachLabelTrap; 16]>, 284 /// Fixups that haven't yet been flushed into `fixup_records` below and may 285 /// be related to branches that are chomped. These all get added to 286 /// `fixup_records` during island emission. 287 pending_fixup_records: SmallVec<[MachLabelFixup<I>; 16]>, 288 /// The nearest upcoming deadline for entries in `pending_fixup_records`. 289 pending_fixup_deadline: CodeOffset, 290 /// Fixups that must be performed after all code is emitted. 291 fixup_records: BinaryHeap<MachLabelFixup<I>>, 292 /// Latest branches, to facilitate in-place editing for better fallthrough 293 /// behavior and empty-block removal. 294 latest_branches: SmallVec<[MachBranch; 4]>, 295 /// All labels at the current offset (emission tail). This is lazily 296 /// cleared: it is actually accurate as long as the current offset is 297 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should 298 /// be considered as empty. 299 /// 300 /// For correctness, this *must* be complete (i.e., the vector must contain 301 /// all labels whose offsets are resolved to the current tail), because we 302 /// rely on it to update labels when we truncate branches. 303 labels_at_tail: SmallVec<[MachLabel; 4]>, 304 /// The last offset at which `labels_at_tail` is valid. It is conceptually 305 /// always describing the tail of the buffer, but we do not clear 306 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it 307 /// when the offset has grown past this (`labels_at_tail_off`) point. 308 /// Always <= `cur_offset()`. 309 labels_at_tail_off: CodeOffset, 310 /// Metadata about all constants that this function has access to. 311 /// 312 /// This records the size/alignment of all constants (not the actual data) 313 /// along with the last available label generated for the constant. This map 314 /// is consulted when constants are referred to and the label assigned to a 315 /// constant may change over time as well. 316 constants: PrimaryMap<VCodeConstant, MachBufferConstant>, 317 /// All recorded usages of constants as pairs of the constant and where the 318 /// constant needs to be placed within `self.data`. Note that the same 319 /// constant may appear in this array multiple times if it was emitted 320 /// multiple times. 321 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>, 322 /// Indicates when a patchable region is currently open, to guard that it's 323 /// not possible to nest patchable regions. 324 open_patchable: bool, 325 } 326 327 impl MachBufferFinalized<Stencil> { 328 /// Get a finalized machine buffer by applying the function's base source location. 329 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> { 330 MachBufferFinalized { 331 data: self.data, 332 relocs: self.relocs, 333 traps: self.traps, 334 call_sites: self.call_sites, 335 exception_handlers: self.exception_handlers, 336 srclocs: self 337 .srclocs 338 .into_iter() 339 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc)) 340 .collect(), 341 user_stack_maps: self.user_stack_maps, 342 unwind_info: self.unwind_info, 343 alignment: self.alignment, 344 } 345 } 346 } 347 348 /// A `MachBuffer` once emission is completed: holds generated code and records, 349 /// without fixups. This allows the type to be independent of the backend. 350 #[derive(PartialEq, Debug, Clone)] 351 #[cfg_attr( 352 feature = "enable-serde", 353 derive(serde_derive::Serialize, serde_derive::Deserialize) 354 )] 355 pub struct MachBufferFinalized<T: CompilePhase> { 356 /// The buffer contents, as raw bytes. 357 pub(crate) data: SmallVec<[u8; 1024]>, 358 /// Any relocations referring to this code. Note that only *external* 359 /// relocations are tracked here; references to labels within the buffer are 360 /// resolved before emission. 361 pub(crate) relocs: SmallVec<[FinalizedMachReloc; 16]>, 362 /// Any trap records referring to this code. 363 pub(crate) traps: SmallVec<[MachTrap; 16]>, 364 /// Any call site records referring to this code. 365 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>, 366 /// Any exception-handler records referred to at call sites. 367 pub(crate) exception_handlers: SmallVec<[FinalizedMachExceptionHandler; 16]>, 368 /// Any source location mappings referring to this code. 369 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>, 370 /// Any user stack maps for this code. 371 /// 372 /// Each entry is an `(offset, span, stack_map)` triple. Entries are sorted 373 /// by code offset, and each stack map covers `span` bytes on the stack. 374 pub(crate) user_stack_maps: SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]>, 375 /// Any unwind info at a given location. 376 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 377 /// The required alignment of this buffer. 378 pub alignment: u32, 379 } 380 381 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff; 382 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff); 383 384 /// Threshold on max length of `labels_at_this_branch` list to avoid 385 /// unbounded quadratic behavior (see comment below at use-site). 386 const LABEL_LIST_THRESHOLD: usize = 100; 387 388 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at 389 /// the point at which it is used by emitted code; the buffer records "fixups" 390 /// for references to the label, and will come back and patch the code 391 /// appropriately when the label's location is eventually known. 392 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)] 393 pub struct MachLabel(u32); 394 entity_impl!(MachLabel); 395 396 impl MachLabel { 397 /// Get a label for a block. (The first N MachLabels are always reserved for 398 /// the N blocks in the vcode.) 399 pub fn from_block(bindex: BlockIndex) -> MachLabel { 400 MachLabel(bindex.index() as u32) 401 } 402 403 /// Creates a string representing this label, for convenience. 404 pub fn to_string(&self) -> String { 405 format!("label{}", self.0) 406 } 407 } 408 409 impl Default for MachLabel { 410 fn default() -> Self { 411 UNKNOWN_LABEL 412 } 413 } 414 415 /// Represents the beginning of an editable region in the [`MachBuffer`], while code emission is 416 /// still occurring. An [`OpenPatchRegion`] is closed by [`MachBuffer::end_patchable`], consuming 417 /// the [`OpenPatchRegion`] token in the process. 418 pub struct OpenPatchRegion(usize); 419 420 /// A region in the [`MachBuffer`] code buffer that can be edited prior to finalization. An example 421 /// of where you might want to use this is for patching instructions that mention constants that 422 /// won't be known until later: [`MachBuffer::start_patchable`] can be used to begin the patchable 423 /// region, instructions can be emitted with placeholder constants, and the [`PatchRegion`] token 424 /// can be produced by [`MachBuffer::end_patchable`]. Once the values of those constants are known, 425 /// the [`PatchRegion::patch`] function can be used to get a mutable buffer to the instruction 426 /// bytes, and the constants uses can be updated directly. 427 pub struct PatchRegion { 428 range: Range<usize>, 429 } 430 431 impl PatchRegion { 432 /// Consume the patch region to yield a mutable slice of the [`MachBuffer`] data buffer. 433 pub fn patch<I: VCodeInst>(self, buffer: &mut MachBuffer<I>) -> &mut [u8] { 434 &mut buffer.data[self.range] 435 } 436 } 437 438 impl<I: VCodeInst> MachBuffer<I> { 439 /// Create a new section, known to start at `start_offset` and with a size limited to 440 /// `length_limit`. 441 pub fn new() -> MachBuffer<I> { 442 MachBuffer { 443 data: SmallVec::new(), 444 min_alignment: I::function_alignment().minimum, 445 relocs: SmallVec::new(), 446 traps: SmallVec::new(), 447 call_sites: SmallVec::new(), 448 exception_handlers: SmallVec::new(), 449 srclocs: SmallVec::new(), 450 user_stack_maps: SmallVec::new(), 451 unwind_info: SmallVec::new(), 452 cur_srcloc: None, 453 label_offsets: SmallVec::new(), 454 label_aliases: SmallVec::new(), 455 pending_constants: SmallVec::new(), 456 pending_constants_size: 0, 457 pending_traps: SmallVec::new(), 458 pending_fixup_records: SmallVec::new(), 459 pending_fixup_deadline: u32::MAX, 460 fixup_records: Default::default(), 461 latest_branches: SmallVec::new(), 462 labels_at_tail: SmallVec::new(), 463 labels_at_tail_off: 0, 464 constants: Default::default(), 465 used_constants: Default::default(), 466 open_patchable: false, 467 } 468 } 469 470 /// Current offset from start of buffer. 471 pub fn cur_offset(&self) -> CodeOffset { 472 self.data.len() as CodeOffset 473 } 474 475 /// Add a byte. 476 pub fn put1(&mut self, value: u8) { 477 self.data.push(value); 478 479 // Post-invariant: conceptual-labels_at_tail contains a complete and 480 // precise list of labels bound at `cur_offset()`. We have advanced 481 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off` 482 // before, it is not anymore (and it cannot become equal, because 483 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is 484 // conceptually empty (even though it is only lazily cleared). No labels 485 // can be bound at this new offset (by invariant on `label_offsets`). 486 // Hence the invariant holds. 487 } 488 489 /// Add 2 bytes. 490 pub fn put2(&mut self, value: u16) { 491 let bytes = value.to_le_bytes(); 492 self.data.extend_from_slice(&bytes[..]); 493 494 // Post-invariant: as for `put1()`. 495 } 496 497 /// Add 4 bytes. 498 pub fn put4(&mut self, value: u32) { 499 let bytes = value.to_le_bytes(); 500 self.data.extend_from_slice(&bytes[..]); 501 502 // Post-invariant: as for `put1()`. 503 } 504 505 /// Add 8 bytes. 506 pub fn put8(&mut self, value: u64) { 507 let bytes = value.to_le_bytes(); 508 self.data.extend_from_slice(&bytes[..]); 509 510 // Post-invariant: as for `put1()`. 511 } 512 513 /// Add a slice of bytes. 514 pub fn put_data(&mut self, data: &[u8]) { 515 self.data.extend_from_slice(data); 516 517 // Post-invariant: as for `put1()`. 518 } 519 520 /// Reserve appended space and return a mutable slice referring to it. 521 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] { 522 let off = self.data.len(); 523 let new_len = self.data.len() + len; 524 self.data.resize(new_len, 0); 525 &mut self.data[off..] 526 527 // Post-invariant: as for `put1()`. 528 } 529 530 /// Align up to the given alignment. 531 pub fn align_to(&mut self, align_to: CodeOffset) { 532 trace!("MachBuffer: align to {}", align_to); 533 assert!( 534 align_to.is_power_of_two(), 535 "{align_to} is not a power of two" 536 ); 537 while self.cur_offset() & (align_to - 1) != 0 { 538 self.put1(0); 539 } 540 541 // Post-invariant: as for `put1()`. 542 } 543 544 /// Begin a region of patchable code. There is one requirement for the 545 /// code that is emitted: It must not introduce any instructions that 546 /// could be chomped (branches are an example of this). In other words, 547 /// you must not call [`MachBuffer::add_cond_branch`] or 548 /// [`MachBuffer::add_uncond_branch`] between calls to this method and 549 /// [`MachBuffer::end_patchable`]. 550 pub fn start_patchable(&mut self) -> OpenPatchRegion { 551 assert!(!self.open_patchable, "Patchable regions may not be nested"); 552 self.open_patchable = true; 553 OpenPatchRegion(usize::try_from(self.cur_offset()).unwrap()) 554 } 555 556 /// End a region of patchable code, yielding a [`PatchRegion`] value that 557 /// can be consumed later to produce a one-off mutable slice to the 558 /// associated region of the data buffer. 559 pub fn end_patchable(&mut self, open: OpenPatchRegion) -> PatchRegion { 560 // No need to assert the state of `open_patchable` here, as we take 561 // ownership of the only `OpenPatchable` value. 562 self.open_patchable = false; 563 let end = usize::try_from(self.cur_offset()).unwrap(); 564 PatchRegion { range: open.0..end } 565 } 566 567 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed 568 /// offset yet. 569 pub fn get_label(&mut self) -> MachLabel { 570 let l = self.label_offsets.len() as u32; 571 self.label_offsets.push(UNKNOWN_LABEL_OFFSET); 572 self.label_aliases.push(UNKNOWN_LABEL); 573 trace!("MachBuffer: new label -> {:?}", MachLabel(l)); 574 MachLabel(l) 575 576 // Post-invariant: the only mutation is to add a new label; it has no 577 // bound offset yet, so it trivially satisfies all invariants. 578 } 579 580 /// Reserve the first N MachLabels for blocks. 581 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) { 582 trace!("MachBuffer: first {} labels are for blocks", blocks); 583 debug_assert!(self.label_offsets.is_empty()); 584 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET); 585 self.label_aliases.resize(blocks, UNKNOWN_LABEL); 586 587 // Post-invariant: as for `get_label()`. 588 } 589 590 /// Registers metadata in this `MachBuffer` about the `constants` provided. 591 /// 592 /// This will record the size/alignment of all constants which will prepare 593 /// them for emission later on. 594 pub fn register_constants(&mut self, constants: &VCodeConstants) { 595 for (c, val) in constants.iter() { 596 self.register_constant(&c, val); 597 } 598 } 599 600 /// Similar to [`MachBuffer::register_constants`] but registers a 601 /// single constant metadata. This function is useful in 602 /// situations where not all constants are known at the time of 603 /// emission. 604 pub fn register_constant(&mut self, constant: &VCodeConstant, data: &VCodeConstantData) { 605 let c2 = self.constants.push(MachBufferConstant { 606 upcoming_label: None, 607 align: data.alignment(), 608 size: data.as_slice().len(), 609 }); 610 assert_eq!(*constant, c2); 611 } 612 613 /// Completes constant emission by iterating over `self.used_constants` and 614 /// filling in the "holes" with the constant values provided by `constants`. 615 /// 616 /// Returns the alignment required for this entire buffer. Alignment starts 617 /// at the ISA's minimum function alignment and can be increased due to 618 /// constant requirements. 619 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 { 620 let mut alignment = self.min_alignment; 621 for (constant, offset) in mem::take(&mut self.used_constants) { 622 let constant = constants.get(constant); 623 let data = constant.as_slice(); 624 self.data[offset as usize..][..data.len()].copy_from_slice(data); 625 alignment = constant.alignment().max(alignment); 626 } 627 alignment 628 } 629 630 /// Returns a label that can be used to refer to the `constant` provided. 631 /// 632 /// This will automatically defer a new constant to be emitted for 633 /// `constant` if it has not been previously emitted. Note that this 634 /// function may return a different label for the same constant at 635 /// different points in time. The label is valid to use only from the 636 /// current location; the MachBuffer takes care to emit the same constant 637 /// multiple times if needed so the constant is always in range. 638 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel { 639 let MachBufferConstant { 640 align, 641 size, 642 upcoming_label, 643 } = self.constants[constant]; 644 if let Some(label) = upcoming_label { 645 return label; 646 } 647 648 let label = self.get_label(); 649 trace!( 650 "defer constant: eventually emit {size} bytes aligned \ 651 to {align} at label {label:?}", 652 ); 653 self.pending_constants.push(constant); 654 self.pending_constants_size += size as u32; 655 self.constants[constant].upcoming_label = Some(label); 656 label 657 } 658 659 /// Bind a label to the current offset. A label can only be bound once. 660 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) { 661 trace!( 662 "MachBuffer: bind label {:?} at offset {}", 663 label, 664 self.cur_offset() 665 ); 666 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET); 667 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL); 668 let offset = self.cur_offset(); 669 self.label_offsets[label.0 as usize] = offset; 670 self.lazily_clear_labels_at_tail(); 671 self.labels_at_tail.push(label); 672 673 // Invariants hold: bound offset of label is <= cur_offset (in fact it 674 // is equal). If the `labels_at_tail` list was complete and precise 675 // before, it is still, because we have bound this label to the current 676 // offset and added it to the list (which contains all labels at the 677 // current offset). 678 679 self.optimize_branches(ctrl_plane); 680 681 // Post-invariant: by `optimize_branches()` (see argument there). 682 } 683 684 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the 685 /// offset that it applies to. 686 fn lazily_clear_labels_at_tail(&mut self) { 687 let offset = self.cur_offset(); 688 if offset > self.labels_at_tail_off { 689 self.labels_at_tail_off = offset; 690 self.labels_at_tail.clear(); 691 } 692 693 // Post-invariant: either labels_at_tail_off was at cur_offset, and 694 // state is untouched, or was less than cur_offset, in which case the 695 // labels_at_tail list was conceptually empty, and is now actually 696 // empty. 697 } 698 699 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`. 700 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset { 701 let mut iters = 0; 702 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL { 703 label = self.label_aliases[label.0 as usize]; 704 // To protect against an infinite loop (despite our assurances to 705 // ourselves that the invariants make this impossible), assert out 706 // after 1M iterations. The number of basic blocks is limited 707 // in most contexts anyway so this should be impossible to hit with 708 // a legitimate input. 709 iters += 1; 710 assert!(iters < 1_000_000, "Unexpected cycle in label aliases"); 711 } 712 self.label_offsets[label.0 as usize] 713 714 // Post-invariant: no mutations. 715 } 716 717 /// Emit a reference to the given label with the given reference type (i.e., 718 /// branch-instruction format) at the current offset. This is like a 719 /// relocation, but handled internally. 720 /// 721 /// This can be called before the branch is actually emitted; fixups will 722 /// not happen until an island is emitted or the buffer is finished. 723 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) { 724 trace!( 725 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}", 726 offset, label, kind 727 ); 728 729 // Add the fixup, and update the worst-case island size based on a 730 // veneer for this label use. 731 let fixup = MachLabelFixup { 732 label, 733 offset, 734 kind, 735 }; 736 self.pending_fixup_deadline = self.pending_fixup_deadline.min(fixup.deadline()); 737 self.pending_fixup_records.push(fixup); 738 739 // Post-invariant: no mutations to branches/labels data structures. 740 } 741 742 /// Inform the buffer of an unconditional branch at the given offset, 743 /// targeting the given label. May be used to optimize branches. 744 /// The last added label-use must correspond to this branch. 745 /// This must be called when the current offset is equal to `start`; i.e., 746 /// before actually emitting the branch. This implies that for a branch that 747 /// uses a label and is eligible for optimizations by the MachBuffer, the 748 /// proper sequence is: 749 /// 750 /// - Call `use_label_at_offset()` to emit the fixup record. 751 /// - Call `add_uncond_branch()` to make note of the branch. 752 /// - Emit the bytes for the branch's machine code. 753 /// 754 /// Additional requirement: no labels may be bound between `start` and `end` 755 /// (exclusive on both ends). 756 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) { 757 debug_assert!( 758 !self.open_patchable, 759 "Branch instruction inserted within a patchable region" 760 ); 761 assert!(self.cur_offset() == start); 762 debug_assert!(end > start); 763 assert!(!self.pending_fixup_records.is_empty()); 764 let fixup = self.pending_fixup_records.len() - 1; 765 self.lazily_clear_labels_at_tail(); 766 self.latest_branches.push(MachBranch { 767 start, 768 end, 769 target, 770 fixup, 771 inverted: None, 772 labels_at_this_branch: self.labels_at_tail.clone(), 773 }); 774 775 // Post-invariant: we asserted branch start is current tail; the list of 776 // labels at branch is cloned from list of labels at current tail. 777 } 778 779 /// Inform the buffer of a conditional branch at the given offset, 780 /// targeting the given label. May be used to optimize branches. 781 /// The last added label-use must correspond to this branch. 782 /// 783 /// Additional requirement: no labels may be bound between `start` and `end` 784 /// (exclusive on both ends). 785 pub fn add_cond_branch( 786 &mut self, 787 start: CodeOffset, 788 end: CodeOffset, 789 target: MachLabel, 790 inverted: &[u8], 791 ) { 792 debug_assert!( 793 !self.open_patchable, 794 "Branch instruction inserted within a patchable region" 795 ); 796 assert!(self.cur_offset() == start); 797 debug_assert!(end > start); 798 assert!(!self.pending_fixup_records.is_empty()); 799 debug_assert!( 800 inverted.len() == (end - start) as usize, 801 "branch length = {}, but inverted length = {}", 802 end - start, 803 inverted.len() 804 ); 805 let fixup = self.pending_fixup_records.len() - 1; 806 let inverted = Some(SmallVec::from(inverted)); 807 self.lazily_clear_labels_at_tail(); 808 self.latest_branches.push(MachBranch { 809 start, 810 end, 811 target, 812 fixup, 813 inverted, 814 labels_at_this_branch: self.labels_at_tail.clone(), 815 }); 816 817 // Post-invariant: we asserted branch start is current tail; labels at 818 // branch list is cloned from list of labels at current tail. 819 } 820 821 fn truncate_last_branch(&mut self) { 822 debug_assert!( 823 !self.open_patchable, 824 "Branch instruction truncated within a patchable region" 825 ); 826 827 self.lazily_clear_labels_at_tail(); 828 // Invariants hold at this point. 829 830 let b = self.latest_branches.pop().unwrap(); 831 assert!(b.end == self.cur_offset()); 832 833 // State: 834 // [PRE CODE] 835 // Offset b.start, b.labels_at_this_branch: 836 // [BRANCH CODE] 837 // cur_off, self.labels_at_tail --> 838 // (end of buffer) 839 self.data.truncate(b.start as usize); 840 self.pending_fixup_records.truncate(b.fixup); 841 while let Some(last_srcloc) = self.srclocs.last_mut() { 842 if last_srcloc.end <= b.start { 843 break; 844 } 845 if last_srcloc.start < b.start { 846 last_srcloc.end = b.start; 847 break; 848 } 849 self.srclocs.pop(); 850 } 851 // State: 852 // [PRE CODE] 853 // cur_off, Offset b.start, b.labels_at_this_branch: 854 // (end of buffer) 855 // 856 // self.labels_at_tail --> (past end of buffer) 857 let cur_off = self.cur_offset(); 858 self.labels_at_tail_off = cur_off; 859 // State: 860 // [PRE CODE] 861 // cur_off, Offset b.start, b.labels_at_this_branch, 862 // self.labels_at_tail: 863 // (end of buffer) 864 // 865 // resolve_label_offset(l) for l in labels_at_tail: 866 // (past end of buffer) 867 868 trace!( 869 "truncate_last_branch: truncated {:?}; off now {}", 870 b, cur_off 871 ); 872 873 // Fix up resolved label offsets for labels at tail. 874 for &l in &self.labels_at_tail { 875 self.label_offsets[l.0 as usize] = cur_off; 876 } 877 // Old labels_at_this_branch are now at cur_off. 878 self.labels_at_tail.extend(b.labels_at_this_branch); 879 880 // Post-invariant: this operation is defined to truncate the buffer, 881 // which moves cur_off backward, and to move labels at the end of the 882 // buffer back to the start-of-branch offset. 883 // 884 // latest_branches satisfies all invariants: 885 // - it has no branches past the end of the buffer (branches are in 886 // order, we removed the last one, and we truncated the buffer to just 887 // before the start of that branch) 888 // - no labels were moved to lower offsets than the (new) cur_off, so 889 // the labels_at_this_branch list for any other branch need not change. 890 // 891 // labels_at_tail satisfies all invariants: 892 // - all labels that were at the tail after the truncated branch are 893 // moved backward to just before the branch, which becomes the new tail; 894 // thus every element in the list should remain (ensured by `.extend()` 895 // above). 896 // - all labels that refer to the new tail, which is the start-offset of 897 // the truncated branch, must be present. The `labels_at_this_branch` 898 // list in the truncated branch's record is a complete and precise list 899 // of exactly these labels; we append these to labels_at_tail. 900 // - labels_at_tail_off is at cur_off after truncation occurs, so the 901 // list is valid (not to be lazily cleared). 902 // 903 // The stated operation was performed: 904 // - For each label at the end of the buffer prior to this method, it 905 // now resolves to the new (truncated) end of the buffer: it must have 906 // been in `labels_at_tail` (this list is precise and complete, and 907 // the tail was at the end of the truncated branch on entry), and we 908 // iterate over this list and set `label_offsets` to the new tail. 909 // None of these labels could have been an alias (by invariant), so 910 // `label_offsets` is authoritative for each. 911 // - No other labels will be past the end of the buffer, because of the 912 // requirement that no labels be bound to the middle of branch ranges 913 // (see comments to `add_{cond,uncond}_branch()`). 914 // - The buffer is truncated to just before the last branch, and the 915 // fixup record referring to that last branch is removed. 916 } 917 918 /// Performs various optimizations on branches pointing at the current label. 919 pub fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) { 920 if ctrl_plane.get_decision() { 921 return; 922 } 923 924 self.lazily_clear_labels_at_tail(); 925 // Invariants valid at this point. 926 927 trace!( 928 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 929 self.latest_branches, self.labels_at_tail, self.pending_fixup_records 930 ); 931 932 // We continue to munch on branches at the tail of the buffer until no 933 // more rules apply. Note that the loop only continues if a branch is 934 // actually truncated (or if labels are redirected away from a branch), 935 // so this always makes progress. 936 while let Some(b) = self.latest_branches.last() { 937 let cur_off = self.cur_offset(); 938 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off); 939 // If there has been any code emission since the end of the last branch or 940 // label definition, then there's nothing we can edit (because we 941 // don't move code once placed, only back up and overwrite), so 942 // clear the records and finish. 943 if b.end < cur_off { 944 break; 945 } 946 947 // If the "labels at this branch" list on this branch is 948 // longer than a threshold, don't do any simplification, 949 // and let the branch remain to separate those labels from 950 // the current tail. This avoids quadratic behavior (see 951 // #3468): otherwise, if a long string of "goto next; 952 // next:" patterns are emitted, all of the labels will 953 // coalesce into a long list of aliases for the current 954 // buffer tail. We must track all aliases of the current 955 // tail for correctness, but we are also allowed to skip 956 // optimization (removal) of any branch, so we take the 957 // escape hatch here and let it stand. In effect this 958 // "spreads" the many thousands of labels in the 959 // pathological case among an actual (harmless but 960 // suboptimal) instruction once per N labels. 961 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD { 962 break; 963 } 964 965 // Invariant: we are looking at a branch that ends at the tail of 966 // the buffer. 967 968 // For any branch, conditional or unconditional: 969 // - If the target is a label at the current offset, then remove 970 // the conditional branch, and reset all labels that targeted 971 // the current offset (end of branch) to the truncated 972 // end-of-code. 973 // 974 // Preserves execution semantics: a branch to its own fallthrough 975 // address is equivalent to a no-op; in both cases, nextPC is the 976 // fallthrough. 977 if self.resolve_label_offset(b.target) == cur_off { 978 trace!("branch with target == cur off; truncating"); 979 self.truncate_last_branch(); 980 continue; 981 } 982 983 // If latest is an unconditional branch: 984 // 985 // - If the branch's target is not its own start address, then for 986 // each label at the start of branch, make the label an alias of the 987 // branch target, and remove the label from the "labels at this 988 // branch" list. 989 // 990 // - Preserves execution semantics: an unconditional branch's 991 // only effect is to set PC to a new PC; this change simply 992 // collapses one step in the step-semantics. 993 // 994 // - Post-invariant: the labels that were bound to the start of 995 // this branch become aliases, so they must not be present in any 996 // labels-at-this-branch list or the labels-at-tail list. The 997 // labels are removed form the latest-branch record's 998 // labels-at-this-branch list, and are never placed in the 999 // labels-at-tail list. Furthermore, it is correct that they are 1000 // not in either list, because they are now aliases, and labels 1001 // that are aliases remain aliases forever. 1002 // 1003 // - If there is a prior unconditional branch that ends just before 1004 // this one begins, and this branch has no labels bound to its 1005 // start, then we can truncate this branch, because it is entirely 1006 // unreachable (we have redirected all labels that make it 1007 // reachable otherwise). Do so and continue around the loop. 1008 // 1009 // - Preserves execution semantics: the branch is unreachable, 1010 // because execution can only flow into an instruction from the 1011 // prior instruction's fallthrough or from a branch bound to that 1012 // instruction's start offset. Unconditional branches have no 1013 // fallthrough, so if the prior instruction is an unconditional 1014 // branch, no fallthrough entry can happen. The 1015 // labels-at-this-branch list is complete (by invariant), so if it 1016 // is empty, then the instruction is entirely unreachable. Thus, 1017 // it can be removed. 1018 // 1019 // - Post-invariant: ensured by truncate_last_branch(). 1020 // 1021 // - If there is a prior conditional branch whose target label 1022 // resolves to the current offset (branches around the 1023 // unconditional branch), then remove the unconditional branch, 1024 // and make the target of the unconditional the target of the 1025 // conditional instead. 1026 // 1027 // - Preserves execution semantics: previously we had: 1028 // 1029 // L1: 1030 // cond_br L2 1031 // br L3 1032 // L2: 1033 // (end of buffer) 1034 // 1035 // by removing the last branch, we have: 1036 // 1037 // L1: 1038 // cond_br L2 1039 // L2: 1040 // (end of buffer) 1041 // 1042 // we then fix up the records for the conditional branch to 1043 // have: 1044 // 1045 // L1: 1046 // cond_br.inverted L3 1047 // L2: 1048 // 1049 // In the original code, control flow reaches L2 when the 1050 // conditional branch's predicate is true, and L3 otherwise. In 1051 // the optimized code, the same is true. 1052 // 1053 // - Post-invariant: all edits to latest_branches and 1054 // labels_at_tail are performed by `truncate_last_branch()`, 1055 // which maintains the invariants at each step. 1056 1057 if b.is_uncond() { 1058 // Set any label equal to current branch's start as an alias of 1059 // the branch's target, if the target is not the branch itself 1060 // (i.e., an infinite loop). 1061 // 1062 // We cannot perform this aliasing if the target of this branch 1063 // ultimately aliases back here; if so, we need to keep this 1064 // branch, so break out of this loop entirely (and clear the 1065 // latest-branches list below). 1066 // 1067 // Note that this check is what prevents cycles from forming in 1068 // `self.label_aliases`. To see why, consider an arbitrary start 1069 // state: 1070 // 1071 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to 1072 // Ln, which is not aliased. 1073 // 1074 // We would create a cycle if we assigned label_aliases[Ln] 1075 // = L1. Note that the below assignment is the only write 1076 // to label_aliases. 1077 // 1078 // By our other invariants, we have that Ln (`l` below) 1079 // resolves to the offset `b.start`, because it is in the 1080 // set `b.labels_at_this_branch`. 1081 // 1082 // If L1 were already aliased, through some arbitrarily deep 1083 // chain, to Ln, then it must also resolve to this offset 1084 // `b.start`. 1085 // 1086 // By checking the resolution of `L1` against this offset, 1087 // and aborting this branch-simplification if they are 1088 // equal, we prevent the below assignment from ever creating 1089 // a cycle. 1090 if self.resolve_label_offset(b.target) != b.start { 1091 let redirected = b.labels_at_this_branch.len(); 1092 for &l in &b.labels_at_this_branch { 1093 trace!( 1094 " -> label at start of branch {:?} redirected to target {:?}", 1095 l, b.target 1096 ); 1097 self.label_aliases[l.0 as usize] = b.target; 1098 // NOTE: we continue to ensure the invariant that labels 1099 // pointing to tail of buffer are in `labels_at_tail` 1100 // because we already ensured above that the last branch 1101 // cannot have a target of `cur_off`; so we never have 1102 // to put the label into `labels_at_tail` when moving it 1103 // here. 1104 } 1105 // Maintain invariant: all branches have been redirected 1106 // and are no longer pointing at the start of this branch. 1107 let mut_b = self.latest_branches.last_mut().unwrap(); 1108 mut_b.labels_at_this_branch.clear(); 1109 1110 if redirected > 0 { 1111 trace!(" -> after label redirects, restarting loop"); 1112 continue; 1113 } 1114 } else { 1115 break; 1116 } 1117 1118 let b = self.latest_branches.last().unwrap(); 1119 1120 // Examine any immediately preceding branch. 1121 if self.latest_branches.len() > 1 { 1122 let prev_b = &self.latest_branches[self.latest_branches.len() - 2]; 1123 trace!(" -> more than one branch; prev_b = {:?}", prev_b); 1124 // This uncond is immediately after another uncond; we 1125 // should have already redirected labels to this uncond away 1126 // (but check to be sure); so we can truncate this uncond. 1127 if prev_b.is_uncond() 1128 && prev_b.end == b.start 1129 && b.labels_at_this_branch.is_empty() 1130 { 1131 trace!(" -> uncond follows another uncond; truncating"); 1132 self.truncate_last_branch(); 1133 continue; 1134 } 1135 1136 // This uncond is immediately after a conditional, and the 1137 // conditional's target is the end of this uncond, and we've 1138 // already redirected labels to this uncond away; so we can 1139 // truncate this uncond, flip the sense of the conditional, and 1140 // set the conditional's target (in `latest_branches` and in 1141 // `fixup_records`) to the uncond's target. 1142 if prev_b.is_cond() 1143 && prev_b.end == b.start 1144 && self.resolve_label_offset(prev_b.target) == cur_off 1145 { 1146 trace!( 1147 " -> uncond follows a conditional, and conditional's target resolves to current offset" 1148 ); 1149 // Save the target of the uncond (this becomes the 1150 // target of the cond), and truncate the uncond. 1151 let target = b.target; 1152 let data = prev_b.inverted.clone().unwrap(); 1153 self.truncate_last_branch(); 1154 1155 // Mutate the code and cond branch. 1156 let off_before_edit = self.cur_offset(); 1157 let prev_b = self.latest_branches.last_mut().unwrap(); 1158 let not_inverted = SmallVec::from( 1159 &self.data[(prev_b.start as usize)..(prev_b.end as usize)], 1160 ); 1161 1162 // Low-level edit: replaces bytes of branch with 1163 // inverted form. cur_off remains the same afterward, so 1164 // we do not need to modify label data structures. 1165 self.data.truncate(prev_b.start as usize); 1166 self.data.extend_from_slice(&data[..]); 1167 1168 // Save the original code as the inversion of the 1169 // inverted branch, in case we later edit this branch 1170 // again. 1171 prev_b.inverted = Some(not_inverted); 1172 self.pending_fixup_records[prev_b.fixup].label = target; 1173 trace!(" -> reassigning target of condbr to {:?}", target); 1174 prev_b.target = target; 1175 debug_assert_eq!(off_before_edit, self.cur_offset()); 1176 continue; 1177 } 1178 } 1179 } 1180 1181 // If we couldn't do anything with the last branch, then break. 1182 break; 1183 } 1184 1185 self.purge_latest_branches(); 1186 1187 trace!( 1188 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 1189 self.latest_branches, self.labels_at_tail, self.pending_fixup_records 1190 ); 1191 } 1192 1193 fn purge_latest_branches(&mut self) { 1194 // All of our branch simplification rules work only if a branch ends at 1195 // the tail of the buffer, with no following code; and branches are in 1196 // order in latest_branches; so if the last entry ends prior to 1197 // cur_offset, then clear all entries. 1198 let cur_off = self.cur_offset(); 1199 if let Some(l) = self.latest_branches.last() { 1200 if l.end < cur_off { 1201 trace!("purge_latest_branches: removing branch {:?}", l); 1202 self.latest_branches.clear(); 1203 } 1204 } 1205 1206 // Post-invariant: no invariant requires any branch to appear in 1207 // `latest_branches`; it is always optional. The list-clear above thus 1208 // preserves all semantics. 1209 } 1210 1211 /// Emit a trap at some point in the future with the specified code and 1212 /// stack map. 1213 /// 1214 /// This function returns a [`MachLabel`] which will be the future address 1215 /// of the trap. Jumps should refer to this label, likely by using the 1216 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation 1217 /// patched in once the address of the trap is known. 1218 /// 1219 /// This will batch all traps into the end of the function. 1220 pub fn defer_trap(&mut self, code: TrapCode) -> MachLabel { 1221 let label = self.get_label(); 1222 self.pending_traps.push(MachLabelTrap { 1223 label, 1224 code, 1225 loc: self.cur_srcloc.map(|(_start, loc)| loc), 1226 }); 1227 label 1228 } 1229 1230 /// Is an island needed within the next N bytes? 1231 pub fn island_needed(&self, distance: CodeOffset) -> bool { 1232 let deadline = match self.fixup_records.peek() { 1233 Some(fixup) => fixup.deadline().min(self.pending_fixup_deadline), 1234 None => self.pending_fixup_deadline, 1235 }; 1236 deadline < u32::MAX && self.worst_case_end_of_island(distance) > deadline 1237 } 1238 1239 /// Returns the maximal offset that islands can reach if `distance` more 1240 /// bytes are appended. 1241 /// 1242 /// This is used to determine if veneers need insertions since jumps that 1243 /// can't reach past this point must get a veneer of some form. 1244 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset { 1245 // Assume that all fixups will require veneers and that the veneers are 1246 // the worst-case size for each platform. This is an over-generalization 1247 // to avoid iterating over the `fixup_records` list or maintaining 1248 // information about it as we go along. 1249 let island_worst_case_size = ((self.fixup_records.len() + self.pending_fixup_records.len()) 1250 as u32) 1251 * (I::LabelUse::worst_case_veneer_size()) 1252 + self.pending_constants_size 1253 + (self.pending_traps.len() * I::TRAP_OPCODE.len()) as u32; 1254 self.cur_offset() 1255 .saturating_add(distance) 1256 .saturating_add(island_worst_case_size) 1257 } 1258 1259 /// Emit all pending constants and required pending veneers. 1260 /// 1261 /// Should only be called if `island_needed()` returns true, i.e., if we 1262 /// actually reach a deadline. It's not necessarily a problem to do so 1263 /// otherwise but it may result in unnecessary work during emission. 1264 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) { 1265 self.emit_island_maybe_forced(ForceVeneers::No, distance, ctrl_plane); 1266 } 1267 1268 /// Same as `emit_island`, but an internal API with a `force_veneers` 1269 /// argument to force all veneers to always get emitted for debugging. 1270 fn emit_island_maybe_forced( 1271 &mut self, 1272 force_veneers: ForceVeneers, 1273 distance: CodeOffset, 1274 ctrl_plane: &mut ControlPlane, 1275 ) { 1276 // We're going to purge fixups, so no latest-branch editing can happen 1277 // anymore. 1278 self.latest_branches.clear(); 1279 1280 // End the current location tracking since anything emitted during this 1281 // function shouldn't be attributed to whatever the current source 1282 // location is. 1283 // 1284 // Note that the current source location, if it's set right now, will be 1285 // restored at the end of this island emission. 1286 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc); 1287 if cur_loc.is_some() { 1288 self.end_srcloc(); 1289 } 1290 1291 let forced_threshold = self.worst_case_end_of_island(distance); 1292 1293 // First flush out all traps/constants so we have more labels in case 1294 // fixups are applied against these labels. 1295 // 1296 // Note that traps are placed first since this typically happens at the 1297 // end of the function and for disassemblers we try to keep all the code 1298 // contiguously together. 1299 for MachLabelTrap { label, code, loc } in mem::take(&mut self.pending_traps) { 1300 // If this trap has source information associated with it then 1301 // emit this information for the trap instruction going out now too. 1302 if let Some(loc) = loc { 1303 self.start_srcloc(loc); 1304 } 1305 self.align_to(I::LabelUse::ALIGN); 1306 self.bind_label(label, ctrl_plane); 1307 self.add_trap(code); 1308 self.put_data(I::TRAP_OPCODE); 1309 if loc.is_some() { 1310 self.end_srcloc(); 1311 } 1312 } 1313 1314 for constant in mem::take(&mut self.pending_constants) { 1315 let MachBufferConstant { align, size, .. } = self.constants[constant]; 1316 let label = self.constants[constant].upcoming_label.take().unwrap(); 1317 self.align_to(align); 1318 self.bind_label(label, ctrl_plane); 1319 self.used_constants.push((constant, self.cur_offset())); 1320 self.get_appended_space(size); 1321 } 1322 1323 // Either handle all pending fixups because they're ready or move them 1324 // onto the `BinaryHeap` tracking all pending fixups if they aren't 1325 // ready. 1326 assert!(self.latest_branches.is_empty()); 1327 for fixup in mem::take(&mut self.pending_fixup_records) { 1328 if self.should_apply_fixup(&fixup, forced_threshold) { 1329 self.handle_fixup(fixup, force_veneers, forced_threshold); 1330 } else { 1331 self.fixup_records.push(fixup); 1332 } 1333 } 1334 self.pending_fixup_deadline = u32::MAX; 1335 while let Some(fixup) = self.fixup_records.peek() { 1336 trace!("emit_island: fixup {:?}", fixup); 1337 1338 // If this fixup shouldn't be applied, that means its label isn't 1339 // defined yet and there'll be remaining space to apply a veneer if 1340 // necessary in the future after this island. In that situation 1341 // because `fixup_records` is sorted by deadline this loop can 1342 // exit. 1343 if !self.should_apply_fixup(fixup, forced_threshold) { 1344 break; 1345 } 1346 1347 let fixup = self.fixup_records.pop().unwrap(); 1348 self.handle_fixup(fixup, force_veneers, forced_threshold); 1349 } 1350 1351 if let Some(loc) = cur_loc { 1352 self.start_srcloc(loc); 1353 } 1354 } 1355 1356 fn should_apply_fixup(&self, fixup: &MachLabelFixup<I>, forced_threshold: CodeOffset) -> bool { 1357 let label_offset = self.resolve_label_offset(fixup.label); 1358 label_offset != UNKNOWN_LABEL_OFFSET || fixup.deadline() < forced_threshold 1359 } 1360 1361 fn handle_fixup( 1362 &mut self, 1363 fixup: MachLabelFixup<I>, 1364 force_veneers: ForceVeneers, 1365 forced_threshold: CodeOffset, 1366 ) { 1367 let MachLabelFixup { 1368 label, 1369 offset, 1370 kind, 1371 } = fixup; 1372 let start = offset as usize; 1373 let end = (offset + kind.patch_size()) as usize; 1374 let label_offset = self.resolve_label_offset(label); 1375 1376 if label_offset != UNKNOWN_LABEL_OFFSET { 1377 // If the offset of the label for this fixup is known then 1378 // we're going to do something here-and-now. We're either going 1379 // to patch the original offset because it's an in-bounds jump, 1380 // or we're going to generate a veneer, patch the fixup to jump 1381 // to the veneer, and then keep going. 1382 // 1383 // If the label comes after the original fixup, then we should 1384 // be guaranteed that the jump is in-bounds. Otherwise there's 1385 // a bug somewhere because this method wasn't called soon 1386 // enough. All forward-jumps are tracked and should get veneers 1387 // before their deadline comes and they're unable to jump 1388 // further. 1389 // 1390 // Otherwise if the label is before the fixup, then that's a 1391 // backwards jump. If it's past the maximum negative range 1392 // then we'll emit a veneer that to jump forward to which can 1393 // then jump backwards. 1394 let veneer_required = if label_offset >= offset { 1395 assert!((label_offset - offset) <= kind.max_pos_range()); 1396 false 1397 } else { 1398 (offset - label_offset) > kind.max_neg_range() 1399 }; 1400 trace!( 1401 " -> label_offset = {}, known, required = {} (pos {} neg {})", 1402 label_offset, 1403 veneer_required, 1404 kind.max_pos_range(), 1405 kind.max_neg_range() 1406 ); 1407 1408 if (force_veneers == ForceVeneers::Yes && kind.supports_veneer()) || veneer_required { 1409 self.emit_veneer(label, offset, kind); 1410 } else { 1411 let slice = &mut self.data[start..end]; 1412 trace!( 1413 "patching in-range! slice = {slice:?}; offset = {offset:#x}; label_offset = {label_offset:#x}" 1414 ); 1415 kind.patch(slice, offset, label_offset); 1416 } 1417 } else { 1418 // If the offset of this label is not known at this time then 1419 // that means that a veneer is required because after this 1420 // island the target can't be in range of the original target. 1421 assert!(forced_threshold - offset > kind.max_pos_range()); 1422 self.emit_veneer(label, offset, kind); 1423 } 1424 } 1425 1426 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`. 1427 /// 1428 /// This will generate extra machine code, using `kind`, to get a 1429 /// larger-jump-kind than `kind` allows. The code at `offset` is then 1430 /// patched to jump to our new code, and then the new code is enqueued for 1431 /// a fixup to get processed at some later time. 1432 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) { 1433 // If this `kind` doesn't support a veneer then that's a bug in the 1434 // backend because we need to implement support for such a veneer. 1435 assert!( 1436 kind.supports_veneer(), 1437 "jump beyond the range of {kind:?} but a veneer isn't supported", 1438 ); 1439 1440 // Allocate space for a veneer in the island. 1441 self.align_to(I::LabelUse::ALIGN); 1442 let veneer_offset = self.cur_offset(); 1443 trace!("making a veneer at {}", veneer_offset); 1444 let start = offset as usize; 1445 let end = (offset + kind.patch_size()) as usize; 1446 let slice = &mut self.data[start..end]; 1447 // Patch the original label use to refer to the veneer. 1448 trace!( 1449 "patching original at offset {} to veneer offset {}", 1450 offset, veneer_offset 1451 ); 1452 kind.patch(slice, offset, veneer_offset); 1453 // Generate the veneer. 1454 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize); 1455 let (veneer_fixup_off, veneer_label_use) = 1456 kind.generate_veneer(veneer_slice, veneer_offset); 1457 trace!( 1458 "generated veneer; fixup offset {}, label_use {:?}", 1459 veneer_fixup_off, veneer_label_use 1460 ); 1461 // Register a new use of `label` with our new veneer fixup and 1462 // offset. This'll recalculate deadlines accordingly and 1463 // enqueue this fixup to get processed at some later 1464 // time. 1465 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use); 1466 } 1467 1468 fn finish_emission_maybe_forcing_veneers( 1469 &mut self, 1470 force_veneers: ForceVeneers, 1471 ctrl_plane: &mut ControlPlane, 1472 ) { 1473 while !self.pending_constants.is_empty() 1474 || !self.pending_traps.is_empty() 1475 || !self.fixup_records.is_empty() 1476 || !self.pending_fixup_records.is_empty() 1477 { 1478 // `emit_island()` will emit any pending veneers and constants, and 1479 // as a side-effect, will also take care of any fixups with resolved 1480 // labels eagerly. 1481 self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane); 1482 } 1483 1484 // Ensure that all labels have been fixed up after the last island is emitted. This is a 1485 // full (release-mode) assert because an unresolved label means the emitted code is 1486 // incorrect. 1487 assert!(self.fixup_records.is_empty()); 1488 assert!(self.pending_fixup_records.is_empty()); 1489 } 1490 1491 /// Finish any deferred emissions and/or fixups. 1492 pub fn finish( 1493 mut self, 1494 constants: &VCodeConstants, 1495 ctrl_plane: &mut ControlPlane, 1496 ) -> MachBufferFinalized<Stencil> { 1497 let _tt = timing::vcode_emit_finish(); 1498 1499 self.finish_emission_maybe_forcing_veneers(ForceVeneers::No, ctrl_plane); 1500 1501 let alignment = self.finish_constants(constants); 1502 1503 // Resolve all labels to their offsets. 1504 let finalized_relocs = self 1505 .relocs 1506 .iter() 1507 .map(|reloc| FinalizedMachReloc { 1508 offset: reloc.offset, 1509 kind: reloc.kind, 1510 addend: reloc.addend, 1511 target: match &reloc.target { 1512 RelocTarget::ExternalName(name) => { 1513 FinalizedRelocTarget::ExternalName(name.clone()) 1514 } 1515 RelocTarget::Label(label) => { 1516 FinalizedRelocTarget::Func(self.resolve_label_offset(*label)) 1517 } 1518 }, 1519 }) 1520 .collect(); 1521 1522 let finalized_exception_handlers = self 1523 .exception_handlers 1524 .iter() 1525 .map(|handler| handler.finalize(|label| self.resolve_label_offset(label))) 1526 .collect(); 1527 1528 let mut srclocs = self.srclocs; 1529 srclocs.sort_by_key(|entry| entry.start); 1530 1531 MachBufferFinalized { 1532 data: self.data, 1533 relocs: finalized_relocs, 1534 traps: self.traps, 1535 call_sites: self.call_sites, 1536 exception_handlers: finalized_exception_handlers, 1537 srclocs, 1538 user_stack_maps: self.user_stack_maps, 1539 unwind_info: self.unwind_info, 1540 alignment, 1541 } 1542 } 1543 1544 /// Add an external relocation at the given offset. 1545 pub fn add_reloc_at_offset<T: Into<RelocTarget> + Clone>( 1546 &mut self, 1547 offset: CodeOffset, 1548 kind: Reloc, 1549 target: &T, 1550 addend: Addend, 1551 ) { 1552 let target: RelocTarget = target.clone().into(); 1553 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally 1554 // generate a label-use statement to track whether an island is possibly 1555 // needed to escape this function to actually get to the external name. 1556 // This is most likely to come up on AArch64 where calls between 1557 // functions use a 26-bit signed offset which gives +/- 64MB. This means 1558 // that if a function is 128MB in size and there's a call in the middle 1559 // it's impossible to reach the actual target. Also, while it's 1560 // technically possible to jump to the start of a function and then jump 1561 // further, island insertion below always inserts islands after 1562 // previously appended code so for Cranelift's own implementation this 1563 // is also a problem for 64MB functions on AArch64 which start with a 1564 // call instruction, those won't be able to escape. 1565 // 1566 // Ideally what needs to happen here is that a `LabelUse` is 1567 // transparently generated (or call-sites of this function are audited 1568 // to generate a `LabelUse` instead) and tracked internally. The actual 1569 // relocation would then change over time if and when a veneer is 1570 // inserted, where the relocation here would be patched by this 1571 // `MachBuffer` to jump to the veneer. The problem, though, is that all 1572 // this still needs to end up, in the case of a singular function, 1573 // generating a final relocation pointing either to this particular 1574 // relocation or to the veneer inserted. Additionally 1575 // `MachBuffer` needs the concept of a label which will never be 1576 // resolved, so `emit_island` doesn't trip over not actually ever 1577 // knowning what some labels are. Currently the loop in 1578 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely 1579 // loop. 1580 // 1581 // For now this means that because relocs aren't tracked at all that 1582 // AArch64 functions have a rough size limits of 64MB. For now that's 1583 // somewhat reasonable and the failure mode is a panic in `MachBuffer` 1584 // when a relocation can't otherwise be resolved later, so it shouldn't 1585 // actually result in any memory unsafety or anything like that. 1586 self.relocs.push(MachReloc { 1587 offset, 1588 kind, 1589 target, 1590 addend, 1591 }); 1592 } 1593 1594 /// Add an external relocation at the current offset. 1595 pub fn add_reloc<T: Into<RelocTarget> + Clone>( 1596 &mut self, 1597 kind: Reloc, 1598 target: &T, 1599 addend: Addend, 1600 ) { 1601 self.add_reloc_at_offset(self.data.len() as CodeOffset, kind, target, addend); 1602 } 1603 1604 /// Add a trap record at the current offset. 1605 pub fn add_trap(&mut self, code: TrapCode) { 1606 self.traps.push(MachTrap { 1607 offset: self.data.len() as CodeOffset, 1608 code, 1609 }); 1610 } 1611 1612 /// Add a call-site record at the current offset. 1613 pub fn add_call_site(&mut self) { 1614 self.add_try_call_site(core::iter::empty()); 1615 } 1616 1617 /// Add a call-site record at the current offset with exception 1618 /// handlers. 1619 pub fn add_try_call_site( 1620 &mut self, 1621 exception_handlers: impl Iterator<Item = MachExceptionHandler>, 1622 ) { 1623 let start = u32::try_from(self.exception_handlers.len()).unwrap(); 1624 self.exception_handlers.extend(exception_handlers); 1625 let end = u32::try_from(self.exception_handlers.len()).unwrap(); 1626 let exception_handler_range = start..end; 1627 1628 self.call_sites.push(MachCallSite { 1629 ret_addr: self.data.len() as CodeOffset, 1630 exception_handler_range, 1631 }); 1632 } 1633 1634 /// Add an unwind record at the current offset. 1635 pub fn add_unwind(&mut self, unwind: UnwindInst) { 1636 self.unwind_info.push((self.cur_offset(), unwind)); 1637 } 1638 1639 /// Set the `SourceLoc` for code from this offset until the offset at the 1640 /// next call to `end_srcloc()`. 1641 /// Returns the current [CodeOffset] and [RelSourceLoc]. 1642 pub fn start_srcloc(&mut self, loc: RelSourceLoc) -> (CodeOffset, RelSourceLoc) { 1643 let cur = (self.cur_offset(), loc); 1644 self.cur_srcloc = Some(cur); 1645 cur 1646 } 1647 1648 /// Mark the end of the `SourceLoc` segment started at the last 1649 /// `start_srcloc()` call. 1650 pub fn end_srcloc(&mut self) { 1651 let (start, loc) = self 1652 .cur_srcloc 1653 .take() 1654 .expect("end_srcloc() called without start_srcloc()"); 1655 let end = self.cur_offset(); 1656 // Skip zero-length extends. 1657 debug_assert!(end >= start); 1658 if end > start { 1659 self.srclocs.push(MachSrcLoc { start, end, loc }); 1660 } 1661 } 1662 1663 /// Push a user stack map onto this buffer. 1664 /// 1665 /// The stack map is associated with the given `return_addr` code 1666 /// offset. This must be the PC for the instruction just *after* this stack 1667 /// map's associated instruction. For example in the sequence `call $foo; 1668 /// add r8, rax`, the `return_addr` must be the offset of the start of the 1669 /// `add` instruction. 1670 /// 1671 /// Stack maps must be pushed in sorted `return_addr` order. 1672 pub fn push_user_stack_map( 1673 &mut self, 1674 emit_state: &I::State, 1675 return_addr: CodeOffset, 1676 mut stack_map: ir::UserStackMap, 1677 ) { 1678 let span = emit_state.frame_layout().active_size(); 1679 trace!("Adding user stack map @ {return_addr:#x} spanning {span} bytes: {stack_map:?}"); 1680 1681 debug_assert!( 1682 self.user_stack_maps 1683 .last() 1684 .map_or(true, |(prev_addr, _, _)| *prev_addr < return_addr), 1685 "pushed stack maps out of order: {} is not less than {}", 1686 self.user_stack_maps.last().unwrap().0, 1687 return_addr, 1688 ); 1689 1690 stack_map.finalize(emit_state.frame_layout().sp_to_sized_stack_slots()); 1691 self.user_stack_maps.push((return_addr, span, stack_map)); 1692 } 1693 1694 /// Increase the alignment of the buffer to the given alignment if bigger 1695 /// than the current alignment. 1696 pub fn set_log2_min_function_alignment(&mut self, align_to: u8) { 1697 self.min_alignment = self.min_alignment.max( 1698 1u32.checked_shl(u32::from(align_to)) 1699 .expect("log2_min_function_alignment too large"), 1700 ); 1701 } 1702 } 1703 1704 impl<I: VCodeInst> Extend<u8> for MachBuffer<I> { 1705 fn extend<T: IntoIterator<Item = u8>>(&mut self, iter: T) { 1706 for b in iter { 1707 self.put1(b); 1708 } 1709 } 1710 } 1711 1712 impl<T: CompilePhase> MachBufferFinalized<T> { 1713 /// Get a list of source location mapping tuples in sorted-by-start-offset order. 1714 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] { 1715 &self.srclocs[..] 1716 } 1717 1718 /// Get the total required size for the code. 1719 pub fn total_size(&self) -> CodeOffset { 1720 self.data.len() as CodeOffset 1721 } 1722 1723 /// Return the code in this mach buffer as a hex string for testing purposes. 1724 pub fn stringify_code_bytes(&self) -> String { 1725 // This is pretty lame, but whatever .. 1726 use std::fmt::Write; 1727 let mut s = String::with_capacity(self.data.len() * 2); 1728 for b in &self.data { 1729 write!(&mut s, "{b:02X}").unwrap(); 1730 } 1731 s 1732 } 1733 1734 /// Get the code bytes. 1735 pub fn data(&self) -> &[u8] { 1736 // N.B.: we emit every section into the .text section as far as 1737 // the `CodeSink` is concerned; we do not bother to segregate 1738 // the contents into the actual program text, the jumptable and the 1739 // rodata (constant pool). This allows us to generate code assuming 1740 // that these will not be relocated relative to each other, and avoids 1741 // having to designate each section as belonging in one of the three 1742 // fixed categories defined by `CodeSink`. If this becomes a problem 1743 // later (e.g. because of memory permissions or similar), we can 1744 // add this designation and segregate the output; take care, however, 1745 // to add the appropriate relocations in this case. 1746 1747 &self.data[..] 1748 } 1749 1750 /// Get the list of external relocations for this code. 1751 pub fn relocs(&self) -> &[FinalizedMachReloc] { 1752 &self.relocs[..] 1753 } 1754 1755 /// Get the list of trap records for this code. 1756 pub fn traps(&self) -> &[MachTrap] { 1757 &self.traps[..] 1758 } 1759 1760 /// Get the user stack map metadata for this code. 1761 pub fn user_stack_maps(&self) -> &[(CodeOffset, u32, ir::UserStackMap)] { 1762 &self.user_stack_maps 1763 } 1764 1765 /// Take this buffer's user strack map metadata. 1766 pub fn take_user_stack_maps(&mut self) -> SmallVec<[(CodeOffset, u32, ir::UserStackMap); 8]> { 1767 mem::take(&mut self.user_stack_maps) 1768 } 1769 1770 /// Get the list of call sites for this code, along with 1771 /// associated exception handlers. 1772 /// 1773 /// Each item yielded by the returned iterator is a struct with: 1774 /// 1775 /// - The call site metadata record, with a `ret_addr` field 1776 /// directly accessible and denoting the offset of the return 1777 /// address into this buffer's code. 1778 /// - The slice of pairs of exception tags and code offsets 1779 /// denoting exception-handler entry points associated with this 1780 /// call site. 1781 pub fn call_sites(&self) -> impl Iterator<Item = FinalizedMachCallSite<'_>> + '_ { 1782 self.call_sites.iter().map(|call_site| { 1783 let handler_range = call_site.exception_handler_range.clone(); 1784 let handler_range = usize::try_from(handler_range.start).unwrap() 1785 ..usize::try_from(handler_range.end).unwrap(); 1786 FinalizedMachCallSite { 1787 ret_addr: call_site.ret_addr, 1788 exception_handlers: &self.exception_handlers[handler_range], 1789 } 1790 }) 1791 } 1792 } 1793 1794 /// An item in the exception-handler list for a callsite, with label 1795 /// references. Items are interpreted in left-to-right order and the 1796 /// first match wins. 1797 #[derive(Clone, Copy, Debug, PartialEq, Eq)] 1798 pub enum MachExceptionHandler { 1799 /// A specific tag (in the current dynamic context) should be 1800 /// handled by the code at the given offset. 1801 Tag(ExceptionTag, MachLabel), 1802 /// All exceptions should be handled by the code at the given 1803 /// offset. 1804 Default(MachLabel), 1805 /// The dynamic context for interpreting tags is updated to the 1806 /// value stored in the given machine location (in this frame's 1807 /// context). 1808 Context(ExceptionContextLoc), 1809 } 1810 1811 impl MachExceptionHandler { 1812 fn finalize<F: Fn(MachLabel) -> CodeOffset>(self, f: F) -> FinalizedMachExceptionHandler { 1813 match self { 1814 Self::Tag(tag, label) => FinalizedMachExceptionHandler::Tag(tag, f(label)), 1815 Self::Default(label) => FinalizedMachExceptionHandler::Default(f(label)), 1816 Self::Context(loc) => FinalizedMachExceptionHandler::Context(loc), 1817 } 1818 } 1819 } 1820 1821 /// An item in the exception-handler list for a callsite, with final 1822 /// (lowered) code offsets. Items are interpreted in left-to-right 1823 /// order and the first match wins. 1824 #[derive(Clone, Copy, Debug, PartialEq, Eq)] 1825 #[cfg_attr( 1826 feature = "enable-serde", 1827 derive(serde_derive::Serialize, serde_derive::Deserialize) 1828 )] 1829 pub enum FinalizedMachExceptionHandler { 1830 /// A specific tag (in the current dynamic context) should be 1831 /// handled by the code at the given offset. 1832 Tag(ExceptionTag, CodeOffset), 1833 /// All exceptions should be handled by the code at the given 1834 /// offset. 1835 Default(CodeOffset), 1836 /// The dynamic context for interpreting tags is updated to the 1837 /// value stored in the given machine location (in this frame's 1838 /// context). 1839 Context(ExceptionContextLoc), 1840 } 1841 1842 /// A location for a dynamic exception context value. 1843 #[derive(Clone, Copy, Debug, PartialEq, Eq)] 1844 #[cfg_attr( 1845 feature = "enable-serde", 1846 derive(serde_derive::Serialize, serde_derive::Deserialize) 1847 )] 1848 pub enum ExceptionContextLoc { 1849 /// An offset from SP at the callsite. 1850 SPOffset(u32), 1851 /// A GPR at the callsite. The physical register number for the 1852 /// GPR register file on the target architecture is used. 1853 GPR(u8), 1854 } 1855 1856 /// Metadata about a constant. 1857 struct MachBufferConstant { 1858 /// A label which has not yet been bound which can be used for this 1859 /// constant. 1860 /// 1861 /// This is lazily created when a label is requested for a constant and is 1862 /// cleared when a constant is emitted. 1863 upcoming_label: Option<MachLabel>, 1864 /// Required alignment. 1865 align: CodeOffset, 1866 /// The byte size of this constant. 1867 size: usize, 1868 } 1869 1870 /// A trap that is deferred to the next time an island is emitted for either 1871 /// traps, constants, or fixups. 1872 struct MachLabelTrap { 1873 /// This label will refer to the trap's offset. 1874 label: MachLabel, 1875 /// The code associated with this trap. 1876 code: TrapCode, 1877 /// An optional source location to assign for this trap. 1878 loc: Option<RelSourceLoc>, 1879 } 1880 1881 /// A fixup to perform on the buffer once code is emitted. Fixups always refer 1882 /// to labels and patch the code based on label offsets. Hence, they are like 1883 /// relocations, but internal to one buffer. 1884 #[derive(Debug)] 1885 struct MachLabelFixup<I: VCodeInst> { 1886 /// The label whose offset controls this fixup. 1887 label: MachLabel, 1888 /// The offset to fix up / patch to refer to this label. 1889 offset: CodeOffset, 1890 /// The kind of fixup. This is architecture-specific; each architecture may have, 1891 /// e.g., several types of branch instructions, each with differently-sized 1892 /// offset fields and different places within the instruction to place the 1893 /// bits. 1894 kind: I::LabelUse, 1895 } 1896 1897 impl<I: VCodeInst> MachLabelFixup<I> { 1898 fn deadline(&self) -> CodeOffset { 1899 self.offset.saturating_add(self.kind.max_pos_range()) 1900 } 1901 } 1902 1903 impl<I: VCodeInst> PartialEq for MachLabelFixup<I> { 1904 fn eq(&self, other: &Self) -> bool { 1905 self.deadline() == other.deadline() 1906 } 1907 } 1908 1909 impl<I: VCodeInst> Eq for MachLabelFixup<I> {} 1910 1911 impl<I: VCodeInst> PartialOrd for MachLabelFixup<I> { 1912 fn partial_cmp(&self, other: &Self) -> Option<Ordering> { 1913 Some(self.cmp(other)) 1914 } 1915 } 1916 1917 impl<I: VCodeInst> Ord for MachLabelFixup<I> { 1918 fn cmp(&self, other: &Self) -> Ordering { 1919 other.deadline().cmp(&self.deadline()) 1920 } 1921 } 1922 1923 /// A relocation resulting from a compilation. 1924 #[derive(Clone, Debug, PartialEq)] 1925 #[cfg_attr( 1926 feature = "enable-serde", 1927 derive(serde_derive::Serialize, serde_derive::Deserialize) 1928 )] 1929 pub struct MachRelocBase<T> { 1930 /// The offset at which the relocation applies, *relative to the 1931 /// containing section*. 1932 pub offset: CodeOffset, 1933 /// The kind of relocation. 1934 pub kind: Reloc, 1935 /// The external symbol / name to which this relocation refers. 1936 pub target: T, 1937 /// The addend to add to the symbol value. 1938 pub addend: i64, 1939 } 1940 1941 type MachReloc = MachRelocBase<RelocTarget>; 1942 1943 /// A relocation resulting from a compilation. 1944 pub type FinalizedMachReloc = MachRelocBase<FinalizedRelocTarget>; 1945 1946 /// A Relocation target 1947 #[derive(Debug, Clone, PartialEq, Eq, Hash)] 1948 pub enum RelocTarget { 1949 /// Points to an [ExternalName] outside the current function. 1950 ExternalName(ExternalName), 1951 /// Points to a [MachLabel] inside this function. 1952 /// This is different from [MachLabelFixup] in that both the relocation and the 1953 /// label will be emitted and are only resolved at link time. 1954 /// 1955 /// There is no reason to prefer this over [MachLabelFixup] unless the ABI requires it. 1956 Label(MachLabel), 1957 } 1958 1959 impl From<ExternalName> for RelocTarget { 1960 fn from(name: ExternalName) -> Self { 1961 Self::ExternalName(name) 1962 } 1963 } 1964 1965 impl From<MachLabel> for RelocTarget { 1966 fn from(label: MachLabel) -> Self { 1967 Self::Label(label) 1968 } 1969 } 1970 1971 /// A Relocation target 1972 #[derive(Debug, Clone, PartialEq, Eq, Hash)] 1973 #[cfg_attr( 1974 feature = "enable-serde", 1975 derive(serde_derive::Serialize, serde_derive::Deserialize) 1976 )] 1977 pub enum FinalizedRelocTarget { 1978 /// Points to an [ExternalName] outside the current function. 1979 ExternalName(ExternalName), 1980 /// Points to a [CodeOffset] from the start of the current function. 1981 Func(CodeOffset), 1982 } 1983 1984 impl FinalizedRelocTarget { 1985 /// Returns a display for the current [FinalizedRelocTarget], with extra context to prettify the 1986 /// output. 1987 pub fn display<'a>(&'a self, params: Option<&'a FunctionParameters>) -> String { 1988 match self { 1989 FinalizedRelocTarget::ExternalName(name) => format!("{}", name.display(params)), 1990 FinalizedRelocTarget::Func(offset) => format!("func+{offset}"), 1991 } 1992 } 1993 } 1994 1995 /// A trap record resulting from a compilation. 1996 #[derive(Clone, Debug, PartialEq)] 1997 #[cfg_attr( 1998 feature = "enable-serde", 1999 derive(serde_derive::Serialize, serde_derive::Deserialize) 2000 )] 2001 pub struct MachTrap { 2002 /// The offset at which the trap instruction occurs, *relative to the 2003 /// containing section*. 2004 pub offset: CodeOffset, 2005 /// The trap code. 2006 pub code: TrapCode, 2007 } 2008 2009 /// A call site record resulting from a compilation. 2010 #[derive(Clone, Debug, PartialEq)] 2011 #[cfg_attr( 2012 feature = "enable-serde", 2013 derive(serde_derive::Serialize, serde_derive::Deserialize) 2014 )] 2015 pub struct MachCallSite { 2016 /// The offset of the call's return address, *relative to the 2017 /// start of the buffer*. 2018 pub ret_addr: CodeOffset, 2019 2020 /// Range in `exception_handlers` corresponding to the exception 2021 /// handlers for this callsite. 2022 exception_handler_range: Range<u32>, 2023 } 2024 2025 /// A call site record resulting from a compilation. 2026 #[derive(Clone, Debug, PartialEq)] 2027 pub struct FinalizedMachCallSite<'a> { 2028 /// The offset of the call's return address, *relative to the 2029 /// start of the buffer*. 2030 pub ret_addr: CodeOffset, 2031 2032 /// Exception handlers at this callsite, with target offsets 2033 /// *relative to the start of the buffer*. 2034 pub exception_handlers: &'a [FinalizedMachExceptionHandler], 2035 } 2036 2037 /// A source-location mapping resulting from a compilation. 2038 #[derive(PartialEq, Debug, Clone)] 2039 #[cfg_attr( 2040 feature = "enable-serde", 2041 derive(serde_derive::Serialize, serde_derive::Deserialize) 2042 )] 2043 pub struct MachSrcLoc<T: CompilePhase> { 2044 /// The start of the region of code corresponding to a source location. 2045 /// This is relative to the start of the function, not to the start of the 2046 /// section. 2047 pub start: CodeOffset, 2048 /// The end of the region of code corresponding to a source location. 2049 /// This is relative to the start of the function, not to the start of the 2050 /// section. 2051 pub end: CodeOffset, 2052 /// The source location. 2053 pub loc: T::SourceLocType, 2054 } 2055 2056 impl MachSrcLoc<Stencil> { 2057 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> { 2058 MachSrcLoc { 2059 start: self.start, 2060 end: self.end, 2061 loc: self.loc.expand(base_srcloc), 2062 } 2063 } 2064 } 2065 2066 /// Record of branch instruction in the buffer, to facilitate editing. 2067 #[derive(Clone, Debug)] 2068 struct MachBranch { 2069 start: CodeOffset, 2070 end: CodeOffset, 2071 target: MachLabel, 2072 fixup: usize, 2073 inverted: Option<SmallVec<[u8; 8]>>, 2074 /// All labels pointing to the start of this branch. For correctness, this 2075 /// *must* be complete (i.e., must contain all labels whose resolved offsets 2076 /// are at the start of this branch): we rely on being able to redirect all 2077 /// labels that could jump to this branch before removing it, if it is 2078 /// otherwise unreachable. 2079 labels_at_this_branch: SmallVec<[MachLabel; 4]>, 2080 } 2081 2082 impl MachBranch { 2083 fn is_cond(&self) -> bool { 2084 self.inverted.is_some() 2085 } 2086 fn is_uncond(&self) -> bool { 2087 self.inverted.is_none() 2088 } 2089 } 2090 2091 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`. 2092 /// 2093 /// Note that `MachBuffer` was primarily written for intra-function references 2094 /// of jumps between basic blocks, but it's also quite usable for entire text 2095 /// sections and resolving references between functions themselves. This 2096 /// builder interprets "blocks" as labeled functions for the purposes of 2097 /// resolving labels internally in the buffer. 2098 pub struct MachTextSectionBuilder<I: VCodeInst> { 2099 buf: MachBuffer<I>, 2100 next_func: usize, 2101 force_veneers: ForceVeneers, 2102 } 2103 2104 impl<I: VCodeInst> MachTextSectionBuilder<I> { 2105 /// Creates a new text section builder which will have `num_funcs` functions 2106 /// pushed into it. 2107 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> { 2108 let mut buf = MachBuffer::new(); 2109 buf.reserve_labels_for_blocks(num_funcs); 2110 MachTextSectionBuilder { 2111 buf, 2112 next_func: 0, 2113 force_veneers: ForceVeneers::No, 2114 } 2115 } 2116 } 2117 2118 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> { 2119 fn append( 2120 &mut self, 2121 labeled: bool, 2122 func: &[u8], 2123 align: u32, 2124 ctrl_plane: &mut ControlPlane, 2125 ) -> u64 { 2126 // Conditionally emit an island if it's necessary to resolve jumps 2127 // between functions which are too far away. 2128 let size = func.len() as u32; 2129 if self.force_veneers == ForceVeneers::Yes || self.buf.island_needed(size) { 2130 self.buf 2131 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane); 2132 } 2133 2134 self.buf.align_to(align); 2135 let pos = self.buf.cur_offset(); 2136 if labeled { 2137 self.buf.bind_label( 2138 MachLabel::from_block(BlockIndex::new(self.next_func)), 2139 ctrl_plane, 2140 ); 2141 self.next_func += 1; 2142 } 2143 self.buf.put_data(func); 2144 u64::from(pos) 2145 } 2146 2147 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool { 2148 crate::trace!( 2149 "Resolving relocation @ {offset:#x} + {addend:#x} to target {target} of kind {reloc:?}" 2150 ); 2151 let label = MachLabel::from_block(BlockIndex::new(target)); 2152 let offset = u32::try_from(offset).unwrap(); 2153 match I::LabelUse::from_reloc(reloc, addend) { 2154 Some(label_use) => { 2155 self.buf.use_label_at_offset(offset, label, label_use); 2156 true 2157 } 2158 None => false, 2159 } 2160 } 2161 2162 fn force_veneers(&mut self) { 2163 self.force_veneers = ForceVeneers::Yes; 2164 } 2165 2166 fn write(&mut self, offset: u64, data: &[u8]) { 2167 self.buf.data[offset.try_into().unwrap()..][..data.len()].copy_from_slice(data); 2168 } 2169 2170 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> { 2171 // Double-check all functions were pushed. 2172 assert_eq!(self.next_func, self.buf.label_offsets.len()); 2173 2174 // Finish up any veneers, if necessary. 2175 self.buf 2176 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane); 2177 2178 // We don't need the data any more, so return it to the caller. 2179 mem::take(&mut self.buf.data).into_vec() 2180 } 2181 } 2182 2183 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here. 2184 #[cfg(all(test, feature = "arm64"))] 2185 mod test { 2186 use cranelift_entity::EntityRef as _; 2187 2188 use super::*; 2189 use crate::ir::UserExternalNameRef; 2190 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst}; 2191 use crate::isa::aarch64::inst::{OperandSize, xreg}; 2192 use crate::machinst::{MachInstEmit, MachInstEmitState}; 2193 use crate::settings; 2194 2195 fn label(n: u32) -> MachLabel { 2196 MachLabel::from_block(BlockIndex::new(n as usize)) 2197 } 2198 fn target(n: u32) -> BranchTarget { 2199 BranchTarget::Label(label(n)) 2200 } 2201 2202 #[test] 2203 fn test_elide_jump_to_next() { 2204 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2205 let mut buf = MachBuffer::new(); 2206 let mut state = <Inst as MachInstEmit>::State::default(); 2207 let constants = Default::default(); 2208 2209 buf.reserve_labels_for_blocks(2); 2210 buf.bind_label(label(0), state.ctrl_plane_mut()); 2211 let inst = Inst::Jump { dest: target(1) }; 2212 inst.emit(&mut buf, &info, &mut state); 2213 buf.bind_label(label(1), state.ctrl_plane_mut()); 2214 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2215 assert_eq!(0, buf.total_size()); 2216 } 2217 2218 #[test] 2219 fn test_elide_trivial_jump_blocks() { 2220 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2221 let mut buf = MachBuffer::new(); 2222 let mut state = <Inst as MachInstEmit>::State::default(); 2223 let constants = Default::default(); 2224 2225 buf.reserve_labels_for_blocks(4); 2226 2227 buf.bind_label(label(0), state.ctrl_plane_mut()); 2228 let inst = Inst::CondBr { 2229 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64), 2230 taken: target(1), 2231 not_taken: target(2), 2232 }; 2233 inst.emit(&mut buf, &info, &mut state); 2234 2235 buf.bind_label(label(1), state.ctrl_plane_mut()); 2236 let inst = Inst::Jump { dest: target(3) }; 2237 inst.emit(&mut buf, &info, &mut state); 2238 2239 buf.bind_label(label(2), state.ctrl_plane_mut()); 2240 let inst = Inst::Jump { dest: target(3) }; 2241 inst.emit(&mut buf, &info, &mut state); 2242 2243 buf.bind_label(label(3), state.ctrl_plane_mut()); 2244 2245 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2246 assert_eq!(0, buf.total_size()); 2247 } 2248 2249 #[test] 2250 fn test_flip_cond() { 2251 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2252 let mut buf = MachBuffer::new(); 2253 let mut state = <Inst as MachInstEmit>::State::default(); 2254 let constants = Default::default(); 2255 2256 buf.reserve_labels_for_blocks(4); 2257 2258 buf.bind_label(label(0), state.ctrl_plane_mut()); 2259 let inst = Inst::CondBr { 2260 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64), 2261 taken: target(1), 2262 not_taken: target(2), 2263 }; 2264 inst.emit(&mut buf, &info, &mut state); 2265 2266 buf.bind_label(label(1), state.ctrl_plane_mut()); 2267 let inst = Inst::Nop4; 2268 inst.emit(&mut buf, &info, &mut state); 2269 2270 buf.bind_label(label(2), state.ctrl_plane_mut()); 2271 let inst = Inst::Udf { 2272 trap_code: TrapCode::STACK_OVERFLOW, 2273 }; 2274 inst.emit(&mut buf, &info, &mut state); 2275 2276 buf.bind_label(label(3), state.ctrl_plane_mut()); 2277 2278 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2279 2280 let mut buf2 = MachBuffer::new(); 2281 let mut state = Default::default(); 2282 let inst = Inst::TrapIf { 2283 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64), 2284 trap_code: TrapCode::STACK_OVERFLOW, 2285 }; 2286 inst.emit(&mut buf2, &info, &mut state); 2287 let inst = Inst::Nop4; 2288 inst.emit(&mut buf2, &info, &mut state); 2289 2290 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2291 2292 assert_eq!(buf.data, buf2.data); 2293 } 2294 2295 #[test] 2296 fn test_island() { 2297 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2298 let mut buf = MachBuffer::new(); 2299 let mut state = <Inst as MachInstEmit>::State::default(); 2300 let constants = Default::default(); 2301 2302 buf.reserve_labels_for_blocks(4); 2303 2304 buf.bind_label(label(0), state.ctrl_plane_mut()); 2305 let inst = Inst::CondBr { 2306 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64), 2307 taken: target(2), 2308 not_taken: target(3), 2309 }; 2310 inst.emit(&mut buf, &info, &mut state); 2311 2312 buf.bind_label(label(1), state.ctrl_plane_mut()); 2313 while buf.cur_offset() < 2000000 { 2314 if buf.island_needed(0) { 2315 buf.emit_island(0, state.ctrl_plane_mut()); 2316 } 2317 let inst = Inst::Nop4; 2318 inst.emit(&mut buf, &info, &mut state); 2319 } 2320 2321 buf.bind_label(label(2), state.ctrl_plane_mut()); 2322 let inst = Inst::Nop4; 2323 inst.emit(&mut buf, &info, &mut state); 2324 2325 buf.bind_label(label(3), state.ctrl_plane_mut()); 2326 let inst = Inst::Nop4; 2327 inst.emit(&mut buf, &info, &mut state); 2328 2329 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2330 2331 assert_eq!(2000000 + 8, buf.total_size()); 2332 2333 let mut buf2 = MachBuffer::new(); 2334 let mut state = Default::default(); 2335 let inst = Inst::CondBr { 2336 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64), 2337 2338 // This conditionally taken branch has a 19-bit constant, shifted 2339 // to the left by two, giving us a 21-bit range in total. Half of 2340 // this range positive so the we should be around 1 << 20 bytes 2341 // away for our jump target. 2342 // 2343 // There are two pending fixups by the time we reach this point, 2344 // one for this 19-bit jump and one for the unconditional 26-bit 2345 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit 2346 // veneer is 20 bytes large, which means that pessimistically 2347 // assuming we'll need two veneers. Currently each veneer is 2348 // pessimistically assumed to be the maximal size which means we 2349 // need 40 bytes of extra space, meaning that the actual island 2350 // should come 40-bytes before the deadline. 2351 taken: BranchTarget::ResolvedOffset((1 << 20) - 20 - 20), 2352 2353 // This branch is in-range so no veneers should be needed, it should 2354 // go directly to the target. 2355 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4), 2356 }; 2357 inst.emit(&mut buf2, &info, &mut state); 2358 2359 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2360 2361 assert_eq!(&buf.data[0..8], &buf2.data[..]); 2362 } 2363 2364 #[test] 2365 fn test_island_backward() { 2366 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2367 let mut buf = MachBuffer::new(); 2368 let mut state = <Inst as MachInstEmit>::State::default(); 2369 let constants = Default::default(); 2370 2371 buf.reserve_labels_for_blocks(4); 2372 2373 buf.bind_label(label(0), state.ctrl_plane_mut()); 2374 let inst = Inst::Nop4; 2375 inst.emit(&mut buf, &info, &mut state); 2376 2377 buf.bind_label(label(1), state.ctrl_plane_mut()); 2378 let inst = Inst::Nop4; 2379 inst.emit(&mut buf, &info, &mut state); 2380 2381 buf.bind_label(label(2), state.ctrl_plane_mut()); 2382 while buf.cur_offset() < 2000000 { 2383 let inst = Inst::Nop4; 2384 inst.emit(&mut buf, &info, &mut state); 2385 } 2386 2387 buf.bind_label(label(3), state.ctrl_plane_mut()); 2388 let inst = Inst::CondBr { 2389 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64), 2390 taken: target(0), 2391 not_taken: target(1), 2392 }; 2393 inst.emit(&mut buf, &info, &mut state); 2394 2395 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2396 2397 assert_eq!(2000000 + 12, buf.total_size()); 2398 2399 let mut buf2 = MachBuffer::new(); 2400 let mut state = Default::default(); 2401 let inst = Inst::CondBr { 2402 kind: CondBrKind::NotZero(xreg(0), OperandSize::Size64), 2403 taken: BranchTarget::ResolvedOffset(8), 2404 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)), 2405 }; 2406 inst.emit(&mut buf2, &info, &mut state); 2407 let inst = Inst::Jump { 2408 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)), 2409 }; 2410 inst.emit(&mut buf2, &info, &mut state); 2411 2412 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2413 2414 assert_eq!(&buf.data[2000000..], &buf2.data[..]); 2415 } 2416 2417 #[test] 2418 fn test_multiple_redirect() { 2419 // label0: 2420 // cbz x0, label1 2421 // b label2 2422 // label1: 2423 // b label3 2424 // label2: 2425 // nop 2426 // nop 2427 // b label0 2428 // label3: 2429 // b label4 2430 // label4: 2431 // b label5 2432 // label5: 2433 // b label7 2434 // label6: 2435 // nop 2436 // label7: 2437 // ret 2438 // 2439 // -- should become: 2440 // 2441 // label0: 2442 // cbz x0, label7 2443 // label2: 2444 // nop 2445 // nop 2446 // b label0 2447 // label6: 2448 // nop 2449 // label7: 2450 // ret 2451 2452 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2453 let mut buf = MachBuffer::new(); 2454 let mut state = <Inst as MachInstEmit>::State::default(); 2455 let constants = Default::default(); 2456 2457 buf.reserve_labels_for_blocks(8); 2458 2459 buf.bind_label(label(0), state.ctrl_plane_mut()); 2460 let inst = Inst::CondBr { 2461 kind: CondBrKind::Zero(xreg(0), OperandSize::Size64), 2462 taken: target(1), 2463 not_taken: target(2), 2464 }; 2465 inst.emit(&mut buf, &info, &mut state); 2466 2467 buf.bind_label(label(1), state.ctrl_plane_mut()); 2468 let inst = Inst::Jump { dest: target(3) }; 2469 inst.emit(&mut buf, &info, &mut state); 2470 2471 buf.bind_label(label(2), state.ctrl_plane_mut()); 2472 let inst = Inst::Nop4; 2473 inst.emit(&mut buf, &info, &mut state); 2474 inst.emit(&mut buf, &info, &mut state); 2475 let inst = Inst::Jump { dest: target(0) }; 2476 inst.emit(&mut buf, &info, &mut state); 2477 2478 buf.bind_label(label(3), state.ctrl_plane_mut()); 2479 let inst = Inst::Jump { dest: target(4) }; 2480 inst.emit(&mut buf, &info, &mut state); 2481 2482 buf.bind_label(label(4), state.ctrl_plane_mut()); 2483 let inst = Inst::Jump { dest: target(5) }; 2484 inst.emit(&mut buf, &info, &mut state); 2485 2486 buf.bind_label(label(5), state.ctrl_plane_mut()); 2487 let inst = Inst::Jump { dest: target(7) }; 2488 inst.emit(&mut buf, &info, &mut state); 2489 2490 buf.bind_label(label(6), state.ctrl_plane_mut()); 2491 let inst = Inst::Nop4; 2492 inst.emit(&mut buf, &info, &mut state); 2493 2494 buf.bind_label(label(7), state.ctrl_plane_mut()); 2495 let inst = Inst::Ret {}; 2496 inst.emit(&mut buf, &info, &mut state); 2497 2498 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2499 2500 let golden_data = vec![ 2501 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14 2502 0x1f, 0x20, 0x03, 0xd5, // nop 2503 0x1f, 0x20, 0x03, 0xd5, // nop 2504 0xfd, 0xff, 0xff, 0x17, // b 0 2505 0x1f, 0x20, 0x03, 0xd5, // nop 2506 0xc0, 0x03, 0x5f, 0xd6, // ret 2507 ]; 2508 2509 assert_eq!(&golden_data[..], &buf.data[..]); 2510 } 2511 2512 #[test] 2513 fn test_handle_branch_cycle() { 2514 // label0: 2515 // b label1 2516 // label1: 2517 // b label2 2518 // label2: 2519 // b label3 2520 // label3: 2521 // b label4 2522 // label4: 2523 // b label1 // note: not label0 (to make it interesting). 2524 // 2525 // -- should become: 2526 // 2527 // label0, label1, ..., label4: 2528 // b label0 2529 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2530 let mut buf = MachBuffer::new(); 2531 let mut state = <Inst as MachInstEmit>::State::default(); 2532 let constants = Default::default(); 2533 2534 buf.reserve_labels_for_blocks(5); 2535 2536 buf.bind_label(label(0), state.ctrl_plane_mut()); 2537 let inst = Inst::Jump { dest: target(1) }; 2538 inst.emit(&mut buf, &info, &mut state); 2539 2540 buf.bind_label(label(1), state.ctrl_plane_mut()); 2541 let inst = Inst::Jump { dest: target(2) }; 2542 inst.emit(&mut buf, &info, &mut state); 2543 2544 buf.bind_label(label(2), state.ctrl_plane_mut()); 2545 let inst = Inst::Jump { dest: target(3) }; 2546 inst.emit(&mut buf, &info, &mut state); 2547 2548 buf.bind_label(label(3), state.ctrl_plane_mut()); 2549 let inst = Inst::Jump { dest: target(4) }; 2550 inst.emit(&mut buf, &info, &mut state); 2551 2552 buf.bind_label(label(4), state.ctrl_plane_mut()); 2553 let inst = Inst::Jump { dest: target(1) }; 2554 inst.emit(&mut buf, &info, &mut state); 2555 2556 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2557 2558 let golden_data = vec![ 2559 0x00, 0x00, 0x00, 0x14, // b 0 2560 ]; 2561 2562 assert_eq!(&golden_data[..], &buf.data[..]); 2563 } 2564 2565 #[test] 2566 fn metadata_records() { 2567 let mut buf = MachBuffer::<Inst>::new(); 2568 let ctrl_plane = &mut Default::default(); 2569 let constants = Default::default(); 2570 2571 buf.reserve_labels_for_blocks(3); 2572 2573 buf.bind_label(label(0), ctrl_plane); 2574 buf.put1(1); 2575 buf.add_trap(TrapCode::HEAP_OUT_OF_BOUNDS); 2576 buf.put1(2); 2577 buf.add_trap(TrapCode::INTEGER_OVERFLOW); 2578 buf.add_trap(TrapCode::INTEGER_DIVISION_BY_ZERO); 2579 buf.add_try_call_site( 2580 [ 2581 MachExceptionHandler::Tag(ExceptionTag::new(42), label(2)), 2582 MachExceptionHandler::Default(label(1)), 2583 ] 2584 .into_iter(), 2585 ); 2586 buf.add_reloc( 2587 Reloc::Abs4, 2588 &ExternalName::User(UserExternalNameRef::new(0)), 2589 0, 2590 ); 2591 buf.put1(3); 2592 buf.add_reloc( 2593 Reloc::Abs8, 2594 &ExternalName::User(UserExternalNameRef::new(1)), 2595 1, 2596 ); 2597 buf.put1(4); 2598 buf.bind_label(label(1), ctrl_plane); 2599 buf.put1(0xff); 2600 buf.bind_label(label(2), ctrl_plane); 2601 buf.put1(0xff); 2602 2603 let buf = buf.finish(&constants, ctrl_plane); 2604 2605 assert_eq!(buf.data(), &[1, 2, 3, 4, 0xff, 0xff]); 2606 assert_eq!( 2607 buf.traps() 2608 .iter() 2609 .map(|trap| (trap.offset, trap.code)) 2610 .collect::<Vec<_>>(), 2611 vec![ 2612 (1, TrapCode::HEAP_OUT_OF_BOUNDS), 2613 (2, TrapCode::INTEGER_OVERFLOW), 2614 (2, TrapCode::INTEGER_DIVISION_BY_ZERO) 2615 ] 2616 ); 2617 let call_sites: Vec<_> = buf.call_sites().collect(); 2618 assert_eq!(call_sites[0].ret_addr, 2); 2619 assert_eq!( 2620 call_sites[0].exception_handlers, 2621 &[ 2622 FinalizedMachExceptionHandler::Tag(ExceptionTag::new(42), 5), 2623 FinalizedMachExceptionHandler::Default(4) 2624 ], 2625 ); 2626 assert_eq!( 2627 buf.relocs() 2628 .iter() 2629 .map(|reloc| (reloc.offset, reloc.kind)) 2630 .collect::<Vec<_>>(), 2631 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)] 2632 ); 2633 } 2634 } 2635