1 //! In-memory representation of compiled machine code, with labels and fixups to 2 //! refer to those labels. Handles constant-pool island insertion and also 3 //! veneer insertion for out-of-range jumps. 4 //! 5 //! This code exists to solve three problems: 6 //! 7 //! - Branch targets for forward branches are not known until later, when we 8 //! emit code in a single pass through the instruction structs. 9 //! 10 //! - On many architectures, address references or offsets have limited range. 11 //! For example, on AArch64, conditional branches can only target code +/- 1MB 12 //! from the branch itself. 13 //! 14 //! - The lowering of control flow from the CFG-with-edges produced by 15 //! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty 16 //! edge blocks when the register allocator does not need to insert any 17 //! spills/reloads/moves in edge blocks, results in many suboptimal branch 18 //! patterns. The lowering also pays no attention to block order, and so 19 //! two-target conditional forms (cond-br followed by uncond-br) can often by 20 //! avoided because one of the targets is the fallthrough. There are several 21 //! cases here where we can simplify to use fewer branches. 22 //! 23 //! This "buffer" implements a single-pass code emission strategy (with a later 24 //! "fixup" pass, but only through recorded fixups, not all instructions). The 25 //! basic idea is: 26 //! 27 //! - Emit branches as they are, including two-target (cond/uncond) compound 28 //! forms, but with zero offsets and optimistically assuming the target will be 29 //! in range. Record the "fixup" for later. Targets are denoted instead by 30 //! symbolic "labels" that are then bound to certain offsets in the buffer as 31 //! we emit code. (Nominally, there is a label at the start of every basic 32 //! block.) 33 //! 34 //! - As we do this, track the offset in the buffer at which the first label 35 //! reference "goes out of range". We call this the "deadline". If we reach the 36 //! deadline and we still have not bound the label to which an unresolved branch 37 //! refers, we have a problem! 38 //! 39 //! - To solve this problem, we emit "islands" full of "veneers". An island is 40 //! simply a chunk of code inserted in the middle of the code actually produced 41 //! by the emitter (e.g., vcode iterating over instruction structs). The emitter 42 //! has some awareness of this: it either asks for an island between blocks, so 43 //! it is not accidentally executed, or else it emits a branch around the island 44 //! when all other options fail (see `Inst::EmitIsland` meta-instruction). 45 //! 46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island" 47 //! that implements a longer-range reference to a label. The idea is that, for 48 //! example, a branch with a limited range can branch to a "veneer" instead, 49 //! which is simply a branch in a form that can use a longer-range reference. On 50 //! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional 51 //! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a 52 //! conditional branch's label reference can be fixed up with a "veneer" to 53 //! achieve a longer range. 54 //! 55 //! - To implement all of this, we require the backend to provide a `LabelUse` 56 //! type that implements a trait. This is nominally an enum that records one of 57 //! several kinds of references to an offset in code -- basically, a relocation 58 //! type -- and will usually correspond to different instruction formats. The 59 //! `LabelUse` implementation specifies the maximum range, how to patch in the 60 //! actual label location when known, and how to generate a veneer to extend the 61 //! range. 62 //! 63 //! That satisfies label references, but we still may have suboptimal branch 64 //! patterns. To clean up the branches, we do a simple "peephole"-style 65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`) 66 //! informs the buffer of branches in the code and, in the case of conditionals, 67 //! the code that would have been emitted to invert this branch's condition. We 68 //! track the "latest branches": these are branches that are contiguous up to 69 //! the current offset. (If any code is emitted after a branch, that branch or 70 //! run of contiguous branches is no longer "latest".) The latest branches are 71 //! those that we can edit by simply truncating the buffer and doing something 72 //! else instead. 73 //! 74 //! To optimize branches, we implement several simple rules, and try to apply 75 //! them to the "latest branches" when possible: 76 //! 77 //! - A branch with a label target, when that label is bound to the ending 78 //! offset of the branch (the fallthrough location), can be removed altogether, 79 //! because the branch would have no effect). 80 //! 81 //! - An unconditional branch that starts at a label location, and branches to 82 //! another label, results in a "label alias": all references to the label bound 83 //! *to* this branch instruction are instead resolved to the *target* of the 84 //! branch instruction. This effectively removes empty blocks that just 85 //! unconditionally branch to the next block. We call this "branch threading". 86 //! 87 //! - A conditional followed by an unconditional, when the conditional branches 88 //! to the unconditional's fallthrough, results in (i) the truncation of the 89 //! unconditional, (ii) the inversion of the condition's condition, and (iii) 90 //! replacement of the conditional's target (using the original target of the 91 //! unconditional). This is a fancy way of saying "we can flip a two-target 92 //! conditional branch's taken/not-taken targets if it works better with our 93 //! fallthrough". To make this work, the emitter actually gives the buffer 94 //! *both* forms of every conditional branch: the true form is emitted into the 95 //! buffer, and the "inverted" machine-code bytes are provided as part of the 96 //! branch-fixup metadata. 97 //! 98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have 99 //! been redirected to target(B), can be removed entirely. This is an extension 100 //! of the branch-threading optimization, and is valid because if we know there 101 //! will be no fallthrough into this branch instruction (the prior instruction 102 //! is an unconditional jump), and if we know we have successfully redirected 103 //! all labels, then this branch instruction is unreachable. Note that this 104 //! works because the redirection happens before the label is ever resolved 105 //! (fixups happen at island emission time, at which point latest-branches are 106 //! cleared, or at the end of emission), so we are sure to catch and redirect 107 //! all possible paths to this instruction. 108 //! 109 //! # Branch-optimization Correctness 110 //! 111 //! The branch-optimization mechanism depends on a few data structures with 112 //! invariants, which are always held outside the scope of top-level public 113 //! methods: 114 //! 115 //! - The latest-branches list. Each entry describes a span of the buffer 116 //! (start/end offsets), the label target, the corresponding fixup-list entry 117 //! index, and the bytes (must be the same length) for the inverted form, if 118 //! conditional. The list of labels that are bound to the start-offset of this 119 //! branch is *complete* (if any label has a resolved offset equal to `start` 120 //! and is not an alias, it must appear in this list) and *precise* (no label 121 //! in this list can be bound to another offset). No label in this list should 122 //! be an alias. No two branch ranges can overlap, and branches are in 123 //! ascending-offset order. 124 //! 125 //! - The labels-at-tail list. This contains all MachLabels that have been bound 126 //! to (whose resolved offsets are equal to) the tail offset of the buffer. 127 //! No label in this list should be an alias. 128 //! 129 //! - The label_offsets array, containing the bound offset of a label or 130 //! UNKNOWN. No label can be bound at an offset greater than the current 131 //! buffer tail. 132 //! 133 //! - The label_aliases array, containing another label to which a label is 134 //! bound or UNKNOWN. A label's resolved offset is the resolved offset 135 //! of the label it is aliased to, if this is set. 136 //! 137 //! We argue below, at each method, how the invariants in these data structures 138 //! are maintained (grep for "Post-invariant"). 139 //! 140 //! Given these invariants, we argue why each optimization preserves execution 141 //! semantics below (grep for "Preserves execution semantics"). 142 143 use crate::binemit::{Addend, CodeOffset, Reloc, StackMap}; 144 use crate::ir::{ExternalName, Opcode, RelSourceLoc, SourceLoc, TrapCode}; 145 use crate::isa::unwind::UnwindInst; 146 use crate::machinst::{ 147 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst, 148 }; 149 use crate::timing; 150 use crate::trace; 151 use cranelift_control::ControlPlane; 152 use cranelift_entity::{entity_impl, PrimaryMap}; 153 use smallvec::SmallVec; 154 use std::convert::TryFrom; 155 use std::mem; 156 use std::string::String; 157 use std::vec::Vec; 158 159 #[cfg(feature = "enable-serde")] 160 use serde::{Deserialize, Serialize}; 161 162 #[cfg(feature = "enable-serde")] 163 pub trait CompilePhase { 164 type MachSrcLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone; 165 type SourceLocType: for<'a> Deserialize<'a> + Serialize + core::fmt::Debug + PartialEq + Clone; 166 } 167 168 #[cfg(not(feature = "enable-serde"))] 169 pub trait CompilePhase { 170 type MachSrcLocType: core::fmt::Debug + PartialEq + Clone; 171 type SourceLocType: core::fmt::Debug + PartialEq + Clone; 172 } 173 174 /// Status of a compiled artifact that needs patching before being used. 175 #[derive(Clone, Debug, PartialEq)] 176 #[cfg_attr(feature = "enable-serde", derive(Serialize, Deserialize))] 177 pub struct Stencil; 178 179 /// Status of a compiled artifact ready to use. 180 #[derive(Clone, Debug, PartialEq)] 181 pub struct Final; 182 183 impl CompilePhase for Stencil { 184 type MachSrcLocType = MachSrcLoc<Stencil>; 185 type SourceLocType = RelSourceLoc; 186 } 187 188 impl CompilePhase for Final { 189 type MachSrcLocType = MachSrcLoc<Final>; 190 type SourceLocType = SourceLoc; 191 } 192 193 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink 194 /// in bulk. 195 /// 196 /// This struct uses `SmallVec`s to support small-ish function bodies without 197 /// any heap allocation. As such, it will be several kilobytes large. This is 198 /// likely fine as long as it is stack-allocated for function emission then 199 /// thrown away; but beware if many buffer objects are retained persistently. 200 pub struct MachBuffer<I: VCodeInst> { 201 /// The buffer contents, as raw bytes. 202 data: SmallVec<[u8; 1024]>, 203 /// Any relocations referring to this code. Note that only *external* 204 /// relocations are tracked here; references to labels within the buffer are 205 /// resolved before emission. 206 relocs: SmallVec<[MachReloc; 16]>, 207 /// Any trap records referring to this code. 208 traps: SmallVec<[MachTrap; 16]>, 209 /// Any call site records referring to this code. 210 call_sites: SmallVec<[MachCallSite; 16]>, 211 /// Any source location mappings referring to this code. 212 srclocs: SmallVec<[MachSrcLoc<Stencil>; 64]>, 213 /// Any stack maps referring to this code. 214 stack_maps: SmallVec<[MachStackMap; 8]>, 215 /// Any unwind info at a given location. 216 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 217 /// The current source location in progress (after `start_srcloc()` and 218 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple. 219 cur_srcloc: Option<(CodeOffset, RelSourceLoc)>, 220 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown. 221 label_offsets: SmallVec<[CodeOffset; 16]>, 222 /// Label aliases: when one label points to an unconditional jump, and that 223 /// jump points to another label, we can redirect references to the first 224 /// label immediately to the second. 225 /// 226 /// Invariant: we don't have label-alias cycles. We ensure this by, 227 /// before setting label A to alias label B, resolving B's alias 228 /// target (iteratively until a non-aliased label); if B is already 229 /// aliased to A, then we cannot alias A back to B. 230 label_aliases: SmallVec<[MachLabel; 16]>, 231 /// Constants that must be emitted at some point. 232 pending_constants: SmallVec<[VCodeConstant; 16]>, 233 /// Traps that must be emitted at some point. 234 pending_traps: SmallVec<[MachLabelTrap; 16]>, 235 /// Fixups that must be performed after all code is emitted. 236 fixup_records: SmallVec<[MachLabelFixup<I>; 16]>, 237 /// Current deadline at which all constants are flushed and all code labels 238 /// are extended by emitting long-range jumps in an island. This flush 239 /// should be rare (e.g., on AArch64, the shortest-range PC-rel references 240 /// are +/- 1MB for conditional jumps and load-literal instructions), so 241 /// it's acceptable to track a minimum and flush-all rather than doing more 242 /// detailed "current minimum" / sort-by-deadline trickery. 243 island_deadline: CodeOffset, 244 /// How many bytes are needed in the worst case for an island, given all 245 /// pending constants and fixups. 246 island_worst_case_size: CodeOffset, 247 /// Latest branches, to facilitate in-place editing for better fallthrough 248 /// behavior and empty-block removal. 249 latest_branches: SmallVec<[MachBranch; 4]>, 250 /// All labels at the current offset (emission tail). This is lazily 251 /// cleared: it is actually accurate as long as the current offset is 252 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should 253 /// be considered as empty. 254 /// 255 /// For correctness, this *must* be complete (i.e., the vector must contain 256 /// all labels whose offsets are resolved to the current tail), because we 257 /// rely on it to update labels when we truncate branches. 258 labels_at_tail: SmallVec<[MachLabel; 4]>, 259 /// The last offset at which `labels_at_tail` is valid. It is conceptually 260 /// always describing the tail of the buffer, but we do not clear 261 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it 262 /// when the offset has grown past this (`labels_at_tail_off`) point. 263 /// Always <= `cur_offset()`. 264 labels_at_tail_off: CodeOffset, 265 /// Metadata about all constants that this function has access to. 266 /// 267 /// This records the size/alignment of all constants (not the actual data) 268 /// along with the last available label generated for the constant. This map 269 /// is consulted when constants are referred to and the label assigned to a 270 /// constant may change over time as well. 271 constants: PrimaryMap<VCodeConstant, MachBufferConstant>, 272 /// All recorded usages of constants as pairs of the constant and where the 273 /// constant needs to be placed within `self.data`. Note that the same 274 /// constant may appear in this array multiple times if it was emitted 275 /// multiple times. 276 used_constants: SmallVec<[(VCodeConstant, CodeOffset); 4]>, 277 } 278 279 impl MachBufferFinalized<Stencil> { 280 /// Get a finalized machine buffer by applying the function's base source location. 281 pub fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachBufferFinalized<Final> { 282 MachBufferFinalized { 283 data: self.data, 284 relocs: self.relocs, 285 traps: self.traps, 286 call_sites: self.call_sites, 287 srclocs: self 288 .srclocs 289 .into_iter() 290 .map(|srcloc| srcloc.apply_base_srcloc(base_srcloc)) 291 .collect(), 292 stack_maps: self.stack_maps, 293 unwind_info: self.unwind_info, 294 alignment: self.alignment, 295 } 296 } 297 } 298 299 /// A `MachBuffer` once emission is completed: holds generated code and records, 300 /// without fixups. This allows the type to be independent of the backend. 301 #[derive(PartialEq, Debug, Clone)] 302 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))] 303 pub struct MachBufferFinalized<T: CompilePhase> { 304 /// The buffer contents, as raw bytes. 305 pub(crate) data: SmallVec<[u8; 1024]>, 306 /// Any relocations referring to this code. Note that only *external* 307 /// relocations are tracked here; references to labels within the buffer are 308 /// resolved before emission. 309 pub(crate) relocs: SmallVec<[MachReloc; 16]>, 310 /// Any trap records referring to this code. 311 pub(crate) traps: SmallVec<[MachTrap; 16]>, 312 /// Any call site records referring to this code. 313 pub(crate) call_sites: SmallVec<[MachCallSite; 16]>, 314 /// Any source location mappings referring to this code. 315 pub(crate) srclocs: SmallVec<[T::MachSrcLocType; 64]>, 316 /// Any stack maps referring to this code. 317 pub(crate) stack_maps: SmallVec<[MachStackMap; 8]>, 318 /// Any unwind info at a given location. 319 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 320 /// The requireed alignment of this buffer 321 pub alignment: u32, 322 } 323 324 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff; 325 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff); 326 327 /// Threshold on max length of `labels_at_this_branch` list to avoid 328 /// unbounded quadratic behavior (see comment below at use-site). 329 const LABEL_LIST_THRESHOLD: usize = 100; 330 331 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at 332 /// the point at which it is used by emitted code; the buffer records "fixups" 333 /// for references to the label, and will come back and patch the code 334 /// appropriately when the label's location is eventually known. 335 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)] 336 pub struct MachLabel(u32); 337 entity_impl!(MachLabel); 338 339 impl MachLabel { 340 /// Get a label for a block. (The first N MachLabels are always reseved for 341 /// the N blocks in the vcode.) 342 pub fn from_block(bindex: BlockIndex) -> MachLabel { 343 MachLabel(bindex.index() as u32) 344 } 345 346 /// Get the numeric label index. 347 pub fn get(self) -> u32 { 348 self.0 349 } 350 351 /// Creates a string representing this label, for convenience. 352 pub fn to_string(&self) -> String { 353 format!("label{}", self.0) 354 } 355 } 356 357 impl Default for MachLabel { 358 fn default() -> Self { 359 UNKNOWN_LABEL 360 } 361 } 362 363 /// A stack map extent, when creating a stack map. 364 pub enum StackMapExtent { 365 /// The stack map starts at this instruction, and ends after the number of upcoming bytes 366 /// (note: this is a code offset diff). 367 UpcomingBytes(CodeOffset), 368 369 /// The stack map started at the given offset and ends at the current one. This helps 370 /// architectures where the instruction size has not a fixed length. 371 StartedAtOffset(CodeOffset), 372 } 373 374 impl<I: VCodeInst> MachBuffer<I> { 375 /// Create a new section, known to start at `start_offset` and with a size limited to 376 /// `length_limit`. 377 pub fn new() -> MachBuffer<I> { 378 MachBuffer { 379 data: SmallVec::new(), 380 relocs: SmallVec::new(), 381 traps: SmallVec::new(), 382 call_sites: SmallVec::new(), 383 srclocs: SmallVec::new(), 384 stack_maps: SmallVec::new(), 385 unwind_info: SmallVec::new(), 386 cur_srcloc: None, 387 label_offsets: SmallVec::new(), 388 label_aliases: SmallVec::new(), 389 pending_constants: SmallVec::new(), 390 pending_traps: SmallVec::new(), 391 fixup_records: SmallVec::new(), 392 island_deadline: UNKNOWN_LABEL_OFFSET, 393 island_worst_case_size: 0, 394 latest_branches: SmallVec::new(), 395 labels_at_tail: SmallVec::new(), 396 labels_at_tail_off: 0, 397 constants: Default::default(), 398 used_constants: Default::default(), 399 } 400 } 401 402 /// Current offset from start of buffer. 403 pub fn cur_offset(&self) -> CodeOffset { 404 self.data.len() as CodeOffset 405 } 406 407 /// Add a byte. 408 pub fn put1(&mut self, value: u8) { 409 trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value); 410 self.data.push(value); 411 412 // Post-invariant: conceptual-labels_at_tail contains a complete and 413 // precise list of labels bound at `cur_offset()`. We have advanced 414 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off` 415 // before, it is not anymore (and it cannot become equal, because 416 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is 417 // conceptually empty (even though it is only lazily cleared). No labels 418 // can be bound at this new offset (by invariant on `label_offsets`). 419 // Hence the invariant holds. 420 } 421 422 /// Add 2 bytes. 423 pub fn put2(&mut self, value: u16) { 424 trace!( 425 "MachBuffer: put 16-bit word @ {}: {:x}", 426 self.cur_offset(), 427 value 428 ); 429 let bytes = value.to_le_bytes(); 430 self.data.extend_from_slice(&bytes[..]); 431 432 // Post-invariant: as for `put1()`. 433 } 434 435 /// Add 4 bytes. 436 pub fn put4(&mut self, value: u32) { 437 trace!( 438 "MachBuffer: put 32-bit word @ {}: {:x}", 439 self.cur_offset(), 440 value 441 ); 442 let bytes = value.to_le_bytes(); 443 self.data.extend_from_slice(&bytes[..]); 444 445 // Post-invariant: as for `put1()`. 446 } 447 448 /// Add 8 bytes. 449 pub fn put8(&mut self, value: u64) { 450 trace!( 451 "MachBuffer: put 64-bit word @ {}: {:x}", 452 self.cur_offset(), 453 value 454 ); 455 let bytes = value.to_le_bytes(); 456 self.data.extend_from_slice(&bytes[..]); 457 458 // Post-invariant: as for `put1()`. 459 } 460 461 /// Add a slice of bytes. 462 pub fn put_data(&mut self, data: &[u8]) { 463 trace!( 464 "MachBuffer: put data @ {}: len {}", 465 self.cur_offset(), 466 data.len() 467 ); 468 self.data.extend_from_slice(data); 469 470 // Post-invariant: as for `put1()`. 471 } 472 473 /// Reserve appended space and return a mutable slice referring to it. 474 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] { 475 trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len); 476 let off = self.data.len(); 477 let new_len = self.data.len() + len; 478 self.data.resize(new_len, 0); 479 &mut self.data[off..] 480 481 // Post-invariant: as for `put1()`. 482 } 483 484 /// Align up to the given alignment. 485 pub fn align_to(&mut self, align_to: CodeOffset) { 486 trace!("MachBuffer: align to {}", align_to); 487 assert!( 488 align_to.is_power_of_two(), 489 "{} is not a power of two", 490 align_to 491 ); 492 while self.cur_offset() & (align_to - 1) != 0 { 493 self.put1(0); 494 } 495 496 // Post-invariant: as for `put1()`. 497 } 498 499 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed 500 /// offset yet. 501 pub fn get_label(&mut self) -> MachLabel { 502 let l = self.label_offsets.len() as u32; 503 self.label_offsets.push(UNKNOWN_LABEL_OFFSET); 504 self.label_aliases.push(UNKNOWN_LABEL); 505 trace!("MachBuffer: new label -> {:?}", MachLabel(l)); 506 MachLabel(l) 507 508 // Post-invariant: the only mutation is to add a new label; it has no 509 // bound offset yet, so it trivially satisfies all invariants. 510 } 511 512 /// Reserve the first N MachLabels for blocks. 513 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) { 514 trace!("MachBuffer: first {} labels are for blocks", blocks); 515 debug_assert!(self.label_offsets.is_empty()); 516 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET); 517 self.label_aliases.resize(blocks, UNKNOWN_LABEL); 518 519 // Post-invariant: as for `get_label()`. 520 } 521 522 /// Registers metadata in this `MachBuffer` about the `constants` provided. 523 /// 524 /// This will record the size/alignment of all constants which will prepare 525 /// them for emission later on. 526 pub fn register_constants(&mut self, constants: &VCodeConstants) { 527 for (c, val) in constants.iter() { 528 let c2 = self.constants.push(MachBufferConstant { 529 upcoming_label: None, 530 align: val.alignment(), 531 size: val.as_slice().len(), 532 }); 533 assert_eq!(c, c2); 534 } 535 } 536 537 /// Completes constant emission by iterating over `self.used_constants` and 538 /// filling in the "holes" with the constant values provided by `constants`. 539 /// 540 /// Returns the alignment required for this entire buffer. Alignment starts 541 /// at the ISA's minimum function alignment and can be increased due to 542 /// constant requirements. 543 fn finish_constants(&mut self, constants: &VCodeConstants) -> u32 { 544 let mut alignment = I::function_alignment().minimum; 545 for (constant, offset) in mem::take(&mut self.used_constants) { 546 let constant = constants.get(constant); 547 let data = constant.as_slice(); 548 self.data[offset as usize..][..data.len()].copy_from_slice(data); 549 alignment = constant.alignment().max(alignment); 550 } 551 alignment 552 } 553 554 /// Returns a label that can be used to refer to the `constant` provided. 555 /// 556 /// This will automatically defer a new constant to be emitted for 557 /// `constant` if it has not been previously emitted. Note that this 558 /// function may return a different label for the same constant at 559 /// different points in time. The label is valid to use only from the 560 /// current location; the MachBuffer takes care to emit the same constant 561 /// multiple times if needed so the constant is always in range. 562 pub fn get_label_for_constant(&mut self, constant: VCodeConstant) -> MachLabel { 563 let MachBufferConstant { 564 align, 565 size, 566 upcoming_label, 567 } = self.constants[constant]; 568 if let Some(label) = upcoming_label { 569 return label; 570 } 571 572 let label = self.get_label(); 573 trace!( 574 "defer constant: eventually emit {size} bytes aligned \ 575 to {align} at label {label:?}", 576 ); 577 self.update_deadline(size, u32::MAX); 578 self.pending_constants.push(constant); 579 self.constants[constant].upcoming_label = Some(label); 580 label 581 } 582 583 /// Bind a label to the current offset. A label can only be bound once. 584 pub fn bind_label(&mut self, label: MachLabel, ctrl_plane: &mut ControlPlane) { 585 trace!( 586 "MachBuffer: bind label {:?} at offset {}", 587 label, 588 self.cur_offset() 589 ); 590 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET); 591 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL); 592 let offset = self.cur_offset(); 593 self.label_offsets[label.0 as usize] = offset; 594 self.lazily_clear_labels_at_tail(); 595 self.labels_at_tail.push(label); 596 597 // Invariants hold: bound offset of label is <= cur_offset (in fact it 598 // is equal). If the `labels_at_tail` list was complete and precise 599 // before, it is still, because we have bound this label to the current 600 // offset and added it to the list (which contains all labels at the 601 // current offset). 602 603 self.optimize_branches(ctrl_plane); 604 605 // Post-invariant: by `optimize_branches()` (see argument there). 606 } 607 608 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the 609 /// offset that it applies to. 610 fn lazily_clear_labels_at_tail(&mut self) { 611 let offset = self.cur_offset(); 612 if offset > self.labels_at_tail_off { 613 self.labels_at_tail_off = offset; 614 self.labels_at_tail.clear(); 615 } 616 617 // Post-invariant: either labels_at_tail_off was at cur_offset, and 618 // state is untouched, or was less than cur_offset, in which case the 619 // labels_at_tail list was conceptually empty, and is now actually 620 // empty. 621 } 622 623 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`. 624 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset { 625 let mut iters = 0; 626 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL { 627 label = self.label_aliases[label.0 as usize]; 628 // To protect against an infinite loop (despite our assurances to 629 // ourselves that the invariants make this impossible), assert out 630 // after 1M iterations. The number of basic blocks is limited 631 // in most contexts anyway so this should be impossible to hit with 632 // a legitimate input. 633 iters += 1; 634 assert!(iters < 1_000_000, "Unexpected cycle in label aliases"); 635 } 636 self.label_offsets[label.0 as usize] 637 638 // Post-invariant: no mutations. 639 } 640 641 /// Emit a reference to the given label with the given reference type (i.e., 642 /// branch-instruction format) at the current offset. This is like a 643 /// relocation, but handled internally. 644 /// 645 /// This can be called before the branch is actually emitted; fixups will 646 /// not happen until an island is emitted or the buffer is finished. 647 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) { 648 trace!( 649 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}", 650 offset, 651 label, 652 kind 653 ); 654 655 // Add the fixup, and update the worst-case island size based on a 656 // veneer for this label use. 657 self.fixup_records.push(MachLabelFixup { 658 label, 659 offset, 660 kind, 661 }); 662 if kind.supports_veneer() { 663 self.island_worst_case_size += kind.veneer_size(); 664 self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1); 665 } 666 let deadline = offset.saturating_add(kind.max_pos_range()); 667 if deadline < self.island_deadline { 668 self.island_deadline = deadline; 669 } 670 671 // Post-invariant: no mutations to branches/labels data structures. 672 } 673 674 /// Inform the buffer of an unconditional branch at the given offset, 675 /// targetting the given label. May be used to optimize branches. 676 /// The last added label-use must correspond to this branch. 677 /// This must be called when the current offset is equal to `start`; i.e., 678 /// before actually emitting the branch. This implies that for a branch that 679 /// uses a label and is eligible for optimizations by the MachBuffer, the 680 /// proper sequence is: 681 /// 682 /// - Call `use_label_at_offset()` to emit the fixup record. 683 /// - Call `add_uncond_branch()` to make note of the branch. 684 /// - Emit the bytes for the branch's machine code. 685 /// 686 /// Additional requirement: no labels may be bound between `start` and `end` 687 /// (exclusive on both ends). 688 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) { 689 assert!(self.cur_offset() == start); 690 debug_assert!(end > start); 691 assert!(!self.fixup_records.is_empty()); 692 let fixup = self.fixup_records.len() - 1; 693 self.lazily_clear_labels_at_tail(); 694 self.latest_branches.push(MachBranch { 695 start, 696 end, 697 target, 698 fixup, 699 inverted: None, 700 labels_at_this_branch: self.labels_at_tail.clone(), 701 }); 702 703 // Post-invariant: we asserted branch start is current tail; the list of 704 // labels at branch is cloned from list of labels at current tail. 705 } 706 707 /// Inform the buffer of a conditional branch at the given offset, 708 /// targetting the given label. May be used to optimize branches. 709 /// The last added label-use must correspond to this branch. 710 /// 711 /// Additional requirement: no labels may be bound between `start` and `end` 712 /// (exclusive on both ends). 713 pub fn add_cond_branch( 714 &mut self, 715 start: CodeOffset, 716 end: CodeOffset, 717 target: MachLabel, 718 inverted: &[u8], 719 ) { 720 assert!(self.cur_offset() == start); 721 debug_assert!(end > start); 722 assert!(!self.fixup_records.is_empty()); 723 debug_assert!(inverted.len() == (end - start) as usize); 724 let fixup = self.fixup_records.len() - 1; 725 let inverted = Some(SmallVec::from(inverted)); 726 self.lazily_clear_labels_at_tail(); 727 self.latest_branches.push(MachBranch { 728 start, 729 end, 730 target, 731 fixup, 732 inverted, 733 labels_at_this_branch: self.labels_at_tail.clone(), 734 }); 735 736 // Post-invariant: we asserted branch start is current tail; labels at 737 // branch list is cloned from list of labels at current tail. 738 } 739 740 fn truncate_last_branch(&mut self) { 741 self.lazily_clear_labels_at_tail(); 742 // Invariants hold at this point. 743 744 let b = self.latest_branches.pop().unwrap(); 745 assert!(b.end == self.cur_offset()); 746 747 // State: 748 // [PRE CODE] 749 // Offset b.start, b.labels_at_this_branch: 750 // [BRANCH CODE] 751 // cur_off, self.labels_at_tail --> 752 // (end of buffer) 753 self.data.truncate(b.start as usize); 754 self.fixup_records.truncate(b.fixup); 755 while let Some(last_srcloc) = self.srclocs.last_mut() { 756 if last_srcloc.end <= b.start { 757 break; 758 } 759 if last_srcloc.start < b.start { 760 last_srcloc.end = b.start; 761 break; 762 } 763 self.srclocs.pop(); 764 } 765 // State: 766 // [PRE CODE] 767 // cur_off, Offset b.start, b.labels_at_this_branch: 768 // (end of buffer) 769 // 770 // self.labels_at_tail --> (past end of buffer) 771 let cur_off = self.cur_offset(); 772 self.labels_at_tail_off = cur_off; 773 // State: 774 // [PRE CODE] 775 // cur_off, Offset b.start, b.labels_at_this_branch, 776 // self.labels_at_tail: 777 // (end of buffer) 778 // 779 // resolve_label_offset(l) for l in labels_at_tail: 780 // (past end of buffer) 781 782 trace!( 783 "truncate_last_branch: truncated {:?}; off now {}", 784 b, 785 cur_off 786 ); 787 788 // Fix up resolved label offsets for labels at tail. 789 for &l in &self.labels_at_tail { 790 self.label_offsets[l.0 as usize] = cur_off; 791 } 792 // Old labels_at_this_branch are now at cur_off. 793 self.labels_at_tail 794 .extend(b.labels_at_this_branch.into_iter()); 795 796 // Post-invariant: this operation is defined to truncate the buffer, 797 // which moves cur_off backward, and to move labels at the end of the 798 // buffer back to the start-of-branch offset. 799 // 800 // latest_branches satisfies all invariants: 801 // - it has no branches past the end of the buffer (branches are in 802 // order, we removed the last one, and we truncated the buffer to just 803 // before the start of that branch) 804 // - no labels were moved to lower offsets than the (new) cur_off, so 805 // the labels_at_this_branch list for any other branch need not change. 806 // 807 // labels_at_tail satisfies all invariants: 808 // - all labels that were at the tail after the truncated branch are 809 // moved backward to just before the branch, which becomes the new tail; 810 // thus every element in the list should remain (ensured by `.extend()` 811 // above). 812 // - all labels that refer to the new tail, which is the start-offset of 813 // the truncated branch, must be present. The `labels_at_this_branch` 814 // list in the truncated branch's record is a complete and precise list 815 // of exactly these labels; we append these to labels_at_tail. 816 // - labels_at_tail_off is at cur_off after truncation occurs, so the 817 // list is valid (not to be lazily cleared). 818 // 819 // The stated operation was performed: 820 // - For each label at the end of the buffer prior to this method, it 821 // now resolves to the new (truncated) end of the buffer: it must have 822 // been in `labels_at_tail` (this list is precise and complete, and 823 // the tail was at the end of the truncated branch on entry), and we 824 // iterate over this list and set `label_offsets` to the new tail. 825 // None of these labels could have been an alias (by invariant), so 826 // `label_offsets` is authoritative for each. 827 // - No other labels will be past the end of the buffer, because of the 828 // requirement that no labels be bound to the middle of branch ranges 829 // (see comments to `add_{cond,uncond}_branch()`). 830 // - The buffer is truncated to just before the last branch, and the 831 // fixup record referring to that last branch is removed. 832 } 833 834 fn optimize_branches(&mut self, ctrl_plane: &mut ControlPlane) { 835 if ctrl_plane.get_decision() { 836 return; 837 } 838 839 self.lazily_clear_labels_at_tail(); 840 // Invariants valid at this point. 841 842 trace!( 843 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 844 self.latest_branches, 845 self.labels_at_tail, 846 self.fixup_records 847 ); 848 849 // We continue to munch on branches at the tail of the buffer until no 850 // more rules apply. Note that the loop only continues if a branch is 851 // actually truncated (or if labels are redirected away from a branch), 852 // so this always makes progress. 853 while let Some(b) = self.latest_branches.last() { 854 let cur_off = self.cur_offset(); 855 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off); 856 // If there has been any code emission since the end of the last branch or 857 // label definition, then there's nothing we can edit (because we 858 // don't move code once placed, only back up and overwrite), so 859 // clear the records and finish. 860 if b.end < cur_off { 861 break; 862 } 863 864 // If the "labels at this branch" list on this branch is 865 // longer than a threshold, don't do any simplification, 866 // and let the branch remain to separate those labels from 867 // the current tail. This avoids quadratic behavior (see 868 // #3468): otherwise, if a long string of "goto next; 869 // next:" patterns are emitted, all of the labels will 870 // coalesce into a long list of aliases for the current 871 // buffer tail. We must track all aliases of the current 872 // tail for correctness, but we are also allowed to skip 873 // optimization (removal) of any branch, so we take the 874 // escape hatch here and let it stand. In effect this 875 // "spreads" the many thousands of labels in the 876 // pathological case among an actual (harmless but 877 // suboptimal) instruction once per N labels. 878 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD { 879 break; 880 } 881 882 // Invariant: we are looking at a branch that ends at the tail of 883 // the buffer. 884 885 // For any branch, conditional or unconditional: 886 // - If the target is a label at the current offset, then remove 887 // the conditional branch, and reset all labels that targetted 888 // the current offset (end of branch) to the truncated 889 // end-of-code. 890 // 891 // Preserves execution semantics: a branch to its own fallthrough 892 // address is equivalent to a no-op; in both cases, nextPC is the 893 // fallthrough. 894 if self.resolve_label_offset(b.target) == cur_off { 895 trace!("branch with target == cur off; truncating"); 896 self.truncate_last_branch(); 897 continue; 898 } 899 900 // If latest is an unconditional branch: 901 // 902 // - If the branch's target is not its own start address, then for 903 // each label at the start of branch, make the label an alias of the 904 // branch target, and remove the label from the "labels at this 905 // branch" list. 906 // 907 // - Preserves execution semantics: an unconditional branch's 908 // only effect is to set PC to a new PC; this change simply 909 // collapses one step in the step-semantics. 910 // 911 // - Post-invariant: the labels that were bound to the start of 912 // this branch become aliases, so they must not be present in any 913 // labels-at-this-branch list or the labels-at-tail list. The 914 // labels are removed form the latest-branch record's 915 // labels-at-this-branch list, and are never placed in the 916 // labels-at-tail list. Furthermore, it is correct that they are 917 // not in either list, because they are now aliases, and labels 918 // that are aliases remain aliases forever. 919 // 920 // - If there is a prior unconditional branch that ends just before 921 // this one begins, and this branch has no labels bound to its 922 // start, then we can truncate this branch, because it is entirely 923 // unreachable (we have redirected all labels that make it 924 // reachable otherwise). Do so and continue around the loop. 925 // 926 // - Preserves execution semantics: the branch is unreachable, 927 // because execution can only flow into an instruction from the 928 // prior instruction's fallthrough or from a branch bound to that 929 // instruction's start offset. Unconditional branches have no 930 // fallthrough, so if the prior instruction is an unconditional 931 // branch, no fallthrough entry can happen. The 932 // labels-at-this-branch list is complete (by invariant), so if it 933 // is empty, then the instruction is entirely unreachable. Thus, 934 // it can be removed. 935 // 936 // - Post-invariant: ensured by truncate_last_branch(). 937 // 938 // - If there is a prior conditional branch whose target label 939 // resolves to the current offset (branches around the 940 // unconditional branch), then remove the unconditional branch, 941 // and make the target of the unconditional the target of the 942 // conditional instead. 943 // 944 // - Preserves execution semantics: previously we had: 945 // 946 // L1: 947 // cond_br L2 948 // br L3 949 // L2: 950 // (end of buffer) 951 // 952 // by removing the last branch, we have: 953 // 954 // L1: 955 // cond_br L2 956 // L2: 957 // (end of buffer) 958 // 959 // we then fix up the records for the conditional branch to 960 // have: 961 // 962 // L1: 963 // cond_br.inverted L3 964 // L2: 965 // 966 // In the original code, control flow reaches L2 when the 967 // conditional branch's predicate is true, and L3 otherwise. In 968 // the optimized code, the same is true. 969 // 970 // - Post-invariant: all edits to latest_branches and 971 // labels_at_tail are performed by `truncate_last_branch()`, 972 // which maintains the invariants at each step. 973 974 if b.is_uncond() { 975 // Set any label equal to current branch's start as an alias of 976 // the branch's target, if the target is not the branch itself 977 // (i.e., an infinite loop). 978 // 979 // We cannot perform this aliasing if the target of this branch 980 // ultimately aliases back here; if so, we need to keep this 981 // branch, so break out of this loop entirely (and clear the 982 // latest-branches list below). 983 // 984 // Note that this check is what prevents cycles from forming in 985 // `self.label_aliases`. To see why, consider an arbitrary start 986 // state: 987 // 988 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to 989 // Ln, which is not aliased. 990 // 991 // We would create a cycle if we assigned label_aliases[Ln] 992 // = L1. Note that the below assignment is the only write 993 // to label_aliases. 994 // 995 // By our other invariants, we have that Ln (`l` below) 996 // resolves to the offset `b.start`, because it is in the 997 // set `b.labels_at_this_branch`. 998 // 999 // If L1 were already aliased, through some arbitrarily deep 1000 // chain, to Ln, then it must also resolve to this offset 1001 // `b.start`. 1002 // 1003 // By checking the resolution of `L1` against this offset, 1004 // and aborting this branch-simplification if they are 1005 // equal, we prevent the below assignment from ever creating 1006 // a cycle. 1007 if self.resolve_label_offset(b.target) != b.start { 1008 let redirected = b.labels_at_this_branch.len(); 1009 for &l in &b.labels_at_this_branch { 1010 trace!( 1011 " -> label at start of branch {:?} redirected to target {:?}", 1012 l, 1013 b.target 1014 ); 1015 self.label_aliases[l.0 as usize] = b.target; 1016 // NOTE: we continue to ensure the invariant that labels 1017 // pointing to tail of buffer are in `labels_at_tail` 1018 // because we already ensured above that the last branch 1019 // cannot have a target of `cur_off`; so we never have 1020 // to put the label into `labels_at_tail` when moving it 1021 // here. 1022 } 1023 // Maintain invariant: all branches have been redirected 1024 // and are no longer pointing at the start of this branch. 1025 let mut_b = self.latest_branches.last_mut().unwrap(); 1026 mut_b.labels_at_this_branch.clear(); 1027 1028 if redirected > 0 { 1029 trace!(" -> after label redirects, restarting loop"); 1030 continue; 1031 } 1032 } else { 1033 break; 1034 } 1035 1036 let b = self.latest_branches.last().unwrap(); 1037 1038 // Examine any immediately preceding branch. 1039 if self.latest_branches.len() > 1 { 1040 let prev_b = &self.latest_branches[self.latest_branches.len() - 2]; 1041 trace!(" -> more than one branch; prev_b = {:?}", prev_b); 1042 // This uncond is immediately after another uncond; we 1043 // should have already redirected labels to this uncond away 1044 // (but check to be sure); so we can truncate this uncond. 1045 if prev_b.is_uncond() 1046 && prev_b.end == b.start 1047 && b.labels_at_this_branch.is_empty() 1048 { 1049 trace!(" -> uncond follows another uncond; truncating"); 1050 self.truncate_last_branch(); 1051 continue; 1052 } 1053 1054 // This uncond is immediately after a conditional, and the 1055 // conditional's target is the end of this uncond, and we've 1056 // already redirected labels to this uncond away; so we can 1057 // truncate this uncond, flip the sense of the conditional, and 1058 // set the conditional's target (in `latest_branches` and in 1059 // `fixup_records`) to the uncond's target. 1060 if prev_b.is_cond() 1061 && prev_b.end == b.start 1062 && self.resolve_label_offset(prev_b.target) == cur_off 1063 { 1064 trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset"); 1065 // Save the target of the uncond (this becomes the 1066 // target of the cond), and truncate the uncond. 1067 let target = b.target; 1068 let data = prev_b.inverted.clone().unwrap(); 1069 self.truncate_last_branch(); 1070 1071 // Mutate the code and cond branch. 1072 let off_before_edit = self.cur_offset(); 1073 let prev_b = self.latest_branches.last_mut().unwrap(); 1074 let not_inverted = SmallVec::from( 1075 &self.data[(prev_b.start as usize)..(prev_b.end as usize)], 1076 ); 1077 1078 // Low-level edit: replaces bytes of branch with 1079 // inverted form. cur_off remains the same afterward, so 1080 // we do not need to modify label data structures. 1081 self.data.truncate(prev_b.start as usize); 1082 self.data.extend_from_slice(&data[..]); 1083 1084 // Save the original code as the inversion of the 1085 // inverted branch, in case we later edit this branch 1086 // again. 1087 prev_b.inverted = Some(not_inverted); 1088 self.fixup_records[prev_b.fixup].label = target; 1089 trace!(" -> reassigning target of condbr to {:?}", target); 1090 prev_b.target = target; 1091 debug_assert_eq!(off_before_edit, self.cur_offset()); 1092 continue; 1093 } 1094 } 1095 } 1096 1097 // If we couldn't do anything with the last branch, then break. 1098 break; 1099 } 1100 1101 self.purge_latest_branches(); 1102 1103 trace!( 1104 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 1105 self.latest_branches, 1106 self.labels_at_tail, 1107 self.fixup_records 1108 ); 1109 } 1110 1111 fn purge_latest_branches(&mut self) { 1112 // All of our branch simplification rules work only if a branch ends at 1113 // the tail of the buffer, with no following code; and branches are in 1114 // order in latest_branches; so if the last entry ends prior to 1115 // cur_offset, then clear all entries. 1116 let cur_off = self.cur_offset(); 1117 if let Some(l) = self.latest_branches.last() { 1118 if l.end < cur_off { 1119 trace!("purge_latest_branches: removing branch {:?}", l); 1120 self.latest_branches.clear(); 1121 } 1122 } 1123 1124 // Post-invariant: no invariant requires any branch to appear in 1125 // `latest_branches`; it is always optional. The list-clear above thus 1126 // preserves all semantics. 1127 } 1128 1129 /// Emit a trap at some point in the future with the specified code and 1130 /// stack map. 1131 /// 1132 /// This function returns a [`MachLabel`] which will be the future address 1133 /// of the trap. Jumps should refer to this label, likely by using the 1134 /// [`MachBuffer::use_label_at_offset`] method, to get a relocation 1135 /// patched in once the address of the trap is known. 1136 /// 1137 /// This will batch all traps into the end of the function. 1138 pub fn defer_trap(&mut self, code: TrapCode, stack_map: Option<StackMap>) -> MachLabel { 1139 let label = self.get_label(); 1140 self.update_deadline(I::TRAP_OPCODE.len(), u32::MAX); 1141 self.pending_traps.push(MachLabelTrap { 1142 label, 1143 code, 1144 stack_map, 1145 loc: self.cur_srcloc.map(|(_start, loc)| loc), 1146 }); 1147 label 1148 } 1149 1150 fn update_deadline(&mut self, len: usize, max_distance: CodeOffset) { 1151 trace!("defer: eventually emit {} bytes", len); 1152 let deadline = self.cur_offset().saturating_add(max_distance); 1153 self.island_worst_case_size += len as CodeOffset; 1154 self.island_worst_case_size = 1155 (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1); 1156 if deadline < self.island_deadline { 1157 self.island_deadline = deadline; 1158 } 1159 } 1160 1161 /// Is an island needed within the next N bytes? 1162 pub fn island_needed(&self, distance: CodeOffset) -> bool { 1163 self.worst_case_end_of_island(distance) > self.island_deadline 1164 } 1165 1166 /// Returns the maximal offset that islands can reach if `distance` more 1167 /// bytes are appended. 1168 /// 1169 /// This is used to determine if veneers need insertions since jumps that 1170 /// can't reach past this point must get a veneer of some form. 1171 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset { 1172 self.cur_offset() 1173 .saturating_add(distance) 1174 .saturating_add(self.island_worst_case_size) 1175 } 1176 1177 /// Emit all pending constants and required pending veneers. 1178 /// 1179 /// Should only be called if `island_needed()` returns true, i.e., if we 1180 /// actually reach a deadline. It's not necessarily a problem to do so 1181 /// otherwise but it may result in unnecessary work during emission. 1182 pub fn emit_island(&mut self, distance: CodeOffset, ctrl_plane: &mut ControlPlane) { 1183 self.emit_island_maybe_forced(false, distance, ctrl_plane); 1184 } 1185 1186 /// Same as `emit_island`, but an internal API with a `force_veneers` 1187 /// argument to force all veneers to always get emitted for debugging. 1188 fn emit_island_maybe_forced( 1189 &mut self, 1190 force_veneers: bool, 1191 distance: CodeOffset, 1192 ctrl_plane: &mut ControlPlane, 1193 ) { 1194 // We're going to purge fixups, so no latest-branch editing can happen 1195 // anymore. 1196 self.latest_branches.clear(); 1197 1198 // Reset internal calculations about islands since we're going to 1199 // change the calculus as we apply fixups. The `forced_threshold` is 1200 // used here to determine whether jumps to unknown labels will require 1201 // a veneer or not. 1202 let forced_threshold = self.worst_case_end_of_island(distance); 1203 self.island_deadline = UNKNOWN_LABEL_OFFSET; 1204 self.island_worst_case_size = 0; 1205 1206 // End the current location tracking since anything emitted during this 1207 // function shouldn't be attributed to whatever the current source 1208 // location is. 1209 // 1210 // Note that the current source location, if it's set right now, will be 1211 // restored at the end of this island emission. 1212 let cur_loc = self.cur_srcloc.map(|(_, loc)| loc); 1213 if cur_loc.is_some() { 1214 self.end_srcloc(); 1215 } 1216 1217 // First flush out all traps/constants so we have more labels in case 1218 // fixups are applied against these labels. 1219 // 1220 // Note that traps are placed first since this typically happens at the 1221 // end of the function and for disassemblers we try to keep all the code 1222 // contiguously together. 1223 for MachLabelTrap { 1224 label, 1225 code, 1226 stack_map, 1227 loc, 1228 } in mem::take(&mut self.pending_traps) 1229 { 1230 // If this trap has source information associated with it then 1231 // emit this information for the trap instruction going out now too. 1232 if let Some(loc) = loc { 1233 self.start_srcloc(loc); 1234 } 1235 self.align_to(I::LabelUse::ALIGN); 1236 self.bind_label(label, ctrl_plane); 1237 self.add_trap(code); 1238 if let Some(map) = stack_map { 1239 let extent = StackMapExtent::UpcomingBytes(I::TRAP_OPCODE.len() as u32); 1240 self.add_stack_map(extent, map); 1241 } 1242 self.put_data(I::TRAP_OPCODE); 1243 if loc.is_some() { 1244 self.end_srcloc(); 1245 } 1246 } 1247 1248 for constant in mem::take(&mut self.pending_constants) { 1249 let MachBufferConstant { align, size, .. } = self.constants[constant]; 1250 let label = self.constants[constant].upcoming_label.take().unwrap(); 1251 self.align_to(align); 1252 self.bind_label(label, ctrl_plane); 1253 self.used_constants.push((constant, self.cur_offset())); 1254 self.get_appended_space(size); 1255 } 1256 1257 for fixup in mem::take(&mut self.fixup_records) { 1258 trace!("emit_island: fixup {:?}", fixup); 1259 let MachLabelFixup { 1260 label, 1261 offset, 1262 kind, 1263 } = fixup; 1264 let label_offset = self.resolve_label_offset(label); 1265 let start = offset as usize; 1266 let end = (offset + kind.patch_size()) as usize; 1267 1268 if label_offset != UNKNOWN_LABEL_OFFSET { 1269 // If the offset of the label for this fixup is known then 1270 // we're going to do something here-and-now. We're either going 1271 // to patch the original offset because it's an in-bounds jump, 1272 // or we're going to generate a veneer, patch the fixup to jump 1273 // to the veneer, and then keep going. 1274 // 1275 // If the label comes after the original fixup, then we should 1276 // be guaranteed that the jump is in-bounds. Otherwise there's 1277 // a bug somewhere because this method wasn't called soon 1278 // enough. All forward-jumps are tracked and should get veneers 1279 // before their deadline comes and they're unable to jump 1280 // further. 1281 // 1282 // Otherwise if the label is before the fixup, then that's a 1283 // backwards jump. If it's past the maximum negative range 1284 // then we'll emit a veneer that to jump forward to which can 1285 // then jump backwards. 1286 let veneer_required = if label_offset >= offset { 1287 assert!((label_offset - offset) <= kind.max_pos_range()); 1288 false 1289 } else { 1290 (offset - label_offset) > kind.max_neg_range() 1291 }; 1292 trace!( 1293 " -> label_offset = {}, known, required = {} (pos {} neg {})", 1294 label_offset, 1295 veneer_required, 1296 kind.max_pos_range(), 1297 kind.max_neg_range() 1298 ); 1299 1300 if (force_veneers && kind.supports_veneer()) || veneer_required { 1301 self.emit_veneer(label, offset, kind); 1302 } else { 1303 let slice = &mut self.data[start..end]; 1304 trace!("patching in-range!"); 1305 kind.patch(slice, offset, label_offset); 1306 } 1307 } else { 1308 // If the offset of this label is not known at this time then 1309 // there's one of two possibilities: 1310 // 1311 // * First we may be about to exceed the maximum jump range of 1312 // this fixup. In that case a veneer is inserted to buy some 1313 // more budget for the forward-jump. It's guaranteed that the 1314 // label will eventually come after where we're at, so we know 1315 // that the forward jump is necessary. 1316 // 1317 // * Otherwise we're still within range of the forward jump but 1318 // the precise target isn't known yet. In that case we 1319 // enqueue the fixup to get processed later. 1320 if forced_threshold - offset > kind.max_pos_range() { 1321 self.emit_veneer(label, offset, kind); 1322 } else { 1323 self.use_label_at_offset(offset, label, kind); 1324 } 1325 } 1326 } 1327 1328 if let Some(loc) = cur_loc { 1329 self.start_srcloc(loc); 1330 } 1331 } 1332 1333 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`. 1334 /// 1335 /// This will generate extra machine code, using `kind`, to get a 1336 /// larger-jump-kind than `kind` allows. The code at `offset` is then 1337 /// patched to jump to our new code, and then the new code is enqueued for 1338 /// a fixup to get processed at some later time. 1339 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) { 1340 // If this `kind` doesn't support a veneer then that's a bug in the 1341 // backend because we need to implement support for such a veneer. 1342 assert!( 1343 kind.supports_veneer(), 1344 "jump beyond the range of {:?} but a veneer isn't supported", 1345 kind, 1346 ); 1347 1348 // Allocate space for a veneer in the island. 1349 self.align_to(I::LabelUse::ALIGN); 1350 let veneer_offset = self.cur_offset(); 1351 trace!("making a veneer at {}", veneer_offset); 1352 let start = offset as usize; 1353 let end = (offset + kind.patch_size()) as usize; 1354 let slice = &mut self.data[start..end]; 1355 // Patch the original label use to refer to the veneer. 1356 trace!( 1357 "patching original at offset {} to veneer offset {}", 1358 offset, 1359 veneer_offset 1360 ); 1361 kind.patch(slice, offset, veneer_offset); 1362 // Generate the veneer. 1363 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize); 1364 let (veneer_fixup_off, veneer_label_use) = 1365 kind.generate_veneer(veneer_slice, veneer_offset); 1366 trace!( 1367 "generated veneer; fixup offset {}, label_use {:?}", 1368 veneer_fixup_off, 1369 veneer_label_use 1370 ); 1371 // Register a new use of `label` with our new veneer fixup and offset. 1372 // This'll recalculate deadlines accordingly and enqueue this fixup to 1373 // get processed at some later time. 1374 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use); 1375 } 1376 1377 fn finish_emission_maybe_forcing_veneers( 1378 &mut self, 1379 force_veneers: bool, 1380 ctrl_plane: &mut ControlPlane, 1381 ) { 1382 while !self.pending_constants.is_empty() 1383 || !self.pending_traps.is_empty() 1384 || !self.fixup_records.is_empty() 1385 { 1386 // `emit_island()` will emit any pending veneers and constants, and 1387 // as a side-effect, will also take care of any fixups with resolved 1388 // labels eagerly. 1389 self.emit_island_maybe_forced(force_veneers, u32::MAX, ctrl_plane); 1390 } 1391 1392 // Ensure that all labels have been fixed up after the last island is emitted. This is a 1393 // full (release-mode) assert because an unresolved label means the emitted code is 1394 // incorrect. 1395 assert!(self.fixup_records.is_empty()); 1396 } 1397 1398 /// Finish any deferred emissions and/or fixups. 1399 pub fn finish( 1400 mut self, 1401 constants: &VCodeConstants, 1402 ctrl_plane: &mut ControlPlane, 1403 ) -> MachBufferFinalized<Stencil> { 1404 let _tt = timing::vcode_emit_finish(); 1405 1406 // Do any optimizations on branches at tail of buffer, as if we 1407 // had bound one last label. 1408 self.optimize_branches(ctrl_plane); 1409 1410 self.finish_emission_maybe_forcing_veneers(false, ctrl_plane); 1411 1412 let alignment = self.finish_constants(constants); 1413 1414 let mut srclocs = self.srclocs; 1415 srclocs.sort_by_key(|entry| entry.start); 1416 1417 MachBufferFinalized { 1418 data: self.data, 1419 relocs: self.relocs, 1420 traps: self.traps, 1421 call_sites: self.call_sites, 1422 srclocs, 1423 stack_maps: self.stack_maps, 1424 unwind_info: self.unwind_info, 1425 alignment, 1426 } 1427 } 1428 1429 /// Add an external relocation at the current offset. 1430 pub fn add_reloc(&mut self, kind: Reloc, name: &ExternalName, addend: Addend) { 1431 let name = name.clone(); 1432 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally 1433 // generate a label-use statement to track whether an island is possibly 1434 // needed to escape this function to actually get to the external name. 1435 // This is most likely to come up on AArch64 where calls between 1436 // functions use a 26-bit signed offset which gives +/- 64MB. This means 1437 // that if a function is 128MB in size and there's a call in the middle 1438 // it's impossible to reach the actual target. Also, while it's 1439 // technically possible to jump to the start of a function and then jump 1440 // further, island insertion below always inserts islands after 1441 // previously appended code so for Cranelift's own implementation this 1442 // is also a problem for 64MB functions on AArch64 which start with a 1443 // call instruction, those won't be able to escape. 1444 // 1445 // Ideally what needs to happen here is that a `LabelUse` is 1446 // transparently generated (or call-sites of this function are audited 1447 // to generate a `LabelUse` instead) and tracked internally. The actual 1448 // relocation would then change over time if and when a veneer is 1449 // inserted, where the relocation here would be patched by this 1450 // `MachBuffer` to jump to the veneer. The problem, though, is that all 1451 // this still needs to end up, in the case of a singular function, 1452 // generating a final relocation pointing either to this particular 1453 // relocation or to the veneer inserted. Additionally 1454 // `MachBuffer` needs the concept of a label which will never be 1455 // resolved, so `emit_island` doesn't trip over not actually ever 1456 // knowning what some labels are. Currently the loop in 1457 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely 1458 // loop. 1459 // 1460 // For now this means that because relocs aren't tracked at all that 1461 // AArch64 functions have a rough size limits of 64MB. For now that's 1462 // somewhat reasonable and the failure mode is a panic in `MachBuffer` 1463 // when a relocation can't otherwise be resolved later, so it shouldn't 1464 // actually result in any memory unsafety or anything like that. 1465 self.relocs.push(MachReloc { 1466 offset: self.data.len() as CodeOffset, 1467 kind, 1468 name, 1469 addend, 1470 }); 1471 } 1472 1473 /// Add a trap record at the current offset. 1474 pub fn add_trap(&mut self, code: TrapCode) { 1475 self.traps.push(MachTrap { 1476 offset: self.data.len() as CodeOffset, 1477 code, 1478 }); 1479 } 1480 1481 /// Add a call-site record at the current offset. 1482 pub fn add_call_site(&mut self, opcode: Opcode) { 1483 debug_assert!( 1484 opcode.is_call(), 1485 "adding call site info for a non-call instruction." 1486 ); 1487 self.call_sites.push(MachCallSite { 1488 ret_addr: self.data.len() as CodeOffset, 1489 opcode, 1490 }); 1491 } 1492 1493 /// Add an unwind record at the current offset. 1494 pub fn add_unwind(&mut self, unwind: UnwindInst) { 1495 self.unwind_info.push((self.cur_offset(), unwind)); 1496 } 1497 1498 /// Set the `SourceLoc` for code from this offset until the offset at the 1499 /// next call to `end_srcloc()`. 1500 pub fn start_srcloc(&mut self, loc: RelSourceLoc) { 1501 self.cur_srcloc = Some((self.cur_offset(), loc)); 1502 } 1503 1504 /// Mark the end of the `SourceLoc` segment started at the last 1505 /// `start_srcloc()` call. 1506 pub fn end_srcloc(&mut self) { 1507 let (start, loc) = self 1508 .cur_srcloc 1509 .take() 1510 .expect("end_srcloc() called without start_srcloc()"); 1511 let end = self.cur_offset(); 1512 // Skip zero-length extends. 1513 debug_assert!(end >= start); 1514 if end > start { 1515 self.srclocs.push(MachSrcLoc { start, end, loc }); 1516 } 1517 } 1518 1519 /// Add stack map metadata for this program point: a set of stack offsets 1520 /// (from SP upward) that contain live references. 1521 /// 1522 /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets` 1523 /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets` 1524 /// element, one can obtain live-reference offsets from FP instead. 1525 pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) { 1526 let (start, end) = match extent { 1527 StackMapExtent::UpcomingBytes(insn_len) => { 1528 let start_offset = self.cur_offset(); 1529 (start_offset, start_offset + insn_len) 1530 } 1531 StackMapExtent::StartedAtOffset(start_offset) => { 1532 let end_offset = self.cur_offset(); 1533 debug_assert!(end_offset >= start_offset); 1534 (start_offset, end_offset) 1535 } 1536 }; 1537 trace!("Adding stack map for offsets {start:#x}..{end:#x}"); 1538 self.stack_maps.push(MachStackMap { 1539 offset: start, 1540 offset_end: end, 1541 stack_map, 1542 }); 1543 } 1544 } 1545 1546 impl<T: CompilePhase> MachBufferFinalized<T> { 1547 /// Get a list of source location mapping tuples in sorted-by-start-offset order. 1548 pub fn get_srclocs_sorted(&self) -> &[T::MachSrcLocType] { 1549 &self.srclocs[..] 1550 } 1551 1552 /// Get the total required size for the code. 1553 pub fn total_size(&self) -> CodeOffset { 1554 self.data.len() as CodeOffset 1555 } 1556 1557 /// Return the code in this mach buffer as a hex string for testing purposes. 1558 pub fn stringify_code_bytes(&self) -> String { 1559 // This is pretty lame, but whatever .. 1560 use std::fmt::Write; 1561 let mut s = String::with_capacity(self.data.len() * 2); 1562 for b in &self.data { 1563 write!(&mut s, "{:02X}", b).unwrap(); 1564 } 1565 s 1566 } 1567 1568 /// Get the code bytes. 1569 pub fn data(&self) -> &[u8] { 1570 // N.B.: we emit every section into the .text section as far as 1571 // the `CodeSink` is concerned; we do not bother to segregate 1572 // the contents into the actual program text, the jumptable and the 1573 // rodata (constant pool). This allows us to generate code assuming 1574 // that these will not be relocated relative to each other, and avoids 1575 // having to designate each section as belonging in one of the three 1576 // fixed categories defined by `CodeSink`. If this becomes a problem 1577 // later (e.g. because of memory permissions or similar), we can 1578 // add this designation and segregate the output; take care, however, 1579 // to add the appropriate relocations in this case. 1580 1581 &self.data[..] 1582 } 1583 1584 /// Get the list of external relocations for this code. 1585 pub fn relocs(&self) -> &[MachReloc] { 1586 &self.relocs[..] 1587 } 1588 1589 /// Get the list of trap records for this code. 1590 pub fn traps(&self) -> &[MachTrap] { 1591 &self.traps[..] 1592 } 1593 1594 /// Get the stack map metadata for this code. 1595 pub fn stack_maps(&self) -> &[MachStackMap] { 1596 &self.stack_maps[..] 1597 } 1598 1599 /// Get the list of call sites for this code. 1600 pub fn call_sites(&self) -> &[MachCallSite] { 1601 &self.call_sites[..] 1602 } 1603 } 1604 1605 /// Metadata about a constant. 1606 struct MachBufferConstant { 1607 /// A label which has not yet been bound which can be used for this 1608 /// constant. 1609 /// 1610 /// This is lazily created when a label is requested for a constant and is 1611 /// cleared when a constant is emitted. 1612 upcoming_label: Option<MachLabel>, 1613 /// Required alignment. 1614 align: CodeOffset, 1615 /// The byte size of this constant. 1616 size: usize, 1617 } 1618 1619 /// A trap that is deferred to the next time an island is emitted for either 1620 /// traps, constants, or fixups. 1621 struct MachLabelTrap { 1622 /// This label will refer to the trap's offset. 1623 label: MachLabel, 1624 /// The code associated with this trap. 1625 code: TrapCode, 1626 /// An optional stack map to associate with this trap. 1627 stack_map: Option<StackMap>, 1628 /// An optional source location to assign for this trap. 1629 loc: Option<RelSourceLoc>, 1630 } 1631 1632 /// A fixup to perform on the buffer once code is emitted. Fixups always refer 1633 /// to labels and patch the code based on label offsets. Hence, they are like 1634 /// relocations, but internal to one buffer. 1635 #[derive(Debug)] 1636 struct MachLabelFixup<I: VCodeInst> { 1637 /// The label whose offset controls this fixup. 1638 label: MachLabel, 1639 /// The offset to fix up / patch to refer to this label. 1640 offset: CodeOffset, 1641 /// The kind of fixup. This is architecture-specific; each architecture may have, 1642 /// e.g., several types of branch instructions, each with differently-sized 1643 /// offset fields and different places within the instruction to place the 1644 /// bits. 1645 kind: I::LabelUse, 1646 } 1647 1648 /// A relocation resulting from a compilation. 1649 #[derive(Clone, Debug, PartialEq)] 1650 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))] 1651 pub struct MachReloc { 1652 /// The offset at which the relocation applies, *relative to the 1653 /// containing section*. 1654 pub offset: CodeOffset, 1655 /// The kind of relocation. 1656 pub kind: Reloc, 1657 /// The external symbol / name to which this relocation refers. 1658 pub name: ExternalName, 1659 /// The addend to add to the symbol value. 1660 pub addend: i64, 1661 } 1662 1663 /// A trap record resulting from a compilation. 1664 #[derive(Clone, Debug, PartialEq)] 1665 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))] 1666 pub struct MachTrap { 1667 /// The offset at which the trap instruction occurs, *relative to the 1668 /// containing section*. 1669 pub offset: CodeOffset, 1670 /// The trap code. 1671 pub code: TrapCode, 1672 } 1673 1674 /// A call site record resulting from a compilation. 1675 #[derive(Clone, Debug, PartialEq)] 1676 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))] 1677 pub struct MachCallSite { 1678 /// The offset of the call's return address, *relative to the containing section*. 1679 pub ret_addr: CodeOffset, 1680 /// The call's opcode. 1681 pub opcode: Opcode, 1682 } 1683 1684 /// A source-location mapping resulting from a compilation. 1685 #[derive(PartialEq, Debug, Clone)] 1686 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))] 1687 pub struct MachSrcLoc<T: CompilePhase> { 1688 /// The start of the region of code corresponding to a source location. 1689 /// This is relative to the start of the function, not to the start of the 1690 /// section. 1691 pub start: CodeOffset, 1692 /// The end of the region of code corresponding to a source location. 1693 /// This is relative to the start of the section, not to the start of the 1694 /// section. 1695 pub end: CodeOffset, 1696 /// The source location. 1697 pub loc: T::SourceLocType, 1698 } 1699 1700 impl MachSrcLoc<Stencil> { 1701 fn apply_base_srcloc(self, base_srcloc: SourceLoc) -> MachSrcLoc<Final> { 1702 MachSrcLoc { 1703 start: self.start, 1704 end: self.end, 1705 loc: self.loc.expand(base_srcloc), 1706 } 1707 } 1708 } 1709 1710 /// Record of stack map metadata: stack offsets containing references. 1711 #[derive(Clone, Debug, PartialEq)] 1712 #[cfg_attr(feature = "enable-serde", derive(serde::Serialize, serde::Deserialize))] 1713 pub struct MachStackMap { 1714 /// The code offset at which this stack map applies. 1715 pub offset: CodeOffset, 1716 /// The code offset just past the "end" of the instruction: that is, the 1717 /// offset of the first byte of the following instruction, or equivalently, 1718 /// the start offset plus the instruction length. 1719 pub offset_end: CodeOffset, 1720 /// The stack map itself. 1721 pub stack_map: StackMap, 1722 } 1723 1724 /// Record of branch instruction in the buffer, to facilitate editing. 1725 #[derive(Clone, Debug)] 1726 struct MachBranch { 1727 start: CodeOffset, 1728 end: CodeOffset, 1729 target: MachLabel, 1730 fixup: usize, 1731 inverted: Option<SmallVec<[u8; 8]>>, 1732 /// All labels pointing to the start of this branch. For correctness, this 1733 /// *must* be complete (i.e., must contain all labels whose resolved offsets 1734 /// are at the start of this branch): we rely on being able to redirect all 1735 /// labels that could jump to this branch before removing it, if it is 1736 /// otherwise unreachable. 1737 labels_at_this_branch: SmallVec<[MachLabel; 4]>, 1738 } 1739 1740 impl MachBranch { 1741 fn is_cond(&self) -> bool { 1742 self.inverted.is_some() 1743 } 1744 fn is_uncond(&self) -> bool { 1745 self.inverted.is_none() 1746 } 1747 } 1748 1749 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`. 1750 /// 1751 /// Note that `MachBuffer` was primarily written for intra-function references 1752 /// of jumps between basic blocks, but it's also quite usable for entire text 1753 /// sections and resolving references between functions themselves. This 1754 /// builder interprets "blocks" as labeled functions for the purposes of 1755 /// resolving labels internally in the buffer. 1756 pub struct MachTextSectionBuilder<I: VCodeInst> { 1757 buf: MachBuffer<I>, 1758 next_func: usize, 1759 force_veneers: bool, 1760 } 1761 1762 impl<I: VCodeInst> MachTextSectionBuilder<I> { 1763 /// Creates a new text section builder which will have `num_funcs` functions 1764 /// pushed into it. 1765 pub fn new(num_funcs: usize) -> MachTextSectionBuilder<I> { 1766 let mut buf = MachBuffer::new(); 1767 buf.reserve_labels_for_blocks(num_funcs); 1768 MachTextSectionBuilder { 1769 buf, 1770 next_func: 0, 1771 force_veneers: false, 1772 } 1773 } 1774 } 1775 1776 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> { 1777 fn append( 1778 &mut self, 1779 labeled: bool, 1780 func: &[u8], 1781 align: u32, 1782 ctrl_plane: &mut ControlPlane, 1783 ) -> u64 { 1784 // Conditionally emit an island if it's necessary to resolve jumps 1785 // between functions which are too far away. 1786 let size = func.len() as u32; 1787 if self.force_veneers || self.buf.island_needed(size) { 1788 self.buf 1789 .emit_island_maybe_forced(self.force_veneers, size, ctrl_plane); 1790 } 1791 1792 self.buf.align_to(align); 1793 let pos = self.buf.cur_offset(); 1794 if labeled { 1795 self.buf.bind_label( 1796 MachLabel::from_block(BlockIndex::new(self.next_func)), 1797 ctrl_plane, 1798 ); 1799 self.next_func += 1; 1800 } 1801 self.buf.put_data(func); 1802 u64::from(pos) 1803 } 1804 1805 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: usize) -> bool { 1806 let label = MachLabel::from_block(BlockIndex::new(target)); 1807 let offset = u32::try_from(offset).unwrap(); 1808 match I::LabelUse::from_reloc(reloc, addend) { 1809 Some(label_use) => { 1810 self.buf.use_label_at_offset(offset, label, label_use); 1811 true 1812 } 1813 None => false, 1814 } 1815 } 1816 1817 fn force_veneers(&mut self) { 1818 self.force_veneers = true; 1819 } 1820 1821 fn finish(&mut self, ctrl_plane: &mut ControlPlane) -> Vec<u8> { 1822 // Double-check all functions were pushed. 1823 assert_eq!(self.next_func, self.buf.label_offsets.len()); 1824 1825 // Finish up any veneers, if necessary. 1826 self.buf 1827 .finish_emission_maybe_forcing_veneers(self.force_veneers, ctrl_plane); 1828 1829 // We don't need the data any more, so return it to the caller. 1830 mem::take(&mut self.buf.data).into_vec() 1831 } 1832 } 1833 1834 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here. 1835 #[cfg(all(test, feature = "arm64"))] 1836 mod test { 1837 use cranelift_entity::EntityRef as _; 1838 1839 use super::*; 1840 use crate::ir::UserExternalNameRef; 1841 use crate::isa::aarch64::inst::xreg; 1842 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst}; 1843 use crate::machinst::{MachInstEmit, MachInstEmitState}; 1844 use crate::settings; 1845 use std::default::Default; 1846 use std::vec::Vec; 1847 1848 fn label(n: u32) -> MachLabel { 1849 MachLabel::from_block(BlockIndex::new(n as usize)) 1850 } 1851 fn target(n: u32) -> BranchTarget { 1852 BranchTarget::Label(label(n)) 1853 } 1854 1855 #[test] 1856 fn test_elide_jump_to_next() { 1857 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1858 let mut buf = MachBuffer::new(); 1859 let mut state = <Inst as MachInstEmit>::State::default(); 1860 let constants = Default::default(); 1861 1862 buf.reserve_labels_for_blocks(2); 1863 buf.bind_label(label(0), state.ctrl_plane_mut()); 1864 let inst = Inst::Jump { dest: target(1) }; 1865 inst.emit(&[], &mut buf, &info, &mut state); 1866 buf.bind_label(label(1), state.ctrl_plane_mut()); 1867 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 1868 assert_eq!(0, buf.total_size()); 1869 } 1870 1871 #[test] 1872 fn test_elide_trivial_jump_blocks() { 1873 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1874 let mut buf = MachBuffer::new(); 1875 let mut state = <Inst as MachInstEmit>::State::default(); 1876 let constants = Default::default(); 1877 1878 buf.reserve_labels_for_blocks(4); 1879 1880 buf.bind_label(label(0), state.ctrl_plane_mut()); 1881 let inst = Inst::CondBr { 1882 kind: CondBrKind::NotZero(xreg(0)), 1883 taken: target(1), 1884 not_taken: target(2), 1885 }; 1886 inst.emit(&[], &mut buf, &info, &mut state); 1887 1888 buf.bind_label(label(1), state.ctrl_plane_mut()); 1889 let inst = Inst::Jump { dest: target(3) }; 1890 inst.emit(&[], &mut buf, &info, &mut state); 1891 1892 buf.bind_label(label(2), state.ctrl_plane_mut()); 1893 let inst = Inst::Jump { dest: target(3) }; 1894 inst.emit(&[], &mut buf, &info, &mut state); 1895 1896 buf.bind_label(label(3), state.ctrl_plane_mut()); 1897 1898 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 1899 assert_eq!(0, buf.total_size()); 1900 } 1901 1902 #[test] 1903 fn test_flip_cond() { 1904 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1905 let mut buf = MachBuffer::new(); 1906 let mut state = <Inst as MachInstEmit>::State::default(); 1907 let constants = Default::default(); 1908 1909 buf.reserve_labels_for_blocks(4); 1910 1911 buf.bind_label(label(0), state.ctrl_plane_mut()); 1912 let inst = Inst::CondBr { 1913 kind: CondBrKind::Zero(xreg(0)), 1914 taken: target(1), 1915 not_taken: target(2), 1916 }; 1917 inst.emit(&[], &mut buf, &info, &mut state); 1918 1919 buf.bind_label(label(1), state.ctrl_plane_mut()); 1920 let inst = Inst::Nop4; 1921 inst.emit(&[], &mut buf, &info, &mut state); 1922 1923 buf.bind_label(label(2), state.ctrl_plane_mut()); 1924 let inst = Inst::Udf { 1925 trap_code: TrapCode::Interrupt, 1926 }; 1927 inst.emit(&[], &mut buf, &info, &mut state); 1928 1929 buf.bind_label(label(3), state.ctrl_plane_mut()); 1930 1931 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 1932 1933 let mut buf2 = MachBuffer::new(); 1934 let mut state = Default::default(); 1935 let inst = Inst::TrapIf { 1936 kind: CondBrKind::NotZero(xreg(0)), 1937 trap_code: TrapCode::Interrupt, 1938 }; 1939 inst.emit(&[], &mut buf2, &info, &mut state); 1940 let inst = Inst::Nop4; 1941 inst.emit(&[], &mut buf2, &info, &mut state); 1942 1943 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 1944 1945 assert_eq!(buf.data, buf2.data); 1946 } 1947 1948 #[test] 1949 fn test_island() { 1950 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1951 let mut buf = MachBuffer::new(); 1952 let mut state = <Inst as MachInstEmit>::State::default(); 1953 let constants = Default::default(); 1954 1955 buf.reserve_labels_for_blocks(4); 1956 1957 buf.bind_label(label(0), state.ctrl_plane_mut()); 1958 let inst = Inst::CondBr { 1959 kind: CondBrKind::NotZero(xreg(0)), 1960 taken: target(2), 1961 not_taken: target(3), 1962 }; 1963 inst.emit(&[], &mut buf, &info, &mut state); 1964 1965 buf.bind_label(label(1), state.ctrl_plane_mut()); 1966 while buf.cur_offset() < 2000000 { 1967 if buf.island_needed(0) { 1968 buf.emit_island(0, state.ctrl_plane_mut()); 1969 } 1970 let inst = Inst::Nop4; 1971 inst.emit(&[], &mut buf, &info, &mut state); 1972 } 1973 1974 buf.bind_label(label(2), state.ctrl_plane_mut()); 1975 let inst = Inst::Nop4; 1976 inst.emit(&[], &mut buf, &info, &mut state); 1977 1978 buf.bind_label(label(3), state.ctrl_plane_mut()); 1979 let inst = Inst::Nop4; 1980 inst.emit(&[], &mut buf, &info, &mut state); 1981 1982 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 1983 1984 assert_eq!(2000000 + 8, buf.total_size()); 1985 1986 let mut buf2 = MachBuffer::new(); 1987 let mut state = Default::default(); 1988 let inst = Inst::CondBr { 1989 kind: CondBrKind::NotZero(xreg(0)), 1990 1991 // This conditionally taken branch has a 19-bit constant, shifted 1992 // to the left by two, giving us a 21-bit range in total. Half of 1993 // this range positive so the we should be around 1 << 20 bytes 1994 // away for our jump target. 1995 // 1996 // There are two pending fixups by the time we reach this point, 1997 // one for this 19-bit jump and one for the unconditional 26-bit 1998 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit 1999 // veneer is 20 bytes large, which means that pessimistically 2000 // assuming we'll need two veneers we need 24 bytes of extra 2001 // space, meaning that the actual island should come 24-bytes 2002 // before the deadline. 2003 taken: BranchTarget::ResolvedOffset((1 << 20) - 4 - 20), 2004 2005 // This branch is in-range so no veneers should be needed, it should 2006 // go directly to the target. 2007 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4), 2008 }; 2009 inst.emit(&[], &mut buf2, &info, &mut state); 2010 2011 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2012 2013 assert_eq!(&buf.data[0..8], &buf2.data[..]); 2014 } 2015 2016 #[test] 2017 fn test_island_backward() { 2018 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2019 let mut buf = MachBuffer::new(); 2020 let mut state = <Inst as MachInstEmit>::State::default(); 2021 let constants = Default::default(); 2022 2023 buf.reserve_labels_for_blocks(4); 2024 2025 buf.bind_label(label(0), state.ctrl_plane_mut()); 2026 let inst = Inst::Nop4; 2027 inst.emit(&[], &mut buf, &info, &mut state); 2028 2029 buf.bind_label(label(1), state.ctrl_plane_mut()); 2030 let inst = Inst::Nop4; 2031 inst.emit(&[], &mut buf, &info, &mut state); 2032 2033 buf.bind_label(label(2), state.ctrl_plane_mut()); 2034 while buf.cur_offset() < 2000000 { 2035 let inst = Inst::Nop4; 2036 inst.emit(&[], &mut buf, &info, &mut state); 2037 } 2038 2039 buf.bind_label(label(3), state.ctrl_plane_mut()); 2040 let inst = Inst::CondBr { 2041 kind: CondBrKind::NotZero(xreg(0)), 2042 taken: target(0), 2043 not_taken: target(1), 2044 }; 2045 inst.emit(&[], &mut buf, &info, &mut state); 2046 2047 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2048 2049 assert_eq!(2000000 + 12, buf.total_size()); 2050 2051 let mut buf2 = MachBuffer::new(); 2052 let mut state = Default::default(); 2053 let inst = Inst::CondBr { 2054 kind: CondBrKind::NotZero(xreg(0)), 2055 taken: BranchTarget::ResolvedOffset(8), 2056 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)), 2057 }; 2058 inst.emit(&[], &mut buf2, &info, &mut state); 2059 let inst = Inst::Jump { 2060 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)), 2061 }; 2062 inst.emit(&[], &mut buf2, &info, &mut state); 2063 2064 let buf2 = buf2.finish(&constants, state.ctrl_plane_mut()); 2065 2066 assert_eq!(&buf.data[2000000..], &buf2.data[..]); 2067 } 2068 2069 #[test] 2070 fn test_multiple_redirect() { 2071 // label0: 2072 // cbz x0, label1 2073 // b label2 2074 // label1: 2075 // b label3 2076 // label2: 2077 // nop 2078 // nop 2079 // b label0 2080 // label3: 2081 // b label4 2082 // label4: 2083 // b label5 2084 // label5: 2085 // b label7 2086 // label6: 2087 // nop 2088 // label7: 2089 // ret 2090 // 2091 // -- should become: 2092 // 2093 // label0: 2094 // cbz x0, label7 2095 // label2: 2096 // nop 2097 // nop 2098 // b label0 2099 // label6: 2100 // nop 2101 // label7: 2102 // ret 2103 2104 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2105 let mut buf = MachBuffer::new(); 2106 let mut state = <Inst as MachInstEmit>::State::default(); 2107 let constants = Default::default(); 2108 2109 buf.reserve_labels_for_blocks(8); 2110 2111 buf.bind_label(label(0), state.ctrl_plane_mut()); 2112 let inst = Inst::CondBr { 2113 kind: CondBrKind::Zero(xreg(0)), 2114 taken: target(1), 2115 not_taken: target(2), 2116 }; 2117 inst.emit(&[], &mut buf, &info, &mut state); 2118 2119 buf.bind_label(label(1), state.ctrl_plane_mut()); 2120 let inst = Inst::Jump { dest: target(3) }; 2121 inst.emit(&[], &mut buf, &info, &mut state); 2122 2123 buf.bind_label(label(2), state.ctrl_plane_mut()); 2124 let inst = Inst::Nop4; 2125 inst.emit(&[], &mut buf, &info, &mut state); 2126 inst.emit(&[], &mut buf, &info, &mut state); 2127 let inst = Inst::Jump { dest: target(0) }; 2128 inst.emit(&[], &mut buf, &info, &mut state); 2129 2130 buf.bind_label(label(3), state.ctrl_plane_mut()); 2131 let inst = Inst::Jump { dest: target(4) }; 2132 inst.emit(&[], &mut buf, &info, &mut state); 2133 2134 buf.bind_label(label(4), state.ctrl_plane_mut()); 2135 let inst = Inst::Jump { dest: target(5) }; 2136 inst.emit(&[], &mut buf, &info, &mut state); 2137 2138 buf.bind_label(label(5), state.ctrl_plane_mut()); 2139 let inst = Inst::Jump { dest: target(7) }; 2140 inst.emit(&[], &mut buf, &info, &mut state); 2141 2142 buf.bind_label(label(6), state.ctrl_plane_mut()); 2143 let inst = Inst::Nop4; 2144 inst.emit(&[], &mut buf, &info, &mut state); 2145 2146 buf.bind_label(label(7), state.ctrl_plane_mut()); 2147 let inst = Inst::Ret { 2148 rets: vec![], 2149 stack_bytes_to_pop: 0, 2150 }; 2151 inst.emit(&[], &mut buf, &info, &mut state); 2152 2153 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2154 2155 let golden_data = vec![ 2156 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14 2157 0x1f, 0x20, 0x03, 0xd5, // nop 2158 0x1f, 0x20, 0x03, 0xd5, // nop 2159 0xfd, 0xff, 0xff, 0x17, // b 0 2160 0x1f, 0x20, 0x03, 0xd5, // nop 2161 0xc0, 0x03, 0x5f, 0xd6, // ret 2162 ]; 2163 2164 assert_eq!(&golden_data[..], &buf.data[..]); 2165 } 2166 2167 #[test] 2168 fn test_handle_branch_cycle() { 2169 // label0: 2170 // b label1 2171 // label1: 2172 // b label2 2173 // label2: 2174 // b label3 2175 // label3: 2176 // b label4 2177 // label4: 2178 // b label1 // note: not label0 (to make it interesting). 2179 // 2180 // -- should become: 2181 // 2182 // label0, label1, ..., label4: 2183 // b label0 2184 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 2185 let mut buf = MachBuffer::new(); 2186 let mut state = <Inst as MachInstEmit>::State::default(); 2187 let constants = Default::default(); 2188 2189 buf.reserve_labels_for_blocks(5); 2190 2191 buf.bind_label(label(0), state.ctrl_plane_mut()); 2192 let inst = Inst::Jump { dest: target(1) }; 2193 inst.emit(&[], &mut buf, &info, &mut state); 2194 2195 buf.bind_label(label(1), state.ctrl_plane_mut()); 2196 let inst = Inst::Jump { dest: target(2) }; 2197 inst.emit(&[], &mut buf, &info, &mut state); 2198 2199 buf.bind_label(label(2), state.ctrl_plane_mut()); 2200 let inst = Inst::Jump { dest: target(3) }; 2201 inst.emit(&[], &mut buf, &info, &mut state); 2202 2203 buf.bind_label(label(3), state.ctrl_plane_mut()); 2204 let inst = Inst::Jump { dest: target(4) }; 2205 inst.emit(&[], &mut buf, &info, &mut state); 2206 2207 buf.bind_label(label(4), state.ctrl_plane_mut()); 2208 let inst = Inst::Jump { dest: target(1) }; 2209 inst.emit(&[], &mut buf, &info, &mut state); 2210 2211 let buf = buf.finish(&constants, state.ctrl_plane_mut()); 2212 2213 let golden_data = vec![ 2214 0x00, 0x00, 0x00, 0x14, // b 0 2215 ]; 2216 2217 assert_eq!(&golden_data[..], &buf.data[..]); 2218 } 2219 2220 #[test] 2221 fn metadata_records() { 2222 let mut buf = MachBuffer::<Inst>::new(); 2223 let ctrl_plane = &mut Default::default(); 2224 let constants = Default::default(); 2225 2226 buf.reserve_labels_for_blocks(1); 2227 2228 buf.bind_label(label(0), ctrl_plane); 2229 buf.put1(1); 2230 buf.add_trap(TrapCode::HeapOutOfBounds); 2231 buf.put1(2); 2232 buf.add_trap(TrapCode::IntegerOverflow); 2233 buf.add_trap(TrapCode::IntegerDivisionByZero); 2234 buf.add_call_site(Opcode::Call); 2235 buf.add_reloc( 2236 Reloc::Abs4, 2237 &ExternalName::User(UserExternalNameRef::new(0)), 2238 0, 2239 ); 2240 buf.put1(3); 2241 buf.add_reloc( 2242 Reloc::Abs8, 2243 &ExternalName::User(UserExternalNameRef::new(1)), 2244 1, 2245 ); 2246 buf.put1(4); 2247 2248 let buf = buf.finish(&constants, ctrl_plane); 2249 2250 assert_eq!(buf.data(), &[1, 2, 3, 4]); 2251 assert_eq!( 2252 buf.traps() 2253 .iter() 2254 .map(|trap| (trap.offset, trap.code)) 2255 .collect::<Vec<_>>(), 2256 vec![ 2257 (1, TrapCode::HeapOutOfBounds), 2258 (2, TrapCode::IntegerOverflow), 2259 (2, TrapCode::IntegerDivisionByZero) 2260 ] 2261 ); 2262 assert_eq!( 2263 buf.call_sites() 2264 .iter() 2265 .map(|call_site| (call_site.ret_addr, call_site.opcode)) 2266 .collect::<Vec<_>>(), 2267 vec![(2, Opcode::Call)] 2268 ); 2269 assert_eq!( 2270 buf.relocs() 2271 .iter() 2272 .map(|reloc| (reloc.offset, reloc.kind)) 2273 .collect::<Vec<_>>(), 2274 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)] 2275 ); 2276 } 2277 } 2278