1 //! In-memory representation of compiled machine code, with labels and fixups to 2 //! refer to those labels. Handles constant-pool island insertion and also 3 //! veneer insertion for out-of-range jumps. 4 //! 5 //! This code exists to solve three problems: 6 //! 7 //! - Branch targets for forward branches are not known until later, when we 8 //! emit code in a single pass through the instruction structs. 9 //! 10 //! - On many architectures, address references or offsets have limited range. 11 //! For example, on AArch64, conditional branches can only target code +/- 1MB 12 //! from the branch itself. 13 //! 14 //! - The lowering of control flow from the CFG-with-edges produced by 15 //! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty 16 //! edge blocks when the register allocator does not need to insert any 17 //! spills/reloads/moves in edge blocks, results in many suboptimal branch 18 //! patterns. The lowering also pays no attention to block order, and so 19 //! two-target conditional forms (cond-br followed by uncond-br) can often by 20 //! avoided because one of the targets is the fallthrough. There are several 21 //! cases here where we can simplify to use fewer branches. 22 //! 23 //! This "buffer" implements a single-pass code emission strategy (with a later 24 //! "fixup" pass, but only through recorded fixups, not all instructions). The 25 //! basic idea is: 26 //! 27 //! - Emit branches as they are, including two-target (cond/uncond) compound 28 //! forms, but with zero offsets and optimistically assuming the target will be 29 //! in range. Record the "fixup" for later. Targets are denoted instead by 30 //! symbolic "labels" that are then bound to certain offsets in the buffer as 31 //! we emit code. (Nominally, there is a label at the start of every basic 32 //! block.) 33 //! 34 //! - As we do this, track the offset in the buffer at which the first label 35 //! reference "goes out of range". We call this the "deadline". If we reach the 36 //! deadline and we still have not bound the label to which an unresolved branch 37 //! refers, we have a problem! 38 //! 39 //! - To solve this problem, we emit "islands" full of "veneers". An island is 40 //! simply a chunk of code inserted in the middle of the code actually produced 41 //! by the emitter (e.g., vcode iterating over instruction structs). The emitter 42 //! has some awareness of this: it either asks for an island between blocks, so 43 //! it is not accidentally executed, or else it emits a branch around the island 44 //! when all other options fail (see `Inst::EmitIsland` meta-instruction). 45 //! 46 //! - A "veneer" is an instruction (or sequence of instructions) in an "island" 47 //! that implements a longer-range reference to a label. The idea is that, for 48 //! example, a branch with a limited range can branch to a "veneer" instead, 49 //! which is simply a branch in a form that can use a longer-range reference. On 50 //! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional 51 //! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a 52 //! conditional branch's label reference can be fixed up with a "veneer" to 53 //! achieve a longer range. 54 //! 55 //! - To implement all of this, we require the backend to provide a `LabelUse` 56 //! type that implements a trait. This is nominally an enum that records one of 57 //! several kinds of references to an offset in code -- basically, a relocation 58 //! type -- and will usually correspond to different instruction formats. The 59 //! `LabelUse` implementation specifies the maximum range, how to patch in the 60 //! actual label location when known, and how to generate a veneer to extend the 61 //! range. 62 //! 63 //! That satisfies label references, but we still may have suboptimal branch 64 //! patterns. To clean up the branches, we do a simple "peephole"-style 65 //! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`) 66 //! informs the buffer of branches in the code and, in the case of conditionals, 67 //! the code that would have been emitted to invert this branch's condition. We 68 //! track the "latest branches": these are branches that are contiguous up to 69 //! the current offset. (If any code is emitted after a branch, that branch or 70 //! run of contiguous branches is no longer "latest".) The latest branches are 71 //! those that we can edit by simply truncating the buffer and doing something 72 //! else instead. 73 //! 74 //! To optimize branches, we implement several simple rules, and try to apply 75 //! them to the "latest branches" when possible: 76 //! 77 //! - A branch with a label target, when that label is bound to the ending 78 //! offset of the branch (the fallthrough location), can be removed altogether, 79 //! because the branch would have no effect). 80 //! 81 //! - An unconditional branch that starts at a label location, and branches to 82 //! another label, results in a "label alias": all references to the label bound 83 //! *to* this branch instruction are instead resolved to the *target* of the 84 //! branch instruction. This effectively removes empty blocks that just 85 //! unconditionally branch to the next block. We call this "branch threading". 86 //! 87 //! - A conditional followed by an unconditional, when the conditional branches 88 //! to the unconditional's fallthrough, results in (i) the truncation of the 89 //! unconditional, (ii) the inversion of the condition's condition, and (iii) 90 //! replacement of the conditional's target (using the original target of the 91 //! unconditional). This is a fancy way of saying "we can flip a two-target 92 //! conditional branch's taken/not-taken targets if it works better with our 93 //! fallthrough". To make this work, the emitter actually gives the buffer 94 //! *both* forms of every conditional branch: the true form is emitted into the 95 //! buffer, and the "inverted" machine-code bytes are provided as part of the 96 //! branch-fixup metadata. 97 //! 98 //! - An unconditional B preceded by another unconditional P, when B's label(s) have 99 //! been redirected to target(B), can be removed entirely. This is an extension 100 //! of the branch-threading optimization, and is valid because if we know there 101 //! will be no fallthrough into this branch instruction (the prior instruction 102 //! is an unconditional jump), and if we know we have successfully redirected 103 //! all labels, then this branch instruction is unreachable. Note that this 104 //! works because the redirection happens before the label is ever resolved 105 //! (fixups happen at island emission time, at which point latest-branches are 106 //! cleared, or at the end of emission), so we are sure to catch and redirect 107 //! all possible paths to this instruction. 108 //! 109 //! # Branch-optimization Correctness 110 //! 111 //! The branch-optimization mechanism depends on a few data structures with 112 //! invariants, which are always held outside the scope of top-level public 113 //! methods: 114 //! 115 //! - The latest-branches list. Each entry describes a span of the buffer 116 //! (start/end offsets), the label target, the corresponding fixup-list entry 117 //! index, and the bytes (must be the same length) for the inverted form, if 118 //! conditional. The list of labels that are bound to the start-offset of this 119 //! branch is *complete* (if any label has a resolved offset equal to `start` 120 //! and is not an alias, it must appear in this list) and *precise* (no label 121 //! in this list can be bound to another offset). No label in this list should 122 //! be an alias. No two branch ranges can overlap, and branches are in 123 //! ascending-offset order. 124 //! 125 //! - The labels-at-tail list. This contains all MachLabels that have been bound 126 //! to (whose resolved offsets are equal to) the tail offset of the buffer. 127 //! No label in this list should be an alias. 128 //! 129 //! - The label_offsets array, containing the bound offset of a label or 130 //! UNKNOWN. No label can be bound at an offset greater than the current 131 //! buffer tail. 132 //! 133 //! - The label_aliases array, containing another label to which a label is 134 //! bound or UNKNOWN. A label's resolved offset is the resolved offset 135 //! of the label it is aliased to, if this is set. 136 //! 137 //! We argue below, at each method, how the invariants in these data structures 138 //! are maintained (grep for "Post-invariant"). 139 //! 140 //! Given these invariants, we argue why each optimization preserves execution 141 //! semantics below (grep for "Preserves execution semantics"). 142 143 use crate::binemit::{Addend, CodeOffset, Reloc, StackMap}; 144 use crate::ir::{ExternalName, Opcode, SourceLoc, TrapCode}; 145 use crate::isa::unwind::UnwindInst; 146 use crate::machinst::{ 147 BlockIndex, MachInstLabelUse, TextSectionBuilder, VCodeConstant, VCodeConstants, VCodeInst, 148 }; 149 use crate::timing; 150 use cranelift_entity::{entity_impl, SecondaryMap}; 151 use log::trace; 152 use smallvec::SmallVec; 153 use std::convert::TryFrom; 154 use std::mem; 155 use std::string::String; 156 use std::vec::Vec; 157 158 /// A buffer of output to be produced, fixed up, and then emitted to a CodeSink 159 /// in bulk. 160 /// 161 /// This struct uses `SmallVec`s to support small-ish function bodies without 162 /// any heap allocation. As such, it will be several kilobytes large. This is 163 /// likely fine as long as it is stack-allocated for function emission then 164 /// thrown away; but beware if many buffer objects are retained persistently. 165 pub struct MachBuffer<I: VCodeInst> { 166 /// The buffer contents, as raw bytes. 167 data: SmallVec<[u8; 1024]>, 168 /// Any relocations referring to this code. Note that only *external* 169 /// relocations are tracked here; references to labels within the buffer are 170 /// resolved before emission. 171 relocs: SmallVec<[MachReloc; 16]>, 172 /// Any trap records referring to this code. 173 traps: SmallVec<[MachTrap; 16]>, 174 /// Any call site records referring to this code. 175 call_sites: SmallVec<[MachCallSite; 16]>, 176 /// Any source location mappings referring to this code. 177 srclocs: SmallVec<[MachSrcLoc; 64]>, 178 /// Any stack maps referring to this code. 179 stack_maps: SmallVec<[MachStackMap; 8]>, 180 /// Any unwind info at a given location. 181 unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 182 /// The current source location in progress (after `start_srcloc()` and 183 /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple. 184 cur_srcloc: Option<(CodeOffset, SourceLoc)>, 185 /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown. 186 label_offsets: SmallVec<[CodeOffset; 16]>, 187 /// Label aliases: when one label points to an unconditional jump, and that 188 /// jump points to another label, we can redirect references to the first 189 /// label immediately to the second. 190 /// 191 /// Invariant: we don't have label-alias cycles. We ensure this by, 192 /// before setting label A to alias label B, resolving B's alias 193 /// target (iteratively until a non-aliased label); if B is already 194 /// aliased to A, then we cannot alias A back to B. 195 label_aliases: SmallVec<[MachLabel; 16]>, 196 /// Constants that must be emitted at some point. 197 pending_constants: SmallVec<[MachLabelConstant; 16]>, 198 /// Fixups that must be performed after all code is emitted. 199 fixup_records: SmallVec<[MachLabelFixup<I>; 16]>, 200 /// Current deadline at which all constants are flushed and all code labels 201 /// are extended by emitting long-range jumps in an island. This flush 202 /// should be rare (e.g., on AArch64, the shortest-range PC-rel references 203 /// are +/- 1MB for conditional jumps and load-literal instructions), so 204 /// it's acceptable to track a minimum and flush-all rather than doing more 205 /// detailed "current minimum" / sort-by-deadline trickery. 206 island_deadline: CodeOffset, 207 /// How many bytes are needed in the worst case for an island, given all 208 /// pending constants and fixups. 209 island_worst_case_size: CodeOffset, 210 /// Latest branches, to facilitate in-place editing for better fallthrough 211 /// behavior and empty-block removal. 212 latest_branches: SmallVec<[MachBranch; 4]>, 213 /// All labels at the current offset (emission tail). This is lazily 214 /// cleared: it is actually accurate as long as the current offset is 215 /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should 216 /// be considered as empty. 217 /// 218 /// For correctness, this *must* be complete (i.e., the vector must contain 219 /// all labels whose offsets are resolved to the current tail), because we 220 /// rely on it to update labels when we truncate branches. 221 labels_at_tail: SmallVec<[MachLabel; 4]>, 222 /// The last offset at which `labels_at_tail` is valid. It is conceptually 223 /// always describing the tail of the buffer, but we do not clear 224 /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it 225 /// when the offset has grown past this (`labels_at_tail_off`) point. 226 /// Always <= `cur_offset()`. 227 labels_at_tail_off: CodeOffset, 228 /// Map used constants to their [MachLabel]. 229 constant_labels: SecondaryMap<VCodeConstant, MachLabel>, 230 } 231 232 /// A `MachBuffer` once emission is completed: holds generated code and records, 233 /// without fixups. This allows the type to be independent of the backend. 234 pub struct MachBufferFinalized { 235 /// The buffer contents, as raw bytes. 236 data: SmallVec<[u8; 1024]>, 237 /// Any relocations referring to this code. Note that only *external* 238 /// relocations are tracked here; references to labels within the buffer are 239 /// resolved before emission. 240 relocs: SmallVec<[MachReloc; 16]>, 241 /// Any trap records referring to this code. 242 traps: SmallVec<[MachTrap; 16]>, 243 /// Any call site records referring to this code. 244 call_sites: SmallVec<[MachCallSite; 16]>, 245 /// Any source location mappings referring to this code. 246 srclocs: SmallVec<[MachSrcLoc; 64]>, 247 /// Any stack maps referring to this code. 248 stack_maps: SmallVec<[MachStackMap; 8]>, 249 /// Any unwind info at a given location. 250 pub unwind_info: SmallVec<[(CodeOffset, UnwindInst); 8]>, 251 } 252 253 const UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff; 254 const UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff); 255 256 /// Threshold on max length of `labels_at_this_branch` list to avoid 257 /// unbounded quadratic behavior (see comment below at use-site). 258 const LABEL_LIST_THRESHOLD: usize = 100; 259 260 /// A label refers to some offset in a `MachBuffer`. It may not be resolved at 261 /// the point at which it is used by emitted code; the buffer records "fixups" 262 /// for references to the label, and will come back and patch the code 263 /// appropriately when the label's location is eventually known. 264 #[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)] 265 pub struct MachLabel(u32); 266 entity_impl!(MachLabel); 267 268 impl MachLabel { 269 /// Get a label for a block. (The first N MachLabels are always reseved for 270 /// the N blocks in the vcode.) 271 pub fn from_block(bindex: BlockIndex) -> MachLabel { 272 MachLabel(bindex.index() as u32) 273 } 274 275 /// Get the numeric label index. 276 pub fn get(self) -> u32 { 277 self.0 278 } 279 280 /// Creates a string representing this label, for convenience. 281 pub fn to_string(&self) -> String { 282 format!("label{}", self.0) 283 } 284 } 285 286 impl Default for MachLabel { 287 fn default() -> Self { 288 UNKNOWN_LABEL 289 } 290 } 291 292 /// A stack map extent, when creating a stack map. 293 pub enum StackMapExtent { 294 /// The stack map starts at this instruction, and ends after the number of upcoming bytes 295 /// (note: this is a code offset diff). 296 UpcomingBytes(CodeOffset), 297 298 /// The stack map started at the given offset and ends at the current one. This helps 299 /// architectures where the instruction size has not a fixed length. 300 StartedAtOffset(CodeOffset), 301 } 302 303 impl<I: VCodeInst> MachBuffer<I> { 304 /// Create a new section, known to start at `start_offset` and with a size limited to 305 /// `length_limit`. 306 pub fn new() -> MachBuffer<I> { 307 MachBuffer { 308 data: SmallVec::new(), 309 relocs: SmallVec::new(), 310 traps: SmallVec::new(), 311 call_sites: SmallVec::new(), 312 srclocs: SmallVec::new(), 313 stack_maps: SmallVec::new(), 314 unwind_info: SmallVec::new(), 315 cur_srcloc: None, 316 label_offsets: SmallVec::new(), 317 label_aliases: SmallVec::new(), 318 pending_constants: SmallVec::new(), 319 fixup_records: SmallVec::new(), 320 island_deadline: UNKNOWN_LABEL_OFFSET, 321 island_worst_case_size: 0, 322 latest_branches: SmallVec::new(), 323 labels_at_tail: SmallVec::new(), 324 labels_at_tail_off: 0, 325 constant_labels: SecondaryMap::new(), 326 } 327 } 328 329 /// Current offset from start of buffer. 330 pub fn cur_offset(&self) -> CodeOffset { 331 self.data.len() as CodeOffset 332 } 333 334 /// Add a byte. 335 pub fn put1(&mut self, value: u8) { 336 trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value); 337 self.data.push(value); 338 339 // Post-invariant: conceptual-labels_at_tail contains a complete and 340 // precise list of labels bound at `cur_offset()`. We have advanced 341 // `cur_offset()`, hence if it had been equal to `labels_at_tail_off` 342 // before, it is not anymore (and it cannot become equal, because 343 // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is 344 // conceptually empty (even though it is only lazily cleared). No labels 345 // can be bound at this new offset (by invariant on `label_offsets`). 346 // Hence the invariant holds. 347 } 348 349 /// Add 2 bytes. 350 pub fn put2(&mut self, value: u16) { 351 trace!( 352 "MachBuffer: put 16-bit word @ {}: {:x}", 353 self.cur_offset(), 354 value 355 ); 356 let bytes = value.to_le_bytes(); 357 self.data.extend_from_slice(&bytes[..]); 358 359 // Post-invariant: as for `put1()`. 360 } 361 362 /// Add 4 bytes. 363 pub fn put4(&mut self, value: u32) { 364 trace!( 365 "MachBuffer: put 32-bit word @ {}: {:x}", 366 self.cur_offset(), 367 value 368 ); 369 let bytes = value.to_le_bytes(); 370 self.data.extend_from_slice(&bytes[..]); 371 372 // Post-invariant: as for `put1()`. 373 } 374 375 /// Add 8 bytes. 376 pub fn put8(&mut self, value: u64) { 377 trace!( 378 "MachBuffer: put 64-bit word @ {}: {:x}", 379 self.cur_offset(), 380 value 381 ); 382 let bytes = value.to_le_bytes(); 383 self.data.extend_from_slice(&bytes[..]); 384 385 // Post-invariant: as for `put1()`. 386 } 387 388 /// Add a slice of bytes. 389 pub fn put_data(&mut self, data: &[u8]) { 390 trace!( 391 "MachBuffer: put data @ {}: len {}", 392 self.cur_offset(), 393 data.len() 394 ); 395 self.data.extend_from_slice(data); 396 397 // Post-invariant: as for `put1()`. 398 } 399 400 /// Reserve appended space and return a mutable slice referring to it. 401 pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] { 402 trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len); 403 let off = self.data.len(); 404 let new_len = self.data.len() + len; 405 self.data.resize(new_len, 0); 406 &mut self.data[off..] 407 408 // Post-invariant: as for `put1()`. 409 } 410 411 /// Align up to the given alignment. 412 pub fn align_to(&mut self, align_to: CodeOffset) { 413 trace!("MachBuffer: align to {}", align_to); 414 assert!(align_to.is_power_of_two()); 415 while self.cur_offset() & (align_to - 1) != 0 { 416 self.put1(0); 417 } 418 419 // Post-invariant: as for `put1()`. 420 } 421 422 /// Allocate a `Label` to refer to some offset. May not be bound to a fixed 423 /// offset yet. 424 pub fn get_label(&mut self) -> MachLabel { 425 let l = self.label_offsets.len() as u32; 426 self.label_offsets.push(UNKNOWN_LABEL_OFFSET); 427 self.label_aliases.push(UNKNOWN_LABEL); 428 trace!("MachBuffer: new label -> {:?}", MachLabel(l)); 429 MachLabel(l) 430 431 // Post-invariant: the only mutation is to add a new label; it has no 432 // bound offset yet, so it trivially satisfies all invariants. 433 } 434 435 /// Reserve the first N MachLabels for blocks. 436 pub fn reserve_labels_for_blocks(&mut self, blocks: usize) { 437 trace!("MachBuffer: first {} labels are for blocks", blocks); 438 debug_assert!(self.label_offsets.is_empty()); 439 self.label_offsets.resize(blocks, UNKNOWN_LABEL_OFFSET); 440 self.label_aliases.resize(blocks, UNKNOWN_LABEL); 441 442 // Post-invariant: as for `get_label()`. 443 } 444 445 /// Reserve the next N MachLabels for constants. 446 pub fn reserve_labels_for_constants(&mut self, constants: &VCodeConstants) { 447 trace!( 448 "MachBuffer: next {} labels are for constants", 449 constants.len() 450 ); 451 for c in constants.keys() { 452 self.constant_labels[c] = self.get_label(); 453 } 454 455 // Post-invariant: as for `get_label()`. 456 } 457 458 /// Retrieve the reserved label for a constant. 459 pub fn get_label_for_constant(&self, constant: VCodeConstant) -> MachLabel { 460 self.constant_labels[constant] 461 } 462 463 /// Bind a label to the current offset. A label can only be bound once. 464 pub fn bind_label(&mut self, label: MachLabel) { 465 trace!( 466 "MachBuffer: bind label {:?} at offset {}", 467 label, 468 self.cur_offset() 469 ); 470 debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET); 471 debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL); 472 let offset = self.cur_offset(); 473 self.label_offsets[label.0 as usize] = offset; 474 self.lazily_clear_labels_at_tail(); 475 self.labels_at_tail.push(label); 476 477 // Invariants hold: bound offset of label is <= cur_offset (in fact it 478 // is equal). If the `labels_at_tail` list was complete and precise 479 // before, it is still, because we have bound this label to the current 480 // offset and added it to the list (which contains all labels at the 481 // current offset). 482 483 self.optimize_branches(); 484 485 // Post-invariant: by `optimize_branches()` (see argument there). 486 } 487 488 /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the 489 /// offset that it applies to. 490 fn lazily_clear_labels_at_tail(&mut self) { 491 let offset = self.cur_offset(); 492 if offset > self.labels_at_tail_off { 493 self.labels_at_tail_off = offset; 494 self.labels_at_tail.clear(); 495 } 496 497 // Post-invariant: either labels_at_tail_off was at cur_offset, and 498 // state is untouched, or was less than cur_offset, in which case the 499 // labels_at_tail list was conceptually empty, and is now actually 500 // empty. 501 } 502 503 /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`. 504 pub(crate) fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset { 505 let mut iters = 0; 506 while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL { 507 label = self.label_aliases[label.0 as usize]; 508 // To protect against an infinite loop (despite our assurances to 509 // ourselves that the invariants make this impossible), assert out 510 // after 1M iterations. The number of basic blocks is limited 511 // in most contexts anyway so this should be impossible to hit with 512 // a legitimate input. 513 iters += 1; 514 assert!(iters < 1_000_000, "Unexpected cycle in label aliases"); 515 } 516 self.label_offsets[label.0 as usize] 517 518 // Post-invariant: no mutations. 519 } 520 521 /// Emit a reference to the given label with the given reference type (i.e., 522 /// branch-instruction format) at the current offset. This is like a 523 /// relocation, but handled internally. 524 /// 525 /// This can be called before the branch is actually emitted; fixups will 526 /// not happen until an island is emitted or the buffer is finished. 527 pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) { 528 trace!( 529 "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}", 530 offset, 531 label, 532 kind 533 ); 534 535 // Add the fixup, and update the worst-case island size based on a 536 // veneer for this label use. 537 self.fixup_records.push(MachLabelFixup { 538 label, 539 offset, 540 kind, 541 }); 542 if kind.supports_veneer() { 543 self.island_worst_case_size += kind.veneer_size(); 544 self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1); 545 } 546 let deadline = offset.saturating_add(kind.max_pos_range()); 547 if deadline < self.island_deadline { 548 self.island_deadline = deadline; 549 } 550 551 // Post-invariant: no mutations to branches/labels data structures. 552 } 553 554 /// Inform the buffer of an unconditional branch at the given offset, 555 /// targetting the given label. May be used to optimize branches. 556 /// The last added label-use must correspond to this branch. 557 /// This must be called when the current offset is equal to `start`; i.e., 558 /// before actually emitting the branch. This implies that for a branch that 559 /// uses a label and is eligible for optimizations by the MachBuffer, the 560 /// proper sequence is: 561 /// 562 /// - Call `use_label_at_offset()` to emit the fixup record. 563 /// - Call `add_uncond_branch()` to make note of the branch. 564 /// - Emit the bytes for the branch's machine code. 565 /// 566 /// Additional requirement: no labels may be bound between `start` and `end` 567 /// (exclusive on both ends). 568 pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) { 569 assert!(self.cur_offset() == start); 570 debug_assert!(end > start); 571 assert!(!self.fixup_records.is_empty()); 572 let fixup = self.fixup_records.len() - 1; 573 self.lazily_clear_labels_at_tail(); 574 self.latest_branches.push(MachBranch { 575 start, 576 end, 577 target, 578 fixup, 579 inverted: None, 580 labels_at_this_branch: self.labels_at_tail.clone(), 581 }); 582 583 // Post-invariant: we asserted branch start is current tail; the list of 584 // labels at branch is cloned from list of labels at current tail. 585 } 586 587 /// Inform the buffer of a conditional branch at the given offset, 588 /// targetting the given label. May be used to optimize branches. 589 /// The last added label-use must correspond to this branch. 590 /// 591 /// Additional requirement: no labels may be bound between `start` and `end` 592 /// (exclusive on both ends). 593 pub fn add_cond_branch( 594 &mut self, 595 start: CodeOffset, 596 end: CodeOffset, 597 target: MachLabel, 598 inverted: &[u8], 599 ) { 600 assert!(self.cur_offset() == start); 601 debug_assert!(end > start); 602 assert!(!self.fixup_records.is_empty()); 603 debug_assert!(inverted.len() == (end - start) as usize); 604 let fixup = self.fixup_records.len() - 1; 605 let inverted = Some(SmallVec::from(inverted)); 606 self.lazily_clear_labels_at_tail(); 607 self.latest_branches.push(MachBranch { 608 start, 609 end, 610 target, 611 fixup, 612 inverted, 613 labels_at_this_branch: self.labels_at_tail.clone(), 614 }); 615 616 // Post-invariant: we asserted branch start is current tail; labels at 617 // branch list is cloned from list of labels at current tail. 618 } 619 620 fn truncate_last_branch(&mut self) { 621 self.lazily_clear_labels_at_tail(); 622 // Invariants hold at this point. 623 624 let b = self.latest_branches.pop().unwrap(); 625 assert!(b.end == self.cur_offset()); 626 627 // State: 628 // [PRE CODE] 629 // Offset b.start, b.labels_at_this_branch: 630 // [BRANCH CODE] 631 // cur_off, self.labels_at_tail --> 632 // (end of buffer) 633 self.data.truncate(b.start as usize); 634 self.fixup_records.truncate(b.fixup); 635 while let Some(mut last_srcloc) = self.srclocs.last_mut() { 636 if last_srcloc.end <= b.start { 637 break; 638 } 639 if last_srcloc.start < b.start { 640 last_srcloc.end = b.start; 641 break; 642 } 643 self.srclocs.pop(); 644 } 645 // State: 646 // [PRE CODE] 647 // cur_off, Offset b.start, b.labels_at_this_branch: 648 // (end of buffer) 649 // 650 // self.labels_at_tail --> (past end of buffer) 651 let cur_off = self.cur_offset(); 652 self.labels_at_tail_off = cur_off; 653 // State: 654 // [PRE CODE] 655 // cur_off, Offset b.start, b.labels_at_this_branch, 656 // self.labels_at_tail: 657 // (end of buffer) 658 // 659 // resolve_label_offset(l) for l in labels_at_tail: 660 // (past end of buffer) 661 662 trace!( 663 "truncate_last_branch: truncated {:?}; off now {}", 664 b, 665 cur_off 666 ); 667 668 // Fix up resolved label offsets for labels at tail. 669 for &l in &self.labels_at_tail { 670 self.label_offsets[l.0 as usize] = cur_off; 671 } 672 // Old labels_at_this_branch are now at cur_off. 673 self.labels_at_tail 674 .extend(b.labels_at_this_branch.into_iter()); 675 676 // Post-invariant: this operation is defined to truncate the buffer, 677 // which moves cur_off backward, and to move labels at the end of the 678 // buffer back to the start-of-branch offset. 679 // 680 // latest_branches satisfies all invariants: 681 // - it has no branches past the end of the buffer (branches are in 682 // order, we removed the last one, and we truncated the buffer to just 683 // before the start of that branch) 684 // - no labels were moved to lower offsets than the (new) cur_off, so 685 // the labels_at_this_branch list for any other branch need not change. 686 // 687 // labels_at_tail satisfies all invariants: 688 // - all labels that were at the tail after the truncated branch are 689 // moved backward to just before the branch, which becomes the new tail; 690 // thus every element in the list should remain (ensured by `.extend()` 691 // above). 692 // - all labels that refer to the new tail, which is the start-offset of 693 // the truncated branch, must be present. The `labels_at_this_branch` 694 // list in the truncated branch's record is a complete and precise list 695 // of exactly these labels; we append these to labels_at_tail. 696 // - labels_at_tail_off is at cur_off after truncation occurs, so the 697 // list is valid (not to be lazily cleared). 698 // 699 // The stated operation was performed: 700 // - For each label at the end of the buffer prior to this method, it 701 // now resolves to the new (truncated) end of the buffer: it must have 702 // been in `labels_at_tail` (this list is precise and complete, and 703 // the tail was at the end of the truncated branch on entry), and we 704 // iterate over this list and set `label_offsets` to the new tail. 705 // None of these labels could have been an alias (by invariant), so 706 // `label_offsets` is authoritative for each. 707 // - No other labels will be past the end of the buffer, because of the 708 // requirement that no labels be bound to the middle of branch ranges 709 // (see comments to `add_{cond,uncond}_branch()`). 710 // - The buffer is truncated to just before the last branch, and the 711 // fixup record referring to that last branch is removed. 712 } 713 714 fn optimize_branches(&mut self) { 715 self.lazily_clear_labels_at_tail(); 716 // Invariants valid at this point. 717 718 trace!( 719 "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 720 self.latest_branches, 721 self.labels_at_tail, 722 self.fixup_records 723 ); 724 725 // We continue to munch on branches at the tail of the buffer until no 726 // more rules apply. Note that the loop only continues if a branch is 727 // actually truncated (or if labels are redirected away from a branch), 728 // so this always makes progress. 729 while let Some(b) = self.latest_branches.last() { 730 let cur_off = self.cur_offset(); 731 trace!("optimize_branches: last branch {:?} at off {}", b, cur_off); 732 // If there has been any code emission since the end of the last branch or 733 // label definition, then there's nothing we can edit (because we 734 // don't move code once placed, only back up and overwrite), so 735 // clear the records and finish. 736 if b.end < cur_off { 737 break; 738 } 739 740 // If the "labels at this branch" list on this branch is 741 // longer than a threshold, don't do any simplification, 742 // and let the branch remain to separate those labels from 743 // the current tail. This avoids quadratic behavior (see 744 // #3468): otherwise, if a long string of "goto next; 745 // next:" patterns are emitted, all of the labels will 746 // coalesce into a long list of aliases for the current 747 // buffer tail. We must track all aliases of the current 748 // tail for correctness, but we are also allowed to skip 749 // optimization (removal) of any branch, so we take the 750 // escape hatch here and let it stand. In effect this 751 // "spreads" the many thousands of labels in the 752 // pathological case among an actual (harmless but 753 // suboptimal) instruction once per N labels. 754 if b.labels_at_this_branch.len() > LABEL_LIST_THRESHOLD { 755 break; 756 } 757 758 // Invariant: we are looking at a branch that ends at the tail of 759 // the buffer. 760 761 // For any branch, conditional or unconditional: 762 // - If the target is a label at the current offset, then remove 763 // the conditional branch, and reset all labels that targetted 764 // the current offset (end of branch) to the truncated 765 // end-of-code. 766 // 767 // Preserves execution semantics: a branch to its own fallthrough 768 // address is equivalent to a no-op; in both cases, nextPC is the 769 // fallthrough. 770 if self.resolve_label_offset(b.target) == cur_off { 771 trace!("branch with target == cur off; truncating"); 772 self.truncate_last_branch(); 773 continue; 774 } 775 776 // If latest is an unconditional branch: 777 // 778 // - If the branch's target is not its own start address, then for 779 // each label at the start of branch, make the label an alias of the 780 // branch target, and remove the label from the "labels at this 781 // branch" list. 782 // 783 // - Preserves execution semantics: an unconditional branch's 784 // only effect is to set PC to a new PC; this change simply 785 // collapses one step in the step-semantics. 786 // 787 // - Post-invariant: the labels that were bound to the start of 788 // this branch become aliases, so they must not be present in any 789 // labels-at-this-branch list or the labels-at-tail list. The 790 // labels are removed form the latest-branch record's 791 // labels-at-this-branch list, and are never placed in the 792 // labels-at-tail list. Furthermore, it is correct that they are 793 // not in either list, because they are now aliases, and labels 794 // that are aliases remain aliases forever. 795 // 796 // - If there is a prior unconditional branch that ends just before 797 // this one begins, and this branch has no labels bound to its 798 // start, then we can truncate this branch, because it is entirely 799 // unreachable (we have redirected all labels that make it 800 // reachable otherwise). Do so and continue around the loop. 801 // 802 // - Preserves execution semantics: the branch is unreachable, 803 // because execution can only flow into an instruction from the 804 // prior instruction's fallthrough or from a branch bound to that 805 // instruction's start offset. Unconditional branches have no 806 // fallthrough, so if the prior instruction is an unconditional 807 // branch, no fallthrough entry can happen. The 808 // labels-at-this-branch list is complete (by invariant), so if it 809 // is empty, then the instruction is entirely unreachable. Thus, 810 // it can be removed. 811 // 812 // - Post-invariant: ensured by truncate_last_branch(). 813 // 814 // - If there is a prior conditional branch whose target label 815 // resolves to the current offset (branches around the 816 // unconditional branch), then remove the unconditional branch, 817 // and make the target of the unconditional the target of the 818 // conditional instead. 819 // 820 // - Preserves execution semantics: previously we had: 821 // 822 // L1: 823 // cond_br L2 824 // br L3 825 // L2: 826 // (end of buffer) 827 // 828 // by removing the last branch, we have: 829 // 830 // L1: 831 // cond_br L2 832 // L2: 833 // (end of buffer) 834 // 835 // we then fix up the records for the conditional branch to 836 // have: 837 // 838 // L1: 839 // cond_br.inverted L3 840 // L2: 841 // 842 // In the original code, control flow reaches L2 when the 843 // conditional branch's predicate is true, and L3 otherwise. In 844 // the optimized code, the same is true. 845 // 846 // - Post-invariant: all edits to latest_branches and 847 // labels_at_tail are performed by `truncate_last_branch()`, 848 // which maintains the invariants at each step. 849 850 if b.is_uncond() { 851 // Set any label equal to current branch's start as an alias of 852 // the branch's target, if the target is not the branch itself 853 // (i.e., an infinite loop). 854 // 855 // We cannot perform this aliasing if the target of this branch 856 // ultimately aliases back here; if so, we need to keep this 857 // branch, so break out of this loop entirely (and clear the 858 // latest-branches list below). 859 // 860 // Note that this check is what prevents cycles from forming in 861 // `self.label_aliases`. To see why, consider an arbitrary start 862 // state: 863 // 864 // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to 865 // Ln, which is not aliased. 866 // 867 // We would create a cycle if we assigned label_aliases[Ln] 868 // = L1. Note that the below assignment is the only write 869 // to label_aliases. 870 // 871 // By our other invariants, we have that Ln (`l` below) 872 // resolves to the offset `b.start`, because it is in the 873 // set `b.labels_at_this_branch`. 874 // 875 // If L1 were already aliased, through some arbitrarily deep 876 // chain, to Ln, then it must also resolve to this offset 877 // `b.start`. 878 // 879 // By checking the resolution of `L1` against this offset, 880 // and aborting this branch-simplification if they are 881 // equal, we prevent the below assignment from ever creating 882 // a cycle. 883 if self.resolve_label_offset(b.target) != b.start { 884 let redirected = b.labels_at_this_branch.len(); 885 for &l in &b.labels_at_this_branch { 886 trace!( 887 " -> label at start of branch {:?} redirected to target {:?}", 888 l, 889 b.target 890 ); 891 self.label_aliases[l.0 as usize] = b.target; 892 // NOTE: we continue to ensure the invariant that labels 893 // pointing to tail of buffer are in `labels_at_tail` 894 // because we already ensured above that the last branch 895 // cannot have a target of `cur_off`; so we never have 896 // to put the label into `labels_at_tail` when moving it 897 // here. 898 } 899 // Maintain invariant: all branches have been redirected 900 // and are no longer pointing at the start of this branch. 901 let mut_b = self.latest_branches.last_mut().unwrap(); 902 mut_b.labels_at_this_branch.clear(); 903 904 if redirected > 0 { 905 trace!(" -> after label redirects, restarting loop"); 906 continue; 907 } 908 } else { 909 break; 910 } 911 912 let b = self.latest_branches.last().unwrap(); 913 914 // Examine any immediately preceding branch. 915 if self.latest_branches.len() > 1 { 916 let prev_b = &self.latest_branches[self.latest_branches.len() - 2]; 917 trace!(" -> more than one branch; prev_b = {:?}", prev_b); 918 // This uncond is immediately after another uncond; we 919 // should have already redirected labels to this uncond away 920 // (but check to be sure); so we can truncate this uncond. 921 if prev_b.is_uncond() 922 && prev_b.end == b.start 923 && b.labels_at_this_branch.is_empty() 924 { 925 trace!(" -> uncond follows another uncond; truncating"); 926 self.truncate_last_branch(); 927 continue; 928 } 929 930 // This uncond is immediately after a conditional, and the 931 // conditional's target is the end of this uncond, and we've 932 // already redirected labels to this uncond away; so we can 933 // truncate this uncond, flip the sense of the conditional, and 934 // set the conditional's target (in `latest_branches` and in 935 // `fixup_records`) to the uncond's target. 936 if prev_b.is_cond() 937 && prev_b.end == b.start 938 && self.resolve_label_offset(prev_b.target) == cur_off 939 { 940 trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset"); 941 // Save the target of the uncond (this becomes the 942 // target of the cond), and truncate the uncond. 943 let target = b.target; 944 let data = prev_b.inverted.clone().unwrap(); 945 self.truncate_last_branch(); 946 947 // Mutate the code and cond branch. 948 let off_before_edit = self.cur_offset(); 949 let prev_b = self.latest_branches.last_mut().unwrap(); 950 let not_inverted = SmallVec::from( 951 &self.data[(prev_b.start as usize)..(prev_b.end as usize)], 952 ); 953 954 // Low-level edit: replaces bytes of branch with 955 // inverted form. cur_off remains the same afterward, so 956 // we do not need to modify label data structures. 957 self.data.truncate(prev_b.start as usize); 958 self.data.extend_from_slice(&data[..]); 959 960 // Save the original code as the inversion of the 961 // inverted branch, in case we later edit this branch 962 // again. 963 prev_b.inverted = Some(not_inverted); 964 self.fixup_records[prev_b.fixup].label = target; 965 trace!(" -> reassigning target of condbr to {:?}", target); 966 prev_b.target = target; 967 debug_assert_eq!(off_before_edit, self.cur_offset()); 968 continue; 969 } 970 } 971 } 972 973 // If we couldn't do anything with the last branch, then break. 974 break; 975 } 976 977 self.purge_latest_branches(); 978 979 trace!( 980 "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}", 981 self.latest_branches, 982 self.labels_at_tail, 983 self.fixup_records 984 ); 985 } 986 987 fn purge_latest_branches(&mut self) { 988 // All of our branch simplification rules work only if a branch ends at 989 // the tail of the buffer, with no following code; and branches are in 990 // order in latest_branches; so if the last entry ends prior to 991 // cur_offset, then clear all entries. 992 let cur_off = self.cur_offset(); 993 if let Some(l) = self.latest_branches.last() { 994 if l.end < cur_off { 995 trace!("purge_latest_branches: removing branch {:?}", l); 996 self.latest_branches.clear(); 997 } 998 } 999 1000 // Post-invariant: no invariant requires any branch to appear in 1001 // `latest_branches`; it is always optional. The list-clear above thus 1002 // preserves all semantics. 1003 } 1004 1005 /// Emit a constant at some point in the future, binding the given label to 1006 /// its offset. The constant will be placed at most `max_distance` from the 1007 /// current offset. 1008 pub fn defer_constant( 1009 &mut self, 1010 label: MachLabel, 1011 align: CodeOffset, 1012 data: &[u8], 1013 max_distance: CodeOffset, 1014 ) { 1015 trace!( 1016 "defer_constant: eventually emit {} bytes aligned to {} at label {:?}", 1017 data.len(), 1018 align, 1019 label 1020 ); 1021 let deadline = self.cur_offset().saturating_add(max_distance); 1022 self.island_worst_case_size += data.len() as CodeOffset; 1023 self.island_worst_case_size = 1024 (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1); 1025 self.pending_constants.push(MachLabelConstant { 1026 label, 1027 align, 1028 data: SmallVec::from(data), 1029 }); 1030 if deadline < self.island_deadline { 1031 self.island_deadline = deadline; 1032 } 1033 } 1034 1035 /// Is an island needed within the next N bytes? 1036 pub fn island_needed(&self, distance: CodeOffset) -> bool { 1037 self.worst_case_end_of_island(distance) > self.island_deadline 1038 } 1039 1040 /// Returns the maximal offset that islands can reach if `distance` more 1041 /// bytes are appended. 1042 /// 1043 /// This is used to determine if veneers need insertions since jumps that 1044 /// can't reach past this point must get a veneer of some form. 1045 fn worst_case_end_of_island(&self, distance: CodeOffset) -> CodeOffset { 1046 self.cur_offset() 1047 .saturating_add(distance) 1048 .saturating_add(self.island_worst_case_size) 1049 } 1050 1051 /// Emit all pending constants and required pending veneers. 1052 /// 1053 /// Should only be called if `island_needed()` returns true, i.e., if we 1054 /// actually reach a deadline. It's not necessarily a problem to do so 1055 /// otherwise but it may result in unnecessary work during emission. 1056 pub fn emit_island(&mut self, distance: CodeOffset) { 1057 self.emit_island_maybe_forced(false, distance); 1058 } 1059 1060 /// Same as `emit_island`, but an internal API with a `force_veneers` 1061 /// argument to force all veneers to always get emitted for debugging. 1062 fn emit_island_maybe_forced(&mut self, force_veneers: bool, distance: CodeOffset) { 1063 // We're going to purge fixups, so no latest-branch editing can happen 1064 // anymore. 1065 self.latest_branches.clear(); 1066 1067 // Reset internal calculations about islands since we're going to 1068 // change the calculus as we apply fixups. The `forced_threshold` is 1069 // used here to determine whether jumps to unknown labels will require 1070 // a veneer or not. 1071 let forced_threshold = self.worst_case_end_of_island(distance); 1072 self.island_deadline = UNKNOWN_LABEL_OFFSET; 1073 self.island_worst_case_size = 0; 1074 1075 // First flush out all constants so we have more labels in case fixups 1076 // are applied against these labels. 1077 for MachLabelConstant { label, align, data } in mem::take(&mut self.pending_constants) { 1078 self.align_to(align); 1079 self.bind_label(label); 1080 self.put_data(&data[..]); 1081 } 1082 1083 for fixup in mem::take(&mut self.fixup_records) { 1084 trace!("emit_island: fixup {:?}", fixup); 1085 let MachLabelFixup { 1086 label, 1087 offset, 1088 kind, 1089 } = fixup; 1090 let label_offset = self.resolve_label_offset(label); 1091 let start = offset as usize; 1092 let end = (offset + kind.patch_size()) as usize; 1093 1094 if label_offset != UNKNOWN_LABEL_OFFSET { 1095 // If the offset of the label for this fixup is known then 1096 // we're going to do something here-and-now. We're either going 1097 // to patch the original offset because it's an in-bounds jump, 1098 // or we're going to generate a veneer, patch the fixup to jump 1099 // to the veneer, and then keep going. 1100 // 1101 // If the label comes after the original fixup, then we should 1102 // be guaranteed that the jump is in-bounds. Otherwise there's 1103 // a bug somewhere because this method wasn't called soon 1104 // enough. All forward-jumps are tracked and should get veneers 1105 // before their deadline comes and they're unable to jump 1106 // further. 1107 // 1108 // Otherwise if the label is before the fixup, then that's a 1109 // backwards jump. If it's past the maximum negative range 1110 // then we'll emit a veneer that to jump forward to which can 1111 // then jump backwards. 1112 let veneer_required = if label_offset >= offset { 1113 assert!((label_offset - offset) <= kind.max_pos_range()); 1114 false 1115 } else { 1116 (offset - label_offset) > kind.max_neg_range() 1117 }; 1118 trace!( 1119 " -> label_offset = {}, known, required = {} (pos {} neg {})", 1120 label_offset, 1121 veneer_required, 1122 kind.max_pos_range(), 1123 kind.max_neg_range() 1124 ); 1125 1126 if (force_veneers && kind.supports_veneer()) || veneer_required { 1127 self.emit_veneer(label, offset, kind); 1128 } else { 1129 let slice = &mut self.data[start..end]; 1130 trace!("patching in-range!"); 1131 kind.patch(slice, offset, label_offset); 1132 } 1133 } else { 1134 // If the offset of this label is not known at this time then 1135 // there's one of two possibilities: 1136 // 1137 // * First we may be about to exceed the maximum jump range of 1138 // this fixup. In that case a veneer is inserted to buy some 1139 // more budget for the forward-jump. It's guaranteed that the 1140 // label will eventually come after where we're at, so we know 1141 // that the forward jump is necessary. 1142 // 1143 // * Otherwise we're still within range of the forward jump but 1144 // the precise target isn't known yet. In that case we 1145 // enqueue the fixup to get processed later. 1146 if forced_threshold - offset > kind.max_pos_range() { 1147 self.emit_veneer(label, offset, kind); 1148 } else { 1149 self.use_label_at_offset(offset, label, kind); 1150 } 1151 } 1152 } 1153 } 1154 1155 /// Emits a "veneer" the `kind` code at `offset` to jump to `label`. 1156 /// 1157 /// This will generate extra machine code, using `kind`, to get a 1158 /// larger-jump-kind than `kind` allows. The code at `offset` is then 1159 /// patched to jump to our new code, and then the new code is enqueued for 1160 /// a fixup to get processed at some later time. 1161 fn emit_veneer(&mut self, label: MachLabel, offset: CodeOffset, kind: I::LabelUse) { 1162 // If this `kind` doesn't support a veneer then that's a bug in the 1163 // backend because we need to implement support for such a veneer. 1164 assert!( 1165 kind.supports_veneer(), 1166 "jump beyond the range of {:?} but a veneer isn't supported", 1167 kind, 1168 ); 1169 1170 // Allocate space for a veneer in the island. 1171 self.align_to(I::LabelUse::ALIGN); 1172 let veneer_offset = self.cur_offset(); 1173 trace!("making a veneer at {}", veneer_offset); 1174 let start = offset as usize; 1175 let end = (offset + kind.patch_size()) as usize; 1176 let slice = &mut self.data[start..end]; 1177 // Patch the original label use to refer to the veneer. 1178 trace!( 1179 "patching original at offset {} to veneer offset {}", 1180 offset, 1181 veneer_offset 1182 ); 1183 kind.patch(slice, offset, veneer_offset); 1184 // Generate the veneer. 1185 let veneer_slice = self.get_appended_space(kind.veneer_size() as usize); 1186 let (veneer_fixup_off, veneer_label_use) = 1187 kind.generate_veneer(veneer_slice, veneer_offset); 1188 trace!( 1189 "generated veneer; fixup offset {}, label_use {:?}", 1190 veneer_fixup_off, 1191 veneer_label_use 1192 ); 1193 // Register a new use of `label` with our new veneer fixup and offset. 1194 // This'll recalculate deadlines accordingly and enqueue this fixup to 1195 // get processed at some later time. 1196 self.use_label_at_offset(veneer_fixup_off, label, veneer_label_use); 1197 } 1198 1199 fn finish_emission_maybe_forcing_veneers(&mut self, force_veneers: bool) { 1200 while !self.pending_constants.is_empty() || !self.fixup_records.is_empty() { 1201 // `emit_island()` will emit any pending veneers and constants, and 1202 // as a side-effect, will also take care of any fixups with resolved 1203 // labels eagerly. 1204 self.emit_island_maybe_forced(force_veneers, u32::MAX); 1205 } 1206 1207 // Ensure that all labels have been fixed up after the last island is emitted. This is a 1208 // full (release-mode) assert because an unresolved label means the emitted code is 1209 // incorrect. 1210 assert!(self.fixup_records.is_empty()); 1211 } 1212 1213 /// Finish any deferred emissions and/or fixups. 1214 pub fn finish(mut self) -> MachBufferFinalized { 1215 let _tt = timing::vcode_emit_finish(); 1216 1217 self.finish_emission_maybe_forcing_veneers(false); 1218 1219 let mut srclocs = self.srclocs; 1220 srclocs.sort_by_key(|entry| entry.start); 1221 1222 MachBufferFinalized { 1223 data: self.data, 1224 relocs: self.relocs, 1225 traps: self.traps, 1226 call_sites: self.call_sites, 1227 srclocs, 1228 stack_maps: self.stack_maps, 1229 unwind_info: self.unwind_info, 1230 } 1231 } 1232 1233 /// Add an external relocation at the current offset. 1234 pub fn add_reloc(&mut self, kind: Reloc, name: &ExternalName, addend: Addend) { 1235 let name = name.clone(); 1236 // FIXME(#3277): This should use `I::LabelUse::from_reloc` to optionally 1237 // generate a label-use statement to track whether an island is possibly 1238 // needed to escape this function to actually get to the external name. 1239 // This is most likely to come up on AArch64 where calls between 1240 // functions use a 26-bit signed offset which gives +/- 64MB. This means 1241 // that if a function is 128MB in size and there's a call in the middle 1242 // it's impossible to reach the actual target. Also, while it's 1243 // technically possible to jump to the start of a function and then jump 1244 // further, island insertion below always inserts islands after 1245 // previously appended code so for Cranelift's own implementation this 1246 // is also a problem for 64MB functions on AArch64 which start with a 1247 // call instruction, those won't be able to escape. 1248 // 1249 // Ideally what needs to happen here is that a `LabelUse` is 1250 // transparently generated (or call-sites of this function are audited 1251 // to generate a `LabelUse` instead) and tracked internally. The actual 1252 // relocation would then change over time if and when a veneer is 1253 // inserted, where the relocation here would be patched by this 1254 // `MachBuffer` to jump to the veneer. The problem, though, is that all 1255 // this still needs to end up, in the case of a singular function, 1256 // generating a final relocation pointing either to this particular 1257 // relocation or to the veneer inserted. Additionally 1258 // `MachBuffer` needs the concept of a label which will never be 1259 // resolved, so `emit_island` doesn't trip over not actually ever 1260 // knowning what some labels are. Currently the loop in 1261 // `finish_emission_maybe_forcing_veneers` would otherwise infinitely 1262 // loop. 1263 // 1264 // For now this means that because relocs aren't tracked at all that 1265 // AArch64 functions have a rough size limits of 64MB. For now that's 1266 // somewhat reasonable and the failure mode is a panic in `MachBuffer` 1267 // when a relocation can't otherwise be resolved later, so it shouldn't 1268 // actually result in any memory unsafety or anything like that. 1269 self.relocs.push(MachReloc { 1270 offset: self.data.len() as CodeOffset, 1271 kind, 1272 name, 1273 addend, 1274 }); 1275 } 1276 1277 /// Add a trap record at the current offset. 1278 pub fn add_trap(&mut self, code: TrapCode) { 1279 self.traps.push(MachTrap { 1280 offset: self.data.len() as CodeOffset, 1281 code, 1282 }); 1283 } 1284 1285 /// Add a call-site record at the current offset. 1286 pub fn add_call_site(&mut self, opcode: Opcode) { 1287 debug_assert!( 1288 opcode.is_call(), 1289 "adding call site info for a non-call instruction." 1290 ); 1291 self.call_sites.push(MachCallSite { 1292 ret_addr: self.data.len() as CodeOffset, 1293 opcode, 1294 }); 1295 } 1296 1297 /// Add an unwind record at the current offset. 1298 pub fn add_unwind(&mut self, unwind: UnwindInst) { 1299 self.unwind_info.push((self.cur_offset(), unwind)); 1300 } 1301 1302 /// Set the `SourceLoc` for code from this offset until the offset at the 1303 /// next call to `end_srcloc()`. 1304 pub fn start_srcloc(&mut self, loc: SourceLoc) { 1305 self.cur_srcloc = Some((self.cur_offset(), loc)); 1306 } 1307 1308 /// Mark the end of the `SourceLoc` segment started at the last 1309 /// `start_srcloc()` call. 1310 pub fn end_srcloc(&mut self) { 1311 let (start, loc) = self 1312 .cur_srcloc 1313 .take() 1314 .expect("end_srcloc() called without start_srcloc()"); 1315 let end = self.cur_offset(); 1316 // Skip zero-length extends. 1317 debug_assert!(end >= start); 1318 if end > start { 1319 self.srclocs.push(MachSrcLoc { start, end, loc }); 1320 } 1321 } 1322 1323 /// Add stack map metadata for this program point: a set of stack offsets 1324 /// (from SP upward) that contain live references. 1325 /// 1326 /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets` 1327 /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets` 1328 /// element, one can obtain live-reference offsets from FP instead. 1329 pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) { 1330 let (start, end) = match extent { 1331 StackMapExtent::UpcomingBytes(insn_len) => { 1332 let start_offset = self.cur_offset(); 1333 (start_offset, start_offset + insn_len) 1334 } 1335 StackMapExtent::StartedAtOffset(start_offset) => { 1336 let end_offset = self.cur_offset(); 1337 debug_assert!(end_offset >= start_offset); 1338 (start_offset, end_offset) 1339 } 1340 }; 1341 self.stack_maps.push(MachStackMap { 1342 offset: start, 1343 offset_end: end, 1344 stack_map, 1345 }); 1346 } 1347 } 1348 1349 impl MachBufferFinalized { 1350 /// Get a list of source location mapping tuples in sorted-by-start-offset order. 1351 pub fn get_srclocs_sorted(&self) -> &[MachSrcLoc] { 1352 &self.srclocs[..] 1353 } 1354 1355 /// Get the total required size for the code. 1356 pub fn total_size(&self) -> CodeOffset { 1357 self.data.len() as CodeOffset 1358 } 1359 1360 /// Return the code in this mach buffer as a hex string for testing purposes. 1361 pub fn stringify_code_bytes(&self) -> String { 1362 // This is pretty lame, but whatever .. 1363 use std::fmt::Write; 1364 let mut s = String::with_capacity(self.data.len() * 2); 1365 for b in &self.data { 1366 write!(&mut s, "{:02X}", b).unwrap(); 1367 } 1368 s 1369 } 1370 1371 /// Get the code bytes. 1372 pub fn data(&self) -> &[u8] { 1373 // N.B.: we emit every section into the .text section as far as 1374 // the `CodeSink` is concerned; we do not bother to segregate 1375 // the contents into the actual program text, the jumptable and the 1376 // rodata (constant pool). This allows us to generate code assuming 1377 // that these will not be relocated relative to each other, and avoids 1378 // having to designate each section as belonging in one of the three 1379 // fixed categories defined by `CodeSink`. If this becomes a problem 1380 // later (e.g. because of memory permissions or similar), we can 1381 // add this designation and segregate the output; take care, however, 1382 // to add the appropriate relocations in this case. 1383 1384 &self.data[..] 1385 } 1386 1387 /// Get the list of external relocations for this code. 1388 pub fn relocs(&self) -> &[MachReloc] { 1389 &self.relocs[..] 1390 } 1391 1392 /// Get the list of trap records for this code. 1393 pub fn traps(&self) -> &[MachTrap] { 1394 &self.traps[..] 1395 } 1396 1397 /// Get the stack map metadata for this code. 1398 pub fn stack_maps(&self) -> &[MachStackMap] { 1399 &self.stack_maps[..] 1400 } 1401 1402 /// Get the list of call sites for this code. 1403 pub fn call_sites(&self) -> &[MachCallSite] { 1404 &self.call_sites[..] 1405 } 1406 } 1407 1408 /// A constant that is deferred to the next constant-pool opportunity. 1409 struct MachLabelConstant { 1410 /// This label will refer to the constant's offset. 1411 label: MachLabel, 1412 /// Required alignment. 1413 align: CodeOffset, 1414 /// This data will be emitted when able. 1415 data: SmallVec<[u8; 16]>, 1416 } 1417 1418 /// A fixup to perform on the buffer once code is emitted. Fixups always refer 1419 /// to labels and patch the code based on label offsets. Hence, they are like 1420 /// relocations, but internal to one buffer. 1421 #[derive(Debug)] 1422 struct MachLabelFixup<I: VCodeInst> { 1423 /// The label whose offset controls this fixup. 1424 label: MachLabel, 1425 /// The offset to fix up / patch to refer to this label. 1426 offset: CodeOffset, 1427 /// The kind of fixup. This is architecture-specific; each architecture may have, 1428 /// e.g., several types of branch instructions, each with differently-sized 1429 /// offset fields and different places within the instruction to place the 1430 /// bits. 1431 kind: I::LabelUse, 1432 } 1433 1434 /// A relocation resulting from a compilation. 1435 #[derive(Clone, Debug)] 1436 pub struct MachReloc { 1437 /// The offset at which the relocation applies, *relative to the 1438 /// containing section*. 1439 pub offset: CodeOffset, 1440 /// The kind of relocation. 1441 pub kind: Reloc, 1442 /// The external symbol / name to which this relocation refers. 1443 pub name: ExternalName, 1444 /// The addend to add to the symbol value. 1445 pub addend: i64, 1446 } 1447 1448 /// A trap record resulting from a compilation. 1449 #[derive(Clone, Debug)] 1450 pub struct MachTrap { 1451 /// The offset at which the trap instruction occurs, *relative to the 1452 /// containing section*. 1453 pub offset: CodeOffset, 1454 /// The trap code. 1455 pub code: TrapCode, 1456 } 1457 1458 /// A call site record resulting from a compilation. 1459 #[derive(Clone, Debug)] 1460 pub struct MachCallSite { 1461 /// The offset of the call's return address, *relative to the containing section*. 1462 pub ret_addr: CodeOffset, 1463 /// The call's opcode. 1464 pub opcode: Opcode, 1465 } 1466 1467 /// A source-location mapping resulting from a compilation. 1468 #[derive(Clone, Debug)] 1469 pub struct MachSrcLoc { 1470 /// The start of the region of code corresponding to a source location. 1471 /// This is relative to the start of the function, not to the start of the 1472 /// section. 1473 pub start: CodeOffset, 1474 /// The end of the region of code corresponding to a source location. 1475 /// This is relative to the start of the section, not to the start of the 1476 /// section. 1477 pub end: CodeOffset, 1478 /// The source location. 1479 pub loc: SourceLoc, 1480 } 1481 1482 /// Record of stack map metadata: stack offsets containing references. 1483 #[derive(Clone, Debug)] 1484 pub struct MachStackMap { 1485 /// The code offset at which this stack map applies. 1486 pub offset: CodeOffset, 1487 /// The code offset just past the "end" of the instruction: that is, the 1488 /// offset of the first byte of the following instruction, or equivalently, 1489 /// the start offset plus the instruction length. 1490 pub offset_end: CodeOffset, 1491 /// The stack map itself. 1492 pub stack_map: StackMap, 1493 } 1494 1495 /// Record of branch instruction in the buffer, to facilitate editing. 1496 #[derive(Clone, Debug)] 1497 struct MachBranch { 1498 start: CodeOffset, 1499 end: CodeOffset, 1500 target: MachLabel, 1501 fixup: usize, 1502 inverted: Option<SmallVec<[u8; 8]>>, 1503 /// All labels pointing to the start of this branch. For correctness, this 1504 /// *must* be complete (i.e., must contain all labels whose resolved offsets 1505 /// are at the start of this branch): we rely on being able to redirect all 1506 /// labels that could jump to this branch before removing it, if it is 1507 /// otherwise unreachable. 1508 labels_at_this_branch: SmallVec<[MachLabel; 4]>, 1509 } 1510 1511 impl MachBranch { 1512 fn is_cond(&self) -> bool { 1513 self.inverted.is_some() 1514 } 1515 fn is_uncond(&self) -> bool { 1516 self.inverted.is_none() 1517 } 1518 } 1519 1520 /// Implementation of the `TextSectionBuilder` trait backed by `MachBuffer`. 1521 /// 1522 /// Note that `MachBuffer` was primarily written for intra-function references 1523 /// of jumps between basic blocks, but it's also quite usable for entire text 1524 /// sections and resolving references between functions themselves. This 1525 /// builder interprets "blocks" as labeled functions for the purposes of 1526 /// resolving labels internally in the buffer. 1527 pub struct MachTextSectionBuilder<I: VCodeInst> { 1528 buf: MachBuffer<I>, 1529 next_func: usize, 1530 force_veneers: bool, 1531 } 1532 1533 impl<I: VCodeInst> MachTextSectionBuilder<I> { 1534 pub fn new(num_funcs: u32) -> MachTextSectionBuilder<I> { 1535 let mut buf = MachBuffer::new(); 1536 buf.reserve_labels_for_blocks(num_funcs as usize); 1537 MachTextSectionBuilder { 1538 buf, 1539 next_func: 0, 1540 force_veneers: false, 1541 } 1542 } 1543 } 1544 1545 impl<I: VCodeInst> TextSectionBuilder for MachTextSectionBuilder<I> { 1546 fn append(&mut self, named: bool, func: &[u8], align: Option<u32>) -> u64 { 1547 // Conditionally emit an island if it's necessary to resolve jumps 1548 // between functions which are too far away. 1549 let size = func.len() as u32; 1550 if self.force_veneers || self.buf.island_needed(size) { 1551 self.buf.emit_island_maybe_forced(self.force_veneers, size); 1552 } 1553 1554 self.buf.align_to(align.unwrap_or(I::LabelUse::ALIGN)); 1555 let pos = self.buf.cur_offset(); 1556 if named { 1557 self.buf 1558 .bind_label(MachLabel::from_block(BlockIndex::new(self.next_func))); 1559 self.next_func += 1; 1560 } 1561 self.buf.put_data(func); 1562 u64::from(pos) 1563 } 1564 1565 fn resolve_reloc(&mut self, offset: u64, reloc: Reloc, addend: Addend, target: u32) -> bool { 1566 let label = MachLabel::from_block(BlockIndex::new(target as usize)); 1567 let offset = u32::try_from(offset).unwrap(); 1568 match I::LabelUse::from_reloc(reloc, addend) { 1569 Some(label_use) => { 1570 self.buf.use_label_at_offset(offset, label, label_use); 1571 true 1572 } 1573 None => false, 1574 } 1575 } 1576 1577 fn force_veneers(&mut self) { 1578 self.force_veneers = true; 1579 } 1580 1581 fn finish(&mut self) -> Vec<u8> { 1582 // Double-check all functions were pushed. 1583 assert_eq!(self.next_func, self.buf.label_offsets.len()); 1584 1585 // Finish up any veneers, if necessary. 1586 self.buf 1587 .finish_emission_maybe_forcing_veneers(self.force_veneers); 1588 1589 // We don't need the data any more, so return it to the caller. 1590 mem::take(&mut self.buf.data).into_vec() 1591 } 1592 } 1593 1594 // We use an actual instruction definition to do tests, so we depend on the `arm64` feature here. 1595 #[cfg(all(test, feature = "arm64"))] 1596 mod test { 1597 use super::*; 1598 use crate::isa::aarch64::inst::xreg; 1599 use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst}; 1600 use crate::machinst::MachInstEmit; 1601 use crate::settings; 1602 use std::default::Default; 1603 use std::vec::Vec; 1604 1605 fn label(n: u32) -> MachLabel { 1606 MachLabel::from_block(BlockIndex::new(n as usize)) 1607 } 1608 fn target(n: u32) -> BranchTarget { 1609 BranchTarget::Label(label(n)) 1610 } 1611 1612 #[test] 1613 fn test_elide_jump_to_next() { 1614 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1615 let mut buf = MachBuffer::new(); 1616 let mut state = Default::default(); 1617 1618 buf.reserve_labels_for_blocks(2); 1619 buf.bind_label(label(0)); 1620 let inst = Inst::Jump { dest: target(1) }; 1621 inst.emit(&[], &mut buf, &info, &mut state); 1622 buf.bind_label(label(1)); 1623 let buf = buf.finish(); 1624 assert_eq!(0, buf.total_size()); 1625 } 1626 1627 #[test] 1628 fn test_elide_trivial_jump_blocks() { 1629 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1630 let mut buf = MachBuffer::new(); 1631 let mut state = Default::default(); 1632 1633 buf.reserve_labels_for_blocks(4); 1634 1635 buf.bind_label(label(0)); 1636 let inst = Inst::CondBr { 1637 kind: CondBrKind::NotZero(xreg(0)), 1638 taken: target(1), 1639 not_taken: target(2), 1640 }; 1641 inst.emit(&[], &mut buf, &info, &mut state); 1642 1643 buf.bind_label(label(1)); 1644 let inst = Inst::Jump { dest: target(3) }; 1645 inst.emit(&[], &mut buf, &info, &mut state); 1646 1647 buf.bind_label(label(2)); 1648 let inst = Inst::Jump { dest: target(3) }; 1649 inst.emit(&[], &mut buf, &info, &mut state); 1650 1651 buf.bind_label(label(3)); 1652 1653 let buf = buf.finish(); 1654 assert_eq!(0, buf.total_size()); 1655 } 1656 1657 #[test] 1658 fn test_flip_cond() { 1659 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1660 let mut buf = MachBuffer::new(); 1661 let mut state = Default::default(); 1662 1663 buf.reserve_labels_for_blocks(4); 1664 1665 buf.bind_label(label(0)); 1666 let inst = Inst::CondBr { 1667 kind: CondBrKind::NotZero(xreg(0)), 1668 taken: target(1), 1669 not_taken: target(2), 1670 }; 1671 inst.emit(&[], &mut buf, &info, &mut state); 1672 1673 buf.bind_label(label(1)); 1674 let inst = Inst::Udf { 1675 trap_code: TrapCode::Interrupt, 1676 }; 1677 inst.emit(&[], &mut buf, &info, &mut state); 1678 1679 buf.bind_label(label(2)); 1680 let inst = Inst::Nop4; 1681 inst.emit(&[], &mut buf, &info, &mut state); 1682 1683 buf.bind_label(label(3)); 1684 1685 let buf = buf.finish(); 1686 1687 let mut buf2 = MachBuffer::new(); 1688 let mut state = Default::default(); 1689 let inst = Inst::TrapIf { 1690 kind: CondBrKind::NotZero(xreg(0)), 1691 trap_code: TrapCode::Interrupt, 1692 }; 1693 inst.emit(&[], &mut buf2, &info, &mut state); 1694 let inst = Inst::Nop4; 1695 inst.emit(&[], &mut buf2, &info, &mut state); 1696 1697 let buf2 = buf2.finish(); 1698 1699 assert_eq!(buf.data, buf2.data); 1700 } 1701 1702 #[test] 1703 fn test_island() { 1704 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1705 let mut buf = MachBuffer::new(); 1706 let mut state = Default::default(); 1707 1708 buf.reserve_labels_for_blocks(4); 1709 1710 buf.bind_label(label(0)); 1711 let inst = Inst::CondBr { 1712 kind: CondBrKind::NotZero(xreg(0)), 1713 taken: target(2), 1714 not_taken: target(3), 1715 }; 1716 inst.emit(&[], &mut buf, &info, &mut state); 1717 1718 buf.bind_label(label(1)); 1719 while buf.cur_offset() < 2000000 { 1720 if buf.island_needed(0) { 1721 buf.emit_island(0); 1722 } 1723 let inst = Inst::Nop4; 1724 inst.emit(&[], &mut buf, &info, &mut state); 1725 } 1726 1727 buf.bind_label(label(2)); 1728 let inst = Inst::Nop4; 1729 inst.emit(&[], &mut buf, &info, &mut state); 1730 1731 buf.bind_label(label(3)); 1732 let inst = Inst::Nop4; 1733 inst.emit(&[], &mut buf, &info, &mut state); 1734 1735 let buf = buf.finish(); 1736 1737 assert_eq!(2000000 + 8, buf.total_size()); 1738 1739 let mut buf2 = MachBuffer::new(); 1740 let mut state = Default::default(); 1741 let inst = Inst::CondBr { 1742 kind: CondBrKind::NotZero(xreg(0)), 1743 1744 // This conditionally taken branch has a 19-bit constant, shifted 1745 // to the left by two, giving us a 21-bit range in total. Half of 1746 // this range positive so the we should be around 1 << 20 bytes 1747 // away for our jump target. 1748 // 1749 // There are two pending fixups by the time we reach this point, 1750 // one for this 19-bit jump and one for the unconditional 26-bit 1751 // jump below. A 19-bit veneer is 4 bytes large and the 26-bit 1752 // veneer is 20 bytes large, which means that pessimistically 1753 // assuming we'll need two veneers we need 24 bytes of extra 1754 // space, meaning that the actual island should come 24-bytes 1755 // before the deadline. 1756 taken: BranchTarget::ResolvedOffset((1 << 20) - 4 - 20), 1757 1758 // This branch is in-range so no veneers should be needed, it should 1759 // go directly to the target. 1760 not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4), 1761 }; 1762 inst.emit(&[], &mut buf2, &info, &mut state); 1763 1764 let buf2 = buf2.finish(); 1765 1766 assert_eq!(&buf.data[0..8], &buf2.data[..]); 1767 } 1768 1769 #[test] 1770 fn test_island_backward() { 1771 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1772 let mut buf = MachBuffer::new(); 1773 let mut state = Default::default(); 1774 1775 buf.reserve_labels_for_blocks(4); 1776 1777 buf.bind_label(label(0)); 1778 let inst = Inst::Nop4; 1779 inst.emit(&[], &mut buf, &info, &mut state); 1780 1781 buf.bind_label(label(1)); 1782 let inst = Inst::Nop4; 1783 inst.emit(&[], &mut buf, &info, &mut state); 1784 1785 buf.bind_label(label(2)); 1786 while buf.cur_offset() < 2000000 { 1787 let inst = Inst::Nop4; 1788 inst.emit(&[], &mut buf, &info, &mut state); 1789 } 1790 1791 buf.bind_label(label(3)); 1792 let inst = Inst::CondBr { 1793 kind: CondBrKind::NotZero(xreg(0)), 1794 taken: target(0), 1795 not_taken: target(1), 1796 }; 1797 inst.emit(&[], &mut buf, &info, &mut state); 1798 1799 let buf = buf.finish(); 1800 1801 assert_eq!(2000000 + 12, buf.total_size()); 1802 1803 let mut buf2 = MachBuffer::new(); 1804 let mut state = Default::default(); 1805 let inst = Inst::CondBr { 1806 kind: CondBrKind::NotZero(xreg(0)), 1807 taken: BranchTarget::ResolvedOffset(8), 1808 not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)), 1809 }; 1810 inst.emit(&[], &mut buf2, &info, &mut state); 1811 let inst = Inst::Jump { 1812 dest: BranchTarget::ResolvedOffset(-(2000000 + 8)), 1813 }; 1814 inst.emit(&[], &mut buf2, &info, &mut state); 1815 1816 let buf2 = buf2.finish(); 1817 1818 assert_eq!(&buf.data[2000000..], &buf2.data[..]); 1819 } 1820 1821 #[test] 1822 fn test_multiple_redirect() { 1823 // label0: 1824 // cbz x0, label1 1825 // b label2 1826 // label1: 1827 // b label3 1828 // label2: 1829 // nop 1830 // nop 1831 // b label0 1832 // label3: 1833 // b label4 1834 // label4: 1835 // b label5 1836 // label5: 1837 // b label7 1838 // label6: 1839 // nop 1840 // label7: 1841 // ret 1842 // 1843 // -- should become: 1844 // 1845 // label0: 1846 // cbz x0, label7 1847 // label2: 1848 // nop 1849 // nop 1850 // b label0 1851 // label6: 1852 // nop 1853 // label7: 1854 // ret 1855 1856 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1857 let mut buf = MachBuffer::new(); 1858 let mut state = Default::default(); 1859 1860 buf.reserve_labels_for_blocks(8); 1861 1862 buf.bind_label(label(0)); 1863 let inst = Inst::CondBr { 1864 kind: CondBrKind::Zero(xreg(0)), 1865 taken: target(1), 1866 not_taken: target(2), 1867 }; 1868 inst.emit(&[], &mut buf, &info, &mut state); 1869 1870 buf.bind_label(label(1)); 1871 let inst = Inst::Jump { dest: target(3) }; 1872 inst.emit(&[], &mut buf, &info, &mut state); 1873 1874 buf.bind_label(label(2)); 1875 let inst = Inst::Nop4; 1876 inst.emit(&[], &mut buf, &info, &mut state); 1877 inst.emit(&[], &mut buf, &info, &mut state); 1878 let inst = Inst::Jump { dest: target(0) }; 1879 inst.emit(&[], &mut buf, &info, &mut state); 1880 1881 buf.bind_label(label(3)); 1882 let inst = Inst::Jump { dest: target(4) }; 1883 inst.emit(&[], &mut buf, &info, &mut state); 1884 1885 buf.bind_label(label(4)); 1886 let inst = Inst::Jump { dest: target(5) }; 1887 inst.emit(&[], &mut buf, &info, &mut state); 1888 1889 buf.bind_label(label(5)); 1890 let inst = Inst::Jump { dest: target(7) }; 1891 inst.emit(&[], &mut buf, &info, &mut state); 1892 1893 buf.bind_label(label(6)); 1894 let inst = Inst::Nop4; 1895 inst.emit(&[], &mut buf, &info, &mut state); 1896 1897 buf.bind_label(label(7)); 1898 let inst = Inst::Ret { rets: vec![] }; 1899 inst.emit(&[], &mut buf, &info, &mut state); 1900 1901 let buf = buf.finish(); 1902 1903 let golden_data = vec![ 1904 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14 1905 0x1f, 0x20, 0x03, 0xd5, // nop 1906 0x1f, 0x20, 0x03, 0xd5, // nop 1907 0xfd, 0xff, 0xff, 0x17, // b 0 1908 0x1f, 0x20, 0x03, 0xd5, // nop 1909 0xc0, 0x03, 0x5f, 0xd6, // ret 1910 ]; 1911 1912 assert_eq!(&golden_data[..], &buf.data[..]); 1913 } 1914 1915 #[test] 1916 fn test_handle_branch_cycle() { 1917 // label0: 1918 // b label1 1919 // label1: 1920 // b label2 1921 // label2: 1922 // b label3 1923 // label3: 1924 // b label4 1925 // label4: 1926 // b label1 // note: not label0 (to make it interesting). 1927 // 1928 // -- should become: 1929 // 1930 // label0, label1, ..., label4: 1931 // b label0 1932 let info = EmitInfo::new(settings::Flags::new(settings::builder())); 1933 let mut buf = MachBuffer::new(); 1934 let mut state = Default::default(); 1935 1936 buf.reserve_labels_for_blocks(5); 1937 1938 buf.bind_label(label(0)); 1939 let inst = Inst::Jump { dest: target(1) }; 1940 inst.emit(&[], &mut buf, &info, &mut state); 1941 1942 buf.bind_label(label(1)); 1943 let inst = Inst::Jump { dest: target(2) }; 1944 inst.emit(&[], &mut buf, &info, &mut state); 1945 1946 buf.bind_label(label(2)); 1947 let inst = Inst::Jump { dest: target(3) }; 1948 inst.emit(&[], &mut buf, &info, &mut state); 1949 1950 buf.bind_label(label(3)); 1951 let inst = Inst::Jump { dest: target(4) }; 1952 inst.emit(&[], &mut buf, &info, &mut state); 1953 1954 buf.bind_label(label(4)); 1955 let inst = Inst::Jump { dest: target(1) }; 1956 inst.emit(&[], &mut buf, &info, &mut state); 1957 1958 let buf = buf.finish(); 1959 1960 let golden_data = vec![ 1961 0x00, 0x00, 0x00, 0x14, // b 0 1962 ]; 1963 1964 assert_eq!(&golden_data[..], &buf.data[..]); 1965 } 1966 1967 #[test] 1968 fn metadata_records() { 1969 let mut buf = MachBuffer::<Inst>::new(); 1970 1971 buf.reserve_labels_for_blocks(1); 1972 1973 buf.bind_label(label(0)); 1974 buf.put1(1); 1975 buf.add_trap(TrapCode::HeapOutOfBounds); 1976 buf.put1(2); 1977 buf.add_trap(TrapCode::IntegerOverflow); 1978 buf.add_trap(TrapCode::IntegerDivisionByZero); 1979 buf.add_call_site(Opcode::Call); 1980 buf.add_reloc(Reloc::Abs4, &ExternalName::user(0, 0), 0); 1981 buf.put1(3); 1982 buf.add_reloc(Reloc::Abs8, &ExternalName::user(1, 1), 1); 1983 buf.put1(4); 1984 1985 let buf = buf.finish(); 1986 1987 assert_eq!(buf.data(), &[1, 2, 3, 4]); 1988 assert_eq!( 1989 buf.traps() 1990 .iter() 1991 .map(|trap| (trap.offset, trap.code)) 1992 .collect::<Vec<_>>(), 1993 vec![ 1994 (1, TrapCode::HeapOutOfBounds), 1995 (2, TrapCode::IntegerOverflow), 1996 (2, TrapCode::IntegerDivisionByZero) 1997 ] 1998 ); 1999 assert_eq!( 2000 buf.call_sites() 2001 .iter() 2002 .map(|call_site| (call_site.ret_addr, call_site.opcode)) 2003 .collect::<Vec<_>>(), 2004 vec![(2, Opcode::Call)] 2005 ); 2006 assert_eq!( 2007 buf.relocs() 2008 .iter() 2009 .map(|reloc| (reloc.offset, reloc.kind)) 2010 .collect::<Vec<_>>(), 2011 vec![(2, Reloc::Abs4), (3, Reloc::Abs8)] 2012 ); 2013 } 2014 } 2015