1 //! Implementation of a vanilla ABI, shared between several machines. The
2 //! implementation here assumes that arguments will be passed in registers
3 //! first, then additional args on the stack; that the stack grows downward,
4 //! contains a standard frame (return address and frame pointer), and the
5 //! compiler is otherwise free to allocate space below that with its choice of
6 //! layout; and that the machine has some notion of caller- and callee-save
7 //! registers. Most modern machines, e.g. x86-64 and AArch64, should fit this
8 //! mold and thus both of these backends use this shared implementation.
9 //!
10 //! See the documentation in specific machine backends for the "instantiation"
11 //! of this generic ABI, i.e., which registers are caller/callee-save, arguments
12 //! and return values, and any other special requirements.
13 //!
14 //! For now the implementation here assumes a 64-bit machine, but we intend to
15 //! make this 32/64-bit-generic shortly.
16 //!
17 //! # Vanilla ABI
18 //!
19 //! First, arguments and return values are passed in registers up to a certain
20 //! fixed count, after which they overflow onto the stack. Multiple return
21 //! values either fit in registers, or are returned in a separate return-value
22 //! area on the stack, given by a hidden extra parameter.
23 //!
24 //! Note that the exact stack layout is up to us. We settled on the
25 //! below design based on several requirements. In particular, we need
26 //! to be able to generate instructions (or instruction sequences) to
27 //! access arguments, stack slots, and spill slots before we know how
28 //! many spill slots or clobber-saves there will be, because of our
29 //! pass structure. We also prefer positive offsets to negative
30 //! offsets because of an asymmetry in some machines' addressing modes
31 //! (e.g., on AArch64, positive offsets have a larger possible range
32 //! without a long-form sequence to synthesize an arbitrary
33 //! offset). We also need clobber-save registers to be "near" the
34 //! frame pointer: Windows unwind information requires it to be within
35 //! 240 bytes of RBP. Finally, it is not allowed to access memory
36 //! below the current SP value.
37 //!
38 //! We assume that a prologue first pushes the frame pointer (and
39 //! return address above that, if the machine does not do that in
40 //! hardware). We set FP to point to this two-word frame record. We
41 //! store all other frame slots below this two-word frame record, as
42 //! well as enough space for arguments to the largest possible
43 //! function call. The stack pointer then remains at this position
44 //! for the duration of the function, allowing us to address all
45 //! frame storage at positive offsets from SP.
46 //!
47 //! Note that if we ever support dynamic stack-space allocation (for
48 //! `alloca`), we will need a way to reference spill slots and stack
49 //! slots relative to a dynamic SP, because we will no longer be able
50 //! to know a static offset from SP to the slots at any particular
51 //! program point. Probably the best solution at that point will be to
52 //! revert to using the frame pointer as the reference for all slots,
53 //! to allow generating spill/reload and stackslot accesses before we
54 //! know how large the clobber-saves will be.
55 //!
56 //! # Stack Layout
57 //!
58 //! The stack looks like:
59 //!
60 //! ```plain
61 //!   (high address)
62 //!                              |          ...              |
63 //!                              | caller frames             |
64 //!                              |          ...              |
65 //!                              +===========================+
66 //!                              |          ...              |
67 //!                              | stack args                |
68 //! Canonical Frame Address -->  | (accessed via FP)         |
69 //!                              +---------------------------+
70 //! SP at function entry ----->  | return address            |
71 //!                              +---------------------------+
72 //! FP after prologue -------->  | FP (pushed by prologue)   |
73 //!                              +---------------------------+           -----
74 //!                              |          ...              |             |
75 //!                              | clobbered callee-saves    |             |
76 //! unwind-frame base -------->  | (pushed by prologue)      |             |
77 //!                              +---------------------------+   -----     |
78 //!                              |          ...              |     |       |
79 //!                              | spill slots               |     |       |
80 //!                              | (accessed via SP)         |   fixed   active
81 //!                              |          ...              |   frame    size
82 //!                              | stack slots               |  storage    |
83 //!                              | (accessed via SP)         |    size     |
84 //!                              | (alloc'd by prologue)     |     |       |
85 //!                              +---------------------------+   -----     |
86 //!                              | [alignment as needed]     |             |
87 //!                              |          ...              |             |
88 //!                              | args for largest call     |             |
89 //! SP ----------------------->  | (alloc'd by prologue)     |             |
90 //!                              +===========================+           -----
91 //!
92 //!   (low address)
93 //! ```
94 //!
95 //! # Multi-value Returns
96 //!
97 //! We support multi-value returns by using multiple return-value
98 //! registers. In some cases this is an extension of the base system
99 //! ABI. See each platform's `abi.rs` implementation for details.
100 
101 use crate::CodegenError;
102 use crate::FxHashMap;
103 use crate::HashMap;
104 use crate::entity::SecondaryMap;
105 use crate::ir::{ArgumentExtension, ArgumentPurpose, ExceptionTag, Signature};
106 use crate::ir::{StackSlotKey, types::*};
107 use crate::isa::TargetIsa;
108 use crate::settings::ProbestackStrategy;
109 use crate::{ir, isa};
110 use crate::{machinst::*, trace};
111 use alloc::boxed::Box;
112 use core::marker::PhantomData;
113 use regalloc2::{MachineEnv, PReg, PRegSet};
114 use smallvec::smallvec;
115 
116 /// A small vector of instructions (with some reasonable size); appropriate for
117 /// a small fixed sequence implementing one operation.
118 pub type SmallInstVec<I> = SmallVec<[I; 4]>;
119 
120 /// A type used by backends to track argument-binding info in the "args"
121 /// pseudoinst. The pseudoinst holds a vec of `ArgPair` structs.
122 #[derive(Clone, Debug)]
123 pub struct ArgPair {
124     /// The vreg that is defined by this args pseudoinst.
125     pub vreg: Writable<Reg>,
126     /// The preg that the arg arrives in; this constrains the vreg's
127     /// placement at the pseudoinst.
128     pub preg: Reg,
129 }
130 
131 /// A type used by backends to track return register binding info in the "ret"
132 /// pseudoinst. The pseudoinst holds a vec of `RetPair` structs.
133 #[derive(Clone, Debug)]
134 pub struct RetPair {
135     /// The vreg that is returned by this pseudionst.
136     pub vreg: Reg,
137     /// The preg that the arg is returned through; this constrains the vreg's
138     /// placement at the pseudoinst.
139     pub preg: Reg,
140 }
141 
142 /// A location for (part of) an argument or return value. These "storage slots"
143 /// are specified for each register-sized part of an argument.
144 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
145 pub enum ABIArgSlot {
146     /// In a real register.
147     Reg {
148         /// Register that holds this arg.
149         reg: RealReg,
150         /// Value type of this arg.
151         ty: ir::Type,
152         /// Should this arg be zero- or sign-extended?
153         extension: ir::ArgumentExtension,
154     },
155     /// Arguments only: on stack, at given offset from SP at entry.
156     Stack {
157         /// Offset of this arg relative to the base of stack args.
158         offset: i64,
159         /// Value type of this arg.
160         ty: ir::Type,
161         /// Should this arg be zero- or sign-extended?
162         extension: ir::ArgumentExtension,
163     },
164 }
165 
166 impl ABIArgSlot {
167     /// The type of the value that will be stored in this slot.
get_type(&self) -> ir::Type168     pub fn get_type(&self) -> ir::Type {
169         match self {
170             ABIArgSlot::Reg { ty, .. } => *ty,
171             ABIArgSlot::Stack { ty, .. } => *ty,
172         }
173     }
174 }
175 
176 /// A vector of `ABIArgSlot`s. Inline capacity for one element because basically
177 /// 100% of values use one slot. Only `i128`s need multiple slots, and they are
178 /// super rare (and never happen with Wasm).
179 pub type ABIArgSlotVec = SmallVec<[ABIArgSlot; 1]>;
180 
181 /// An ABIArg is composed of one or more parts. This allows for a CLIF-level
182 /// Value to be passed with its parts in more than one location at the ABI
183 /// level. For example, a 128-bit integer may be passed in two 64-bit registers,
184 /// or even a 64-bit register and a 64-bit stack slot, on a 64-bit machine. The
185 /// number of "parts" should correspond to the number of registers used to store
186 /// this type according to the machine backend.
187 ///
188 /// As an invariant, the `purpose` for every part must match. As a further
189 /// invariant, a `StructArg` part cannot appear with any other part.
190 #[derive(Clone, Debug)]
191 pub enum ABIArg {
192     /// Storage slots (registers or stack locations) for each part of the
193     /// argument value. The number of slots must equal the number of register
194     /// parts used to store a value of this type.
195     Slots {
196         /// Slots, one per register part.
197         slots: ABIArgSlotVec,
198         /// Purpose of this arg.
199         purpose: ir::ArgumentPurpose,
200     },
201     /// Structure argument. We reserve stack space for it, but the CLIF-level
202     /// semantics are a little weird: the value passed to the call instruction,
203     /// and received in the corresponding block param, is a *pointer*. On the
204     /// caller side, we memcpy the data from the passed-in pointer to the stack
205     /// area; on the callee side, we compute a pointer to this stack area and
206     /// provide that as the argument's value.
207     StructArg {
208         /// Offset of this arg relative to base of stack args.
209         offset: i64,
210         /// Size of this arg on the stack.
211         size: u64,
212         /// Purpose of this arg.
213         purpose: ir::ArgumentPurpose,
214     },
215     /// Implicit argument. Similar to a StructArg, except that we have the
216     /// target type, not a pointer type, at the CLIF-level. This argument is
217     /// still being passed via reference implicitly.
218     ImplicitPtrArg {
219         /// Register or stack slot holding a pointer to the buffer.
220         pointer: ABIArgSlot,
221         /// Offset of the argument buffer.
222         offset: i64,
223         /// Type of the implicit argument.
224         ty: Type,
225         /// Purpose of this arg.
226         purpose: ir::ArgumentPurpose,
227     },
228 }
229 
230 impl ABIArg {
231     /// Create an ABIArg from one register.
reg( reg: RealReg, ty: ir::Type, extension: ir::ArgumentExtension, purpose: ir::ArgumentPurpose, ) -> ABIArg232     pub fn reg(
233         reg: RealReg,
234         ty: ir::Type,
235         extension: ir::ArgumentExtension,
236         purpose: ir::ArgumentPurpose,
237     ) -> ABIArg {
238         ABIArg::Slots {
239             slots: smallvec![ABIArgSlot::Reg { reg, ty, extension }],
240             purpose,
241         }
242     }
243 
244     /// Create an ABIArg from one stack slot.
stack( offset: i64, ty: ir::Type, extension: ir::ArgumentExtension, purpose: ir::ArgumentPurpose, ) -> ABIArg245     pub fn stack(
246         offset: i64,
247         ty: ir::Type,
248         extension: ir::ArgumentExtension,
249         purpose: ir::ArgumentPurpose,
250     ) -> ABIArg {
251         ABIArg::Slots {
252             slots: smallvec![ABIArgSlot::Stack {
253                 offset,
254                 ty,
255                 extension,
256             }],
257             purpose,
258         }
259     }
260 }
261 
262 /// Are we computing information about arguments or return values? Much of the
263 /// handling is factored out into common routines; this enum allows us to
264 /// distinguish which case we're handling.
265 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
266 pub enum ArgsOrRets {
267     /// Arguments.
268     Args,
269     /// Return values.
270     Rets,
271 }
272 
273 /// Abstract location for a machine-specific ABI impl to translate into the
274 /// appropriate addressing mode.
275 #[derive(Clone, Copy, Debug, PartialEq, Eq)]
276 pub enum StackAMode {
277     /// Offset into the current frame's argument area.
278     IncomingArg(i64, u32),
279     /// Offset within the stack slots in the current frame.
280     Slot(i64),
281     /// Offset into the callee frame's argument area.
282     OutgoingArg(i64),
283 }
284 
285 impl StackAMode {
offset_by(&self, offset: u32) -> Self286     fn offset_by(&self, offset: u32) -> Self {
287         match self {
288             StackAMode::IncomingArg(off, size) => {
289                 StackAMode::IncomingArg(off.checked_add(i64::from(offset)).unwrap(), *size)
290             }
291             StackAMode::Slot(off) => StackAMode::Slot(off.checked_add(i64::from(offset)).unwrap()),
292             StackAMode::OutgoingArg(off) => {
293                 StackAMode::OutgoingArg(off.checked_add(i64::from(offset)).unwrap())
294             }
295         }
296     }
297 }
298 
299 /// Trait implemented by machine-specific backend to represent ISA flags.
300 pub trait IsaFlags: Clone {
301     /// Get a flag indicating whether forward-edge CFI is enabled.
is_forward_edge_cfi_enabled(&self) -> bool302     fn is_forward_edge_cfi_enabled(&self) -> bool {
303         false
304     }
305 }
306 
307 /// Used as an out-parameter to accumulate a sequence of `ABIArg`s in
308 /// `ABIMachineSpec::compute_arg_locs`. Wraps the shared allocation for all
309 /// `ABIArg`s in `SigSet` and exposes just the args for the current
310 /// `compute_arg_locs` call.
311 pub struct ArgsAccumulator<'a> {
312     sig_set_abi_args: &'a mut Vec<ABIArg>,
313     start: usize,
314     non_formal_flag: bool,
315 }
316 
317 impl<'a> ArgsAccumulator<'a> {
new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self318     fn new(sig_set_abi_args: &'a mut Vec<ABIArg>) -> Self {
319         let start = sig_set_abi_args.len();
320         ArgsAccumulator {
321             sig_set_abi_args,
322             start,
323             non_formal_flag: false,
324         }
325     }
326 
327     #[inline]
push(&mut self, arg: ABIArg)328     pub fn push(&mut self, arg: ABIArg) {
329         debug_assert!(!self.non_formal_flag);
330         self.sig_set_abi_args.push(arg)
331     }
332 
333     #[inline]
push_non_formal(&mut self, arg: ABIArg)334     pub fn push_non_formal(&mut self, arg: ABIArg) {
335         self.non_formal_flag = true;
336         self.sig_set_abi_args.push(arg)
337     }
338 
339     #[inline]
args(&self) -> &[ABIArg]340     pub fn args(&self) -> &[ABIArg] {
341         &self.sig_set_abi_args[self.start..]
342     }
343 
344     #[inline]
args_mut(&mut self) -> &mut [ABIArg]345     pub fn args_mut(&mut self) -> &mut [ABIArg] {
346         &mut self.sig_set_abi_args[self.start..]
347     }
348 }
349 
350 /// Trait implemented by machine-specific backend to provide information about
351 /// register assignments and to allow generating the specific instructions for
352 /// stack loads/saves, prologues/epilogues, etc.
353 pub trait ABIMachineSpec {
354     /// The instruction type.
355     type I: VCodeInst;
356 
357     /// The ISA flags type.
358     type F: IsaFlags;
359 
360     /// This is the limit for the size of argument and return-value areas on the
361     /// stack. We place a reasonable limit here to avoid integer overflow issues
362     /// with 32-bit arithmetic.
363     const STACK_ARG_RET_SIZE_LIMIT: u32;
364 
365     /// Returns the number of bits in a word, that is 32/64 for 32/64-bit architecture.
word_bits() -> u32366     fn word_bits() -> u32;
367 
368     /// Returns the number of bytes in a word.
word_bytes() -> u32369     fn word_bytes() -> u32 {
370         return Self::word_bits() / 8;
371     }
372 
373     /// Returns word-size integer type.
word_type() -> Type374     fn word_type() -> Type {
375         match Self::word_bits() {
376             32 => I32,
377             64 => I64,
378             _ => unreachable!(),
379         }
380     }
381 
382     /// Returns word register class.
word_reg_class() -> RegClass383     fn word_reg_class() -> RegClass {
384         RegClass::Int
385     }
386 
387     /// Returns required stack alignment in bytes.
stack_align(call_conv: isa::CallConv) -> u32388     fn stack_align(call_conv: isa::CallConv) -> u32;
389 
390     /// Process a list of parameters or return values and allocate them to registers
391     /// and stack slots.
392     ///
393     /// The argument locations should be pushed onto the given `ArgsAccumulator`
394     /// in order. Any extra arguments added (such as return area pointers)
395     /// should come at the end of the list so that the first N lowered
396     /// parameters align with the N clif parameters.
397     ///
398     /// Returns the stack-space used (rounded up to as alignment requires), and
399     /// if `add_ret_area_ptr` was passed, the index of the extra synthetic arg
400     /// that was added.
compute_arg_locs( call_conv: isa::CallConv, flags: &settings::Flags, params: &[ir::AbiParam], args_or_rets: ArgsOrRets, add_ret_area_ptr: bool, args: ArgsAccumulator, ) -> CodegenResult<(u32, Option<usize>)>401     fn compute_arg_locs(
402         call_conv: isa::CallConv,
403         flags: &settings::Flags,
404         params: &[ir::AbiParam],
405         args_or_rets: ArgsOrRets,
406         add_ret_area_ptr: bool,
407         args: ArgsAccumulator,
408     ) -> CodegenResult<(u32, Option<usize>)>;
409 
410     /// Generate a load from the stack.
gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I411     fn gen_load_stack(mem: StackAMode, into_reg: Writable<Reg>, ty: Type) -> Self::I;
412 
413     /// Generate a store to the stack.
gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I414     fn gen_store_stack(mem: StackAMode, from_reg: Reg, ty: Type) -> Self::I;
415 
416     /// Generate a move.
gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I417     fn gen_move(to_reg: Writable<Reg>, from_reg: Reg, ty: Type) -> Self::I;
418 
419     /// Generate an integer-extend operation.
gen_extend( to_reg: Writable<Reg>, from_reg: Reg, is_signed: bool, from_bits: u8, to_bits: u8, ) -> Self::I420     fn gen_extend(
421         to_reg: Writable<Reg>,
422         from_reg: Reg,
423         is_signed: bool,
424         from_bits: u8,
425         to_bits: u8,
426     ) -> Self::I;
427 
428     /// Generate an "args" pseudo-instruction to capture input args in
429     /// registers.
gen_args(args: Vec<ArgPair>) -> Self::I430     fn gen_args(args: Vec<ArgPair>) -> Self::I;
431 
432     /// Generate a "rets" pseudo-instruction that moves vregs to return
433     /// registers.
gen_rets(rets: Vec<RetPair>) -> Self::I434     fn gen_rets(rets: Vec<RetPair>) -> Self::I;
435 
436     /// Generate an add-with-immediate. Note that even if this uses a scratch
437     /// register, it must satisfy two requirements:
438     ///
439     /// - The add-imm sequence must only clobber caller-save registers that are
440     ///   not used for arguments, because it will be placed in the prologue
441     ///   before the clobbered callee-save registers are saved.
442     ///
443     /// - The add-imm sequence must work correctly when `from_reg` and/or
444     ///   `into_reg` are the register returned by `get_stacklimit_reg()`.
gen_add_imm( call_conv: isa::CallConv, into_reg: Writable<Reg>, from_reg: Reg, imm: u32, ) -> SmallInstVec<Self::I>445     fn gen_add_imm(
446         call_conv: isa::CallConv,
447         into_reg: Writable<Reg>,
448         from_reg: Reg,
449         imm: u32,
450     ) -> SmallInstVec<Self::I>;
451 
452     /// Generate a sequence that traps with a `TrapCode::StackOverflow` code if
453     /// the stack pointer is less than the given limit register (assuming the
454     /// stack grows downward).
gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>455     fn gen_stack_lower_bound_trap(limit_reg: Reg) -> SmallInstVec<Self::I>;
456 
457     /// Generate an instruction to compute an address of a stack slot (FP- or
458     /// SP-based offset).
gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I459     fn gen_get_stack_addr(mem: StackAMode, into_reg: Writable<Reg>) -> Self::I;
460 
461     /// Get a fixed register to use to compute a stack limit. This is needed for
462     /// certain sequences generated after the register allocator has already
463     /// run. This must satisfy two requirements:
464     ///
465     /// - It must be a caller-save register that is not used for arguments,
466     ///   because it will be clobbered in the prologue before the clobbered
467     ///   callee-save registers are saved.
468     ///
469     /// - It must be safe to pass as an argument and/or destination to
470     ///   `gen_add_imm()`. This is relevant when an addition with a large
471     ///   immediate needs its own temporary; it cannot use the same fixed
472     ///   temporary as this one.
get_stacklimit_reg(call_conv: isa::CallConv) -> Reg473     fn get_stacklimit_reg(call_conv: isa::CallConv) -> Reg;
474 
475     /// Generate a load to the given [base+offset] address.
gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I476     fn gen_load_base_offset(into_reg: Writable<Reg>, base: Reg, offset: i32, ty: Type) -> Self::I;
477 
478     /// Generate a store from the given [base+offset] address.
gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I479     fn gen_store_base_offset(base: Reg, offset: i32, from_reg: Reg, ty: Type) -> Self::I;
480 
481     /// Adjust the stack pointer up or down.
gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>482     fn gen_sp_reg_adjust(amount: i32) -> SmallInstVec<Self::I>;
483 
484     /// Compute a FrameLayout structure containing a sorted list of all clobbered
485     /// registers that are callee-saved according to the ABI, as well as the sizes
486     /// of all parts of the stack frame.  The result is used to emit the prologue
487     /// and epilogue routines.
compute_frame_layout( call_conv: isa::CallConv, flags: &settings::Flags, sig: &Signature, regs: &[Writable<RealReg>], function_calls: FunctionCalls, incoming_args_size: u32, tail_args_size: u32, stackslots_size: u32, fixed_frame_storage_size: u32, outgoing_args_size: u32, ) -> FrameLayout488     fn compute_frame_layout(
489         call_conv: isa::CallConv,
490         flags: &settings::Flags,
491         sig: &Signature,
492         regs: &[Writable<RealReg>],
493         function_calls: FunctionCalls,
494         incoming_args_size: u32,
495         tail_args_size: u32,
496         stackslots_size: u32,
497         fixed_frame_storage_size: u32,
498         outgoing_args_size: u32,
499     ) -> FrameLayout;
500 
501     /// Generate the usual frame-setup sequence for this architecture: e.g.,
502     /// `push rbp / mov rbp, rsp` on x86-64, or `stp fp, lr, [sp, #-16]!` on
503     /// AArch64.
gen_prologue_frame_setup( call_conv: isa::CallConv, flags: &settings::Flags, isa_flags: &Self::F, frame_layout: &FrameLayout, ) -> SmallInstVec<Self::I>504     fn gen_prologue_frame_setup(
505         call_conv: isa::CallConv,
506         flags: &settings::Flags,
507         isa_flags: &Self::F,
508         frame_layout: &FrameLayout,
509     ) -> SmallInstVec<Self::I>;
510 
511     /// Generate the usual frame-restore sequence for this architecture.
gen_epilogue_frame_restore( call_conv: isa::CallConv, flags: &settings::Flags, isa_flags: &Self::F, frame_layout: &FrameLayout, ) -> SmallInstVec<Self::I>512     fn gen_epilogue_frame_restore(
513         call_conv: isa::CallConv,
514         flags: &settings::Flags,
515         isa_flags: &Self::F,
516         frame_layout: &FrameLayout,
517     ) -> SmallInstVec<Self::I>;
518 
519     /// Generate a return instruction.
gen_return( call_conv: isa::CallConv, isa_flags: &Self::F, frame_layout: &FrameLayout, ) -> SmallInstVec<Self::I>520     fn gen_return(
521         call_conv: isa::CallConv,
522         isa_flags: &Self::F,
523         frame_layout: &FrameLayout,
524     ) -> SmallInstVec<Self::I>;
525 
526     /// Generate a probestack call.
gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32)527     fn gen_probestack(insts: &mut SmallInstVec<Self::I>, frame_size: u32);
528 
529     /// Generate a inline stack probe.
gen_inline_probestack( insts: &mut SmallInstVec<Self::I>, call_conv: isa::CallConv, frame_size: u32, guard_size: u32, )530     fn gen_inline_probestack(
531         insts: &mut SmallInstVec<Self::I>,
532         call_conv: isa::CallConv,
533         frame_size: u32,
534         guard_size: u32,
535     );
536 
537     /// Generate a clobber-save sequence. The implementation here should return
538     /// a sequence of instructions that "push" or otherwise save to the stack all
539     /// registers written/modified by the function body that are callee-saved.
540     /// The sequence of instructions should adjust the stack pointer downward,
541     /// and should align as necessary according to ABI requirements.
gen_clobber_save( call_conv: isa::CallConv, flags: &settings::Flags, frame_layout: &FrameLayout, ) -> SmallVec<[Self::I; 16]>542     fn gen_clobber_save(
543         call_conv: isa::CallConv,
544         flags: &settings::Flags,
545         frame_layout: &FrameLayout,
546     ) -> SmallVec<[Self::I; 16]>;
547 
548     /// Generate a clobber-restore sequence. This sequence should perform the
549     /// opposite of the clobber-save sequence generated above, assuming that SP
550     /// going into the sequence is at the same point that it was left when the
551     /// clobber-save sequence finished.
gen_clobber_restore( call_conv: isa::CallConv, flags: &settings::Flags, frame_layout: &FrameLayout, ) -> SmallVec<[Self::I; 16]>552     fn gen_clobber_restore(
553         call_conv: isa::CallConv,
554         flags: &settings::Flags,
555         frame_layout: &FrameLayout,
556     ) -> SmallVec<[Self::I; 16]>;
557 
558     /// Generate a memcpy invocation. Used to set up struct
559     /// args. Takes `src`, `dst` as read-only inputs and passes a temporary
560     /// allocator.
gen_memcpy<F: FnMut(Type) -> Writable<Reg>>( call_conv: isa::CallConv, dst: Reg, src: Reg, size: usize, alloc_tmp: F, ) -> SmallVec<[Self::I; 8]>561     fn gen_memcpy<F: FnMut(Type) -> Writable<Reg>>(
562         call_conv: isa::CallConv,
563         dst: Reg,
564         src: Reg,
565         size: usize,
566         alloc_tmp: F,
567     ) -> SmallVec<[Self::I; 8]>;
568 
569     /// Get the number of spillslots required for the given register-class.
get_number_of_spillslots_for_value( rc: RegClass, target_vector_bytes: u32, isa_flags: &Self::F, ) -> u32570     fn get_number_of_spillslots_for_value(
571         rc: RegClass,
572         target_vector_bytes: u32,
573         isa_flags: &Self::F,
574     ) -> u32;
575 
576     /// Get the ABI-dependent MachineEnv for managing register allocation.
get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv577     fn get_machine_env(flags: &settings::Flags, call_conv: isa::CallConv) -> &MachineEnv;
578 
579     /// Get all caller-save registers, that is, registers that we expect
580     /// not to be saved across a call to a callee with the given ABI.
get_regs_clobbered_by_call( call_conv_of_callee: isa::CallConv, is_exception: bool, ) -> PRegSet581     fn get_regs_clobbered_by_call(
582         call_conv_of_callee: isa::CallConv,
583         is_exception: bool,
584     ) -> PRegSet;
585 
586     /// Get the needed extension mode, given the mode attached to the argument
587     /// in the signature and the calling convention. The input (the attribute in
588     /// the signature) specifies what extension type should be done *if* the ABI
589     /// requires extension to the full register; this method's return value
590     /// indicates whether the extension actually *will* be done.
get_ext_mode( call_conv: isa::CallConv, specified: ir::ArgumentExtension, ) -> ir::ArgumentExtension591     fn get_ext_mode(
592         call_conv: isa::CallConv,
593         specified: ir::ArgumentExtension,
594     ) -> ir::ArgumentExtension;
595 
596     /// Get a temporary register that is available to use after a call
597     /// completes and that does not interfere with register-carried
598     /// return values. This is used to move stack-carried return
599     /// values directly into spillslots if needed.
retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>600     fn retval_temp_reg(call_conv_of_callee: isa::CallConv) -> Writable<Reg>;
601 
602     /// Get the exception payload registers, if any, for a calling
603     /// convention.
604     ///
605     /// Note that the argument here is the calling convention of the *callee*.
606     /// This might differ from the caller but the exceptional payloads that are
607     /// available are defined by the callee, not the caller.
exception_payload_regs(callee_conv: isa::CallConv) -> &'static [Reg]608     fn exception_payload_regs(callee_conv: isa::CallConv) -> &'static [Reg] {
609         let _ = callee_conv;
610         &[]
611     }
612 }
613 
614 /// Out-of-line data for calls, to keep the size of `Inst` down.
615 #[derive(Clone, Debug)]
616 pub struct CallInfo<T> {
617     /// Receiver of this call
618     pub dest: T,
619     /// Register uses of this call.
620     pub uses: CallArgList,
621     /// Register defs of this call.
622     pub defs: CallRetList,
623     /// Registers clobbered by this call, as per its calling convention.
624     pub clobbers: PRegSet,
625     /// The calling convention of the callee.
626     pub callee_conv: isa::CallConv,
627     /// The calling convention of the caller.
628     pub caller_conv: isa::CallConv,
629     /// The number of bytes that the callee will pop from the stack for the
630     /// caller, if any. (Used for popping stack arguments with the `tail`
631     /// calling convention.)
632     pub callee_pop_size: u32,
633     /// Information for a try-call, if this is one. We combine
634     /// handling of calls and try-calls as much as possible to share
635     /// argument/return logic; they mostly differ in the metadata that
636     /// they emit, which this information feeds into.
637     pub try_call_info: Option<TryCallInfo>,
638     /// Whether this call is patchable.
639     pub patchable: bool,
640 }
641 
642 /// Out-of-line information present on `try_call` instructions only:
643 /// information that is used to generate exception-handling tables and
644 /// link up to destination blocks properly.
645 #[derive(Clone, Debug)]
646 pub struct TryCallInfo {
647     /// The target to jump to on a normal returhn.
648     pub continuation: MachLabel,
649     /// Exception tags to catch and corresponding destination labels.
650     pub exception_handlers: Box<[TryCallHandler]>,
651 }
652 
653 /// Information about an individual handler at a try-call site.
654 #[derive(Clone, Debug)]
655 pub enum TryCallHandler {
656     /// If the tag matches (given the current context), recover at the
657     /// label.
658     Tag(ExceptionTag, MachLabel),
659     /// Recover at the label unconditionally.
660     Default(MachLabel),
661     /// Set the dynamic context for interpreting tags at this point in
662     /// the handler list.
663     Context(Reg),
664 }
665 
666 impl<T> CallInfo<T> {
667     /// Creates an empty set of info with no clobbers/uses/etc with the
668     /// specified ABI
empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T>669     pub fn empty(dest: T, call_conv: isa::CallConv) -> CallInfo<T> {
670         CallInfo {
671             dest,
672             uses: smallvec![],
673             defs: smallvec![],
674             clobbers: PRegSet::empty(),
675             caller_conv: call_conv,
676             callee_conv: call_conv,
677             callee_pop_size: 0,
678             try_call_info: None,
679             patchable: false,
680         }
681     }
682 }
683 
684 /// The id of an ABI signature within the `SigSet`.
685 #[derive(Copy, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
686 pub struct Sig(u32);
687 cranelift_entity::entity_impl!(Sig);
688 
689 impl Sig {
prev(self) -> Option<Sig>690     fn prev(self) -> Option<Sig> {
691         self.0.checked_sub(1).map(Sig)
692     }
693 }
694 
695 /// ABI information shared between body (callee) and caller.
696 #[derive(Clone, Debug)]
697 pub struct SigData {
698     /// Currently both return values and arguments are stored in a continuous space vector
699     /// in `SigSet::abi_args`.
700     ///
701     /// ```plain
702     ///                  +----------------------------------------------+
703     ///                  | return values                                |
704     ///                  | ...                                          |
705     ///   rets_end   --> +----------------------------------------------+
706     ///                  | arguments                                    |
707     ///                  | ...                                          |
708     ///   args_end   --> +----------------------------------------------+
709     ///
710     /// ```
711     ///
712     /// Note we only store two offsets as rets_end == args_start, and rets_start == prev.args_end.
713     ///
714     /// Argument location ending offset (regs or stack slots). Stack offsets are relative to
715     /// SP on entry to function.
716     ///
717     /// This is a index into the `SigSet::abi_args`.
718     args_end: u32,
719 
720     /// Return-value location ending offset. Stack offsets are relative to the return-area
721     /// pointer.
722     ///
723     /// This is a index into the `SigSet::abi_args`.
724     rets_end: u32,
725 
726     /// Space on stack used to store arguments. We're storing the size in u32 to
727     /// reduce the size of the struct.
728     sized_stack_arg_space: u32,
729 
730     /// Space on stack used to store return values. We're storing the size in u32 to
731     /// reduce the size of the struct.
732     sized_stack_ret_space: u32,
733 
734     /// Index in `args` of the stack-return-value-area argument.
735     stack_ret_arg: Option<u16>,
736 
737     /// Calling convention used.
738     call_conv: isa::CallConv,
739 }
740 
741 impl SigData {
742     /// Get total stack space required for arguments.
sized_stack_arg_space(&self) -> u32743     pub fn sized_stack_arg_space(&self) -> u32 {
744         self.sized_stack_arg_space
745     }
746 
747     /// Get total stack space required for return values.
sized_stack_ret_space(&self) -> u32748     pub fn sized_stack_ret_space(&self) -> u32 {
749         self.sized_stack_ret_space
750     }
751 
752     /// Get calling convention used.
call_conv(&self) -> isa::CallConv753     pub fn call_conv(&self) -> isa::CallConv {
754         self.call_conv
755     }
756 
757     /// The index of the stack-return-value-area argument, if any.
stack_ret_arg(&self) -> Option<u16>758     pub fn stack_ret_arg(&self) -> Option<u16> {
759         self.stack_ret_arg
760     }
761 }
762 
763 /// A (mostly) deduplicated set of ABI signatures.
764 ///
765 /// We say "mostly" because we do not dedupe between signatures interned via
766 /// `ir::SigRef` (direct and indirect calls; the vast majority of signatures in
767 /// this set) vs via `ir::Signature` (the callee itself and libcalls). Doing
768 /// this final bit of deduplication would require filling out the
769 /// `ir_signature_to_abi_sig`, which is a bunch of allocations (not just the
770 /// hash map itself but params and returns vecs in each signature) that we want
771 /// to avoid.
772 ///
773 /// In general, prefer using the `ir::SigRef`-taking methods to the
774 /// `ir::Signature`-taking methods when you can get away with it, as they don't
775 /// require cloning non-copy types that will trigger heap allocations.
776 ///
777 /// This type can be indexed by `Sig` to access its associated `SigData`.
778 pub struct SigSet {
779     /// Interned `ir::Signature`s that we already have an ABI signature for.
780     ir_signature_to_abi_sig: FxHashMap<ir::Signature, Sig>,
781 
782     /// Interned `ir::SigRef`s that we already have an ABI signature for.
783     ir_sig_ref_to_abi_sig: SecondaryMap<ir::SigRef, Option<Sig>>,
784 
785     /// A single, shared allocation for all `ABIArg`s used by all
786     /// `SigData`s. Each `SigData` references its args/rets via indices into
787     /// this allocation.
788     abi_args: Vec<ABIArg>,
789 
790     /// The actual ABI signatures, keyed by `Sig`.
791     sigs: PrimaryMap<Sig, SigData>,
792 }
793 
794 impl SigSet {
795     /// Construct a new `SigSet`, interning all of the signatures used by the
796     /// given function.
new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self> where M: ABIMachineSpec,797     pub fn new<M>(func: &ir::Function, flags: &settings::Flags) -> CodegenResult<Self>
798     where
799         M: ABIMachineSpec,
800     {
801         let arg_estimate = func.dfg.signatures.len() * 6;
802 
803         let mut sigs = SigSet {
804             ir_signature_to_abi_sig: FxHashMap::default(),
805             ir_sig_ref_to_abi_sig: SecondaryMap::with_capacity(func.dfg.signatures.len()),
806             abi_args: Vec::with_capacity(arg_estimate),
807             sigs: PrimaryMap::with_capacity(1 + func.dfg.signatures.len()),
808         };
809 
810         sigs.make_abi_sig_from_ir_signature::<M>(func.signature.clone(), flags)?;
811         for sig_ref in func.dfg.signatures.keys() {
812             sigs.make_abi_sig_from_ir_sig_ref::<M>(sig_ref, &func.dfg, flags)?;
813         }
814 
815         Ok(sigs)
816     }
817 
818     /// Have we already interned an ABI signature for the given `ir::Signature`?
have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool819     pub fn have_abi_sig_for_signature(&self, signature: &ir::Signature) -> bool {
820         self.ir_signature_to_abi_sig.contains_key(signature)
821     }
822 
823     /// Construct and intern an ABI signature for the given `ir::Signature`.
make_abi_sig_from_ir_signature<M>( &mut self, signature: ir::Signature, flags: &settings::Flags, ) -> CodegenResult<Sig> where M: ABIMachineSpec,824     pub fn make_abi_sig_from_ir_signature<M>(
825         &mut self,
826         signature: ir::Signature,
827         flags: &settings::Flags,
828     ) -> CodegenResult<Sig>
829     where
830         M: ABIMachineSpec,
831     {
832         // Because the `HashMap` entry API requires taking ownership of the
833         // lookup key -- and we want to avoid unnecessary clones of
834         // `ir::Signature`s, even at the cost of duplicate lookups -- we can't
835         // have a single, get-or-create-style method for interning
836         // `ir::Signature`s into ABI signatures. So at least (debug) assert that
837         // we aren't creating duplicate ABI signatures for the same
838         // `ir::Signature`.
839         debug_assert!(!self.have_abi_sig_for_signature(&signature));
840 
841         let sig_data = self.from_func_sig::<M>(&signature, flags)?;
842         let sig = self.sigs.push(sig_data);
843         self.ir_signature_to_abi_sig.insert(signature, sig);
844         Ok(sig)
845     }
846 
make_abi_sig_from_ir_sig_ref<M>( &mut self, sig_ref: ir::SigRef, dfg: &ir::DataFlowGraph, flags: &settings::Flags, ) -> CodegenResult<Sig> where M: ABIMachineSpec,847     fn make_abi_sig_from_ir_sig_ref<M>(
848         &mut self,
849         sig_ref: ir::SigRef,
850         dfg: &ir::DataFlowGraph,
851         flags: &settings::Flags,
852     ) -> CodegenResult<Sig>
853     where
854         M: ABIMachineSpec,
855     {
856         if let Some(sig) = self.ir_sig_ref_to_abi_sig[sig_ref] {
857             return Ok(sig);
858         }
859         let signature = &dfg.signatures[sig_ref];
860         let sig_data = self.from_func_sig::<M>(signature, flags)?;
861         let sig = self.sigs.push(sig_data);
862         self.ir_sig_ref_to_abi_sig[sig_ref] = Some(sig);
863         Ok(sig)
864     }
865 
866     /// Get the already-interned ABI signature id for the given `ir::SigRef`.
abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig867     pub fn abi_sig_for_sig_ref(&self, sig_ref: ir::SigRef) -> Sig {
868         self.ir_sig_ref_to_abi_sig[sig_ref]
869             .expect("must call `make_abi_sig_from_ir_sig_ref` before `get_abi_sig_for_sig_ref`")
870     }
871 
872     /// Get the already-interned ABI signature id for the given `ir::Signature`.
abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig873     pub fn abi_sig_for_signature(&self, signature: &ir::Signature) -> Sig {
874         self.ir_signature_to_abi_sig
875             .get(signature)
876             .copied()
877             .expect("must call `make_abi_sig_from_ir_signature` before `get_abi_sig_for_signature`")
878     }
879 
from_func_sig<M: ABIMachineSpec>( &mut self, sig: &ir::Signature, flags: &settings::Flags, ) -> CodegenResult<SigData>880     pub fn from_func_sig<M: ABIMachineSpec>(
881         &mut self,
882         sig: &ir::Signature,
883         flags: &settings::Flags,
884     ) -> CodegenResult<SigData> {
885         // Keep in sync with ensure_struct_return_ptr_is_returned
886         if sig.uses_special_return(ArgumentPurpose::StructReturn) {
887             panic!("Explicit StructReturn return value not allowed: {sig:?}")
888         }
889         let tmp;
890         let returns = if let Some(struct_ret_index) =
891             sig.special_param_index(ArgumentPurpose::StructReturn)
892         {
893             if !sig.returns.is_empty() {
894                 panic!("No return values are allowed when using StructReturn: {sig:?}");
895             }
896             tmp = [sig.params[struct_ret_index]];
897             &tmp
898         } else {
899             sig.returns.as_slice()
900         };
901 
902         // Compute args and retvals from signature. Handle retvals first,
903         // because we may need to add a return-area arg to the args.
904 
905         // NOTE: We rely on the order of the args (rets -> args) inserted to compute the offsets in
906         // `SigSet::args()` and `SigSet::rets()`. Therefore, we cannot change the two
907         // compute_arg_locs order.
908         let (sized_stack_ret_space, _) = M::compute_arg_locs(
909             sig.call_conv,
910             flags,
911             &returns,
912             ArgsOrRets::Rets,
913             /* extra ret-area ptr = */ false,
914             ArgsAccumulator::new(&mut self.abi_args),
915         )?;
916         if !flags.enable_multi_ret_implicit_sret() {
917             assert_eq!(sized_stack_ret_space, 0);
918         }
919         let rets_end = u32::try_from(self.abi_args.len()).unwrap();
920 
921         // To avoid overflow issues, limit the return size to something reasonable.
922         if sized_stack_ret_space > M::STACK_ARG_RET_SIZE_LIMIT {
923             return Err(CodegenError::ImplLimitExceeded);
924         }
925 
926         let need_stack_return_area = sized_stack_ret_space > 0;
927         if need_stack_return_area {
928             assert!(!sig.uses_special_param(ir::ArgumentPurpose::StructReturn));
929         }
930 
931         let (sized_stack_arg_space, stack_ret_arg) = M::compute_arg_locs(
932             sig.call_conv,
933             flags,
934             &sig.params,
935             ArgsOrRets::Args,
936             need_stack_return_area,
937             ArgsAccumulator::new(&mut self.abi_args),
938         )?;
939         let args_end = u32::try_from(self.abi_args.len()).unwrap();
940 
941         // To avoid overflow issues, limit the arg size to something reasonable.
942         if sized_stack_arg_space > M::STACK_ARG_RET_SIZE_LIMIT {
943             return Err(CodegenError::ImplLimitExceeded);
944         }
945 
946         trace!(
947             "ABISig: sig {:?} => args end = {} rets end = {}
948              arg stack = {} ret stack = {} stack_ret_arg = {:?}",
949             sig,
950             args_end,
951             rets_end,
952             sized_stack_arg_space,
953             sized_stack_ret_space,
954             need_stack_return_area,
955         );
956 
957         let stack_ret_arg = stack_ret_arg.map(|s| u16::try_from(s).unwrap());
958         Ok(SigData {
959             args_end,
960             rets_end,
961             sized_stack_arg_space,
962             sized_stack_ret_space,
963             stack_ret_arg,
964             call_conv: sig.call_conv,
965         })
966     }
967 
968     /// Get this signature's ABI arguments.
args(&self, sig: Sig) -> &[ABIArg]969     pub fn args(&self, sig: Sig) -> &[ABIArg] {
970         let sig_data = &self.sigs[sig];
971         // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
972         let start = usize::try_from(sig_data.rets_end).unwrap();
973         let end = usize::try_from(sig_data.args_end).unwrap();
974         &self.abi_args[start..end]
975     }
976 
977     /// Get information specifying how to pass the implicit pointer
978     /// to the return-value area on the stack, if required.
get_ret_arg(&self, sig: Sig) -> Option<ABIArg>979     pub fn get_ret_arg(&self, sig: Sig) -> Option<ABIArg> {
980         let sig_data = &self.sigs[sig];
981         if let Some(i) = sig_data.stack_ret_arg {
982             Some(self.args(sig)[usize::from(i)].clone())
983         } else {
984             None
985         }
986     }
987 
988     /// Get information specifying how to pass one argument.
get_arg(&self, sig: Sig, idx: usize) -> ABIArg989     pub fn get_arg(&self, sig: Sig, idx: usize) -> ABIArg {
990         self.args(sig)[idx].clone()
991     }
992 
993     /// Get this signature's ABI returns.
rets(&self, sig: Sig) -> &[ABIArg]994     pub fn rets(&self, sig: Sig) -> &[ABIArg] {
995         let sig_data = &self.sigs[sig];
996         // Please see comments in `SigSet::from_func_sig` of how we store the offsets.
997         let start = usize::try_from(sig.prev().map_or(0, |prev| self.sigs[prev].args_end)).unwrap();
998         let end = usize::try_from(sig_data.rets_end).unwrap();
999         &self.abi_args[start..end]
1000     }
1001 
1002     /// Get information specifying how to pass one return value.
get_ret(&self, sig: Sig, idx: usize) -> ABIArg1003     pub fn get_ret(&self, sig: Sig, idx: usize) -> ABIArg {
1004         self.rets(sig)[idx].clone()
1005     }
1006 
1007     /// Get the number of arguments expected.
num_args(&self, sig: Sig) -> usize1008     pub fn num_args(&self, sig: Sig) -> usize {
1009         let len = self.args(sig).len();
1010         if self.sigs[sig].stack_ret_arg.is_some() {
1011             len - 1
1012         } else {
1013             len
1014         }
1015     }
1016 
1017     /// Get the number of return values expected.
num_rets(&self, sig: Sig) -> usize1018     pub fn num_rets(&self, sig: Sig) -> usize {
1019         self.rets(sig).len()
1020     }
1021 }
1022 
1023 // NB: we do _not_ implement `IndexMut` because these signatures are
1024 // deduplicated and shared!
1025 impl core::ops::Index<Sig> for SigSet {
1026     type Output = SigData;
1027 
index(&self, sig: Sig) -> &Self::Output1028     fn index(&self, sig: Sig) -> &Self::Output {
1029         &self.sigs[sig]
1030     }
1031 }
1032 
1033 /// Structure describing the layout of a function's stack frame.
1034 #[derive(Clone, Debug, Default)]
1035 pub struct FrameLayout {
1036     /// Word size in bytes, so this struct can be
1037     /// monomorphic/independent of `ABIMachineSpec`.
1038     pub word_bytes: u32,
1039 
1040     /// N.B. The areas whose sizes are given in this structure fully
1041     /// cover the current function's stack frame, from high to low
1042     /// stack addresses in the sequence below.  Each size contains
1043     /// any alignment padding that may be required by the ABI.
1044 
1045     /// Size of incoming arguments on the stack.  This is not technically
1046     /// part of this function's frame, but code in the function will still
1047     /// need to access it.  Depending on the ABI, we may need to set up a
1048     /// frame pointer to do so; we also may need to pop this area from the
1049     /// stack upon return.
1050     pub incoming_args_size: u32,
1051 
1052     /// The size of the incoming argument area, taking into account any
1053     /// potential increase in size required for tail calls present in the
1054     /// function. In the case that no tail calls are present, this value
1055     /// will be the same as [`Self::incoming_args_size`].
1056     pub tail_args_size: u32,
1057 
1058     /// Size of the "setup area", typically holding the return address
1059     /// and/or the saved frame pointer.  This may be written either during
1060     /// the call itself (e.g. a pushed return address) or by code emitted
1061     /// from gen_prologue_frame_setup.  In any case, after that code has
1062     /// completed execution, the stack pointer is expected to point to the
1063     /// bottom of this area.  The same holds at the start of code emitted
1064     /// by gen_epilogue_frame_restore.
1065     pub setup_area_size: u32,
1066 
1067     /// Size of the area used to save callee-saved clobbered registers.
1068     /// This area is accessed by code emitted from gen_clobber_save and
1069     /// gen_clobber_restore.
1070     pub clobber_size: u32,
1071 
1072     /// Storage allocated for the fixed part of the stack frame.
1073     /// This contains stack slots and spill slots.
1074     pub fixed_frame_storage_size: u32,
1075 
1076     /// The size of all stackslots.
1077     pub stackslots_size: u32,
1078 
1079     /// Stack size to be reserved for outgoing arguments, if used by
1080     /// the current ABI, or 0 otherwise.  After gen_clobber_save and
1081     /// before gen_clobber_restore, the stack pointer points to the
1082     /// bottom of this area.
1083     pub outgoing_args_size: u32,
1084 
1085     /// Sorted list of callee-saved registers that are clobbered
1086     /// according to the ABI.  These registers will be saved and
1087     /// restored by gen_clobber_save and gen_clobber_restore.
1088     pub clobbered_callee_saves: Vec<Writable<RealReg>>,
1089 
1090     /// The function's call pattern classification.
1091     pub function_calls: FunctionCalls,
1092 }
1093 
1094 impl FrameLayout {
1095     /// Split the clobbered callee-save registers into integer-class and
1096     /// float-class groups.
1097     ///
1098     /// This method does not currently support vector-class callee-save
1099     /// registers because no current backend has them.
clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>])1100     pub fn clobbered_callee_saves_by_class(&self) -> (&[Writable<RealReg>], &[Writable<RealReg>]) {
1101         let (ints, floats) = self.clobbered_callee_saves.split_at(
1102             self.clobbered_callee_saves
1103                 .partition_point(|r| r.to_reg().class() == RegClass::Int),
1104         );
1105         debug_assert!(floats.iter().all(|r| r.to_reg().class() == RegClass::Float));
1106         (ints, floats)
1107     }
1108 
1109     /// The size of FP to SP while the frame is active (not during prologue
1110     /// setup or epilogue tear down).
active_size(&self) -> u321111     pub fn active_size(&self) -> u32 {
1112         self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1113     }
1114 
1115     /// Get the offset from the SP to the sized stack slots area.
sp_to_sized_stack_slots(&self) -> u321116     pub fn sp_to_sized_stack_slots(&self) -> u32 {
1117         self.outgoing_args_size
1118     }
1119 
1120     /// Get the offset of a spill slot from SP.
spillslot_offset(&self, spillslot: SpillSlot) -> i641121     pub fn spillslot_offset(&self, spillslot: SpillSlot) -> i64 {
1122         // Offset from beginning of spillslot area.
1123         let islot = spillslot.index() as i64;
1124         let spill_off = islot * self.word_bytes as i64;
1125         let sp_off = self.stackslots_size as i64 + spill_off;
1126 
1127         sp_off
1128     }
1129 
1130     /// Get the offset from SP up to FP.
sp_to_fp(&self) -> u321131     pub fn sp_to_fp(&self) -> u32 {
1132         self.outgoing_args_size + self.fixed_frame_storage_size + self.clobber_size
1133     }
1134 }
1135 
1136 /// ABI object for a function body.
1137 pub struct Callee<M: ABIMachineSpec> {
1138     /// CLIF-level signature, possibly normalized.
1139     ir_sig: ir::Signature,
1140     /// Signature: arg and retval regs.
1141     sig: Sig,
1142     /// Defined dynamic types.
1143     dynamic_type_sizes: HashMap<Type, u32>,
1144     /// Offsets to each dynamic stackslot.
1145     dynamic_stackslots: PrimaryMap<DynamicStackSlot, u32>,
1146     /// Offsets to each sized stackslot.
1147     sized_stackslots: PrimaryMap<StackSlot, u32>,
1148     /// Descriptors for sized stackslots.
1149     sized_stackslot_keys: SecondaryMap<StackSlot, Option<StackSlotKey>>,
1150     /// Total stack size of all stackslots
1151     stackslots_size: u32,
1152     /// Stack size to be reserved for outgoing arguments.
1153     outgoing_args_size: u32,
1154     /// Initially the number of bytes originating in the callers frame where stack arguments will
1155     /// live. After lowering this number may be larger than the size expected by the function being
1156     /// compiled, as tail calls potentially require more space for stack arguments.
1157     tail_args_size: u32,
1158     /// Register-argument defs, to be provided to the `args`
1159     /// pseudo-inst, and pregs to constrain them to.
1160     reg_args: Vec<ArgPair>,
1161     /// Finalized frame layout for this function.
1162     frame_layout: Option<FrameLayout>,
1163     /// The register holding the return-area pointer, if needed.
1164     ret_area_ptr: Option<Reg>,
1165     /// Calling convention this function expects.
1166     call_conv: isa::CallConv,
1167     /// The settings controlling this function's compilation.
1168     flags: settings::Flags,
1169     /// The ISA-specific flag values controlling this function's compilation.
1170     isa_flags: M::F,
1171     /// If this function has a stack limit specified, then `Reg` is where the
1172     /// stack limit will be located after the instructions specified have been
1173     /// executed.
1174     ///
1175     /// Note that this is intended for insertion into the prologue, if
1176     /// present. Also note that because the instructions here execute in the
1177     /// prologue this happens after legalization/register allocation/etc so we
1178     /// need to be extremely careful with each instruction. The instructions are
1179     /// manually register-allocated and carefully only use caller-saved
1180     /// registers and keep nothing live after this sequence of instructions.
1181     stack_limit: Option<(Reg, SmallInstVec<M::I>)>,
1182 
1183     _mach: PhantomData<M>,
1184 }
1185 
get_special_purpose_param_register( f: &ir::Function, sigs: &SigSet, sig: Sig, purpose: ir::ArgumentPurpose, ) -> Option<Reg>1186 fn get_special_purpose_param_register(
1187     f: &ir::Function,
1188     sigs: &SigSet,
1189     sig: Sig,
1190     purpose: ir::ArgumentPurpose,
1191 ) -> Option<Reg> {
1192     let idx = f.signature.special_param_index(purpose)?;
1193     match &sigs.args(sig)[idx] {
1194         &ABIArg::Slots { ref slots, .. } => match &slots[0] {
1195             &ABIArgSlot::Reg { reg, .. } => Some(reg.into()),
1196             _ => None,
1197         },
1198         _ => None,
1199     }
1200 }
1201 
checked_round_up(val: u32, mask: u32) -> Option<u32>1202 fn checked_round_up(val: u32, mask: u32) -> Option<u32> {
1203     Some(val.checked_add(mask)? & !mask)
1204 }
1205 
1206 impl<M: ABIMachineSpec> Callee<M> {
1207     /// Create a new body ABI instance.
new( f: &ir::Function, isa: &dyn TargetIsa, isa_flags: &M::F, sigs: &SigSet, ) -> CodegenResult<Self>1208     pub fn new(
1209         f: &ir::Function,
1210         isa: &dyn TargetIsa,
1211         isa_flags: &M::F,
1212         sigs: &SigSet,
1213     ) -> CodegenResult<Self> {
1214         trace!("ABI: func signature {:?}", f.signature);
1215 
1216         let flags = isa.flags().clone();
1217         let sig = sigs.abi_sig_for_signature(&f.signature);
1218 
1219         let call_conv = f.signature.call_conv;
1220         // Only these calling conventions are supported.
1221         debug_assert!(
1222             call_conv == isa::CallConv::SystemV
1223                 || call_conv == isa::CallConv::Tail
1224                 || call_conv == isa::CallConv::Fast
1225                 || call_conv == isa::CallConv::WindowsFastcall
1226                 || call_conv == isa::CallConv::AppleAarch64
1227                 || call_conv == isa::CallConv::Winch
1228                 || call_conv == isa::CallConv::PreserveAll,
1229             "Unsupported calling convention: {call_conv:?}"
1230         );
1231 
1232         // Compute sized stackslot locations and total stackslot size.
1233         let mut end_offset: u32 = 0;
1234         let mut sized_stackslots = PrimaryMap::new();
1235         let mut sized_stackslot_keys = SecondaryMap::new();
1236 
1237         for (stackslot, data) in f.sized_stack_slots.iter() {
1238             // We start our computation possibly unaligned where the previous
1239             // stackslot left off.
1240             let unaligned_start_offset = end_offset;
1241 
1242             // The start of the stackslot must be aligned.
1243             //
1244             // We always at least machine-word-align slots, but also
1245             // satisfy the user's requested alignment.
1246             debug_assert!(data.align_shift < 32);
1247             let align = core::cmp::max(M::word_bytes(), 1u32 << data.align_shift);
1248             let mask = align - 1;
1249             let start_offset = checked_round_up(unaligned_start_offset, mask)
1250                 .ok_or(CodegenError::ImplLimitExceeded)?;
1251 
1252             // The end offset is the start offset increased by the size
1253             end_offset = start_offset
1254                 .checked_add(data.size)
1255                 .ok_or(CodegenError::ImplLimitExceeded)?;
1256 
1257             debug_assert_eq!(stackslot.as_u32() as usize, sized_stackslots.len());
1258             sized_stackslots.push(start_offset);
1259             sized_stackslot_keys[stackslot] = data.key;
1260         }
1261 
1262         // Compute dynamic stackslot locations and total stackslot size.
1263         let mut dynamic_stackslots = PrimaryMap::new();
1264         for (stackslot, data) in f.dynamic_stack_slots.iter() {
1265             debug_assert_eq!(stackslot.as_u32() as usize, dynamic_stackslots.len());
1266 
1267             // This computation is similar to the stackslots above
1268             let unaligned_start_offset = end_offset;
1269 
1270             let mask = M::word_bytes() - 1;
1271             let start_offset = checked_round_up(unaligned_start_offset, mask)
1272                 .ok_or(CodegenError::ImplLimitExceeded)?;
1273 
1274             let ty = f.get_concrete_dynamic_ty(data.dyn_ty).ok_or_else(|| {
1275                 CodegenError::Unsupported(format!("invalid dynamic vector type: {}", data.dyn_ty))
1276             })?;
1277 
1278             end_offset = start_offset
1279                 .checked_add(isa.dynamic_vector_bytes(ty))
1280                 .ok_or(CodegenError::ImplLimitExceeded)?;
1281 
1282             dynamic_stackslots.push(start_offset);
1283         }
1284 
1285         // The size of the stackslots needs to be word aligned
1286         let stackslots_size = checked_round_up(end_offset, M::word_bytes() - 1)
1287             .ok_or(CodegenError::ImplLimitExceeded)?;
1288 
1289         let mut dynamic_type_sizes = HashMap::with_capacity(f.dfg.dynamic_types.len());
1290         for (dyn_ty, _data) in f.dfg.dynamic_types.iter() {
1291             let ty = f
1292                 .get_concrete_dynamic_ty(dyn_ty)
1293                 .unwrap_or_else(|| panic!("invalid dynamic vector type: {dyn_ty}"));
1294             let size = isa.dynamic_vector_bytes(ty);
1295             dynamic_type_sizes.insert(ty, size);
1296         }
1297 
1298         // Figure out what instructions, if any, will be needed to check the
1299         // stack limit. This can either be specified as a special-purpose
1300         // argument or as a global value which often calculates the stack limit
1301         // from the arguments.
1302         let stack_limit = f
1303             .stack_limit
1304             .map(|gv| gen_stack_limit::<M>(f, sigs, sig, gv));
1305 
1306         let tail_args_size = sigs[sig].sized_stack_arg_space;
1307 
1308         Ok(Self {
1309             ir_sig: ensure_struct_return_ptr_is_returned(&f.signature),
1310             sig,
1311             dynamic_stackslots,
1312             dynamic_type_sizes,
1313             sized_stackslots,
1314             sized_stackslot_keys,
1315             stackslots_size,
1316             outgoing_args_size: 0,
1317             tail_args_size,
1318             reg_args: vec![],
1319             frame_layout: None,
1320             ret_area_ptr: None,
1321             call_conv,
1322             flags,
1323             isa_flags: isa_flags.clone(),
1324             stack_limit,
1325             _mach: PhantomData,
1326         })
1327     }
1328 
1329     /// Inserts instructions necessary for checking the stack limit into the
1330     /// prologue.
1331     ///
1332     /// This function will generate instructions necessary for perform a stack
1333     /// check at the header of a function. The stack check is intended to trap
1334     /// if the stack pointer goes below a particular threshold, preventing stack
1335     /// overflow in wasm or other code. The `stack_limit` argument here is the
1336     /// register which holds the threshold below which we're supposed to trap.
1337     /// This function is known to allocate `stack_size` bytes and we'll push
1338     /// instructions onto `insts`.
1339     ///
1340     /// Note that the instructions generated here are special because this is
1341     /// happening so late in the pipeline (e.g. after register allocation). This
1342     /// means that we need to do manual register allocation here and also be
1343     /// careful to not clobber any callee-saved or argument registers. For now
1344     /// this routine makes do with the `spilltmp_reg` as one temporary
1345     /// register, and a second register of `tmp2` which is caller-saved. This
1346     /// should be fine for us since no spills should happen in this sequence of
1347     /// instructions, so our register won't get accidentally clobbered.
1348     ///
1349     /// No values can be live after the prologue, but in this case that's ok
1350     /// because we just need to perform a stack check before progressing with
1351     /// the rest of the function.
insert_stack_check( &self, stack_limit: Reg, stack_size: u32, insts: &mut SmallInstVec<M::I>, )1352     fn insert_stack_check(
1353         &self,
1354         stack_limit: Reg,
1355         stack_size: u32,
1356         insts: &mut SmallInstVec<M::I>,
1357     ) {
1358         // With no explicit stack allocated we can just emit the simple check of
1359         // the stack registers against the stack limit register, and trap if
1360         // it's out of bounds.
1361         if stack_size == 0 {
1362             insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1363             return;
1364         }
1365 
1366         // Note that the 32k stack size here is pretty special. See the
1367         // documentation in x86/abi.rs for why this is here. The general idea is
1368         // that we're protecting against overflow in the addition that happens
1369         // below.
1370         if stack_size >= 32 * 1024 {
1371             insts.extend(M::gen_stack_lower_bound_trap(stack_limit));
1372         }
1373 
1374         // Add the `stack_size` to `stack_limit`, placing the result in
1375         // `scratch`.
1376         //
1377         // Note though that `stack_limit`'s register may be the same as
1378         // `scratch`. If our stack size doesn't fit into an immediate this
1379         // means we need a second scratch register for loading the stack size
1380         // into a register.
1381         let scratch = Writable::from_reg(M::get_stacklimit_reg(self.call_conv));
1382         insts.extend(M::gen_add_imm(
1383             self.call_conv,
1384             scratch,
1385             stack_limit,
1386             stack_size,
1387         ));
1388         insts.extend(M::gen_stack_lower_bound_trap(scratch.to_reg()));
1389     }
1390 }
1391 
1392 /// Generates the instructions necessary for the `gv` to be materialized into a
1393 /// register.
1394 ///
1395 /// This function will return a register that will contain the result of
1396 /// evaluating `gv`. It will also return any instructions necessary to calculate
1397 /// the value of the register.
1398 ///
1399 /// Note that global values are typically lowered to instructions via the
1400 /// standard legalization pass. Unfortunately though prologue generation happens
1401 /// so late in the pipeline that we can't use these legalization passes to
1402 /// generate the instructions for `gv`. As a result we duplicate some lowering
1403 /// of `gv` here and support only some global values. This is similar to what
1404 /// the x86 backend does for now, and hopefully this can be somewhat cleaned up
1405 /// in the future too!
1406 ///
1407 /// Also note that this function will make use of `writable_spilltmp_reg()` as a
1408 /// temporary register to store values in if necessary. Currently after we write
1409 /// to this register there's guaranteed to be no spilled values between where
1410 /// it's used, because we're not participating in register allocation anyway!
gen_stack_limit<M: ABIMachineSpec>( f: &ir::Function, sigs: &SigSet, sig: Sig, gv: ir::GlobalValue, ) -> (Reg, SmallInstVec<M::I>)1411 fn gen_stack_limit<M: ABIMachineSpec>(
1412     f: &ir::Function,
1413     sigs: &SigSet,
1414     sig: Sig,
1415     gv: ir::GlobalValue,
1416 ) -> (Reg, SmallInstVec<M::I>) {
1417     let mut insts = smallvec![];
1418     let reg = generate_gv::<M>(f, sigs, sig, gv, &mut insts);
1419     return (reg, insts);
1420 }
1421 
generate_gv<M: ABIMachineSpec>( f: &ir::Function, sigs: &SigSet, sig: Sig, gv: ir::GlobalValue, insts: &mut SmallInstVec<M::I>, ) -> Reg1422 fn generate_gv<M: ABIMachineSpec>(
1423     f: &ir::Function,
1424     sigs: &SigSet,
1425     sig: Sig,
1426     gv: ir::GlobalValue,
1427     insts: &mut SmallInstVec<M::I>,
1428 ) -> Reg {
1429     match f.global_values[gv] {
1430         // Return the direct register the vmcontext is in
1431         ir::GlobalValueData::VMContext => {
1432             get_special_purpose_param_register(f, sigs, sig, ir::ArgumentPurpose::VMContext)
1433                 .expect("no vmcontext parameter found")
1434         }
1435         // Load our base value into a register, then load from that register
1436         // in to a temporary register.
1437         ir::GlobalValueData::Load {
1438             base,
1439             offset,
1440             global_type: _,
1441             flags: _,
1442         } => {
1443             let base = generate_gv::<M>(f, sigs, sig, base, insts);
1444             let into_reg = Writable::from_reg(M::get_stacklimit_reg(f.stencil.signature.call_conv));
1445             insts.push(M::gen_load_base_offset(
1446                 into_reg,
1447                 base,
1448                 offset.into(),
1449                 M::word_type(),
1450             ));
1451             return into_reg.to_reg();
1452         }
1453         ref other => panic!("global value for stack limit not supported: {other}"),
1454     }
1455 }
1456 
1457 /// Returns true if the signature needs to be legalized.
missing_struct_return(sig: &ir::Signature) -> bool1458 fn missing_struct_return(sig: &ir::Signature) -> bool {
1459     sig.uses_special_param(ArgumentPurpose::StructReturn)
1460         && !sig.uses_special_return(ArgumentPurpose::StructReturn)
1461 }
1462 
ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature1463 fn ensure_struct_return_ptr_is_returned(sig: &ir::Signature) -> ir::Signature {
1464     // Keep in sync with Callee::new
1465     let mut sig = sig.clone();
1466     if sig.uses_special_return(ArgumentPurpose::StructReturn) {
1467         panic!("Explicit StructReturn return value not allowed: {sig:?}")
1468     }
1469     if let Some(struct_ret_index) = sig.special_param_index(ArgumentPurpose::StructReturn) {
1470         if !sig.returns.is_empty() {
1471             panic!("No return values are allowed when using StructReturn: {sig:?}");
1472         }
1473         sig.returns.insert(0, sig.params[struct_ret_index]);
1474     }
1475     sig
1476 }
1477 
1478 /// ### Pre-Regalloc Functions
1479 ///
1480 /// These methods of `Callee` may only be called before regalloc.
1481 impl<M: ABIMachineSpec> Callee<M> {
1482     /// Access the (possibly legalized) signature.
signature(&self) -> &ir::Signature1483     pub fn signature(&self) -> &ir::Signature {
1484         debug_assert!(
1485             !missing_struct_return(&self.ir_sig),
1486             "`Callee::ir_sig` is always legalized"
1487         );
1488         &self.ir_sig
1489     }
1490 
1491     /// Initialize. This is called after the Callee is constructed because it
1492     /// may allocate a temp vreg, which can only be allocated once the lowering
1493     /// context exists.
init_retval_area( &mut self, sigs: &SigSet, vregs: &mut VRegAllocator<M::I>, ) -> CodegenResult<()>1494     pub fn init_retval_area(
1495         &mut self,
1496         sigs: &SigSet,
1497         vregs: &mut VRegAllocator<M::I>,
1498     ) -> CodegenResult<()> {
1499         if sigs[self.sig].stack_ret_arg.is_some() {
1500             let ret_area_ptr = vregs.alloc(M::word_type())?;
1501             self.ret_area_ptr = Some(ret_area_ptr.only_reg().unwrap());
1502         }
1503         Ok(())
1504     }
1505 
1506     /// Get the return area pointer register, if any.
ret_area_ptr(&self) -> Option<Reg>1507     pub fn ret_area_ptr(&self) -> Option<Reg> {
1508         self.ret_area_ptr
1509     }
1510 
1511     /// Accumulate outgoing arguments.
1512     ///
1513     /// This ensures that at least `size` bytes are allocated in the prologue to
1514     /// be available for use in function calls to hold arguments and/or return
1515     /// values. If this function is called multiple times, the maximum of all
1516     /// `size` values will be available.
accumulate_outgoing_args_size(&mut self, size: u32)1517     pub fn accumulate_outgoing_args_size(&mut self, size: u32) {
1518         if size > self.outgoing_args_size {
1519             self.outgoing_args_size = size;
1520         }
1521     }
1522 
1523     /// Accumulate the incoming argument area size requirements for a tail call,
1524     /// as it could be larger than the incoming arguments of the function
1525     /// currently being compiled.
accumulate_tail_args_size(&mut self, size: u32)1526     pub fn accumulate_tail_args_size(&mut self, size: u32) {
1527         if size > self.tail_args_size {
1528             self.tail_args_size = size;
1529         }
1530     }
1531 
is_forward_edge_cfi_enabled(&self) -> bool1532     pub fn is_forward_edge_cfi_enabled(&self) -> bool {
1533         self.isa_flags.is_forward_edge_cfi_enabled()
1534     }
1535 
1536     /// Get the calling convention implemented by this ABI object.
call_conv(&self) -> isa::CallConv1537     pub fn call_conv(&self) -> isa::CallConv {
1538         self.call_conv
1539     }
1540 
1541     /// Get the ABI-dependent MachineEnv for managing register allocation.
machine_env(&self) -> &MachineEnv1542     pub fn machine_env(&self) -> &MachineEnv {
1543         M::get_machine_env(&self.flags, self.call_conv)
1544     }
1545 
1546     /// The offsets of all sized stack slots (not spill slots) for debuginfo purposes.
sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32>1547     pub fn sized_stackslot_offsets(&self) -> &PrimaryMap<StackSlot, u32> {
1548         &self.sized_stackslots
1549     }
1550 
1551     /// The offsets of all dynamic stack slots (not spill slots) for debuginfo purposes.
dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32>1552     pub fn dynamic_stackslot_offsets(&self) -> &PrimaryMap<DynamicStackSlot, u32> {
1553         &self.dynamic_stackslots
1554     }
1555 
1556     /// Generate an instruction which copies an argument to a destination
1557     /// register.
gen_copy_arg_to_regs( &mut self, sigs: &SigSet, idx: usize, into_regs: ValueRegs<Writable<Reg>>, vregs: &mut VRegAllocator<M::I>, ) -> SmallInstVec<M::I>1558     pub fn gen_copy_arg_to_regs(
1559         &mut self,
1560         sigs: &SigSet,
1561         idx: usize,
1562         into_regs: ValueRegs<Writable<Reg>>,
1563         vregs: &mut VRegAllocator<M::I>,
1564     ) -> SmallInstVec<M::I> {
1565         let mut insts = smallvec![];
1566         let mut copy_arg_slot_to_reg = |slot: &ABIArgSlot, into_reg: &Writable<Reg>| {
1567             match slot {
1568                 &ABIArgSlot::Reg { reg, .. } => {
1569                     // Add a preg -> def pair to the eventual `args`
1570                     // instruction.  Extension mode doesn't matter
1571                     // (we're copying out, not in; we ignore high bits
1572                     // by convention).
1573                     let arg = ArgPair {
1574                         vreg: *into_reg,
1575                         preg: reg.into(),
1576                     };
1577                     self.reg_args.push(arg);
1578                 }
1579                 &ABIArgSlot::Stack {
1580                     offset,
1581                     ty,
1582                     extension,
1583                     ..
1584                 } => {
1585                     // However, we have to respect the extension mode for stack
1586                     // slots, or else we grab the wrong bytes on big-endian.
1587                     let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1588                     let ty =
1589                         if ext != ArgumentExtension::None && M::word_bits() > ty_bits(ty) as u32 {
1590                             M::word_type()
1591                         } else {
1592                             ty
1593                         };
1594                     insts.push(M::gen_load_stack(
1595                         StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1596                         *into_reg,
1597                         ty,
1598                     ));
1599                 }
1600             }
1601         };
1602 
1603         match &sigs.args(self.sig)[idx] {
1604             &ABIArg::Slots { ref slots, .. } => {
1605                 assert_eq!(into_regs.len(), slots.len());
1606                 for (slot, into_reg) in slots.iter().zip(into_regs.regs().iter()) {
1607                     copy_arg_slot_to_reg(&slot, &into_reg);
1608                 }
1609             }
1610             &ABIArg::StructArg { offset, .. } => {
1611                 let into_reg = into_regs.only_reg().unwrap();
1612                 // Buffer address is implicitly defined by the ABI.
1613                 insts.push(M::gen_get_stack_addr(
1614                     StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1615                     into_reg,
1616                 ));
1617             }
1618             &ABIArg::ImplicitPtrArg { pointer, ty, .. } => {
1619                 let into_reg = into_regs.only_reg().unwrap();
1620                 // We need to dereference the pointer.
1621                 let base = match &pointer {
1622                     &ABIArgSlot::Reg { reg, ty, .. } => {
1623                         let tmp = vregs.alloc_with_deferred_error(ty).only_reg().unwrap();
1624                         self.reg_args.push(ArgPair {
1625                             vreg: Writable::from_reg(tmp),
1626                             preg: reg.into(),
1627                         });
1628                         tmp
1629                     }
1630                     &ABIArgSlot::Stack { offset, ty, .. } => {
1631                         let addr_reg = writable_value_regs(vregs.alloc_with_deferred_error(ty))
1632                             .only_reg()
1633                             .unwrap();
1634                         insts.push(M::gen_load_stack(
1635                             StackAMode::IncomingArg(offset, sigs[self.sig].sized_stack_arg_space),
1636                             addr_reg,
1637                             ty,
1638                         ));
1639                         addr_reg.to_reg()
1640                     }
1641                 };
1642                 insts.push(M::gen_load_base_offset(into_reg, base, 0, ty));
1643             }
1644         }
1645         insts
1646     }
1647 
1648     /// Generate an instruction which copies a source register to a return value slot.
gen_copy_regs_to_retval( &self, sigs: &SigSet, idx: usize, from_regs: ValueRegs<Reg>, vregs: &mut VRegAllocator<M::I>, ) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>)1649     pub fn gen_copy_regs_to_retval(
1650         &self,
1651         sigs: &SigSet,
1652         idx: usize,
1653         from_regs: ValueRegs<Reg>,
1654         vregs: &mut VRegAllocator<M::I>,
1655     ) -> (SmallVec<[RetPair; 2]>, SmallInstVec<M::I>) {
1656         let mut reg_pairs = smallvec![];
1657         let mut ret = smallvec![];
1658         let word_bits = M::word_bits() as u8;
1659         match &sigs.rets(self.sig)[idx] {
1660             &ABIArg::Slots { ref slots, .. } => {
1661                 assert_eq!(from_regs.len(), slots.len());
1662                 for (slot, &from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1663                     match slot {
1664                         &ABIArgSlot::Reg {
1665                             reg, ty, extension, ..
1666                         } => {
1667                             let from_bits = ty_bits(ty) as u8;
1668                             let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1669                             let vreg = match (ext, from_bits) {
1670                                 (ir::ArgumentExtension::Uext, n)
1671                                 | (ir::ArgumentExtension::Sext, n)
1672                                     if n < word_bits =>
1673                                 {
1674                                     let signed = ext == ir::ArgumentExtension::Sext;
1675                                     let dst =
1676                                         writable_value_regs(vregs.alloc_with_deferred_error(ty))
1677                                             .only_reg()
1678                                             .unwrap();
1679                                     ret.push(M::gen_extend(
1680                                         dst, from_reg, signed, from_bits,
1681                                         /* to_bits = */ word_bits,
1682                                     ));
1683                                     dst.to_reg()
1684                                 }
1685                                 _ => {
1686                                     // No move needed, regalloc2 will emit it using the constraint
1687                                     // added by the RetPair.
1688                                     from_reg
1689                                 }
1690                             };
1691                             reg_pairs.push(RetPair {
1692                                 vreg,
1693                                 preg: Reg::from(reg),
1694                             });
1695                         }
1696                         &ABIArgSlot::Stack {
1697                             offset,
1698                             ty,
1699                             extension,
1700                             ..
1701                         } => {
1702                             let mut ty = ty;
1703                             let from_bits = ty_bits(ty) as u8;
1704                             // A machine ABI implementation should ensure that stack frames
1705                             // have "reasonable" size. All current ABIs for machinst
1706                             // backends (aarch64 and x64) enforce a 128MB limit.
1707                             let off = i32::try_from(offset).expect(
1708                                 "Argument stack offset greater than 2GB; should hit impl limit first",
1709                                 );
1710                             let ext = M::get_ext_mode(sigs[self.sig].call_conv, extension);
1711                             // Trash the from_reg; it should be its last use.
1712                             match (ext, from_bits) {
1713                                 (ir::ArgumentExtension::Uext, n)
1714                                 | (ir::ArgumentExtension::Sext, n)
1715                                     if n < word_bits =>
1716                                 {
1717                                     assert_eq!(M::word_reg_class(), from_reg.class());
1718                                     let signed = ext == ir::ArgumentExtension::Sext;
1719                                     let dst =
1720                                         writable_value_regs(vregs.alloc_with_deferred_error(ty))
1721                                             .only_reg()
1722                                             .unwrap();
1723                                     ret.push(M::gen_extend(
1724                                         dst, from_reg, signed, from_bits,
1725                                         /* to_bits = */ word_bits,
1726                                     ));
1727                                     // Store the extended version.
1728                                     ty = M::word_type();
1729                                 }
1730                                 _ => {}
1731                             };
1732                             ret.push(M::gen_store_base_offset(
1733                                 self.ret_area_ptr.unwrap(),
1734                                 off,
1735                                 from_reg,
1736                                 ty,
1737                             ));
1738                         }
1739                     }
1740                 }
1741             }
1742             ABIArg::StructArg { .. } => {
1743                 panic!("StructArg in return position is unsupported");
1744             }
1745             ABIArg::ImplicitPtrArg { .. } => {
1746                 panic!("ImplicitPtrArg in return position is unsupported");
1747             }
1748         }
1749         (reg_pairs, ret)
1750     }
1751 
1752     /// Generate any setup instruction needed to save values to the
1753     /// return-value area. This is usually used when were are multiple return
1754     /// values or an otherwise large return value that must be passed on the
1755     /// stack; typically the ABI specifies an extra hidden argument that is a
1756     /// pointer to that memory.
gen_retval_area_setup( &mut self, sigs: &SigSet, vregs: &mut VRegAllocator<M::I>, ) -> Option<M::I>1757     pub fn gen_retval_area_setup(
1758         &mut self,
1759         sigs: &SigSet,
1760         vregs: &mut VRegAllocator<M::I>,
1761     ) -> Option<M::I> {
1762         if let Some(i) = sigs[self.sig].stack_ret_arg {
1763             let ret_area_ptr = Writable::from_reg(self.ret_area_ptr.unwrap());
1764             let insts =
1765                 self.gen_copy_arg_to_regs(sigs, i.into(), ValueRegs::one(ret_area_ptr), vregs);
1766             insts.into_iter().next().map(|inst| {
1767                 trace!(
1768                     "gen_retval_area_setup: inst {:?}; ptr reg is {:?}",
1769                     inst,
1770                     ret_area_ptr.to_reg()
1771                 );
1772                 inst
1773             })
1774         } else {
1775             trace!("gen_retval_area_setup: not needed");
1776             None
1777         }
1778     }
1779 
1780     /// Generate a return instruction.
gen_rets(&self, rets: Vec<RetPair>) -> M::I1781     pub fn gen_rets(&self, rets: Vec<RetPair>) -> M::I {
1782         M::gen_rets(rets)
1783     }
1784 
1785     /// Set up arguments values `args` for a call with signature `sig`.
1786     /// This will return a series of instructions to be emitted to set
1787     /// up all arguments, as well as a `CallArgList` list representing
1788     /// the arguments passed in registers.  The latter need to be added
1789     /// as constraints to the actual call instruction.
gen_call_args( &self, sigs: &SigSet, sig: Sig, args: &[ValueRegs<Reg>], is_tail_call: bool, flags: &settings::Flags, vregs: &mut VRegAllocator<M::I>, ) -> (CallArgList, SmallInstVec<M::I>)1790     pub fn gen_call_args(
1791         &self,
1792         sigs: &SigSet,
1793         sig: Sig,
1794         args: &[ValueRegs<Reg>],
1795         is_tail_call: bool,
1796         flags: &settings::Flags,
1797         vregs: &mut VRegAllocator<M::I>,
1798     ) -> (CallArgList, SmallInstVec<M::I>) {
1799         let mut uses: CallArgList = smallvec![];
1800         let mut insts = smallvec![];
1801 
1802         assert_eq!(args.len(), sigs.num_args(sig));
1803 
1804         let call_conv = sigs[sig].call_conv;
1805         let stack_arg_space = sigs[sig].sized_stack_arg_space;
1806         let stack_arg = |offset| {
1807             if is_tail_call {
1808                 StackAMode::IncomingArg(offset, stack_arg_space)
1809             } else {
1810                 StackAMode::OutgoingArg(offset)
1811             }
1812         };
1813 
1814         let word_ty = M::word_type();
1815         let word_rc = M::word_reg_class();
1816         let word_bits = M::word_bits() as usize;
1817 
1818         if is_tail_call {
1819             debug_assert_eq!(
1820                 self.call_conv,
1821                 isa::CallConv::Tail,
1822                 "Can only do `return_call`s from within a `tail` calling convention function"
1823             );
1824         }
1825 
1826         // Helper to process a single argument slot (register or stack slot).
1827         // This will either add the register to the `uses` list or write the
1828         // value to the stack slot in the outgoing argument area (or for tail
1829         // calls, the incoming argument area).
1830         let mut process_arg_slot = |insts: &mut SmallInstVec<M::I>, slot, vreg, ty| {
1831             match &slot {
1832                 &ABIArgSlot::Reg { reg, .. } => {
1833                     uses.push(CallArgPair {
1834                         vreg,
1835                         preg: reg.into(),
1836                     });
1837                 }
1838                 &ABIArgSlot::Stack { offset, .. } => {
1839                     insts.push(M::gen_store_stack(stack_arg(offset), vreg, ty));
1840                 }
1841             };
1842         };
1843 
1844         // First pass: Handle `StructArg` arguments.  These need to be copied
1845         // into their associated stack buffers.  This should happen before any
1846         // of the other arguments are processed, as the `memcpy` call might
1847         // clobber registers used by other arguments.
1848         for (idx, from_regs) in args.iter().enumerate() {
1849             match &sigs.args(sig)[idx] {
1850                 &ABIArg::Slots { .. } | &ABIArg::ImplicitPtrArg { .. } => {}
1851                 &ABIArg::StructArg { offset, size, .. } => {
1852                     let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1853                     insts.push(M::gen_get_stack_addr(
1854                         stack_arg(offset),
1855                         Writable::from_reg(tmp),
1856                     ));
1857                     insts.extend(M::gen_memcpy(
1858                         isa::CallConv::for_libcall(flags, call_conv),
1859                         tmp,
1860                         from_regs.only_reg().unwrap(),
1861                         size as usize,
1862                         |ty| {
1863                             Writable::from_reg(
1864                                 vregs.alloc_with_deferred_error(ty).only_reg().unwrap(),
1865                             )
1866                         },
1867                     ));
1868                 }
1869             }
1870         }
1871 
1872         // Second pass: Handle everything except `StructArg` arguments.
1873         for (idx, from_regs) in args.iter().enumerate() {
1874             match sigs.args(sig)[idx] {
1875                 ABIArg::Slots { ref slots, .. } => {
1876                     assert_eq!(from_regs.len(), slots.len());
1877                     for (slot, from_reg) in slots.iter().zip(from_regs.regs().iter()) {
1878                         // Load argument slot value from `from_reg`, and perform any zero-
1879                         // or sign-extension that is required by the ABI.
1880                         let (ty, extension) = match *slot {
1881                             ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),
1882                             ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),
1883                         };
1884                         let ext = M::get_ext_mode(call_conv, extension);
1885                         let (vreg, ty) = if ext != ir::ArgumentExtension::None
1886                             && ty_bits(ty) < word_bits
1887                         {
1888                             assert_eq!(word_rc, from_reg.class());
1889                             let signed = match ext {
1890                                 ir::ArgumentExtension::Uext => false,
1891                                 ir::ArgumentExtension::Sext => true,
1892                                 _ => unreachable!(),
1893                             };
1894                             let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1895                             insts.push(M::gen_extend(
1896                                 Writable::from_reg(tmp),
1897                                 *from_reg,
1898                                 signed,
1899                                 ty_bits(ty) as u8,
1900                                 word_bits as u8,
1901                             ));
1902                             (tmp, word_ty)
1903                         } else {
1904                             (*from_reg, ty)
1905                         };
1906                         process_arg_slot(&mut insts, *slot, vreg, ty);
1907                     }
1908                 }
1909                 ABIArg::ImplicitPtrArg {
1910                     offset,
1911                     pointer,
1912                     ty,
1913                     ..
1914                 } => {
1915                     let vreg = from_regs.only_reg().unwrap();
1916                     let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1917                     insts.push(M::gen_get_stack_addr(
1918                         stack_arg(offset),
1919                         Writable::from_reg(tmp),
1920                     ));
1921                     insts.push(M::gen_store_base_offset(tmp, 0, vreg, ty));
1922                     process_arg_slot(&mut insts, pointer, tmp, word_ty);
1923                 }
1924                 ABIArg::StructArg { .. } => {}
1925             }
1926         }
1927 
1928         // Finally, set the stack-return pointer to the return argument area.
1929         // For tail calls, this means forwarding the incoming stack-return pointer.
1930         if let Some(ret_arg) = sigs.get_ret_arg(sig) {
1931             let ret_area = if is_tail_call {
1932                 self.ret_area_ptr.expect(
1933                     "if the tail callee has a return pointer, then the tail caller must as well",
1934                 )
1935             } else {
1936                 let tmp = vregs.alloc_with_deferred_error(word_ty).only_reg().unwrap();
1937                 let amode = StackAMode::OutgoingArg(stack_arg_space.into());
1938                 insts.push(M::gen_get_stack_addr(amode, Writable::from_reg(tmp)));
1939                 tmp
1940             };
1941             match ret_arg {
1942                 // The return pointer must occupy a single slot.
1943                 ABIArg::Slots { slots, .. } => {
1944                     assert_eq!(slots.len(), 1);
1945                     process_arg_slot(&mut insts, slots[0], ret_area, word_ty);
1946                 }
1947                 _ => unreachable!(),
1948             }
1949         }
1950 
1951         (uses, insts)
1952     }
1953 
1954     /// Set up return values `outputs` for a call with signature `sig`.
1955     /// This does not emit (or return) any instructions, but returns a
1956     /// `CallRetList` representing the return value constraints.  This
1957     /// needs to be added to the actual call instruction.
1958     ///
1959     /// If `try_call_payloads` is non-zero, it is expected to hold
1960     /// exception payload registers for try_call instructions.  These
1961     /// will be added as needed to the `CallRetList` as well.
gen_call_rets( &self, sigs: &SigSet, sig: Sig, outputs: &[ValueRegs<Reg>], try_call_payloads: Option<&[Writable<Reg>]>, vregs: &mut VRegAllocator<M::I>, ) -> CallRetList1962     pub fn gen_call_rets(
1963         &self,
1964         sigs: &SigSet,
1965         sig: Sig,
1966         outputs: &[ValueRegs<Reg>],
1967         try_call_payloads: Option<&[Writable<Reg>]>,
1968         vregs: &mut VRegAllocator<M::I>,
1969     ) -> CallRetList {
1970         let callee_conv = sigs[sig].call_conv;
1971         let stack_arg_space = sigs[sig].sized_stack_arg_space;
1972 
1973         let word_ty = M::word_type();
1974         let word_bits = M::word_bits() as usize;
1975 
1976         let mut defs: CallRetList = smallvec![];
1977         let mut outputs = outputs.into_iter();
1978         let num_rets = sigs.num_rets(sig);
1979         for idx in 0..num_rets {
1980             let ret = sigs.rets(sig)[idx].clone();
1981             match ret {
1982                 ABIArg::Slots {
1983                     ref slots, purpose, ..
1984                 } => {
1985                     // We do not use the returned copy of the return buffer pointer,
1986                     // so skip any StructReturn returns that may be present.
1987                     if purpose == ArgumentPurpose::StructReturn {
1988                         continue;
1989                     }
1990                     let retval_regs = outputs.next().unwrap();
1991                     assert_eq!(retval_regs.len(), slots.len());
1992                     for (slot, retval_reg) in slots.iter().zip(retval_regs.regs().iter()) {
1993                         // We do not perform any extension because we're copying out, not in,
1994                         // and we ignore high bits in our own registers by convention.  However,
1995                         // we still need to use the proper extended type to access stack slots
1996                         // (this is critical on big-endian systems).
1997                         let (ty, extension) = match *slot {
1998                             ABIArgSlot::Reg { ty, extension, .. } => (ty, extension),
1999                             ABIArgSlot::Stack { ty, extension, .. } => (ty, extension),
2000                         };
2001                         let ext = M::get_ext_mode(callee_conv, extension);
2002                         let ty = if ext != ir::ArgumentExtension::None && ty_bits(ty) < word_bits {
2003                             word_ty
2004                         } else {
2005                             ty
2006                         };
2007 
2008                         match slot {
2009                             &ABIArgSlot::Reg { reg, .. } => {
2010                                 defs.push(CallRetPair {
2011                                     vreg: Writable::from_reg(*retval_reg),
2012                                     location: RetLocation::Reg(reg.into(), ty),
2013                                 });
2014                             }
2015                             &ABIArgSlot::Stack { offset, .. } => {
2016                                 let amode =
2017                                     StackAMode::OutgoingArg(offset + i64::from(stack_arg_space));
2018                                 defs.push(CallRetPair {
2019                                     vreg: Writable::from_reg(*retval_reg),
2020                                     location: RetLocation::Stack(amode, ty),
2021                                 });
2022                             }
2023                         }
2024                     }
2025                 }
2026                 ABIArg::StructArg { .. } => {
2027                     panic!("StructArg not supported in return position");
2028                 }
2029                 ABIArg::ImplicitPtrArg { .. } => {
2030                     panic!("ImplicitPtrArg not supported in return position");
2031                 }
2032             }
2033         }
2034         assert!(outputs.next().is_none());
2035 
2036         if let Some(try_call_payloads) = try_call_payloads {
2037             // Let `M` say where the payload values are going to end up and then
2038             // double-check it's the same size as the calling convention's
2039             // reported number of exception types.
2040             let pregs = M::exception_payload_regs(callee_conv);
2041             assert_eq!(
2042                 callee_conv.exception_payload_types(M::word_type()).len(),
2043                 pregs.len()
2044             );
2045 
2046             // We need to update `defs` to contain the exception
2047             // payload regs as well. We have two sources of info that
2048             // we join:
2049             //
2050             // - The machine-specific ABI implementation `M`, which
2051             //   tells us the particular registers that payload values
2052             //   must be in
2053             // - The passed-in lowering context, which gives us the
2054             //   vregs we must define.
2055             //
2056             // Note that payload values may need to end up in the same
2057             // physical registers as ordinary return values; this is
2058             // not a conflict, because we either get one or the
2059             // other. For regalloc's purposes, we define both starting
2060             // here at the callsite, but we can share one def in the
2061             // `defs` list and alias one vreg to another. Thus we
2062             // handle the two cases below for each payload register:
2063             // overlaps a return value (and we alias to it) or not
2064             // (and we add a def).
2065             for (i, &preg) in pregs.iter().enumerate() {
2066                 let vreg = try_call_payloads[i];
2067                 if let Some(existing) = defs.iter().find(|def| match def.location {
2068                     RetLocation::Reg(r, _) => r == preg,
2069                     _ => false,
2070                 }) {
2071                     vregs.set_vreg_alias(vreg.to_reg(), existing.vreg.to_reg());
2072                 } else {
2073                     defs.push(CallRetPair {
2074                         vreg,
2075                         location: RetLocation::Reg(preg, word_ty),
2076                     });
2077                 }
2078             }
2079         }
2080 
2081         defs
2082     }
2083 
2084     /// Populate a `CallInfo` for a call with signature `sig`.
2085     ///
2086     /// `dest` is the target-specific call destination value
2087     /// `uses` is the `CallArgList` describing argument constraints
2088     /// `defs` is the `CallRetList` describing return constraints
2089     /// `try_call_info` describes exception targets for try_call instructions
2090     /// `patchable` describes whether this callsite should emit metadata
2091     /// for patching to enable/disable it.
2092     ///
2093     /// The clobber list is computed here from the above data.
gen_call_info<T>( &self, sigs: &SigSet, sig: Sig, dest: T, uses: CallArgList, defs: CallRetList, try_call_info: Option<TryCallInfo>, patchable: bool, ) -> CallInfo<T>2094     pub fn gen_call_info<T>(
2095         &self,
2096         sigs: &SigSet,
2097         sig: Sig,
2098         dest: T,
2099         uses: CallArgList,
2100         defs: CallRetList,
2101         try_call_info: Option<TryCallInfo>,
2102         patchable: bool,
2103     ) -> CallInfo<T> {
2104         let caller_conv = self.call_conv;
2105         let callee_conv = sigs[sig].call_conv;
2106         let stack_arg_space = sigs[sig].sized_stack_arg_space;
2107 
2108         let clobbers = {
2109             // Get clobbers: all caller-saves. These may include return value
2110             // regs, which we will remove from the clobber set below.
2111             let mut clobbers =
2112                 <M>::get_regs_clobbered_by_call(callee_conv, try_call_info.is_some());
2113 
2114             // Remove retval regs from clobbers.
2115             for def in &defs {
2116                 if let RetLocation::Reg(preg, _) = def.location {
2117                     clobbers.remove(PReg::from(preg.to_real_reg().unwrap()));
2118                 }
2119             }
2120 
2121             clobbers
2122         };
2123 
2124         // Any adjustment to SP to account for required outgoing arguments/stack return values must
2125         // be done inside of the call pseudo-op, to ensure that SP is always in a consistent
2126         // state for all other instructions. For example, if a tail-call abi function is called
2127         // here, the reclamation of the outgoing argument area must be done inside of the call
2128         // pseudo-op's emission to ensure that SP is consistent at all other points in the lowered
2129         // function. (Except the prologue and epilogue, but those are fairly special parts of the
2130         // function that establish the SP invariants that are relied on elsewhere and are generated
2131         // after the register allocator has run and thus cannot have register allocator-inserted
2132         // references to SP offsets.)
2133 
2134         let callee_pop_size = if callee_conv == isa::CallConv::Tail {
2135             // The tail calling convention has callees pop stack arguments.
2136             stack_arg_space
2137         } else {
2138             0
2139         };
2140 
2141         CallInfo {
2142             dest,
2143             uses,
2144             defs,
2145             clobbers,
2146             callee_conv,
2147             caller_conv,
2148             callee_pop_size,
2149             try_call_info,
2150             patchable,
2151         }
2152     }
2153 
2154     /// Get the raw offset of a sized stackslot in the slot region.
sized_stackslot_offset(&self, slot: StackSlot) -> u322155     pub fn sized_stackslot_offset(&self, slot: StackSlot) -> u32 {
2156         self.sized_stackslots[slot]
2157     }
2158 
2159     /// Produce an instruction that computes a sized stackslot address.
sized_stackslot_addr( &self, slot: StackSlot, offset: u32, into_reg: Writable<Reg>, ) -> M::I2160     pub fn sized_stackslot_addr(
2161         &self,
2162         slot: StackSlot,
2163         offset: u32,
2164         into_reg: Writable<Reg>,
2165     ) -> M::I {
2166         // Offset from beginning of stackslot area.
2167         let stack_off = self.sized_stackslots[slot] as i64;
2168         let sp_off: i64 = stack_off + (offset as i64);
2169         M::gen_get_stack_addr(StackAMode::Slot(sp_off), into_reg)
2170     }
2171 
2172     /// Produce an instruction that computes a dynamic stackslot address.
dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I2173     pub fn dynamic_stackslot_addr(&self, slot: DynamicStackSlot, into_reg: Writable<Reg>) -> M::I {
2174         let stack_off = self.dynamic_stackslots[slot] as i64;
2175         M::gen_get_stack_addr(StackAMode::Slot(stack_off), into_reg)
2176     }
2177 
2178     /// Get an `args` pseudo-inst, if any, that should appear at the
2179     /// very top of the function body prior to regalloc.
take_args(&mut self) -> Option<M::I>2180     pub fn take_args(&mut self) -> Option<M::I> {
2181         if self.reg_args.len() > 0 {
2182             // Very first instruction is an `args` pseudo-inst that
2183             // establishes live-ranges for in-register arguments and
2184             // constrains them at the start of the function to the
2185             // locations defined by the ABI.
2186             Some(M::gen_args(core::mem::take(&mut self.reg_args)))
2187         } else {
2188             None
2189         }
2190     }
2191 }
2192 
2193 /// ### Post-Regalloc Functions
2194 ///
2195 /// These methods of `Callee` may only be called after
2196 /// regalloc.
2197 impl<M: ABIMachineSpec> Callee<M> {
2198     /// Compute the final frame layout, post-regalloc.
2199     ///
2200     /// This must be called before gen_prologue or gen_epilogue.
compute_frame_layout( &mut self, sigs: &SigSet, spillslots: usize, clobbered: Vec<Writable<RealReg>>, function_calls: FunctionCalls, )2201     pub fn compute_frame_layout(
2202         &mut self,
2203         sigs: &SigSet,
2204         spillslots: usize,
2205         clobbered: Vec<Writable<RealReg>>,
2206         function_calls: FunctionCalls,
2207     ) {
2208         let bytes = M::word_bytes();
2209         let total_stacksize = self.stackslots_size + bytes * spillslots as u32;
2210         let mask = M::stack_align(self.call_conv) - 1;
2211         let total_stacksize = (total_stacksize + mask) & !mask; // 16-align the stack.
2212         self.frame_layout = Some(M::compute_frame_layout(
2213             self.call_conv,
2214             &self.flags,
2215             self.signature(),
2216             &clobbered,
2217             function_calls,
2218             self.stack_args_size(sigs),
2219             self.tail_args_size,
2220             self.stackslots_size,
2221             total_stacksize,
2222             self.outgoing_args_size,
2223         ));
2224     }
2225 
2226     /// Generate a prologue, post-regalloc.
2227     ///
2228     /// This should include any stack frame or other setup necessary to use the
2229     /// other methods (`load_arg`, `store_retval`, and spillslot accesses.)
gen_prologue(&self) -> SmallInstVec<M::I>2230     pub fn gen_prologue(&self) -> SmallInstVec<M::I> {
2231         let frame_layout = self.frame_layout();
2232         let mut insts = smallvec![];
2233 
2234         // Set up frame.
2235         insts.extend(M::gen_prologue_frame_setup(
2236             self.call_conv,
2237             &self.flags,
2238             &self.isa_flags,
2239             &frame_layout,
2240         ));
2241 
2242         // The stack limit check needs to cover all the stack adjustments we
2243         // might make, up to the next stack limit check in any function we
2244         // call. Since this happens after frame setup, the current function's
2245         // setup area needs to be accounted for in the caller's stack limit
2246         // check, but we need to account for any setup area that our callees
2247         // might need. Note that s390x may also use the outgoing args area for
2248         // backtrace support even in leaf functions, so that should be accounted
2249         // for unconditionally.
2250         let total_stacksize = (frame_layout.tail_args_size - frame_layout.incoming_args_size)
2251             + frame_layout.clobber_size
2252             + frame_layout.fixed_frame_storage_size
2253             + frame_layout.outgoing_args_size
2254             + if frame_layout.function_calls == FunctionCalls::None {
2255                 0
2256             } else {
2257                 frame_layout.setup_area_size
2258             };
2259 
2260         // Leaf functions with zero stack don't need a stack check if one's
2261         // specified, otherwise always insert the stack check.
2262         if total_stacksize > 0 || frame_layout.function_calls != FunctionCalls::None {
2263             if let Some((reg, stack_limit_load)) = &self.stack_limit {
2264                 insts.extend(stack_limit_load.clone());
2265                 self.insert_stack_check(*reg, total_stacksize, &mut insts);
2266             }
2267 
2268             if self.flags.enable_probestack() {
2269                 let guard_size = 1 << self.flags.probestack_size_log2();
2270                 match self.flags.probestack_strategy() {
2271                     ProbestackStrategy::Inline => M::gen_inline_probestack(
2272                         &mut insts,
2273                         self.call_conv,
2274                         total_stacksize,
2275                         guard_size,
2276                     ),
2277                     ProbestackStrategy::Outline => {
2278                         if total_stacksize >= guard_size {
2279                             M::gen_probestack(&mut insts, total_stacksize);
2280                         }
2281                     }
2282                 }
2283             }
2284         }
2285 
2286         // Save clobbered registers.
2287         insts.extend(M::gen_clobber_save(
2288             self.call_conv,
2289             &self.flags,
2290             &frame_layout,
2291         ));
2292 
2293         insts
2294     }
2295 
2296     /// Generate an epilogue, post-regalloc.
2297     ///
2298     /// Note that this must generate the actual return instruction (rather than
2299     /// emitting this in the lowering logic), because the epilogue code comes
2300     /// before the return and the two are likely closely related.
gen_epilogue(&self) -> SmallInstVec<M::I>2301     pub fn gen_epilogue(&self) -> SmallInstVec<M::I> {
2302         let frame_layout = self.frame_layout();
2303         let mut insts = smallvec![];
2304 
2305         // Restore clobbered registers.
2306         insts.extend(M::gen_clobber_restore(
2307             self.call_conv,
2308             &self.flags,
2309             &frame_layout,
2310         ));
2311 
2312         // Tear down frame.
2313         insts.extend(M::gen_epilogue_frame_restore(
2314             self.call_conv,
2315             &self.flags,
2316             &self.isa_flags,
2317             &frame_layout,
2318         ));
2319 
2320         // And return.
2321         insts.extend(M::gen_return(
2322             self.call_conv,
2323             &self.isa_flags,
2324             &frame_layout,
2325         ));
2326 
2327         trace!("Epilogue: {:?}", insts);
2328         insts
2329     }
2330 
2331     /// Return a reference to the computed frame layout information. This
2332     /// function will panic if it's called before [`Self::compute_frame_layout`].
frame_layout(&self) -> &FrameLayout2333     pub fn frame_layout(&self) -> &FrameLayout {
2334         self.frame_layout
2335             .as_ref()
2336             .expect("frame layout not computed before prologue generation")
2337     }
2338 
2339     /// Returns the offset from SP to FP for the given function, after
2340     /// the prologue has set up the frame. This comprises the spill
2341     /// slots and stack-storage slots as well as storage for clobbered
2342     /// callee-save registers and outgoing arguments at callsites
2343     /// (space for which is reserved during frame setup).
sp_to_fp_offset(&self) -> u322344     pub fn sp_to_fp_offset(&self) -> u32 {
2345         let frame_layout = self.frame_layout();
2346         frame_layout.clobber_size
2347             + frame_layout.fixed_frame_storage_size
2348             + frame_layout.outgoing_args_size
2349     }
2350 
2351     /// Returns offset from the slot base in the current frame to the caller's SP.
slot_base_to_caller_sp_offset(&self) -> u322352     pub fn slot_base_to_caller_sp_offset(&self) -> u32 {
2353         // Note: this looks very similar to `frame_size()` above, but
2354         // it differs in both endpoints: it measures from the bottom
2355         // of stackslots, excluding outgoing args; and it includes the
2356         // setup area (FP/LR) size and any extra tail-args space.
2357         let frame_layout = self.frame_layout();
2358         frame_layout.clobber_size
2359             + frame_layout.fixed_frame_storage_size
2360             + frame_layout.setup_area_size
2361             + (frame_layout.tail_args_size - frame_layout.incoming_args_size)
2362     }
2363 
2364     /// Returns the size of arguments expected on the stack.
stack_args_size(&self, sigs: &SigSet) -> u322365     pub fn stack_args_size(&self, sigs: &SigSet) -> u32 {
2366         sigs[self.sig].sized_stack_arg_space
2367     }
2368 
2369     /// Get the spill-slot size.
get_spillslot_size(&self, rc: RegClass) -> u322370     pub fn get_spillslot_size(&self, rc: RegClass) -> u32 {
2371         let max = if self.dynamic_type_sizes.len() == 0 {
2372             16
2373         } else {
2374             *self
2375                 .dynamic_type_sizes
2376                 .iter()
2377                 .max_by(|x, y| x.1.cmp(&y.1))
2378                 .map(|(_k, v)| v)
2379                 .unwrap()
2380         };
2381         M::get_number_of_spillslots_for_value(rc, max, &self.isa_flags)
2382     }
2383 
2384     /// Get the spill slot offset relative to the fixed allocation area start.
get_spillslot_offset(&self, slot: SpillSlot) -> i642385     pub fn get_spillslot_offset(&self, slot: SpillSlot) -> i64 {
2386         self.frame_layout().spillslot_offset(slot)
2387     }
2388 
2389     /// Generate a spill.
gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I2390     pub fn gen_spill(&self, to_slot: SpillSlot, from_reg: RealReg) -> M::I {
2391         let ty = M::I::canonical_type_for_rc(from_reg.class());
2392         debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
2393 
2394         let sp_off = self.get_spillslot_offset(to_slot);
2395         trace!("gen_spill: {from_reg:?} into slot {to_slot:?} at offset {sp_off}");
2396 
2397         let from = StackAMode::Slot(sp_off);
2398         <M>::gen_store_stack(from, Reg::from(from_reg), ty)
2399     }
2400 
2401     /// Generate a reload (fill).
gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I2402     pub fn gen_reload(&self, to_reg: Writable<RealReg>, from_slot: SpillSlot) -> M::I {
2403         let ty = M::I::canonical_type_for_rc(to_reg.to_reg().class());
2404         debug_assert_eq!(<M>::I::rc_for_type(ty).unwrap().1, &[ty]);
2405 
2406         let sp_off = self.get_spillslot_offset(from_slot);
2407         trace!("gen_reload: {to_reg:?} from slot {from_slot:?} at offset {sp_off}");
2408 
2409         let from = StackAMode::Slot(sp_off);
2410         <M>::gen_load_stack(from, to_reg.map(Reg::from), ty)
2411     }
2412 
2413     /// Provide metadata to be emitted alongside machine code.
2414     ///
2415     /// This metadata describes the frame layout sufficiently to find
2416     /// stack slots, so that runtimes and unwinders can observe state
2417     /// set up by compiled code in stackslots allocated for that
2418     /// purpose.
frame_slot_metadata(&self) -> MachBufferFrameLayout2419     pub fn frame_slot_metadata(&self) -> MachBufferFrameLayout {
2420         let frame_to_fp_offset = self.sp_to_fp_offset();
2421         let mut stackslots = SecondaryMap::with_capacity(self.sized_stackslots.len());
2422         let storage_area_base = self.frame_layout().outgoing_args_size;
2423         for (slot, storage_area_offset) in &self.sized_stackslots {
2424             stackslots[slot] = MachBufferStackSlot {
2425                 offset: storage_area_base.checked_add(*storage_area_offset).unwrap(),
2426                 key: self.sized_stackslot_keys[slot],
2427             };
2428         }
2429         MachBufferFrameLayout {
2430             frame_to_fp_offset,
2431             stackslots,
2432         }
2433     }
2434 }
2435 
2436 /// An input argument to a call instruction: the vreg that is used,
2437 /// and the preg it is constrained to (per the ABI).
2438 #[derive(Clone, Debug)]
2439 pub struct CallArgPair {
2440     /// The virtual register to use for the argument.
2441     pub vreg: Reg,
2442     /// The real register into which the arg goes.
2443     pub preg: Reg,
2444 }
2445 
2446 /// An output return value from a call instruction: the vreg that is
2447 /// defined, and the preg or stack location it is constrained to (per
2448 /// the ABI).
2449 #[derive(Clone, Debug)]
2450 pub struct CallRetPair {
2451     /// The virtual register to define from this return value.
2452     pub vreg: Writable<Reg>,
2453     /// The real register from which the return value is read.
2454     pub location: RetLocation,
2455 }
2456 
2457 /// A location to load a return-value from after a call completes.
2458 #[derive(Clone, Debug, PartialEq, Eq)]
2459 pub enum RetLocation {
2460     /// A physical register.
2461     Reg(Reg, Type),
2462     /// A stack location, identified by a `StackAMode`.
2463     Stack(StackAMode, Type),
2464 }
2465 
2466 pub type CallArgList = SmallVec<[CallArgPair; 8]>;
2467 pub type CallRetList = SmallVec<[CallRetPair; 8]>;
2468 
2469 impl<T> CallInfo<T> {
2470     /// Emit loads for any stack-carried return values using the call
2471     /// info and allocations.
emit_retval_loads< M: ABIMachineSpec, EmitFn: FnMut(M::I), IslandFn: Fn(u32) -> Option<M::I>, >( &self, stackslots_size: u32, mut emit: EmitFn, emit_island: IslandFn, )2472     pub fn emit_retval_loads<
2473         M: ABIMachineSpec,
2474         EmitFn: FnMut(M::I),
2475         IslandFn: Fn(u32) -> Option<M::I>,
2476     >(
2477         &self,
2478         stackslots_size: u32,
2479         mut emit: EmitFn,
2480         emit_island: IslandFn,
2481     ) {
2482         // Count stack-ret locations and emit an island to account for
2483         // this space usage.
2484         let mut space_needed = 0;
2485         for CallRetPair { location, .. } in &self.defs {
2486             if let RetLocation::Stack(..) = location {
2487                 // Assume up to ten instructions, semi-arbitrarily:
2488                 // load from stack, store to spillslot, codegen of
2489                 // large offsets on RISC ISAs.
2490                 space_needed += 10 * M::I::worst_case_size();
2491             }
2492         }
2493         if space_needed > 0 {
2494             if let Some(island_inst) = emit_island(space_needed) {
2495                 emit(island_inst);
2496             }
2497         }
2498 
2499         let temp = M::retval_temp_reg(self.callee_conv);
2500         // The temporary must be noted as clobbered unless there are
2501         // no returns (hence it isn't needed). The latter can only be
2502         // the case statically for an ABI when the ABI doesn't allow
2503         // any returns at all (e.g., preserve-all ABI).
2504         debug_assert!(
2505             self.defs.is_empty()
2506                 || M::get_regs_clobbered_by_call(self.callee_conv, self.try_call_info.is_some())
2507                     .contains(PReg::from(temp.to_reg().to_real_reg().unwrap()))
2508         );
2509 
2510         for CallRetPair { vreg, location } in &self.defs {
2511             match location {
2512                 RetLocation::Reg(preg, ..) => {
2513                     // The temporary must not also be an actual return
2514                     // value register.
2515                     debug_assert!(*preg != temp.to_reg());
2516                 }
2517                 RetLocation::Stack(amode, ty) => {
2518                     if let Some(spillslot) = vreg.to_reg().to_spillslot() {
2519                         // `temp` is an integer register of machine word
2520                         // width, but `ty` may be floating-point/vector,
2521                         // which (i) may not be loadable directly into an
2522                         // int reg, and (ii) may be wider than a machine
2523                         // word. For simplicity, and because there are not
2524                         // always easy choices for volatile float/vec regs
2525                         // (see e.g. x86-64, where fastcall clobbers only
2526                         // xmm0-xmm5, but tail uses xmm0-xmm7 for
2527                         // returns), we use the integer temp register in
2528                         // steps.
2529                         let parts = (ty.bytes() + M::word_bytes() - 1) / M::word_bytes();
2530                         let one_part_load_ty =
2531                             Type::int_with_byte_size(M::word_bytes().min(ty.bytes()) as u16)
2532                                 .unwrap();
2533                         for part in 0..parts {
2534                             emit(M::gen_load_stack(
2535                                 amode.offset_by(part * M::word_bytes()),
2536                                 temp,
2537                                 one_part_load_ty,
2538                             ));
2539                             emit(M::gen_store_stack(
2540                                 StackAMode::Slot(
2541                                     i64::from(stackslots_size)
2542                                         + i64::from(M::word_bytes())
2543                                             * ((spillslot.index() as i64) + (part as i64)),
2544                                 ),
2545                                 temp.to_reg(),
2546                                 M::word_type(),
2547                             ));
2548                         }
2549                     } else {
2550                         assert_ne!(*vreg, temp);
2551                         emit(M::gen_load_stack(*amode, *vreg, *ty));
2552                     }
2553                 }
2554             }
2555         }
2556     }
2557 }
2558 
2559 impl TryCallInfo {
exception_handlers( &self, layout: &FrameLayout, ) -> impl Iterator<Item = MachExceptionHandler>2560     pub(crate) fn exception_handlers(
2561         &self,
2562         layout: &FrameLayout,
2563     ) -> impl Iterator<Item = MachExceptionHandler> {
2564         self.exception_handlers.iter().map(|handler| match handler {
2565             TryCallHandler::Tag(tag, label) => MachExceptionHandler::Tag(*tag, *label),
2566             TryCallHandler::Default(label) => MachExceptionHandler::Default(*label),
2567             TryCallHandler::Context(reg) => {
2568                 let loc = if let Some(spillslot) = reg.to_spillslot() {
2569                     // The spillslot offset is relative to the "fixed
2570                     // storage area", which comes after outgoing args.
2571                     let offset = layout.spillslot_offset(spillslot) + i64::from(layout.outgoing_args_size);
2572                     ExceptionContextLoc::SPOffset(u32::try_from(offset).expect("SP offset cannot be negative or larger than 4GiB"))
2573                 } else if let Some(realreg) = reg.to_real_reg() {
2574                     ExceptionContextLoc::GPR(realreg.hw_enc())
2575                 } else {
2576                     panic!("Virtual register present in try-call handler clause after register allocation");
2577                 };
2578                 MachExceptionHandler::Context(loc)
2579             }
2580         })
2581     }
2582 
pretty_print_dests(&self) -> String2583     pub(crate) fn pretty_print_dests(&self) -> String {
2584         self.exception_handlers
2585             .iter()
2586             .map(|handler| match handler {
2587                 TryCallHandler::Tag(tag, label) => format!("{tag:?}: {label:?}"),
2588                 TryCallHandler::Default(label) => format!("default: {label:?}"),
2589                 TryCallHandler::Context(loc) => format!("context {loc:?}"),
2590             })
2591             .collect::<Vec<_>>()
2592             .join(", ")
2593     }
2594 
collect_operands(&mut self, collector: &mut impl OperandVisitor)2595     pub(crate) fn collect_operands(&mut self, collector: &mut impl OperandVisitor) {
2596         for handler in &mut self.exception_handlers {
2597             match handler {
2598                 TryCallHandler::Context(ctx) => {
2599                     collector.any_late_use(ctx);
2600                 }
2601                 TryCallHandler::Tag(_, _) | TryCallHandler::Default(_) => {}
2602             }
2603         }
2604     }
2605 }
2606 
2607 #[cfg(test)]
2608 mod tests {
2609     use super::SigData;
2610 
2611     #[test]
sig_data_size()2612     fn sig_data_size() {
2613         // The size of `SigData` is performance sensitive, so make sure
2614         // we don't regress it unintentionally.
2615         assert_eq!(core::mem::size_of::<SigData>(), 24);
2616     }
2617 }
2618