xref: /llvm-project-15.0.7/llvm/docs/LangRef.rst (revision 63b15823)
1==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6   :local:
7   :depth: 4
8
9Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55    %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64.. _identifiers:
65
66Identifiers
67===========
68
69LLVM identifiers come in two basic types: global and local. Global
70identifiers (functions, global variables) begin with the ``'@'``
71character. Local identifiers (register names, types) begin with the
72``'%'`` character. Additionally, there are three different formats for
73identifiers, for different purposes:
74
75#. Named values are represented as a string of characters with their
76   prefix. For example, ``%foo``, ``@DivisionByZero``,
77   ``%a.really.long.identifier``. The actual regular expression used is
78   '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
79   characters in their names can be surrounded with quotes. Special
80   characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81   code for the character in hexadecimal. In this way, any character can
82   be used in a name value, even quotes themselves. The ``"\01"`` prefix
83   can be used on global values to suppress mangling.
84#. Unnamed values are represented as an unsigned numeric value with
85   their prefix. For example, ``%12``, ``@2``, ``%44``.
86#. Constants, which are described in the section Constants_ below.
87
88LLVM requires that values start with a prefix for two reasons: Compilers
89don't need to worry about name clashes with reserved words, and the set
90of reserved words may be expanded in the future without penalty.
91Additionally, unnamed identifiers allow a compiler to quickly come up
92with a temporary variable without having to avoid symbol table
93conflicts.
94
95Reserved words in LLVM are very similar to reserved words in other
96languages. There are keywords for different opcodes ('``add``',
97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98'``i32``', etc...), and others. These reserved words cannot conflict
99with variable names, because none of them start with a prefix character
100(``'%'`` or ``'@'``).
101
102Here is an example of LLVM code to multiply the integer variable
103'``%X``' by 8:
104
105The easy way:
106
107.. code-block:: llvm
108
109    %result = mul i32 %X, 8
110
111After strength reduction:
112
113.. code-block:: llvm
114
115    %result = shl i32 %X, 3
116
117And the hard way:
118
119.. code-block:: llvm
120
121    %0 = add i32 %X, %X           ; yields i32:%0
122    %1 = add i32 %0, %0           ; yields i32:%1
123    %result = add i32 %1, %1
124
125This last way of multiplying ``%X`` by 8 illustrates several important
126lexical features of LLVM:
127
128#. Comments are delimited with a '``;``' and go until the end of line.
129#. Unnamed temporaries are created when the result of a computation is
130   not assigned to a named value.
131#. Unnamed temporaries are numbered sequentially (using a per-function
132   incrementing counter, starting with 0). Note that basic blocks and unnamed
133   function parameters are included in this numbering. For example, if the
134   entry basic block is not given a label name and all function parameters are
135   named, then it will get number 0.
136
137It also shows a convention that we follow in this document. When
138demonstrating instructions, we will follow an instruction with a comment
139that defines the type and name of value produced.
140
141High Level Structure
142====================
143
144Module Structure
145----------------
146
147LLVM programs are composed of ``Module``'s, each of which is a
148translation unit of the input programs. Each module consists of
149functions, global variables, and symbol table entries. Modules may be
150combined together with the LLVM linker, which merges function (and
151global variable) definitions, resolves forward declarations, and merges
152symbol table entries. Here is an example of the "hello world" module:
153
154.. code-block:: llvm
155
156    ; Declare the string constant as a global constant.
157    @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
158
159    ; External declaration of the puts function
160    declare i32 @puts(ptr nocapture) nounwind
161
162    ; Definition of main function
163    define i32 @main() {
164      ; Call puts function to write out the string to stdout.
165      call i32 @puts(ptr @.str)
166      ret i32 0
167    }
168
169    ; Named metadata
170    !0 = !{i32 42, null, !"string"}
171    !foo = !{!0}
172
173This example is made up of a :ref:`global variable <globalvars>` named
174"``.str``", an external declaration of the "``puts``" function, a
175:ref:`function definition <functionstructure>` for "``main``" and
176:ref:`named metadata <namedmetadatastructure>` "``foo``".
177
178In general, a module is made up of a list of global values (where both
179functions and global variables are global values). Global values are
180represented by a pointer to a memory location (in this case, a pointer
181to an array of char, and a pointer to a function), and have one of the
182following :ref:`linkage types <linkage>`.
183
184.. _linkage:
185
186Linkage Types
187-------------
188
189All Global Variables and Functions have one of the following types of
190linkage:
191
192``private``
193    Global values with "``private``" linkage are only directly
194    accessible by objects in the current module. In particular, linking
195    code into a module with a private global value may cause the
196    private to be renamed as necessary to avoid collisions. Because the
197    symbol is private to the module, all references can be updated. This
198    doesn't show up in any symbol table in the object file.
199``internal``
200    Similar to private, but the value shows as a local symbol
201    (``STB_LOCAL`` in the case of ELF) in the object file. This
202    corresponds to the notion of the '``static``' keyword in C.
203``available_externally``
204    Globals with "``available_externally``" linkage are never emitted into
205    the object file corresponding to the LLVM module. From the linker's
206    perspective, an ``available_externally`` global is equivalent to
207    an external declaration. They exist to allow inlining and other
208    optimizations to take place given knowledge of the definition of the
209    global, which is known to be somewhere outside the module. Globals
210    with ``available_externally`` linkage are allowed to be discarded at
211    will, and allow inlining and other optimizations. This linkage type is
212    only allowed on definitions, not declarations.
213``linkonce``
214    Globals with "``linkonce``" linkage are merged with other globals of
215    the same name when linkage occurs. This can be used to implement
216    some forms of inline functions, templates, or other code which must
217    be generated in each translation unit that uses it, but where the
218    body may be overridden with a more definitive definition later.
219    Unreferenced ``linkonce`` globals are allowed to be discarded. Note
220    that ``linkonce`` linkage does not actually allow the optimizer to
221    inline the body of this function into callers because it doesn't
222    know if this definition of the function is the definitive definition
223    within the program or whether it will be overridden by a stronger
224    definition. To enable inlining and other optimizations, use
225    "``linkonce_odr``" linkage.
226``weak``
227    "``weak``" linkage has the same merging semantics as ``linkonce``
228    linkage, except that unreferenced globals with ``weak`` linkage may
229    not be discarded. This is used for globals that are declared "weak"
230    in C source code.
231``common``
232    "``common``" linkage is most similar to "``weak``" linkage, but they
233    are used for tentative definitions in C, such as "``int X;``" at
234    global scope. Symbols with "``common``" linkage are merged in the
235    same way as ``weak symbols``, and they may not be deleted if
236    unreferenced. ``common`` symbols may not have an explicit section,
237    must have a zero initializer, and may not be marked
238    ':ref:`constant <globalvars>`'. Functions and aliases may not have
239    common linkage.
240
241.. _linkage_appending:
242
243``appending``
244    "``appending``" linkage may only be applied to global variables of
245    pointer to array type. When two global variables with appending
246    linkage are linked together, the two global arrays are appended
247    together. This is the LLVM, typesafe, equivalent of having the
248    system linker append together "sections" with identical names when
249    .o files are linked.
250
251    Unfortunately this doesn't correspond to any feature in .o files, so it
252    can only be used for variables like ``llvm.global_ctors`` which llvm
253    interprets specially.
254
255``extern_weak``
256    The semantics of this linkage follow the ELF object file model: the
257    symbol is weak until linked, if not linked, the symbol becomes null
258    instead of being an undefined reference.
259``linkonce_odr``, ``weak_odr``
260    Some languages allow differing globals to be merged, such as two
261    functions with different semantics. Other languages, such as
262    ``C++``, ensure that only equivalent globals are ever merged (the
263    "one definition rule" --- "ODR"). Such languages can use the
264    ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
265    global will only be merged with equivalent globals. These linkage
266    types are otherwise the same as their non-``odr`` versions.
267``external``
268    If none of the above identifiers are used, the global is externally
269    visible, meaning that it participates in linkage and can be used to
270    resolve external symbol references.
271
272It is illegal for a global variable or function *declaration* to have any
273linkage type other than ``external`` or ``extern_weak``.
274
275.. _callingconv:
276
277Calling Conventions
278-------------------
279
280LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
281:ref:`invokes <i_invoke>` can all have an optional calling convention
282specified for the call. The calling convention of any pair of dynamic
283caller/callee must match, or the behavior of the program is undefined.
284The following calling conventions are supported by LLVM, and more may be
285added in the future:
286
287"``ccc``" - The C calling convention
288    This calling convention (the default if no other calling convention
289    is specified) matches the target C calling conventions. This calling
290    convention supports varargs function calls and tolerates some
291    mismatch in the declared prototype and implemented declaration of
292    the function (as does normal C).
293"``fastcc``" - The fast calling convention
294    This calling convention attempts to make calls as fast as possible
295    (e.g. by passing things in registers). This calling convention
296    allows the target to use whatever tricks it wants to produce fast
297    code for the target, without having to conform to an externally
298    specified ABI (Application Binary Interface). `Tail calls can only
299    be optimized when this, the tailcc, the GHC or the HiPE convention is
300    used. <CodeGenerator.html#tail-call-optimization>`_ This calling
301    convention does not support varargs and requires the prototype of all
302    callees to exactly match the prototype of the function definition.
303"``coldcc``" - The cold calling convention
304    This calling convention attempts to make code in the caller as
305    efficient as possible under the assumption that the call is not
306    commonly executed. As such, these calls often preserve all registers
307    so that the call does not break any live ranges in the caller side.
308    This calling convention does not support varargs and requires the
309    prototype of all callees to exactly match the prototype of the
310    function definition. Furthermore the inliner doesn't consider such function
311    calls for inlining.
312"``cc 10``" - GHC convention
313    This calling convention has been implemented specifically for use by
314    the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
315    It passes everything in registers, going to extremes to achieve this
316    by disabling callee save registers. This calling convention should
317    not be used lightly but only for specific situations such as an
318    alternative to the *register pinning* performance technique often
319    used when implementing functional programming languages. At the
320    moment only X86 supports this convention and it has the following
321    limitations:
322
323    -  On *X86-32* only supports up to 4 bit type parameters. No
324       floating-point types are supported.
325    -  On *X86-64* only supports up to 10 bit type parameters and 6
326       floating-point parameters.
327
328    This calling convention supports `tail call
329    optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
330    both the caller and callee are using it.
331"``cc 11``" - The HiPE calling convention
332    This calling convention has been implemented specifically for use by
333    the `High-Performance Erlang
334    (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
335    native code compiler of the `Ericsson's Open Source Erlang/OTP
336    system <http://www.erlang.org/download.shtml>`_. It uses more
337    registers for argument passing than the ordinary C calling
338    convention and defines no callee-saved registers. The calling
339    convention properly supports `tail call
340    optimization <CodeGenerator.html#tail-call-optimization>`_ but requires
341    that both the caller and the callee use it. It uses a *register pinning*
342    mechanism, similar to GHC's convention, for keeping frequently
343    accessed runtime components pinned to specific hardware registers.
344    At the moment only X86 supports this convention (both 32 and 64
345    bit).
346"``webkit_jscc``" - WebKit's JavaScript calling convention
347    This calling convention has been implemented for `WebKit FTL JIT
348    <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
349    stack right to left (as cdecl does), and returns a value in the
350    platform's customary return register.
351"``anyregcc``" - Dynamic calling convention for code patching
352    This is a special convention that supports patching an arbitrary code
353    sequence in place of a call site. This convention forces the call
354    arguments into registers but allows them to be dynamically
355    allocated. This can currently only be used with calls to
356    llvm.experimental.patchpoint because only this intrinsic records
357    the location of its arguments in a side table. See :doc:`StackMaps`.
358"``preserve_mostcc``" - The `PreserveMost` calling convention
359    This calling convention attempts to make the code in the caller as
360    unintrusive as possible. This convention behaves identically to the `C`
361    calling convention on how arguments and return values are passed, but it
362    uses a different set of caller/callee-saved registers. This alleviates the
363    burden of saving and recovering a large register set before and after the
364    call in the caller. If the arguments are passed in callee-saved registers,
365    then they will be preserved by the callee across the call. This doesn't
366    apply for values returned in callee-saved registers.
367
368    - On X86-64 the callee preserves all general purpose registers, except for
369      R11. R11 can be used as a scratch register. Floating-point registers
370      (XMMs/YMMs) are not preserved and need to be saved by the caller.
371
372    The idea behind this convention is to support calls to runtime functions
373    that have a hot path and a cold path. The hot path is usually a small piece
374    of code that doesn't use many registers. The cold path might need to call out to
375    another function and therefore only needs to preserve the caller-saved
376    registers, which haven't already been saved by the caller. The
377    `PreserveMost` calling convention is very similar to the `cold` calling
378    convention in terms of caller/callee-saved registers, but they are used for
379    different types of function calls. `coldcc` is for function calls that are
380    rarely executed, whereas `preserve_mostcc` function calls are intended to be
381    on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
382    doesn't prevent the inliner from inlining the function call.
383
384    This calling convention will be used by a future version of the ObjectiveC
385    runtime and should therefore still be considered experimental at this time.
386    Although this convention was created to optimize certain runtime calls to
387    the ObjectiveC runtime, it is not limited to this runtime and might be used
388    by other runtimes in the future too. The current implementation only
389    supports X86-64, but the intention is to support more architectures in the
390    future.
391"``preserve_allcc``" - The `PreserveAll` calling convention
392    This calling convention attempts to make the code in the caller even less
393    intrusive than the `PreserveMost` calling convention. This calling
394    convention also behaves identical to the `C` calling convention on how
395    arguments and return values are passed, but it uses a different set of
396    caller/callee-saved registers. This removes the burden of saving and
397    recovering a large register set before and after the call in the caller. If
398    the arguments are passed in callee-saved registers, then they will be
399    preserved by the callee across the call. This doesn't apply for values
400    returned in callee-saved registers.
401
402    - On X86-64 the callee preserves all general purpose registers, except for
403      R11. R11 can be used as a scratch register. Furthermore it also preserves
404      all floating-point registers (XMMs/YMMs).
405
406    The idea behind this convention is to support calls to runtime functions
407    that don't need to call out to any other functions.
408
409    This calling convention, like the `PreserveMost` calling convention, will be
410    used by a future version of the ObjectiveC runtime and should be considered
411    experimental at this time.
412"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
413    Clang generates an access function to access C++-style TLS. The access
414    function generally has an entry block, an exit block and an initialization
415    block that is run at the first time. The entry and exit blocks can access
416    a few TLS IR variables, each access will be lowered to a platform-specific
417    sequence.
418
419    This calling convention aims to minimize overhead in the caller by
420    preserving as many registers as possible (all the registers that are
421    preserved on the fast path, composed of the entry and exit blocks).
422
423    This calling convention behaves identical to the `C` calling convention on
424    how arguments and return values are passed, but it uses a different set of
425    caller/callee-saved registers.
426
427    Given that each platform has its own lowering sequence, hence its own set
428    of preserved registers, we can't use the existing `PreserveMost`.
429
430    - On X86-64 the callee preserves all general purpose registers, except for
431      RDI and RAX.
432"``tailcc``" - Tail callable calling convention
433    This calling convention ensures that calls in tail position will always be
434    tail call optimized. This calling convention is equivalent to fastcc,
435    except for an additional guarantee that tail calls will be produced
436    whenever possible. `Tail calls can only be optimized when this, the fastcc,
437    the GHC or the HiPE convention is used. <CodeGenerator.html#tail-call-optimization>`_
438    This calling convention does not support varargs and requires the prototype of
439    all callees to exactly match the prototype of the function definition.
440"``swiftcc``" - This calling convention is used for Swift language.
441    - On X86-64 RCX and R8 are available for additional integer returns, and
442      XMM2 and XMM3 are available for additional FP/vector returns.
443    - On iOS platforms, we use AAPCS-VFP calling convention.
444"``swifttailcc``"
445    This calling convention is like ``swiftcc`` in most respects, but also the
446    callee pops the argument area of the stack so that mandatory tail calls are
447    possible as in ``tailcc``.
448"``cfguard_checkcc``" - Windows Control Flow Guard (Check mechanism)
449    This calling convention is used for the Control Flow Guard check function,
450    calls to which can be inserted before indirect calls to check that the call
451    target is a valid function address. The check function has no return value,
452    but it will trigger an OS-level error if the address is not a valid target.
453    The set of registers preserved by the check function, and the register
454    containing the target address are architecture-specific.
455
456    - On X86 the target address is passed in ECX.
457    - On ARM the target address is passed in R0.
458    - On AArch64 the target address is passed in X15.
459"``cc <n>``" - Numbered convention
460    Any calling convention may be specified by number, allowing
461    target-specific calling conventions to be used. Target specific
462    calling conventions start at 64.
463
464More calling conventions can be added/defined on an as-needed basis, to
465support Pascal conventions or any other well-known target-independent
466convention.
467
468.. _visibilitystyles:
469
470Visibility Styles
471-----------------
472
473All Global Variables and Functions have one of the following visibility
474styles:
475
476"``default``" - Default style
477    On targets that use the ELF object file format, default visibility
478    means that the declaration is visible to other modules and, in
479    shared libraries, means that the declared entity may be overridden.
480    On Darwin, default visibility means that the declaration is visible
481    to other modules. On XCOFF, default visibility means no explicit
482    visibility bit will be set and whether the symbol is visible
483    (i.e "exported") to other modules depends primarily on export lists
484    provided to the linker. Default visibility corresponds to "external
485    linkage" in the language.
486"``hidden``" - Hidden style
487    Two declarations of an object with hidden visibility refer to the
488    same object if they are in the same shared object. Usually, hidden
489    visibility indicates that the symbol will not be placed into the
490    dynamic symbol table, so no other module (executable or shared
491    library) can reference it directly.
492"``protected``" - Protected style
493    On ELF, protected visibility indicates that the symbol will be
494    placed in the dynamic symbol table, but that references within the
495    defining module will bind to the local symbol. That is, the symbol
496    cannot be overridden by another module.
497
498A symbol with ``internal`` or ``private`` linkage must have ``default``
499visibility.
500
501.. _dllstorageclass:
502
503DLL Storage Classes
504-------------------
505
506All Global Variables, Functions and Aliases can have one of the following
507DLL storage class:
508
509``dllimport``
510    "``dllimport``" causes the compiler to reference a function or variable via
511    a global pointer to a pointer that is set up by the DLL exporting the
512    symbol. On Microsoft Windows targets, the pointer name is formed by
513    combining ``__imp_`` and the function or variable name.
514``dllexport``
515    On Microsoft Windows targets, "``dllexport``" causes the compiler to provide
516    a global pointer to a pointer in a DLL, so that it can be referenced with the
517    ``dllimport`` attribute. the pointer name is formed by combining ``__imp_``
518    and the function or variable name. On XCOFF targets, ``dllexport`` indicates
519    that the symbol will be made visible to other modules using "exported"
520    visibility and thus placed by the linker in the loader section symbol table.
521    Since this storage class exists for defining a dll interface, the compiler,
522    assembler and linker know it is externally referenced and must refrain from
523    deleting the symbol.
524
525.. _tls_model:
526
527Thread Local Storage Models
528---------------------------
529
530A variable may be defined as ``thread_local``, which means that it will
531not be shared by threads (each thread will have a separated copy of the
532variable). Not all targets support thread-local variables. Optionally, a
533TLS model may be specified:
534
535``localdynamic``
536    For variables that are only used within the current shared library.
537``initialexec``
538    For variables in modules that will not be loaded dynamically.
539``localexec``
540    For variables defined in the executable and only used within it.
541
542If no explicit model is given, the "general dynamic" model is used.
543
544The models correspond to the ELF TLS models; see `ELF Handling For
545Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
546more information on under which circumstances the different models may
547be used. The target may choose a different TLS model if the specified
548model is not supported, or if a better choice of model can be made.
549
550A model can also be specified in an alias, but then it only governs how
551the alias is accessed. It will not have any effect in the aliasee.
552
553For platforms without linker support of ELF TLS model, the -femulated-tls
554flag can be used to generate GCC compatible emulated TLS code.
555
556.. _runtime_preemption_model:
557
558Runtime Preemption Specifiers
559-----------------------------
560
561Global variables, functions and aliases may have an optional runtime preemption
562specifier. If a preemption specifier isn't given explicitly, then a
563symbol is assumed to be ``dso_preemptable``.
564
565``dso_preemptable``
566    Indicates that the function or variable may be replaced by a symbol from
567    outside the linkage unit at runtime.
568
569``dso_local``
570    The compiler may assume that a function or variable marked as ``dso_local``
571    will resolve to a symbol within the same linkage unit. Direct access will
572    be generated even if the definition is not within this compilation unit.
573
574.. _namedtypes:
575
576Structure Types
577---------------
578
579LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
580types <t_struct>`. Literal types are uniqued structurally, but identified types
581are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
582to forward declare a type that is not yet available.
583
584An example of an identified structure specification is:
585
586.. code-block:: llvm
587
588    %mytype = type { %mytype*, i32 }
589
590Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
591literal types are uniqued in recent versions of LLVM.
592
593.. _nointptrtype:
594
595Non-Integral Pointer Type
596-------------------------
597
598Note: non-integral pointer types are a work in progress, and they should be
599considered experimental at this time.
600
601LLVM IR optionally allows the frontend to denote pointers in certain address
602spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
603Non-integral pointer types represent pointers that have an *unspecified* bitwise
604representation; that is, the integral representation may be target dependent or
605unstable (not backed by a fixed integer).
606
607``inttoptr`` and ``ptrtoint`` instructions have the same semantics as for
608integral (i.e. normal) pointers in that they convert integers to and from
609corresponding pointer types, but there are additional implications to be
610aware of.  Because the bit-representation of a non-integral pointer may
611not be stable, two identical casts of the same operand may or may not
612return the same value.  Said differently, the conversion to or from the
613non-integral type depends on environmental state in an implementation
614defined manner.
615
616If the frontend wishes to observe a *particular* value following a cast, the
617generated IR must fence with the underlying environment in an implementation
618defined manner. (In practice, this tends to require ``noinline`` routines for
619such operations.)
620
621From the perspective of the optimizer, ``inttoptr`` and ``ptrtoint`` for
622non-integral types are analogous to ones on integral types with one
623key exception: the optimizer may not, in general, insert new dynamic
624occurrences of such casts.  If a new cast is inserted, the optimizer would
625need to either ensure that a) all possible values are valid, or b)
626appropriate fencing is inserted.  Since the appropriate fencing is
627implementation defined, the optimizer can't do the latter.  The former is
628challenging as many commonly expected properties, such as
629``ptrtoint(v)-ptrtoint(v) == 0``, don't hold for non-integral types.
630
631.. _globalvars:
632
633Global Variables
634----------------
635
636Global variables define regions of memory allocated at compilation time
637instead of run-time.
638
639Global variable definitions must be initialized.
640
641Global variables in other translation units can also be declared, in which
642case they don't have an initializer.
643
644Global variables can optionally specify a :ref:`linkage type <linkage>`.
645
646Either global variable definitions or declarations may have an explicit section
647to be placed in and may have an optional explicit alignment specified. If there
648is a mismatch between the explicit or inferred section information for the
649variable declaration and its definition the resulting behavior is undefined.
650
651A variable may be defined as a global ``constant``, which indicates that
652the contents of the variable will **never** be modified (enabling better
653optimization, allowing the global data to be placed in the read-only
654section of an executable, etc). Note that variables that need runtime
655initialization cannot be marked ``constant`` as there is a store to the
656variable.
657
658LLVM explicitly allows *declarations* of global variables to be marked
659constant, even if the final definition of the global is not. This
660capability can be used to enable slightly better optimization of the
661program, but requires the language definition to guarantee that
662optimizations based on the 'constantness' are valid for the translation
663units that do not include the definition.
664
665As SSA values, global variables define pointer values that are in scope
666(i.e. they dominate) all basic blocks in the program. Global variables
667always define a pointer to their "content" type because they describe a
668region of memory, and all memory objects in LLVM are accessed through
669pointers.
670
671Global variables can be marked with ``unnamed_addr`` which indicates
672that the address is not significant, only the content. Constants marked
673like this can be merged with other constants if they have the same
674initializer. Note that a constant with significant address *can* be
675merged with a ``unnamed_addr`` constant, the result being a constant
676whose address is significant.
677
678If the ``local_unnamed_addr`` attribute is given, the address is known to
679not be significant within the module.
680
681A global variable may be declared to reside in a target-specific
682numbered address space. For targets that support them, address spaces
683may affect how optimizations are performed and/or what target
684instructions are used to access the variable. The default address space
685is zero. The address space qualifier must precede any other attributes.
686
687LLVM allows an explicit section to be specified for globals. If the
688target supports it, it will emit globals to the section specified.
689Additionally, the global can placed in a comdat if the target has the necessary
690support.
691
692External declarations may have an explicit section specified. Section
693information is retained in LLVM IR for targets that make use of this
694information. Attaching section information to an external declaration is an
695assertion that its definition is located in the specified section. If the
696definition is located in a different section, the behavior is undefined.
697
698By default, global initializers are optimized by assuming that global
699variables defined within the module are not modified from their
700initial values before the start of the global initializer. This is
701true even for variables potentially accessible from outside the
702module, including those with external linkage or appearing in
703``@llvm.used`` or dllexported variables. This assumption may be suppressed
704by marking the variable with ``externally_initialized``.
705
706An explicit alignment may be specified for a global, which must be a
707power of 2. If not present, or if the alignment is set to zero, the
708alignment of the global is set by the target to whatever it feels
709convenient. If an explicit alignment is specified, the global is forced
710to have exactly that alignment. Targets and optimizers are not allowed
711to over-align the global if the global has an assigned section. In this
712case, the extra alignment could be observable: for example, code could
713assume that the globals are densely packed in their section and try to
714iterate over them as an array, alignment padding would break this
715iteration. The maximum alignment is ``1 << 32``.
716
717For global variables declarations, as well as definitions that may be
718replaced at link time (``linkonce``, ``weak``, ``extern_weak`` and ``common``
719linkage types), LLVM makes no assumptions about the allocation size of the
720variables, except that they may not overlap. The alignment of a global variable
721declaration or replaceable definition must not be greater than the alignment of
722the definition it resolves to.
723
724Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
725an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
726an optional :ref:`global attributes <glattrs>` and
727an optional list of attached :ref:`metadata <metadata>`.
728
729Variables and aliases can have a
730:ref:`Thread Local Storage Model <tls_model>`.
731
732:ref:`Scalable vectors <t_vector>` cannot be global variables or members of
733arrays because their size is unknown at compile time. They are allowed in
734structs to facilitate intrinsics returning multiple values. Structs containing
735scalable vectors cannot be used in loads, stores, allocas, or GEPs.
736
737Syntax::
738
739      @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
740                         [DLLStorageClass] [ThreadLocal]
741                         [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
742                         [ExternallyInitialized]
743                         <global | constant> <Type> [<InitializerConstant>]
744                         [, section "name"] [, partition "name"]
745                         [, comdat [($name)]] [, align <Alignment>]
746                         [, no_sanitize_address] [, no_sanitize_hwaddress]
747                         [, sanitize_address_dyninit] [, sanitize_memtag]
748                         (, !name !N)*
749
750For example, the following defines a global in a numbered address space
751with an initializer, section, and alignment:
752
753.. code-block:: llvm
754
755    @G = addrspace(5) constant float 1.0, section "foo", align 4
756
757The following example just declares a global variable
758
759.. code-block:: llvm
760
761   @G = external global i32
762
763The following example defines a thread-local global with the
764``initialexec`` TLS model:
765
766.. code-block:: llvm
767
768    @G = thread_local(initialexec) global i32 0, align 4
769
770.. _functionstructure:
771
772Functions
773---------
774
775LLVM function definitions consist of the "``define``" keyword, an
776optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
777specifier <runtime_preemption_model>`,  an optional :ref:`visibility
778style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
779an optional :ref:`calling convention <callingconv>`,
780an optional ``unnamed_addr`` attribute, a return type, an optional
781:ref:`parameter attribute <paramattrs>` for the return type, a function
782name, a (possibly empty) argument list (each with optional :ref:`parameter
783attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
784an optional address space, an optional section, an optional partition,
785an optional alignment, an optional :ref:`comdat <langref_comdats>`,
786an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
787an optional :ref:`prologue <prologuedata>`,
788an optional :ref:`personality <personalityfn>`,
789an optional list of attached :ref:`metadata <metadata>`,
790an opening curly brace, a list of basic blocks, and a closing curly brace.
791
792Syntax::
793
794    define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
795           [cconv] [ret attrs]
796           <ResultType> @<FunctionName> ([argument list])
797           [(unnamed_addr|local_unnamed_addr)] [AddrSpace] [fn Attrs]
798           [section "name"] [partition "name"] [comdat [($name)]] [align N]
799           [gc] [prefix Constant] [prologue Constant] [personality Constant]
800           (!name !N)* { ... }
801
802The argument list is a comma separated sequence of arguments where each
803argument is of the following form:
804
805Syntax::
806
807   <type> [parameter Attrs] [name]
808
809LLVM function declarations consist of the "``declare``" keyword, an
810optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
811<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
812optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
813or ``local_unnamed_addr`` attribute, an optional address space, a return type,
814an optional :ref:`parameter attribute <paramattrs>` for the return type, a function name, a possibly
815empty list of arguments, an optional alignment, an optional :ref:`garbage
816collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
817:ref:`prologue <prologuedata>`.
818
819Syntax::
820
821    declare [linkage] [visibility] [DLLStorageClass]
822            [cconv] [ret attrs]
823            <ResultType> @<FunctionName> ([argument list])
824            [(unnamed_addr|local_unnamed_addr)] [align N] [gc]
825            [prefix Constant] [prologue Constant]
826
827A function definition contains a list of basic blocks, forming the CFG (Control
828Flow Graph) for the function. Each basic block may optionally start with a label
829(giving the basic block a symbol table entry), contains a list of instructions,
830and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
831function return). If an explicit label name is not provided, a block is assigned
832an implicit numbered label, using the next value from the same counter as used
833for unnamed temporaries (:ref:`see above<identifiers>`). For example, if a
834function entry block does not have an explicit label, it will be assigned label
835"%0", then the first unnamed temporary in that block will be "%1", etc. If a
836numeric label is explicitly specified, it must match the numeric label that
837would be used implicitly.
838
839The first basic block in a function is special in two ways: it is
840immediately executed on entrance to the function, and it is not allowed
841to have predecessor basic blocks (i.e. there can not be any branches to
842the entry block of a function). Because the block can have no
843predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
844
845LLVM allows an explicit section to be specified for functions. If the
846target supports it, it will emit functions to the section specified.
847Additionally, the function can be placed in a COMDAT.
848
849An explicit alignment may be specified for a function. If not present,
850or if the alignment is set to zero, the alignment of the function is set
851by the target to whatever it feels convenient. If an explicit alignment
852is specified, the function is forced to have at least that much
853alignment. All alignments must be a power of 2.
854
855If the ``unnamed_addr`` attribute is given, the address is known to not
856be significant and two identical functions can be merged.
857
858If the ``local_unnamed_addr`` attribute is given, the address is known to
859not be significant within the module.
860
861If an explicit address space is not given, it will default to the program
862address space from the :ref:`datalayout string<langref_datalayout>`.
863
864.. _langref_aliases:
865
866Aliases
867-------
868
869Aliases, unlike function or variables, don't create any new data. They
870are just a new symbol and metadata for an existing position.
871
872Aliases have a name and an aliasee that is either a global value or a
873constant expression.
874
875Aliases may have an optional :ref:`linkage type <linkage>`, an optional
876:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
877:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
878<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
879
880Syntax::
881
882    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
883              [, partition "name"]
884
885The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
886``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
887might not correctly handle dropping a weak symbol that is aliased.
888
889Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
890the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
891to the same content.
892
893If the ``local_unnamed_addr`` attribute is given, the address is known to
894not be significant within the module.
895
896Since aliases are only a second name, some restrictions apply, of which
897some can only be checked when producing an object file:
898
899* The expression defining the aliasee must be computable at assembly
900  time. Since it is just a name, no relocations can be used.
901
902* No alias in the expression can be weak as the possibility of the
903  intermediate alias being overridden cannot be represented in an
904  object file.
905
906* No global value in the expression can be a declaration, since that
907  would require a relocation, which is not possible.
908
909* If either the alias or the aliasee may be replaced by a symbol outside the
910  module at link time or runtime, any optimization cannot replace the alias with
911  the aliasee, since the behavior may be different. The alias may be used as a
912  name guaranteed to point to the content in the current module.
913
914.. _langref_ifunc:
915
916IFuncs
917-------
918
919IFuncs, like as aliases, don't create any new data or func. They are just a new
920symbol that dynamic linker resolves at runtime by calling a resolver function.
921
922IFuncs have a name and a resolver that is a function called by dynamic linker
923that returns address of another function associated with the name.
924
925IFunc may have an optional :ref:`linkage type <linkage>` and an optional
926:ref:`visibility style <visibility>`.
927
928Syntax::
929
930    @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
931              [, partition "name"]
932
933
934.. _langref_comdats:
935
936Comdats
937-------
938
939Comdat IR provides access to object file COMDAT/section group functionality
940which represents interrelated sections.
941
942Comdats have a name which represents the COMDAT key and a selection kind to
943provide input on how the linker deduplicates comdats with the same key in two
944different object files. A comdat must be included or omitted as a unit.
945Discarding the whole comdat is allowed but discarding a subset is not.
946
947A global object may be a member of at most one comdat. Aliases are placed in the
948same COMDAT that their aliasee computes to, if any.
949
950Syntax::
951
952    $<Name> = comdat SelectionKind
953
954For selection kinds other than ``nodeduplicate``, only one of the duplicate
955comdats may be retained by the linker and the members of the remaining comdats
956must be discarded. The following selection kinds are supported:
957
958``any``
959    The linker may choose any COMDAT key, the choice is arbitrary.
960``exactmatch``
961    The linker may choose any COMDAT key but the sections must contain the
962    same data.
963``largest``
964    The linker will choose the section containing the largest COMDAT key.
965``nodeduplicate``
966    No deduplication is performed.
967``samesize``
968    The linker may choose any COMDAT key but the sections must contain the
969    same amount of data.
970
971- XCOFF and Mach-O don't support COMDATs.
972- COFF supports all selection kinds. Non-``nodeduplicate`` selection kinds need
973  a non-local linkage COMDAT symbol.
974- ELF supports ``any`` and ``nodeduplicate``.
975- WebAssembly only supports ``any``.
976
977Here is an example of a COFF COMDAT where a function will only be selected if
978the COMDAT key's section is the largest:
979
980.. code-block:: text
981
982   $foo = comdat largest
983   @foo = global i32 2, comdat($foo)
984
985   define void @bar() comdat($foo) {
986     ret void
987   }
988
989In a COFF object file, this will create a COMDAT section with selection kind
990``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
991and another COMDAT section with selection kind
992``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
993section and contains the contents of the ``@bar`` symbol.
994
995As a syntactic sugar the ``$name`` can be omitted if the name is the same as
996the global name:
997
998.. code-block:: llvm
999
1000  $foo = comdat any
1001  @foo = global i32 2, comdat
1002  @bar = global i32 3, comdat($foo)
1003
1004There are some restrictions on the properties of the global object.
1005It, or an alias to it, must have the same name as the COMDAT group when
1006targeting COFF.
1007The contents and size of this object may be used during link-time to determine
1008which COMDAT groups get selected depending on the selection kind.
1009Because the name of the object must match the name of the COMDAT group, the
1010linkage of the global object must not be local; local symbols can get renamed
1011if a collision occurs in the symbol table.
1012
1013The combined use of COMDATS and section attributes may yield surprising results.
1014For example:
1015
1016.. code-block:: llvm
1017
1018   $foo = comdat any
1019   $bar = comdat any
1020   @g1 = global i32 42, section "sec", comdat($foo)
1021   @g2 = global i32 42, section "sec", comdat($bar)
1022
1023From the object file perspective, this requires the creation of two sections
1024with the same name. This is necessary because both globals belong to different
1025COMDAT groups and COMDATs, at the object file level, are represented by
1026sections.
1027
1028Note that certain IR constructs like global variables and functions may
1029create COMDATs in the object file in addition to any which are specified using
1030COMDAT IR. This arises when the code generator is configured to emit globals
1031in individual sections (e.g. when `-data-sections` or `-function-sections`
1032is supplied to `llc`).
1033
1034.. _namedmetadatastructure:
1035
1036Named Metadata
1037--------------
1038
1039Named metadata is a collection of metadata. :ref:`Metadata
1040nodes <metadata>` (but not metadata strings) are the only valid
1041operands for a named metadata.
1042
1043#. Named metadata are represented as a string of characters with the
1044   metadata prefix. The rules for metadata names are the same as for
1045   identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
1046   are still valid, which allows any character to be part of a name.
1047
1048Syntax::
1049
1050    ; Some unnamed metadata nodes, which are referenced by the named metadata.
1051    !0 = !{!"zero"}
1052    !1 = !{!"one"}
1053    !2 = !{!"two"}
1054    ; A named metadata.
1055    !name = !{!0, !1, !2}
1056
1057.. _paramattrs:
1058
1059Parameter Attributes
1060--------------------
1061
1062The return type and each parameter of a function type may have a set of
1063*parameter attributes* associated with them. Parameter attributes are
1064used to communicate additional information about the result or
1065parameters of a function. Parameter attributes are considered to be part
1066of the function, not of the function type, so functions with different
1067parameter attributes can have the same function type.
1068
1069Parameter attributes are simple keywords that follow the type specified.
1070If multiple parameter attributes are needed, they are space separated.
1071For example:
1072
1073.. code-block:: llvm
1074
1075    declare i32 @printf(ptr noalias nocapture, ...)
1076    declare i32 @atoi(i8 zeroext)
1077    declare signext i8 @returns_signed_char()
1078
1079Note that any attributes for the function result (``nounwind``,
1080``readonly``) come immediately after the argument list.
1081
1082Currently, only the following parameter attributes are defined:
1083
1084``zeroext``
1085    This indicates to the code generator that the parameter or return
1086    value should be zero-extended to the extent required by the target's
1087    ABI by the caller (for a parameter) or the callee (for a return value).
1088``signext``
1089    This indicates to the code generator that the parameter or return
1090    value should be sign-extended to the extent required by the target's
1091    ABI (which is usually 32-bits) by the caller (for a parameter) or
1092    the callee (for a return value).
1093``inreg``
1094    This indicates that this parameter or return value should be treated
1095    in a special target-dependent fashion while emitting code for
1096    a function call or return (usually, by putting it in a register as
1097    opposed to memory, though some targets use it to distinguish between
1098    two different kinds of registers). Use of this attribute is
1099    target-specific.
1100``byval(<ty>)``
1101    This indicates that the pointer parameter should really be passed by
1102    value to the function. The attribute implies that a hidden copy of
1103    the pointee is made between the caller and the callee, so the callee
1104    is unable to modify the value in the caller. This attribute is only
1105    valid on LLVM pointer arguments. It is generally used to pass
1106    structs and arrays by value, but is also valid on pointers to
1107    scalars. The copy is considered to belong to the caller not the
1108    callee (for example, ``readonly`` functions should not write to
1109    ``byval`` parameters). This is not a valid attribute for return
1110    values.
1111
1112    The byval type argument indicates the in-memory value type, and
1113    must be the same as the pointee type of the argument.
1114
1115    The byval attribute also supports specifying an alignment with the
1116    align attribute. It indicates the alignment of the stack slot to
1117    form and the known alignment of the pointer specified to the call
1118    site. If the alignment is not specified, then the code generator
1119    makes a target-specific assumption.
1120
1121.. _attr_byref:
1122
1123``byref(<ty>)``
1124
1125    The ``byref`` argument attribute allows specifying the pointee
1126    memory type of an argument. This is similar to ``byval``, but does
1127    not imply a copy is made anywhere, or that the argument is passed
1128    on the stack. This implies the pointer is dereferenceable up to
1129    the storage size of the type.
1130
1131    It is not generally permissible to introduce a write to an
1132    ``byref`` pointer. The pointer may have any address space and may
1133    be read only.
1134
1135    This is not a valid attribute for return values.
1136
1137    The alignment for an ``byref`` parameter can be explicitly
1138    specified by combining it with the ``align`` attribute, similar to
1139    ``byval``. If the alignment is not specified, then the code generator
1140    makes a target-specific assumption.
1141
1142    This is intended for representing ABI constraints, and is not
1143    intended to be inferred for optimization use.
1144
1145.. _attr_preallocated:
1146
1147``preallocated(<ty>)``
1148    This indicates that the pointer parameter should really be passed by
1149    value to the function, and that the pointer parameter's pointee has
1150    already been initialized before the call instruction. This attribute
1151    is only valid on LLVM pointer arguments. The argument must be the value
1152    returned by the appropriate
1153    :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` on non
1154    ``musttail`` calls, or the corresponding caller parameter in ``musttail``
1155    calls, although it is ignored during codegen.
1156
1157    A non ``musttail`` function call with a ``preallocated`` attribute in
1158    any parameter must have a ``"preallocated"`` operand bundle. A ``musttail``
1159    function call cannot have a ``"preallocated"`` operand bundle.
1160
1161    The preallocated attribute requires a type argument, which must be
1162    the same as the pointee type of the argument.
1163
1164    The preallocated attribute also supports specifying an alignment with the
1165    align attribute. It indicates the alignment of the stack slot to
1166    form and the known alignment of the pointer specified to the call
1167    site. If the alignment is not specified, then the code generator
1168    makes a target-specific assumption.
1169
1170.. _attr_inalloca:
1171
1172``inalloca(<ty>)``
1173
1174    The ``inalloca`` argument attribute allows the caller to take the
1175    address of outgoing stack arguments. An ``inalloca`` argument must
1176    be a pointer to stack memory produced by an ``alloca`` instruction.
1177    The alloca, or argument allocation, must also be tagged with the
1178    inalloca keyword. Only the last argument may have the ``inalloca``
1179    attribute, and that argument is guaranteed to be passed in memory.
1180
1181    An argument allocation may be used by a call at most once because
1182    the call may deallocate it. The ``inalloca`` attribute cannot be
1183    used in conjunction with other attributes that affect argument
1184    storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
1185    ``inalloca`` attribute also disables LLVM's implicit lowering of
1186    large aggregate return values, which means that frontend authors
1187    must lower them with ``sret`` pointers.
1188
1189    When the call site is reached, the argument allocation must have
1190    been the most recent stack allocation that is still live, or the
1191    behavior is undefined. It is possible to allocate additional stack
1192    space after an argument allocation and before its call site, but it
1193    must be cleared off with :ref:`llvm.stackrestore
1194    <int_stackrestore>`.
1195
1196    The inalloca attribute requires a type argument, which must be the
1197    same as the pointee type of the argument.
1198
1199    See :doc:`InAlloca` for more information on how to use this
1200    attribute.
1201
1202``sret(<ty>)``
1203    This indicates that the pointer parameter specifies the address of a
1204    structure that is the return value of the function in the source
1205    program. This pointer must be guaranteed by the caller to be valid:
1206    loads and stores to the structure may be assumed by the callee not
1207    to trap and to be properly aligned. This is not a valid attribute
1208    for return values.
1209
1210    The sret type argument specifies the in memory type, which must be
1211    the same as the pointee type of the argument.
1212
1213.. _attr_elementtype:
1214
1215``elementtype(<ty>)``
1216
1217    The ``elementtype`` argument attribute can be used to specify a pointer
1218    element type in a way that is compatible with `opaque pointers
1219    <OpaquePointers.html>`__.
1220
1221    The ``elementtype`` attribute by itself does not carry any specific
1222    semantics. However, certain intrinsics may require this attribute to be
1223    present and assign it particular semantics. This will be documented on
1224    individual intrinsics.
1225
1226    The attribute may only be applied to pointer typed arguments of intrinsic
1227    calls. It cannot be applied to non-intrinsic calls, and cannot be applied
1228    to parameters on function declarations. For non-opaque pointers, the type
1229    passed to ``elementtype`` must match the pointer element type.
1230
1231.. _attr_align:
1232
1233``align <n>`` or ``align(<n>)``
1234    This indicates that the pointer value or vector of pointers has the
1235    specified alignment. If applied to a vector of pointers, *all* pointers
1236    (elements) have the specified alignment. If the pointer value does not have
1237    the specified alignment, :ref:`poison value <poisonvalues>` is returned or
1238    passed instead.  The ``align`` attribute should be combined with the
1239    ``noundef`` attribute to ensure a pointer is aligned, or otherwise the
1240    behavior is undefined. Note that ``align 1`` has no effect on non-byval,
1241    non-preallocated arguments.
1242
1243    Note that this attribute has additional semantics when combined with the
1244    ``byval`` or ``preallocated`` attribute, which are documented there.
1245
1246.. _noalias:
1247
1248``noalias``
1249    This indicates that memory locations accessed via pointer values
1250    :ref:`based <pointeraliasing>` on the argument or return value are not also
1251    accessed, during the execution of the function, via pointer values not
1252    *based* on the argument or return value. This guarantee only holds for
1253    memory locations that are *modified*, by any means, during the execution of
1254    the function. The attribute on a return value also has additional semantics
1255    described below. The caller shares the responsibility with the callee for
1256    ensuring that these requirements are met.  For further details, please see
1257    the discussion of the NoAlias response in :ref:`alias analysis <Must, May,
1258    or No>`.
1259
1260    Note that this definition of ``noalias`` is intentionally similar
1261    to the definition of ``restrict`` in C99 for function arguments.
1262
1263    For function return values, C99's ``restrict`` is not meaningful,
1264    while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1265    attribute on return values are stronger than the semantics of the attribute
1266    when used on function arguments. On function return values, the ``noalias``
1267    attribute indicates that the function acts like a system memory allocation
1268    function, returning a pointer to allocated storage disjoint from the
1269    storage for any other object accessible to the caller.
1270
1271.. _nocapture:
1272
1273``nocapture``
1274    This indicates that the callee does not :ref:`capture <pointercapture>` the
1275    pointer. This is not a valid attribute for return values.
1276    This attribute applies only to the particular copy of the pointer passed in
1277    this argument. A caller could pass two copies of the same pointer with one
1278    being annotated nocapture and the other not, and the callee could validly
1279    capture through the non annotated parameter.
1280
1281.. code-block:: llvm
1282
1283    define void @f(ptr nocapture %a, ptr %b) {
1284      ; (capture %b)
1285    }
1286
1287    call void @f(ptr @glb, ptr @glb) ; well-defined
1288
1289``nofree``
1290    This indicates that callee does not free the pointer argument. This is not
1291    a valid attribute for return values.
1292
1293.. _nest:
1294
1295``nest``
1296    This indicates that the pointer parameter can be excised using the
1297    :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
1298    attribute for return values and can only be applied to one parameter.
1299
1300``returned``
1301    This indicates that the function always returns the argument as its return
1302    value. This is a hint to the optimizer and code generator used when
1303    generating the caller, allowing value propagation, tail call optimization,
1304    and omission of register saves and restores in some cases; it is not
1305    checked or enforced when generating the callee. The parameter and the
1306    function return type must be valid operands for the
1307    :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1308    return values and can only be applied to one parameter.
1309
1310``nonnull``
1311    This indicates that the parameter or return pointer is not null. This
1312    attribute may only be applied to pointer typed parameters. This is not
1313    checked or enforced by LLVM; if the parameter or return pointer is null,
1314    :ref:`poison value <poisonvalues>` is returned or passed instead.
1315    The ``nonnull`` attribute should be combined with the ``noundef`` attribute
1316    to ensure a pointer is not null or otherwise the behavior is undefined.
1317
1318``dereferenceable(<n>)``
1319    This indicates that the parameter or return pointer is dereferenceable. This
1320    attribute may only be applied to pointer typed parameters. A pointer that
1321    is dereferenceable can be loaded from speculatively without a risk of
1322    trapping. The number of bytes known to be dereferenceable must be provided
1323    in parentheses. It is legal for the number of bytes to be less than the
1324    size of the pointee type. The ``nonnull`` attribute does not imply
1325    dereferenceability (consider a pointer to one element past the end of an
1326    array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1327    ``addrspace(0)`` (which is the default address space), except if the
1328    ``null_pointer_is_valid`` function attribute is present.
1329    ``n`` should be a positive number. The pointer should be well defined,
1330    otherwise it is undefined behavior. This means ``dereferenceable(<n>)``
1331    implies ``noundef``.
1332
1333``dereferenceable_or_null(<n>)``
1334    This indicates that the parameter or return value isn't both
1335    non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
1336    time. All non-null pointers tagged with
1337    ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1338    For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1339    a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1340    and in other address spaces ``dereferenceable_or_null(<n>)``
1341    implies that a pointer is at least one of ``dereferenceable(<n>)``
1342    or ``null`` (i.e. it may be both ``null`` and
1343    ``dereferenceable(<n>)``). This attribute may only be applied to
1344    pointer typed parameters.
1345
1346``swiftself``
1347    This indicates that the parameter is the self/context parameter. This is not
1348    a valid attribute for return values and can only be applied to one
1349    parameter.
1350
1351``swiftasync``
1352    This indicates that the parameter is the asynchronous context parameter and
1353    triggers the creation of a target-specific extended frame record to store
1354    this pointer. This is not a valid attribute for return values and can only
1355    be applied to one parameter.
1356
1357``swifterror``
1358    This attribute is motivated to model and optimize Swift error handling. It
1359    can be applied to a parameter with pointer to pointer type or a
1360    pointer-sized alloca. At the call site, the actual argument that corresponds
1361    to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1362    the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1363    the parameter or the alloca) can only be loaded and stored from, or used as
1364    a ``swifterror`` argument. This is not a valid attribute for return values
1365    and can only be applied to one parameter.
1366
1367    These constraints allow the calling convention to optimize access to
1368    ``swifterror`` variables by associating them with a specific register at
1369    call boundaries rather than placing them in memory. Since this does change
1370    the calling convention, a function which uses the ``swifterror`` attribute
1371    on a parameter is not ABI-compatible with one which does not.
1372
1373    These constraints also allow LLVM to assume that a ``swifterror`` argument
1374    does not alias any other memory visible within a function and that a
1375    ``swifterror`` alloca passed as an argument does not escape.
1376
1377``immarg``
1378    This indicates the parameter is required to be an immediate
1379    value. This must be a trivial immediate integer or floating-point
1380    constant. Undef or constant expressions are not valid. This is
1381    only valid on intrinsic declarations and cannot be applied to a
1382    call site or arbitrary function.
1383
1384``noundef``
1385    This attribute applies to parameters and return values. If the value
1386    representation contains any undefined or poison bits, the behavior is
1387    undefined. Note that this does not refer to padding introduced by the
1388    type's storage representation.
1389
1390``alignstack(<n>)``
1391    This indicates the alignment that should be considered by the backend when
1392    assigning this parameter to a stack slot during calling convention
1393    lowering. The enforcement of the specified alignment is target-dependent,
1394    as target-specific calling convention rules may override this value. This
1395    attribute serves the purpose of carrying language specific alignment
1396    information that is not mapped to base types in the backend (for example,
1397    over-alignment specification through language attributes).
1398
1399``allocalign``
1400    The function parameter marked with this attribute is is the alignment in bytes of the
1401    newly allocated block returned by this function. The returned value must either have
1402    the specified alignment or be the null pointer. The return value MAY be more aligned
1403    than the requested alignment, but not less aligned.  Invalid (e.g. non-power-of-2)
1404    alignments are permitted for the allocalign parameter, so long as the returned pointer
1405    is null. This attribute may only be applied to integer parameters.
1406
1407``allocptr``
1408    The function parameter marked with this attribute is the pointer
1409    that will be manipulated by the allocator. For a realloc-like
1410    function the pointer will be invalidated upon success (but the
1411    same address may be returned), for a free-like function the
1412    pointer will always be invalidated.
1413
1414.. _gc:
1415
1416Garbage Collector Strategy Names
1417--------------------------------
1418
1419Each function may specify a garbage collector strategy name, which is simply a
1420string:
1421
1422.. code-block:: llvm
1423
1424    define void @f() gc "name" { ... }
1425
1426The supported values of *name* includes those :ref:`built in to LLVM
1427<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
1428strategy will cause the compiler to alter its output in order to support the
1429named garbage collection algorithm. Note that LLVM itself does not contain a
1430garbage collector, this functionality is restricted to generating machine code
1431which can interoperate with a collector provided externally.
1432
1433.. _prefixdata:
1434
1435Prefix Data
1436-----------
1437
1438Prefix data is data associated with a function which the code
1439generator will emit immediately before the function's entrypoint.
1440The purpose of this feature is to allow frontends to associate
1441language-specific runtime metadata with specific functions and make it
1442available through the function pointer while still allowing the
1443function pointer to be called.
1444
1445To access the data for a given function, a program may bitcast the
1446function pointer to a pointer to the constant's type and dereference
1447index -1. This implies that the IR symbol points just past the end of
1448the prefix data. For instance, take the example of a function annotated
1449with a single ``i32``,
1450
1451.. code-block:: llvm
1452
1453    define void @f() prefix i32 123 { ... }
1454
1455The prefix data can be referenced as,
1456
1457.. code-block:: llvm
1458
1459    %a = getelementptr inbounds i32, ptr @f, i32 -1
1460    %b = load i32, ptr %a
1461
1462Prefix data is laid out as if it were an initializer for a global variable
1463of the prefix data's type. The function will be placed such that the
1464beginning of the prefix data is aligned. This means that if the size
1465of the prefix data is not a multiple of the alignment size, the
1466function's entrypoint will not be aligned. If alignment of the
1467function's entrypoint is desired, padding must be added to the prefix
1468data.
1469
1470A function may have prefix data but no body. This has similar semantics
1471to the ``available_externally`` linkage in that the data may be used by the
1472optimizers but will not be emitted in the object file.
1473
1474.. _prologuedata:
1475
1476Prologue Data
1477-------------
1478
1479The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1480be inserted prior to the function body. This can be used for enabling
1481function hot-patching and instrumentation.
1482
1483To maintain the semantics of ordinary function calls, the prologue data must
1484have a particular format. Specifically, it must begin with a sequence of
1485bytes which decode to a sequence of machine instructions, valid for the
1486module's target, which transfer control to the point immediately succeeding
1487the prologue data, without performing any other visible action. This allows
1488the inliner and other passes to reason about the semantics of the function
1489definition without needing to reason about the prologue data. Obviously this
1490makes the format of the prologue data highly target dependent.
1491
1492A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
1493which encodes the ``nop`` instruction:
1494
1495.. code-block:: text
1496
1497    define void @f() prologue i8 144 { ... }
1498
1499Generally prologue data can be formed by encoding a relative branch instruction
1500which skips the metadata, as in this example of valid prologue data for the
1501x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1502
1503.. code-block:: text
1504
1505    %0 = type <{ i8, i8, ptr }>
1506
1507    define void @f() prologue %0 <{ i8 235, i8 8, ptr @md}> { ... }
1508
1509A function may have prologue data but no body. This has similar semantics
1510to the ``available_externally`` linkage in that the data may be used by the
1511optimizers but will not be emitted in the object file.
1512
1513.. _personalityfn:
1514
1515Personality Function
1516--------------------
1517
1518The ``personality`` attribute permits functions to specify what function
1519to use for exception handling.
1520
1521.. _attrgrp:
1522
1523Attribute Groups
1524----------------
1525
1526Attribute groups are groups of attributes that are referenced by objects within
1527the IR. They are important for keeping ``.ll`` files readable, because a lot of
1528functions will use the same set of attributes. In the degenerative case of a
1529``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1530group will capture the important command line flags used to build that file.
1531
1532An attribute group is a module-level object. To use an attribute group, an
1533object references the attribute group's ID (e.g. ``#37``). An object may refer
1534to more than one attribute group. In that situation, the attributes from the
1535different groups are merged.
1536
1537Here is an example of attribute groups for a function that should always be
1538inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1539
1540.. code-block:: llvm
1541
1542   ; Target-independent attributes:
1543   attributes #0 = { alwaysinline alignstack=4 }
1544
1545   ; Target-dependent attributes:
1546   attributes #1 = { "no-sse" }
1547
1548   ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1549   define void @f() #0 #1 { ... }
1550
1551.. _fnattrs:
1552
1553Function Attributes
1554-------------------
1555
1556Function attributes are set to communicate additional information about
1557a function. Function attributes are considered to be part of the
1558function, not of the function type, so functions with different function
1559attributes can have the same function type.
1560
1561Function attributes are simple keywords that follow the type specified.
1562If multiple attributes are needed, they are space separated. For
1563example:
1564
1565.. code-block:: llvm
1566
1567    define void @f() noinline { ... }
1568    define void @f() alwaysinline { ... }
1569    define void @f() alwaysinline optsize { ... }
1570    define void @f() optsize { ... }
1571
1572``alignstack(<n>)``
1573    This attribute indicates that, when emitting the prologue and
1574    epilogue, the backend should forcibly align the stack pointer.
1575    Specify the desired alignment, which must be a power of two, in
1576    parentheses.
1577``"alloc-family"="FAMILY"``
1578    This indicates which "family" an allocator function is part of. To avoid
1579    collisions, the family name should match the mangled name of the primary
1580    allocator function, that is "malloc" for malloc/calloc/realloc/free,
1581    "_Znwm" for ``::operator::new`` and ``::operator::delete``, and
1582    "_ZnwmSt11align_val_t" for aligned ``::operator::new`` and
1583    ``::operator::delete``. Matching malloc/realloc/free calls within a family
1584    can be optimized, but mismatched ones will be left alone.
1585``allockind("KIND")``
1586    Describes the behavior of an allocation function. The KIND string contains comma
1587    separated entries from the following options:
1588
1589    * "alloc": the function returns a new block of memory or null.
1590    * "realloc": the function returns a new block of memory or null. If the
1591      result is non-null the memory contents from the start of the block up to
1592      the smaller of the original allocation size and the new allocation size
1593      will match that of the ``allocptr`` argument and the ``allocptr``
1594      argument is invalidated, even if the function returns the same address.
1595    * "free": the function frees the block of memory specified by ``allocptr``.
1596      Functions marked as "free" ``allockind`` must return void.
1597    * "uninitialized": Any newly-allocated memory (either a new block from
1598      a "alloc" function or the enlarged capacity from a "realloc" function)
1599      will be uninitialized.
1600    * "zeroed": Any newly-allocated memory (either a new block from a "alloc"
1601      function or the enlarged capacity from a "realloc" function) will be
1602      zeroed.
1603    * "aligned": the function returns memory aligned according to the
1604      ``allocalign`` parameter.
1605
1606    The first three options are mutually exclusive, and the remaining options
1607    describe more details of how the function behaves. The remaining options
1608    are invalid for "free"-type functions.
1609``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1610    This attribute indicates that the annotated function will always return at
1611    least a given number of bytes (or null). Its arguments are zero-indexed
1612    parameter numbers; if one argument is provided, then it's assumed that at
1613    least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1614    returned pointer. If two are provided, then it's assumed that
1615    ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1616    available. The referenced parameters must be integer types. No assumptions
1617    are made about the contents of the returned block of memory.
1618``alwaysinline``
1619    This attribute indicates that the inliner should attempt to inline
1620    this function into callers whenever possible, ignoring any active
1621    inlining size threshold for this caller.
1622``builtin``
1623    This indicates that the callee function at a call site should be
1624    recognized as a built-in function, even though the function's declaration
1625    uses the ``nobuiltin`` attribute. This is only valid at call sites for
1626    direct calls to functions that are declared with the ``nobuiltin``
1627    attribute.
1628``cold``
1629    This attribute indicates that this function is rarely called. When
1630    computing edge weights, basic blocks post-dominated by a cold
1631    function call are also considered to be cold; and, thus, given low
1632    weight.
1633``convergent``
1634    In some parallel execution models, there exist operations that cannot be
1635    made control-dependent on any additional values.  We call such operations
1636    ``convergent``, and mark them with this attribute.
1637
1638    The ``convergent`` attribute may appear on functions or call/invoke
1639    instructions.  When it appears on a function, it indicates that calls to
1640    this function should not be made control-dependent on additional values.
1641    For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
1642    calls to this intrinsic cannot be made control-dependent on additional
1643    values.
1644
1645    When it appears on a call/invoke, the ``convergent`` attribute indicates
1646    that we should treat the call as though we're calling a convergent
1647    function.  This is particularly useful on indirect calls; without this we
1648    may treat such calls as though the target is non-convergent.
1649
1650    The optimizer may remove the ``convergent`` attribute on functions when it
1651    can prove that the function does not execute any convergent operations.
1652    Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1653    can prove that the call/invoke cannot call a convergent function.
1654``disable_sanitizer_instrumentation``
1655    When instrumenting code with sanitizers, it can be important to skip certain
1656    functions to ensure no instrumentation is applied to them.
1657
1658    This attribute is not always similar to absent ``sanitize_<name>``
1659    attributes: depending on the specific sanitizer, code can be inserted into
1660    functions regardless of the ``sanitize_<name>`` attribute to prevent false
1661    positive reports.
1662
1663    ``disable_sanitizer_instrumentation`` disables all kinds of instrumentation,
1664    taking precedence over the ``sanitize_<name>`` attributes and other compiler
1665    flags.
1666``"dontcall-error"``
1667    This attribute denotes that an error diagnostic should be emitted when a
1668    call of a function with this attribute is not eliminated via optimization.
1669    Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1670    such callees to attach information about where in the source language such a
1671    call came from. A string value can be provided as a note.
1672``"dontcall-warn"``
1673    This attribute denotes that a warning diagnostic should be emitted when a
1674    call of a function with this attribute is not eliminated via optimization.
1675    Front ends can provide optional ``srcloc`` metadata nodes on call sites of
1676    such callees to attach information about where in the source language such a
1677    call came from. A string value can be provided as a note.
1678``fn_ret_thunk_extern``
1679    This attribute tells the code generator that returns from functions should
1680    be replaced with jumps to externally-defined architecture-specific symbols.
1681    For X86, this symbol's identifier is ``__x86_return_thunk``.
1682``"frame-pointer"``
1683    This attribute tells the code generator whether the function
1684    should keep the frame pointer. The code generator may emit the frame pointer
1685    even if this attribute says the frame pointer can be eliminated.
1686    The allowed string values are:
1687
1688     * ``"none"`` (default) - the frame pointer can be eliminated.
1689     * ``"non-leaf"`` - the frame pointer should be kept if the function calls
1690       other functions.
1691     * ``"all"`` - the frame pointer should be kept.
1692``hot``
1693    This attribute indicates that this function is a hot spot of the program
1694    execution. The function will be optimized more aggressively and will be
1695    placed into special subsection of the text section to improving locality.
1696
1697    When profile feedback is enabled, this attribute has the precedence over
1698    the profile information. By marking a function ``hot``, users can work
1699    around the cases where the training input does not have good coverage
1700    on all the hot functions.
1701``inaccessiblememonly``
1702    This attribute indicates that the function may only access memory that
1703    is not accessible by the module being compiled before return from the
1704    function. This is a weaker form of ``readnone``. If the function reads
1705    or writes other memory, the behavior is undefined.
1706
1707    For clarity, note that such functions are allowed to return new memory
1708    which is ``noalias`` with respect to memory already accessible from
1709    the module.  That is, a function can be both ``inaccessiblememonly`` and
1710    have a ``noalias`` return which introduces a new, potentially initialized,
1711    allocation.
1712``inaccessiblemem_or_argmemonly``
1713    This attribute indicates that the function may only access memory that is
1714    either not accessible by the module being compiled, or is pointed to
1715    by its pointer arguments. This is a weaker form of  ``argmemonly``. If the
1716    function reads or writes other memory, the behavior is undefined.
1717``inlinehint``
1718    This attribute indicates that the source code contained a hint that
1719    inlining this function is desirable (such as the "inline" keyword in
1720    C/C++). It is just a hint; it imposes no requirements on the
1721    inliner.
1722``jumptable``
1723    This attribute indicates that the function should be added to a
1724    jump-instruction table at code-generation time, and that all address-taken
1725    references to this function should be replaced with a reference to the
1726    appropriate jump-instruction-table function pointer. Note that this creates
1727    a new pointer for the original function, which means that code that depends
1728    on function-pointer identity can break. So, any function annotated with
1729    ``jumptable`` must also be ``unnamed_addr``.
1730``minsize``
1731    This attribute suggests that optimization passes and code generator
1732    passes make choices that keep the code size of this function as small
1733    as possible and perform optimizations that may sacrifice runtime
1734    performance in order to minimize the size of the generated code.
1735``naked``
1736    This attribute disables prologue / epilogue emission for the
1737    function. This can have very system-specific consequences.
1738``"no-inline-line-tables"``
1739    When this attribute is set to true, the inliner discards source locations
1740    when inlining code and instead uses the source location of the call site.
1741    Breakpoints set on code that was inlined into the current function will
1742    not fire during the execution of the inlined call sites. If the debugger
1743    stops inside an inlined call site, it will appear to be stopped at the
1744    outermost inlined call site.
1745``no-jump-tables``
1746    When this attribute is set to true, the jump tables and lookup tables that
1747    can be generated from a switch case lowering are disabled.
1748``nobuiltin``
1749    This indicates that the callee function at a call site is not recognized as
1750    a built-in function. LLVM will retain the original call and not replace it
1751    with equivalent code based on the semantics of the built-in function, unless
1752    the call site uses the ``builtin`` attribute. This is valid at call sites
1753    and on function declarations and definitions.
1754``noduplicate``
1755    This attribute indicates that calls to the function cannot be
1756    duplicated. A call to a ``noduplicate`` function may be moved
1757    within its parent function, but may not be duplicated within
1758    its parent function.
1759
1760    A function containing a ``noduplicate`` call may still
1761    be an inlining candidate, provided that the call is not
1762    duplicated by inlining. That implies that the function has
1763    internal linkage and only has one call site, so the original
1764    call is dead after inlining.
1765``nofree``
1766    This function attribute indicates that the function does not, directly or
1767    transitively, call a memory-deallocation function (``free``, for example)
1768    on a memory allocation which existed before the call.
1769
1770    As a result, uncaptured pointers that are known to be dereferenceable
1771    prior to a call to a function with the ``nofree`` attribute are still
1772    known to be dereferenceable after the call. The capturing condition is
1773    necessary in environments where the function might communicate the
1774    pointer to another thread which then deallocates the memory.  Alternatively,
1775    ``nosync`` would ensure such communication cannot happen and even captured
1776    pointers cannot be freed by the function.
1777
1778    A ``nofree`` function is explicitly allowed to free memory which it
1779    allocated or (if not ``nosync``) arrange for another thread to free
1780    memory on it's behalf.  As a result, perhaps surprisingly, a ``nofree``
1781    function can return a pointer to a previously deallocated memory object.
1782``noimplicitfloat``
1783    Disallows implicit floating-point code. This inhibits optimizations that
1784    use floating-point code and floating-point/SIMD/vector registers for
1785    operations that are not nominally floating-point. LLVM instructions that
1786    perform floating-point operations or require access to floating-point
1787    registers may still cause floating-point code to be generated.
1788``noinline``
1789    This attribute indicates that the inliner should never inline this
1790    function in any situation. This attribute may not be used together
1791    with the ``alwaysinline`` attribute.
1792``nomerge``
1793    This attribute indicates that calls to this function should never be merged
1794    during optimization. For example, it will prevent tail merging otherwise
1795    identical code sequences that raise an exception or terminate the program.
1796    Tail merging normally reduces the precision of source location information,
1797    making stack traces less useful for debugging. This attribute gives the
1798    user control over the tradeoff between code size and debug information
1799    precision.
1800``nonlazybind``
1801    This attribute suppresses lazy symbol binding for the function. This
1802    may make calls to the function faster, at the cost of extra program
1803    startup time if the function is not called during program startup.
1804``noprofile``
1805    This function attribute prevents instrumentation based profiling, used for
1806    coverage or profile based optimization, from being added to a function,
1807    even when inlined.
1808``noredzone``
1809    This attribute indicates that the code generator should not use a
1810    red zone, even if the target-specific ABI normally permits it.
1811``indirect-tls-seg-refs``
1812    This attribute indicates that the code generator should not use
1813    direct TLS access through segment registers, even if the
1814    target-specific ABI normally permits it.
1815``noreturn``
1816    This function attribute indicates that the function never returns
1817    normally, hence through a return instruction. This produces undefined
1818    behavior at runtime if the function ever does dynamically return. Annotated
1819    functions may still raise an exception, i.a., ``nounwind`` is not implied.
1820``norecurse``
1821    This function attribute indicates that the function does not call itself
1822    either directly or indirectly down any possible call path. This produces
1823    undefined behavior at runtime if the function ever does recurse.
1824
1825.. _langref_willreturn:
1826
1827``willreturn``
1828    This function attribute indicates that a call of this function will
1829    either exhibit undefined behavior or comes back and continues execution
1830    at a point in the existing call stack that includes the current invocation.
1831    Annotated functions may still raise an exception, i.a., ``nounwind`` is not implied.
1832    If an invocation of an annotated function does not return control back
1833    to a point in the call stack, the behavior is undefined.
1834``nosync``
1835    This function attribute indicates that the function does not communicate
1836    (synchronize) with another thread through memory or other well-defined means.
1837    Synchronization is considered possible in the presence of `atomic` accesses
1838    that enforce an order, thus not "unordered" and "monotonic", `volatile` accesses,
1839    as well as `convergent` function calls. Note that through `convergent` function calls
1840    non-memory communication, e.g., cross-lane operations, are possible and are also
1841    considered synchronization. However `convergent` does not contradict `nosync`.
1842    If an annotated function does ever synchronize with another thread,
1843    the behavior is undefined.
1844``nounwind``
1845    This function attribute indicates that the function never raises an
1846    exception. If the function does raise an exception, its runtime
1847    behavior is undefined. However, functions marked nounwind may still
1848    trap or generate asynchronous exceptions. Exception handling schemes
1849    that are recognized by LLVM to handle asynchronous exceptions, such
1850    as SEH, will still provide their implementation defined semantics.
1851``nosanitize_bounds``
1852    This attribute indicates that bounds checking sanitizer instrumentation
1853    is disabled for this function.
1854``nosanitize_coverage``
1855    This attribute indicates that SanitizerCoverage instrumentation is disabled
1856    for this function.
1857``null_pointer_is_valid``
1858   If ``null_pointer_is_valid`` is set, then the ``null`` address
1859   in address-space 0 is considered to be a valid address for memory loads and
1860   stores. Any analysis or optimization should not treat dereferencing a
1861   pointer to ``null`` as undefined behavior in this function.
1862   Note: Comparing address of a global variable to ``null`` may still
1863   evaluate to false because of a limitation in querying this attribute inside
1864   constant expressions.
1865``optforfuzzing``
1866    This attribute indicates that this function should be optimized
1867    for maximum fuzzing signal.
1868``optnone``
1869    This function attribute indicates that most optimization passes will skip
1870    this function, with the exception of interprocedural optimization passes.
1871    Code generation defaults to the "fast" instruction selector.
1872    This attribute cannot be used together with the ``alwaysinline``
1873    attribute; this attribute is also incompatible
1874    with the ``minsize`` attribute and the ``optsize`` attribute.
1875
1876    This attribute requires the ``noinline`` attribute to be specified on
1877    the function as well, so the function is never inlined into any caller.
1878    Only functions with the ``alwaysinline`` attribute are valid
1879    candidates for inlining into the body of this function.
1880``optsize``
1881    This attribute suggests that optimization passes and code generator
1882    passes make choices that keep the code size of this function low,
1883    and otherwise do optimizations specifically to reduce code size as
1884    long as they do not significantly impact runtime performance.
1885``"patchable-function"``
1886    This attribute tells the code generator that the code
1887    generated for this function needs to follow certain conventions that
1888    make it possible for a runtime function to patch over it later.
1889    The exact effect of this attribute depends on its string value,
1890    for which there currently is one legal possibility:
1891
1892     * ``"prologue-short-redirect"`` - This style of patchable
1893       function is intended to support patching a function prologue to
1894       redirect control away from the function in a thread safe
1895       manner.  It guarantees that the first instruction of the
1896       function will be large enough to accommodate a short jump
1897       instruction, and will be sufficiently aligned to allow being
1898       fully changed via an atomic compare-and-swap instruction.
1899       While the first requirement can be satisfied by inserting large
1900       enough NOP, LLVM can and will try to re-purpose an existing
1901       instruction (i.e. one that would have to be emitted anyway) as
1902       the patchable instruction larger than a short jump.
1903
1904       ``"prologue-short-redirect"`` is currently only supported on
1905       x86-64.
1906
1907    This attribute by itself does not imply restrictions on
1908    inter-procedural optimizations.  All of the semantic effects the
1909    patching may have to be separately conveyed via the linkage type.
1910``"probe-stack"``
1911    This attribute indicates that the function will trigger a guard region
1912    in the end of the stack. It ensures that accesses to the stack must be
1913    no further apart than the size of the guard region to a previous
1914    access of the stack. It takes one required string value, the name of
1915    the stack probing function that will be called.
1916
1917    If a function that has a ``"probe-stack"`` attribute is inlined into
1918    a function with another ``"probe-stack"`` attribute, the resulting
1919    function has the ``"probe-stack"`` attribute of the caller. If a
1920    function that has a ``"probe-stack"`` attribute is inlined into a
1921    function that has no ``"probe-stack"`` attribute at all, the resulting
1922    function has the ``"probe-stack"`` attribute of the callee.
1923``readnone``
1924    On a function, this attribute indicates that the function computes its
1925    result (or decides to unwind an exception) based strictly on its arguments,
1926    without dereferencing any pointer arguments or otherwise accessing
1927    any mutable state (e.g. memory, control registers, etc) visible outside the
1928    ``readnone`` function. It does not write through any pointer arguments
1929    (including ``byval`` arguments) and never changes any state visible to
1930    callers. This means while it cannot unwind exceptions by calling the ``C++``
1931    exception throwing methods (since they write to memory), there may be
1932    non-``C++`` mechanisms that throw exceptions without writing to LLVM visible
1933    memory.
1934
1935    On an argument, this attribute indicates that the function does not
1936    dereference that pointer argument, even though it may read or write the
1937    memory that the pointer points to if accessed through other pointers.
1938
1939    If a readnone function reads or writes memory visible outside the function,
1940    or has other side-effects, the behavior is undefined. If a
1941    function reads from or writes to a readnone pointer argument, the behavior
1942    is undefined.
1943``readonly``
1944    On a function, this attribute indicates that the function does not write
1945    through any pointer arguments (including ``byval`` arguments) or otherwise
1946    modify any state (e.g. memory, control registers, etc) visible outside the
1947    ``readonly`` function. It may dereference pointer arguments and read
1948    state that may be set in the caller. A readonly function always
1949    returns the same value (or unwinds an exception identically) when
1950    called with the same set of arguments and global state.  This means while it
1951    cannot unwind exceptions by calling the ``C++`` exception throwing methods
1952    (since they write to memory), there may be non-``C++`` mechanisms that throw
1953    exceptions without writing to LLVM visible memory.
1954
1955    On an argument, this attribute indicates that the function does not write
1956    through this pointer argument, even though it may write to the memory that
1957    the pointer points to.
1958
1959    If a readonly function writes memory visible outside the function, or has
1960    other side-effects, the behavior is undefined. If a function writes to a
1961    readonly pointer argument, the behavior is undefined.
1962``"stack-probe-size"``
1963    This attribute controls the behavior of stack probes: either
1964    the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1965    It defines the size of the guard region. It ensures that if the function
1966    may use more stack space than the size of the guard region, stack probing
1967    sequence will be emitted. It takes one required integer value, which
1968    is 4096 by default.
1969
1970    If a function that has a ``"stack-probe-size"`` attribute is inlined into
1971    a function with another ``"stack-probe-size"`` attribute, the resulting
1972    function has the ``"stack-probe-size"`` attribute that has the lower
1973    numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1974    inlined into a function that has no ``"stack-probe-size"`` attribute
1975    at all, the resulting function has the ``"stack-probe-size"`` attribute
1976    of the callee.
1977``"no-stack-arg-probe"``
1978    This attribute disables ABI-required stack probes, if any.
1979``writeonly``
1980    On a function, this attribute indicates that the function may write to but
1981    does not read from memory visible outside the ``writeonly`` function.
1982
1983    On an argument, this attribute indicates that the function may write to but
1984    does not read through this pointer argument (even though it may read from
1985    the memory that the pointer points to).
1986
1987    If a writeonly function reads memory visible outside the function or has
1988    other side-effects, the behavior is undefined. If a function reads
1989    from a writeonly pointer argument, the behavior is undefined.
1990``argmemonly``
1991    This attribute indicates that the only memory accesses inside function are
1992    loads and stores from objects pointed to by its pointer-typed arguments,
1993    with arbitrary offsets. Or in other words, all memory operations in the
1994    function can refer to memory only using pointers based on its function
1995    arguments.
1996
1997    Note that ``argmemonly`` can be used together with ``readonly`` attribute
1998    in order to specify that function reads only from its arguments.
1999
2000    If an argmemonly function reads or writes memory other than the pointer
2001    arguments, or has other side-effects, the behavior is undefined.
2002``returns_twice``
2003    This attribute indicates that this function can return twice. The C
2004    ``setjmp`` is an example of such a function. The compiler disables
2005    some optimizations (like tail calls) in the caller of these
2006    functions.
2007``safestack``
2008    This attribute indicates that
2009    `SafeStack <https://clang.llvm.org/docs/SafeStack.html>`_
2010    protection is enabled for this function.
2011
2012    If a function that has a ``safestack`` attribute is inlined into a
2013    function that doesn't have a ``safestack`` attribute or which has an
2014    ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
2015    function will have a ``safestack`` attribute.
2016``sanitize_address``
2017    This attribute indicates that AddressSanitizer checks
2018    (dynamic address safety analysis) are enabled for this function.
2019``sanitize_memory``
2020    This attribute indicates that MemorySanitizer checks (dynamic detection
2021    of accesses to uninitialized memory) are enabled for this function.
2022``sanitize_thread``
2023    This attribute indicates that ThreadSanitizer checks
2024    (dynamic thread safety analysis) are enabled for this function.
2025``sanitize_hwaddress``
2026    This attribute indicates that HWAddressSanitizer checks
2027    (dynamic address safety analysis based on tagged pointers) are enabled for
2028    this function.
2029``sanitize_memtag``
2030    This attribute indicates that MemTagSanitizer checks
2031    (dynamic address safety analysis based on Armv8 MTE) are enabled for
2032    this function.
2033``speculative_load_hardening``
2034    This attribute indicates that
2035    `Speculative Load Hardening <https://llvm.org/docs/SpeculativeLoadHardening.html>`_
2036    should be enabled for the function body.
2037
2038    Speculative Load Hardening is a best-effort mitigation against
2039    information leak attacks that make use of control flow
2040    miss-speculation - specifically miss-speculation of whether a branch
2041    is taken or not. Typically vulnerabilities enabling such attacks are
2042    classified as "Spectre variant #1". Notably, this does not attempt to
2043    mitigate against miss-speculation of branch target, classified as
2044    "Spectre variant #2" vulnerabilities.
2045
2046    When inlining, the attribute is sticky. Inlining a function that carries
2047    this attribute will cause the caller to gain the attribute. This is intended
2048    to provide a maximally conservative model where the code in a function
2049    annotated with this attribute will always (even after inlining) end up
2050    hardened.
2051``speculatable``
2052    This function attribute indicates that the function does not have any
2053    effects besides calculating its result and does not have undefined behavior.
2054    Note that ``speculatable`` is not enough to conclude that along any
2055    particular execution path the number of calls to this function will not be
2056    externally observable. This attribute is only valid on functions
2057    and declarations, not on individual call sites. If a function is
2058    incorrectly marked as speculatable and really does exhibit
2059    undefined behavior, the undefined behavior may be observed even
2060    if the call site is dead code.
2061
2062``ssp``
2063    This attribute indicates that the function should emit a stack
2064    smashing protector. It is in the form of a "canary" --- a random value
2065    placed on the stack before the local variables that's checked upon
2066    return from the function to see if it has been overwritten. A
2067    heuristic is used to determine if a function needs stack protectors
2068    or not. The heuristic used will enable protectors for functions with:
2069
2070    - Character arrays larger than ``ssp-buffer-size`` (default 8).
2071    - Aggregates containing character arrays larger than ``ssp-buffer-size``.
2072    - Calls to alloca() with variable sizes or constant sizes greater than
2073      ``ssp-buffer-size``.
2074
2075    Variables that are identified as requiring a protector will be arranged
2076    on the stack such that they are adjacent to the stack protector guard.
2077
2078    If a function with an ``ssp`` attribute is inlined into a calling function,
2079    the attribute is not carried over to the calling function.
2080
2081``sspstrong``
2082    This attribute indicates that the function should emit a stack smashing
2083    protector. This attribute causes a strong heuristic to be used when
2084    determining if a function needs stack protectors. The strong heuristic
2085    will enable protectors for functions with:
2086
2087    - Arrays of any size and type
2088    - Aggregates containing an array of any size and type.
2089    - Calls to alloca().
2090    - Local variables that have had their address taken.
2091
2092    Variables that are identified as requiring a protector will be arranged
2093    on the stack such that they are adjacent to the stack protector guard.
2094    The specific layout rules are:
2095
2096    #. Large arrays and structures containing large arrays
2097       (``>= ssp-buffer-size``) are closest to the stack protector.
2098    #. Small arrays and structures containing small arrays
2099       (``< ssp-buffer-size``) are 2nd closest to the protector.
2100    #. Variables that have had their address taken are 3rd closest to the
2101       protector.
2102
2103    This overrides the ``ssp`` function attribute.
2104
2105    If a function with an ``sspstrong`` attribute is inlined into a calling
2106    function which has an ``ssp`` attribute, the calling function's attribute
2107    will be upgraded to ``sspstrong``.
2108
2109``sspreq``
2110    This attribute indicates that the function should *always* emit a stack
2111    smashing protector. This overrides the ``ssp`` and ``sspstrong`` function
2112    attributes.
2113
2114    Variables that are identified as requiring a protector will be arranged
2115    on the stack such that they are adjacent to the stack protector guard.
2116    The specific layout rules are:
2117
2118    #. Large arrays and structures containing large arrays
2119       (``>= ssp-buffer-size``) are closest to the stack protector.
2120    #. Small arrays and structures containing small arrays
2121       (``< ssp-buffer-size``) are 2nd closest to the protector.
2122    #. Variables that have had their address taken are 3rd closest to the
2123       protector.
2124
2125    If a function with an ``sspreq`` attribute is inlined into a calling
2126    function which has an ``ssp`` or ``sspstrong`` attribute, the calling
2127    function's attribute will be upgraded to ``sspreq``.
2128
2129``strictfp``
2130    This attribute indicates that the function was called from a scope that
2131    requires strict floating-point semantics.  LLVM will not attempt any
2132    optimizations that require assumptions about the floating-point rounding
2133    mode or that might alter the state of floating-point status flags that
2134    might otherwise be set or cleared by calling this function. LLVM will
2135    not introduce any new floating-point instructions that may trap.
2136
2137``"denormal-fp-math"``
2138    This indicates the denormal (subnormal) handling that may be
2139    assumed for the default floating-point environment. This is a
2140    comma separated pair. The elements may be one of ``"ieee"``,
2141    ``"preserve-sign"``, or ``"positive-zero"``. The first entry
2142    indicates the flushing mode for the result of floating point
2143    operations. The second indicates the handling of denormal inputs
2144    to floating point instructions. For compatibility with older
2145    bitcode, if the second value is omitted, both input and output
2146    modes will assume the same mode.
2147
2148    If this is attribute is not specified, the default is
2149    ``"ieee,ieee"``.
2150
2151    If the output mode is ``"preserve-sign"``, or ``"positive-zero"``,
2152    denormal outputs may be flushed to zero by standard floating-point
2153    operations. It is not mandated that flushing to zero occurs, but if
2154    a denormal output is flushed to zero, it must respect the sign
2155    mode. Not all targets support all modes. While this indicates the
2156    expected floating point mode the function will be executed with,
2157    this does not make any attempt to ensure the mode is
2158    consistent. User or platform code is expected to set the floating
2159    point mode appropriately before function entry.
2160
2161   If the input mode is ``"preserve-sign"``, or ``"positive-zero"``, a
2162   floating-point operation must treat any input denormal value as
2163   zero. In some situations, if an instruction does not respect this
2164   mode, the input may need to be converted to 0 as if by
2165   ``@llvm.canonicalize`` during lowering for correctness.
2166
2167``"denormal-fp-math-f32"``
2168    Same as ``"denormal-fp-math"``, but only controls the behavior of
2169    the 32-bit float type (or vectors of 32-bit floats). If both are
2170    are present, this overrides ``"denormal-fp-math"``. Not all targets
2171    support separately setting the denormal mode per type, and no
2172    attempt is made to diagnose unsupported uses. Currently this
2173    attribute is respected by the AMDGPU and NVPTX backends.
2174
2175``"thunk"``
2176    This attribute indicates that the function will delegate to some other
2177    function with a tail call. The prototype of a thunk should not be used for
2178    optimization purposes. The caller is expected to cast the thunk prototype to
2179    match the thunk target prototype.
2180
2181``"tls-load-hoist"``
2182    This attribute indicates that the function will try to reduce redundant
2183    tls address calculation by hoisting tls variable.
2184
2185``uwtable[(sync|async)]``
2186    This attribute indicates that the ABI being targeted requires that
2187    an unwind table entry be produced for this function even if we can
2188    show that no exceptions passes by it. This is normally the case for
2189    the ELF x86-64 abi, but it can be disabled for some compilation
2190    units. The optional parameter describes what kind of unwind tables
2191    to generate: ``sync`` for normal unwind tables, ``async`` for asynchronous
2192    (instruction precise) unwind tables. Without the parameter, the attribute
2193    ``uwtable`` is equivalent to ``uwtable(async)``.
2194``nocf_check``
2195    This attribute indicates that no control-flow check will be performed on
2196    the attributed entity. It disables -fcf-protection=<> for a specific
2197    entity to fine grain the HW control flow protection mechanism. The flag
2198    is target independent and currently appertains to a function or function
2199    pointer.
2200``shadowcallstack``
2201    This attribute indicates that the ShadowCallStack checks are enabled for
2202    the function. The instrumentation checks that the return address for the
2203    function has not changed between the function prolog and epilog. It is
2204    currently x86_64-specific.
2205
2206.. _langref_mustprogress:
2207
2208``mustprogress``
2209    This attribute indicates that the function is required to return, unwind,
2210    or interact with the environment in an observable way e.g. via a volatile
2211    memory access, I/O, or other synchronization.  The ``mustprogress``
2212    attribute is intended to model the requirements of the first section of
2213    [intro.progress] of the C++ Standard. As a consequence, a loop in a
2214    function with the `mustprogress` attribute can be assumed to terminate if
2215    it does not interact with the environment in an observable way, and
2216    terminating loops without side-effects can be removed. If a `mustprogress`
2217    function does not satisfy this contract, the behavior is undefined.  This
2218    attribute does not apply transitively to callees, but does apply to call
2219    sites within the function. Note that `willreturn` implies `mustprogress`.
2220``"warn-stack-size"="<threshold>"``
2221    This attribute sets a threshold to emit diagnostics once the frame size is
2222    known should the frame size exceed the specified value.  It takes one
2223    required integer value, which should be a non-negative integer, and less
2224    than `UINT_MAX`.  It's unspecified which threshold will be used when
2225    duplicate definitions are linked together with differing values.
2226``vscale_range(<min>[, <max>])``
2227    This attribute indicates the minimum and maximum vscale value for the given
2228    function. The min must be greater than 0. A maximum value of 0 means
2229    unbounded. If the optional max value is omitted then max is set to the
2230    value of min. If the attribute is not present, no assumptions are made
2231    about the range of vscale.
2232``"min-legal-vector-width"="<size>"``
2233    This attribute indicates the minimum legal vector width required by the
2234    calling conversion. It is the maximum width of vector arguments and
2235    returnings in the function and functions called by this function. Because
2236    all the vectors are supposed to be legal type for compatibility.
2237    Backends are free to ignore the attribute if they don't need to support
2238    different maximum legal vector types or such information can be inferred by
2239    other attributes.
2240
2241Call Site Attributes
2242----------------------
2243
2244In addition to function attributes the following call site only
2245attributes are supported:
2246
2247``vector-function-abi-variant``
2248    This attribute can be attached to a :ref:`call <i_call>` to list
2249    the vector functions associated to the function. Notice that the
2250    attribute cannot be attached to a :ref:`invoke <i_invoke>` or a
2251    :ref:`callbr <i_callbr>` instruction. The attribute consists of a
2252    comma separated list of mangled names. The order of the list does
2253    not imply preference (it is logically a set). The compiler is free
2254    to pick any listed vector function of its choosing.
2255
2256    The syntax for the mangled names is as follows:::
2257
2258        _ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
2259
2260    When present, the attribute informs the compiler that the function
2261    ``<scalar_name>`` has a corresponding vector variant that can be
2262    used to perform the concurrent invocation of ``<scalar_name>`` on
2263    vectors. The shape of the vector function is described by the
2264    tokens between the prefix ``_ZGV`` and the ``<scalar_name>``
2265    token. The standard name of the vector function is
2266    ``_ZGV<isa><mask><vlen><parameters>_<scalar_name>``. When present,
2267    the optional token ``(<vector_redirection>)`` informs the compiler
2268    that a custom name is provided in addition to the standard one
2269    (custom names can be provided for example via the use of ``declare
2270    variant`` in OpenMP 5.0). The declaration of the variant must be
2271    present in the IR Module. The signature of the vector variant is
2272    determined by the rules of the Vector Function ABI (VFABI)
2273    specifications of the target. For Arm and X86, the VFABI can be
2274    found at https://github.com/ARM-software/abi-aa and
2275    https://software.intel.com/content/www/us/en/develop/download/vector-simd-function-abi.html,
2276    respectively.
2277
2278    For X86 and Arm targets, the values of the tokens in the standard
2279    name are those that are defined in the VFABI. LLVM has an internal
2280    ``<isa>`` token that can be used to create scalar-to-vector
2281    mappings for functions that are not directly associated to any of
2282    the target ISAs (for example, some of the mappings stored in the
2283    TargetLibraryInfo). Valid values for the ``<isa>`` token are:::
2284
2285        <isa>:= b | c | d | e  -> X86 SSE, AVX, AVX2, AVX512
2286              | n | s          -> Armv8 Advanced SIMD, SVE
2287              | __LLVM__       -> Internal LLVM Vector ISA
2288
2289    For all targets currently supported (x86, Arm and Internal LLVM),
2290    the remaining tokens can have the following values:::
2291
2292        <mask>:= M | N         -> mask | no mask
2293
2294        <vlen>:= number        -> number of lanes
2295               | x             -> VLA (Vector Length Agnostic)
2296
2297        <parameters>:= v              -> vector
2298                     | l | l <number> -> linear
2299                     | R | R <number> -> linear with ref modifier
2300                     | L | L <number> -> linear with val modifier
2301                     | U | U <number> -> linear with uval modifier
2302                     | ls <pos>       -> runtime linear
2303                     | Rs <pos>       -> runtime linear with ref modifier
2304                     | Ls <pos>       -> runtime linear with val modifier
2305                     | Us <pos>       -> runtime linear with uval modifier
2306                     | u              -> uniform
2307
2308        <scalar_name>:= name of the scalar function
2309
2310        <vector_redirection>:= optional, custom name of the vector function
2311
2312``preallocated(<ty>)``
2313    This attribute is required on calls to ``llvm.call.preallocated.arg``
2314    and cannot be used on any other call. See
2315    :ref:`llvm.call.preallocated.arg<int_call_preallocated_arg>` for more
2316    details.
2317
2318.. _glattrs:
2319
2320Global Attributes
2321-----------------
2322
2323Attributes may be set to communicate additional information about a global variable.
2324Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
2325are grouped into a single :ref:`attribute group <attrgrp>`.
2326
2327``no_sanitize_address``
2328    This attribute indicates that the global variable should not have
2329    AddressSanitizer instrumentation applied to it, because it was annotated
2330    with `__attribute__((no_sanitize("address")))`,
2331    `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2332    `-fsanitize-ignorelist` file.
2333``no_sanitize_hwaddress``
2334    This attribute indicates that the global variable should not have
2335    HWAddressSanitizer instrumentation applied to it, because it was annotated
2336    with `__attribute__((no_sanitize("hwaddress")))`,
2337    `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2338    `-fsanitize-ignorelist` file.
2339``sanitize_memtag``
2340    This attribute indicates that the global variable should have AArch64 memory
2341    tags (MTE) instrumentation applied to it. This attribute causes the
2342    suppression of certain optimisations, like GlobalMerge, as well as ensuring
2343    extra directives are emitted in the assembly and extra bits of metadata are
2344    placed in the object file so that the linker can ensure the accesses are
2345    protected by MTE. This attribute is added by clang when
2346    `-fsanitize=memtag-globals` is provided, as long as the global is not marked
2347    with `__attribute__((no_sanitize("memtag")))`,
2348    `__attribute__((disable_sanitizer_instrumentation))`, or included in the
2349    `-fsanitize-ignorelist` file. The AArch64 Globals Tagging pass may remove
2350    this attribute when it's not possible to tag the global (e.g. it's a TLS
2351    variable).
2352``sanitize_address_dyninit``
2353    This attribute indicates that the global variable, when instrumented with
2354    AddressSanitizer, should be checked for ODR violations. This attribute is
2355    applied to global variables that are dynamically initialized according to
2356    C++ rules.
2357
2358.. _opbundles:
2359
2360Operand Bundles
2361---------------
2362
2363Operand bundles are tagged sets of SSA values that can be associated
2364with certain LLVM instructions (currently only ``call`` s and
2365``invoke`` s).  In a way they are like metadata, but dropping them is
2366incorrect and will change program semantics.
2367
2368Syntax::
2369
2370    operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
2371    operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
2372    bundle operand ::= SSA value
2373    tag ::= string constant
2374
2375Operand bundles are **not** part of a function's signature, and a
2376given function may be called from multiple places with different kinds
2377of operand bundles.  This reflects the fact that the operand bundles
2378are conceptually a part of the ``call`` (or ``invoke``), not the
2379callee being dispatched to.
2380
2381Operand bundles are a generic mechanism intended to support
2382runtime-introspection-like functionality for managed languages.  While
2383the exact semantics of an operand bundle depend on the bundle tag,
2384there are certain limitations to how much the presence of an operand
2385bundle can influence the semantics of a program.  These restrictions
2386are described as the semantics of an "unknown" operand bundle.  As
2387long as the behavior of an operand bundle is describable within these
2388restrictions, LLVM does not need to have special knowledge of the
2389operand bundle to not miscompile programs containing it.
2390
2391- The bundle operands for an unknown operand bundle escape in unknown
2392  ways before control is transferred to the callee or invokee.
2393- Calls and invokes with operand bundles have unknown read / write
2394  effect on the heap on entry and exit (even if the call target is
2395  ``readnone`` or ``readonly``), unless they're overridden with
2396  callsite specific attributes.
2397- An operand bundle at a call site cannot change the implementation
2398  of the called function.  Inter-procedural optimizations work as
2399  usual as long as they take into account the first two properties.
2400
2401More specific types of operand bundles are described below.
2402
2403.. _deopt_opbundles:
2404
2405Deoptimization Operand Bundles
2406^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2407
2408Deoptimization operand bundles are characterized by the ``"deopt"``
2409operand bundle tag.  These operand bundles represent an alternate
2410"safe" continuation for the call site they're attached to, and can be
2411used by a suitable runtime to deoptimize the compiled frame at the
2412specified call site.  There can be at most one ``"deopt"`` operand
2413bundle attached to a call site.  Exact details of deoptimization is
2414out of scope for the language reference, but it usually involves
2415rewriting a compiled frame into a set of interpreted frames.
2416
2417From the compiler's perspective, deoptimization operand bundles make
2418the call sites they're attached to at least ``readonly``.  They read
2419through all of their pointer typed operands (even if they're not
2420otherwise escaped) and the entire visible heap.  Deoptimization
2421operand bundles do not capture their operands except during
2422deoptimization, in which case control will not be returned to the
2423compiled frame.
2424
2425The inliner knows how to inline through calls that have deoptimization
2426operand bundles.  Just like inlining through a normal call site
2427involves composing the normal and exceptional continuations, inlining
2428through a call site with a deoptimization operand bundle needs to
2429appropriately compose the "safe" deoptimization continuation.  The
2430inliner does this by prepending the parent's deoptimization
2431continuation to every deoptimization continuation in the inlined body.
2432E.g. inlining ``@f`` into ``@g`` in the following example
2433
2434.. code-block:: llvm
2435
2436    define void @f() {
2437      call void @x()  ;; no deopt state
2438      call void @y() [ "deopt"(i32 10) ]
2439      call void @y() [ "deopt"(i32 10), "unknown"(ptr null) ]
2440      ret void
2441    }
2442
2443    define void @g() {
2444      call void @f() [ "deopt"(i32 20) ]
2445      ret void
2446    }
2447
2448will result in
2449
2450.. code-block:: llvm
2451
2452    define void @g() {
2453      call void @x()  ;; still no deopt state
2454      call void @y() [ "deopt"(i32 20, i32 10) ]
2455      call void @y() [ "deopt"(i32 20, i32 10), "unknown"(ptr null) ]
2456      ret void
2457    }
2458
2459It is the frontend's responsibility to structure or encode the
2460deoptimization state in a way that syntactically prepending the
2461caller's deoptimization state to the callee's deoptimization state is
2462semantically equivalent to composing the caller's deoptimization
2463continuation after the callee's deoptimization continuation.
2464
2465.. _ob_funclet:
2466
2467Funclet Operand Bundles
2468^^^^^^^^^^^^^^^^^^^^^^^
2469
2470Funclet operand bundles are characterized by the ``"funclet"``
2471operand bundle tag.  These operand bundles indicate that a call site
2472is within a particular funclet.  There can be at most one
2473``"funclet"`` operand bundle attached to a call site and it must have
2474exactly one bundle operand.
2475
2476If any funclet EH pads have been "entered" but not "exited" (per the
2477`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
2478it is undefined behavior to execute a ``call`` or ``invoke`` which:
2479
2480* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
2481  intrinsic, or
2482* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
2483  not-yet-exited funclet EH pad.
2484
2485Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
2486executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
2487
2488GC Transition Operand Bundles
2489^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2490
2491GC transition operand bundles are characterized by the
2492``"gc-transition"`` operand bundle tag. These operand bundles mark a
2493call as a transition between a function with one GC strategy to a
2494function with a different GC strategy. If coordinating the transition
2495between GC strategies requires additional code generation at the call
2496site, these bundles may contain any values that are needed by the
2497generated code.  For more details, see :ref:`GC Transitions
2498<gc_transition_args>`.
2499
2500The bundle contain an arbitrary list of Values which need to be passed
2501to GC transition code. They will be lowered and passed as operands to
2502the appropriate GC_TRANSITION nodes in the selection DAG. It is assumed
2503that these arguments must be available before and after (but not
2504necessarily during) the execution of the callee.
2505
2506.. _assume_opbundles:
2507
2508Assume Operand Bundles
2509^^^^^^^^^^^^^^^^^^^^^^
2510
2511Operand bundles on an :ref:`llvm.assume <int_assume>` allows representing
2512assumptions that a :ref:`parameter attribute <paramattrs>` or a
2513:ref:`function attribute <fnattrs>` holds for a certain value at a certain
2514location. Operand bundles enable assumptions that are either hard or impossible
2515to represent as a boolean argument of an :ref:`llvm.assume <int_assume>`.
2516
2517An assume operand bundle has the form:
2518
2519::
2520
2521      "<tag>"([ <holds for value> [, <attribute argument>] ])
2522
2523* The tag of the operand bundle is usually the name of attribute that can be
2524  assumed to hold. It can also be `ignore`, this tag doesn't contain any
2525  information and should be ignored.
2526* The first argument if present is the value for which the attribute hold.
2527* The second argument if present is an argument of the attribute.
2528
2529If there are no arguments the attribute is a property of the call location.
2530
2531For example:
2532
2533.. code-block:: llvm
2534
2535      call void @llvm.assume(i1 true) ["align"(ptr %val, i32 8)]
2536
2537allows the optimizer to assume that at location of call to
2538:ref:`llvm.assume <int_assume>` ``%val`` has an alignment of at least 8.
2539
2540.. code-block:: llvm
2541
2542      call void @llvm.assume(i1 %cond) ["cold"(), "nonnull"(ptr %val)]
2543
2544allows the optimizer to assume that the :ref:`llvm.assume <int_assume>`
2545call location is cold and that ``%val`` may not be null.
2546
2547Just like for the argument of :ref:`llvm.assume <int_assume>`, if any of the
2548provided guarantees are violated at runtime the behavior is undefined.
2549
2550While attributes expect constant arguments, assume operand bundles may be
2551provided a dynamic value, for example:
2552
2553.. code-block:: llvm
2554
2555      call void @llvm.assume(i1 true) ["align"(ptr %val, i32 %align)]
2556
2557If the operand bundle value violates any requirements on the attribute value,
2558the behavior is undefined, unless one of the following exceptions applies:
2559
2560* ``"assume"`` operand bundles may specify a non-power-of-two alignment
2561  (including a zero alignment). If this is the case, then the pointer value
2562  must be a null pointer, otherwise the behavior is undefined.
2563
2564Even if the assumed property can be encoded as a boolean value, like
2565``nonnull``, using operand bundles to express the property can still have
2566benefits:
2567
2568* Attributes that can be expressed via operand bundles are directly the
2569  property that the optimizer uses and cares about. Encoding attributes as
2570  operand bundles removes the need for an instruction sequence that represents
2571  the property (e.g., `icmp ne ptr %p, null` for `nonnull`) and for the
2572  optimizer to deduce the property from that instruction sequence.
2573* Expressing the property using operand bundles makes it easy to identify the
2574  use of the value as a use in an :ref:`llvm.assume <int_assume>`. This then
2575  simplifies and improves heuristics, e.g., for use "use-sensitive"
2576  optimizations.
2577
2578.. _ob_preallocated:
2579
2580Preallocated Operand Bundles
2581^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2582
2583Preallocated operand bundles are characterized by the ``"preallocated"``
2584operand bundle tag.  These operand bundles allow separation of the allocation
2585of the call argument memory from the call site.  This is necessary to pass
2586non-trivially copyable objects by value in a way that is compatible with MSVC
2587on some targets.  There can be at most one ``"preallocated"`` operand bundle
2588attached to a call site and it must have exactly one bundle operand, which is
2589a token generated by ``@llvm.call.preallocated.setup``.  A call with this
2590operand bundle should not adjust the stack before entering the function, as
2591that will have been done by one of the ``@llvm.call.preallocated.*`` intrinsics.
2592
2593.. code-block:: llvm
2594
2595      %foo = type { i64, i32 }
2596
2597      ...
2598
2599      %t = call token @llvm.call.preallocated.setup(i32 1)
2600      %a = call ptr @llvm.call.preallocated.arg(token %t, i32 0) preallocated(%foo)
2601      ; initialize %b
2602      call void @bar(i32 42, ptr preallocated(%foo) %a) ["preallocated"(token %t)]
2603
2604.. _ob_gc_live:
2605
2606GC Live Operand Bundles
2607^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2608
2609A "gc-live" operand bundle is only valid on a :ref:`gc.statepoint <gc_statepoint>`
2610intrinsic. The operand bundle must contain every pointer to a garbage collected
2611object which potentially needs to be updated by the garbage collector.
2612
2613When lowered, any relocated value will be recorded in the corresponding
2614:ref:`stackmap entry <statepoint-stackmap-format>`.  See the intrinsic description
2615for further details.
2616
2617ObjC ARC Attached Call Operand Bundles
2618^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2619
2620A ``"clang.arc.attachedcall"`` operand bundle on a call indicates the call is
2621implicitly followed by a marker instruction and a call to an ObjC runtime
2622function that uses the result of the call. The operand bundle takes a mandatory
2623pointer to the runtime function (``@objc_retainAutoreleasedReturnValue`` or
2624``@objc_unsafeClaimAutoreleasedReturnValue``).
2625The return value of a call with this bundle is used by a call to
2626``@llvm.objc.clang.arc.noop.use`` unless the called function's return type is
2627void, in which case the operand bundle is ignored.
2628
2629.. code-block:: llvm
2630
2631   ; The marker instruction and a runtime function call are inserted after the call
2632   ; to @foo.
2633   call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_retainAutoreleasedReturnValue) ]
2634   call ptr @foo() [ "clang.arc.attachedcall"(ptr @objc_unsafeClaimAutoreleasedReturnValue) ]
2635
2636The operand bundle is needed to ensure the call is immediately followed by the
2637marker instruction and the ObjC runtime call in the final output.
2638
2639.. _ob_ptrauth:
2640
2641Pointer Authentication Operand Bundles
2642^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2643
2644Pointer Authentication operand bundles are characterized by the
2645``"ptrauth"`` operand bundle tag.  They are described in the
2646`Pointer Authentication <PointerAuth.html#operand-bundle>`__ document.
2647
2648.. _moduleasm:
2649
2650Module-Level Inline Assembly
2651----------------------------
2652
2653Modules may contain "module-level inline asm" blocks, which corresponds
2654to the GCC "file scope inline asm" blocks. These blocks are internally
2655concatenated by LLVM and treated as a single unit, but may be separated
2656in the ``.ll`` file if desired. The syntax is very simple:
2657
2658.. code-block:: llvm
2659
2660    module asm "inline asm code goes here"
2661    module asm "more can go here"
2662
2663The strings can contain any character by escaping non-printable
2664characters. The escape sequence used is simply "\\xx" where "xx" is the
2665two digit hex code for the number.
2666
2667Note that the assembly string *must* be parseable by LLVM's integrated assembler
2668(unless it is disabled), even when emitting a ``.s`` file.
2669
2670.. _langref_datalayout:
2671
2672Data Layout
2673-----------
2674
2675A module may specify a target specific data layout string that specifies
2676how data is to be laid out in memory. The syntax for the data layout is
2677simply:
2678
2679.. code-block:: llvm
2680
2681    target datalayout = "layout specification"
2682
2683The *layout specification* consists of a list of specifications
2684separated by the minus sign character ('-'). Each specification starts
2685with a letter and may include other information after the letter to
2686define some aspect of the data layout. The specifications accepted are
2687as follows:
2688
2689``E``
2690    Specifies that the target lays out data in big-endian form. That is,
2691    the bits with the most significance have the lowest address
2692    location.
2693``e``
2694    Specifies that the target lays out data in little-endian form. That
2695    is, the bits with the least significance have the lowest address
2696    location.
2697``S<size>``
2698    Specifies the natural alignment of the stack in bits. Alignment
2699    promotion of stack variables is limited to the natural stack
2700    alignment to avoid dynamic stack realignment. The stack alignment
2701    must be a multiple of 8-bits. If omitted, the natural stack
2702    alignment defaults to "unspecified", which does not prevent any
2703    alignment promotions.
2704``P<address space>``
2705    Specifies the address space that corresponds to program memory.
2706    Harvard architectures can use this to specify what space LLVM
2707    should place things such as functions into. If omitted, the
2708    program memory space defaults to the default address space of 0,
2709    which corresponds to a Von Neumann architecture that has code
2710    and data in the same space.
2711``G<address space>``
2712    Specifies the address space to be used by default when creating global
2713    variables. If omitted, the globals address space defaults to the default
2714    address space 0.
2715    Note: variable declarations without an address space are always created in
2716    address space 0, this property only affects the default value to be used
2717    when creating globals without additional contextual information (e.g. in
2718    LLVM passes).
2719``A<address space>``
2720    Specifies the address space of objects created by '``alloca``'.
2721    Defaults to the default address space of 0.
2722``p[n]:<size>:<abi>[:<pref>][:<idx>]``
2723    This specifies the *size* of a pointer and its ``<abi>`` and
2724    ``<pref>``\erred alignments for address space ``n``. ``<pref>`` is optional
2725    and defaults to ``<abi>``. The fourth parameter ``<idx>`` is the size of the
2726    index that used for address calculation. If not
2727    specified, the default index size is equal to the pointer size. All sizes
2728    are in bits. The address space, ``n``, is optional, and if not specified,
2729    denotes the default address space 0. The value of ``n`` must be
2730    in the range [1,2^23).
2731``i<size>:<abi>[:<pref>]``
2732    This specifies the alignment for an integer type of a given bit
2733    ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2734    ``<pref>`` is optional and defaults to ``<abi>``.
2735``v<size>:<abi>[:<pref>]``
2736    This specifies the alignment for a vector type of a given bit
2737    ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
2738    ``<pref>`` is optional and defaults to ``<abi>``.
2739``f<size>:<abi>[:<pref>]``
2740    This specifies the alignment for a floating-point type of a given bit
2741    ``<size>``. Only values of ``<size>`` that are supported by the target
2742    will work. 32 (float) and 64 (double) are supported on all targets; 80
2743    or 128 (different flavors of long double) are also supported on some
2744    targets. The value of ``<size>`` must be in the range [1,2^23).
2745    ``<pref>`` is optional and defaults to ``<abi>``.
2746``a:<abi>[:<pref>]``
2747    This specifies the alignment for an object of aggregate type.
2748    ``<pref>`` is optional and defaults to ``<abi>``.
2749``F<type><abi>``
2750    This specifies the alignment for function pointers.
2751    The options for ``<type>`` are:
2752
2753    * ``i``: The alignment of function pointers is independent of the alignment
2754      of functions, and is a multiple of ``<abi>``.
2755    * ``n``: The alignment of function pointers is a multiple of the explicit
2756      alignment specified on the function, and is a multiple of ``<abi>``.
2757``m:<mangling>``
2758    If present, specifies that llvm names are mangled in the output. Symbols
2759    prefixed with the mangling escape character ``\01`` are passed through
2760    directly to the assembler without the escape character. The mangling style
2761    options are
2762
2763    * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
2764    * ``l``: GOFF mangling: Private symbols get a ``@`` prefix.
2765    * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
2766    * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
2767      symbols get a ``_`` prefix.
2768    * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
2769      Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
2770      ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
2771      ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
2772      starting with ``?`` are not mangled in any way.
2773    * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
2774      symbols do not receive a ``_`` prefix.
2775    * ``a``: XCOFF mangling: Private symbols get a ``L..`` prefix.
2776``n<size1>:<size2>:<size3>...``
2777    This specifies a set of native integer widths for the target CPU in
2778    bits. For example, it might contain ``n32`` for 32-bit PowerPC,
2779    ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
2780    this set are considered to support most general arithmetic operations
2781    efficiently.
2782``ni:<address space0>:<address space1>:<address space2>...``
2783    This specifies pointer types with the specified address spaces
2784    as :ref:`Non-Integral Pointer Type <nointptrtype>` s.  The ``0``
2785    address space cannot be specified as non-integral.
2786
2787On every specification that takes a ``<abi>:<pref>``, specifying the
2788``<pref>`` alignment is optional. If omitted, the preceding ``:``
2789should be omitted too and ``<pref>`` will be equal to ``<abi>``.
2790
2791When constructing the data layout for a given target, LLVM starts with a
2792default set of specifications which are then (possibly) overridden by
2793the specifications in the ``datalayout`` keyword. The default
2794specifications are given in this list:
2795
2796-  ``e`` - little endian
2797-  ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2798-  ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2799   same as the default address space.
2800-  ``S0`` - natural stack alignment is unspecified
2801-  ``i1:8:8`` - i1 is 8-bit (byte) aligned
2802-  ``i8:8:8`` - i8 is 8-bit (byte) aligned
2803-  ``i16:16:16`` - i16 is 16-bit aligned
2804-  ``i32:32:32`` - i32 is 32-bit aligned
2805-  ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2806   alignment of 64-bits
2807-  ``f16:16:16`` - half is 16-bit aligned
2808-  ``f32:32:32`` - float is 32-bit aligned
2809-  ``f64:64:64`` - double is 64-bit aligned
2810-  ``f128:128:128`` - quad is 128-bit aligned
2811-  ``v64:64:64`` - 64-bit vector is 64-bit aligned
2812-  ``v128:128:128`` - 128-bit vector is 128-bit aligned
2813-  ``a:0:64`` - aggregates are 64-bit aligned
2814
2815When LLVM is determining the alignment for a given type, it uses the
2816following rules:
2817
2818#. If the type sought is an exact match for one of the specifications,
2819   that specification is used.
2820#. If no match is found, and the type sought is an integer type, then
2821   the smallest integer type that is larger than the bitwidth of the
2822   sought type is used. If none of the specifications are larger than
2823   the bitwidth then the largest integer type is used. For example,
2824   given the default specifications above, the i7 type will use the
2825   alignment of i8 (next largest) while both i65 and i256 will use the
2826   alignment of i64 (largest specified).
2827
2828The function of the data layout string may not be what you expect.
2829Notably, this is not a specification from the frontend of what alignment
2830the code generator should use.
2831
2832Instead, if specified, the target data layout is required to match what
2833the ultimate *code generator* expects. This string is used by the
2834mid-level optimizers to improve code, and this only works if it matches
2835what the ultimate code generator uses. There is no way to generate IR
2836that does not embed this target-specific detail into the IR. If you
2837don't specify the string, the default specifications will be used to
2838generate a Data Layout and the optimization phases will operate
2839accordingly and introduce target specificity into the IR with respect to
2840these default specifications.
2841
2842.. _langref_triple:
2843
2844Target Triple
2845-------------
2846
2847A module may specify a target triple string that describes the target
2848host. The syntax for the target triple is simply:
2849
2850.. code-block:: llvm
2851
2852    target triple = "x86_64-apple-macosx10.7.0"
2853
2854The *target triple* string consists of a series of identifiers delimited
2855by the minus sign character ('-'). The canonical forms are:
2856
2857::
2858
2859    ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2860    ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2861
2862This information is passed along to the backend so that it generates
2863code for the proper architecture. It's possible to override this on the
2864command line with the ``-mtriple`` command line option.
2865
2866.. _objectlifetime:
2867
2868Object Lifetime
2869----------------------
2870
2871A memory object, or simply object, is a region of a memory space that is
2872reserved by a memory allocation such as :ref:`alloca <i_alloca>`, heap
2873allocation calls, and global variable definitions.
2874Once it is allocated, the bytes stored in the region can only be read or written
2875through a pointer that is :ref:`based on <pointeraliasing>` the allocation
2876value.
2877If a pointer that is not based on the object tries to read or write to the
2878object, it is undefined behavior.
2879
2880A lifetime of a memory object is a property that decides its accessibility.
2881Unless stated otherwise, a memory object is alive since its allocation, and
2882dead after its deallocation.
2883It is undefined behavior to access a memory object that isn't alive, but
2884operations that don't dereference it such as
2885:ref:`getelementptr <i_getelementptr>`, :ref:`ptrtoint <i_ptrtoint>` and
2886:ref:`icmp <i_icmp>` return a valid result.
2887This explains code motion of these instructions across operations that
2888impact the object's lifetime.
2889A stack object's lifetime can be explicitly specified using
2890:ref:`llvm.lifetime.start <int_lifestart>` and
2891:ref:`llvm.lifetime.end <int_lifeend>` intrinsic function calls.
2892
2893.. _pointeraliasing:
2894
2895Pointer Aliasing Rules
2896----------------------
2897
2898Any memory access must be done through a pointer value associated with
2899an address range of the memory access, otherwise the behavior is
2900undefined. Pointer values are associated with address ranges according
2901to the following rules:
2902
2903-  A pointer value is associated with the addresses associated with any
2904   value it is *based* on.
2905-  An address of a global variable is associated with the address range
2906   of the variable's storage.
2907-  The result value of an allocation instruction is associated with the
2908   address range of the allocated storage.
2909-  A null pointer in the default address-space is associated with no
2910   address.
2911-  An :ref:`undef value <undefvalues>` in *any* address-space is
2912   associated with no address.
2913-  An integer constant other than zero or a pointer value returned from
2914   a function not defined within LLVM may be associated with address
2915   ranges allocated through mechanisms other than those provided by
2916   LLVM. Such ranges shall not overlap with any ranges of addresses
2917   allocated by mechanisms provided by LLVM.
2918
2919A pointer value is *based* on another pointer value according to the
2920following rules:
2921
2922-  A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2923   the pointer-typed operand of the ``getelementptr``.
2924-  The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2925   is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2926   of the ``getelementptr``.
2927-  The result value of a ``bitcast`` is *based* on the operand of the
2928   ``bitcast``.
2929-  A pointer value formed by an ``inttoptr`` is *based* on all pointer
2930   values that contribute (directly or indirectly) to the computation of
2931   the pointer's value.
2932-  The "*based* on" relationship is transitive.
2933
2934Note that this definition of *"based"* is intentionally similar to the
2935definition of *"based"* in C99, though it is slightly weaker.
2936
2937LLVM IR does not associate types with memory. The result type of a
2938``load`` merely indicates the size and alignment of the memory from
2939which to load, as well as the interpretation of the value. The first
2940operand type of a ``store`` similarly only indicates the size and
2941alignment of the store.
2942
2943Consequently, type-based alias analysis, aka TBAA, aka
2944``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2945:ref:`Metadata <metadata>` may be used to encode additional information
2946which specialized optimization passes may use to implement type-based
2947alias analysis.
2948
2949.. _pointercapture:
2950
2951Pointer Capture
2952---------------
2953
2954Given a function call and a pointer that is passed as an argument or stored in
2955the memory before the call, a pointer is *captured* by the call if it makes a
2956copy of any part of the pointer that outlives the call.
2957To be precise, a pointer is captured if one or more of the following conditions
2958hold:
2959
29601. The call stores any bit of the pointer carrying information into a place,
2961   and the stored bits can be read from the place by the caller after this call
2962   exits.
2963
2964.. code-block:: llvm
2965
2966    @glb  = global ptr null
2967    @glb2 = global ptr null
2968    @glb3 = global ptr null
2969    @glbi = global i32 0
2970
2971    define ptr @f(ptr %a, ptr %b, ptr %c, ptr %d, ptr %e) {
2972      store ptr %a, ptr @glb ; %a is captured by this call
2973
2974      store ptr %b,   ptr @glb2 ; %b isn't captured because the stored value is overwritten by the store below
2975      store ptr null, ptr @glb2
2976
2977      store ptr %c,   ptr @glb3
2978      call void @g() ; If @g makes a copy of %c that outlives this call (@f), %c is captured
2979      store ptr null, ptr @glb3
2980
2981      %i = ptrtoint ptr %d to i64
2982      %j = trunc i64 %i to i32
2983      store i32 %j, ptr @glbi ; %d is captured
2984
2985      ret ptr %e ; %e is captured
2986    }
2987
29882. The call stores any bit of the pointer carrying information into a place,
2989   and the stored bits can be safely read from the place by another thread via
2990   synchronization.
2991
2992.. code-block:: llvm
2993
2994    @lock = global i1 true
2995
2996    define void @f(ptr %a) {
2997      store ptr %a, ptr* @glb
2998      store atomic i1 false, ptr @lock release ; %a is captured because another thread can safely read @glb
2999      store ptr null, ptr @glb
3000      ret void
3001    }
3002
30033. The call's behavior depends on any bit of the pointer carrying information.
3004
3005.. code-block:: llvm
3006
3007    @glb = global i8 0
3008
3009    define void @f(ptr %a) {
3010      %c = icmp eq ptr %a, @glb
3011      br i1 %c, label %BB_EXIT, label %BB_CONTINUE ; escapes %a
3012    BB_EXIT:
3013      call void @exit()
3014      unreachable
3015    BB_CONTINUE:
3016      ret void
3017    }
3018
30194. The pointer is used in a volatile access as its address.
3020
3021
3022.. _volatile:
3023
3024Volatile Memory Accesses
3025------------------------
3026
3027Certain memory accesses, such as :ref:`load <i_load>`'s,
3028:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
3029marked ``volatile``. The optimizers must not change the number of
3030volatile operations or change their order of execution relative to other
3031volatile operations. The optimizers *may* change the order of volatile
3032operations relative to non-volatile operations. This is not Java's
3033"volatile" and has no cross-thread synchronization behavior.
3034
3035A volatile load or store may have additional target-specific semantics.
3036Any volatile operation can have side effects, and any volatile operation
3037can read and/or modify state which is not accessible via a regular load
3038or store in this module. Volatile operations may use addresses which do
3039not point to memory (like MMIO registers). This means the compiler may
3040not use a volatile operation to prove a non-volatile access to that
3041address has defined behavior.
3042
3043The allowed side-effects for volatile accesses are limited.  If a
3044non-volatile store to a given address would be legal, a volatile
3045operation may modify the memory at that address. A volatile operation
3046may not modify any other memory accessible by the module being compiled.
3047A volatile operation may not call any code in the current module.
3048
3049The compiler may assume execution will continue after a volatile operation,
3050so operations which modify memory or may have undefined behavior can be
3051hoisted past a volatile operation.
3052
3053As an exception to the preceding rule, the compiler may not assume execution
3054will continue after a volatile store operation. This restriction is necessary
3055to support the somewhat common pattern in C of intentionally storing to an
3056invalid pointer to crash the program. In the future, it might make sense to
3057allow frontends to control this behavior.
3058
3059IR-level volatile loads and stores cannot safely be optimized into llvm.memcpy
3060or llvm.memmove intrinsics even when those intrinsics are flagged volatile.
3061Likewise, the backend should never split or merge target-legal volatile
3062load/store instructions. Similarly, IR-level volatile loads and stores cannot
3063change from integer to floating-point or vice versa.
3064
3065.. admonition:: Rationale
3066
3067 Platforms may rely on volatile loads and stores of natively supported
3068 data width to be executed as single instruction. For example, in C
3069 this holds for an l-value of volatile primitive type with native
3070 hardware support, but not necessarily for aggregate types. The
3071 frontend upholds these expectations, which are intentionally
3072 unspecified in the IR. The rules above ensure that IR transformations
3073 do not violate the frontend's contract with the language.
3074
3075.. _memmodel:
3076
3077Memory Model for Concurrent Operations
3078--------------------------------------
3079
3080The LLVM IR does not define any way to start parallel threads of
3081execution or to register signal handlers. Nonetheless, there are
3082platform-specific ways to create them, and we define LLVM IR's behavior
3083in their presence. This model is inspired by the C++0x memory model.
3084
3085For a more informal introduction to this model, see the :doc:`Atomics`.
3086
3087We define a *happens-before* partial order as the least partial order
3088that
3089
3090-  Is a superset of single-thread program order, and
3091-  When a *synchronizes-with* ``b``, includes an edge from ``a`` to
3092   ``b``. *Synchronizes-with* pairs are introduced by platform-specific
3093   techniques, like pthread locks, thread creation, thread joining,
3094   etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
3095   Constraints <ordering>`).
3096
3097Note that program order does not introduce *happens-before* edges
3098between a thread and signals executing inside that thread.
3099
3100Every (defined) read operation (load instructions, memcpy, atomic
3101loads/read-modify-writes, etc.) R reads a series of bytes written by
3102(defined) write operations (store instructions, atomic
3103stores/read-modify-writes, memcpy, etc.). For the purposes of this
3104section, initialized globals are considered to have a write of the
3105initializer which is atomic and happens before any other read or write
3106of the memory in question. For each byte of a read R, R\ :sub:`byte`
3107may see any write to the same byte, except:
3108
3109-  If write\ :sub:`1`  happens before write\ :sub:`2`, and
3110   write\ :sub:`2` happens before R\ :sub:`byte`, then
3111   R\ :sub:`byte` does not see write\ :sub:`1`.
3112-  If R\ :sub:`byte` happens before write\ :sub:`3`, then
3113   R\ :sub:`byte` does not see write\ :sub:`3`.
3114
3115Given that definition, R\ :sub:`byte` is defined as follows:
3116
3117-  If R is volatile, the result is target-dependent. (Volatile is
3118   supposed to give guarantees which can support ``sig_atomic_t`` in
3119   C/C++, and may be used for accesses to addresses that do not behave
3120   like normal memory. It does not generally provide cross-thread
3121   synchronization.)
3122-  Otherwise, if there is no write to the same byte that happens before
3123   R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
3124-  Otherwise, if R\ :sub:`byte` may see exactly one write,
3125   R\ :sub:`byte` returns the value written by that write.
3126-  Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
3127   see are atomic, it chooses one of the values written. See the :ref:`Atomic
3128   Memory Ordering Constraints <ordering>` section for additional
3129   constraints on how the choice is made.
3130-  Otherwise R\ :sub:`byte` returns ``undef``.
3131
3132R returns the value composed of the series of bytes it read. This
3133implies that some bytes within the value may be ``undef`` **without**
3134the entire value being ``undef``. Note that this only defines the
3135semantics of the operation; it doesn't mean that targets will emit more
3136than one instruction to read the series of bytes.
3137
3138Note that in cases where none of the atomic intrinsics are used, this
3139model places only one restriction on IR transformations on top of what
3140is required for single-threaded execution: introducing a store to a byte
3141which might not otherwise be stored is not allowed in general.
3142(Specifically, in the case where another thread might write to and read
3143from an address, introducing a store can change a load that may see
3144exactly one write into a load that may see multiple writes.)
3145
3146.. _ordering:
3147
3148Atomic Memory Ordering Constraints
3149----------------------------------
3150
3151Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
3152:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
3153:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
3154ordering parameters that determine which other atomic instructions on
3155the same address they *synchronize with*. These semantics are borrowed
3156from Java and C++0x, but are somewhat more colloquial. If these
3157descriptions aren't precise enough, check those specs (see spec
3158references in the :doc:`atomics guide <Atomics>`).
3159:ref:`fence <i_fence>` instructions treat these orderings somewhat
3160differently since they don't take an address. See that instruction's
3161documentation for details.
3162
3163For a simpler introduction to the ordering constraints, see the
3164:doc:`Atomics`.
3165
3166``unordered``
3167    The set of values that can be read is governed by the happens-before
3168    partial order. A value cannot be read unless some operation wrote
3169    it. This is intended to provide a guarantee strong enough to model
3170    Java's non-volatile shared variables. This ordering cannot be
3171    specified for read-modify-write operations; it is not strong enough
3172    to make them atomic in any interesting way.
3173``monotonic``
3174    In addition to the guarantees of ``unordered``, there is a single
3175    total order for modifications by ``monotonic`` operations on each
3176    address. All modification orders must be compatible with the
3177    happens-before order. There is no guarantee that the modification
3178    orders can be combined to a global total order for the whole program
3179    (and this often will not be possible). The read in an atomic
3180    read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
3181    :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
3182    order immediately before the value it writes. If one atomic read
3183    happens before another atomic read of the same address, the later
3184    read must see the same value or a later value in the address's
3185    modification order. This disallows reordering of ``monotonic`` (or
3186    stronger) operations on the same address. If an address is written
3187    ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
3188    read that address repeatedly, the other threads must eventually see
3189    the write. This corresponds to the C++0x/C1x
3190    ``memory_order_relaxed``.
3191``acquire``
3192    In addition to the guarantees of ``monotonic``, a
3193    *synchronizes-with* edge may be formed with a ``release`` operation.
3194    This is intended to model C++'s ``memory_order_acquire``.
3195``release``
3196    In addition to the guarantees of ``monotonic``, if this operation
3197    writes a value which is subsequently read by an ``acquire``
3198    operation, it *synchronizes-with* that operation. (This isn't a
3199    complete description; see the C++0x definition of a release
3200    sequence.) This corresponds to the C++0x/C1x
3201    ``memory_order_release``.
3202``acq_rel`` (acquire+release)
3203    Acts as both an ``acquire`` and ``release`` operation on its
3204    address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
3205``seq_cst`` (sequentially consistent)
3206    In addition to the guarantees of ``acq_rel`` (``acquire`` for an
3207    operation that only reads, ``release`` for an operation that only
3208    writes), there is a global total order on all
3209    sequentially-consistent operations on all addresses, which is
3210    consistent with the *happens-before* partial order and with the
3211    modification orders of all the affected addresses. Each
3212    sequentially-consistent read sees the last preceding write to the
3213    same address in this global order. This corresponds to the C++0x/C1x
3214    ``memory_order_seq_cst`` and Java volatile.
3215
3216.. _syncscope:
3217
3218If an atomic operation is marked ``syncscope("singlethread")``, it only
3219*synchronizes with* and only participates in the seq\_cst total orderings of
3220other operations running in the same thread (for example, in signal handlers).
3221
3222If an atomic operation is marked ``syncscope("<target-scope>")``, where
3223``<target-scope>`` is a target specific synchronization scope, then it is target
3224dependent if it *synchronizes with* and participates in the seq\_cst total
3225orderings of other operations.
3226
3227Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
3228or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
3229seq\_cst total orderings of other operations that are not marked
3230``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
3231
3232.. _floatenv:
3233
3234Floating-Point Environment
3235--------------------------
3236
3237The default LLVM floating-point environment assumes that floating-point
3238instructions do not have side effects. Results assume the round-to-nearest
3239rounding mode. No floating-point exception state is maintained in this
3240environment. Therefore, there is no attempt to create or preserve invalid
3241operation (SNaN) or division-by-zero exceptions.
3242
3243The benefit of this exception-free assumption is that floating-point
3244operations may be speculated freely without any other fast-math relaxations
3245to the floating-point model.
3246
3247Code that requires different behavior than this should use the
3248:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
3249
3250.. _fastmath:
3251
3252Fast-Math Flags
3253---------------
3254
3255LLVM IR floating-point operations (:ref:`fneg <i_fneg>`, :ref:`fadd <i_fadd>`,
3256:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
3257:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`), :ref:`phi <i_phi>`,
3258:ref:`select <i_select>` and :ref:`call <i_call>`
3259may use the following flags to enable otherwise unsafe
3260floating-point transformations.
3261
3262``nnan``
3263   No NaNs - Allow optimizations to assume the arguments and result are not
3264   NaN. If an argument is a nan, or the result would be a nan, it produces
3265   a :ref:`poison value <poisonvalues>` instead.
3266
3267``ninf``
3268   No Infs - Allow optimizations to assume the arguments and result are not
3269   +/-Inf. If an argument is +/-Inf, or the result would be +/-Inf, it
3270   produces a :ref:`poison value <poisonvalues>` instead.
3271
3272``nsz``
3273   No Signed Zeros - Allow optimizations to treat the sign of a zero
3274   argument or result as insignificant. This does not imply that -0.0
3275   is poison and/or guaranteed to not exist in the operation.
3276
3277``arcp``
3278   Allow Reciprocal - Allow optimizations to use the reciprocal of an
3279   argument rather than perform division.
3280
3281``contract``
3282   Allow floating-point contraction (e.g. fusing a multiply followed by an
3283   addition into a fused multiply-and-add). This does not enable reassociating
3284   to form arbitrary contractions. For example, ``(a*b) + (c*d) + e`` can not
3285   be transformed into ``(a*b) + ((c*d) + e)`` to create two fma operations.
3286
3287``afn``
3288   Approximate functions - Allow substitution of approximate calculations for
3289   functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
3290   for places where this can apply to LLVM's intrinsic math functions.
3291
3292``reassoc``
3293   Allow reassociation transformations for floating-point instructions.
3294   This may dramatically change results in floating-point.
3295
3296``fast``
3297   This flag implies all of the others.
3298
3299.. _uselistorder:
3300
3301Use-list Order Directives
3302-------------------------
3303
3304Use-list directives encode the in-memory order of each use-list, allowing the
3305order to be recreated. ``<order-indexes>`` is a comma-separated list of
3306indexes that are assigned to the referenced value's uses. The referenced
3307value's use-list is immediately sorted by these indexes.
3308
3309Use-list directives may appear at function scope or global scope. They are not
3310instructions, and have no effect on the semantics of the IR. When they're at
3311function scope, they must appear after the terminator of the final basic block.
3312
3313If basic blocks have their address taken via ``blockaddress()`` expressions,
3314``uselistorder_bb`` can be used to reorder their use-lists from outside their
3315function's scope.
3316
3317:Syntax:
3318
3319::
3320
3321    uselistorder <ty> <value>, { <order-indexes> }
3322    uselistorder_bb @function, %block { <order-indexes> }
3323
3324:Examples:
3325
3326::
3327
3328    define void @foo(i32 %arg1, i32 %arg2) {
3329    entry:
3330      ; ... instructions ...
3331    bb:
3332      ; ... instructions ...
3333
3334      ; At function scope.
3335      uselistorder i32 %arg1, { 1, 0, 2 }
3336      uselistorder label %bb, { 1, 0 }
3337    }
3338
3339    ; At global scope.
3340    uselistorder ptr @global, { 1, 2, 0 }
3341    uselistorder i32 7, { 1, 0 }
3342    uselistorder i32 (i32) @bar, { 1, 0 }
3343    uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
3344
3345.. _source_filename:
3346
3347Source Filename
3348---------------
3349
3350The *source filename* string is set to the original module identifier,
3351which will be the name of the compiled source file when compiling from
3352source through the clang front end, for example. It is then preserved through
3353the IR and bitcode.
3354
3355This is currently necessary to generate a consistent unique global
3356identifier for local functions used in profile data, which prepends the
3357source file name to the local function name.
3358
3359The syntax for the source file name is simply:
3360
3361.. code-block:: text
3362
3363    source_filename = "/path/to/source.c"
3364
3365.. _typesystem:
3366
3367Type System
3368===========
3369
3370The LLVM type system is one of the most important features of the
3371intermediate representation. Being typed enables a number of
3372optimizations to be performed on the intermediate representation
3373directly, without having to do extra analyses on the side before the
3374transformation. A strong type system makes it easier to read the
3375generated code and enables novel analyses and transformations that are
3376not feasible to perform on normal three address code representations.
3377
3378.. _t_void:
3379
3380Void Type
3381---------
3382
3383:Overview:
3384
3385
3386The void type does not represent any value and has no size.
3387
3388:Syntax:
3389
3390
3391::
3392
3393      void
3394
3395
3396.. _t_function:
3397
3398Function Type
3399-------------
3400
3401:Overview:
3402
3403
3404The function type can be thought of as a function signature. It consists of a
3405return type and a list of formal parameter types. The return type of a function
3406type is a void type or first class type --- except for :ref:`label <t_label>`
3407and :ref:`metadata <t_metadata>` types.
3408
3409:Syntax:
3410
3411::
3412
3413      <returntype> (<parameter list>)
3414
3415...where '``<parameter list>``' is a comma-separated list of type
3416specifiers. Optionally, the parameter list may include a type ``...``, which
3417indicates that the function takes a variable number of arguments. Variable
3418argument functions can access their arguments with the :ref:`variable argument
3419handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
3420except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
3421
3422:Examples:
3423
3424+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3425| ``i32 (i32)``                   | function taking an ``i32``, returning an ``i32``                                                                                                                    |
3426+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3427| ``i32 (ptr, ...)``              | A vararg function that takes at least one :ref:`pointer <t_pointer>` argument and returns an integer. This is the signature for ``printf`` in LLVM.                 |
3428+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3429| ``{i32, i32} (i32)``            | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values                                                                 |
3430+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3431
3432.. _t_firstclass:
3433
3434First Class Types
3435-----------------
3436
3437The :ref:`first class <t_firstclass>` types are perhaps the most important.
3438Values of these types are the only ones which can be produced by
3439instructions.
3440
3441.. _t_single_value:
3442
3443Single Value Types
3444^^^^^^^^^^^^^^^^^^
3445
3446These are the types that are valid in registers from CodeGen's perspective.
3447
3448.. _t_integer:
3449
3450Integer Type
3451""""""""""""
3452
3453:Overview:
3454
3455The integer type is a very simple type that simply specifies an
3456arbitrary bit width for the integer type desired. Any bit width from 1
3457bit to 2\ :sup:`23`\ (about 8 million) can be specified.
3458
3459:Syntax:
3460
3461::
3462
3463      iN
3464
3465The number of bits the integer will occupy is specified by the ``N``
3466value.
3467
3468Examples:
3469*********
3470
3471+----------------+------------------------------------------------+
3472| ``i1``         | a single-bit integer.                          |
3473+----------------+------------------------------------------------+
3474| ``i32``        | a 32-bit integer.                              |
3475+----------------+------------------------------------------------+
3476| ``i1942652``   | a really big integer of over 1 million bits.   |
3477+----------------+------------------------------------------------+
3478
3479.. _t_floating:
3480
3481Floating-Point Types
3482""""""""""""""""""""
3483
3484.. list-table::
3485   :header-rows: 1
3486
3487   * - Type
3488     - Description
3489
3490   * - ``half``
3491     - 16-bit floating-point value
3492
3493   * - ``bfloat``
3494     - 16-bit "brain" floating-point value (7-bit significand).  Provides the
3495       same number of exponent bits as ``float``, so that it matches its dynamic
3496       range, but with greatly reduced precision.  Used in Intel's AVX-512 BF16
3497       extensions and Arm's ARMv8.6-A extensions, among others.
3498
3499   * - ``float``
3500     - 32-bit floating-point value
3501
3502   * - ``double``
3503     - 64-bit floating-point value
3504
3505   * - ``fp128``
3506     - 128-bit floating-point value (113-bit significand)
3507
3508   * - ``x86_fp80``
3509     -  80-bit floating-point value (X87)
3510
3511   * - ``ppc_fp128``
3512     - 128-bit floating-point value (two 64-bits)
3513
3514The binary format of half, float, double, and fp128 correspond to the
3515IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
3516respectively.
3517
3518X86_amx Type
3519""""""""""""
3520
3521:Overview:
3522
3523The x86_amx type represents a value held in an AMX tile register on an x86
3524machine. The operations allowed on it are quite limited. Only few intrinsics
3525are allowed: stride load and store, zero and dot product. No instruction is
3526allowed for this type. There are no arguments, arrays, pointers, vectors
3527or constants of this type.
3528
3529:Syntax:
3530
3531::
3532
3533      x86_amx
3534
3535
3536X86_mmx Type
3537""""""""""""
3538
3539:Overview:
3540
3541The x86_mmx type represents a value held in an MMX register on an x86
3542machine. The operations allowed on it are quite limited: parameters and
3543return values, load and store, and bitcast. User-specified MMX
3544instructions are represented as intrinsic or asm calls with arguments
3545and/or results of this type. There are no arrays, vectors or constants
3546of this type.
3547
3548:Syntax:
3549
3550::
3551
3552      x86_mmx
3553
3554
3555.. _t_pointer:
3556
3557Pointer Type
3558""""""""""""
3559
3560:Overview:
3561
3562The pointer type ``ptr`` is used to specify memory locations. Pointers are
3563commonly used to reference objects in memory.
3564
3565Pointer types may have an optional address space attribute defining the
3566numbered address space where the pointed-to object resides. The default
3567address space is number zero. The semantics of non-zero address spaces
3568are target-specific. For example, ``ptr addrspace(5)`` is a pointer
3569to address space 5.
3570
3571Prior to LLVM 15, pointer types also specified a pointee type, such as
3572``i8*``, ``[4 x i32]*`` or ``i32 (i32*)*``. In LLVM 15, such "typed
3573pointers" are still supported under non-default options. See the
3574`opaque pointers document <OpaquePointers.html>`__ for more information.
3575
3576.. _t_vector:
3577
3578Vector Type
3579"""""""""""
3580
3581:Overview:
3582
3583A vector type is a simple derived type that represents a vector of
3584elements. Vector types are used when multiple primitive data are
3585operated in parallel using a single instruction (SIMD). A vector type
3586requires a size (number of elements), an underlying primitive data type,
3587and a scalable property to represent vectors where the exact hardware
3588vector length is unknown at compile time. Vector types are considered
3589:ref:`first class <t_firstclass>`.
3590
3591:Memory Layout:
3592
3593In general vector elements are laid out in memory in the same way as
3594:ref:`array types <t_array>`. Such an analogy works fine as long as the vector
3595elements are byte sized. However, when the elements of the vector aren't byte
3596sized it gets a bit more complicated. One way to describe the layout is by
3597describing what happens when a vector such as <N x iM> is bitcasted to an
3598integer type with N*M bits, and then following the rules for storing such an
3599integer to memory.
3600
3601A bitcast from a vector type to a scalar integer type will see the elements
3602being packed together (without padding). The order in which elements are
3603inserted in the integer depends on endianess. For little endian element zero
3604is put in the least significant bits of the integer, and for big endian
3605element zero is put in the most significant bits.
3606
3607Using a vector such as ``<i4 1, i4 2, i4 3, i4 5>`` as an example, together
3608with the analogy that we can replace a vector store by a bitcast followed by
3609an integer store, we get this for big endian:
3610
3611.. code-block:: llvm
3612
3613      %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3614
3615      ; Bitcasting from a vector to an integral type can be seen as
3616      ; concatenating the values:
3617      ;   %val now has the hexadecimal value 0x1235.
3618
3619      store i16 %val, ptr %ptr
3620
3621      ; In memory the content will be (8-bit addressing):
3622      ;
3623      ;    [%ptr + 0]: 00010010  (0x12)
3624      ;    [%ptr + 1]: 00110101  (0x35)
3625
3626The same example for little endian:
3627
3628.. code-block:: llvm
3629
3630      %val = bitcast <4 x i4> <i4 1, i4 2, i4 3, i4 5> to i16
3631
3632      ; Bitcasting from a vector to an integral type can be seen as
3633      ; concatenating the values:
3634      ;   %val now has the hexadecimal value 0x5321.
3635
3636      store i16 %val, ptr %ptr
3637
3638      ; In memory the content will be (8-bit addressing):
3639      ;
3640      ;    [%ptr + 0]: 01010011  (0x53)
3641      ;    [%ptr + 1]: 00100001  (0x21)
3642
3643When ``<N*M>`` isn't evenly divisible by the byte size the exact memory layout
3644is unspecified (just like it is for an integral type of the same size). This
3645is because different targets could put the padding at different positions when
3646the type size is smaller than the type's store size.
3647
3648:Syntax:
3649
3650::
3651
3652      < <# elements> x <elementtype> >          ; Fixed-length vector
3653      < vscale x <# elements> x <elementtype> > ; Scalable vector
3654
3655The number of elements is a constant integer value larger than 0;
3656elementtype may be any integer, floating-point or pointer type. Vectors
3657of size zero are not allowed. For scalable vectors, the total number of
3658elements is a constant multiple (called vscale) of the specified number
3659of elements; vscale is a positive integer that is unknown at compile time
3660and the same hardware-dependent constant for all scalable vectors at run
3661time. The size of a specific scalable vector type is thus constant within
3662IR, even if the exact size in bytes cannot be determined until run time.
3663
3664:Examples:
3665
3666+------------------------+----------------------------------------------------+
3667| ``<4 x i32>``          | Vector of 4 32-bit integer values.                 |
3668+------------------------+----------------------------------------------------+
3669| ``<8 x float>``        | Vector of 8 32-bit floating-point values.          |
3670+------------------------+----------------------------------------------------+
3671| ``<2 x i64>``          | Vector of 2 64-bit integer values.                 |
3672+------------------------+----------------------------------------------------+
3673| ``<4 x ptr>``          | Vector of 4 pointers                               |
3674+------------------------+----------------------------------------------------+
3675| ``<vscale x 4 x i32>`` | Vector with a multiple of 4 32-bit integer values. |
3676+------------------------+----------------------------------------------------+
3677
3678.. _t_label:
3679
3680Label Type
3681^^^^^^^^^^
3682
3683:Overview:
3684
3685The label type represents code labels.
3686
3687:Syntax:
3688
3689::
3690
3691      label
3692
3693.. _t_token:
3694
3695Token Type
3696^^^^^^^^^^
3697
3698:Overview:
3699
3700The token type is used when a value is associated with an instruction
3701but all uses of the value must not attempt to introspect or obscure it.
3702As such, it is not appropriate to have a :ref:`phi <i_phi>` or
3703:ref:`select <i_select>` of type token.
3704
3705:Syntax:
3706
3707::
3708
3709      token
3710
3711
3712
3713.. _t_metadata:
3714
3715Metadata Type
3716^^^^^^^^^^^^^
3717
3718:Overview:
3719
3720The metadata type represents embedded metadata. No derived types may be
3721created from metadata except for :ref:`function <t_function>` arguments.
3722
3723:Syntax:
3724
3725::
3726
3727      metadata
3728
3729.. _t_aggregate:
3730
3731Aggregate Types
3732^^^^^^^^^^^^^^^
3733
3734Aggregate Types are a subset of derived types that can contain multiple
3735member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
3736aggregate types. :ref:`Vectors <t_vector>` are not considered to be
3737aggregate types.
3738
3739.. _t_array:
3740
3741Array Type
3742""""""""""
3743
3744:Overview:
3745
3746The array type is a very simple derived type that arranges elements
3747sequentially in memory. The array type requires a size (number of
3748elements) and an underlying data type.
3749
3750:Syntax:
3751
3752::
3753
3754      [<# elements> x <elementtype>]
3755
3756The number of elements is a constant integer value; ``elementtype`` may
3757be any type with a size.
3758
3759:Examples:
3760
3761+------------------+--------------------------------------+
3762| ``[40 x i32]``   | Array of 40 32-bit integer values.   |
3763+------------------+--------------------------------------+
3764| ``[41 x i32]``   | Array of 41 32-bit integer values.   |
3765+------------------+--------------------------------------+
3766| ``[4 x i8]``     | Array of 4 8-bit integer values.     |
3767+------------------+--------------------------------------+
3768
3769Here are some examples of multidimensional arrays:
3770
3771+-----------------------------+----------------------------------------------------------+
3772| ``[3 x [4 x i32]]``         | 3x4 array of 32-bit integer values.                      |
3773+-----------------------------+----------------------------------------------------------+
3774| ``[12 x [10 x float]]``     | 12x10 array of single precision floating-point values.   |
3775+-----------------------------+----------------------------------------------------------+
3776| ``[2 x [3 x [4 x i16]]]``   | 2x3x4 array of 16-bit integer values.                    |
3777+-----------------------------+----------------------------------------------------------+
3778
3779There is no restriction on indexing beyond the end of the array implied
3780by a static type (though there are restrictions on indexing beyond the
3781bounds of an allocated object in some cases). This means that
3782single-dimension 'variable sized array' addressing can be implemented in
3783LLVM with a zero length array type. An implementation of 'pascal style
3784arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
3785example.
3786
3787.. _t_struct:
3788
3789Structure Type
3790""""""""""""""
3791
3792:Overview:
3793
3794The structure type is used to represent a collection of data members
3795together in memory. The elements of a structure may be any type that has
3796a size.
3797
3798Structures in memory are accessed using '``load``' and '``store``' by
3799getting a pointer to a field with the '``getelementptr``' instruction.
3800Structures in registers are accessed using the '``extractvalue``' and
3801'``insertvalue``' instructions.
3802
3803Structures may optionally be "packed" structures, which indicate that
3804the alignment of the struct is one byte, and that there is no padding
3805between the elements. In non-packed structs, padding between field types
3806is inserted as defined by the DataLayout string in the module, which is
3807required to match what the underlying code generator expects.
3808
3809Structures can either be "literal" or "identified". A literal structure
3810is defined inline with other types (e.g. ``[2 x {i32, i32}]``) whereas
3811identified types are always defined at the top level with a name.
3812Literal types are uniqued by their contents and can never be recursive
3813or opaque since there is no way to write one. Identified types can be
3814recursive, can be opaqued, and are never uniqued.
3815
3816:Syntax:
3817
3818::
3819
3820      %T1 = type { <type list> }     ; Identified normal struct type
3821      %T2 = type <{ <type list> }>   ; Identified packed struct type
3822
3823:Examples:
3824
3825+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3826| ``{ i32, i32, i32 }``        | A triple of three ``i32`` values                                                                                                                                                      |
3827+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3828| ``{ float, ptr }``           | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>`.                                                                                |
3829+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3830| ``<{ i8, i32 }>``            | A packed struct known to be 5 bytes in size.                                                                                                                                          |
3831+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3832
3833.. _t_opaque:
3834
3835Opaque Structure Types
3836""""""""""""""""""""""
3837
3838:Overview:
3839
3840Opaque structure types are used to represent structure types that
3841do not have a body specified. This corresponds (for example) to the C
3842notion of a forward declared structure. They can be named (``%X``) or
3843unnamed (``%52``).
3844
3845:Syntax:
3846
3847::
3848
3849      %X = type opaque
3850      %52 = type opaque
3851
3852:Examples:
3853
3854+--------------+-------------------+
3855| ``opaque``   | An opaque type.   |
3856+--------------+-------------------+
3857
3858.. _constants:
3859
3860Constants
3861=========
3862
3863LLVM has several different basic types of constants. This section
3864describes them all and their syntax.
3865
3866Simple Constants
3867----------------
3868
3869**Boolean constants**
3870    The two strings '``true``' and '``false``' are both valid constants
3871    of the ``i1`` type.
3872**Integer constants**
3873    Standard integers (such as '4') are constants of the
3874    :ref:`integer <t_integer>` type. Negative numbers may be used with
3875    integer types.
3876**Floating-point constants**
3877    Floating-point constants use standard decimal notation (e.g.
3878    123.421), exponential notation (e.g. 1.23421e+2), or a more precise
3879    hexadecimal notation (see below). The assembler requires the exact
3880    decimal value of a floating-point constant. For example, the
3881    assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
3882    decimal in binary. Floating-point constants must have a
3883    :ref:`floating-point <t_floating>` type.
3884**Null pointer constants**
3885    The identifier '``null``' is recognized as a null pointer constant
3886    and must be of :ref:`pointer type <t_pointer>`.
3887**Token constants**
3888    The identifier '``none``' is recognized as an empty token constant
3889    and must be of :ref:`token type <t_token>`.
3890
3891The one non-intuitive notation for constants is the hexadecimal form of
3892floating-point constants. For example, the form
3893'``double    0x432ff973cafa8000``' is equivalent to (but harder to read
3894than) '``double 4.5e+15``'. The only time hexadecimal floating-point
3895constants are required (and the only time that they are generated by the
3896disassembler) is when a floating-point constant must be emitted but it
3897cannot be represented as a decimal floating-point number in a reasonable
3898number of digits. For example, NaN's, infinities, and other special
3899values are represented in their IEEE hexadecimal format so that assembly
3900and disassembly do not cause any bits to change in the constants.
3901
3902When using the hexadecimal form, constants of types bfloat, half, float, and
3903double are represented using the 16-digit form shown above (which matches the
3904IEEE754 representation for double); bfloat, half and float values must, however,
3905be exactly representable as bfloat, IEEE 754 half, and IEEE 754 single
3906precision respectively. Hexadecimal format is always used for long double, and
3907there are three forms of long double. The 80-bit format used by x86 is
3908represented as ``0xK`` followed by 20 hexadecimal digits. The 128-bit format
3909used by PowerPC (two adjacent doubles) is represented by ``0xM`` followed by 32
3910hexadecimal digits. The IEEE 128-bit format is represented by ``0xL`` followed
3911by 32 hexadecimal digits. Long doubles will only work if they match the long
3912double format on your target.  The IEEE 16-bit format (half precision) is
3913represented by ``0xH`` followed by 4 hexadecimal digits. The bfloat 16-bit
3914format is represented by ``0xR`` followed by 4 hexadecimal digits. All
3915hexadecimal formats are big-endian (sign bit at the left).
3916
3917There are no constants of type x86_mmx and x86_amx.
3918
3919.. _complexconstants:
3920
3921Complex Constants
3922-----------------
3923
3924Complex constants are a (potentially recursive) combination of simple
3925constants and smaller complex constants.
3926
3927**Structure constants**
3928    Structure constants are represented with notation similar to
3929    structure type definitions (a comma separated list of elements,
3930    surrounded by braces (``{}``)). For example:
3931    "``{ i32 4, float 17.0, ptr @G }``", where "``@G``" is declared as
3932    "``@G = external global i32``". Structure constants must have
3933    :ref:`structure type <t_struct>`, and the number and types of elements
3934    must match those specified by the type.
3935**Array constants**
3936    Array constants are represented with notation similar to array type
3937    definitions (a comma separated list of elements, surrounded by
3938    square brackets (``[]``)). For example:
3939    "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
3940    :ref:`array type <t_array>`, and the number and types of elements must
3941    match those specified by the type. As a special case, character array
3942    constants may also be represented as a double-quoted string using the ``c``
3943    prefix. For example: "``c"Hello World\0A\00"``".
3944**Vector constants**
3945    Vector constants are represented with notation similar to vector
3946    type definitions (a comma separated list of elements, surrounded by
3947    less-than/greater-than's (``<>``)). For example:
3948    "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
3949    must have :ref:`vector type <t_vector>`, and the number and types of
3950    elements must match those specified by the type.
3951**Zero initialization**
3952    The string '``zeroinitializer``' can be used to zero initialize a
3953    value to zero of *any* type, including scalar and
3954    :ref:`aggregate <t_aggregate>` types. This is often used to avoid
3955    having to print large zero initializers (e.g. for large arrays) and
3956    is always exactly equivalent to using explicit zero initializers.
3957**Metadata node**
3958    A metadata node is a constant tuple without types. For example:
3959    "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
3960    for example: "``!{!0, i32 0, ptr @global, ptr @function, !"str"}``".
3961    Unlike other typed constants that are meant to be interpreted as part of
3962    the instruction stream, metadata is a place to attach additional
3963    information such as debug info.
3964
3965Global Variable and Function Addresses
3966--------------------------------------
3967
3968The addresses of :ref:`global variables <globalvars>` and
3969:ref:`functions <functionstructure>` are always implicitly valid
3970(link-time) constants. These constants are explicitly referenced when
3971the :ref:`identifier for the global <identifiers>` is used and always have
3972:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
3973file:
3974
3975.. code-block:: llvm
3976
3977    @X = global i32 17
3978    @Y = global i32 42
3979    @Z = global [2 x ptr] [ ptr @X, ptr @Y ]
3980
3981.. _undefvalues:
3982
3983Undefined Values
3984----------------
3985
3986The string '``undef``' can be used anywhere a constant is expected, and
3987indicates that the user of the value may receive an unspecified
3988bit-pattern. Undefined values may be of any type (other than '``label``'
3989or '``void``') and be used anywhere a constant is permitted.
3990
3991.. note::
3992
3993  A '``poison``' value (decribed in the next section) should be used instead of
3994  '``undef``' whenever possible. Poison values are stronger than undef, and
3995  enable more optimizations. Just the existence of '``undef``' blocks certain
3996  optimizations (see the examples below).
3997
3998Undefined values are useful because they indicate to the compiler that
3999the program is well defined no matter what value is used. This gives the
4000compiler more freedom to optimize. Here are some examples of
4001(potentially surprising) transformations that are valid (in pseudo IR):
4002
4003.. code-block:: llvm
4004
4005      %A = add %X, undef
4006      %B = sub %X, undef
4007      %C = xor %X, undef
4008    Safe:
4009      %A = undef
4010      %B = undef
4011      %C = undef
4012
4013This is safe because all of the output bits are affected by the undef
4014bits. Any output bit can have a zero or one depending on the input bits.
4015
4016.. code-block:: llvm
4017
4018      %A = or %X, undef
4019      %B = and %X, undef
4020    Safe:
4021      %A = -1
4022      %B = 0
4023    Safe:
4024      %A = %X  ;; By choosing undef as 0
4025      %B = %X  ;; By choosing undef as -1
4026    Unsafe:
4027      %A = undef
4028      %B = undef
4029
4030These logical operations have bits that are not always affected by the
4031input. For example, if ``%X`` has a zero bit, then the output of the
4032'``and``' operation will always be a zero for that bit, no matter what
4033the corresponding bit from the '``undef``' is. As such, it is unsafe to
4034optimize or assume that the result of the '``and``' is '``undef``'.
4035However, it is safe to assume that all bits of the '``undef``' could be
40360, and optimize the '``and``' to 0. Likewise, it is safe to assume that
4037all the bits of the '``undef``' operand to the '``or``' could be set,
4038allowing the '``or``' to be folded to -1.
4039
4040.. code-block:: llvm
4041
4042      %A = select undef, %X, %Y
4043      %B = select undef, 42, %Y
4044      %C = select %X, %Y, undef
4045    Safe:
4046      %A = %X     (or %Y)
4047      %B = 42     (or %Y)
4048      %C = %Y     (if %Y is provably not poison; unsafe otherwise)
4049    Unsafe:
4050      %A = undef
4051      %B = undef
4052      %C = undef
4053
4054This set of examples shows that undefined '``select``' (and conditional
4055branch) conditions can go *either way*, but they have to come from one
4056of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
4057both known to have a clear low bit, then ``%A`` would have to have a
4058cleared low bit. However, in the ``%C`` example, the optimizer is
4059allowed to assume that the '``undef``' operand could be the same as
4060``%Y`` if ``%Y`` is provably not '``poison``', allowing the whole '``select``'
4061to be eliminated. This is because '``poison``' is stronger than '``undef``'.
4062
4063.. code-block:: llvm
4064
4065      %A = xor undef, undef
4066
4067      %B = undef
4068      %C = xor %B, %B
4069
4070      %D = undef
4071      %E = icmp slt %D, 4
4072      %F = icmp gte %D, 4
4073
4074    Safe:
4075      %A = undef
4076      %B = undef
4077      %C = undef
4078      %D = undef
4079      %E = undef
4080      %F = undef
4081
4082This example points out that two '``undef``' operands are not
4083necessarily the same. This can be surprising to people (and also matches
4084C semantics) where they assume that "``X^X``" is always zero, even if
4085``X`` is undefined. This isn't true for a number of reasons, but the
4086short answer is that an '``undef``' "variable" can arbitrarily change
4087its value over its "live range". This is true because the variable
4088doesn't actually *have a live range*. Instead, the value is logically
4089read from arbitrary registers that happen to be around when needed, so
4090the value is not necessarily consistent over time. In fact, ``%A`` and
4091``%C`` need to have the same semantics or the core LLVM "replace all
4092uses with" concept would not hold.
4093
4094To ensure all uses of a given register observe the same value (even if
4095'``undef``'), the :ref:`freeze instruction <i_freeze>` can be used.
4096
4097.. code-block:: llvm
4098
4099      %A = sdiv undef, %X
4100      %B = sdiv %X, undef
4101    Safe:
4102      %A = 0
4103    b: unreachable
4104
4105These examples show the crucial difference between an *undefined value*
4106and *undefined behavior*. An undefined value (like '``undef``') is
4107allowed to have an arbitrary bit-pattern. This means that the ``%A``
4108operation can be constant folded to '``0``', because the '``undef``'
4109could be zero, and zero divided by any value is zero.
4110However, in the second example, we can make a more aggressive
4111assumption: because the ``undef`` is allowed to be an arbitrary value,
4112we are allowed to assume that it could be zero. Since a divide by zero
4113has *undefined behavior*, we are allowed to assume that the operation
4114does not execute at all. This allows us to delete the divide and all
4115code after it. Because the undefined operation "can't happen", the
4116optimizer can assume that it occurs in dead code.
4117
4118.. code-block:: text
4119
4120    a:  store undef -> %X
4121    b:  store %X -> undef
4122    Safe:
4123    a: <deleted>     (if the stored value in %X is provably not poison)
4124    b: unreachable
4125
4126A store *of* an undefined value can be assumed to not have any effect;
4127we can assume that the value is overwritten with bits that happen to
4128match what was already there. This argument is only valid if the stored value
4129is provably not ``poison``. However, a store *to* an undefined
4130location could clobber arbitrary memory, therefore, it has undefined
4131behavior.
4132
4133Branching on an undefined value is undefined behavior.
4134This explains optimizations that depend on branch conditions to construct
4135predicates, such as Correlated Value Propagation and Global Value Numbering.
4136In case of switch instruction, the branch condition should be frozen, otherwise
4137it is undefined behavior.
4138
4139.. code-block:: llvm
4140
4141    Unsafe:
4142      br undef, BB1, BB2 ; UB
4143
4144      %X = and i32 undef, 255
4145      switch %X, label %ret [ .. ] ; UB
4146
4147      store undef, ptr %ptr
4148      %X = load ptr %ptr ; %X is undef
4149      switch i8 %X, label %ret [ .. ] ; UB
4150
4151    Safe:
4152      %X = or i8 undef, 255 ; always 255
4153      switch i8 %X, label %ret [ .. ] ; Well-defined
4154
4155      %X = freeze i1 undef
4156      br %X, BB1, BB2 ; Well-defined (non-deterministic jump)
4157
4158
4159
4160.. _poisonvalues:
4161
4162Poison Values
4163-------------
4164
4165A poison value is a result of an erroneous operation.
4166In order to facilitate speculative execution, many instructions do not
4167invoke immediate undefined behavior when provided with illegal operands,
4168and return a poison value instead.
4169The string '``poison``' can be used anywhere a constant is expected, and
4170operations such as :ref:`add <i_add>` with the ``nsw`` flag can produce
4171a poison value.
4172
4173Most instructions return '``poison``' when one of their arguments is
4174'``poison``'. A notable exception is the :ref:`select instruction <i_select>`.
4175Propagation of poison can be stopped with the
4176:ref:`freeze instruction <i_freeze>`.
4177
4178It is correct to replace a poison value with an
4179:ref:`undef value <undefvalues>` or any value of the type.
4180
4181This means that immediate undefined behavior occurs if a poison value is
4182used as an instruction operand that has any values that trigger undefined
4183behavior. Notably this includes (but is not limited to):
4184
4185-  The pointer operand of a :ref:`load <i_load>`, :ref:`store <i_store>` or
4186   any other pointer dereferencing instruction (independent of address
4187   space).
4188-  The divisor operand of a ``udiv``, ``sdiv``, ``urem`` or ``srem``
4189   instruction.
4190-  The condition operand of a :ref:`br <i_br>` instruction.
4191-  The callee operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4192   instruction.
4193-  The parameter operand of a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
4194   instruction, when the function or invoking call site has a ``noundef``
4195   attribute in the corresponding position.
4196-  The operand of a :ref:`ret <i_ret>` instruction if the function or invoking
4197   call site has a `noundef` attribute in the return value position.
4198
4199Here are some examples:
4200
4201.. code-block:: llvm
4202
4203    entry:
4204      %poison = sub nuw i32 0, 1           ; Results in a poison value.
4205      %poison2 = sub i32 poison, 1         ; Also results in a poison value.
4206      %still_poison = and i32 %poison, 0   ; 0, but also poison.
4207      %poison_yet_again = getelementptr i32, ptr @h, i32 %still_poison
4208      store i32 0, ptr %poison_yet_again   ; Undefined behavior due to
4209                                           ; store to poison.
4210
4211      store i32 %poison, ptr @g            ; Poison value stored to memory.
4212      %poison3 = load i32, ptr @g          ; Poison value loaded back from memory.
4213
4214      %poison4 = load i16, ptr @g          ; Returns a poison value.
4215      %poison5 = load i64, ptr @g          ; Returns a poison value.
4216
4217      %cmp = icmp slt i32 %poison, 0       ; Returns a poison value.
4218      br i1 %cmp, label %end, label %end   ; undefined behavior
4219
4220    end:
4221
4222.. _welldefinedvalues:
4223
4224Well-Defined Values
4225-------------------
4226
4227Given a program execution, a value is *well defined* if the value does not
4228have an undef bit and is not poison in the execution.
4229An aggregate value or vector is well defined if its elements are well defined.
4230The padding of an aggregate isn't considered, since it isn't visible
4231without storing it into memory and loading it with a different type.
4232
4233A constant of a :ref:`single value <t_single_value>`, non-vector type is well
4234defined if it is neither '``undef``' constant nor '``poison``' constant.
4235The result of :ref:`freeze instruction <i_freeze>` is well defined regardless
4236of its operand.
4237
4238.. _blockaddress:
4239
4240Addresses of Basic Blocks
4241-------------------------
4242
4243``blockaddress(@function, %block)``
4244
4245The '``blockaddress``' constant computes the address of the specified
4246basic block in the specified function.
4247
4248It always has an ``ptr addrspace(P)`` type, where ``P`` is the address space
4249of the function containing ``%block`` (usually ``addrspace(0)``).
4250
4251Taking the address of the entry block is illegal.
4252
4253This value only has defined behavior when used as an operand to the
4254':ref:`indirectbr <i_indirectbr>`' or ':ref:`callbr <i_callbr>`'instruction, or
4255for comparisons against null. Pointer equality tests between labels addresses
4256results in undefined behavior --- though, again, comparison against null is ok,
4257and no label is equal to the null pointer. This may be passed around as an
4258opaque pointer sized value as long as the bits are not inspected. This
4259allows ``ptrtoint`` and arithmetic to be performed on these values so
4260long as the original value is reconstituted before the ``indirectbr`` or
4261``callbr`` instruction.
4262
4263Finally, some targets may provide defined semantics when using the value
4264as the operand to an inline assembly, but that is target specific.
4265
4266.. _dso_local_equivalent:
4267
4268DSO Local Equivalent
4269--------------------
4270
4271``dso_local_equivalent @func``
4272
4273A '``dso_local_equivalent``' constant represents a function which is
4274functionally equivalent to a given function, but is always defined in the
4275current linkage unit. The resulting pointer has the same type as the underlying
4276function. The resulting pointer is permitted, but not required, to be different
4277from a pointer to the function, and it may have different values in different
4278translation units.
4279
4280The target function may not have ``extern_weak`` linkage.
4281
4282``dso_local_equivalent`` can be implemented as such:
4283
4284- If the function has local linkage, hidden visibility, or is
4285  ``dso_local``, ``dso_local_equivalent`` can be implemented as simply a pointer
4286  to the function.
4287- ``dso_local_equivalent`` can be implemented with a stub that tail-calls the
4288  function. Many targets support relocations that resolve at link time to either
4289  a function or a stub for it, depending on if the function is defined within the
4290  linkage unit; LLVM will use this when available. (This is commonly called a
4291  "PLT stub".) On other targets, the stub may need to be emitted explicitly.
4292
4293This can be used wherever a ``dso_local`` instance of a function is needed without
4294needing to explicitly make the original function ``dso_local``. An instance where
4295this can be used is for static offset calculations between a function and some other
4296``dso_local`` symbol. This is especially useful for the Relative VTables C++ ABI,
4297where dynamic relocations for function pointers in VTables can be replaced with
4298static relocations for offsets between the VTable and virtual functions which
4299may not be ``dso_local``.
4300
4301This is currently only supported for ELF binary formats.
4302
4303.. _no_cfi:
4304
4305No CFI
4306------
4307
4308``no_cfi @func``
4309
4310With `Control-Flow Integrity (CFI)
4311<https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_, a '``no_cfi``'
4312constant represents a function reference that does not get replaced with a
4313reference to the CFI jump table in the ``LowerTypeTests`` pass. These constants
4314may be useful in low-level programs, such as operating system kernels, which
4315need to refer to the actual function body.
4316
4317.. _constantexprs:
4318
4319Constant Expressions
4320--------------------
4321
4322Constant expressions are used to allow expressions involving other
4323constants to be used as constants. Constant expressions may be of any
4324:ref:`first class <t_firstclass>` type and may involve any LLVM operation
4325that does not have side effects (e.g. load and call are not supported).
4326The following is the syntax for constant expressions:
4327
4328``trunc (CST to TYPE)``
4329    Perform the :ref:`trunc operation <i_trunc>` on constants.
4330``zext (CST to TYPE)``
4331    Perform the :ref:`zext operation <i_zext>` on constants.
4332``sext (CST to TYPE)``
4333    Perform the :ref:`sext operation <i_sext>` on constants.
4334``fptrunc (CST to TYPE)``
4335    Truncate a floating-point constant to another floating-point type.
4336    The size of CST must be larger than the size of TYPE. Both types
4337    must be floating-point.
4338``fpext (CST to TYPE)``
4339    Floating-point extend a constant to another type. The size of CST
4340    must be smaller or equal to the size of TYPE. Both types must be
4341    floating-point.
4342``fptoui (CST to TYPE)``
4343    Convert a floating-point constant to the corresponding unsigned
4344    integer constant. TYPE must be a scalar or vector integer type. CST
4345    must be of scalar or vector floating-point type. Both CST and TYPE
4346    must be scalars, or vectors of the same number of elements. If the
4347    value won't fit in the integer type, the result is a
4348    :ref:`poison value <poisonvalues>`.
4349``fptosi (CST to TYPE)``
4350    Convert a floating-point constant to the corresponding signed
4351    integer constant. TYPE must be a scalar or vector integer type. CST
4352    must be of scalar or vector floating-point type. Both CST and TYPE
4353    must be scalars, or vectors of the same number of elements. If the
4354    value won't fit in the integer type, the result is a
4355    :ref:`poison value <poisonvalues>`.
4356``uitofp (CST to TYPE)``
4357    Convert an unsigned integer constant to the corresponding
4358    floating-point constant. TYPE must be a scalar or vector floating-point
4359    type.  CST must be of scalar or vector integer type. Both CST and TYPE must
4360    be scalars, or vectors of the same number of elements.
4361``sitofp (CST to TYPE)``
4362    Convert a signed integer constant to the corresponding floating-point
4363    constant. TYPE must be a scalar or vector floating-point type.
4364    CST must be of scalar or vector integer type. Both CST and TYPE must
4365    be scalars, or vectors of the same number of elements.
4366``ptrtoint (CST to TYPE)``
4367    Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
4368``inttoptr (CST to TYPE)``
4369    Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
4370    This one is *really* dangerous!
4371``bitcast (CST to TYPE)``
4372    Convert a constant, CST, to another TYPE.
4373    The constraints of the operands are the same as those for the
4374    :ref:`bitcast instruction <i_bitcast>`.
4375``addrspacecast (CST to TYPE)``
4376    Convert a constant pointer or constant vector of pointer, CST, to another
4377    TYPE in a different address space. The constraints of the operands are the
4378    same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
4379``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
4380    Perform the :ref:`getelementptr operation <i_getelementptr>` on
4381    constants. As with the :ref:`getelementptr <i_getelementptr>`
4382    instruction, the index list may have one or more indexes, which are
4383    required to make sense for the type of "pointer to TY".
4384``select (COND, VAL1, VAL2)``
4385    Perform the :ref:`select operation <i_select>` on constants.
4386``icmp COND (VAL1, VAL2)``
4387    Perform the :ref:`icmp operation <i_icmp>` on constants.
4388``fcmp COND (VAL1, VAL2)``
4389    Perform the :ref:`fcmp operation <i_fcmp>` on constants.
4390``extractelement (VAL, IDX)``
4391    Perform the :ref:`extractelement operation <i_extractelement>` on
4392    constants.
4393``insertelement (VAL, ELT, IDX)``
4394    Perform the :ref:`insertelement operation <i_insertelement>` on
4395    constants.
4396``shufflevector (VEC1, VEC2, IDXMASK)``
4397    Perform the :ref:`shufflevector operation <i_shufflevector>` on
4398    constants.
4399``extractvalue (VAL, IDX0, IDX1, ...)``
4400    Perform the :ref:`extractvalue operation <i_extractvalue>` on
4401    constants. The index list is interpreted in a similar manner as
4402    indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
4403    least one index value must be specified.
4404``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
4405    Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
4406    The index list is interpreted in a similar manner as indices in a
4407    ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
4408    value must be specified.
4409``OPCODE (LHS, RHS)``
4410    Perform the specified operation of the LHS and RHS constants. OPCODE
4411    may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
4412    binary <bitwiseops>` operations. The constraints on operands are
4413    the same as those for the corresponding instruction (e.g. no bitwise
4414    operations on floating-point values are allowed).
4415
4416Other Values
4417============
4418
4419.. _inlineasmexprs:
4420
4421Inline Assembler Expressions
4422----------------------------
4423
4424LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
4425Inline Assembly <moduleasm>`) through the use of a special value. This value
4426represents the inline assembler as a template string (containing the
4427instructions to emit), a list of operand constraints (stored as a string), a
4428flag that indicates whether or not the inline asm expression has side effects,
4429and a flag indicating whether the function containing the asm needs to align its
4430stack conservatively.
4431
4432The template string supports argument substitution of the operands using "``$``"
4433followed by a number, to indicate substitution of the given register/memory
4434location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
4435be used, where ``MODIFIER`` is a target-specific annotation for how to print the
4436operand (See :ref:`inline-asm-modifiers`).
4437
4438A literal "``$``" may be included by using "``$$``" in the template. To include
4439other special characters into the output, the usual "``\XX``" escapes may be
4440used, just as in other strings. Note that after template substitution, the
4441resulting assembly string is parsed by LLVM's integrated assembler unless it is
4442disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
4443syntax known to LLVM.
4444
4445LLVM also supports a few more substitutions useful for writing inline assembly:
4446
4447- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
4448  This substitution is useful when declaring a local label. Many standard
4449  compiler optimizations, such as inlining, may duplicate an inline asm blob.
4450  Adding a blob-unique identifier ensures that the two labels will not conflict
4451  during assembly. This is used to implement `GCC's %= special format
4452  string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
4453- ``${:comment}``: Expands to the comment character of the current target's
4454  assembly dialect. This is usually ``#``, but many targets use other strings,
4455  such as ``;``, ``//``, or ``!``.
4456- ``${:private}``: Expands to the assembler private label prefix. Labels with
4457  this prefix will not appear in the symbol table of the assembled object.
4458  Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
4459  relatively popular.
4460
4461LLVM's support for inline asm is modeled closely on the requirements of Clang's
4462GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
4463modifier codes listed here are similar or identical to those in GCC's inline asm
4464support. However, to be clear, the syntax of the template and constraint strings
4465described here is *not* the same as the syntax accepted by GCC and Clang, and,
4466while most constraint letters are passed through as-is by Clang, some get
4467translated to other codes when converting from the C source to the LLVM
4468assembly.
4469
4470An example inline assembler expression is:
4471
4472.. code-block:: llvm
4473
4474    i32 (i32) asm "bswap $0", "=r,r"
4475
4476Inline assembler expressions may **only** be used as the callee operand
4477of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
4478Thus, typically we have:
4479
4480.. code-block:: llvm
4481
4482    %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
4483
4484Inline asms with side effects not visible in the constraint list must be
4485marked as having side effects. This is done through the use of the
4486'``sideeffect``' keyword, like so:
4487
4488.. code-block:: llvm
4489
4490    call void asm sideeffect "eieio", ""()
4491
4492In some cases inline asms will contain code that will not work unless
4493the stack is aligned in some way, such as calls or SSE instructions on
4494x86, yet will not contain code that does that alignment within the asm.
4495The compiler should make conservative assumptions about what the asm
4496might contain and should generate its usual stack alignment code in the
4497prologue if the '``alignstack``' keyword is present:
4498
4499.. code-block:: llvm
4500
4501    call void asm alignstack "eieio", ""()
4502
4503Inline asms also support using non-standard assembly dialects. The
4504assumed dialect is ATT. When the '``inteldialect``' keyword is present,
4505the inline asm is using the Intel dialect. Currently, ATT and Intel are
4506the only supported dialects. An example is:
4507
4508.. code-block:: llvm
4509
4510    call void asm inteldialect "eieio", ""()
4511
4512In the case that the inline asm might unwind the stack,
4513the '``unwind``' keyword must be used, so that the compiler emits
4514unwinding information:
4515
4516.. code-block:: llvm
4517
4518    call void asm unwind "call func", ""()
4519
4520If the inline asm unwinds the stack and isn't marked with
4521the '``unwind``' keyword, the behavior is undefined.
4522
4523If multiple keywords appear, the '``sideeffect``' keyword must come
4524first, the '``alignstack``' keyword second, the '``inteldialect``' keyword
4525third and the '``unwind``' keyword last.
4526
4527Inline Asm Constraint String
4528^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4529
4530The constraint list is a comma-separated string, each element containing one or
4531more constraint codes.
4532
4533For each element in the constraint list an appropriate register or memory
4534operand will be chosen, and it will be made available to assembly template
4535string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
4536second, etc.
4537
4538There are three different types of constraints, which are distinguished by a
4539prefix symbol in front of the constraint code: Output, Input, and Clobber. The
4540constraints must always be given in that order: outputs first, then inputs, then
4541clobbers. They cannot be intermingled.
4542
4543There are also three different categories of constraint codes:
4544
4545- Register constraint. This is either a register class, or a fixed physical
4546  register. This kind of constraint will allocate a register, and if necessary,
4547  bitcast the argument or result to the appropriate type.
4548- Memory constraint. This kind of constraint is for use with an instruction
4549  taking a memory operand. Different constraints allow for different addressing
4550  modes used by the target.
4551- Immediate value constraint. This kind of constraint is for an integer or other
4552  immediate value which can be rendered directly into an instruction. The
4553  various target-specific constraints allow the selection of a value in the
4554  proper range for the instruction you wish to use it with.
4555
4556Output constraints
4557""""""""""""""""""
4558
4559Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
4560indicates that the assembly will write to this operand, and the operand will
4561then be made available as a return value of the ``asm`` expression. Output
4562constraints do not consume an argument from the call instruction. (Except, see
4563below about indirect outputs).
4564
4565Normally, it is expected that no output locations are written to by the assembly
4566expression until *all* of the inputs have been read. As such, LLVM may assign
4567the same register to an output and an input. If this is not safe (e.g. if the
4568assembly contains two instructions, where the first writes to one output, and
4569the second reads an input and writes to a second output), then the "``&``"
4570modifier must be used (e.g. "``=&r``") to specify that the output is an
4571"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
4572will not use the same register for any inputs (other than an input tied to this
4573output).
4574
4575Input constraints
4576"""""""""""""""""
4577
4578Input constraints do not have a prefix -- just the constraint codes. Each input
4579constraint will consume one argument from the call instruction. It is not
4580permitted for the asm to write to any input register or memory location (unless
4581that input is tied to an output). Note also that multiple inputs may all be
4582assigned to the same register, if LLVM can determine that they necessarily all
4583contain the same value.
4584
4585Instead of providing a Constraint Code, input constraints may also "tie"
4586themselves to an output constraint, by providing an integer as the constraint
4587string. Tied inputs still consume an argument from the call instruction, and
4588take up a position in the asm template numbering as is usual -- they will simply
4589be constrained to always use the same register as the output they've been tied
4590to. For example, a constraint string of "``=r,0``" says to assign a register for
4591output, and use that register as an input as well (it being the 0'th
4592constraint).
4593
4594It is permitted to tie an input to an "early-clobber" output. In that case, no
4595*other* input may share the same register as the input tied to the early-clobber
4596(even when the other input has the same value).
4597
4598You may only tie an input to an output which has a register constraint, not a
4599memory constraint. Only a single input may be tied to an output.
4600
4601There is also an "interesting" feature which deserves a bit of explanation: if a
4602register class constraint allocates a register which is too small for the value
4603type operand provided as input, the input value will be split into multiple
4604registers, and all of them passed to the inline asm.
4605
4606However, this feature is often not as useful as you might think.
4607
4608Firstly, the registers are *not* guaranteed to be consecutive. So, on those
4609architectures that have instructions which operate on multiple consecutive
4610instructions, this is not an appropriate way to support them. (e.g. the 32-bit
4611SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
4612hardware then loads into both the named register, and the next register. This
4613feature of inline asm would not be useful to support that.)
4614
4615A few of the targets provide a template string modifier allowing explicit access
4616to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
4617``D``). On such an architecture, you can actually access the second allocated
4618register (yet, still, not any subsequent ones). But, in that case, you're still
4619probably better off simply splitting the value into two separate operands, for
4620clarity. (e.g. see the description of the ``A`` constraint on X86, which,
4621despite existing only for use with this feature, is not really a good idea to
4622use)
4623
4624Indirect inputs and outputs
4625"""""""""""""""""""""""""""
4626
4627Indirect output or input constraints can be specified by the "``*``" modifier
4628(which goes after the "``=``" in case of an output). This indicates that the asm
4629will write to or read from the contents of an *address* provided as an input
4630argument. (Note that in this way, indirect outputs act more like an *input* than
4631an output: just like an input, they consume an argument of the call expression,
4632rather than producing a return value. An indirect output constraint is an
4633"output" only in that the asm is expected to write to the contents of the input
4634memory location, instead of just read from it).
4635
4636This is most typically used for memory constraint, e.g. "``=*m``", to pass the
4637address of a variable as a value.
4638
4639It is also possible to use an indirect *register* constraint, but only on output
4640(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
4641value normally, and then, separately emit a store to the address provided as
4642input, after the provided inline asm. (It's not clear what value this
4643functionality provides, compared to writing the store explicitly after the asm
4644statement, and it can only produce worse code, since it bypasses many
4645optimization passes. I would recommend not using it.)
4646
4647Call arguments for indirect constraints must have pointer type and must specify
4648the :ref:`elementtype <attr_elementtype>` attribute to indicate the pointer
4649element type.
4650
4651Clobber constraints
4652"""""""""""""""""""
4653
4654A clobber constraint is indicated by a "``~``" prefix. A clobber does not
4655consume an input operand, nor generate an output. Clobbers cannot use any of the
4656general constraint code letters -- they may use only explicit register
4657constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
4658"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
4659memory locations -- not only the memory pointed to by a declared indirect
4660output.
4661
4662Note that clobbering named registers that are also present in output
4663constraints is not legal.
4664
4665Label constraints
4666"""""""""""""""""
4667
4668A label constraint is indicated by a "``!``" prefix and typically used in the
4669form ``"!i"``. Instead of consuming call arguments, label constraints consume
4670indirect destination labels of ``callbr`` instructions.
4671
4672Label constraints can only be used in conjunction with ``callbr`` and the
4673number of label constraints must match the number of indirect destination
4674labels in the ``callbr`` instruction.
4675
4676
4677Constraint Codes
4678""""""""""""""""
4679After a potential prefix comes constraint code, or codes.
4680
4681A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
4682followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
4683(e.g. "``{eax}``").
4684
4685The one and two letter constraint codes are typically chosen to be the same as
4686GCC's constraint codes.
4687
4688A single constraint may include one or more than constraint code in it, leaving
4689it up to LLVM to choose which one to use. This is included mainly for
4690compatibility with the translation of GCC inline asm coming from clang.
4691
4692There are two ways to specify alternatives, and either or both may be used in an
4693inline asm constraint list:
4694
46951) Append the codes to each other, making a constraint code set. E.g. "``im``"
4696   or "``{eax}m``". This means "choose any of the options in the set". The
4697   choice of constraint is made independently for each constraint in the
4698   constraint list.
4699
47002) Use "``|``" between constraint code sets, creating alternatives. Every
4701   constraint in the constraint list must have the same number of alternative
4702   sets. With this syntax, the same alternative in *all* of the items in the
4703   constraint list will be chosen together.
4704
4705Putting those together, you might have a two operand constraint string like
4706``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
4707operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
4708may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
4709
4710However, the use of either of the alternatives features is *NOT* recommended, as
4711LLVM is not able to make an intelligent choice about which one to use. (At the
4712point it currently needs to choose, not enough information is available to do so
4713in a smart way.) Thus, it simply tries to make a choice that's most likely to
4714compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
4715always choose to use memory, not registers). And, if given multiple registers,
4716or multiple register classes, it will simply choose the first one. (In fact, it
4717doesn't currently even ensure explicitly specified physical registers are
4718unique, so specifying multiple physical registers as alternatives, like
4719``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
4720intended.)
4721
4722Supported Constraint Code List
4723""""""""""""""""""""""""""""""
4724
4725The constraint codes are, in general, expected to behave the same way they do in
4726GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
4727inline asm code which was supported by GCC. A mismatch in behavior between LLVM
4728and GCC likely indicates a bug in LLVM.
4729
4730Some constraint codes are typically supported by all targets:
4731
4732- ``r``: A register in the target's general purpose register class.
4733- ``m``: A memory address operand. It is target-specific what addressing modes
4734  are supported, typical examples are register, or register + register offset,
4735  or register + immediate offset (of some target-specific size).
4736- ``p``: An address operand. Similar to ``m``, but used by "load address"
4737  type instructions without touching memory.
4738- ``i``: An integer constant (of target-specific width). Allows either a simple
4739  immediate, or a relocatable value.
4740- ``n``: An integer constant -- *not* including relocatable values.
4741- ``s``: An integer constant, but allowing *only* relocatable values.
4742- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
4743  useful to pass a label for an asm branch or call.
4744
4745  .. FIXME: but that surely isn't actually okay to jump out of an asm
4746     block without telling llvm about the control transfer???)
4747
4748- ``{register-name}``: Requires exactly the named physical register.
4749
4750Other constraints are target-specific:
4751
4752AArch64:
4753
4754- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
4755- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
4756  i.e. 0 to 4095 with optional shift by 12.
4757- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
4758  ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
4759- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
4760  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
4761- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
4762  logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
4763- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
4764  32-bit register. This is a superset of ``K``: in addition to the bitmask
4765  immediate, also allows immediate integers which can be loaded with a single
4766  ``MOVZ`` or ``MOVL`` instruction.
4767- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
4768  64-bit register. This is a superset of ``L``.
4769- ``Q``: Memory address operand must be in a single register (no
4770  offsets). (However, LLVM currently does this for the ``m`` constraint as
4771  well.)
4772- ``r``: A 32 or 64-bit integer register (W* or X*).
4773- ``w``: A 32, 64, or 128-bit floating-point, SIMD or SVE vector register.
4774- ``x``: Like w, but restricted to registers 0 to 15 inclusive.
4775- ``y``: Like w, but restricted to SVE vector registers Z0 to Z7 inclusive.
4776- ``Upl``: One of the low eight SVE predicate registers (P0 to P7)
4777- ``Upa``: Any of the SVE predicate registers (P0 to P15)
4778
4779AMDGPU:
4780
4781- ``r``: A 32 or 64-bit integer register.
4782- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
4783- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
4784- ``[0-9]a``: The 32-bit AGPR register, number 0-9.
4785- ``I``: An integer inline constant in the range from -16 to 64.
4786- ``J``: A 16-bit signed integer constant.
4787- ``A``: An integer or a floating-point inline constant.
4788- ``B``: A 32-bit signed integer constant.
4789- ``C``: A 32-bit unsigned integer constant or an integer inline constant in the range from -16 to 64.
4790- ``DA``: A 64-bit constant that can be split into two "A" constants.
4791- ``DB``: A 64-bit constant that can be split into two "B" constants.
4792
4793All ARM modes:
4794
4795- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
4796  operand. Treated the same as operand ``m``, at the moment.
4797- ``Te``: An even general-purpose 32-bit integer register: ``r0,r2,...,r12,r14``
4798- ``To``: An odd general-purpose 32-bit integer register: ``r1,r3,...,r11``
4799
4800ARM and ARM's Thumb2 mode:
4801
4802- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
4803- ``I``: An immediate integer valid for a data-processing instruction.
4804- ``J``: An immediate integer between -4095 and 4095.
4805- ``K``: An immediate integer whose bitwise inverse is valid for a
4806  data-processing instruction. (Can be used with template modifier "``B``" to
4807  print the inverted value).
4808- ``L``: An immediate integer whose negation is valid for a data-processing
4809  instruction. (Can be used with template modifier "``n``" to print the negated
4810  value).
4811- ``M``: A power of two or an integer between 0 and 32.
4812- ``N``: Invalid immediate constraint.
4813- ``O``: Invalid immediate constraint.
4814- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
4815- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
4816  as ``r``.
4817- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
4818  invalid.
4819- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4820  ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4821- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4822  ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4823- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4824  ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4825
4826ARM's Thumb1 mode:
4827
4828- ``I``: An immediate integer between 0 and 255.
4829- ``J``: An immediate integer between -255 and -1.
4830- ``K``: An immediate integer between 0 and 255, with optional left-shift by
4831  some amount.
4832- ``L``: An immediate integer between -7 and 7.
4833- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
4834- ``N``: An immediate integer between 0 and 31.
4835- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
4836- ``r``: A low 32-bit GPR register (``r0-r7``).
4837- ``l``: A low 32-bit GPR register (``r0-r7``).
4838- ``h``: A high GPR register (``r0-r7``).
4839- ``w``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4840  ``s0-s31``, ``d0-d31``, or ``q0-q15``, respectively.
4841- ``t``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4842  ``s0-s31``, ``d0-d15``, or ``q0-q7``, respectively.
4843- ``x``: A 32, 64, or 128-bit floating-point/SIMD register in the ranges
4844  ``s0-s15``, ``d0-d7``, or ``q0-q3``, respectively.
4845
4846
4847Hexagon:
4848
4849- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
4850  at the moment.
4851- ``r``: A 32 or 64-bit register.
4852
4853MSP430:
4854
4855- ``r``: An 8 or 16-bit register.
4856
4857MIPS:
4858
4859- ``I``: An immediate signed 16-bit integer.
4860- ``J``: An immediate integer zero.
4861- ``K``: An immediate unsigned 16-bit integer.
4862- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
4863- ``N``: An immediate integer between -65535 and -1.
4864- ``O``: An immediate signed 15-bit integer.
4865- ``P``: An immediate integer between 1 and 65535.
4866- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
4867  register plus 16-bit immediate offset. In MIPS mode, just a base register.
4868- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
4869  register plus a 9-bit signed offset. In MIPS mode, the same as constraint
4870  ``m``.
4871- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
4872  ``sc`` instruction on the given subtarget (details vary).
4873- ``r``, ``d``,  ``y``: A 32 or 64-bit GPR register.
4874- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
4875  (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
4876  argument modifier for compatibility with GCC.
4877- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
4878  ``25``).
4879- ``l``: The ``lo`` register, 32 or 64-bit.
4880- ``x``: Invalid.
4881
4882NVPTX:
4883
4884- ``b``: A 1-bit integer register.
4885- ``c`` or ``h``: A 16-bit integer register.
4886- ``r``: A 32-bit integer register.
4887- ``l`` or ``N``: A 64-bit integer register.
4888- ``f``: A 32-bit float register.
4889- ``d``: A 64-bit float register.
4890
4891
4892PowerPC:
4893
4894- ``I``: An immediate signed 16-bit integer.
4895- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
4896- ``K``: An immediate unsigned 16-bit integer.
4897- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
4898- ``M``: An immediate integer greater than 31.
4899- ``N``: An immediate integer that is an exact power of 2.
4900- ``O``: The immediate integer constant 0.
4901- ``P``: An immediate integer constant whose negation is a signed 16-bit
4902  constant.
4903- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
4904  treated the same as ``m``.
4905- ``r``: A 32 or 64-bit integer register.
4906- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
4907  ``R1-R31``).
4908- ``f``: A 32 or 64-bit float register (``F0-F31``),
4909- ``v``: For ``4 x f32`` or ``4 x f64`` types, a 128-bit altivec vector
4910   register (``V0-V31``).
4911
4912- ``y``: Condition register (``CR0-CR7``).
4913- ``wc``: An individual CR bit in a CR register.
4914- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
4915  register set (overlapping both the floating-point and vector register files).
4916- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
4917  set.
4918
4919RISC-V:
4920
4921- ``A``: An address operand (using a general-purpose register, without an
4922  offset).
4923- ``I``: A 12-bit signed integer immediate operand.
4924- ``J``: A zero integer immediate operand.
4925- ``K``: A 5-bit unsigned integer immediate operand.
4926- ``f``: A 32- or 64-bit floating-point register (requires F or D extension).
4927- ``r``: A 32- or 64-bit general-purpose register (depending on the platform
4928  ``XLEN``).
4929- ``vr``: A vector register. (requires V extension).
4930- ``vm``: A vector register for masking operand. (requires V extension).
4931
4932Sparc:
4933
4934- ``I``: An immediate 13-bit signed integer.
4935- ``r``: A 32-bit integer register.
4936- ``f``: Any floating-point register on SparcV8, or a floating-point
4937  register in the "low" half of the registers on SparcV9.
4938- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
4939
4940SystemZ:
4941
4942- ``I``: An immediate unsigned 8-bit integer.
4943- ``J``: An immediate unsigned 12-bit integer.
4944- ``K``: An immediate signed 16-bit integer.
4945- ``L``: An immediate signed 20-bit integer.
4946- ``M``: An immediate integer 0x7fffffff.
4947- ``Q``: A memory address operand with a base address and a 12-bit immediate
4948  unsigned displacement.
4949- ``R``: A memory address operand with a base address, a 12-bit immediate
4950  unsigned displacement, and an index register.
4951- ``S``: A memory address operand with a base address and a 20-bit immediate
4952  signed displacement.
4953- ``T``: A memory address operand with a base address, a 20-bit immediate
4954  signed displacement, and an index register.
4955- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
4956- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
4957  address context evaluates as zero).
4958- ``h``: A 32-bit value in the high part of a 64bit data register
4959  (LLVM-specific)
4960- ``f``: A 32, 64, or 128-bit floating-point register.
4961
4962X86:
4963
4964- ``I``: An immediate integer between 0 and 31.
4965- ``J``: An immediate integer between 0 and 64.
4966- ``K``: An immediate signed 8-bit integer.
4967- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
4968  0xffffffff.
4969- ``M``: An immediate integer between 0 and 3.
4970- ``N``: An immediate unsigned 8-bit integer.
4971- ``O``: An immediate integer between 0 and 127.
4972- ``e``: An immediate 32-bit signed integer.
4973- ``Z``: An immediate 32-bit unsigned integer.
4974- ``o``, ``v``: Treated the same as ``m``, at the moment.
4975- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4976  ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
4977  registers, and on X86-64, it is all of the integer registers.
4978- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
4979  ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
4980- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
4981- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
4982  existed since i386, and can be accessed without the REX prefix.
4983- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
4984- ``y``: A 64-bit MMX register, if MMX is enabled.
4985- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
4986  operand in a SSE register. If AVX is also enabled, can also be a 256-bit
4987  vector operand in an AVX register. If AVX-512 is also enabled, can also be a
4988  512-bit vector operand in an AVX512 register, Otherwise, an error.
4989- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
4990- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
4991  32-bit mode, a 64-bit integer operand will get split into two registers). It
4992  is not recommended to use this constraint, as in 64-bit mode, the 64-bit
4993  operand will get allocated only to RAX -- if two 32-bit operands are needed,
4994  you're better off splitting it yourself, before passing it to the asm
4995  statement.
4996
4997XCore:
4998
4999- ``r``: A 32-bit integer register.
5000
5001
5002.. _inline-asm-modifiers:
5003
5004Asm template argument modifiers
5005^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5006
5007In the asm template string, modifiers can be used on the operand reference, like
5008"``${0:n}``".
5009
5010The modifiers are, in general, expected to behave the same way they do in
5011GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
5012inline asm code which was supported by GCC. A mismatch in behavior between LLVM
5013and GCC likely indicates a bug in LLVM.
5014
5015Target-independent:
5016
5017- ``c``: Print an immediate integer constant unadorned, without
5018  the target-specific immediate punctuation (e.g. no ``$`` prefix).
5019- ``n``: Negate and print immediate integer constant unadorned, without the
5020  target-specific immediate punctuation (e.g. no ``$`` prefix).
5021- ``l``: Print as an unadorned label, without the target-specific label
5022  punctuation (e.g. no ``$`` prefix).
5023
5024AArch64:
5025
5026- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
5027  instead of ``x30``, print ``w30``.
5028- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
5029- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
5030  ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
5031  ``v*``.
5032
5033AMDGPU:
5034
5035- ``r``: No effect.
5036
5037ARM:
5038
5039- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
5040  register).
5041- ``P``: No effect.
5042- ``q``: No effect.
5043- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
5044  as ``d4[1]`` instead of ``s9``)
5045- ``B``: Bitwise invert and print an immediate integer constant without ``#``
5046  prefix.
5047- ``L``: Print the low 16-bits of an immediate integer constant.
5048- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
5049  register operands subsequent to the specified one (!), so use carefully.
5050- ``Q``: Print the low-order register of a register-pair, or the low-order
5051  register of a two-register operand.
5052- ``R``: Print the high-order register of a register-pair, or the high-order
5053  register of a two-register operand.
5054- ``H``: Print the second register of a register-pair. (On a big-endian system,
5055  ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
5056  to ``R``.)
5057
5058  .. FIXME: H doesn't currently support printing the second register
5059     of a two-register operand.
5060
5061- ``e``: Print the low doubleword register of a NEON quad register.
5062- ``f``: Print the high doubleword register of a NEON quad register.
5063- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
5064  adornment.
5065
5066Hexagon:
5067
5068- ``L``: Print the second register of a two-register operand. Requires that it
5069  has been allocated consecutively to the first.
5070
5071  .. FIXME: why is it restricted to consecutive ones? And there's
5072     nothing that ensures that happens, is there?
5073
5074- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5075  nothing. Used to print 'addi' vs 'add' instructions.
5076
5077MSP430:
5078
5079No additional modifiers.
5080
5081MIPS:
5082
5083- ``X``: Print an immediate integer as hexadecimal
5084- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
5085- ``d``: Print an immediate integer as decimal.
5086- ``m``: Subtract one and print an immediate integer as decimal.
5087- ``z``: Print $0 if an immediate zero, otherwise print normally.
5088- ``L``: Print the low-order register of a two-register operand, or prints the
5089  address of the low-order word of a double-word memory operand.
5090
5091  .. FIXME: L seems to be missing memory operand support.
5092
5093- ``M``: Print the high-order register of a two-register operand, or prints the
5094  address of the high-order word of a double-word memory operand.
5095
5096  .. FIXME: M seems to be missing memory operand support.
5097
5098- ``D``: Print the second register of a two-register operand, or prints the
5099  second word of a double-word memory operand. (On a big-endian system, ``D`` is
5100  equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
5101  ``M``.)
5102- ``w``: No effect. Provided for compatibility with GCC which requires this
5103  modifier in order to print MSA registers (``W0-W31``) with the ``f``
5104  constraint.
5105
5106NVPTX:
5107
5108- ``r``: No effect.
5109
5110PowerPC:
5111
5112- ``L``: Print the second register of a two-register operand. Requires that it
5113  has been allocated consecutively to the first.
5114
5115  .. FIXME: why is it restricted to consecutive ones? And there's
5116     nothing that ensures that happens, is there?
5117
5118- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
5119  nothing. Used to print 'addi' vs 'add' instructions.
5120- ``y``: For a memory operand, prints formatter for a two-register X-form
5121  instruction. (Currently always prints ``r0,OPERAND``).
5122- ``U``: Prints 'u' if the memory operand is an update form, and nothing
5123  otherwise. (NOTE: LLVM does not support update form, so this will currently
5124  always print nothing)
5125- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
5126  not support indexed form, so this will currently always print nothing)
5127
5128RISC-V:
5129
5130- ``i``: Print the letter 'i' if the operand is not a register, otherwise print
5131  nothing. Used to print 'addi' vs 'add' instructions, etc.
5132- ``z``: Print the register ``zero`` if an immediate zero, otherwise print
5133  normally.
5134
5135Sparc:
5136
5137- ``r``: No effect.
5138
5139SystemZ:
5140
5141SystemZ implements only ``n``, and does *not* support any of the other
5142target-independent modifiers.
5143
5144X86:
5145
5146- ``c``: Print an unadorned integer or symbol name. (The latter is
5147  target-specific behavior for this typically target-independent modifier).
5148- ``A``: Print a register name with a '``*``' before it.
5149- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
5150  operand.
5151- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
5152  memory operand.
5153- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
5154  operand.
5155- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
5156  operand.
5157- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
5158  available, otherwise the 32-bit register name; do nothing on a memory operand.
5159- ``n``: Negate and print an unadorned integer, or, for operands other than an
5160  immediate integer (e.g. a relocatable symbol expression), print a '-' before
5161  the operand. (The behavior for relocatable symbol expressions is a
5162  target-specific behavior for this typically target-independent modifier)
5163- ``H``: Print a memory reference with additional offset +8.
5164- ``P``: Print a memory reference used as the argument of a call instruction or
5165  used with explicit base reg and index reg as its offset. So it can not use
5166  additional regs to present the memory reference. (E.g. omit ``(rip)``, even
5167  though it's PC-relative.)
5168
5169XCore:
5170
5171No additional modifiers.
5172
5173
5174Inline Asm Metadata
5175^^^^^^^^^^^^^^^^^^^
5176
5177The call instructions that wrap inline asm nodes may have a
5178"``!srcloc``" MDNode attached to it that contains a list of constant
5179integers. If present, the code generator will use the integer as the
5180location cookie value when report errors through the ``LLVMContext``
5181error reporting mechanisms. This allows a front-end to correlate backend
5182errors that occur with inline asm back to the source code that produced
5183it. For example:
5184
5185.. code-block:: llvm
5186
5187    call void asm sideeffect "something bad", ""(), !srcloc !42
5188    ...
5189    !42 = !{ i32 1234567 }
5190
5191It is up to the front-end to make sense of the magic numbers it places
5192in the IR. If the MDNode contains multiple constants, the code generator
5193will use the one that corresponds to the line of the asm that the error
5194occurs on.
5195
5196.. _metadata:
5197
5198Metadata
5199========
5200
5201LLVM IR allows metadata to be attached to instructions and global objects in the
5202program that can convey extra information about the code to the optimizers and
5203code generator. One example application of metadata is source-level
5204debug information. There are two metadata primitives: strings and nodes.
5205
5206Metadata does not have a type, and is not a value. If referenced from a
5207``call`` instruction, it uses the ``metadata`` type.
5208
5209All metadata are identified in syntax by an exclamation point ('``!``').
5210
5211.. _metadata-string:
5212
5213Metadata Nodes and Metadata Strings
5214-----------------------------------
5215
5216A metadata string is a string surrounded by double quotes. It can
5217contain any character by escaping non-printable characters with
5218"``\xx``" where "``xx``" is the two digit hex code. For example:
5219"``!"test\00"``".
5220
5221Metadata nodes are represented with notation similar to structure
5222constants (a comma separated list of elements, surrounded by braces and
5223preceded by an exclamation point). Metadata nodes can have any values as
5224their operand. For example:
5225
5226.. code-block:: llvm
5227
5228    !{ !"test\00", i32 10}
5229
5230Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
5231
5232.. code-block:: text
5233
5234    !0 = distinct !{!"test\00", i32 10}
5235
5236``distinct`` nodes are useful when nodes shouldn't be merged based on their
5237content. They can also occur when transformations cause uniquing collisions
5238when metadata operands change.
5239
5240A :ref:`named metadata <namedmetadatastructure>` is a collection of
5241metadata nodes, which can be looked up in the module symbol table. For
5242example:
5243
5244.. code-block:: llvm
5245
5246    !foo = !{!4, !3}
5247
5248Metadata can be used as function arguments. Here the ``llvm.dbg.value``
5249intrinsic is using three metadata arguments:
5250
5251.. code-block:: llvm
5252
5253    call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
5254
5255Metadata can be attached to an instruction. Here metadata ``!21`` is attached
5256to the ``add`` instruction using the ``!dbg`` identifier:
5257
5258.. code-block:: llvm
5259
5260    %indvar.next = add i64 %indvar, 1, !dbg !21
5261
5262Instructions may not have multiple metadata attachments with the same
5263identifier.
5264
5265Metadata can also be attached to a function or a global variable. Here metadata
5266``!22`` is attached to the ``f1`` and ``f2`` functions, and the globals ``g1``
5267and ``g2`` using the ``!dbg`` identifier:
5268
5269.. code-block:: llvm
5270
5271    declare !dbg !22 void @f1()
5272    define void @f2() !dbg !22 {
5273      ret void
5274    }
5275
5276    @g1 = global i32 0, !dbg !22
5277    @g2 = external global i32, !dbg !22
5278
5279Unlike instructions, global objects (functions and global variables) may have
5280multiple metadata attachments with the same identifier.
5281
5282A transformation is required to drop any metadata attachment that it does not
5283know or know it can't preserve. Currently there is an exception for metadata
5284attachment to globals for ``!func_sanitize``, ``!type`` and ``!absolute_symbol`` which can't be
5285unconditionally dropped unless the global is itself deleted.
5286
5287Metadata attached to a module using named metadata may not be dropped, with
5288the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
5289
5290More information about specific metadata nodes recognized by the
5291optimizers and code generator is found below.
5292
5293.. _specialized-metadata:
5294
5295Specialized Metadata Nodes
5296^^^^^^^^^^^^^^^^^^^^^^^^^^
5297
5298Specialized metadata nodes are custom data structures in metadata (as opposed
5299to generic tuples). Their fields are labelled, and can be specified in any
5300order.
5301
5302These aren't inherently debug info centric, but currently all the specialized
5303metadata nodes are related to debug info.
5304
5305.. _DICompileUnit:
5306
5307DICompileUnit
5308"""""""""""""
5309
5310``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
5311``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
5312containing the debug info to be emitted along with the compile unit, regardless
5313of code optimizations (some nodes are only emitted if there are references to
5314them from instructions). The ``debugInfoForProfiling:`` field is a boolean
5315indicating whether or not line-table discriminators are updated to provide
5316more-accurate debug info for profiling results.
5317
5318.. code-block:: text
5319
5320    !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
5321                        isOptimized: true, flags: "-O2", runtimeVersion: 2,
5322                        splitDebugFilename: "abc.debug", emissionKind: FullDebug,
5323                        enums: !2, retainedTypes: !3, globals: !4, imports: !5,
5324                        macros: !6, dwoId: 0x0abcd)
5325
5326Compile unit descriptors provide the root scope for objects declared in a
5327specific compilation unit. File descriptors are defined using this scope.  These
5328descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
5329track of global variables, type information, and imported entities (declarations
5330and namespaces).
5331
5332.. _DIFile:
5333
5334DIFile
5335""""""
5336
5337``DIFile`` nodes represent files. The ``filename:`` can include slashes.
5338
5339.. code-block:: none
5340
5341    !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
5342                 checksumkind: CSK_MD5,
5343                 checksum: "000102030405060708090a0b0c0d0e0f")
5344
5345Files are sometimes used in ``scope:`` fields, and are the only valid target
5346for ``file:`` fields.
5347Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1, CSK_SHA256}
5348
5349.. _DIBasicType:
5350
5351DIBasicType
5352"""""""""""
5353
5354``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
5355``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
5356
5357.. code-block:: text
5358
5359    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5360                      encoding: DW_ATE_unsigned_char)
5361    !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
5362
5363The ``encoding:`` describes the details of the type. Usually it's one of the
5364following:
5365
5366.. code-block:: text
5367
5368  DW_ATE_address       = 1
5369  DW_ATE_boolean       = 2
5370  DW_ATE_float         = 4
5371  DW_ATE_signed        = 5
5372  DW_ATE_signed_char   = 6
5373  DW_ATE_unsigned      = 7
5374  DW_ATE_unsigned_char = 8
5375
5376.. _DISubroutineType:
5377
5378DISubroutineType
5379""""""""""""""""
5380
5381``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
5382refers to a tuple; the first operand is the return type, while the rest are the
5383types of the formal arguments in order. If the first operand is ``null``, that
5384represents a function with no return value (such as ``void foo() {}`` in C++).
5385
5386.. code-block:: text
5387
5388    !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
5389    !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
5390    !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
5391
5392.. _DIDerivedType:
5393
5394DIDerivedType
5395"""""""""""""
5396
5397``DIDerivedType`` nodes represent types derived from other types, such as
5398qualified types.
5399
5400.. code-block:: text
5401
5402    !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
5403                      encoding: DW_ATE_unsigned_char)
5404    !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
5405                        align: 32)
5406
5407The following ``tag:`` values are valid:
5408
5409.. code-block:: text
5410
5411  DW_TAG_member             = 13
5412  DW_TAG_pointer_type       = 15
5413  DW_TAG_reference_type     = 16
5414  DW_TAG_typedef            = 22
5415  DW_TAG_inheritance        = 28
5416  DW_TAG_ptr_to_member_type = 31
5417  DW_TAG_const_type         = 38
5418  DW_TAG_friend             = 42
5419  DW_TAG_volatile_type      = 53
5420  DW_TAG_restrict_type      = 55
5421  DW_TAG_atomic_type        = 71
5422  DW_TAG_immutable_type     = 75
5423
5424.. _DIDerivedTypeMember:
5425
5426``DW_TAG_member`` is used to define a member of a :ref:`composite type
5427<DICompositeType>`. The type of the member is the ``baseType:``. The
5428``offset:`` is the member's bit offset.  If the composite type has an ODR
5429``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
5430uniqued based only on its ``name:`` and ``scope:``.
5431
5432``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
5433field of :ref:`composite types <DICompositeType>` to describe parents and
5434friends.
5435
5436``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
5437
5438``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
5439``DW_TAG_volatile_type``, ``DW_TAG_restrict_type``, ``DW_TAG_atomic_type`` and
5440``DW_TAG_immutable_type`` are used to qualify the ``baseType:``.
5441
5442Note that the ``void *`` type is expressed as a type derived from NULL.
5443
5444.. _DICompositeType:
5445
5446DICompositeType
5447"""""""""""""""
5448
5449``DICompositeType`` nodes represent types composed of other types, like
5450structures and unions. ``elements:`` points to a tuple of the composed types.
5451
5452If the source language supports ODR, the ``identifier:`` field gives the unique
5453identifier used for type merging between modules.  When specified,
5454:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
5455derived types <DIDerivedTypeMember>` that reference the ODR-type in their
5456``scope:`` change uniquing rules.
5457
5458For a given ``identifier:``, there should only be a single composite type that
5459does not have  ``flags: DIFlagFwdDecl`` set.  LLVM tools that link modules
5460together will unique such definitions at parse time via the ``identifier:``
5461field, even if the nodes are ``distinct``.
5462
5463.. code-block:: text
5464
5465    !0 = !DIEnumerator(name: "SixKind", value: 7)
5466    !1 = !DIEnumerator(name: "SevenKind", value: 7)
5467    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5468    !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
5469                          line: 2, size: 32, align: 32, identifier: "_M4Enum",
5470                          elements: !{!0, !1, !2})
5471
5472The following ``tag:`` values are valid:
5473
5474.. code-block:: text
5475
5476  DW_TAG_array_type       = 1
5477  DW_TAG_class_type       = 2
5478  DW_TAG_enumeration_type = 4
5479  DW_TAG_structure_type   = 19
5480  DW_TAG_union_type       = 23
5481
5482For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
5483descriptors <DISubrange>`, each representing the range of subscripts at that
5484level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
5485array type is a native packed vector. The optional ``dataLocation`` is a
5486DIExpression that describes how to get from an object's address to the actual
5487raw data, if they aren't equivalent. This is only supported for array types,
5488particularly to describe Fortran arrays, which have an array descriptor in
5489addition to the array data. Alternatively it can also be DIVariable which
5490has the address of the actual raw data. The Fortran language supports pointer
5491arrays which can be attached to actual arrays, this attachment between pointer
5492and pointee is called association.  The optional ``associated`` is a
5493DIExpression that describes whether the pointer array is currently associated.
5494The optional ``allocated`` is a DIExpression that describes whether the
5495allocatable array is currently allocated.  The optional ``rank`` is a
5496DIExpression that describes the rank (number of dimensions) of fortran assumed
5497rank array (rank is known at runtime).
5498
5499For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
5500descriptors <DIEnumerator>`, each representing the definition of an enumeration
5501value for the set. All enumeration type descriptors are collected in the
5502``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
5503
5504For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
5505``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
5506<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
5507``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
5508``isDefinition: false``.
5509
5510.. _DISubrange:
5511
5512DISubrange
5513""""""""""
5514
5515``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
5516:ref:`DICompositeType`.
5517
5518- ``count: -1`` indicates an empty array.
5519- ``count: !10`` describes the count with a :ref:`DILocalVariable`.
5520- ``count: !12`` describes the count with a :ref:`DIGlobalVariable`.
5521
5522.. code-block:: text
5523
5524    !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
5525    !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
5526    !2 = !DISubrange(count: -1) ; empty array.
5527
5528    ; Scopes used in rest of example
5529    !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
5530    !7 = distinct !DICompileUnit(language: DW_LANG_C99, file: !6)
5531    !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5)
5532
5533    ; Use of local variable as count value
5534    !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5535    !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
5536    !11 = !DISubrange(count: !10, lowerBound: 0)
5537
5538    ; Use of global variable as count value
5539    !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
5540    !13 = !DISubrange(count: !12, lowerBound: 0)
5541
5542.. _DIEnumerator:
5543
5544DIEnumerator
5545""""""""""""
5546
5547``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
5548variants of :ref:`DICompositeType`.
5549
5550.. code-block:: text
5551
5552    !0 = !DIEnumerator(name: "SixKind", value: 7)
5553    !1 = !DIEnumerator(name: "SevenKind", value: 7)
5554    !2 = !DIEnumerator(name: "NegEightKind", value: -8)
5555
5556DITemplateTypeParameter
5557"""""""""""""""""""""""
5558
5559``DITemplateTypeParameter`` nodes represent type parameters to generic source
5560language constructs. They are used (optionally) in :ref:`DICompositeType` and
5561:ref:`DISubprogram` ``templateParams:`` fields.
5562
5563.. code-block:: text
5564
5565    !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
5566
5567DITemplateValueParameter
5568""""""""""""""""""""""""
5569
5570``DITemplateValueParameter`` nodes represent value parameters to generic source
5571language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
5572but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
5573``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
5574:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
5575
5576.. code-block:: text
5577
5578    !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
5579
5580DINamespace
5581"""""""""""
5582
5583``DINamespace`` nodes represent namespaces in the source language.
5584
5585.. code-block:: text
5586
5587    !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
5588
5589.. _DIGlobalVariable:
5590
5591DIGlobalVariable
5592""""""""""""""""
5593
5594``DIGlobalVariable`` nodes represent global variables in the source language.
5595
5596.. code-block:: text
5597
5598    @foo = global i32, !dbg !0
5599    !0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
5600    !1 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !2,
5601                           file: !3, line: 7, type: !4, isLocal: true,
5602                           isDefinition: false, declaration: !5)
5603
5604
5605DIGlobalVariableExpression
5606""""""""""""""""""""""""""
5607
5608``DIGlobalVariableExpression`` nodes tie a :ref:`DIGlobalVariable` together
5609with a :ref:`DIExpression`.
5610
5611.. code-block:: text
5612
5613    @lower = global i32, !dbg !0
5614    @upper = global i32, !dbg !1
5615    !0 = !DIGlobalVariableExpression(
5616             var: !2,
5617             expr: !DIExpression(DW_OP_LLVM_fragment, 0, 32)
5618             )
5619    !1 = !DIGlobalVariableExpression(
5620             var: !2,
5621             expr: !DIExpression(DW_OP_LLVM_fragment, 32, 32)
5622             )
5623    !2 = !DIGlobalVariable(name: "split64", linkageName: "split64", scope: !3,
5624                           file: !4, line: 8, type: !5, declaration: !6)
5625
5626All global variable expressions should be referenced by the `globals:` field of
5627a :ref:`compile unit <DICompileUnit>`.
5628
5629.. _DISubprogram:
5630
5631DISubprogram
5632""""""""""""
5633
5634``DISubprogram`` nodes represent functions from the source language. A distinct
5635``DISubprogram`` may be attached to a function definition using ``!dbg``
5636metadata. A unique ``DISubprogram`` may be attached to a function declaration
5637used for call site debug info. The ``retainedNodes:`` field is a list of
5638:ref:`variables <DILocalVariable>` and :ref:`labels <DILabel>` that must be
5639retained, even if their IR counterparts are optimized out of the IR. The
5640``type:`` field must point at an :ref:`DISubroutineType`.
5641
5642.. _DISubprogramDeclaration:
5643
5644When ``isDefinition: false``, subprograms describe a declaration in the type
5645tree as opposed to a definition of a function.  If the scope is a composite
5646type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
5647then the subprogram declaration is uniqued based only on its ``linkageName:``
5648and ``scope:``.
5649
5650.. code-block:: text
5651
5652    define void @_Z3foov() !dbg !0 {
5653      ...
5654    }
5655
5656    !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
5657                                file: !2, line: 7, type: !3, isLocal: true,
5658                                isDefinition: true, scopeLine: 8,
5659                                containingType: !4,
5660                                virtuality: DW_VIRTUALITY_pure_virtual,
5661                                virtualIndex: 10, flags: DIFlagPrototyped,
5662                                isOptimized: true, unit: !5, templateParams: !6,
5663                                declaration: !7, retainedNodes: !8,
5664                                thrownTypes: !9)
5665
5666.. _DILexicalBlock:
5667
5668DILexicalBlock
5669""""""""""""""
5670
5671``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
5672<DISubprogram>`. The line number and column numbers are used to distinguish
5673two lexical blocks at same depth. They are valid targets for ``scope:``
5674fields.
5675
5676.. code-block:: text
5677
5678    !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
5679
5680Usually lexical blocks are ``distinct`` to prevent node merging based on
5681operands.
5682
5683.. _DILexicalBlockFile:
5684
5685DILexicalBlockFile
5686""""""""""""""""""
5687
5688``DILexicalBlockFile`` nodes are used to discriminate between sections of a
5689:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
5690indicate textual inclusion, or the ``discriminator:`` field can be used to
5691discriminate between control flow within a single block in the source language.
5692
5693.. code-block:: text
5694
5695    !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
5696    !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
5697    !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
5698
5699.. _DILocation:
5700
5701DILocation
5702""""""""""
5703
5704``DILocation`` nodes represent source debug locations. The ``scope:`` field is
5705mandatory, and points at an :ref:`DILexicalBlockFile`, an
5706:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
5707
5708.. code-block:: text
5709
5710    !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
5711
5712.. _DILocalVariable:
5713
5714DILocalVariable
5715"""""""""""""""
5716
5717``DILocalVariable`` nodes represent local variables in the source language. If
5718the ``arg:`` field is set to non-zero, then this variable is a subprogram
5719parameter, and it will be included in the ``retainedNodes:`` field of its
5720:ref:`DISubprogram`.
5721
5722.. code-block:: text
5723
5724    !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
5725                          type: !3, flags: DIFlagArtificial)
5726    !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
5727                          type: !3)
5728    !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
5729
5730.. _DIExpression:
5731
5732DIExpression
5733""""""""""""
5734
5735``DIExpression`` nodes represent expressions that are inspired by the DWARF
5736expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
5737(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
5738referenced LLVM variable relates to the source language variable. Debug
5739intrinsics are interpreted left-to-right: start by pushing the value/address
5740operand of the intrinsic onto a stack, then repeatedly push and evaluate
5741opcodes from the DIExpression until the final variable description is produced.
5742
5743The current supported opcode vocabulary is limited:
5744
5745- ``DW_OP_deref`` dereferences the top of the expression stack.
5746- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
5747  them together and appends the result to the expression stack.
5748- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
5749  the last entry from the second last entry and appends the result to the
5750  expression stack.
5751- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
5752- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
5753  here, respectively) of the variable fragment from the working expression. Note
5754  that contrary to DW_OP_bit_piece, the offset is describing the location
5755  within the described source variable.
5756- ``DW_OP_LLVM_convert, 16, DW_ATE_signed`` specifies a bit size and encoding
5757  (``16`` and ``DW_ATE_signed`` here, respectively) to which the top of the
5758  expression stack is to be converted. Maps into a ``DW_OP_convert`` operation
5759  that references a base type constructed from the supplied values.
5760- ``DW_OP_LLVM_tag_offset, tag_offset`` specifies that a memory tag should be
5761  optionally applied to the pointer. The memory tag is derived from the
5762  given tag offset in an implementation-defined manner.
5763- ``DW_OP_swap`` swaps top two stack entries.
5764- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
5765  of the stack is treated as an address. The second stack entry is treated as an
5766  address space identifier.
5767- ``DW_OP_stack_value`` marks a constant value.
5768- ``DW_OP_LLVM_entry_value, N`` may only appear in MIR and at the
5769  beginning of a ``DIExpression``. In DWARF a ``DBG_VALUE``
5770  instruction binding a ``DIExpression(DW_OP_LLVM_entry_value`` to a
5771  register is lowered to a ``DW_OP_entry_value [reg]``, pushing the
5772  value the register had upon function entry onto the stack.  The next
5773  ``(N - 1)`` operations will be part of the ``DW_OP_entry_value``
5774  block argument. For example, ``!DIExpression(DW_OP_LLVM_entry_value,
5775  1, DW_OP_plus_uconst, 123, DW_OP_stack_value)`` specifies an
5776  expression where the entry value of the debug value instruction's
5777  value/address operand is pushed to the stack, and is added
5778  with 123. Due to framework limitations ``N`` can currently only
5779  be 1.
5780
5781  The operation is introduced by the ``LiveDebugValues`` pass, which
5782  applies it only to function parameters that are unmodified
5783  throughout the function. Support is limited to simple register
5784  location descriptions, or as indirect locations (e.g., when a struct
5785  is passed-by-value to a callee via a pointer to a temporary copy
5786  made in the caller). The entry value op is also introduced by the
5787  ``AsmPrinter`` pass when a call site parameter value
5788  (``DW_AT_call_site_parameter_value``) is represented as entry value
5789  of the parameter.
5790- ``DW_OP_LLVM_arg, N`` is used in debug intrinsics that refer to more than one
5791  value, such as one that calculates the sum of two registers. This is always
5792  used in combination with an ordered list of values, such that
5793  ``DW_OP_LLVM_arg, N`` refers to the ``N``th element in that list. For
5794  example, ``!DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_minus,
5795  DW_OP_stack_value)`` used with the list ``(%reg1, %reg2)`` would evaluate to
5796  ``%reg1 - reg2``. This list of values should be provided by the containing
5797  intrinsic/instruction.
5798- ``DW_OP_breg`` (or ``DW_OP_bregx``) represents a content on the provided
5799  signed offset of the specified register. The opcode is only generated by the
5800  ``AsmPrinter`` pass to describe call site parameter value which requires an
5801  expression over two registers.
5802- ``DW_OP_push_object_address`` pushes the address of the object which can then
5803  serve as a descriptor in subsequent calculation. This opcode can be used to
5804  calculate bounds of fortran allocatable array which has array descriptors.
5805- ``DW_OP_over`` duplicates the entry currently second in the stack at the top
5806  of the stack. This opcode can be used to calculate bounds of fortran assumed
5807  rank array which has rank known at run time and current dimension number is
5808  implicitly first element of the stack.
5809- ``DW_OP_LLVM_implicit_pointer`` It specifies the dereferenced value. It can
5810  be used to represent pointer variables which are optimized out but the value
5811  it points to is known. This operator is required as it is different than DWARF
5812  operator DW_OP_implicit_pointer in representation and specification (number
5813  and types of operands) and later can not be used as multiple level.
5814
5815.. code-block:: text
5816
5817    IR for "*ptr = 4;"
5818    --------------
5819    call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !20)
5820    !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5821                           type: !18)
5822    !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5823    !19 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5824    !20 = !DIExpression(DW_OP_LLVM_implicit_pointer))
5825
5826    IR for "**ptr = 4;"
5827    --------------
5828    call void @llvm.dbg.value(metadata i32 4, metadata !17, metadata !21)
5829    !17 = !DILocalVariable(name: "ptr1", scope: !12, file: !3, line: 5,
5830                           type: !18)
5831    !18 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !19, size: 64)
5832    !19 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !20, size: 64)
5833    !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
5834    !21 = !DIExpression(DW_OP_LLVM_implicit_pointer,
5835                        DW_OP_LLVM_implicit_pointer))
5836
5837DWARF specifies three kinds of simple location descriptions: Register, memory,
5838and implicit location descriptions.  Note that a location description is
5839defined over certain ranges of a program, i.e the location of a variable may
5840change over the course of the program. Register and memory location
5841descriptions describe the *concrete location* of a source variable (in the
5842sense that a debugger might modify its value), whereas *implicit locations*
5843describe merely the actual *value* of a source variable which might not exist
5844in registers or in memory (see ``DW_OP_stack_value``).
5845
5846A ``llvm.dbg.addr`` or ``llvm.dbg.declare`` intrinsic describes an indirect
5847value (the address) of a source variable. The first operand of the intrinsic
5848must be an address of some kind. A DIExpression attached to the intrinsic
5849refines this address to produce a concrete location for the source variable.
5850
5851A ``llvm.dbg.value`` intrinsic describes the direct value of a source variable.
5852The first operand of the intrinsic may be a direct or indirect value. A
5853DIExpression attached to the intrinsic refines the first operand to produce a
5854direct value. For example, if the first operand is an indirect value, it may be
5855necessary to insert ``DW_OP_deref`` into the DIExpression in order to produce a
5856valid debug intrinsic.
5857
5858.. note::
5859
5860   A DIExpression is interpreted in the same way regardless of which kind of
5861   debug intrinsic it's attached to.
5862
5863.. code-block:: text
5864
5865    !0 = !DIExpression(DW_OP_deref)
5866    !1 = !DIExpression(DW_OP_plus_uconst, 3)
5867    !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
5868    !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
5869    !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
5870    !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
5871    !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
5872
5873DIArgList
5874""""""""""""
5875
5876``DIArgList`` nodes hold a list of constant or SSA value references. These are
5877used in :ref:`debug intrinsics<dbg_intrinsics>` (currently only in
5878``llvm.dbg.value``) in combination with a ``DIExpression`` that uses the
5879``DW_OP_LLVM_arg`` operator. Because a DIArgList may refer to local values
5880within a function, it must only be used as a function argument, must always be
5881inlined, and cannot appear in named metadata.
5882
5883.. code-block:: text
5884
5885    llvm.dbg.value(metadata !DIArgList(i32 %a, i32 %b),
5886                   metadata !16,
5887                   metadata !DIExpression(DW_OP_LLVM_arg, 0, DW_OP_LLVM_arg, 1, DW_OP_plus))
5888
5889DIFlags
5890"""""""""""""""
5891
5892These flags encode various properties of DINodes.
5893
5894The `ExportSymbols` flag marks a class, struct or union whose members
5895may be referenced as if they were defined in the containing class or
5896union. This flag is used to decide whether the DW_AT_export_symbols can
5897be used for the structure type.
5898
5899DIObjCProperty
5900""""""""""""""
5901
5902``DIObjCProperty`` nodes represent Objective-C property nodes.
5903
5904.. code-block:: text
5905
5906    !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
5907                         getter: "getFoo", attributes: 7, type: !2)
5908
5909DIImportedEntity
5910""""""""""""""""
5911
5912``DIImportedEntity`` nodes represent entities (such as modules) imported into a
5913compile unit. The ``elements`` field is a list of renamed entities (such as
5914variables and subprograms) in the imported entity (such as module).
5915
5916.. code-block:: text
5917
5918   !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
5919                          entity: !1, line: 7, elements: !3)
5920   !3 = !{!4}
5921   !4 = !DIImportedEntity(tag: DW_TAG_imported_declaration, name: "bar", scope: !0,
5922                          entity: !5, line: 7)
5923
5924DIMacro
5925"""""""
5926
5927``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
5928The ``name:`` field is the macro identifier, followed by macro parameters when
5929defining a function-like macro, and the ``value`` field is the token-string
5930used to expand the macro identifier.
5931
5932.. code-block:: text
5933
5934   !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
5935                 value: "((x) + 1)")
5936   !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
5937
5938DIMacroFile
5939"""""""""""
5940
5941``DIMacroFile`` nodes represent inclusion of source files.
5942The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
5943appear in the included source file.
5944
5945.. code-block:: text
5946
5947   !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
5948                     nodes: !3)
5949
5950.. _DILabel:
5951
5952DILabel
5953"""""""
5954
5955``DILabel`` nodes represent labels within a :ref:`DISubprogram`. All fields of
5956a ``DILabel`` are mandatory. The ``scope:`` field must be one of either a
5957:ref:`DILexicalBlockFile`, a :ref:`DILexicalBlock`, or a :ref:`DISubprogram`.
5958The ``name:`` field is the label identifier. The ``file:`` field is the
5959:ref:`DIFile` the label is present in. The ``line:`` field is the source line
5960within the file where the label is declared.
5961
5962.. code-block:: text
5963
5964  !2 = !DILabel(scope: !0, name: "foo", file: !1, line: 7)
5965
5966'``tbaa``' Metadata
5967^^^^^^^^^^^^^^^^^^^
5968
5969In LLVM IR, memory does not have types, so LLVM's own type system is not
5970suitable for doing type based alias analysis (TBAA). Instead, metadata is
5971added to the IR to describe a type system of a higher level language. This
5972can be used to implement C/C++ strict type aliasing rules, but it can also
5973be used to implement custom alias analysis behavior for other languages.
5974
5975This description of LLVM's TBAA system is broken into two parts:
5976:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
5977:ref:`Representation<tbaa_node_representation>` talks about the metadata
5978encoding of various entities.
5979
5980It is always possible to trace any TBAA node to a "root" TBAA node (details
5981in the :ref:`Representation<tbaa_node_representation>` section).  TBAA
5982nodes with different roots have an unknown aliasing relationship, and LLVM
5983conservatively infers ``MayAlias`` between them.  The rules mentioned in
5984this section only pertain to TBAA nodes living under the same root.
5985
5986.. _tbaa_node_semantics:
5987
5988Semantics
5989"""""""""
5990
5991The TBAA metadata system, referred to as "struct path TBAA" (not to be
5992confused with ``tbaa.struct``), consists of the following high level
5993concepts: *Type Descriptors*, further subdivided into scalar type
5994descriptors and struct type descriptors; and *Access Tags*.
5995
5996**Type descriptors** describe the type system of the higher level language
5997being compiled.  **Scalar type descriptors** describe types that do not
5998contain other types.  Each scalar type has a parent type, which must also
5999be a scalar type or the TBAA root.  Via this parent relation, scalar types
6000within a TBAA root form a tree.  **Struct type descriptors** denote types
6001that contain a sequence of other type descriptors, at known offsets.  These
6002contained type descriptors can either be struct type descriptors themselves
6003or scalar type descriptors.
6004
6005**Access tags** are metadata nodes attached to load and store instructions.
6006Access tags use type descriptors to describe the *location* being accessed
6007in terms of the type system of the higher level language.  Access tags are
6008tuples consisting of a base type, an access type and an offset.  The base
6009type is a scalar type descriptor or a struct type descriptor, the access
6010type is a scalar type descriptor, and the offset is a constant integer.
6011
6012The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
6013things:
6014
6015 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
6016   or store) of a value of type ``AccessTy`` contained in the struct type
6017   ``BaseTy`` at offset ``Offset``.
6018
6019 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
6020   ``AccessTy`` must be the same; and the access tag describes a scalar
6021   access with scalar type ``AccessTy``.
6022
6023We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
6024tuples this way:
6025
6026 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
6027   ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
6028   described in the TBAA metadata.  ``ImmediateParent(BaseTy, Offset)`` is
6029   undefined if ``Offset`` is non-zero.
6030
6031 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
6032   is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
6033   ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
6034   to be relative within that inner type.
6035
6036A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
6037aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
6038Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
6039Offset2)`` via the ``Parent`` relation or vice versa.
6040
6041As a concrete example, the type descriptor graph for the following program
6042
6043.. code-block:: c
6044
6045    struct Inner {
6046      int i;    // offset 0
6047      float f;  // offset 4
6048    };
6049
6050    struct Outer {
6051      float f;  // offset 0
6052      double d; // offset 4
6053      struct Inner inner_a;  // offset 12
6054    };
6055
6056    void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
6057      outer->f = 0;            // tag0: (OuterStructTy, FloatScalarTy, 0)
6058      outer->inner_a.i = 0;    // tag1: (OuterStructTy, IntScalarTy, 12)
6059      outer->inner_a.f = 0.0;  // tag2: (OuterStructTy, FloatScalarTy, 16)
6060      *f = 0.0;                // tag3: (FloatScalarTy, FloatScalarTy, 0)
6061    }
6062
6063is (note that in C and C++, ``char`` can be used to access any arbitrary
6064type):
6065
6066.. code-block:: text
6067
6068    Root = "TBAA Root"
6069    CharScalarTy = ("char", Root, 0)
6070    FloatScalarTy = ("float", CharScalarTy, 0)
6071    DoubleScalarTy = ("double", CharScalarTy, 0)
6072    IntScalarTy = ("int", CharScalarTy, 0)
6073    InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
6074    OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
6075                     (InnerStructTy, 12)}
6076
6077
6078with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
60790)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
6080``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
6081
6082.. _tbaa_node_representation:
6083
6084Representation
6085""""""""""""""
6086
6087The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
6088with exactly one ``MDString`` operand.
6089
6090Scalar type descriptors are represented as an ``MDNode`` s with two
6091operands.  The first operand is an ``MDString`` denoting the name of the
6092struct type.  LLVM does not assign meaning to the value of this operand, it
6093only cares about it being an ``MDString``.  The second operand is an
6094``MDNode`` which points to the parent for said scalar type descriptor,
6095which is either another scalar type descriptor or the TBAA root.  Scalar
6096type descriptors can have an optional third argument, but that must be the
6097constant integer zero.
6098
6099Struct type descriptors are represented as ``MDNode`` s with an odd number
6100of operands greater than 1.  The first operand is an ``MDString`` denoting
6101the name of the struct type.  Like in scalar type descriptors the actual
6102value of this name operand is irrelevant to LLVM.  After the name operand,
6103the struct type descriptors have a sequence of alternating ``MDNode`` and
6104``ConstantInt`` operands.  With N starting from 1, the 2N - 1 th operand,
6105an ``MDNode``, denotes a contained field, and the 2N th operand, a
6106``ConstantInt``, is the offset of the said contained field.  The offsets
6107must be in non-decreasing order.
6108
6109Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
6110The first operand is an ``MDNode`` pointing to the node representing the
6111base type.  The second operand is an ``MDNode`` pointing to the node
6112representing the access type.  The third operand is a ``ConstantInt`` that
6113states the offset of the access.  If a fourth field is present, it must be
6114a ``ConstantInt`` valued at 0 or 1.  If it is 1 then the access tag states
6115that the location being accessed is "constant" (meaning
6116``pointsToConstantMemory`` should return true; see `other useful
6117AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).  The TBAA root of
6118the access type and the base type of an access tag must be the same, and
6119that is the TBAA root of the access tag.
6120
6121'``tbaa.struct``' Metadata
6122^^^^^^^^^^^^^^^^^^^^^^^^^^
6123
6124The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
6125aggregate assignment operations in C and similar languages, however it
6126is defined to copy a contiguous region of memory, which is more than
6127strictly necessary for aggregate types which contain holes due to
6128padding. Also, it doesn't contain any TBAA information about the fields
6129of the aggregate.
6130
6131``!tbaa.struct`` metadata can describe which memory subregions in a
6132memcpy are padding and what the TBAA tags of the struct are.
6133
6134The current metadata format is very simple. ``!tbaa.struct`` metadata
6135nodes are a list of operands which are in conceptual groups of three.
6136For each group of three, the first operand gives the byte offset of a
6137field in bytes, the second gives its size in bytes, and the third gives
6138its tbaa tag. e.g.:
6139
6140.. code-block:: llvm
6141
6142    !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
6143
6144This describes a struct with two fields. The first is at offset 0 bytes
6145with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
6146and has size 4 bytes and has tbaa tag !2.
6147
6148Note that the fields need not be contiguous. In this example, there is a
61494 byte gap between the two fields. This gap represents padding which
6150does not carry useful data and need not be preserved.
6151
6152'``noalias``' and '``alias.scope``' Metadata
6153^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6154
6155``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
6156noalias memory-access sets. This means that some collection of memory access
6157instructions (loads, stores, memory-accessing calls, etc.) that carry
6158``noalias`` metadata can specifically be specified not to alias with some other
6159collection of memory access instructions that carry ``alias.scope`` metadata.
6160Each type of metadata specifies a list of scopes where each scope has an id and
6161a domain.
6162
6163When evaluating an aliasing query, if for some domain, the set
6164of scopes with that domain in one instruction's ``alias.scope`` list is a
6165subset of (or equal to) the set of scopes for that domain in another
6166instruction's ``noalias`` list, then the two memory accesses are assumed not to
6167alias.
6168
6169Because scopes in one domain don't affect scopes in other domains, separate
6170domains can be used to compose multiple independent noalias sets.  This is
6171used for example during inlining.  As the noalias function parameters are
6172turned into noalias scope metadata, a new domain is used every time the
6173function is inlined.
6174
6175The metadata identifying each domain is itself a list containing one or two
6176entries. The first entry is the name of the domain. Note that if the name is a
6177string then it can be combined across functions and translation units. A
6178self-reference can be used to create globally unique domain names. A
6179descriptive string may optionally be provided as a second list entry.
6180
6181The metadata identifying each scope is also itself a list containing two or
6182three entries. The first entry is the name of the scope. Note that if the name
6183is a string then it can be combined across functions and translation units. A
6184self-reference can be used to create globally unique scope names. A metadata
6185reference to the scope's domain is the second entry. A descriptive string may
6186optionally be provided as a third list entry.
6187
6188For example,
6189
6190.. code-block:: llvm
6191
6192    ; Two scope domains:
6193    !0 = !{!0}
6194    !1 = !{!1}
6195
6196    ; Some scopes in these domains:
6197    !2 = !{!2, !0}
6198    !3 = !{!3, !0}
6199    !4 = !{!4, !1}
6200
6201    ; Some scope lists:
6202    !5 = !{!4} ; A list containing only scope !4
6203    !6 = !{!4, !3, !2}
6204    !7 = !{!3}
6205
6206    ; These two instructions don't alias:
6207    %0 = load float, ptr %c, align 4, !alias.scope !5
6208    store float %0, ptr %arrayidx.i, align 4, !noalias !5
6209
6210    ; These two instructions also don't alias (for domain !1, the set of scopes
6211    ; in the !alias.scope equals that in the !noalias list):
6212    %2 = load float, ptr %c, align 4, !alias.scope !5
6213    store float %2, ptr %arrayidx.i2, align 4, !noalias !6
6214
6215    ; These two instructions may alias (for domain !0, the set of scopes in
6216    ; the !noalias list is not a superset of, or equal to, the scopes in the
6217    ; !alias.scope list):
6218    %2 = load float, ptr %c, align 4, !alias.scope !6
6219    store float %0, ptr %arrayidx.i, align 4, !noalias !7
6220
6221'``fpmath``' Metadata
6222^^^^^^^^^^^^^^^^^^^^^
6223
6224``fpmath`` metadata may be attached to any instruction of floating-point
6225type. It can be used to express the maximum acceptable error in the
6226result of that instruction, in ULPs, thus potentially allowing the
6227compiler to use a more efficient but less accurate method of computing
6228it. ULP is defined as follows:
6229
6230    If ``x`` is a real number that lies between two finite consecutive
6231    floating-point numbers ``a`` and ``b``, without being equal to one
6232    of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
6233    distance between the two non-equal finite floating-point numbers
6234    nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
6235
6236The metadata node shall consist of a single positive float type number
6237representing the maximum relative error, for example:
6238
6239.. code-block:: llvm
6240
6241    !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
6242
6243.. _range-metadata:
6244
6245'``range``' Metadata
6246^^^^^^^^^^^^^^^^^^^^
6247
6248``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
6249integer types. It expresses the possible ranges the loaded value or the value
6250returned by the called function at this call site is in. If the loaded or
6251returned value is not in the specified range, the behavior is undefined. The
6252ranges are represented with a flattened list of integers. The loaded value or
6253the value returned is known to be in the union of the ranges defined by each
6254consecutive pair. Each pair has the following properties:
6255
6256-  The type must match the type loaded by the instruction.
6257-  The pair ``a,b`` represents the range ``[a,b)``.
6258-  Both ``a`` and ``b`` are constants.
6259-  The range is allowed to wrap.
6260-  The range should not represent the full or empty set. That is,
6261   ``a!=b``.
6262
6263In addition, the pairs must be in signed order of the lower bound and
6264they must be non-contiguous.
6265
6266Examples:
6267
6268.. code-block:: llvm
6269
6270      %a = load i8, ptr %x, align 1, !range !0 ; Can only be 0 or 1
6271      %b = load i8, ptr %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
6272      %c = call i8 @foo(),       !range !2 ; Can only be 0, 1, 3, 4 or 5
6273      %d = invoke i8 @bar() to label %cont
6274             unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
6275    ...
6276    !0 = !{ i8 0, i8 2 }
6277    !1 = !{ i8 255, i8 2 }
6278    !2 = !{ i8 0, i8 2, i8 3, i8 6 }
6279    !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
6280
6281'``absolute_symbol``' Metadata
6282^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6283
6284``absolute_symbol`` metadata may be attached to a global variable
6285declaration. It marks the declaration as a reference to an absolute symbol,
6286which causes the backend to use absolute relocations for the symbol even
6287in position independent code, and expresses the possible ranges that the
6288global variable's *address* (not its value) is in, in the same format as
6289``range`` metadata, with the extension that the pair ``all-ones,all-ones``
6290may be used to represent the full set.
6291
6292Example (assuming 64-bit pointers):
6293
6294.. code-block:: llvm
6295
6296      @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
6297      @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
6298
6299    ...
6300    !0 = !{ i64 0, i64 256 }
6301    !1 = !{ i64 -1, i64 -1 }
6302
6303'``callees``' Metadata
6304^^^^^^^^^^^^^^^^^^^^^^
6305
6306``callees`` metadata may be attached to indirect call sites. If ``callees``
6307metadata is attached to a call site, and any callee is not among the set of
6308functions provided by the metadata, the behavior is undefined. The intent of
6309this metadata is to facilitate optimizations such as indirect-call promotion.
6310For example, in the code below, the call instruction may only target the
6311``add`` or ``sub`` functions:
6312
6313.. code-block:: llvm
6314
6315    %result = call i64 %binop(i64 %x, i64 %y), !callees !0
6316
6317    ...
6318    !0 = !{ptr @add, ptr @sub}
6319
6320'``callback``' Metadata
6321^^^^^^^^^^^^^^^^^^^^^^^
6322
6323``callback`` metadata may be attached to a function declaration, or definition.
6324(Call sites are excluded only due to the lack of a use case.) For ease of
6325exposition, we'll refer to the function annotated w/ metadata as a broker
6326function. The metadata describes how the arguments of a call to the broker are
6327in turn passed to the callback function specified by the metadata. Thus, the
6328``callback`` metadata provides a partial description of a call site inside the
6329broker function with regards to the arguments of a call to the broker. The only
6330semantic restriction on the broker function itself is that it is not allowed to
6331inspect or modify arguments referenced in the ``callback`` metadata as
6332pass-through to the callback function.
6333
6334The broker is not required to actually invoke the callback function at runtime.
6335However, the assumptions about not inspecting or modifying arguments that would
6336be passed to the specified callback function still hold, even if the callback
6337function is not dynamically invoked. The broker is allowed to invoke the
6338callback function more than once per invocation of the broker. The broker is
6339also allowed to invoke (directly or indirectly) the function passed as a
6340callback through another use. Finally, the broker is also allowed to relay the
6341callback callee invocation to a different thread.
6342
6343The metadata is structured as follows: At the outer level, ``callback``
6344metadata is a list of ``callback`` encodings. Each encoding starts with a
6345constant ``i64`` which describes the argument position of the callback function
6346in the call to the broker. The following elements, except the last, describe
6347what arguments are passed to the callback function. Each element is again an
6348``i64`` constant identifying the argument of the broker that is passed through,
6349or ``i64 -1`` to indicate an unknown or inspected argument. The order in which
6350they are listed has to be the same in which they are passed to the callback
6351callee. The last element of the encoding is a boolean which specifies how
6352variadic arguments of the broker are handled. If it is true, all variadic
6353arguments of the broker are passed through to the callback function *after* the
6354arguments encoded explicitly before.
6355
6356In the code below, the ``pthread_create`` function is marked as a broker
6357through the ``!callback !1`` metadata. In the example, there is only one
6358callback encoding, namely ``!2``, associated with the broker. This encoding
6359identifies the callback function as the second argument of the broker (``i64
63602``) and the sole argument of the callback function as the third one of the
6361broker function (``i64 3``).
6362
6363.. FIXME why does the llvm-sphinx-docs builder give a highlighting
6364   error if the below is set to highlight as 'llvm', despite that we
6365   have misc.highlighting_failure set?
6366
6367.. code-block:: text
6368
6369    declare !callback !1 dso_local i32 @pthread_create(ptr, ptr, ptr, ptr)
6370
6371    ...
6372    !2 = !{i64 2, i64 3, i1 false}
6373    !1 = !{!2}
6374
6375Another example is shown below. The callback callee is the second argument of
6376the ``__kmpc_fork_call`` function (``i64 2``). The callee is given two unknown
6377values (each identified by a ``i64 -1``) and afterwards all
6378variadic arguments that are passed to the ``__kmpc_fork_call`` call (due to the
6379final ``i1 true``).
6380
6381.. FIXME why does the llvm-sphinx-docs builder give a highlighting
6382   error if the below is set to highlight as 'llvm', despite that we
6383   have misc.highlighting_failure set?
6384
6385.. code-block:: text
6386
6387    declare !callback !0 dso_local void @__kmpc_fork_call(ptr, i32, ptr, ...)
6388
6389    ...
6390    !1 = !{i64 2, i64 -1, i64 -1, i1 true}
6391    !0 = !{!1}
6392
6393'``exclude``' Metadata
6394^^^^^^^^^^^^^^^^^^^^^^
6395
6396``exclude`` metadata may be attached to a global variable to signify that its
6397section should not be included in the final executable or shared library. This
6398option is only valid for global variables with an explicit section targeting ELF
6399or COFF. This is done using the ``SHF_EXCLUDE`` flag on ELF targets and the
6400``IMAGE_SCN_LNK_REMOVE`` and ``IMAGE_SCN_MEM_DISCARDABLE`` flags for COFF
6401targets. Additionally, this metadata is only used as a flag, so the associated
6402node must be empty. The explicit section should not conflict with any other
6403sections that the user does not want removed after linking.
6404
6405.. code-block:: text
6406
6407  @object = private constant [1 x i8] c"\00", section ".foo" !exclude !0
6408
6409  ...
6410  !0 = !{}
6411
6412'``unpredictable``' Metadata
6413^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6414
6415``unpredictable`` metadata may be attached to any branch or switch
6416instruction. It can be used to express the unpredictability of control
6417flow. Similar to the llvm.expect intrinsic, it may be used to alter
6418optimizations related to compare and branch instructions. The metadata
6419is treated as a boolean value; if it exists, it signals that the branch
6420or switch that it is attached to is completely unpredictable.
6421
6422.. _md_dereferenceable:
6423
6424'``dereferenceable``' Metadata
6425^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6426
6427The existence of the ``!dereferenceable`` metadata on the instruction
6428tells the optimizer that the value loaded is known to be dereferenceable.
6429The number of bytes known to be dereferenceable is specified by the integer
6430value in the metadata node. This is analogous to the ''dereferenceable''
6431attribute on parameters and return values.
6432
6433.. _md_dereferenceable_or_null:
6434
6435'``dereferenceable_or_null``' Metadata
6436^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6437
6438The existence of the ``!dereferenceable_or_null`` metadata on the
6439instruction tells the optimizer that the value loaded is known to be either
6440dereferenceable or null.
6441The number of bytes known to be dereferenceable is specified by the integer
6442value in the metadata node. This is analogous to the ''dereferenceable_or_null''
6443attribute on parameters and return values.
6444
6445.. _llvm.loop:
6446
6447'``llvm.loop``'
6448^^^^^^^^^^^^^^^
6449
6450It is sometimes useful to attach information to loop constructs. Currently,
6451loop metadata is implemented as metadata attached to the branch instruction
6452in the loop latch block. The loop metadata node is a list of
6453other metadata nodes, each representing a property of the loop. Usually,
6454the first item of the property node is a string. For example, the
6455``llvm.loop.unroll.count`` suggests an unroll factor to the loop
6456unroller:
6457
6458.. code-block:: llvm
6459
6460      br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
6461    ...
6462    !0 = !{!0, !1, !2}
6463    !1 = !{!"llvm.loop.unroll.enable"}
6464    !2 = !{!"llvm.loop.unroll.count", i32 4}
6465
6466For legacy reasons, the first item of a loop metadata node must be a
6467reference to itself. Before the advent of the 'distinct' keyword, this
6468forced the preservation of otherwise identical metadata nodes. Since
6469the loop-metadata node can be attached to multiple nodes, the 'distinct'
6470keyword has become unnecessary.
6471
6472Prior to the property nodes, one or two ``DILocation`` (debug location)
6473nodes can be present in the list. The first, if present, identifies the
6474source-code location where the loop begins. The second, if present,
6475identifies the source-code location where the loop ends.
6476
6477Loop metadata nodes cannot be used as unique identifiers. They are
6478neither persistent for the same loop through transformations nor
6479necessarily unique to just one loop.
6480
6481'``llvm.loop.disable_nonforced``'
6482^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6483
6484This metadata disables all optional loop transformations unless
6485explicitly instructed using other transformation metadata such as
6486``llvm.loop.unroll.enable``. That is, no heuristic will try to determine
6487whether a transformation is profitable. The purpose is to avoid that the
6488loop is transformed to a different loop before an explicitly requested
6489(forced) transformation is applied. For instance, loop fusion can make
6490other transformations impossible. Mandatory loop canonicalizations such
6491as loop rotation are still applied.
6492
6493It is recommended to use this metadata in addition to any llvm.loop.*
6494transformation directive. Also, any loop should have at most one
6495directive applied to it (and a sequence of transformations built using
6496followup-attributes). Otherwise, which transformation will be applied
6497depends on implementation details such as the pass pipeline order.
6498
6499See :ref:`transformation-metadata` for details.
6500
6501'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
6502^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6503
6504Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
6505used to control per-loop vectorization and interleaving parameters such as
6506vectorization width and interleave count. These metadata should be used in
6507conjunction with ``llvm.loop`` loop identification metadata. The
6508``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
6509optimization hints and the optimizer will only interleave and vectorize loops if
6510it believes it is safe to do so. The ``llvm.loop.parallel_accesses`` metadata
6511which contains information about loop-carried memory dependencies can be helpful
6512in determining the safety of these transformations.
6513
6514'``llvm.loop.interleave.count``' Metadata
6515^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6516
6517This metadata suggests an interleave count to the loop interleaver.
6518The first operand is the string ``llvm.loop.interleave.count`` and the
6519second operand is an integer specifying the interleave count. For
6520example:
6521
6522.. code-block:: llvm
6523
6524   !0 = !{!"llvm.loop.interleave.count", i32 4}
6525
6526Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
6527multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
6528then the interleave count will be determined automatically.
6529
6530'``llvm.loop.vectorize.enable``' Metadata
6531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6532
6533This metadata selectively enables or disables vectorization for the loop. The
6534first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
6535is a bit. If the bit operand value is 1 vectorization is enabled. A value of
65360 disables vectorization:
6537
6538.. code-block:: llvm
6539
6540   !0 = !{!"llvm.loop.vectorize.enable", i1 0}
6541   !1 = !{!"llvm.loop.vectorize.enable", i1 1}
6542
6543'``llvm.loop.vectorize.predicate.enable``' Metadata
6544^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6545
6546This metadata selectively enables or disables creating predicated instructions
6547for the loop, which can enable folding of the scalar epilogue loop into the
6548main loop. The first operand is the string
6549``llvm.loop.vectorize.predicate.enable`` and the second operand is a bit. If
6550the bit operand value is 1 vectorization is enabled. A value of 0 disables
6551vectorization:
6552
6553.. code-block:: llvm
6554
6555   !0 = !{!"llvm.loop.vectorize.predicate.enable", i1 0}
6556   !1 = !{!"llvm.loop.vectorize.predicate.enable", i1 1}
6557
6558'``llvm.loop.vectorize.scalable.enable``' Metadata
6559^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6560
6561This metadata selectively enables or disables scalable vectorization for the
6562loop, and only has any effect if vectorization for the loop is already enabled.
6563The first operand is the string ``llvm.loop.vectorize.scalable.enable``
6564and the second operand is a bit. If the bit operand value is 1 scalable
6565vectorization is enabled, whereas a value of 0 reverts to the default fixed
6566width vectorization:
6567
6568.. code-block:: llvm
6569
6570   !0 = !{!"llvm.loop.vectorize.scalable.enable", i1 0}
6571   !1 = !{!"llvm.loop.vectorize.scalable.enable", i1 1}
6572
6573'``llvm.loop.vectorize.width``' Metadata
6574^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6575
6576This metadata sets the target width of the vectorizer. The first
6577operand is the string ``llvm.loop.vectorize.width`` and the second
6578operand is an integer specifying the width. For example:
6579
6580.. code-block:: llvm
6581
6582   !0 = !{!"llvm.loop.vectorize.width", i32 4}
6583
6584Note that setting ``llvm.loop.vectorize.width`` to 1 disables
6585vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
65860 or if the loop does not have this metadata the width will be
6587determined automatically.
6588
6589'``llvm.loop.vectorize.followup_vectorized``' Metadata
6590^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6591
6592This metadata defines which loop attributes the vectorized loop will
6593have. See :ref:`transformation-metadata` for details.
6594
6595'``llvm.loop.vectorize.followup_epilogue``' Metadata
6596^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6597
6598This metadata defines which loop attributes the epilogue will have. The
6599epilogue is not vectorized and is executed when either the vectorized
6600loop is not known to preserve semantics (because e.g., it processes two
6601arrays that are found to alias by a runtime check) or for the last
6602iterations that do not fill a complete set of vector lanes. See
6603:ref:`Transformation Metadata <transformation-metadata>` for details.
6604
6605'``llvm.loop.vectorize.followup_all``' Metadata
6606^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6607
6608Attributes in the metadata will be added to both the vectorized and
6609epilogue loop.
6610See :ref:`Transformation Metadata <transformation-metadata>` for details.
6611
6612'``llvm.loop.unroll``'
6613^^^^^^^^^^^^^^^^^^^^^^
6614
6615Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
6616optimization hints such as the unroll factor. ``llvm.loop.unroll``
6617metadata should be used in conjunction with ``llvm.loop`` loop
6618identification metadata. The ``llvm.loop.unroll`` metadata are only
6619optimization hints and the unrolling will only be performed if the
6620optimizer believes it is safe to do so.
6621
6622'``llvm.loop.unroll.count``' Metadata
6623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6624
6625This metadata suggests an unroll factor to the loop unroller. The
6626first operand is the string ``llvm.loop.unroll.count`` and the second
6627operand is a positive integer specifying the unroll factor. For
6628example:
6629
6630.. code-block:: llvm
6631
6632   !0 = !{!"llvm.loop.unroll.count", i32 4}
6633
6634If the trip count of the loop is less than the unroll count the loop
6635will be partially unrolled.
6636
6637'``llvm.loop.unroll.disable``' Metadata
6638^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6639
6640This metadata disables loop unrolling. The metadata has a single operand
6641which is the string ``llvm.loop.unroll.disable``. For example:
6642
6643.. code-block:: llvm
6644
6645   !0 = !{!"llvm.loop.unroll.disable"}
6646
6647'``llvm.loop.unroll.runtime.disable``' Metadata
6648^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6649
6650This metadata disables runtime loop unrolling. The metadata has a single
6651operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
6652
6653.. code-block:: llvm
6654
6655   !0 = !{!"llvm.loop.unroll.runtime.disable"}
6656
6657'``llvm.loop.unroll.enable``' Metadata
6658^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6659
6660This metadata suggests that the loop should be fully unrolled if the trip count
6661is known at compile time and partially unrolled if the trip count is not known
6662at compile time. The metadata has a single operand which is the string
6663``llvm.loop.unroll.enable``.  For example:
6664
6665.. code-block:: llvm
6666
6667   !0 = !{!"llvm.loop.unroll.enable"}
6668
6669'``llvm.loop.unroll.full``' Metadata
6670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6671
6672This metadata suggests that the loop should be unrolled fully. The
6673metadata has a single operand which is the string ``llvm.loop.unroll.full``.
6674For example:
6675
6676.. code-block:: llvm
6677
6678   !0 = !{!"llvm.loop.unroll.full"}
6679
6680'``llvm.loop.unroll.followup``' Metadata
6681^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6682
6683This metadata defines which loop attributes the unrolled loop will have.
6684See :ref:`Transformation Metadata <transformation-metadata>` for details.
6685
6686'``llvm.loop.unroll.followup_remainder``' Metadata
6687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6688
6689This metadata defines which loop attributes the remainder loop after
6690partial/runtime unrolling will have. See
6691:ref:`Transformation Metadata <transformation-metadata>` for details.
6692
6693'``llvm.loop.unroll_and_jam``'
6694^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6695
6696This metadata is treated very similarly to the ``llvm.loop.unroll`` metadata
6697above, but affect the unroll and jam pass. In addition any loop with
6698``llvm.loop.unroll`` metadata but no ``llvm.loop.unroll_and_jam`` metadata will
6699disable unroll and jam (so ``llvm.loop.unroll`` metadata will be left to the
6700unroller, plus ``llvm.loop.unroll.disable`` metadata will disable unroll and jam
6701too.)
6702
6703The metadata for unroll and jam otherwise is the same as for ``unroll``.
6704``llvm.loop.unroll_and_jam.enable``, ``llvm.loop.unroll_and_jam.disable`` and
6705``llvm.loop.unroll_and_jam.count`` do the same as for unroll.
6706``llvm.loop.unroll_and_jam.full`` is not supported. Again these are only hints
6707and the normal safety checks will still be performed.
6708
6709'``llvm.loop.unroll_and_jam.count``' Metadata
6710^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6711
6712This metadata suggests an unroll and jam factor to use, similarly to
6713``llvm.loop.unroll.count``. The first operand is the string
6714``llvm.loop.unroll_and_jam.count`` and the second operand is a positive integer
6715specifying the unroll factor. For example:
6716
6717.. code-block:: llvm
6718
6719   !0 = !{!"llvm.loop.unroll_and_jam.count", i32 4}
6720
6721If the trip count of the loop is less than the unroll count the loop
6722will be partially unroll and jammed.
6723
6724'``llvm.loop.unroll_and_jam.disable``' Metadata
6725^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6726
6727This metadata disables loop unroll and jamming. The metadata has a single
6728operand which is the string ``llvm.loop.unroll_and_jam.disable``. For example:
6729
6730.. code-block:: llvm
6731
6732   !0 = !{!"llvm.loop.unroll_and_jam.disable"}
6733
6734'``llvm.loop.unroll_and_jam.enable``' Metadata
6735^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6736
6737This metadata suggests that the loop should be fully unroll and jammed if the
6738trip count is known at compile time and partially unrolled if the trip count is
6739not known at compile time. The metadata has a single operand which is the
6740string ``llvm.loop.unroll_and_jam.enable``.  For example:
6741
6742.. code-block:: llvm
6743
6744   !0 = !{!"llvm.loop.unroll_and_jam.enable"}
6745
6746'``llvm.loop.unroll_and_jam.followup_outer``' Metadata
6747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6748
6749This metadata defines which loop attributes the outer unrolled loop will
6750have. See :ref:`Transformation Metadata <transformation-metadata>` for
6751details.
6752
6753'``llvm.loop.unroll_and_jam.followup_inner``' Metadata
6754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6755
6756This metadata defines which loop attributes the inner jammed loop will
6757have. See :ref:`Transformation Metadata <transformation-metadata>` for
6758details.
6759
6760'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
6761^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6762
6763This metadata defines which attributes the epilogue of the outer loop
6764will have. This loop is usually unrolled, meaning there is no such
6765loop. This attribute will be ignored in this case. See
6766:ref:`Transformation Metadata <transformation-metadata>` for details.
6767
6768'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
6769^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6770
6771This metadata defines which attributes the inner loop of the epilogue
6772will have. The outer epilogue will usually be unrolled, meaning there
6773can be multiple inner remainder loops. See
6774:ref:`Transformation Metadata <transformation-metadata>` for details.
6775
6776'``llvm.loop.unroll_and_jam.followup_all``' Metadata
6777^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6778
6779Attributes specified in the metadata is added to all
6780``llvm.loop.unroll_and_jam.*`` loops. See
6781:ref:`Transformation Metadata <transformation-metadata>` for details.
6782
6783'``llvm.loop.licm_versioning.disable``' Metadata
6784^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6785
6786This metadata indicates that the loop should not be versioned for the purpose
6787of enabling loop-invariant code motion (LICM). The metadata has a single operand
6788which is the string ``llvm.loop.licm_versioning.disable``. For example:
6789
6790.. code-block:: llvm
6791
6792   !0 = !{!"llvm.loop.licm_versioning.disable"}
6793
6794'``llvm.loop.distribute.enable``' Metadata
6795^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6796
6797Loop distribution allows splitting a loop into multiple loops.  Currently,
6798this is only performed if the entire loop cannot be vectorized due to unsafe
6799memory dependencies.  The transformation will attempt to isolate the unsafe
6800dependencies into their own loop.
6801
6802This metadata can be used to selectively enable or disable distribution of the
6803loop.  The first operand is the string ``llvm.loop.distribute.enable`` and the
6804second operand is a bit. If the bit operand value is 1 distribution is
6805enabled. A value of 0 disables distribution:
6806
6807.. code-block:: llvm
6808
6809   !0 = !{!"llvm.loop.distribute.enable", i1 0}
6810   !1 = !{!"llvm.loop.distribute.enable", i1 1}
6811
6812This metadata should be used in conjunction with ``llvm.loop`` loop
6813identification metadata.
6814
6815'``llvm.loop.distribute.followup_coincident``' Metadata
6816^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6817
6818This metadata defines which attributes extracted loops with no cyclic
6819dependencies will have (i.e. can be vectorized). See
6820:ref:`Transformation Metadata <transformation-metadata>` for details.
6821
6822'``llvm.loop.distribute.followup_sequential``' Metadata
6823^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6824
6825This metadata defines which attributes the isolated loops with unsafe
6826memory dependencies will have. See
6827:ref:`Transformation Metadata <transformation-metadata>` for details.
6828
6829'``llvm.loop.distribute.followup_fallback``' Metadata
6830^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6831
6832If loop versioning is necessary, this metadata defined the attributes
6833the non-distributed fallback version will have. See
6834:ref:`Transformation Metadata <transformation-metadata>` for details.
6835
6836'``llvm.loop.distribute.followup_all``' Metadata
6837^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6838
6839The attributes in this metadata is added to all followup loops of the
6840loop distribution pass. See
6841:ref:`Transformation Metadata <transformation-metadata>` for details.
6842
6843'``llvm.licm.disable``' Metadata
6844^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6845
6846This metadata indicates that loop-invariant code motion (LICM) should not be
6847performed on this loop. The metadata has a single operand which is the string
6848``llvm.licm.disable``. For example:
6849
6850.. code-block:: llvm
6851
6852   !0 = !{!"llvm.licm.disable"}
6853
6854Note that although it operates per loop it isn't given the llvm.loop prefix
6855as it is not affected by the ``llvm.loop.disable_nonforced`` metadata.
6856
6857'``llvm.access.group``' Metadata
6858^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6859
6860``llvm.access.group`` metadata can be attached to any instruction that
6861potentially accesses memory. It can point to a single distinct metadata
6862node, which we call access group. This node represents all memory access
6863instructions referring to it via ``llvm.access.group``. When an
6864instruction belongs to multiple access groups, it can also point to a
6865list of accesses groups, illustrated by the following example.
6866
6867.. code-block:: llvm
6868
6869   %val = load i32, ptr %arrayidx, !llvm.access.group !0
6870   ...
6871   !0 = !{!1, !2}
6872   !1 = distinct !{}
6873   !2 = distinct !{}
6874
6875It is illegal for the list node to be empty since it might be confused
6876with an access group.
6877
6878The access group metadata node must be 'distinct' to avoid collapsing
6879multiple access groups by content. A access group metadata node must
6880always be empty which can be used to distinguish an access group
6881metadata node from a list of access groups. Being empty avoids the
6882situation that the content must be updated which, because metadata is
6883immutable by design, would required finding and updating all references
6884to the access group node.
6885
6886The access group can be used to refer to a memory access instruction
6887without pointing to it directly (which is not possible in global
6888metadata). Currently, the only metadata making use of it is
6889``llvm.loop.parallel_accesses``.
6890
6891'``llvm.loop.parallel_accesses``' Metadata
6892^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6893
6894The ``llvm.loop.parallel_accesses`` metadata refers to one or more
6895access group metadata nodes (see ``llvm.access.group``). It denotes that
6896no loop-carried memory dependence exist between it and other instructions
6897in the loop with this metadata.
6898
6899Let ``m1`` and ``m2`` be two instructions that both have the
6900``llvm.access.group`` metadata to the access group ``g1``, respectively
6901``g2`` (which might be identical). If a loop contains both access groups
6902in its ``llvm.loop.parallel_accesses`` metadata, then the compiler can
6903assume that there is no dependency between ``m1`` and ``m2`` carried by
6904this loop. Instructions that belong to multiple access groups are
6905considered having this property if at least one of the access groups
6906matches the ``llvm.loop.parallel_accesses`` list.
6907
6908If all memory-accessing instructions in a loop have
6909``llvm.access.group`` metadata that each refer to one of the access
6910groups of a loop's ``llvm.loop.parallel_accesses`` metadata, then the
6911loop has no loop carried memory dependences and is considered to be a
6912parallel loop.
6913
6914Note that if not all memory access instructions belong to an access
6915group referred to by ``llvm.loop.parallel_accesses``, then the loop must
6916not be considered trivially parallel. Additional
6917memory dependence analysis is required to make that determination. As a fail
6918safe mechanism, this causes loops that were originally parallel to be considered
6919sequential (if optimization passes that are unaware of the parallel semantics
6920insert new memory instructions into the loop body).
6921
6922Example of a loop that is considered parallel due to its correct use of
6923both ``llvm.access.group`` and ``llvm.loop.parallel_accesses``
6924metadata types.
6925
6926.. code-block:: llvm
6927
6928   for.body:
6929     ...
6930     %val0 = load i32, ptr %arrayidx, !llvm.access.group !1
6931     ...
6932     store i32 %val0, ptr %arrayidx1, !llvm.access.group !1
6933     ...
6934     br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
6935
6936   for.end:
6937   ...
6938   !0 = distinct !{!0, !{!"llvm.loop.parallel_accesses", !1}}
6939   !1 = distinct !{}
6940
6941It is also possible to have nested parallel loops:
6942
6943.. code-block:: llvm
6944
6945   outer.for.body:
6946     ...
6947     %val1 = load i32, ptr %arrayidx3, !llvm.access.group !4
6948     ...
6949     br label %inner.for.body
6950
6951   inner.for.body:
6952     ...
6953     %val0 = load i32, ptr %arrayidx1, !llvm.access.group !3
6954     ...
6955     store i32 %val0, ptr %arrayidx2, !llvm.access.group !3
6956     ...
6957     br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
6958
6959   inner.for.end:
6960     ...
6961     store i32 %val1, ptr %arrayidx4, !llvm.access.group !4
6962     ...
6963     br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
6964
6965   outer.for.end:                                          ; preds = %for.body
6966   ...
6967   !1 = distinct !{!1, !{!"llvm.loop.parallel_accesses", !3}}     ; metadata for the inner loop
6968   !2 = distinct !{!2, !{!"llvm.loop.parallel_accesses", !3, !4}} ; metadata for the outer loop
6969   !3 = distinct !{} ; access group for instructions in the inner loop (which are implicitly contained in outer loop as well)
6970   !4 = distinct !{} ; access group for instructions in the outer, but not the inner loop
6971
6972.. _langref_llvm_loop_mustprogress:
6973
6974'``llvm.loop.mustprogress``' Metadata
6975^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6976
6977The ``llvm.loop.mustprogress`` metadata indicates that this loop is required to
6978terminate, unwind, or interact with the environment in an observable way e.g.
6979via a volatile memory access, I/O, or other synchronization. If such a loop is
6980not found to interact with the environment in an observable way, the loop may
6981be removed. This corresponds to the ``mustprogress`` function attribute.
6982
6983'``irr_loop``' Metadata
6984^^^^^^^^^^^^^^^^^^^^^^^
6985
6986``irr_loop`` metadata may be attached to the terminator instruction of a basic
6987block that's an irreducible loop header (note that an irreducible loop has more
6988than once header basic blocks.) If ``irr_loop`` metadata is attached to the
6989terminator instruction of a basic block that is not really an irreducible loop
6990header, the behavior is undefined. The intent of this metadata is to improve the
6991accuracy of the block frequency propagation. For example, in the code below, the
6992block ``header0`` may have a loop header weight (relative to the other headers of
6993the irreducible loop) of 100:
6994
6995.. code-block:: llvm
6996
6997    header0:
6998    ...
6999    br i1 %cmp, label %t1, label %t2, !irr_loop !0
7000
7001    ...
7002    !0 = !{"loop_header_weight", i64 100}
7003
7004Irreducible loop header weights are typically based on profile data.
7005
7006.. _md_invariant.group:
7007
7008'``invariant.group``' Metadata
7009^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7010
7011The experimental ``invariant.group`` metadata may be attached to
7012``load``/``store`` instructions referencing a single metadata with no entries.
7013The existence of the ``invariant.group`` metadata on the instruction tells
7014the optimizer that every ``load`` and ``store`` to the same pointer operand
7015can be assumed to load or store the same
7016value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
7017when two pointers are considered the same). Pointers returned by bitcast or
7018getelementptr with only zero indices are considered the same.
7019
7020Examples:
7021
7022.. code-block:: llvm
7023
7024   @unknownPtr = external global i8
7025   ...
7026   %ptr = alloca i8
7027   store i8 42, ptr %ptr, !invariant.group !0
7028   call void @foo(ptr %ptr)
7029
7030   %a = load i8, ptr %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
7031   call void @foo(ptr %ptr)
7032
7033   %newPtr = call ptr @getPointer(ptr %ptr)
7034   %c = load i8, ptr %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
7035
7036   %unknownValue = load i8, ptr @unknownPtr
7037   store i8 %unknownValue, ptr %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
7038
7039   call void @foo(ptr %ptr)
7040   %newPtr2 = call ptr @llvm.launder.invariant.group.p0(ptr %ptr)
7041   %d = load i8, ptr %newPtr2, !invariant.group !0  ; Can't step through launder.invariant.group to get value of %ptr
7042
7043   ...
7044   declare void @foo(ptr)
7045   declare ptr @getPointer(ptr)
7046   declare ptr @llvm.launder.invariant.group.p0(ptr)
7047
7048   !0 = !{}
7049
7050The invariant.group metadata must be dropped when replacing one pointer by
7051another based on aliasing information. This is because invariant.group is tied
7052to the SSA value of the pointer operand.
7053
7054.. code-block:: llvm
7055
7056  %v = load i8, ptr %x, !invariant.group !0
7057  ; if %x mustalias %y then we can replace the above instruction with
7058  %v = load i8, ptr %y
7059
7060Note that this is an experimental feature, which means that its semantics might
7061change in the future.
7062
7063'``type``' Metadata
7064^^^^^^^^^^^^^^^^^^^
7065
7066See :doc:`TypeMetadata`.
7067
7068'``associated``' Metadata
7069^^^^^^^^^^^^^^^^^^^^^^^^^
7070
7071The ``associated`` metadata may be attached to a global variable definition with
7072a single argument that references a global object (optionally through an alias).
7073
7074This metadata lowers to the ELF section flag ``SHF_LINK_ORDER`` which prevents
7075discarding of the global variable in linker GC unless the referenced object is
7076also discarded. The linker support for this feature is spotty. For best
7077compatibility, globals carrying this metadata should:
7078
7079- Be in ``@llvm.compiler.used``.
7080- If the referenced global variable is in a comdat, be in the same comdat.
7081
7082``!associated`` can not express many-to-one relationship. A global variable with
7083the metadata should generally not be referenced by a function: the function may
7084be inlined into other functions, leading to more references to the metadata.
7085Ideally we would want to keep metadata alive as long as any inline location is
7086alive, but this many-to-one relationship is not representable. Moreover, if the
7087metadata is retained while the function is discarded, the linker will report an
7088error of a relocation referencing a discarded section.
7089
7090The metadata is often used with an explicit section consisting of valid C
7091identifiers so that the runtime can find the metadata section with
7092linker-defined encapsulation symbols ``__start_<section_name>`` and
7093``__stop_<section_name>``.
7094
7095It does not have any effect on non-ELF targets.
7096
7097Example:
7098
7099.. code-block:: text
7100
7101    $a = comdat any
7102    @a = global i32 1, comdat $a
7103    @b = internal global i32 2, comdat $a, section "abc", !associated !0
7104    !0 = !{ptr @a}
7105
7106
7107'``prof``' Metadata
7108^^^^^^^^^^^^^^^^^^^
7109
7110The ``prof`` metadata is used to record profile data in the IR.
7111The first operand of the metadata node indicates the profile metadata
7112type. There are currently 3 types:
7113:ref:`branch_weights<prof_node_branch_weights>`,
7114:ref:`function_entry_count<prof_node_function_entry_count>`, and
7115:ref:`VP<prof_node_VP>`.
7116
7117.. _prof_node_branch_weights:
7118
7119branch_weights
7120""""""""""""""
7121
7122Branch weight metadata attached to a branch, select, switch or call instruction
7123represents the likeliness of the associated branch being taken.
7124For more information, see :doc:`BranchWeightMetadata`.
7125
7126.. _prof_node_function_entry_count:
7127
7128function_entry_count
7129""""""""""""""""""""
7130
7131Function entry count metadata can be attached to function definitions
7132to record the number of times the function is called. Used with BFI
7133information, it is also used to derive the basic block profile count.
7134For more information, see :doc:`BranchWeightMetadata`.
7135
7136.. _prof_node_VP:
7137
7138VP
7139""
7140
7141VP (value profile) metadata can be attached to instructions that have
7142value profile information. Currently this is indirect calls (where it
7143records the hottest callees) and calls to memory intrinsics such as memcpy,
7144memmove, and memset (where it records the hottest byte lengths).
7145
7146Each VP metadata node contains "VP" string, then a uint32_t value for the value
7147profiling kind, a uint64_t value for the total number of times the instruction
7148is executed, followed by uint64_t value and execution count pairs.
7149The value profiling kind is 0 for indirect call targets and 1 for memory
7150operations. For indirect call targets, each profile value is a hash
7151of the callee function name, and for memory operations each value is the
7152byte length.
7153
7154Note that the value counts do not need to add up to the total count
7155listed in the third operand (in practice only the top hottest values
7156are tracked and reported).
7157
7158Indirect call example:
7159
7160.. code-block:: llvm
7161
7162    call void %f(), !prof !1
7163    !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
7164
7165Note that the VP type is 0 (the second operand), which indicates this is
7166an indirect call value profile data. The third operand indicates that the
7167indirect call executed 1600 times. The 4th and 6th operands give the
7168hashes of the 2 hottest target functions' names (this is the same hash used
7169to represent function names in the profile database), and the 5th and 7th
7170operands give the execution count that each of the respective prior target
7171functions was called.
7172
7173.. _md_annotation:
7174
7175'``annotation``' Metadata
7176^^^^^^^^^^^^^^^^^^^^^^^^^
7177
7178The ``annotation`` metadata can be used to attach a tuple of annotation strings
7179to any instruction. This metadata does not impact the semantics of the program
7180and may only be used to provide additional insight about the program and
7181transformations to users.
7182
7183Example:
7184
7185.. code-block:: text
7186
7187    %a.addr = alloca ptr, align 8, !annotation !0
7188    !0 = !{!"auto-init"}
7189
7190'``func_sanitize``' Metadata
7191^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7192
7193The ``func_sanitize`` metadata is used to attach two values for the function
7194sanitizer instrumentation. The first value is the ubsan function signature.
7195The second value is the address of the proxy variable which stores the address
7196of the RTTI descriptor. If :ref:`prologue <prologuedata>` and '``func_sanitize``'
7197are used at the same time, :ref:`prologue <prologuedata>` is emitted before
7198'``func_sanitize``' in the output.
7199
7200Example:
7201
7202.. code-block:: text
7203
7204    @__llvm_rtti_proxy = private unnamed_addr constant ptr @_ZTIFvvE
7205    define void @_Z3funv() !func_sanitize !0 {
7206      return void
7207    }
7208    !0 = !{i32 846595819, ptr @__llvm_rtti_proxy}
7209
7210Module Flags Metadata
7211=====================
7212
7213Information about the module as a whole is difficult to convey to LLVM's
7214subsystems. The LLVM IR isn't sufficient to transmit this information.
7215The ``llvm.module.flags`` named metadata exists in order to facilitate
7216this. These flags are in the form of key / value pairs --- much like a
7217dictionary --- making it easy for any subsystem who cares about a flag to
7218look it up.
7219
7220The ``llvm.module.flags`` metadata contains a list of metadata triplets.
7221Each triplet has the following form:
7222
7223-  The first element is a *behavior* flag, which specifies the behavior
7224   when two (or more) modules are merged together, and it encounters two
7225   (or more) metadata with the same ID. The supported behaviors are
7226   described below.
7227-  The second element is a metadata string that is a unique ID for the
7228   metadata. Each module may only have one flag entry for each unique ID (not
7229   including entries with the **Require** behavior).
7230-  The third element is the value of the flag.
7231
7232When two (or more) modules are merged together, the resulting
7233``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
7234each unique metadata ID string, there will be exactly one entry in the merged
7235modules ``llvm.module.flags`` metadata table, and the value for that entry will
7236be determined by the merge behavior flag, as described below. The only exception
7237is that entries with the *Require* behavior are always preserved.
7238
7239The following behaviors are supported:
7240
7241.. list-table::
7242   :header-rows: 1
7243   :widths: 10 90
7244
7245   * - Value
7246     - Behavior
7247
7248   * - 1
7249     - **Error**
7250           Emits an error if two values disagree, otherwise the resulting value
7251           is that of the operands.
7252
7253   * - 2
7254     - **Warning**
7255           Emits a warning if two values disagree. The result value will be the
7256           operand for the flag from the first module being linked, or the max
7257           if the other module uses **Max** (in which case the resulting flag
7258           will be **Max**).
7259
7260   * - 3
7261     - **Require**
7262           Adds a requirement that another module flag be present and have a
7263           specified value after linking is performed. The value must be a
7264           metadata pair, where the first element of the pair is the ID of the
7265           module flag to be restricted, and the second element of the pair is
7266           the value the module flag should be restricted to. This behavior can
7267           be used to restrict the allowable results (via triggering of an
7268           error) of linking IDs with the **Override** behavior.
7269
7270   * - 4
7271     - **Override**
7272           Uses the specified value, regardless of the behavior or value of the
7273           other module. If both modules specify **Override**, but the values
7274           differ, an error will be emitted.
7275
7276   * - 5
7277     - **Append**
7278           Appends the two values, which are required to be metadata nodes.
7279
7280   * - 6
7281     - **AppendUnique**
7282           Appends the two values, which are required to be metadata
7283           nodes. However, duplicate entries in the second list are dropped
7284           during the append operation.
7285
7286   * - 7
7287     - **Max**
7288           Takes the max of the two values, which are required to be integers.
7289
7290   * - 8
7291     - **Min**
7292           Takes the min of the two values, which are required to be non-negative integers.
7293           An absent module flag is treated as having the value 0.
7294
7295It is an error for a particular unique flag ID to have multiple behaviors,
7296except in the case of **Require** (which adds restrictions on another metadata
7297value) or **Override**.
7298
7299An example of module flags:
7300
7301.. code-block:: llvm
7302
7303    !0 = !{ i32 1, !"foo", i32 1 }
7304    !1 = !{ i32 4, !"bar", i32 37 }
7305    !2 = !{ i32 2, !"qux", i32 42 }
7306    !3 = !{ i32 3, !"qux",
7307      !{
7308        !"foo", i32 1
7309      }
7310    }
7311    !llvm.module.flags = !{ !0, !1, !2, !3 }
7312
7313-  Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
7314   if two or more ``!"foo"`` flags are seen is to emit an error if their
7315   values are not equal.
7316
7317-  Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
7318   behavior if two or more ``!"bar"`` flags are seen is to use the value
7319   '37'.
7320
7321-  Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
7322   behavior if two or more ``!"qux"`` flags are seen is to emit a
7323   warning if their values are not equal.
7324
7325-  Metadata ``!3`` has the ID ``!"qux"`` and the value:
7326
7327   ::
7328
7329       !{ !"foo", i32 1 }
7330
7331   The behavior is to emit an error if the ``llvm.module.flags`` does not
7332   contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
7333   performed.
7334
7335Synthesized Functions Module Flags Metadata
7336-------------------------------------------
7337
7338These metadata specify the default attributes synthesized functions should have.
7339These metadata are currently respected by a few instrumentation passes, such as
7340sanitizers.
7341
7342These metadata correspond to a few function attributes with significant code
7343generation behaviors. Function attributes with just optimization purposes
7344should not be listed because the performance impact of these synthesized
7345functions is small.
7346
7347- "frame-pointer": **Max**. The value can be 0, 1, or 2. A synthesized function
7348  will get the "frame-pointer" function attribute, with value being "none",
7349  "non-leaf", or "all", respectively.
7350- "function_return_thunk_extern": The synthesized function will get the
7351  ``fn_return_thunk_extern`` function attribute.
7352- "uwtable": **Max**. The value can be 0, 1, or 2. If the value is 1, a synthesized
7353  function will get the ``uwtable(sync)`` function attribute, if the value is 2,
7354  a synthesized function will get the ``uwtable(async)`` function attribute.
7355
7356Objective-C Garbage Collection Module Flags Metadata
7357----------------------------------------------------
7358
7359On the Mach-O platform, Objective-C stores metadata about garbage
7360collection in a special section called "image info". The metadata
7361consists of a version number and a bitmask specifying what types of
7362garbage collection are supported (if any) by the file. If two or more
7363modules are linked together their garbage collection metadata needs to
7364be merged rather than appended together.
7365
7366The Objective-C garbage collection module flags metadata consists of the
7367following key-value pairs:
7368
7369.. list-table::
7370   :header-rows: 1
7371   :widths: 30 70
7372
7373   * - Key
7374     - Value
7375
7376   * - ``Objective-C Version``
7377     - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
7378
7379   * - ``Objective-C Image Info Version``
7380     - **[Required]** --- The version of the image info section. Currently
7381       always 0.
7382
7383   * - ``Objective-C Image Info Section``
7384     - **[Required]** --- The section to place the metadata. Valid values are
7385       ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
7386       ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
7387       Objective-C ABI version 2.
7388
7389   * - ``Objective-C Garbage Collection``
7390     - **[Required]** --- Specifies whether garbage collection is supported or
7391       not. Valid values are 0, for no garbage collection, and 2, for garbage
7392       collection supported.
7393
7394   * - ``Objective-C GC Only``
7395     - **[Optional]** --- Specifies that only garbage collection is supported.
7396       If present, its value must be 6. This flag requires that the
7397       ``Objective-C Garbage Collection`` flag have the value 2.
7398
7399Some important flag interactions:
7400
7401-  If a module with ``Objective-C Garbage Collection`` set to 0 is
7402   merged with a module with ``Objective-C Garbage Collection`` set to
7403   2, then the resulting module has the
7404   ``Objective-C Garbage Collection`` flag set to 0.
7405-  A module with ``Objective-C Garbage Collection`` set to 0 cannot be
7406   merged with a module with ``Objective-C GC Only`` set to 6.
7407
7408C type width Module Flags Metadata
7409----------------------------------
7410
7411The ARM backend emits a section into each generated object file describing the
7412options that it was compiled with (in a compiler-independent way) to prevent
7413linking incompatible objects, and to allow automatic library selection. Some
7414of these options are not visible at the IR level, namely wchar_t width and enum
7415width.
7416
7417To pass this information to the backend, these options are encoded in module
7418flags metadata, using the following key-value pairs:
7419
7420.. list-table::
7421   :header-rows: 1
7422   :widths: 30 70
7423
7424   * - Key
7425     - Value
7426
7427   * - short_wchar
7428     - * 0 --- sizeof(wchar_t) == 4
7429       * 1 --- sizeof(wchar_t) == 2
7430
7431   * - short_enum
7432     - * 0 --- Enums are at least as large as an ``int``.
7433       * 1 --- Enums are stored in the smallest integer type which can
7434         represent all of its values.
7435
7436For example, the following metadata section specifies that the module was
7437compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
7438enum is the smallest type which can represent all of its values::
7439
7440    !llvm.module.flags = !{!0, !1}
7441    !0 = !{i32 1, !"short_wchar", i32 1}
7442    !1 = !{i32 1, !"short_enum", i32 0}
7443
7444LTO Post-Link Module Flags Metadata
7445-----------------------------------
7446
7447Some optimisations are only when the entire LTO unit is present in the current
7448module. This is represented by the ``LTOPostLink`` module flags metadata, which
7449will be created with a value of ``1`` when LTO linking occurs.
7450
7451Embedded Objects Names Metadata
7452===============================
7453
7454Offloading compilations need to embed device code into the host section table to
7455create a fat binary. This metadata node references each global that will be
7456embedded in the module. The primary use for this is to make referencing these
7457globals more efficient in the IR. The metadata references nodes containing
7458pointers to the global to be embedded followed by the section name it will be
7459stored at::
7460
7461    !llvm.embedded.objects = !{!0}
7462    !0 = !{ptr @object, !".section"}
7463
7464Automatic Linker Flags Named Metadata
7465=====================================
7466
7467Some targets support embedding of flags to the linker inside individual object
7468files. Typically this is used in conjunction with language extensions which
7469allow source files to contain linker command line options, and have these
7470automatically be transmitted to the linker via object files.
7471
7472These flags are encoded in the IR using named metadata with the name
7473``!llvm.linker.options``. Each operand is expected to be a metadata node
7474which should be a list of other metadata nodes, each of which should be a
7475list of metadata strings defining linker options.
7476
7477For example, the following metadata section specifies two separate sets of
7478linker options, presumably to link against ``libz`` and the ``Cocoa``
7479framework::
7480
7481    !0 = !{ !"-lz" }
7482    !1 = !{ !"-framework", !"Cocoa" }
7483    !llvm.linker.options = !{ !0, !1 }
7484
7485The metadata encoding as lists of lists of options, as opposed to a collapsed
7486list of options, is chosen so that the IR encoding can use multiple option
7487strings to specify e.g., a single library, while still having that specifier be
7488preserved as an atomic element that can be recognized by a target specific
7489assembly writer or object file emitter.
7490
7491Each individual option is required to be either a valid option for the target's
7492linker, or an option that is reserved by the target specific assembly writer or
7493object file emitter. No other aspect of these options is defined by the IR.
7494
7495Dependent Libs Named Metadata
7496=============================
7497
7498Some targets support embedding of strings into object files to indicate
7499a set of libraries to add to the link. Typically this is used in conjunction
7500with language extensions which allow source files to explicitly declare the
7501libraries they depend on, and have these automatically be transmitted to the
7502linker via object files.
7503
7504The list is encoded in the IR using named metadata with the name
7505``!llvm.dependent-libraries``. Each operand is expected to be a metadata node
7506which should contain a single string operand.
7507
7508For example, the following metadata section contains two library specifiers::
7509
7510    !0 = !{!"a library specifier"}
7511    !1 = !{!"another library specifier"}
7512    !llvm.dependent-libraries = !{ !0, !1 }
7513
7514Each library specifier will be handled independently by the consuming linker.
7515The effect of the library specifiers are defined by the consuming linker.
7516
7517.. _summary:
7518
7519ThinLTO Summary
7520===============
7521
7522Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
7523causes the building of a compact summary of the module that is emitted into
7524the bitcode. The summary is emitted into the LLVM assembly and identified
7525in syntax by a caret ('``^``').
7526
7527The summary is parsed into a bitcode output, along with the Module
7528IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
7529of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
7530summary entries (just as they currently ignore summary entries in a bitcode
7531input file).
7532
7533Eventually, the summary will be parsed into a ModuleSummaryIndex object under
7534the same conditions where summary index is currently built from bitcode.
7535Specifically, tools that test the Thin Link portion of a ThinLTO compile
7536(i.e. llvm-lto and llvm-lto2), or when parsing a combined index
7537for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag
7538(this part is not yet implemented, use llvm-as to create a bitcode object
7539before feeding into thin link tools for now).
7540
7541There are currently 3 types of summary entries in the LLVM assembly:
7542:ref:`module paths<module_path_summary>`,
7543:ref:`global values<gv_summary>`, and
7544:ref:`type identifiers<typeid_summary>`.
7545
7546.. _module_path_summary:
7547
7548Module Path Summary Entry
7549-------------------------
7550
7551Each module path summary entry lists a module containing global values included
7552in the summary. For a single IR module there will be one such entry, but
7553in a combined summary index produced during the thin link, there will be
7554one module path entry per linked module with summary.
7555
7556Example:
7557
7558.. code-block:: text
7559
7560    ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
7561
7562The ``path`` field is a string path to the bitcode file, and the ``hash``
7563field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
7564incremental builds and caching.
7565
7566.. _gv_summary:
7567
7568Global Value Summary Entry
7569--------------------------
7570
7571Each global value summary entry corresponds to a global value defined or
7572referenced by a summarized module.
7573
7574Example:
7575
7576.. code-block:: text
7577
7578    ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
7579
7580For declarations, there will not be a summary list. For definitions, a
7581global value will contain a list of summaries, one per module containing
7582a definition. There can be multiple entries in a combined summary index
7583for symbols with weak linkage.
7584
7585Each ``Summary`` format will depend on whether the global value is a
7586:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
7587:ref:`alias<alias_summary>`.
7588
7589.. _function_summary:
7590
7591Function Summary
7592^^^^^^^^^^^^^^^^
7593
7594If the global value is a function, the ``Summary`` entry will look like:
7595
7596.. code-block:: text
7597
7598    function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Params]?[, Refs]?
7599
7600The ``module`` field includes the summary entry id for the module containing
7601this definition, and the ``flags`` field contains information such as
7602the linkage type, a flag indicating whether it is legal to import the
7603definition, whether it is globally live and whether the linker resolved it
7604to a local definition (the latter two are populated during the thin link).
7605The ``insts`` field contains the number of IR instructions in the function.
7606Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
7607:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
7608:ref:`Params<params_summary>`, :ref:`Refs<refs_summary>`.
7609
7610.. _variable_summary:
7611
7612Global Variable Summary
7613^^^^^^^^^^^^^^^^^^^^^^^
7614
7615If the global value is a variable, the ``Summary`` entry will look like:
7616
7617.. code-block:: text
7618
7619    variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
7620
7621The variable entry contains a subset of the fields in a
7622:ref:`function summary <function_summary>`, see the descriptions there.
7623
7624.. _alias_summary:
7625
7626Alias Summary
7627^^^^^^^^^^^^^
7628
7629If the global value is an alias, the ``Summary`` entry will look like:
7630
7631.. code-block:: text
7632
7633    alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
7634
7635The ``module`` and ``flags`` fields are as described for a
7636:ref:`function summary <function_summary>`. The ``aliasee`` field
7637contains a reference to the global value summary entry of the aliasee.
7638
7639.. _funcflags_summary:
7640
7641Function Flags
7642^^^^^^^^^^^^^^
7643
7644The optional ``FuncFlags`` field looks like:
7645
7646.. code-block:: text
7647
7648    funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0)
7649
7650If unspecified, flags are assumed to hold the conservative ``false`` value of
7651``0``.
7652
7653.. _calls_summary:
7654
7655Calls
7656^^^^^
7657
7658The optional ``Calls`` field looks like:
7659
7660.. code-block:: text
7661
7662    calls: ((Callee)[, (Callee)]*)
7663
7664where each ``Callee`` looks like:
7665
7666.. code-block:: text
7667
7668    callee: ^1[, hotness: None]?[, relbf: 0]?
7669
7670The ``callee`` refers to the summary entry id of the callee. At most one
7671of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
7672``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
7673branch frequency relative to the entry frequency, scaled down by 2^8)
7674may be specified. The defaults are ``Unknown`` and ``0``, respectively.
7675
7676.. _params_summary:
7677
7678Params
7679^^^^^^
7680
7681The optional ``Params`` is used by ``StackSafety`` and looks like:
7682
7683.. code-block:: text
7684
7685    Params: ((Param)[, (Param)]*)
7686
7687where each ``Param`` describes pointer parameter access inside of the
7688function and looks like:
7689
7690.. code-block:: text
7691
7692    param: 4, offset: [0, 5][, calls: ((Callee)[, (Callee)]*)]?
7693
7694where the first ``param`` is the number of the parameter it describes,
7695``offset`` is the inclusive range of offsets from the pointer parameter to bytes
7696which can be accessed by the function. This range does not include accesses by
7697function calls from ``calls`` list.
7698
7699where each ``Callee`` describes how parameter is forwarded into other
7700functions and looks like:
7701
7702.. code-block:: text
7703
7704    callee: ^3, param: 5, offset: [-3, 3]
7705
7706The ``callee`` refers to the summary entry id of the callee,  ``param`` is
7707the number of the callee parameter which points into the callers parameter
7708with offset known to be inside of the ``offset`` range. ``calls`` will be
7709consumed and removed by thin link stage to update ``Param::offset`` so it
7710covers all accesses possible by ``calls``.
7711
7712Pointer parameter without corresponding ``Param`` is considered unsafe and we
7713assume that access with any offset is possible.
7714
7715Example:
7716
7717If we have the following function:
7718
7719.. code-block:: text
7720
7721    define i64 @foo(ptr %0, ptr %1, ptr %2, i8 %3) {
7722      store ptr %1, ptr @x
7723      %5 = getelementptr inbounds i8, ptr %2, i64 5
7724      %6 = load i8, ptr %5
7725      %7 = getelementptr inbounds i8, ptr %2, i8 %3
7726      tail call void @bar(i8 %3, ptr %7)
7727      %8 = load i64, ptr %0
7728      ret i64 %8
7729    }
7730
7731We can expect the record like this:
7732
7733.. code-block:: text
7734
7735    params: ((param: 0, offset: [0, 7]),(param: 2, offset: [5, 5], calls: ((callee: ^3, param: 1, offset: [-128, 127]))))
7736
7737The function may access just 8 bytes of the parameter %0 . ``calls`` is empty,
7738so the parameter is either not used for function calls or ``offset`` already
7739covers all accesses from nested function calls.
7740Parameter %1 escapes, so access is unknown.
7741The function itself can access just a single byte of the parameter %2. Additional
7742access is possible inside of the ``@bar`` or ``^3``. The function adds signed
7743offset to the pointer and passes the result as the argument %1 into ``^3``.
7744This record itself does not tell us how ``^3`` will access the parameter.
7745Parameter %3 is not a pointer.
7746
7747.. _refs_summary:
7748
7749Refs
7750^^^^
7751
7752The optional ``Refs`` field looks like:
7753
7754.. code-block:: text
7755
7756    refs: ((Ref)[, (Ref)]*)
7757
7758where each ``Ref`` contains a reference to the summary id of the referenced
7759value (e.g. ``^1``).
7760
7761.. _typeidinfo_summary:
7762
7763TypeIdInfo
7764^^^^^^^^^^
7765
7766The optional ``TypeIdInfo`` field, used for
7767`Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7768looks like:
7769
7770.. code-block:: text
7771
7772    typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
7773
7774These optional fields have the following forms:
7775
7776TypeTests
7777"""""""""
7778
7779.. code-block:: text
7780
7781    typeTests: (TypeIdRef[, TypeIdRef]*)
7782
7783Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7784by summary id or ``GUID``.
7785
7786TypeTestAssumeVCalls
7787""""""""""""""""""""
7788
7789.. code-block:: text
7790
7791    typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
7792
7793Where each VFuncId has the format:
7794
7795.. code-block:: text
7796
7797    vFuncId: (TypeIdRef, offset: 16)
7798
7799Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
7800by summary id or ``GUID`` preceded by a ``guid:`` tag.
7801
7802TypeCheckedLoadVCalls
7803"""""""""""""""""""""
7804
7805.. code-block:: text
7806
7807    typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
7808
7809Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
7810
7811TypeTestAssumeConstVCalls
7812"""""""""""""""""""""""""
7813
7814.. code-block:: text
7815
7816    typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
7817
7818Where each ConstVCall has the format:
7819
7820.. code-block:: text
7821
7822    (VFuncId, args: (Arg[, Arg]*))
7823
7824and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
7825and each Arg is an integer argument number.
7826
7827TypeCheckedLoadConstVCalls
7828""""""""""""""""""""""""""
7829
7830.. code-block:: text
7831
7832    typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
7833
7834Where each ConstVCall has the format described for
7835``TypeTestAssumeConstVCalls``.
7836
7837.. _typeid_summary:
7838
7839Type ID Summary Entry
7840---------------------
7841
7842Each type id summary entry corresponds to a type identifier resolution
7843which is generated during the LTO link portion of the compile when building
7844with `Control Flow Integrity <https://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
7845so these are only present in a combined summary index.
7846
7847Example:
7848
7849.. code-block:: text
7850
7851    ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
7852
7853The ``typeTestRes`` gives the type test resolution ``kind`` (which may
7854be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
7855the ``size-1`` bit width. It is followed by optional flags, which default to 0,
7856and an optional WpdResolutions (whole program devirtualization resolution)
7857field that looks like:
7858
7859.. code-block:: text
7860
7861    wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
7862
7863where each entry is a mapping from the given byte offset to the whole-program
7864devirtualization resolution WpdRes, that has one of the following formats:
7865
7866.. code-block:: text
7867
7868    wpdRes: (kind: branchFunnel)
7869    wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
7870    wpdRes: (kind: indir)
7871
7872Additionally, each wpdRes has an optional ``resByArg`` field, which
7873describes the resolutions for calls with all constant integer arguments:
7874
7875.. code-block:: text
7876
7877    resByArg: (ResByArg[, ResByArg]*)
7878
7879where ResByArg is:
7880
7881.. code-block:: text
7882
7883    args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
7884
7885Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
7886or ``VirtualConstProp``. The ``info`` field is only used if the kind
7887is ``UniformRetVal`` (indicates the uniform return value), or
7888``UniqueRetVal`` (holds the return value associated with the unique vtable
7889(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
7890not support the use of absolute symbols to store constants.
7891
7892.. _intrinsicglobalvariables:
7893
7894Intrinsic Global Variables
7895==========================
7896
7897LLVM has a number of "magic" global variables that contain data that
7898affect code generation or other IR semantics. These are documented here.
7899All globals of this sort should have a section specified as
7900"``llvm.metadata``". This section and all globals that start with
7901"``llvm.``" are reserved for use by LLVM.
7902
7903.. _gv_llvmused:
7904
7905The '``llvm.used``' Global Variable
7906-----------------------------------
7907
7908The ``@llvm.used`` global is an array which has
7909:ref:`appending linkage <linkage_appending>`. This array contains a list of
7910pointers to named global variables, functions and aliases which may optionally
7911have a pointer cast formed of bitcast or getelementptr. For example, a legal
7912use of it is:
7913
7914.. code-block:: llvm
7915
7916    @X = global i8 4
7917    @Y = global i32 123
7918
7919    @llvm.used = appending global [2 x ptr] [
7920       ptr @X,
7921       ptr @Y
7922    ], section "llvm.metadata"
7923
7924If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
7925and linker are required to treat the symbol as if there is a reference to the
7926symbol that it cannot see (which is why they have to be named). For example, if
7927a variable has internal linkage and no references other than that from the
7928``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
7929references from inline asms and other things the compiler cannot "see", and
7930corresponds to "``attribute((used))``" in GNU C.
7931
7932On some targets, the code generator must emit a directive to the
7933assembler or object file to prevent the assembler and linker from
7934removing the symbol.
7935
7936.. _gv_llvmcompilerused:
7937
7938The '``llvm.compiler.used``' Global Variable
7939--------------------------------------------
7940
7941The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
7942directive, except that it only prevents the compiler from touching the
7943symbol. On targets that support it, this allows an intelligent linker to
7944optimize references to the symbol without being impeded as it would be
7945by ``@llvm.used``.
7946
7947This is a rare construct that should only be used in rare circumstances,
7948and should not be exposed to source languages.
7949
7950.. _gv_llvmglobalctors:
7951
7952The '``llvm.global_ctors``' Global Variable
7953-------------------------------------------
7954
7955.. code-block:: llvm
7956
7957    %0 = type { i32, ptr, ptr }
7958    @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, ptr @ctor, ptr @data }]
7959
7960The ``@llvm.global_ctors`` array contains a list of constructor
7961functions, priorities, and an associated global or function.
7962The functions referenced by this array will be called in ascending order
7963of priority (i.e. lowest first) when the module is loaded. The order of
7964functions with the same priority is not defined.
7965
7966If the third field is non-null, and points to a global variable
7967or function, the initializer function will only run if the associated
7968data from the current module is not discarded.
7969On ELF the referenced global variable or function must be in a comdat.
7970
7971.. _llvmglobaldtors:
7972
7973The '``llvm.global_dtors``' Global Variable
7974-------------------------------------------
7975
7976.. code-block:: llvm
7977
7978    %0 = type { i32, ptr, ptr }
7979    @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, ptr @dtor, ptr @data }]
7980
7981The ``@llvm.global_dtors`` array contains a list of destructor
7982functions, priorities, and an associated global or function.
7983The functions referenced by this array will be called in descending
7984order of priority (i.e. highest first) when the module is unloaded. The
7985order of functions with the same priority is not defined.
7986
7987If the third field is non-null, and points to a global variable
7988or function, the destructor function will only run if the associated
7989data from the current module is not discarded.
7990On ELF the referenced global variable or function must be in a comdat.
7991
7992Instruction Reference
7993=====================
7994
7995The LLVM instruction set consists of several different classifications
7996of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
7997instructions <binaryops>`, :ref:`bitwise binary
7998instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
7999:ref:`other instructions <otherops>`.
8000
8001.. _terminators:
8002
8003Terminator Instructions
8004-----------------------
8005
8006As mentioned :ref:`previously <functionstructure>`, every basic block in a
8007program ends with a "Terminator" instruction, which indicates which
8008block should be executed after the current block is finished. These
8009terminator instructions typically yield a '``void``' value: they produce
8010control flow, not values (the one exception being the
8011':ref:`invoke <i_invoke>`' instruction).
8012
8013The terminator instructions are: ':ref:`ret <i_ret>`',
8014':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
8015':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
8016':ref:`callbr <i_callbr>`'
8017':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
8018':ref:`catchret <i_catchret>`',
8019':ref:`cleanupret <i_cleanupret>`',
8020and ':ref:`unreachable <i_unreachable>`'.
8021
8022.. _i_ret:
8023
8024'``ret``' Instruction
8025^^^^^^^^^^^^^^^^^^^^^
8026
8027Syntax:
8028"""""""
8029
8030::
8031
8032      ret <type> <value>       ; Return a value from a non-void function
8033      ret void                 ; Return from void function
8034
8035Overview:
8036"""""""""
8037
8038The '``ret``' instruction is used to return control flow (and optionally
8039a value) from a function back to the caller.
8040
8041There are two forms of the '``ret``' instruction: one that returns a
8042value and then causes control flow, and one that just causes control
8043flow to occur.
8044
8045Arguments:
8046""""""""""
8047
8048The '``ret``' instruction optionally accepts a single argument, the
8049return value. The type of the return value must be a ':ref:`first
8050class <t_firstclass>`' type.
8051
8052A function is not :ref:`well formed <wellformed>` if it has a non-void
8053return type and contains a '``ret``' instruction with no return value or
8054a return value with a type that does not match its type, or if it has a
8055void return type and contains a '``ret``' instruction with a return
8056value.
8057
8058Semantics:
8059""""""""""
8060
8061When the '``ret``' instruction is executed, control flow returns back to
8062the calling function's context. If the caller is a
8063":ref:`call <i_call>`" instruction, execution continues at the
8064instruction after the call. If the caller was an
8065":ref:`invoke <i_invoke>`" instruction, execution continues at the
8066beginning of the "normal" destination block. If the instruction returns
8067a value, that value shall set the call or invoke instruction's return
8068value.
8069
8070Example:
8071""""""""
8072
8073.. code-block:: llvm
8074
8075      ret i32 5                       ; Return an integer value of 5
8076      ret void                        ; Return from a void function
8077      ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
8078
8079.. _i_br:
8080
8081'``br``' Instruction
8082^^^^^^^^^^^^^^^^^^^^
8083
8084Syntax:
8085"""""""
8086
8087::
8088
8089      br i1 <cond>, label <iftrue>, label <iffalse>
8090      br label <dest>          ; Unconditional branch
8091
8092Overview:
8093"""""""""
8094
8095The '``br``' instruction is used to cause control flow to transfer to a
8096different basic block in the current function. There are two forms of
8097this instruction, corresponding to a conditional branch and an
8098unconditional branch.
8099
8100Arguments:
8101""""""""""
8102
8103The conditional branch form of the '``br``' instruction takes a single
8104'``i1``' value and two '``label``' values. The unconditional form of the
8105'``br``' instruction takes a single '``label``' value as a target.
8106
8107Semantics:
8108""""""""""
8109
8110Upon execution of a conditional '``br``' instruction, the '``i1``'
8111argument is evaluated. If the value is ``true``, control flows to the
8112'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
8113to the '``iffalse``' ``label`` argument.
8114If '``cond``' is ``poison`` or ``undef``, this instruction has undefined
8115behavior.
8116
8117Example:
8118""""""""
8119
8120.. code-block:: llvm
8121
8122    Test:
8123      %cond = icmp eq i32 %a, %b
8124      br i1 %cond, label %IfEqual, label %IfUnequal
8125    IfEqual:
8126      ret i32 1
8127    IfUnequal:
8128      ret i32 0
8129
8130.. _i_switch:
8131
8132'``switch``' Instruction
8133^^^^^^^^^^^^^^^^^^^^^^^^
8134
8135Syntax:
8136"""""""
8137
8138::
8139
8140      switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
8141
8142Overview:
8143"""""""""
8144
8145The '``switch``' instruction is used to transfer control flow to one of
8146several different places. It is a generalization of the '``br``'
8147instruction, allowing a branch to occur to one of many possible
8148destinations.
8149
8150Arguments:
8151""""""""""
8152
8153The '``switch``' instruction uses three parameters: an integer
8154comparison value '``value``', a default '``label``' destination, and an
8155array of pairs of comparison value constants and '``label``'s. The table
8156is not allowed to contain duplicate constant entries.
8157
8158Semantics:
8159""""""""""
8160
8161The ``switch`` instruction specifies a table of values and destinations.
8162When the '``switch``' instruction is executed, this table is searched
8163for the given value. If the value is found, control flow is transferred
8164to the corresponding destination; otherwise, control flow is transferred
8165to the default destination.
8166If '``value``' is ``poison`` or ``undef``, this instruction has undefined
8167behavior.
8168
8169Implementation:
8170"""""""""""""""
8171
8172Depending on properties of the target machine and the particular
8173``switch`` instruction, this instruction may be code generated in
8174different ways. For example, it could be generated as a series of
8175chained conditional branches or with a lookup table.
8176
8177Example:
8178""""""""
8179
8180.. code-block:: llvm
8181
8182     ; Emulate a conditional br instruction
8183     %Val = zext i1 %value to i32
8184     switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
8185
8186     ; Emulate an unconditional br instruction
8187     switch i32 0, label %dest [ ]
8188
8189     ; Implement a jump table:
8190     switch i32 %val, label %otherwise [ i32 0, label %onzero
8191                                         i32 1, label %onone
8192                                         i32 2, label %ontwo ]
8193
8194.. _i_indirectbr:
8195
8196'``indirectbr``' Instruction
8197^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8198
8199Syntax:
8200"""""""
8201
8202::
8203
8204      indirectbr ptr <address>, [ label <dest1>, label <dest2>, ... ]
8205
8206Overview:
8207"""""""""
8208
8209The '``indirectbr``' instruction implements an indirect branch to a
8210label within the current function, whose address is specified by
8211"``address``". Address must be derived from a
8212:ref:`blockaddress <blockaddress>` constant.
8213
8214Arguments:
8215""""""""""
8216
8217The '``address``' argument is the address of the label to jump to. The
8218rest of the arguments indicate the full set of possible destinations
8219that the address may point to. Blocks are allowed to occur multiple
8220times in the destination list, though this isn't particularly useful.
8221
8222This destination list is required so that dataflow analysis has an
8223accurate understanding of the CFG.
8224
8225Semantics:
8226""""""""""
8227
8228Control transfers to the block specified in the address argument. All
8229possible destination blocks must be listed in the label list, otherwise
8230this instruction has undefined behavior. This implies that jumps to
8231labels defined in other functions have undefined behavior as well.
8232If '``address``' is ``poison`` or ``undef``, this instruction has undefined
8233behavior.
8234
8235Implementation:
8236"""""""""""""""
8237
8238This is typically implemented with a jump through a register.
8239
8240Example:
8241""""""""
8242
8243.. code-block:: llvm
8244
8245     indirectbr ptr %Addr, [ label %bb1, label %bb2, label %bb3 ]
8246
8247.. _i_invoke:
8248
8249'``invoke``' Instruction
8250^^^^^^^^^^^^^^^^^^^^^^^^
8251
8252Syntax:
8253"""""""
8254
8255::
8256
8257      <result> = invoke [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8258                    [operand bundles] to label <normal label> unwind label <exception label>
8259
8260Overview:
8261"""""""""
8262
8263The '``invoke``' instruction causes control to transfer to a specified
8264function, with the possibility of control flow transfer to either the
8265'``normal``' label or the '``exception``' label. If the callee function
8266returns with the "``ret``" instruction, control flow will return to the
8267"normal" label. If the callee (or any indirect callees) returns via the
8268":ref:`resume <i_resume>`" instruction or other exception handling
8269mechanism, control is interrupted and continued at the dynamically
8270nearest "exception" label.
8271
8272The '``exception``' label is a `landing
8273pad <ExceptionHandling.html#overview>`_ for the exception. As such,
8274'``exception``' label is required to have the
8275":ref:`landingpad <i_landingpad>`" instruction, which contains the
8276information about the behavior of the program after unwinding happens,
8277as its first non-PHI instruction. The restrictions on the
8278"``landingpad``" instruction's tightly couples it to the "``invoke``"
8279instruction, so that the important information contained within the
8280"``landingpad``" instruction can't be lost through normal code motion.
8281
8282Arguments:
8283""""""""""
8284
8285This instruction requires several arguments:
8286
8287#. The optional "cconv" marker indicates which :ref:`calling
8288   convention <callingconv>` the call should use. If none is
8289   specified, the call defaults to using C calling conventions.
8290#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8291   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8292   are valid here.
8293#. The optional addrspace attribute can be used to indicate the address space
8294   of the called function. If it is not specified, the program address space
8295   from the :ref:`datalayout string<langref_datalayout>` will be used.
8296#. '``ty``': the type of the call instruction itself which is also the
8297   type of the return value. Functions that return no value are marked
8298   ``void``.
8299#. '``fnty``': shall be the signature of the function being invoked. The
8300   argument types must match the types implied by this signature. This
8301   type can be omitted if the function is not varargs.
8302#. '``fnptrval``': An LLVM value containing a pointer to a function to
8303   be invoked. In most cases, this is a direct function invocation, but
8304   indirect ``invoke``'s are just as possible, calling an arbitrary pointer
8305   to function value.
8306#. '``function args``': argument list whose types match the function
8307   signature argument types and parameter attributes. All arguments must
8308   be of :ref:`first class <t_firstclass>` type. If the function signature
8309   indicates the function accepts a variable number of arguments, the
8310   extra arguments can be specified.
8311#. '``normal label``': the label reached when the called function
8312   executes a '``ret``' instruction.
8313#. '``exception label``': the label reached when a callee returns via
8314   the :ref:`resume <i_resume>` instruction or other exception handling
8315   mechanism.
8316#. The optional :ref:`function attributes <fnattrs>` list.
8317#. The optional :ref:`operand bundles <opbundles>` list.
8318
8319Semantics:
8320""""""""""
8321
8322This instruction is designed to operate as a standard '``call``'
8323instruction in most regards. The primary difference is that it
8324establishes an association with a label, which is used by the runtime
8325library to unwind the stack.
8326
8327This instruction is used in languages with destructors to ensure that
8328proper cleanup is performed in the case of either a ``longjmp`` or a
8329thrown exception. Additionally, this is important for implementation of
8330'``catch``' clauses in high-level languages that support them.
8331
8332For the purposes of the SSA form, the definition of the value returned
8333by the '``invoke``' instruction is deemed to occur on the edge from the
8334current block to the "normal" label. If the callee unwinds then no
8335return value is available.
8336
8337Example:
8338""""""""
8339
8340.. code-block:: llvm
8341
8342      %retval = invoke i32 @Test(i32 15) to label %Continue
8343                  unwind label %TestCleanup              ; i32:retval set
8344      %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
8345                  unwind label %TestCleanup              ; i32:retval set
8346
8347.. _i_callbr:
8348
8349'``callbr``' Instruction
8350^^^^^^^^^^^^^^^^^^^^^^^^
8351
8352Syntax:
8353"""""""
8354
8355::
8356
8357      <result> = callbr [cconv] [ret attrs] [addrspace(<num>)] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
8358                    [operand bundles] to label <fallthrough label> [indirect labels]
8359
8360Overview:
8361"""""""""
8362
8363The '``callbr``' instruction causes control to transfer to a specified
8364function, with the possibility of control flow transfer to either the
8365'``fallthrough``' label or one of the '``indirect``' labels.
8366
8367This instruction should only be used to implement the "goto" feature of gcc
8368style inline assembly. Any other usage is an error in the IR verifier.
8369
8370Arguments:
8371""""""""""
8372
8373This instruction requires several arguments:
8374
8375#. The optional "cconv" marker indicates which :ref:`calling
8376   convention <callingconv>` the call should use. If none is
8377   specified, the call defaults to using C calling conventions.
8378#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8379   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8380   are valid here.
8381#. The optional addrspace attribute can be used to indicate the address space
8382   of the called function. If it is not specified, the program address space
8383   from the :ref:`datalayout string<langref_datalayout>` will be used.
8384#. '``ty``': the type of the call instruction itself which is also the
8385   type of the return value. Functions that return no value are marked
8386   ``void``.
8387#. '``fnty``': shall be the signature of the function being called. The
8388   argument types must match the types implied by this signature. This
8389   type can be omitted if the function is not varargs.
8390#. '``fnptrval``': An LLVM value containing a pointer to a function to
8391   be called. In most cases, this is a direct function call, but
8392   other ``callbr``'s are just as possible, calling an arbitrary pointer
8393   to function value.
8394#. '``function args``': argument list whose types match the function
8395   signature argument types and parameter attributes. All arguments must
8396   be of :ref:`first class <t_firstclass>` type. If the function signature
8397   indicates the function accepts a variable number of arguments, the
8398   extra arguments can be specified.
8399#. '``fallthrough label``': the label reached when the inline assembly's
8400   execution exits the bottom.
8401#. '``indirect labels``': the labels reached when a callee transfers control
8402   to a location other than the '``fallthrough label``'. Label constraints
8403   refer to these destinations.
8404#. The optional :ref:`function attributes <fnattrs>` list.
8405#. The optional :ref:`operand bundles <opbundles>` list.
8406
8407Semantics:
8408""""""""""
8409
8410This instruction is designed to operate as a standard '``call``'
8411instruction in most regards. The primary difference is that it
8412establishes an association with additional labels to define where control
8413flow goes after the call.
8414
8415The output values of a '``callbr``' instruction are available only to
8416the '``fallthrough``' block, not to any '``indirect``' blocks(s).
8417
8418The only use of this today is to implement the "goto" feature of gcc inline
8419assembly where additional labels can be provided as locations for the inline
8420assembly to jump to.
8421
8422Example:
8423""""""""
8424
8425.. code-block:: llvm
8426
8427      ; "asm goto" without output constraints.
8428      callbr void asm "", "r,!i"(i32 %x)
8429                  to label %fallthrough [label %indirect]
8430
8431      ; "asm goto" with output constraints.
8432      <result> = callbr i32 asm "", "=r,r,!i"(i32 %x)
8433                  to label %fallthrough [label %indirect]
8434
8435.. _i_resume:
8436
8437'``resume``' Instruction
8438^^^^^^^^^^^^^^^^^^^^^^^^
8439
8440Syntax:
8441"""""""
8442
8443::
8444
8445      resume <type> <value>
8446
8447Overview:
8448"""""""""
8449
8450The '``resume``' instruction is a terminator instruction that has no
8451successors.
8452
8453Arguments:
8454""""""""""
8455
8456The '``resume``' instruction requires one argument, which must have the
8457same type as the result of any '``landingpad``' instruction in the same
8458function.
8459
8460Semantics:
8461""""""""""
8462
8463The '``resume``' instruction resumes propagation of an existing
8464(in-flight) exception whose unwinding was interrupted with a
8465:ref:`landingpad <i_landingpad>` instruction.
8466
8467Example:
8468""""""""
8469
8470.. code-block:: llvm
8471
8472      resume { ptr, i32 } %exn
8473
8474.. _i_catchswitch:
8475
8476'``catchswitch``' Instruction
8477^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8478
8479Syntax:
8480"""""""
8481
8482::
8483
8484      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
8485      <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
8486
8487Overview:
8488"""""""""
8489
8490The '``catchswitch``' instruction is used by `LLVM's exception handling system
8491<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
8492that may be executed by the :ref:`EH personality routine <personalityfn>`.
8493
8494Arguments:
8495""""""""""
8496
8497The ``parent`` argument is the token of the funclet that contains the
8498``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
8499this operand may be the token ``none``.
8500
8501The ``default`` argument is the label of another basic block beginning with
8502either a ``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination
8503must be a legal target with respect to the ``parent`` links, as described in
8504the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8505
8506The ``handlers`` are a nonempty list of successor blocks that each begin with a
8507:ref:`catchpad <i_catchpad>` instruction.
8508
8509Semantics:
8510""""""""""
8511
8512Executing this instruction transfers control to one of the successors in
8513``handlers``, if appropriate, or continues to unwind via the unwind label if
8514present.
8515
8516The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
8517it must be both the first non-phi instruction and last instruction in the basic
8518block. Therefore, it must be the only non-phi instruction in the block.
8519
8520Example:
8521""""""""
8522
8523.. code-block:: text
8524
8525    dispatch1:
8526      %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
8527    dispatch2:
8528      %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
8529
8530.. _i_catchret:
8531
8532'``catchret``' Instruction
8533^^^^^^^^^^^^^^^^^^^^^^^^^^
8534
8535Syntax:
8536"""""""
8537
8538::
8539
8540      catchret from <token> to label <normal>
8541
8542Overview:
8543"""""""""
8544
8545The '``catchret``' instruction is a terminator instruction that has a
8546single successor.
8547
8548
8549Arguments:
8550""""""""""
8551
8552The first argument to a '``catchret``' indicates which ``catchpad`` it
8553exits.  It must be a :ref:`catchpad <i_catchpad>`.
8554The second argument to a '``catchret``' specifies where control will
8555transfer to next.
8556
8557Semantics:
8558""""""""""
8559
8560The '``catchret``' instruction ends an existing (in-flight) exception whose
8561unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction.  The
8562:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
8563code to, for example, destroy the active exception.  Control then transfers to
8564``normal``.
8565
8566The ``token`` argument must be a token produced by a ``catchpad`` instruction.
8567If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
8568funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8569the ``catchret``'s behavior is undefined.
8570
8571Example:
8572""""""""
8573
8574.. code-block:: text
8575
8576      catchret from %catch to label %continue
8577
8578.. _i_cleanupret:
8579
8580'``cleanupret``' Instruction
8581^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8582
8583Syntax:
8584"""""""
8585
8586::
8587
8588      cleanupret from <value> unwind label <continue>
8589      cleanupret from <value> unwind to caller
8590
8591Overview:
8592"""""""""
8593
8594The '``cleanupret``' instruction is a terminator instruction that has
8595an optional successor.
8596
8597
8598Arguments:
8599""""""""""
8600
8601The '``cleanupret``' instruction requires one argument, which indicates
8602which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
8603If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
8604funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8605the ``cleanupret``'s behavior is undefined.
8606
8607The '``cleanupret``' instruction also has an optional successor, ``continue``,
8608which must be the label of another basic block beginning with either a
8609``cleanuppad`` or ``catchswitch`` instruction.  This unwind destination must
8610be a legal target with respect to the ``parent`` links, as described in the
8611`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
8612
8613Semantics:
8614""""""""""
8615
8616The '``cleanupret``' instruction indicates to the
8617:ref:`personality function <personalityfn>` that one
8618:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
8619It transfers control to ``continue`` or unwinds out of the function.
8620
8621Example:
8622""""""""
8623
8624.. code-block:: text
8625
8626      cleanupret from %cleanup unwind to caller
8627      cleanupret from %cleanup unwind label %continue
8628
8629.. _i_unreachable:
8630
8631'``unreachable``' Instruction
8632^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8633
8634Syntax:
8635"""""""
8636
8637::
8638
8639      unreachable
8640
8641Overview:
8642"""""""""
8643
8644The '``unreachable``' instruction has no defined semantics. This
8645instruction is used to inform the optimizer that a particular portion of
8646the code is not reachable. This can be used to indicate that the code
8647after a no-return function cannot be reached, and other facts.
8648
8649Semantics:
8650""""""""""
8651
8652The '``unreachable``' instruction has no defined semantics.
8653
8654.. _unaryops:
8655
8656Unary Operations
8657-----------------
8658
8659Unary operators require a single operand, execute an operation on
8660it, and produce a single value. The operand might represent multiple
8661data, as is the case with the :ref:`vector <t_vector>` data type. The
8662result value has the same type as its operand.
8663
8664.. _i_fneg:
8665
8666'``fneg``' Instruction
8667^^^^^^^^^^^^^^^^^^^^^^
8668
8669Syntax:
8670"""""""
8671
8672::
8673
8674      <result> = fneg [fast-math flags]* <ty> <op1>   ; yields ty:result
8675
8676Overview:
8677"""""""""
8678
8679The '``fneg``' instruction returns the negation of its operand.
8680
8681Arguments:
8682""""""""""
8683
8684The argument to the '``fneg``' instruction must be a
8685:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8686floating-point values.
8687
8688Semantics:
8689""""""""""
8690
8691The value produced is a copy of the operand with its sign bit flipped.
8692This instruction can also take any number of :ref:`fast-math
8693flags <fastmath>`, which are optimization hints to enable otherwise
8694unsafe floating-point optimizations:
8695
8696Example:
8697""""""""
8698
8699.. code-block:: text
8700
8701      <result> = fneg float %val          ; yields float:result = -%var
8702
8703.. _binaryops:
8704
8705Binary Operations
8706-----------------
8707
8708Binary operators are used to do most of the computation in a program.
8709They require two operands of the same type, execute an operation on
8710them, and produce a single value. The operands might represent multiple
8711data, as is the case with the :ref:`vector <t_vector>` data type. The
8712result value has the same type as its operands.
8713
8714There are several different binary operators:
8715
8716.. _i_add:
8717
8718'``add``' Instruction
8719^^^^^^^^^^^^^^^^^^^^^
8720
8721Syntax:
8722"""""""
8723
8724::
8725
8726      <result> = add <ty> <op1>, <op2>          ; yields ty:result
8727      <result> = add nuw <ty> <op1>, <op2>      ; yields ty:result
8728      <result> = add nsw <ty> <op1>, <op2>      ; yields ty:result
8729      <result> = add nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8730
8731Overview:
8732"""""""""
8733
8734The '``add``' instruction returns the sum of its two operands.
8735
8736Arguments:
8737""""""""""
8738
8739The two arguments to the '``add``' instruction must be
8740:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8741arguments must have identical types.
8742
8743Semantics:
8744""""""""""
8745
8746The value produced is the integer sum of the two operands.
8747
8748If the sum has unsigned overflow, the result returned is the
8749mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8750the result.
8751
8752Because LLVM integers use a two's complement representation, this
8753instruction is appropriate for both signed and unsigned integers.
8754
8755``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8756respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8757result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
8758unsigned and/or signed overflow, respectively, occurs.
8759
8760Example:
8761""""""""
8762
8763.. code-block:: text
8764
8765      <result> = add i32 4, %var          ; yields i32:result = 4 + %var
8766
8767.. _i_fadd:
8768
8769'``fadd``' Instruction
8770^^^^^^^^^^^^^^^^^^^^^^
8771
8772Syntax:
8773"""""""
8774
8775::
8776
8777      <result> = fadd [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8778
8779Overview:
8780"""""""""
8781
8782The '``fadd``' instruction returns the sum of its two operands.
8783
8784Arguments:
8785""""""""""
8786
8787The two arguments to the '``fadd``' instruction must be
8788:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8789floating-point values. Both arguments must have identical types.
8790
8791Semantics:
8792""""""""""
8793
8794The value produced is the floating-point sum of the two operands.
8795This instruction is assumed to execute in the default :ref:`floating-point
8796environment <floatenv>`.
8797This instruction can also take any number of :ref:`fast-math
8798flags <fastmath>`, which are optimization hints to enable otherwise
8799unsafe floating-point optimizations:
8800
8801Example:
8802""""""""
8803
8804.. code-block:: text
8805
8806      <result> = fadd float 4.0, %var          ; yields float:result = 4.0 + %var
8807
8808.. _i_sub:
8809
8810'``sub``' Instruction
8811^^^^^^^^^^^^^^^^^^^^^
8812
8813Syntax:
8814"""""""
8815
8816::
8817
8818      <result> = sub <ty> <op1>, <op2>          ; yields ty:result
8819      <result> = sub nuw <ty> <op1>, <op2>      ; yields ty:result
8820      <result> = sub nsw <ty> <op1>, <op2>      ; yields ty:result
8821      <result> = sub nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8822
8823Overview:
8824"""""""""
8825
8826The '``sub``' instruction returns the difference of its two operands.
8827
8828Note that the '``sub``' instruction is used to represent the '``neg``'
8829instruction present in most other intermediate representations.
8830
8831Arguments:
8832""""""""""
8833
8834The two arguments to the '``sub``' instruction must be
8835:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8836arguments must have identical types.
8837
8838Semantics:
8839""""""""""
8840
8841The value produced is the integer difference of the two operands.
8842
8843If the difference has unsigned overflow, the result returned is the
8844mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
8845the result.
8846
8847Because LLVM integers use a two's complement representation, this
8848instruction is appropriate for both signed and unsigned integers.
8849
8850``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8851respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8852result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
8853unsigned and/or signed overflow, respectively, occurs.
8854
8855Example:
8856""""""""
8857
8858.. code-block:: text
8859
8860      <result> = sub i32 4, %var          ; yields i32:result = 4 - %var
8861      <result> = sub i32 0, %val          ; yields i32:result = -%var
8862
8863.. _i_fsub:
8864
8865'``fsub``' Instruction
8866^^^^^^^^^^^^^^^^^^^^^^
8867
8868Syntax:
8869"""""""
8870
8871::
8872
8873      <result> = fsub [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8874
8875Overview:
8876"""""""""
8877
8878The '``fsub``' instruction returns the difference of its two operands.
8879
8880Arguments:
8881""""""""""
8882
8883The two arguments to the '``fsub``' instruction must be
8884:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8885floating-point values. Both arguments must have identical types.
8886
8887Semantics:
8888""""""""""
8889
8890The value produced is the floating-point difference of the two operands.
8891This instruction is assumed to execute in the default :ref:`floating-point
8892environment <floatenv>`.
8893This instruction can also take any number of :ref:`fast-math
8894flags <fastmath>`, which are optimization hints to enable otherwise
8895unsafe floating-point optimizations:
8896
8897Example:
8898""""""""
8899
8900.. code-block:: text
8901
8902      <result> = fsub float 4.0, %var           ; yields float:result = 4.0 - %var
8903      <result> = fsub float -0.0, %val          ; yields float:result = -%var
8904
8905.. _i_mul:
8906
8907'``mul``' Instruction
8908^^^^^^^^^^^^^^^^^^^^^
8909
8910Syntax:
8911"""""""
8912
8913::
8914
8915      <result> = mul <ty> <op1>, <op2>          ; yields ty:result
8916      <result> = mul nuw <ty> <op1>, <op2>      ; yields ty:result
8917      <result> = mul nsw <ty> <op1>, <op2>      ; yields ty:result
8918      <result> = mul nuw nsw <ty> <op1>, <op2>  ; yields ty:result
8919
8920Overview:
8921"""""""""
8922
8923The '``mul``' instruction returns the product of its two operands.
8924
8925Arguments:
8926""""""""""
8927
8928The two arguments to the '``mul``' instruction must be
8929:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
8930arguments must have identical types.
8931
8932Semantics:
8933""""""""""
8934
8935The value produced is the integer product of the two operands.
8936
8937If the result of the multiplication has unsigned overflow, the result
8938returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
8939bit width of the result.
8940
8941Because LLVM integers use a two's complement representation, and the
8942result is the same width as the operands, this instruction returns the
8943correct result for both signed and unsigned integers. If a full product
8944(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
8945sign-extended or zero-extended as appropriate to the width of the full
8946product.
8947
8948``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
8949respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
8950result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
8951unsigned and/or signed overflow, respectively, occurs.
8952
8953Example:
8954""""""""
8955
8956.. code-block:: text
8957
8958      <result> = mul i32 4, %var          ; yields i32:result = 4 * %var
8959
8960.. _i_fmul:
8961
8962'``fmul``' Instruction
8963^^^^^^^^^^^^^^^^^^^^^^
8964
8965Syntax:
8966"""""""
8967
8968::
8969
8970      <result> = fmul [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
8971
8972Overview:
8973"""""""""
8974
8975The '``fmul``' instruction returns the product of its two operands.
8976
8977Arguments:
8978""""""""""
8979
8980The two arguments to the '``fmul``' instruction must be
8981:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
8982floating-point values. Both arguments must have identical types.
8983
8984Semantics:
8985""""""""""
8986
8987The value produced is the floating-point product of the two operands.
8988This instruction is assumed to execute in the default :ref:`floating-point
8989environment <floatenv>`.
8990This instruction can also take any number of :ref:`fast-math
8991flags <fastmath>`, which are optimization hints to enable otherwise
8992unsafe floating-point optimizations:
8993
8994Example:
8995""""""""
8996
8997.. code-block:: text
8998
8999      <result> = fmul float 4.0, %var          ; yields float:result = 4.0 * %var
9000
9001.. _i_udiv:
9002
9003'``udiv``' Instruction
9004^^^^^^^^^^^^^^^^^^^^^^
9005
9006Syntax:
9007"""""""
9008
9009::
9010
9011      <result> = udiv <ty> <op1>, <op2>         ; yields ty:result
9012      <result> = udiv exact <ty> <op1>, <op2>   ; yields ty:result
9013
9014Overview:
9015"""""""""
9016
9017The '``udiv``' instruction returns the quotient of its two operands.
9018
9019Arguments:
9020""""""""""
9021
9022The two arguments to the '``udiv``' instruction must be
9023:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9024arguments must have identical types.
9025
9026Semantics:
9027""""""""""
9028
9029The value produced is the unsigned integer quotient of the two operands.
9030
9031Note that unsigned integer division and signed integer division are
9032distinct operations; for signed integer division, use '``sdiv``'.
9033
9034Division by zero is undefined behavior. For vectors, if any element
9035of the divisor is zero, the operation has undefined behavior.
9036
9037
9038If the ``exact`` keyword is present, the result value of the ``udiv`` is
9039a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
9040such, "((a udiv exact b) mul b) == a").
9041
9042Example:
9043""""""""
9044
9045.. code-block:: text
9046
9047      <result> = udiv i32 4, %var          ; yields i32:result = 4 / %var
9048
9049.. _i_sdiv:
9050
9051'``sdiv``' Instruction
9052^^^^^^^^^^^^^^^^^^^^^^
9053
9054Syntax:
9055"""""""
9056
9057::
9058
9059      <result> = sdiv <ty> <op1>, <op2>         ; yields ty:result
9060      <result> = sdiv exact <ty> <op1>, <op2>   ; yields ty:result
9061
9062Overview:
9063"""""""""
9064
9065The '``sdiv``' instruction returns the quotient of its two operands.
9066
9067Arguments:
9068""""""""""
9069
9070The two arguments to the '``sdiv``' instruction must be
9071:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9072arguments must have identical types.
9073
9074Semantics:
9075""""""""""
9076
9077The value produced is the signed integer quotient of the two operands
9078rounded towards zero.
9079
9080Note that signed integer division and unsigned integer division are
9081distinct operations; for unsigned integer division, use '``udiv``'.
9082
9083Division by zero is undefined behavior. For vectors, if any element
9084of the divisor is zero, the operation has undefined behavior.
9085Overflow also leads to undefined behavior; this is a rare case, but can
9086occur, for example, by doing a 32-bit division of -2147483648 by -1.
9087
9088If the ``exact`` keyword is present, the result value of the ``sdiv`` is
9089a :ref:`poison value <poisonvalues>` if the result would be rounded.
9090
9091Example:
9092""""""""
9093
9094.. code-block:: text
9095
9096      <result> = sdiv i32 4, %var          ; yields i32:result = 4 / %var
9097
9098.. _i_fdiv:
9099
9100'``fdiv``' Instruction
9101^^^^^^^^^^^^^^^^^^^^^^
9102
9103Syntax:
9104"""""""
9105
9106::
9107
9108      <result> = fdiv [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9109
9110Overview:
9111"""""""""
9112
9113The '``fdiv``' instruction returns the quotient of its two operands.
9114
9115Arguments:
9116""""""""""
9117
9118The two arguments to the '``fdiv``' instruction must be
9119:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9120floating-point values. Both arguments must have identical types.
9121
9122Semantics:
9123""""""""""
9124
9125The value produced is the floating-point quotient of the two operands.
9126This instruction is assumed to execute in the default :ref:`floating-point
9127environment <floatenv>`.
9128This instruction can also take any number of :ref:`fast-math
9129flags <fastmath>`, which are optimization hints to enable otherwise
9130unsafe floating-point optimizations:
9131
9132Example:
9133""""""""
9134
9135.. code-block:: text
9136
9137      <result> = fdiv float 4.0, %var          ; yields float:result = 4.0 / %var
9138
9139.. _i_urem:
9140
9141'``urem``' Instruction
9142^^^^^^^^^^^^^^^^^^^^^^
9143
9144Syntax:
9145"""""""
9146
9147::
9148
9149      <result> = urem <ty> <op1>, <op2>   ; yields ty:result
9150
9151Overview:
9152"""""""""
9153
9154The '``urem``' instruction returns the remainder from the unsigned
9155division of its two arguments.
9156
9157Arguments:
9158""""""""""
9159
9160The two arguments to the '``urem``' instruction must be
9161:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9162arguments must have identical types.
9163
9164Semantics:
9165""""""""""
9166
9167This instruction returns the unsigned integer *remainder* of a division.
9168This instruction always performs an unsigned division to get the
9169remainder.
9170
9171Note that unsigned integer remainder and signed integer remainder are
9172distinct operations; for signed integer remainder, use '``srem``'.
9173
9174Taking the remainder of a division by zero is undefined behavior.
9175For vectors, if any element of the divisor is zero, the operation has
9176undefined behavior.
9177
9178Example:
9179""""""""
9180
9181.. code-block:: text
9182
9183      <result> = urem i32 4, %var          ; yields i32:result = 4 % %var
9184
9185.. _i_srem:
9186
9187'``srem``' Instruction
9188^^^^^^^^^^^^^^^^^^^^^^
9189
9190Syntax:
9191"""""""
9192
9193::
9194
9195      <result> = srem <ty> <op1>, <op2>   ; yields ty:result
9196
9197Overview:
9198"""""""""
9199
9200The '``srem``' instruction returns the remainder from the signed
9201division of its two operands. This instruction can also take
9202:ref:`vector <t_vector>` versions of the values in which case the elements
9203must be integers.
9204
9205Arguments:
9206""""""""""
9207
9208The two arguments to the '``srem``' instruction must be
9209:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9210arguments must have identical types.
9211
9212Semantics:
9213""""""""""
9214
9215This instruction returns the *remainder* of a division (where the result
9216is either zero or has the same sign as the dividend, ``op1``), not the
9217*modulo* operator (where the result is either zero or has the same sign
9218as the divisor, ``op2``) of a value. For more information about the
9219difference, see `The Math
9220Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
9221table of how this is implemented in various languages, please see
9222`Wikipedia: modulo
9223operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
9224
9225Note that signed integer remainder and unsigned integer remainder are
9226distinct operations; for unsigned integer remainder, use '``urem``'.
9227
9228Taking the remainder of a division by zero is undefined behavior.
9229For vectors, if any element of the divisor is zero, the operation has
9230undefined behavior.
9231Overflow also leads to undefined behavior; this is a rare case, but can
9232occur, for example, by taking the remainder of a 32-bit division of
9233-2147483648 by -1. (The remainder doesn't actually overflow, but this
9234rule lets srem be implemented using instructions that return both the
9235result of the division and the remainder.)
9236
9237Example:
9238""""""""
9239
9240.. code-block:: text
9241
9242      <result> = srem i32 4, %var          ; yields i32:result = 4 % %var
9243
9244.. _i_frem:
9245
9246'``frem``' Instruction
9247^^^^^^^^^^^^^^^^^^^^^^
9248
9249Syntax:
9250"""""""
9251
9252::
9253
9254      <result> = frem [fast-math flags]* <ty> <op1>, <op2>   ; yields ty:result
9255
9256Overview:
9257"""""""""
9258
9259The '``frem``' instruction returns the remainder from the division of
9260its two operands.
9261
9262Arguments:
9263""""""""""
9264
9265The two arguments to the '``frem``' instruction must be
9266:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
9267floating-point values. Both arguments must have identical types.
9268
9269Semantics:
9270""""""""""
9271
9272The value produced is the floating-point remainder of the two operands.
9273This is the same output as a libm '``fmod``' function, but without any
9274possibility of setting ``errno``. The remainder has the same sign as the
9275dividend.
9276This instruction is assumed to execute in the default :ref:`floating-point
9277environment <floatenv>`.
9278This instruction can also take any number of :ref:`fast-math
9279flags <fastmath>`, which are optimization hints to enable otherwise
9280unsafe floating-point optimizations:
9281
9282Example:
9283""""""""
9284
9285.. code-block:: text
9286
9287      <result> = frem float 4.0, %var          ; yields float:result = 4.0 % %var
9288
9289.. _bitwiseops:
9290
9291Bitwise Binary Operations
9292-------------------------
9293
9294Bitwise binary operators are used to do various forms of bit-twiddling
9295in a program. They are generally very efficient instructions and can
9296commonly be strength reduced from other instructions. They require two
9297operands of the same type, execute an operation on them, and produce a
9298single value. The resulting value is the same type as its operands.
9299
9300.. _i_shl:
9301
9302'``shl``' Instruction
9303^^^^^^^^^^^^^^^^^^^^^
9304
9305Syntax:
9306"""""""
9307
9308::
9309
9310      <result> = shl <ty> <op1>, <op2>           ; yields ty:result
9311      <result> = shl nuw <ty> <op1>, <op2>       ; yields ty:result
9312      <result> = shl nsw <ty> <op1>, <op2>       ; yields ty:result
9313      <result> = shl nuw nsw <ty> <op1>, <op2>   ; yields ty:result
9314
9315Overview:
9316"""""""""
9317
9318The '``shl``' instruction returns the first operand shifted to the left
9319a specified number of bits.
9320
9321Arguments:
9322""""""""""
9323
9324Both arguments to the '``shl``' instruction must be the same
9325:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9326'``op2``' is treated as an unsigned value.
9327
9328Semantics:
9329""""""""""
9330
9331The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
9332where ``n`` is the width of the result. If ``op2`` is (statically or
9333dynamically) equal to or larger than the number of bits in
9334``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
9335If the arguments are vectors, each vector element of ``op1`` is shifted
9336by the corresponding shift amount in ``op2``.
9337
9338If the ``nuw`` keyword is present, then the shift produces a poison
9339value if it shifts out any non-zero bits.
9340If the ``nsw`` keyword is present, then the shift produces a poison
9341value if it shifts out any bits that disagree with the resultant sign bit.
9342
9343Example:
9344""""""""
9345
9346.. code-block:: text
9347
9348      <result> = shl i32 4, %var   ; yields i32: 4 << %var
9349      <result> = shl i32 4, 2      ; yields i32: 16
9350      <result> = shl i32 1, 10     ; yields i32: 1024
9351      <result> = shl i32 1, 32     ; undefined
9352      <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 2, i32 4>
9353
9354.. _i_lshr:
9355
9356
9357'``lshr``' Instruction
9358^^^^^^^^^^^^^^^^^^^^^^
9359
9360Syntax:
9361"""""""
9362
9363::
9364
9365      <result> = lshr <ty> <op1>, <op2>         ; yields ty:result
9366      <result> = lshr exact <ty> <op1>, <op2>   ; yields ty:result
9367
9368Overview:
9369"""""""""
9370
9371The '``lshr``' instruction (logical shift right) returns the first
9372operand shifted to the right a specified number of bits with zero fill.
9373
9374Arguments:
9375""""""""""
9376
9377Both arguments to the '``lshr``' instruction must be the same
9378:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9379'``op2``' is treated as an unsigned value.
9380
9381Semantics:
9382""""""""""
9383
9384This instruction always performs a logical shift right operation. The
9385most significant bits of the result will be filled with zero bits after
9386the shift. If ``op2`` is (statically or dynamically) equal to or larger
9387than the number of bits in ``op1``, this instruction returns a :ref:`poison
9388value <poisonvalues>`. If the arguments are vectors, each vector element
9389of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9390
9391If the ``exact`` keyword is present, the result value of the ``lshr`` is
9392a poison value if any of the bits shifted out are non-zero.
9393
9394Example:
9395""""""""
9396
9397.. code-block:: text
9398
9399      <result> = lshr i32 4, 1   ; yields i32:result = 2
9400      <result> = lshr i32 4, 2   ; yields i32:result = 1
9401      <result> = lshr i8  4, 3   ; yields i8:result = 0
9402      <result> = lshr i8 -2, 1   ; yields i8:result = 0x7F
9403      <result> = lshr i32 1, 32  ; undefined
9404      <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2>   ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
9405
9406.. _i_ashr:
9407
9408'``ashr``' Instruction
9409^^^^^^^^^^^^^^^^^^^^^^
9410
9411Syntax:
9412"""""""
9413
9414::
9415
9416      <result> = ashr <ty> <op1>, <op2>         ; yields ty:result
9417      <result> = ashr exact <ty> <op1>, <op2>   ; yields ty:result
9418
9419Overview:
9420"""""""""
9421
9422The '``ashr``' instruction (arithmetic shift right) returns the first
9423operand shifted to the right a specified number of bits with sign
9424extension.
9425
9426Arguments:
9427""""""""""
9428
9429Both arguments to the '``ashr``' instruction must be the same
9430:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
9431'``op2``' is treated as an unsigned value.
9432
9433Semantics:
9434""""""""""
9435
9436This instruction always performs an arithmetic shift right operation,
9437The most significant bits of the result will be filled with the sign bit
9438of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
9439than the number of bits in ``op1``, this instruction returns a :ref:`poison
9440value <poisonvalues>`. If the arguments are vectors, each vector element
9441of ``op1`` is shifted by the corresponding shift amount in ``op2``.
9442
9443If the ``exact`` keyword is present, the result value of the ``ashr`` is
9444a poison value if any of the bits shifted out are non-zero.
9445
9446Example:
9447""""""""
9448
9449.. code-block:: text
9450
9451      <result> = ashr i32 4, 1   ; yields i32:result = 2
9452      <result> = ashr i32 4, 2   ; yields i32:result = 1
9453      <result> = ashr i8  4, 3   ; yields i8:result = 0
9454      <result> = ashr i8 -2, 1   ; yields i8:result = -1
9455      <result> = ashr i32 1, 32  ; undefined
9456      <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3>   ; yields: result=<2 x i32> < i32 -1, i32 0>
9457
9458.. _i_and:
9459
9460'``and``' Instruction
9461^^^^^^^^^^^^^^^^^^^^^
9462
9463Syntax:
9464"""""""
9465
9466::
9467
9468      <result> = and <ty> <op1>, <op2>   ; yields ty:result
9469
9470Overview:
9471"""""""""
9472
9473The '``and``' instruction returns the bitwise logical and of its two
9474operands.
9475
9476Arguments:
9477""""""""""
9478
9479The two arguments to the '``and``' instruction must be
9480:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9481arguments must have identical types.
9482
9483Semantics:
9484""""""""""
9485
9486The truth table used for the '``and``' instruction is:
9487
9488+-----+-----+-----+
9489| In0 | In1 | Out |
9490+-----+-----+-----+
9491|   0 |   0 |   0 |
9492+-----+-----+-----+
9493|   0 |   1 |   0 |
9494+-----+-----+-----+
9495|   1 |   0 |   0 |
9496+-----+-----+-----+
9497|   1 |   1 |   1 |
9498+-----+-----+-----+
9499
9500Example:
9501""""""""
9502
9503.. code-block:: text
9504
9505      <result> = and i32 4, %var         ; yields i32:result = 4 & %var
9506      <result> = and i32 15, 40          ; yields i32:result = 8
9507      <result> = and i32 4, 8            ; yields i32:result = 0
9508
9509.. _i_or:
9510
9511'``or``' Instruction
9512^^^^^^^^^^^^^^^^^^^^
9513
9514Syntax:
9515"""""""
9516
9517::
9518
9519      <result> = or <ty> <op1>, <op2>   ; yields ty:result
9520
9521Overview:
9522"""""""""
9523
9524The '``or``' instruction returns the bitwise logical inclusive or of its
9525two operands.
9526
9527Arguments:
9528""""""""""
9529
9530The two arguments to the '``or``' instruction must be
9531:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9532arguments must have identical types.
9533
9534Semantics:
9535""""""""""
9536
9537The truth table used for the '``or``' instruction is:
9538
9539+-----+-----+-----+
9540| In0 | In1 | Out |
9541+-----+-----+-----+
9542|   0 |   0 |   0 |
9543+-----+-----+-----+
9544|   0 |   1 |   1 |
9545+-----+-----+-----+
9546|   1 |   0 |   1 |
9547+-----+-----+-----+
9548|   1 |   1 |   1 |
9549+-----+-----+-----+
9550
9551Example:
9552""""""""
9553
9554::
9555
9556      <result> = or i32 4, %var         ; yields i32:result = 4 | %var
9557      <result> = or i32 15, 40          ; yields i32:result = 47
9558      <result> = or i32 4, 8            ; yields i32:result = 12
9559
9560.. _i_xor:
9561
9562'``xor``' Instruction
9563^^^^^^^^^^^^^^^^^^^^^
9564
9565Syntax:
9566"""""""
9567
9568::
9569
9570      <result> = xor <ty> <op1>, <op2>   ; yields ty:result
9571
9572Overview:
9573"""""""""
9574
9575The '``xor``' instruction returns the bitwise logical exclusive or of
9576its two operands. The ``xor`` is used to implement the "one's
9577complement" operation, which is the "~" operator in C.
9578
9579Arguments:
9580""""""""""
9581
9582The two arguments to the '``xor``' instruction must be
9583:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
9584arguments must have identical types.
9585
9586Semantics:
9587""""""""""
9588
9589The truth table used for the '``xor``' instruction is:
9590
9591+-----+-----+-----+
9592| In0 | In1 | Out |
9593+-----+-----+-----+
9594|   0 |   0 |   0 |
9595+-----+-----+-----+
9596|   0 |   1 |   1 |
9597+-----+-----+-----+
9598|   1 |   0 |   1 |
9599+-----+-----+-----+
9600|   1 |   1 |   0 |
9601+-----+-----+-----+
9602
9603Example:
9604""""""""
9605
9606.. code-block:: text
9607
9608      <result> = xor i32 4, %var         ; yields i32:result = 4 ^ %var
9609      <result> = xor i32 15, 40          ; yields i32:result = 39
9610      <result> = xor i32 4, 8            ; yields i32:result = 12
9611      <result> = xor i32 %V, -1          ; yields i32:result = ~%V
9612
9613Vector Operations
9614-----------------
9615
9616LLVM supports several instructions to represent vector operations in a
9617target-independent manner. These instructions cover the element-access
9618and vector-specific operations needed to process vectors effectively.
9619While LLVM does directly support these vector operations, many
9620sophisticated algorithms will want to use target-specific intrinsics to
9621take full advantage of a specific target.
9622
9623.. _i_extractelement:
9624
9625'``extractelement``' Instruction
9626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9627
9628Syntax:
9629"""""""
9630
9631::
9632
9633      <result> = extractelement <n x <ty>> <val>, <ty2> <idx>  ; yields <ty>
9634      <result> = extractelement <vscale x n x <ty>> <val>, <ty2> <idx> ; yields <ty>
9635
9636Overview:
9637"""""""""
9638
9639The '``extractelement``' instruction extracts a single scalar element
9640from a vector at a specified index.
9641
9642Arguments:
9643""""""""""
9644
9645The first operand of an '``extractelement``' instruction is a value of
9646:ref:`vector <t_vector>` type. The second operand is an index indicating
9647the position from which to extract the element. The index may be a
9648variable of any integer type.
9649
9650Semantics:
9651""""""""""
9652
9653The result is a scalar of the same type as the element type of ``val``.
9654Its value is the value at position ``idx`` of ``val``. If ``idx``
9655exceeds the length of ``val`` for a fixed-length vector, the result is a
9656:ref:`poison value <poisonvalues>`. For a scalable vector, if the value
9657of ``idx`` exceeds the runtime length of the vector, the result is a
9658:ref:`poison value <poisonvalues>`.
9659
9660Example:
9661""""""""
9662
9663.. code-block:: text
9664
9665      <result> = extractelement <4 x i32> %vec, i32 0    ; yields i32
9666
9667.. _i_insertelement:
9668
9669'``insertelement``' Instruction
9670^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9671
9672Syntax:
9673"""""""
9674
9675::
9676
9677      <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx>    ; yields <n x <ty>>
9678      <result> = insertelement <vscale x n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <vscale x n x <ty>>
9679
9680Overview:
9681"""""""""
9682
9683The '``insertelement``' instruction inserts a scalar element into a
9684vector at a specified index.
9685
9686Arguments:
9687""""""""""
9688
9689The first operand of an '``insertelement``' instruction is a value of
9690:ref:`vector <t_vector>` type. The second operand is a scalar value whose
9691type must equal the element type of the first operand. The third operand
9692is an index indicating the position at which to insert the value. The
9693index may be a variable of any integer type.
9694
9695Semantics:
9696""""""""""
9697
9698The result is a vector of the same type as ``val``. Its element values
9699are those of ``val`` except at position ``idx``, where it gets the value
9700``elt``. If ``idx`` exceeds the length of ``val`` for a fixed-length vector,
9701the result is a :ref:`poison value <poisonvalues>`. For a scalable vector,
9702if the value of ``idx`` exceeds the runtime length of the vector, the result
9703is a :ref:`poison value <poisonvalues>`.
9704
9705Example:
9706""""""""
9707
9708.. code-block:: text
9709
9710      <result> = insertelement <4 x i32> %vec, i32 1, i32 0    ; yields <4 x i32>
9711
9712.. _i_shufflevector:
9713
9714'``shufflevector``' Instruction
9715^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9716
9717Syntax:
9718"""""""
9719
9720::
9721
9722      <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask>    ; yields <m x <ty>>
9723      <result> = shufflevector <vscale x n x <ty>> <v1>, <vscale x n x <ty>> v2, <vscale x m x i32> <mask>  ; yields <vscale x m x <ty>>
9724
9725Overview:
9726"""""""""
9727
9728The '``shufflevector``' instruction constructs a permutation of elements
9729from two input vectors, returning a vector with the same element type as
9730the input and length that is the same as the shuffle mask.
9731
9732Arguments:
9733""""""""""
9734
9735The first two operands of a '``shufflevector``' instruction are vectors
9736with the same type. The third argument is a shuffle mask vector constant
9737whose element type is ``i32``. The mask vector elements must be constant
9738integers or ``undef`` values. The result of the instruction is a vector
9739whose length is the same as the shuffle mask and whose element type is the
9740same as the element type of the first two operands.
9741
9742Semantics:
9743""""""""""
9744
9745The elements of the two input vectors are numbered from left to right
9746across both of the vectors. For each element of the result vector, the
9747shuffle mask selects an element from one of the input vectors to copy
9748to the result. Non-negative elements in the mask represent an index
9749into the concatenated pair of input vectors.
9750
9751If the shuffle mask is undefined, the result vector is undefined. If
9752the shuffle mask selects an undefined element from one of the input
9753vectors, the resulting element is undefined. An undefined element
9754in the mask vector specifies that the resulting element is undefined.
9755An undefined element in the mask vector prevents a poisoned vector
9756element from propagating.
9757
9758For scalable vectors, the only valid mask values at present are
9759``zeroinitializer`` and ``undef``, since we cannot write all indices as
9760literals for a vector with a length unknown at compile time.
9761
9762Example:
9763""""""""
9764
9765.. code-block:: text
9766
9767      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9768                              <4 x i32> <i32 0, i32 4, i32 1, i32 5>  ; yields <4 x i32>
9769      <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
9770                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32> - Identity shuffle.
9771      <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
9772                              <4 x i32> <i32 0, i32 1, i32 2, i32 3>  ; yields <4 x i32>
9773      <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
9774                              <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 >  ; yields <8 x i32>
9775
9776Aggregate Operations
9777--------------------
9778
9779LLVM supports several instructions for working with
9780:ref:`aggregate <t_aggregate>` values.
9781
9782.. _i_extractvalue:
9783
9784'``extractvalue``' Instruction
9785^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9786
9787Syntax:
9788"""""""
9789
9790::
9791
9792      <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
9793
9794Overview:
9795"""""""""
9796
9797The '``extractvalue``' instruction extracts the value of a member field
9798from an :ref:`aggregate <t_aggregate>` value.
9799
9800Arguments:
9801""""""""""
9802
9803The first operand of an '``extractvalue``' instruction is a value of
9804:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
9805constant indices to specify which value to extract in a similar manner
9806as indices in a '``getelementptr``' instruction.
9807
9808The major differences to ``getelementptr`` indexing are:
9809
9810-  Since the value being indexed is not a pointer, the first index is
9811   omitted and assumed to be zero.
9812-  At least one index must be specified.
9813-  Not only struct indices but also array indices must be in bounds.
9814
9815Semantics:
9816""""""""""
9817
9818The result is the value at the position in the aggregate specified by
9819the index operands.
9820
9821Example:
9822""""""""
9823
9824.. code-block:: text
9825
9826      <result> = extractvalue {i32, float} %agg, 0    ; yields i32
9827
9828.. _i_insertvalue:
9829
9830'``insertvalue``' Instruction
9831^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9832
9833Syntax:
9834"""""""
9835
9836::
9837
9838      <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}*    ; yields <aggregate type>
9839
9840Overview:
9841"""""""""
9842
9843The '``insertvalue``' instruction inserts a value into a member field in
9844an :ref:`aggregate <t_aggregate>` value.
9845
9846Arguments:
9847""""""""""
9848
9849The first operand of an '``insertvalue``' instruction is a value of
9850:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
9851a first-class value to insert. The following operands are constant
9852indices indicating the position at which to insert the value in a
9853similar manner as indices in a '``extractvalue``' instruction. The value
9854to insert must have the same type as the value identified by the
9855indices.
9856
9857Semantics:
9858""""""""""
9859
9860The result is an aggregate of the same type as ``val``. Its value is
9861that of ``val`` except that the value at the position specified by the
9862indices is that of ``elt``.
9863
9864Example:
9865""""""""
9866
9867.. code-block:: llvm
9868
9869      %agg1 = insertvalue {i32, float} undef, i32 1, 0              ; yields {i32 1, float undef}
9870      %agg2 = insertvalue {i32, float} %agg1, float %val, 1         ; yields {i32 1, float %val}
9871      %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0    ; yields {i32 undef, {float %val}}
9872
9873.. _memoryops:
9874
9875Memory Access and Addressing Operations
9876---------------------------------------
9877
9878A key design point of an SSA-based representation is how it represents
9879memory. In LLVM, no memory locations are in SSA form, which makes things
9880very simple. This section describes how to read, write, and allocate
9881memory in LLVM.
9882
9883.. _i_alloca:
9884
9885'``alloca``' Instruction
9886^^^^^^^^^^^^^^^^^^^^^^^^
9887
9888Syntax:
9889"""""""
9890
9891::
9892
9893      <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)]     ; yields type addrspace(num)*:result
9894
9895Overview:
9896"""""""""
9897
9898The '``alloca``' instruction allocates memory on the stack frame of the
9899currently executing function, to be automatically released when this
9900function returns to its caller.  If the address space is not explicitly
9901specified, the object is allocated in the alloca address space from the
9902:ref:`datalayout string<langref_datalayout>`.
9903
9904Arguments:
9905""""""""""
9906
9907The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
9908bytes of memory on the runtime stack, returning a pointer of the
9909appropriate type to the program. If "NumElements" is specified, it is
9910the number of elements allocated, otherwise "NumElements" is defaulted
9911to be one. If a constant alignment is specified, the value result of the
9912allocation is guaranteed to be aligned to at least that boundary. The
9913alignment may not be greater than ``1 << 32``. If not specified, or if
9914zero, the target can choose to align the allocation on any convenient
9915boundary compatible with the type.
9916
9917'``type``' may be any sized type.
9918
9919Semantics:
9920""""""""""
9921
9922Memory is allocated; a pointer is returned. The allocated memory is
9923uninitialized, and loading from uninitialized memory produces an undefined
9924value. The operation itself is undefined if there is insufficient stack
9925space for the allocation.'``alloca``'d memory is automatically released
9926when the function returns. The '``alloca``' instruction is commonly used
9927to represent automatic variables that must have an address available. When
9928the function returns (either with the ``ret`` or ``resume`` instructions),
9929the memory is reclaimed. Allocating zero bytes is legal, but the returned
9930pointer may not be unique. The order in which memory is allocated (ie.,
9931which way the stack grows) is not specified.
9932
9933Note that '``alloca``' outside of the alloca address space from the
9934:ref:`datalayout string<langref_datalayout>` is meaningful only if the
9935target has assigned it a semantics.
9936
9937If the returned pointer is used by :ref:`llvm.lifetime.start <int_lifestart>`,
9938the returned object is initially dead.
9939See :ref:`llvm.lifetime.start <int_lifestart>` and
9940:ref:`llvm.lifetime.end <int_lifeend>` for the precise semantics of
9941lifetime-manipulating intrinsics.
9942
9943Example:
9944""""""""
9945
9946.. code-block:: llvm
9947
9948      %ptr = alloca i32                             ; yields ptr
9949      %ptr = alloca i32, i32 4                      ; yields ptr
9950      %ptr = alloca i32, i32 4, align 1024          ; yields ptr
9951      %ptr = alloca i32, align 1024                 ; yields ptr
9952
9953.. _i_load:
9954
9955'``load``' Instruction
9956^^^^^^^^^^^^^^^^^^^^^^
9957
9958Syntax:
9959"""""""
9960
9961::
9962
9963      <result> = load [volatile] <ty>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.load !<empty_node>][, !invariant.group !<empty_node>][, !nonnull !<empty_node>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>][, !noundef !<empty_node>]
9964      <result> = load atomic [volatile] <ty>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>]
9965      !<nontemp_node> = !{ i32 1 }
9966      !<empty_node> = !{}
9967      !<deref_bytes_node> = !{ i64 <dereferenceable_bytes> }
9968      !<align_node> = !{ i64 <value_alignment> }
9969
9970Overview:
9971"""""""""
9972
9973The '``load``' instruction is used to read from memory.
9974
9975Arguments:
9976""""""""""
9977
9978The argument to the ``load`` instruction specifies the memory address from which
9979to load. The type specified must be a :ref:`first class <t_firstclass>` type of
9980known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
9981the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
9982modify the number or order of execution of this ``load`` with other
9983:ref:`volatile operations <volatile>`.
9984
9985If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
9986<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
9987``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
9988Atomic loads produce :ref:`defined <memmodel>` results when they may see
9989multiple atomic stores. The type of the pointee must be an integer, pointer, or
9990floating-point type whose bit width is a power of two greater than or equal to
9991eight and less than or equal to a target-specific size limit.  ``align`` must be
9992explicitly specified on atomic loads, and the load has undefined behavior if the
9993alignment is not set to a value which is at least the size in bytes of the
9994pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
9995
9996The optional constant ``align`` argument specifies the alignment of the
9997operation (that is, the alignment of the memory address). A value of 0
9998or an omitted ``align`` argument means that the operation has the ABI
9999alignment for the target. It is the responsibility of the code emitter
10000to ensure that the alignment information is correct. Overestimating the
10001alignment results in undefined behavior. Underestimating the alignment
10002may produce less efficient code. An alignment of 1 is always safe. The
10003maximum possible alignment is ``1 << 32``. An alignment value higher
10004than the size of the loaded type implies memory up to the alignment
10005value bytes can be safely loaded without trapping in the default
10006address space. Access of the high bytes can interfere with debugging
10007tools, so should not be accessed if the function has the
10008``sanitize_thread`` or ``sanitize_address`` attributes.
10009
10010The optional ``!nontemporal`` metadata must reference a single
10011metadata name ``<nontemp_node>`` corresponding to a metadata node with one
10012``i32`` entry of value 1. The existence of the ``!nontemporal``
10013metadata on the instruction tells the optimizer and code generator
10014that this load is not expected to be reused in the cache. The code
10015generator may select special instructions to save cache bandwidth, such
10016as the ``MOVNT`` instruction on x86.
10017
10018The optional ``!invariant.load`` metadata must reference a single
10019metadata name ``<empty_node>`` corresponding to a metadata node with no
10020entries. If a load instruction tagged with the ``!invariant.load``
10021metadata is executed, the memory location referenced by the load has
10022to contain the same value at all points in the program where the
10023memory location is dereferenceable; otherwise, the behavior is
10024undefined.
10025
10026The optional ``!invariant.group`` metadata must reference a single metadata name
10027 ``<empty_node>`` corresponding to a metadata node with no entries.
10028 See ``invariant.group`` metadata :ref:`invariant.group <md_invariant.group>`.
10029
10030The optional ``!nonnull`` metadata must reference a single
10031metadata name ``<empty_node>`` corresponding to a metadata node with no
10032entries. The existence of the ``!nonnull`` metadata on the
10033instruction tells the optimizer that the value loaded is known to
10034never be null. If the value is null at runtime, the behavior is undefined.
10035This is analogous to the ``nonnull`` attribute on parameters and return
10036values. This metadata can only be applied to loads of a pointer type.
10037
10038The optional ``!dereferenceable`` metadata must reference a single metadata
10039name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
10040entry.
10041See ``dereferenceable`` metadata :ref:`dereferenceable <md_dereferenceable>`.
10042
10043The optional ``!dereferenceable_or_null`` metadata must reference a single
10044metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
10045``i64`` entry.
10046See ``dereferenceable_or_null`` metadata :ref:`dereferenceable_or_null
10047<md_dereferenceable_or_null>`.
10048
10049The optional ``!align`` metadata must reference a single metadata name
10050``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
10051The existence of the ``!align`` metadata on the instruction tells the
10052optimizer that the value loaded is known to be aligned to a boundary specified
10053by the integer value in the metadata node. The alignment must be a power of 2.
10054This is analogous to the ''align'' attribute on parameters and return values.
10055This metadata can only be applied to loads of a pointer type. If the returned
10056value is not appropriately aligned at runtime, the behavior is undefined.
10057
10058The optional ``!noundef`` metadata must reference a single metadata name
10059``<empty_node>`` corresponding to a node with no entries. The existence of
10060``!noundef`` metadata on the instruction tells the optimizer that the value
10061loaded is known to be :ref:`well defined <welldefinedvalues>`.
10062If the value isn't well defined, the behavior is undefined.
10063
10064Semantics:
10065""""""""""
10066
10067The location of memory pointed to is loaded. If the value being loaded
10068is of scalar type then the number of bytes read does not exceed the
10069minimum number of bytes needed to hold all bits of the type. For
10070example, loading an ``i24`` reads at most three bytes. When loading a
10071value of a type like ``i20`` with a size that is not an integral number
10072of bytes, the result is undefined if the value was not originally
10073written using a store of the same type.
10074If the value being loaded is of aggregate type, the bytes that correspond to
10075padding may be accessed but are ignored, because it is impossible to observe
10076padding from the loaded aggregate value.
10077If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10078
10079Examples:
10080"""""""""
10081
10082.. code-block:: llvm
10083
10084      %ptr = alloca i32                               ; yields ptr
10085      store i32 3, ptr %ptr                           ; yields void
10086      %val = load i32, ptr %ptr                       ; yields i32:val = i32 3
10087
10088.. _i_store:
10089
10090'``store``' Instruction
10091^^^^^^^^^^^^^^^^^^^^^^^
10092
10093Syntax:
10094"""""""
10095
10096::
10097
10098      store [volatile] <ty> <value>, ptr <pointer>[, align <alignment>][, !nontemporal !<nontemp_node>][, !invariant.group !<empty_node>]        ; yields void
10099      store atomic [volatile] <ty> <value>, ptr <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<empty_node>] ; yields void
10100      !<nontemp_node> = !{ i32 1 }
10101      !<empty_node> = !{}
10102
10103Overview:
10104"""""""""
10105
10106The '``store``' instruction is used to write to memory.
10107
10108Arguments:
10109""""""""""
10110
10111There are two arguments to the ``store`` instruction: a value to store and an
10112address at which to store it. The type of the ``<pointer>`` operand must be a
10113pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
10114operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
10115allowed to modify the number or order of execution of this ``store`` with other
10116:ref:`volatile operations <volatile>`.  Only values of :ref:`first class
10117<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
10118structural type <t_opaque>`) can be stored.
10119
10120If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
10121<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
10122``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
10123Atomic loads produce :ref:`defined <memmodel>` results when they may see
10124multiple atomic stores. The type of the pointee must be an integer, pointer, or
10125floating-point type whose bit width is a power of two greater than or equal to
10126eight and less than or equal to a target-specific size limit.  ``align`` must be
10127explicitly specified on atomic stores, and the store has undefined behavior if
10128the alignment is not set to a value which is at least the size in bytes of the
10129pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
10130
10131The optional constant ``align`` argument specifies the alignment of the
10132operation (that is, the alignment of the memory address). A value of 0
10133or an omitted ``align`` argument means that the operation has the ABI
10134alignment for the target. It is the responsibility of the code emitter
10135to ensure that the alignment information is correct. Overestimating the
10136alignment results in undefined behavior. Underestimating the
10137alignment may produce less efficient code. An alignment of 1 is always
10138safe. The maximum possible alignment is ``1 << 32``. An alignment
10139value higher than the size of the stored type implies memory up to the
10140alignment value bytes can be stored to without trapping in the default
10141address space. Storing to the higher bytes however may result in data
10142races if another thread can access the same address. Introducing a
10143data race is not allowed. Storing to the extra bytes is not allowed
10144even in situations where a data race is known to not exist if the
10145function has the ``sanitize_address`` attribute.
10146
10147The optional ``!nontemporal`` metadata must reference a single metadata
10148name ``<nontemp_node>`` corresponding to a metadata node with one ``i32`` entry
10149of value 1. The existence of the ``!nontemporal`` metadata on the instruction
10150tells the optimizer and code generator that this load is not expected to
10151be reused in the cache. The code generator may select special
10152instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
10153x86.
10154
10155The optional ``!invariant.group`` metadata must reference a
10156single metadata name ``<empty_node>``. See ``invariant.group`` metadata.
10157
10158Semantics:
10159""""""""""
10160
10161The contents of memory are updated to contain ``<value>`` at the
10162location specified by the ``<pointer>`` operand. If ``<value>`` is
10163of scalar type then the number of bytes written does not exceed the
10164minimum number of bytes needed to hold all bits of the type. For
10165example, storing an ``i24`` writes at most three bytes. When writing a
10166value of a type like ``i20`` with a size that is not an integral number
10167of bytes, it is unspecified what happens to the extra bits that do not
10168belong to the type, but they will typically be overwritten.
10169If ``<value>`` is of aggregate type, padding is filled with
10170:ref:`undef <undefvalues>`.
10171If ``<pointer>`` is not a well-defined value, the behavior is undefined.
10172
10173Example:
10174""""""""
10175
10176.. code-block:: llvm
10177
10178      %ptr = alloca i32                               ; yields ptr
10179      store i32 3, ptr %ptr                           ; yields void
10180      %val = load i32, ptr %ptr                       ; yields i32:val = i32 3
10181
10182.. _i_fence:
10183
10184'``fence``' Instruction
10185^^^^^^^^^^^^^^^^^^^^^^^
10186
10187Syntax:
10188"""""""
10189
10190::
10191
10192      fence [syncscope("<target-scope>")] <ordering>  ; yields void
10193
10194Overview:
10195"""""""""
10196
10197The '``fence``' instruction is used to introduce happens-before edges
10198between operations.
10199
10200Arguments:
10201""""""""""
10202
10203'``fence``' instructions take an :ref:`ordering <ordering>` argument which
10204defines what *synchronizes-with* edges they add. They can only be given
10205``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
10206
10207Semantics:
10208""""""""""
10209
10210A fence A which has (at least) ``release`` ordering semantics
10211*synchronizes with* a fence B with (at least) ``acquire`` ordering
10212semantics if and only if there exist atomic operations X and Y, both
10213operating on some atomic object M, such that A is sequenced before X, X
10214modifies M (either directly or through some side effect of a sequence
10215headed by X), Y is sequenced before B, and Y observes M. This provides a
10216*happens-before* dependency between A and B. Rather than an explicit
10217``fence``, one (but not both) of the atomic operations X or Y might
10218provide a ``release`` or ``acquire`` (resp.) ordering constraint and
10219still *synchronize-with* the explicit ``fence`` and establish the
10220*happens-before* edge.
10221
10222A ``fence`` which has ``seq_cst`` ordering, in addition to having both
10223``acquire`` and ``release`` semantics specified above, participates in
10224the global program order of other ``seq_cst`` operations and/or fences.
10225
10226A ``fence`` instruction can also take an optional
10227":ref:`syncscope <syncscope>`" argument.
10228
10229Example:
10230""""""""
10231
10232.. code-block:: text
10233
10234      fence acquire                                        ; yields void
10235      fence syncscope("singlethread") seq_cst              ; yields void
10236      fence syncscope("agent") seq_cst                     ; yields void
10237
10238.. _i_cmpxchg:
10239
10240'``cmpxchg``' Instruction
10241^^^^^^^^^^^^^^^^^^^^^^^^^
10242
10243Syntax:
10244"""""""
10245
10246::
10247
10248      cmpxchg [weak] [volatile] ptr <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering>[, align <alignment>] ; yields  { ty, i1 }
10249
10250Overview:
10251"""""""""
10252
10253The '``cmpxchg``' instruction is used to atomically modify memory. It
10254loads a value in memory and compares it to a given value. If they are
10255equal, it tries to store a new value into the memory.
10256
10257Arguments:
10258""""""""""
10259
10260There are three arguments to the '``cmpxchg``' instruction: an address
10261to operate on, a value to compare to the value currently be at that
10262address, and a new value to place at that address if the compared values
10263are equal. The type of '<cmp>' must be an integer or pointer type whose
10264bit width is a power of two greater than or equal to eight and less
10265than or equal to a target-specific size limit. '<cmp>' and '<new>' must
10266have the same type, and the type of '<pointer>' must be a pointer to
10267that type. If the ``cmpxchg`` is marked as ``volatile``, then the
10268optimizer is not allowed to modify the number or order of execution of
10269this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
10270
10271The success and failure :ref:`ordering <ordering>` arguments specify how this
10272``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
10273must be at least ``monotonic``, the failure ordering cannot be either
10274``release`` or ``acq_rel``.
10275
10276A ``cmpxchg`` instruction can also take an optional
10277":ref:`syncscope <syncscope>`" argument.
10278
10279The instruction can take an optional ``align`` attribute.
10280The alignment must be a power of two greater or equal to the size of the
10281`<value>` type. If unspecified, the alignment is assumed to be equal to the
10282size of the '<value>' type. Note that this default alignment assumption is
10283different from the alignment used for the load/store instructions when align
10284isn't specified.
10285
10286The pointer passed into cmpxchg must have alignment greater than or
10287equal to the size in memory of the operand.
10288
10289Semantics:
10290""""""""""
10291
10292The contents of memory at the location specified by the '``<pointer>``' operand
10293is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
10294written to the location. The original value at the location is returned,
10295together with a flag indicating success (true) or failure (false).
10296
10297If the cmpxchg operation is marked as ``weak`` then a spurious failure is
10298permitted: the operation may not write ``<new>`` even if the comparison
10299matched.
10300
10301If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
10302if the value loaded equals ``cmp``.
10303
10304A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
10305identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
10306load with an ordering parameter determined the second ordering parameter.
10307
10308Example:
10309""""""""
10310
10311.. code-block:: llvm
10312
10313    entry:
10314      %orig = load atomic i32, ptr %ptr unordered, align 4                      ; yields i32
10315      br label %loop
10316
10317    loop:
10318      %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
10319      %squared = mul i32 %cmp, %cmp
10320      %val_success = cmpxchg ptr %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields  { i32, i1 }
10321      %value_loaded = extractvalue { i32, i1 } %val_success, 0
10322      %success = extractvalue { i32, i1 } %val_success, 1
10323      br i1 %success, label %done, label %loop
10324
10325    done:
10326      ...
10327
10328.. _i_atomicrmw:
10329
10330'``atomicrmw``' Instruction
10331^^^^^^^^^^^^^^^^^^^^^^^^^^^
10332
10333Syntax:
10334"""""""
10335
10336::
10337
10338      atomicrmw [volatile] <operation> ptr <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering>[, align <alignment>]  ; yields ty
10339
10340Overview:
10341"""""""""
10342
10343The '``atomicrmw``' instruction is used to atomically modify memory.
10344
10345Arguments:
10346""""""""""
10347
10348There are three arguments to the '``atomicrmw``' instruction: an
10349operation to apply, an address whose value to modify, an argument to the
10350operation. The operation must be one of the following keywords:
10351
10352-  xchg
10353-  add
10354-  sub
10355-  and
10356-  nand
10357-  or
10358-  xor
10359-  max
10360-  min
10361-  umax
10362-  umin
10363-  fadd
10364-  fsub
10365-  fmax
10366-  fmin
10367
10368For most of these operations, the type of '<value>' must be an integer
10369type whose bit width is a power of two greater than or equal to eight
10370and less than or equal to a target-specific size limit. For xchg, this
10371may also be a floating point or a pointer type with the same size constraints
10372as integers.  For fadd/fsub/fmax/fmin, this must be a floating point type.  The
10373type of the '``<pointer>``' operand must be a pointer to that type. If
10374the ``atomicrmw`` is marked as ``volatile``, then the optimizer is not
10375allowed to modify the number or order of execution of this
10376``atomicrmw`` with other :ref:`volatile operations <volatile>`.
10377
10378The instruction can take an optional ``align`` attribute.
10379The alignment must be a power of two greater or equal to the size of the
10380`<value>` type. If unspecified, the alignment is assumed to be equal to the
10381size of the '<value>' type. Note that this default alignment assumption is
10382different from the alignment used for the load/store instructions when align
10383isn't specified.
10384
10385A ``atomicrmw`` instruction can also take an optional
10386":ref:`syncscope <syncscope>`" argument.
10387
10388Semantics:
10389""""""""""
10390
10391The contents of memory at the location specified by the '``<pointer>``'
10392operand are atomically read, modified, and written back. The original
10393value at the location is returned. The modification is specified by the
10394operation argument:
10395
10396-  xchg: ``*ptr = val``
10397-  add: ``*ptr = *ptr + val``
10398-  sub: ``*ptr = *ptr - val``
10399-  and: ``*ptr = *ptr & val``
10400-  nand: ``*ptr = ~(*ptr & val)``
10401-  or: ``*ptr = *ptr | val``
10402-  xor: ``*ptr = *ptr ^ val``
10403-  max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
10404-  min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
10405-  umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned comparison)
10406-  umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned comparison)
10407- fadd: ``*ptr = *ptr + val`` (using floating point arithmetic)
10408- fsub: ``*ptr = *ptr - val`` (using floating point arithmetic)
10409-  fmax: ``*ptr = maxnum(*ptr, val)`` (match the `llvm.maxnum.*`` intrinsic)
10410-  fmin: ``*ptr = minnum(*ptr, val)`` (match the `llvm.minnum.*`` intrinsic)
10411
10412Example:
10413""""""""
10414
10415.. code-block:: llvm
10416
10417      %old = atomicrmw add ptr %ptr, i32 1 acquire                        ; yields i32
10418
10419.. _i_getelementptr:
10420
10421'``getelementptr``' Instruction
10422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10423
10424Syntax:
10425"""""""
10426
10427::
10428
10429      <result> = getelementptr <ty>, ptr <ptrval>{, [inrange] <ty> <idx>}*
10430      <result> = getelementptr inbounds ptr <ptrval>{, [inrange] <ty> <idx>}*
10431      <result> = getelementptr <ty>, <N x ptr> <ptrval>, [inrange] <vector index type> <idx>
10432
10433Overview:
10434"""""""""
10435
10436The '``getelementptr``' instruction is used to get the address of a
10437subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
10438address calculation only and does not access memory. The instruction can also
10439be used to calculate a vector of such addresses.
10440
10441Arguments:
10442""""""""""
10443
10444The first argument is always a type used as the basis for the calculations.
10445The second argument is always a pointer or a vector of pointers, and is the
10446base address to start from. The remaining arguments are indices
10447that indicate which of the elements of the aggregate object are indexed.
10448The interpretation of each index is dependent on the type being indexed
10449into. The first index always indexes the pointer value given as the
10450second argument, the second index indexes a value of the type pointed to
10451(not necessarily the value directly pointed to, since the first index
10452can be non-zero), etc. The first type indexed into must be a pointer
10453value, subsequent types can be arrays, vectors, and structs. Note that
10454subsequent types being indexed into can never be pointers, since that
10455would require loading the pointer before continuing calculation.
10456
10457The type of each index argument depends on the type it is indexing into.
10458When indexing into a (optionally packed) structure, only ``i32`` integer
10459**constants** are allowed (when using a vector of indices they must all
10460be the **same** ``i32`` integer constant). When indexing into an array,
10461pointer or vector, integers of any width are allowed, and they are not
10462required to be constant. These integers are treated as signed values
10463where relevant.
10464
10465For example, let's consider a C code fragment and how it gets compiled
10466to LLVM:
10467
10468.. code-block:: c
10469
10470    struct RT {
10471      char A;
10472      int B[10][20];
10473      char C;
10474    };
10475    struct ST {
10476      int X;
10477      double Y;
10478      struct RT Z;
10479    };
10480
10481    int *foo(struct ST *s) {
10482      return &s[1].Z.B[5][13];
10483    }
10484
10485The LLVM code generated by Clang is:
10486
10487.. code-block:: llvm
10488
10489    %struct.RT = type { i8, [10 x [20 x i32]], i8 }
10490    %struct.ST = type { i32, double, %struct.RT }
10491
10492    define ptr @foo(ptr %s) nounwind uwtable readnone optsize ssp {
10493    entry:
10494      %arrayidx = getelementptr inbounds %struct.ST, ptr %s, i64 1, i32 2, i32 1, i64 5, i64 13
10495      ret ptr %arrayidx
10496    }
10497
10498Semantics:
10499""""""""""
10500
10501In the example above, the first index is indexing into the
10502'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
10503= '``{ i32, double, %struct.RT }``' type, a structure. The second index
10504indexes into the third element of the structure, yielding a
10505'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
10506structure. The third index indexes into the second element of the
10507structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
10508dimensions of the array are subscripted into, yielding an '``i32``'
10509type. The '``getelementptr``' instruction returns a pointer to this
10510element.
10511
10512Note that it is perfectly legal to index partially through a structure,
10513returning a pointer to an inner element. Because of this, the LLVM code
10514for the given testcase is equivalent to:
10515
10516.. code-block:: llvm
10517
10518    define ptr @foo(ptr %s) {
10519      %t1 = getelementptr %struct.ST, ptr %s, i32 1
10520      %t2 = getelementptr %struct.ST, ptr %t1, i32 0, i32 2
10521      %t3 = getelementptr %struct.RT, ptr %t2, i32 0, i32 1
10522      %t4 = getelementptr [10 x [20 x i32]], ptr %t3, i32 0, i32 5
10523      %t5 = getelementptr [20 x i32], ptr %t4, i32 0, i32 13
10524      ret ptr %t5
10525    }
10526
10527If the ``inbounds`` keyword is present, the result value of the
10528``getelementptr`` is a :ref:`poison value <poisonvalues>` if one of the
10529following rules is violated:
10530
10531*  The base pointer has an *in bounds* address of an allocated object, which
10532   means that it points into an allocated object, or to its end. The only
10533   *in bounds* address for a null pointer in the default address-space is the
10534   null pointer itself.
10535*  If the type of an index is larger than the pointer index type, the
10536   truncation to the pointer index type preserves the signed value.
10537*  The multiplication of an index by the type size does not wrap the pointer
10538   index type in a signed sense (``nsw``).
10539*  The successive addition of offsets (without adding the base address) does
10540   not wrap the pointer index type in a signed sense (``nsw``).
10541*  The successive addition of the current address, interpreted as an unsigned
10542   number, and an offset, interpreted as a signed number, does not wrap the
10543   unsigned address space and remains *in bounds* of the allocated object.
10544   As a corollary, if the added offset is non-negative, the addition does not
10545   wrap in an unsigned sense (``nuw``).
10546*  In cases where the base is a vector of pointers, the ``inbounds`` keyword
10547   applies to each of the computations element-wise.
10548
10549These rules are based on the assumption that no allocated object may cross
10550the unsigned address space boundary, and no allocated object may be larger
10551than half the pointer index type space.
10552
10553If the ``inbounds`` keyword is not present, the offsets are added to the
10554base address with silently-wrapping two's complement arithmetic. If the
10555offsets have a different width from the pointer, they are sign-extended
10556or truncated to the width of the pointer. The result value of the
10557``getelementptr`` may be outside the object pointed to by the base
10558pointer. The result value may not necessarily be used to access memory
10559though, even if it happens to point into allocated storage. See the
10560:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
10561information.
10562
10563If the ``inrange`` keyword is present before any index, loading from or
10564storing to any pointer derived from the ``getelementptr`` has undefined
10565behavior if the load or store would access memory outside of the bounds of
10566the element selected by the index marked as ``inrange``. The result of a
10567pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
10568involving memory) involving a pointer derived from a ``getelementptr`` with
10569the ``inrange`` keyword is undefined, with the exception of comparisons
10570in the case where both operands are in the range of the element selected
10571by the ``inrange`` keyword, inclusive of the address one past the end of
10572that element. Note that the ``inrange`` keyword is currently only allowed
10573in constant ``getelementptr`` expressions.
10574
10575The getelementptr instruction is often confusing. For some more insight
10576into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
10577
10578Example:
10579""""""""
10580
10581.. code-block:: llvm
10582
10583        %aptr = getelementptr {i32, [12 x i8]}, ptr %saptr, i64 0, i32 1
10584        %vptr = getelementptr {i32, <2 x i8>}, ptr %svptr, i64 0, i32 1, i32 1
10585        %eptr = getelementptr [12 x i8], ptr %aptr, i64 0, i32 1
10586        %iptr = getelementptr [10 x i32], ptr @arr, i16 0, i16 0
10587
10588Vector of pointers:
10589"""""""""""""""""""
10590
10591The ``getelementptr`` returns a vector of pointers, instead of a single address,
10592when one or more of its arguments is a vector. In such cases, all vector
10593arguments should have the same number of elements, and every scalar argument
10594will be effectively broadcast into a vector during address calculation.
10595
10596.. code-block:: llvm
10597
10598     ; All arguments are vectors:
10599     ;   A[i] = ptrs[i] + offsets[i]*sizeof(i8)
10600     %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
10601
10602     ; Add the same scalar offset to each pointer of a vector:
10603     ;   A[i] = ptrs[i] + offset*sizeof(i8)
10604     %A = getelementptr i8, <4 x ptr> %ptrs, i64 %offset
10605
10606     ; Add distinct offsets to the same pointer:
10607     ;   A[i] = ptr + offsets[i]*sizeof(i8)
10608     %A = getelementptr i8, ptr %ptr, <4 x i64> %offsets
10609
10610     ; In all cases described above the type of the result is <4 x ptr>
10611
10612The two following instructions are equivalent:
10613
10614.. code-block:: llvm
10615
10616     getelementptr  %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
10617       <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
10618       <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
10619       <4 x i32> %ind4,
10620       <4 x i64> <i64 13, i64 13, i64 13, i64 13>
10621
10622     getelementptr  %struct.ST, <4 x ptr> %s, <4 x i64> %ind1,
10623       i32 2, i32 1, <4 x i32> %ind4, i64 13
10624
10625Let's look at the C code, where the vector version of ``getelementptr``
10626makes sense:
10627
10628.. code-block:: c
10629
10630    // Let's assume that we vectorize the following loop:
10631    double *A, *B; int *C;
10632    for (int i = 0; i < size; ++i) {
10633      A[i] = B[C[i]];
10634    }
10635
10636.. code-block:: llvm
10637
10638    ; get pointers for 8 elements from array B
10639    %ptrs = getelementptr double, ptr %B, <8 x i32> %C
10640    ; load 8 elements from array B into A
10641    %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x ptr> %ptrs,
10642         i32 8, <8 x i1> %mask, <8 x double> %passthru)
10643
10644Conversion Operations
10645---------------------
10646
10647The instructions in this category are the conversion instructions
10648(casting) which all take a single operand and a type. They perform
10649various bit conversions on the operand.
10650
10651.. _i_trunc:
10652
10653'``trunc .. to``' Instruction
10654^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10655
10656Syntax:
10657"""""""
10658
10659::
10660
10661      <result> = trunc <ty> <value> to <ty2>             ; yields ty2
10662
10663Overview:
10664"""""""""
10665
10666The '``trunc``' instruction truncates its operand to the type ``ty2``.
10667
10668Arguments:
10669""""""""""
10670
10671The '``trunc``' instruction takes a value to trunc, and a type to trunc
10672it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
10673of the same number of integers. The bit size of the ``value`` must be
10674larger than the bit size of the destination type, ``ty2``. Equal sized
10675types are not allowed.
10676
10677Semantics:
10678""""""""""
10679
10680The '``trunc``' instruction truncates the high order bits in ``value``
10681and converts the remaining bits to ``ty2``. Since the source size must
10682be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
10683It will always truncate bits.
10684
10685Example:
10686""""""""
10687
10688.. code-block:: llvm
10689
10690      %X = trunc i32 257 to i8                        ; yields i8:1
10691      %Y = trunc i32 123 to i1                        ; yields i1:true
10692      %Z = trunc i32 122 to i1                        ; yields i1:false
10693      %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
10694
10695.. _i_zext:
10696
10697'``zext .. to``' Instruction
10698^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10699
10700Syntax:
10701"""""""
10702
10703::
10704
10705      <result> = zext <ty> <value> to <ty2>             ; yields ty2
10706
10707Overview:
10708"""""""""
10709
10710The '``zext``' instruction zero extends its operand to type ``ty2``.
10711
10712Arguments:
10713""""""""""
10714
10715The '``zext``' instruction takes a value to cast, and a type to cast it
10716to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10717the same number of integers. The bit size of the ``value`` must be
10718smaller than the bit size of the destination type, ``ty2``.
10719
10720Semantics:
10721""""""""""
10722
10723The ``zext`` fills the high order bits of the ``value`` with zero bits
10724until it reaches the size of the destination type, ``ty2``.
10725
10726When zero extending from i1, the result will always be either 0 or 1.
10727
10728Example:
10729""""""""
10730
10731.. code-block:: llvm
10732
10733      %X = zext i32 257 to i64              ; yields i64:257
10734      %Y = zext i1 true to i32              ; yields i32:1
10735      %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10736
10737.. _i_sext:
10738
10739'``sext .. to``' Instruction
10740^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10741
10742Syntax:
10743"""""""
10744
10745::
10746
10747      <result> = sext <ty> <value> to <ty2>             ; yields ty2
10748
10749Overview:
10750"""""""""
10751
10752The '``sext``' sign extends ``value`` to the type ``ty2``.
10753
10754Arguments:
10755""""""""""
10756
10757The '``sext``' instruction takes a value to cast, and a type to cast it
10758to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
10759the same number of integers. The bit size of the ``value`` must be
10760smaller than the bit size of the destination type, ``ty2``.
10761
10762Semantics:
10763""""""""""
10764
10765The '``sext``' instruction performs a sign extension by copying the sign
10766bit (highest order bit) of the ``value`` until it reaches the bit size
10767of the type ``ty2``.
10768
10769When sign extending from i1, the extension always results in -1 or 0.
10770
10771Example:
10772""""""""
10773
10774.. code-block:: llvm
10775
10776      %X = sext i8  -1 to i16              ; yields i16   :65535
10777      %Y = sext i1 true to i32             ; yields i32:-1
10778      %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
10779
10780'``fptrunc .. to``' Instruction
10781^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10782
10783Syntax:
10784"""""""
10785
10786::
10787
10788      <result> = fptrunc <ty> <value> to <ty2>             ; yields ty2
10789
10790Overview:
10791"""""""""
10792
10793The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
10794
10795Arguments:
10796""""""""""
10797
10798The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
10799value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
10800The size of ``value`` must be larger than the size of ``ty2``. This
10801implies that ``fptrunc`` cannot be used to make a *no-op cast*.
10802
10803Semantics:
10804""""""""""
10805
10806The '``fptrunc``' instruction casts a ``value`` from a larger
10807:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
10808<t_floating>` type.
10809This instruction is assumed to execute in the default :ref:`floating-point
10810environment <floatenv>`.
10811
10812Example:
10813""""""""
10814
10815.. code-block:: llvm
10816
10817      %X = fptrunc double 16777217.0 to float    ; yields float:16777216.0
10818      %Y = fptrunc double 1.0E+300 to half       ; yields half:+infinity
10819
10820'``fpext .. to``' Instruction
10821^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10822
10823Syntax:
10824"""""""
10825
10826::
10827
10828      <result> = fpext <ty> <value> to <ty2>             ; yields ty2
10829
10830Overview:
10831"""""""""
10832
10833The '``fpext``' extends a floating-point ``value`` to a larger floating-point
10834value.
10835
10836Arguments:
10837""""""""""
10838
10839The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
10840``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
10841to. The source type must be smaller than the destination type.
10842
10843Semantics:
10844""""""""""
10845
10846The '``fpext``' instruction extends the ``value`` from a smaller
10847:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
10848<t_floating>` type. The ``fpext`` cannot be used to make a
10849*no-op cast* because it always changes bits. Use ``bitcast`` to make a
10850*no-op cast* for a floating-point cast.
10851
10852Example:
10853""""""""
10854
10855.. code-block:: llvm
10856
10857      %X = fpext float 3.125 to double         ; yields double:3.125000e+00
10858      %Y = fpext double %X to fp128            ; yields fp128:0xL00000000000000004000900000000000
10859
10860'``fptoui .. to``' Instruction
10861^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10862
10863Syntax:
10864"""""""
10865
10866::
10867
10868      <result> = fptoui <ty> <value> to <ty2>             ; yields ty2
10869
10870Overview:
10871"""""""""
10872
10873The '``fptoui``' converts a floating-point ``value`` to its unsigned
10874integer equivalent of type ``ty2``.
10875
10876Arguments:
10877""""""""""
10878
10879The '``fptoui``' instruction takes a value to cast, which must be a
10880scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10881cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10882``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10883type with the same number of elements as ``ty``
10884
10885Semantics:
10886""""""""""
10887
10888The '``fptoui``' instruction converts its :ref:`floating-point
10889<t_floating>` operand into the nearest (rounding towards zero)
10890unsigned integer value. If the value cannot fit in ``ty2``, the result
10891is a :ref:`poison value <poisonvalues>`.
10892
10893Example:
10894""""""""
10895
10896.. code-block:: llvm
10897
10898      %X = fptoui double 123.0 to i32      ; yields i32:123
10899      %Y = fptoui float 1.0E+300 to i1     ; yields undefined:1
10900      %Z = fptoui float 1.04E+17 to i8     ; yields undefined:1
10901
10902'``fptosi .. to``' Instruction
10903^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10904
10905Syntax:
10906"""""""
10907
10908::
10909
10910      <result> = fptosi <ty> <value> to <ty2>             ; yields ty2
10911
10912Overview:
10913"""""""""
10914
10915The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
10916``value`` to type ``ty2``.
10917
10918Arguments:
10919""""""""""
10920
10921The '``fptosi``' instruction takes a value to cast, which must be a
10922scalar or vector :ref:`floating-point <t_floating>` value, and a type to
10923cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
10924``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
10925type with the same number of elements as ``ty``
10926
10927Semantics:
10928""""""""""
10929
10930The '``fptosi``' instruction converts its :ref:`floating-point
10931<t_floating>` operand into the nearest (rounding towards zero)
10932signed integer value. If the value cannot fit in ``ty2``, the result
10933is a :ref:`poison value <poisonvalues>`.
10934
10935Example:
10936""""""""
10937
10938.. code-block:: llvm
10939
10940      %X = fptosi double -123.0 to i32      ; yields i32:-123
10941      %Y = fptosi float 1.0E-247 to i1      ; yields undefined:1
10942      %Z = fptosi float 1.04E+17 to i8      ; yields undefined:1
10943
10944'``uitofp .. to``' Instruction
10945^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10946
10947Syntax:
10948"""""""
10949
10950::
10951
10952      <result> = uitofp <ty> <value> to <ty2>             ; yields ty2
10953
10954Overview:
10955"""""""""
10956
10957The '``uitofp``' instruction regards ``value`` as an unsigned integer
10958and converts that value to the ``ty2`` type.
10959
10960Arguments:
10961""""""""""
10962
10963The '``uitofp``' instruction takes a value to cast, which must be a
10964scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
10965``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
10966``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
10967type with the same number of elements as ``ty``
10968
10969Semantics:
10970""""""""""
10971
10972The '``uitofp``' instruction interprets its operand as an unsigned
10973integer quantity and converts it to the corresponding floating-point
10974value. If the value cannot be exactly represented, it is rounded using
10975the default rounding mode.
10976
10977
10978Example:
10979""""""""
10980
10981.. code-block:: llvm
10982
10983      %X = uitofp i32 257 to float         ; yields float:257.0
10984      %Y = uitofp i8 -1 to double          ; yields double:255.0
10985
10986'``sitofp .. to``' Instruction
10987^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10988
10989Syntax:
10990"""""""
10991
10992::
10993
10994      <result> = sitofp <ty> <value> to <ty2>             ; yields ty2
10995
10996Overview:
10997"""""""""
10998
10999The '``sitofp``' instruction regards ``value`` as a signed integer and
11000converts that value to the ``ty2`` type.
11001
11002Arguments:
11003""""""""""
11004
11005The '``sitofp``' instruction takes a value to cast, which must be a
11006scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
11007``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
11008``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
11009type with the same number of elements as ``ty``
11010
11011Semantics:
11012""""""""""
11013
11014The '``sitofp``' instruction interprets its operand as a signed integer
11015quantity and converts it to the corresponding floating-point value. If the
11016value cannot be exactly represented, it is rounded using the default rounding
11017mode.
11018
11019Example:
11020""""""""
11021
11022.. code-block:: llvm
11023
11024      %X = sitofp i32 257 to float         ; yields float:257.0
11025      %Y = sitofp i8 -1 to double          ; yields double:-1.0
11026
11027.. _i_ptrtoint:
11028
11029'``ptrtoint .. to``' Instruction
11030^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11031
11032Syntax:
11033"""""""
11034
11035::
11036
11037      <result> = ptrtoint <ty> <value> to <ty2>             ; yields ty2
11038
11039Overview:
11040"""""""""
11041
11042The '``ptrtoint``' instruction converts the pointer or a vector of
11043pointers ``value`` to the integer (or vector of integers) type ``ty2``.
11044
11045Arguments:
11046""""""""""
11047
11048The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
11049a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
11050type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
11051a vector of integers type.
11052
11053Semantics:
11054""""""""""
11055
11056The '``ptrtoint``' instruction converts ``value`` to integer type
11057``ty2`` by interpreting the pointer value as an integer and either
11058truncating or zero extending that value to the size of the integer type.
11059If ``value`` is smaller than ``ty2`` then a zero extension is done. If
11060``value`` is larger than ``ty2`` then a truncation is done. If they are
11061the same size, then nothing is done (*no-op cast*) other than a type
11062change.
11063
11064Example:
11065""""""""
11066
11067.. code-block:: llvm
11068
11069      %X = ptrtoint ptr %P to i8                         ; yields truncation on 32-bit architecture
11070      %Y = ptrtoint ptr %P to i64                        ; yields zero extension on 32-bit architecture
11071      %Z = ptrtoint <4 x ptr> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
11072
11073.. _i_inttoptr:
11074
11075'``inttoptr .. to``' Instruction
11076^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11077
11078Syntax:
11079"""""""
11080
11081::
11082
11083      <result> = inttoptr <ty> <value> to <ty2>[, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>]             ; yields ty2
11084
11085Overview:
11086"""""""""
11087
11088The '``inttoptr``' instruction converts an integer ``value`` to a
11089pointer type, ``ty2``.
11090
11091Arguments:
11092""""""""""
11093
11094The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
11095cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
11096type.
11097
11098The optional ``!dereferenceable`` metadata must reference a single metadata
11099name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
11100entry.
11101See ``dereferenceable`` metadata.
11102
11103The optional ``!dereferenceable_or_null`` metadata must reference a single
11104metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
11105``i64`` entry.
11106See ``dereferenceable_or_null`` metadata.
11107
11108Semantics:
11109""""""""""
11110
11111The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
11112applying either a zero extension or a truncation depending on the size
11113of the integer ``value``. If ``value`` is larger than the size of a
11114pointer then a truncation is done. If ``value`` is smaller than the size
11115of a pointer then a zero extension is done. If they are the same size,
11116nothing is done (*no-op cast*).
11117
11118Example:
11119""""""""
11120
11121.. code-block:: llvm
11122
11123      %X = inttoptr i32 255 to ptr           ; yields zero extension on 64-bit architecture
11124      %Y = inttoptr i32 255 to ptr           ; yields no-op on 32-bit architecture
11125      %Z = inttoptr i64 0 to ptr             ; yields truncation on 32-bit architecture
11126      %Z = inttoptr <4 x i32> %G to <4 x ptr>; yields truncation of vector G to four pointers
11127
11128.. _i_bitcast:
11129
11130'``bitcast .. to``' Instruction
11131^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11132
11133Syntax:
11134"""""""
11135
11136::
11137
11138      <result> = bitcast <ty> <value> to <ty2>             ; yields ty2
11139
11140Overview:
11141"""""""""
11142
11143The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
11144changing any bits.
11145
11146Arguments:
11147""""""""""
11148
11149The '``bitcast``' instruction takes a value to cast, which must be a
11150non-aggregate first class value, and a type to cast it to, which must
11151also be a non-aggregate :ref:`first class <t_firstclass>` type. The
11152bit sizes of ``value`` and the destination type, ``ty2``, must be
11153identical. If the source type is a pointer, the destination type must
11154also be a pointer of the same size. This instruction supports bitwise
11155conversion of vectors to integers and to vectors of other types (as
11156long as they have the same size).
11157
11158Semantics:
11159""""""""""
11160
11161The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
11162is always a *no-op cast* because no bits change with this
11163conversion. The conversion is done as if the ``value`` had been stored
11164to memory and read back as type ``ty2``. Pointer (or vector of
11165pointers) types may only be converted to other pointer (or vector of
11166pointers) types with the same address space through this instruction.
11167To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
11168or :ref:`ptrtoint <i_ptrtoint>` instructions first.
11169
11170There is a caveat for bitcasts involving vector types in relation to
11171endianess. For example ``bitcast <2 x i8> <value> to i16`` puts element zero
11172of the vector in the least significant bits of the i16 for little-endian while
11173element zero ends up in the most significant bits for big-endian.
11174
11175Example:
11176""""""""
11177
11178.. code-block:: text
11179
11180      %X = bitcast i8 255 to i8          ; yields i8 :-1
11181      %Y = bitcast i32* %x to i16*       ; yields i16*:%x
11182      %Z = bitcast <2 x i32> %V to i64;  ; yields i64: %V (depends on endianess)
11183      %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
11184
11185.. _i_addrspacecast:
11186
11187'``addrspacecast .. to``' Instruction
11188^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11189
11190Syntax:
11191"""""""
11192
11193::
11194
11195      <result> = addrspacecast <pty> <ptrval> to <pty2>       ; yields pty2
11196
11197Overview:
11198"""""""""
11199
11200The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
11201address space ``n`` to type ``pty2`` in address space ``m``.
11202
11203Arguments:
11204""""""""""
11205
11206The '``addrspacecast``' instruction takes a pointer or vector of pointer value
11207to cast and a pointer type to cast it to, which must have a different
11208address space.
11209
11210Semantics:
11211""""""""""
11212
11213The '``addrspacecast``' instruction converts the pointer value
11214``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
11215value modification, depending on the target and the address space
11216pair. Pointer conversions within the same address space must be
11217performed with the ``bitcast`` instruction. Note that if the address space
11218conversion is legal then both result and operand refer to the same memory
11219location.
11220
11221Example:
11222""""""""
11223
11224.. code-block:: llvm
11225
11226      %X = addrspacecast ptr %x to ptr addrspace(1)
11227      %Y = addrspacecast ptr addrspace(1) %y to ptr addrspace(2)
11228      %Z = addrspacecast <4 x ptr> %z to <4 x ptr addrspace(3)>
11229
11230.. _otherops:
11231
11232Other Operations
11233----------------
11234
11235The instructions in this category are the "miscellaneous" instructions,
11236which defy better classification.
11237
11238.. _i_icmp:
11239
11240'``icmp``' Instruction
11241^^^^^^^^^^^^^^^^^^^^^^
11242
11243Syntax:
11244"""""""
11245
11246::
11247
11248      <result> = icmp <cond> <ty> <op1>, <op2>   ; yields i1 or <N x i1>:result
11249
11250Overview:
11251"""""""""
11252
11253The '``icmp``' instruction returns a boolean value or a vector of
11254boolean values based on comparison of its two integer, integer vector,
11255pointer, or pointer vector operands.
11256
11257Arguments:
11258""""""""""
11259
11260The '``icmp``' instruction takes three operands. The first operand is
11261the condition code indicating the kind of comparison to perform. It is
11262not a value, just a keyword. The possible condition codes are:
11263
11264.. _icmp_md_cc:
11265
11266#. ``eq``: equal
11267#. ``ne``: not equal
11268#. ``ugt``: unsigned greater than
11269#. ``uge``: unsigned greater or equal
11270#. ``ult``: unsigned less than
11271#. ``ule``: unsigned less or equal
11272#. ``sgt``: signed greater than
11273#. ``sge``: signed greater or equal
11274#. ``slt``: signed less than
11275#. ``sle``: signed less or equal
11276
11277The remaining two arguments must be :ref:`integer <t_integer>` or
11278:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
11279must also be identical types.
11280
11281Semantics:
11282""""""""""
11283
11284The '``icmp``' compares ``op1`` and ``op2`` according to the condition
11285code given as ``cond``. The comparison performed always yields either an
11286:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
11287
11288.. _icmp_md_cc_sem:
11289
11290#. ``eq``: yields ``true`` if the operands are equal, ``false``
11291   otherwise. No sign interpretation is necessary or performed.
11292#. ``ne``: yields ``true`` if the operands are unequal, ``false``
11293   otherwise. No sign interpretation is necessary or performed.
11294#. ``ugt``: interprets the operands as unsigned values and yields
11295   ``true`` if ``op1`` is greater than ``op2``.
11296#. ``uge``: interprets the operands as unsigned values and yields
11297   ``true`` if ``op1`` is greater than or equal to ``op2``.
11298#. ``ult``: interprets the operands as unsigned values and yields
11299   ``true`` if ``op1`` is less than ``op2``.
11300#. ``ule``: interprets the operands as unsigned values and yields
11301   ``true`` if ``op1`` is less than or equal to ``op2``.
11302#. ``sgt``: interprets the operands as signed values and yields ``true``
11303   if ``op1`` is greater than ``op2``.
11304#. ``sge``: interprets the operands as signed values and yields ``true``
11305   if ``op1`` is greater than or equal to ``op2``.
11306#. ``slt``: interprets the operands as signed values and yields ``true``
11307   if ``op1`` is less than ``op2``.
11308#. ``sle``: interprets the operands as signed values and yields ``true``
11309   if ``op1`` is less than or equal to ``op2``.
11310
11311If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
11312are compared as if they were integers.
11313
11314If the operands are integer vectors, then they are compared element by
11315element. The result is an ``i1`` vector with the same number of elements
11316as the values being compared. Otherwise, the result is an ``i1``.
11317
11318Example:
11319""""""""
11320
11321.. code-block:: text
11322
11323      <result> = icmp eq i32 4, 5          ; yields: result=false
11324      <result> = icmp ne ptr %X, %X        ; yields: result=false
11325      <result> = icmp ult i16  4, 5        ; yields: result=true
11326      <result> = icmp sgt i16  4, 5        ; yields: result=false
11327      <result> = icmp ule i16 -4, 5        ; yields: result=false
11328      <result> = icmp sge i16  4, 5        ; yields: result=false
11329
11330.. _i_fcmp:
11331
11332'``fcmp``' Instruction
11333^^^^^^^^^^^^^^^^^^^^^^
11334
11335Syntax:
11336"""""""
11337
11338::
11339
11340      <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2>     ; yields i1 or <N x i1>:result
11341
11342Overview:
11343"""""""""
11344
11345The '``fcmp``' instruction returns a boolean value or vector of boolean
11346values based on comparison of its operands.
11347
11348If the operands are floating-point scalars, then the result type is a
11349boolean (:ref:`i1 <t_integer>`).
11350
11351If the operands are floating-point vectors, then the result type is a
11352vector of boolean with the same number of elements as the operands being
11353compared.
11354
11355Arguments:
11356""""""""""
11357
11358The '``fcmp``' instruction takes three operands. The first operand is
11359the condition code indicating the kind of comparison to perform. It is
11360not a value, just a keyword. The possible condition codes are:
11361
11362#. ``false``: no comparison, always returns false
11363#. ``oeq``: ordered and equal
11364#. ``ogt``: ordered and greater than
11365#. ``oge``: ordered and greater than or equal
11366#. ``olt``: ordered and less than
11367#. ``ole``: ordered and less than or equal
11368#. ``one``: ordered and not equal
11369#. ``ord``: ordered (no nans)
11370#. ``ueq``: unordered or equal
11371#. ``ugt``: unordered or greater than
11372#. ``uge``: unordered or greater than or equal
11373#. ``ult``: unordered or less than
11374#. ``ule``: unordered or less than or equal
11375#. ``une``: unordered or not equal
11376#. ``uno``: unordered (either nans)
11377#. ``true``: no comparison, always returns true
11378
11379*Ordered* means that neither operand is a QNAN while *unordered* means
11380that either operand may be a QNAN.
11381
11382Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
11383<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
11384They must have identical types.
11385
11386Semantics:
11387""""""""""
11388
11389The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
11390condition code given as ``cond``. If the operands are vectors, then the
11391vectors are compared element by element. Each comparison performed
11392always yields an :ref:`i1 <t_integer>` result, as follows:
11393
11394#. ``false``: always yields ``false``, regardless of operands.
11395#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
11396   is equal to ``op2``.
11397#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
11398   is greater than ``op2``.
11399#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
11400   is greater than or equal to ``op2``.
11401#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
11402   is less than ``op2``.
11403#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
11404   is less than or equal to ``op2``.
11405#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
11406   is not equal to ``op2``.
11407#. ``ord``: yields ``true`` if both operands are not a QNAN.
11408#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
11409   equal to ``op2``.
11410#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
11411   greater than ``op2``.
11412#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
11413   greater than or equal to ``op2``.
11414#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
11415   less than ``op2``.
11416#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
11417   less than or equal to ``op2``.
11418#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
11419   not equal to ``op2``.
11420#. ``uno``: yields ``true`` if either operand is a QNAN.
11421#. ``true``: always yields ``true``, regardless of operands.
11422
11423The ``fcmp`` instruction can also optionally take any number of
11424:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11425otherwise unsafe floating-point optimizations.
11426
11427Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
11428only flags that have any effect on its semantics are those that allow
11429assumptions to be made about the values of input arguments; namely
11430``nnan``, ``ninf``, and ``reassoc``. See :ref:`fastmath` for more information.
11431
11432Example:
11433""""""""
11434
11435.. code-block:: text
11436
11437      <result> = fcmp oeq float 4.0, 5.0    ; yields: result=false
11438      <result> = fcmp one float 4.0, 5.0    ; yields: result=true
11439      <result> = fcmp olt float 4.0, 5.0    ; yields: result=true
11440      <result> = fcmp ueq double 1.0, 2.0   ; yields: result=false
11441
11442.. _i_phi:
11443
11444'``phi``' Instruction
11445^^^^^^^^^^^^^^^^^^^^^
11446
11447Syntax:
11448"""""""
11449
11450::
11451
11452      <result> = phi [fast-math-flags] <ty> [ <val0>, <label0>], ...
11453
11454Overview:
11455"""""""""
11456
11457The '``phi``' instruction is used to implement the φ node in the SSA
11458graph representing the function.
11459
11460Arguments:
11461""""""""""
11462
11463The type of the incoming values is specified with the first type field.
11464After this, the '``phi``' instruction takes a list of pairs as
11465arguments, with one pair for each predecessor basic block of the current
11466block. Only values of :ref:`first class <t_firstclass>` type may be used as
11467the value arguments to the PHI node. Only labels may be used as the
11468label arguments.
11469
11470There must be no non-phi instructions between the start of a basic block
11471and the PHI instructions: i.e. PHI instructions must be first in a basic
11472block.
11473
11474For the purposes of the SSA form, the use of each incoming value is
11475deemed to occur on the edge from the corresponding predecessor block to
11476the current block (but after any definition of an '``invoke``'
11477instruction's return value on the same edge).
11478
11479The optional ``fast-math-flags`` marker indicates that the phi has one
11480or more :ref:`fast-math-flags <fastmath>`. These are optimization hints
11481to enable otherwise unsafe floating-point optimizations. Fast-math-flags
11482are only valid for phis that return a floating-point scalar or vector
11483type, or an array (nested to any depth) of floating-point scalar or vector
11484types.
11485
11486Semantics:
11487""""""""""
11488
11489At runtime, the '``phi``' instruction logically takes on the value
11490specified by the pair corresponding to the predecessor basic block that
11491executed just prior to the current block.
11492
11493Example:
11494""""""""
11495
11496.. code-block:: llvm
11497
11498    Loop:       ; Infinite loop that counts from 0 on up...
11499      %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
11500      %nextindvar = add i32 %indvar, 1
11501      br label %Loop
11502
11503.. _i_select:
11504
11505'``select``' Instruction
11506^^^^^^^^^^^^^^^^^^^^^^^^
11507
11508Syntax:
11509"""""""
11510
11511::
11512
11513      <result> = select [fast-math flags] selty <cond>, <ty> <val1>, <ty> <val2>             ; yields ty
11514
11515      selty is either i1 or {<N x i1>}
11516
11517Overview:
11518"""""""""
11519
11520The '``select``' instruction is used to choose one value based on a
11521condition, without IR-level branching.
11522
11523Arguments:
11524""""""""""
11525
11526The '``select``' instruction requires an 'i1' value or a vector of 'i1'
11527values indicating the condition, and two values of the same :ref:`first
11528class <t_firstclass>` type.
11529
11530#. The optional ``fast-math flags`` marker indicates that the select has one or more
11531   :ref:`fast-math flags <fastmath>`. These are optimization hints to enable
11532   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11533   for selects that return a floating-point scalar or vector type, or an array
11534   (nested to any depth) of floating-point scalar or vector types.
11535
11536Semantics:
11537""""""""""
11538
11539If the condition is an i1 and it evaluates to 1, the instruction returns
11540the first value argument; otherwise, it returns the second value
11541argument.
11542
11543If the condition is a vector of i1, then the value arguments must be
11544vectors of the same size, and the selection is done element by element.
11545
11546If the condition is an i1 and the value arguments are vectors of the
11547same size, then an entire vector is selected.
11548
11549Example:
11550""""""""
11551
11552.. code-block:: llvm
11553
11554      %X = select i1 true, i8 17, i8 42          ; yields i8:17
11555
11556
11557.. _i_freeze:
11558
11559'``freeze``' Instruction
11560^^^^^^^^^^^^^^^^^^^^^^^^
11561
11562Syntax:
11563"""""""
11564
11565::
11566
11567      <result> = freeze ty <val>    ; yields ty:result
11568
11569Overview:
11570"""""""""
11571
11572The '``freeze``' instruction is used to stop propagation of
11573:ref:`undef <undefvalues>` and :ref:`poison <poisonvalues>` values.
11574
11575Arguments:
11576""""""""""
11577
11578The '``freeze``' instruction takes a single argument.
11579
11580Semantics:
11581""""""""""
11582
11583If the argument is ``undef`` or ``poison``, '``freeze``' returns an
11584arbitrary, but fixed, value of type '``ty``'.
11585Otherwise, this instruction is a no-op and returns the input argument.
11586All uses of a value returned by the same '``freeze``' instruction are
11587guaranteed to always observe the same value, while different '``freeze``'
11588instructions may yield different values.
11589
11590While ``undef`` and ``poison`` pointers can be frozen, the result is a
11591non-dereferenceable pointer. See the
11592:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more information.
11593If an aggregate value or vector is frozen, the operand is frozen element-wise.
11594The padding of an aggregate isn't considered, since it isn't visible
11595without storing it into memory and loading it with a different type.
11596
11597
11598Example:
11599""""""""
11600
11601.. code-block:: text
11602
11603      %w = i32 undef
11604      %x = freeze i32 %w
11605      %y = add i32 %w, %w         ; undef
11606      %z = add i32 %x, %x         ; even number because all uses of %x observe
11607                                  ; the same value
11608      %x2 = freeze i32 %w
11609      %cmp = icmp eq i32 %x, %x2  ; can be true or false
11610
11611      ; example with vectors
11612      %v = <2 x i32> <i32 undef, i32 poison>
11613      %a = extractelement <2 x i32> %v, i32 0    ; undef
11614      %b = extractelement <2 x i32> %v, i32 1    ; poison
11615      %add = add i32 %a, %a                      ; undef
11616
11617      %v.fr = freeze <2 x i32> %v                ; element-wise freeze
11618      %d = extractelement <2 x i32> %v.fr, i32 0 ; not undef
11619      %add.f = add i32 %d, %d                    ; even number
11620
11621      ; branching on frozen value
11622      %poison = add nsw i1 %k, undef   ; poison
11623      %c = freeze i1 %poison
11624      br i1 %c, label %foo, label %bar ; non-deterministic branch to %foo or %bar
11625
11626
11627.. _i_call:
11628
11629'``call``' Instruction
11630^^^^^^^^^^^^^^^^^^^^^^
11631
11632Syntax:
11633"""""""
11634
11635::
11636
11637      <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] [addrspace(<num>)]
11638                 <ty>|<fnty> <fnptrval>(<function args>) [fn attrs] [ operand bundles ]
11639
11640Overview:
11641"""""""""
11642
11643The '``call``' instruction represents a simple function call.
11644
11645Arguments:
11646""""""""""
11647
11648This instruction requires several arguments:
11649
11650#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
11651   should perform tail call optimization. The ``tail`` marker is a hint that
11652   `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
11653   means that the call must be tail call optimized in order for the program to
11654   be correct. The ``musttail`` marker provides these guarantees:
11655
11656   #. The call will not cause unbounded stack growth if it is part of a
11657      recursive cycle in the call graph.
11658   #. Arguments with the :ref:`inalloca <attr_inalloca>` or
11659      :ref:`preallocated <attr_preallocated>` attribute are forwarded in place.
11660   #. If the musttail call appears in a function with the ``"thunk"`` attribute
11661      and the caller and callee both have varargs, than any unprototyped
11662      arguments in register or memory are forwarded to the callee. Similarly,
11663      the return value of the callee is returned to the caller's caller, even
11664      if a void return type is in use.
11665
11666   Both markers imply that the callee does not access allocas from the caller.
11667   The ``tail`` marker additionally implies that the callee does not access
11668   varargs from the caller. Calls marked ``musttail`` must obey the following
11669   additional  rules:
11670
11671   - The call must immediately precede a :ref:`ret <i_ret>` instruction,
11672     or a pointer bitcast followed by a ret instruction.
11673   - The ret instruction must return the (possibly bitcasted) value
11674     produced by the call, undef, or void.
11675   - The calling conventions of the caller and callee must match.
11676   - The callee must be varargs iff the caller is varargs. Bitcasting a
11677     non-varargs function to the appropriate varargs type is legal so
11678     long as the non-varargs prefixes obey the other rules.
11679   - The return type must not undergo automatic conversion to an `sret` pointer.
11680
11681  In addition, if the calling convention is not `swifttailcc` or `tailcc`:
11682
11683   - All ABI-impacting function attributes, such as sret, byval, inreg,
11684     returned, and inalloca, must match.
11685   - The caller and callee prototypes must match. Pointer types of parameters
11686     or return types may differ in pointee type, but not in address space.
11687
11688  On the other hand, if the calling convention is `swifttailcc` or `swiftcc`:
11689
11690   - Only these ABI-impacting attributes attributes are allowed: sret, byval,
11691     swiftself, and swiftasync.
11692   - Prototypes are not required to match.
11693
11694   Tail call optimization for calls marked ``tail`` is guaranteed to occur if
11695   the following conditions are met:
11696
11697   -  Caller and callee both have the calling convention ``fastcc`` or ``tailcc``.
11698   -  The call is in tail position (ret immediately follows call and ret
11699      uses value of call or is void).
11700   -  Option ``-tailcallopt`` is enabled,
11701      ``llvm::GuaranteedTailCallOpt`` is ``true``, or the calling convention
11702      is ``tailcc``
11703   -  `Platform-specific constraints are
11704      met. <CodeGenerator.html#tailcallopt>`_
11705
11706#. The optional ``notail`` marker indicates that the optimizers should not add
11707   ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
11708   call optimization from being performed on the call.
11709
11710#. The optional ``fast-math flags`` marker indicates that the call has one or more
11711   :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
11712   otherwise unsafe floating-point optimizations. Fast-math flags are only valid
11713   for calls that return a floating-point scalar or vector type, or an array
11714   (nested to any depth) of floating-point scalar or vector types.
11715
11716#. The optional "cconv" marker indicates which :ref:`calling
11717   convention <callingconv>` the call should use. If none is
11718   specified, the call defaults to using C calling conventions. The
11719   calling convention of the call must match the calling convention of
11720   the target function, or else the behavior is undefined.
11721#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
11722   values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
11723   are valid here.
11724#. The optional addrspace attribute can be used to indicate the address space
11725   of the called function. If it is not specified, the program address space
11726   from the :ref:`datalayout string<langref_datalayout>` will be used.
11727#. '``ty``': the type of the call instruction itself which is also the
11728   type of the return value. Functions that return no value are marked
11729   ``void``.
11730#. '``fnty``': shall be the signature of the function being called. The
11731   argument types must match the types implied by this signature. This
11732   type can be omitted if the function is not varargs.
11733#. '``fnptrval``': An LLVM value containing a pointer to a function to
11734   be called. In most cases, this is a direct function call, but
11735   indirect ``call``'s are just as possible, calling an arbitrary pointer
11736   to function value.
11737#. '``function args``': argument list whose types match the function
11738   signature argument types and parameter attributes. All arguments must
11739   be of :ref:`first class <t_firstclass>` type. If the function signature
11740   indicates the function accepts a variable number of arguments, the
11741   extra arguments can be specified.
11742#. The optional :ref:`function attributes <fnattrs>` list.
11743#. The optional :ref:`operand bundles <opbundles>` list.
11744
11745Semantics:
11746""""""""""
11747
11748The '``call``' instruction is used to cause control flow to transfer to
11749a specified function, with its incoming arguments bound to the specified
11750values. Upon a '``ret``' instruction in the called function, control
11751flow continues with the instruction after the function call, and the
11752return value of the function is bound to the result argument.
11753
11754Example:
11755""""""""
11756
11757.. code-block:: llvm
11758
11759      %retval = call i32 @test(i32 %argc)
11760      call i32 (ptr, ...) @printf(ptr %msg, i32 12, i8 42)        ; yields i32
11761      %X = tail call i32 @foo()                                    ; yields i32
11762      %Y = tail call fastcc i32 @foo()  ; yields i32
11763      call void %foo(i8 signext 97)
11764
11765      %struct.A = type { i32, i8 }
11766      %r = call %struct.A @foo()                        ; yields { i32, i8 }
11767      %gr = extractvalue %struct.A %r, 0                ; yields i32
11768      %gr1 = extractvalue %struct.A %r, 1               ; yields i8
11769      %Z = call void @foo() noreturn                    ; indicates that %foo never returns normally
11770      %ZZ = call zeroext i32 @bar()                     ; Return value is %zero extended
11771
11772llvm treats calls to some functions with names and arguments that match
11773the standard C99 library as being the C99 library functions, and may
11774perform optimizations or generate code for them under that assumption.
11775This is something we'd like to change in the future to provide better
11776support for freestanding environments and non-C-based languages.
11777
11778.. _i_va_arg:
11779
11780'``va_arg``' Instruction
11781^^^^^^^^^^^^^^^^^^^^^^^^
11782
11783Syntax:
11784"""""""
11785
11786::
11787
11788      <resultval> = va_arg <va_list*> <arglist>, <argty>
11789
11790Overview:
11791"""""""""
11792
11793The '``va_arg``' instruction is used to access arguments passed through
11794the "variable argument" area of a function call. It is used to implement
11795the ``va_arg`` macro in C.
11796
11797Arguments:
11798""""""""""
11799
11800This instruction takes a ``va_list*`` value and the type of the
11801argument. It returns a value of the specified argument type and
11802increments the ``va_list`` to point to the next argument. The actual
11803type of ``va_list`` is target specific.
11804
11805Semantics:
11806""""""""""
11807
11808The '``va_arg``' instruction loads an argument of the specified type
11809from the specified ``va_list`` and causes the ``va_list`` to point to
11810the next argument. For more information, see the variable argument
11811handling :ref:`Intrinsic Functions <int_varargs>`.
11812
11813It is legal for this instruction to be called in a function which does
11814not take a variable number of arguments, for example, the ``vfprintf``
11815function.
11816
11817``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
11818function <intrinsics>` because it takes a type as an argument.
11819
11820Example:
11821""""""""
11822
11823See the :ref:`variable argument processing <int_varargs>` section.
11824
11825Note that the code generator does not yet fully support va\_arg on many
11826targets. Also, it does not currently support va\_arg with aggregate
11827types on any target.
11828
11829.. _i_landingpad:
11830
11831'``landingpad``' Instruction
11832^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11833
11834Syntax:
11835"""""""
11836
11837::
11838
11839      <resultval> = landingpad <resultty> <clause>+
11840      <resultval> = landingpad <resultty> cleanup <clause>*
11841
11842      <clause> := catch <type> <value>
11843      <clause> := filter <array constant type> <array constant>
11844
11845Overview:
11846"""""""""
11847
11848The '``landingpad``' instruction is used by `LLVM's exception handling
11849system <ExceptionHandling.html#overview>`_ to specify that a basic block
11850is a landing pad --- one where the exception lands, and corresponds to the
11851code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
11852defines values supplied by the :ref:`personality function <personalityfn>` upon
11853re-entry to the function. The ``resultval`` has the type ``resultty``.
11854
11855Arguments:
11856""""""""""
11857
11858The optional
11859``cleanup`` flag indicates that the landing pad block is a cleanup.
11860
11861A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
11862contains the global variable representing the "type" that may be caught
11863or filtered respectively. Unlike the ``catch`` clause, the ``filter``
11864clause takes an array constant as its argument. Use
11865"``[0 x ptr] undef``" for a filter which cannot throw. The
11866'``landingpad``' instruction must contain *at least* one ``clause`` or
11867the ``cleanup`` flag.
11868
11869Semantics:
11870""""""""""
11871
11872The '``landingpad``' instruction defines the values which are set by the
11873:ref:`personality function <personalityfn>` upon re-entry to the function, and
11874therefore the "result type" of the ``landingpad`` instruction. As with
11875calling conventions, how the personality function results are
11876represented in LLVM IR is target specific.
11877
11878The clauses are applied in order from top to bottom. If two
11879``landingpad`` instructions are merged together through inlining, the
11880clauses from the calling function are appended to the list of clauses.
11881When the call stack is being unwound due to an exception being thrown,
11882the exception is compared against each ``clause`` in turn. If it doesn't
11883match any of the clauses, and the ``cleanup`` flag is not set, then
11884unwinding continues further up the call stack.
11885
11886The ``landingpad`` instruction has several restrictions:
11887
11888-  A landing pad block is a basic block which is the unwind destination
11889   of an '``invoke``' instruction.
11890-  A landing pad block must have a '``landingpad``' instruction as its
11891   first non-PHI instruction.
11892-  There can be only one '``landingpad``' instruction within the landing
11893   pad block.
11894-  A basic block that is not a landing pad block may not include a
11895   '``landingpad``' instruction.
11896
11897Example:
11898""""""""
11899
11900.. code-block:: llvm
11901
11902      ;; A landing pad which can catch an integer.
11903      %res = landingpad { ptr, i32 }
11904               catch ptr @_ZTIi
11905      ;; A landing pad that is a cleanup.
11906      %res = landingpad { ptr, i32 }
11907               cleanup
11908      ;; A landing pad which can catch an integer and can only throw a double.
11909      %res = landingpad { ptr, i32 }
11910               catch ptr @_ZTIi
11911               filter [1 x ptr] [ptr @_ZTId]
11912
11913.. _i_catchpad:
11914
11915'``catchpad``' Instruction
11916^^^^^^^^^^^^^^^^^^^^^^^^^^
11917
11918Syntax:
11919"""""""
11920
11921::
11922
11923      <resultval> = catchpad within <catchswitch> [<args>*]
11924
11925Overview:
11926"""""""""
11927
11928The '``catchpad``' instruction is used by `LLVM's exception handling
11929system <ExceptionHandling.html#overview>`_ to specify that a basic block
11930begins a catch handler --- one where a personality routine attempts to transfer
11931control to catch an exception.
11932
11933Arguments:
11934""""""""""
11935
11936The ``catchswitch`` operand must always be a token produced by a
11937:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
11938ensures that each ``catchpad`` has exactly one predecessor block, and it always
11939terminates in a ``catchswitch``.
11940
11941The ``args`` correspond to whatever information the personality routine
11942requires to know if this is an appropriate handler for the exception. Control
11943will transfer to the ``catchpad`` if this is the first appropriate handler for
11944the exception.
11945
11946The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
11947``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
11948pads.
11949
11950Semantics:
11951""""""""""
11952
11953When the call stack is being unwound due to an exception being thrown, the
11954exception is compared against the ``args``. If it doesn't match, control will
11955not reach the ``catchpad`` instruction.  The representation of ``args`` is
11956entirely target and personality function-specific.
11957
11958Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
11959instruction must be the first non-phi of its parent basic block.
11960
11961The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
11962instructions is described in the
11963`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
11964
11965When a ``catchpad`` has been "entered" but not yet "exited" (as
11966described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
11967it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
11968that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
11969
11970Example:
11971""""""""
11972
11973.. code-block:: text
11974
11975    dispatch:
11976      %cs = catchswitch within none [label %handler0] unwind to caller
11977      ;; A catch block which can catch an integer.
11978    handler0:
11979      %tok = catchpad within %cs [ptr @_ZTIi]
11980
11981.. _i_cleanuppad:
11982
11983'``cleanuppad``' Instruction
11984^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11985
11986Syntax:
11987"""""""
11988
11989::
11990
11991      <resultval> = cleanuppad within <parent> [<args>*]
11992
11993Overview:
11994"""""""""
11995
11996The '``cleanuppad``' instruction is used by `LLVM's exception handling
11997system <ExceptionHandling.html#overview>`_ to specify that a basic block
11998is a cleanup block --- one where a personality routine attempts to
11999transfer control to run cleanup actions.
12000The ``args`` correspond to whatever additional
12001information the :ref:`personality function <personalityfn>` requires to
12002execute the cleanup.
12003The ``resultval`` has the type :ref:`token <t_token>` and is used to
12004match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
12005The ``parent`` argument is the token of the funclet that contains the
12006``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
12007this operand may be the token ``none``.
12008
12009Arguments:
12010""""""""""
12011
12012The instruction takes a list of arbitrary values which are interpreted
12013by the :ref:`personality function <personalityfn>`.
12014
12015Semantics:
12016""""""""""
12017
12018When the call stack is being unwound due to an exception being thrown,
12019the :ref:`personality function <personalityfn>` transfers control to the
12020``cleanuppad`` with the aid of the personality-specific arguments.
12021As with calling conventions, how the personality function results are
12022represented in LLVM IR is target specific.
12023
12024The ``cleanuppad`` instruction has several restrictions:
12025
12026-  A cleanup block is a basic block which is the unwind destination of
12027   an exceptional instruction.
12028-  A cleanup block must have a '``cleanuppad``' instruction as its
12029   first non-PHI instruction.
12030-  There can be only one '``cleanuppad``' instruction within the
12031   cleanup block.
12032-  A basic block that is not a cleanup block may not include a
12033   '``cleanuppad``' instruction.
12034
12035When a ``cleanuppad`` has been "entered" but not yet "exited" (as
12036described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
12037it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
12038that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
12039
12040Example:
12041""""""""
12042
12043.. code-block:: text
12044
12045      %tok = cleanuppad within %cs []
12046
12047.. _intrinsics:
12048
12049Intrinsic Functions
12050===================
12051
12052LLVM supports the notion of an "intrinsic function". These functions
12053have well known names and semantics and are required to follow certain
12054restrictions. Overall, these intrinsics represent an extension mechanism
12055for the LLVM language that does not require changing all of the
12056transformations in LLVM when adding to the language (or the bitcode
12057reader/writer, the parser, etc...).
12058
12059Intrinsic function names must all start with an "``llvm.``" prefix. This
12060prefix is reserved in LLVM for intrinsic names; thus, function names may
12061not begin with this prefix. Intrinsic functions must always be external
12062functions: you cannot define the body of intrinsic functions. Intrinsic
12063functions may only be used in call or invoke instructions: it is illegal
12064to take the address of an intrinsic function. Additionally, because
12065intrinsic functions are part of the LLVM language, it is required if any
12066are added that they be documented here.
12067
12068Some intrinsic functions can be overloaded, i.e., the intrinsic
12069represents a family of functions that perform the same operation but on
12070different data types. Because LLVM can represent over 8 million
12071different integer types, overloading is used commonly to allow an
12072intrinsic function to operate on any integer type. One or more of the
12073argument types or the result type can be overloaded to accept any
12074integer type. Argument types may also be defined as exactly matching a
12075previous argument's type or the result type. This allows an intrinsic
12076function which accepts multiple arguments, but needs all of them to be
12077of the same type, to only be overloaded with respect to a single
12078argument or the result.
12079
12080Overloaded intrinsics will have the names of its overloaded argument
12081types encoded into its function name, each preceded by a period. Only
12082those types which are overloaded result in a name suffix. Arguments
12083whose type is matched against another type do not. For example, the
12084``llvm.ctpop`` function can take an integer of any width and returns an
12085integer of exactly the same integer width. This leads to a family of
12086functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
12087``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
12088overloaded, and only one type suffix is required. Because the argument's
12089type is matched against the return type, it does not require its own
12090name suffix.
12091
12092:ref:`Unnamed types <t_opaque>` are encoded as ``s_s``. Overloaded intrinsics
12093that depend on an unnamed type in one of its overloaded argument types get an
12094additional ``.<number>`` suffix. This allows differentiating intrinsics with
12095different unnamed types as arguments. (For example:
12096``llvm.ssa.copy.p0s_s.2(%42*)``) The number is tracked in the LLVM module and
12097it ensures unique names in the module. While linking together two modules, it is
12098still possible to get a name clash. In that case one of the names will be
12099changed by getting a new number.
12100
12101For target developers who are defining intrinsics for back-end code
12102generation, any intrinsic overloads based solely the distinction between
12103integer or floating point types should not be relied upon for correct
12104code generation. In such cases, the recommended approach for target
12105maintainers when defining intrinsics is to create separate integer and
12106FP intrinsics rather than rely on overloading. For example, if different
12107codegen is required for ``llvm.target.foo(<4 x i32>)`` and
12108``llvm.target.foo(<4 x float>)`` then these should be split into
12109different intrinsics.
12110
12111To learn how to add an intrinsic function, please see the `Extending
12112LLVM Guide <ExtendingLLVM.html>`_.
12113
12114.. _int_varargs:
12115
12116Variable Argument Handling Intrinsics
12117-------------------------------------
12118
12119Variable argument support is defined in LLVM with the
12120:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
12121functions. These functions are related to the similarly named macros
12122defined in the ``<stdarg.h>`` header file.
12123
12124All of these functions operate on arguments that use a target-specific
12125value type "``va_list``". The LLVM assembly language reference manual
12126does not define what this type is, so all transformations should be
12127prepared to handle these functions regardless of the type used.
12128
12129This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
12130variable argument handling intrinsic functions are used.
12131
12132.. code-block:: llvm
12133
12134    ; This struct is different for every platform. For most platforms,
12135    ; it is merely a ptr.
12136    %struct.va_list = type { ptr }
12137
12138    ; For Unix x86_64 platforms, va_list is the following struct:
12139    ; %struct.va_list = type { i32, i32, ptr, ptr }
12140
12141    define i32 @test(i32 %X, ...) {
12142      ; Initialize variable argument processing
12143      %ap = alloca %struct.va_list
12144      call void @llvm.va_start(ptr %ap)
12145
12146      ; Read a single integer argument
12147      %tmp = va_arg ptr %ap, i32
12148
12149      ; Demonstrate usage of llvm.va_copy and llvm.va_end
12150      %aq = alloca ptr
12151      call void @llvm.va_copy(ptr %aq, ptr %ap)
12152      call void @llvm.va_end(ptr %aq)
12153
12154      ; Stop processing of arguments.
12155      call void @llvm.va_end(ptr %ap)
12156      ret i32 %tmp
12157    }
12158
12159    declare void @llvm.va_start(ptr)
12160    declare void @llvm.va_copy(ptr, ptr)
12161    declare void @llvm.va_end(ptr)
12162
12163.. _int_va_start:
12164
12165'``llvm.va_start``' Intrinsic
12166^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12167
12168Syntax:
12169"""""""
12170
12171::
12172
12173      declare void @llvm.va_start(ptr <arglist>)
12174
12175Overview:
12176"""""""""
12177
12178The '``llvm.va_start``' intrinsic initializes ``<arglist>`` for
12179subsequent use by ``va_arg``.
12180
12181Arguments:
12182""""""""""
12183
12184The argument is a pointer to a ``va_list`` element to initialize.
12185
12186Semantics:
12187""""""""""
12188
12189The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
12190available in C. In a target-dependent way, it initializes the
12191``va_list`` element to which the argument points, so that the next call
12192to ``va_arg`` will produce the first variable argument passed to the
12193function. Unlike the C ``va_start`` macro, this intrinsic does not need
12194to know the last argument of the function as the compiler can figure
12195that out.
12196
12197'``llvm.va_end``' Intrinsic
12198^^^^^^^^^^^^^^^^^^^^^^^^^^^
12199
12200Syntax:
12201"""""""
12202
12203::
12204
12205      declare void @llvm.va_end(ptr <arglist>)
12206
12207Overview:
12208"""""""""
12209
12210The '``llvm.va_end``' intrinsic destroys ``<arglist>``, which has been
12211initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
12212
12213Arguments:
12214""""""""""
12215
12216The argument is a pointer to a ``va_list`` to destroy.
12217
12218Semantics:
12219""""""""""
12220
12221The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
12222available in C. In a target-dependent way, it destroys the ``va_list``
12223element to which the argument points. Calls to
12224:ref:`llvm.va_start <int_va_start>` and
12225:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
12226``llvm.va_end``.
12227
12228.. _int_va_copy:
12229
12230'``llvm.va_copy``' Intrinsic
12231^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12232
12233Syntax:
12234"""""""
12235
12236::
12237
12238      declare void @llvm.va_copy(ptr <destarglist>, ptr <srcarglist>)
12239
12240Overview:
12241"""""""""
12242
12243The '``llvm.va_copy``' intrinsic copies the current argument position
12244from the source argument list to the destination argument list.
12245
12246Arguments:
12247""""""""""
12248
12249The first argument is a pointer to a ``va_list`` element to initialize.
12250The second argument is a pointer to a ``va_list`` element to copy from.
12251
12252Semantics:
12253""""""""""
12254
12255The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
12256available in C. In a target-dependent way, it copies the source
12257``va_list`` element into the destination ``va_list`` element. This
12258intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
12259arbitrarily complex and require, for example, memory allocation.
12260
12261Accurate Garbage Collection Intrinsics
12262--------------------------------------
12263
12264LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
12265(GC) requires the frontend to generate code containing appropriate intrinsic
12266calls and select an appropriate GC strategy which knows how to lower these
12267intrinsics in a manner which is appropriate for the target collector.
12268
12269These intrinsics allow identification of :ref:`GC roots on the
12270stack <int_gcroot>`, as well as garbage collector implementations that
12271require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
12272Frontends for type-safe garbage collected languages should generate
12273these intrinsics to make use of the LLVM garbage collectors. For more
12274details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
12275
12276LLVM provides an second experimental set of intrinsics for describing garbage
12277collection safepoints in compiled code. These intrinsics are an alternative
12278to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
12279:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
12280differences in approach are covered in the `Garbage Collection with LLVM
12281<GarbageCollection.html>`_ documentation. The intrinsics themselves are
12282described in :doc:`Statepoints`.
12283
12284.. _int_gcroot:
12285
12286'``llvm.gcroot``' Intrinsic
12287^^^^^^^^^^^^^^^^^^^^^^^^^^^
12288
12289Syntax:
12290"""""""
12291
12292::
12293
12294      declare void @llvm.gcroot(ptr %ptrloc, ptr %metadata)
12295
12296Overview:
12297"""""""""
12298
12299The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
12300the code generator, and allows some metadata to be associated with it.
12301
12302Arguments:
12303""""""""""
12304
12305The first argument specifies the address of a stack object that contains
12306the root pointer. The second pointer (which must be either a constant or
12307a global value address) contains the meta-data to be associated with the
12308root.
12309
12310Semantics:
12311""""""""""
12312
12313At runtime, a call to this intrinsic stores a null pointer into the
12314"ptrloc" location. At compile-time, the code generator generates
12315information to allow the runtime to find the pointer at GC safe points.
12316The '``llvm.gcroot``' intrinsic may only be used in a function which
12317:ref:`specifies a GC algorithm <gc>`.
12318
12319.. _int_gcread:
12320
12321'``llvm.gcread``' Intrinsic
12322^^^^^^^^^^^^^^^^^^^^^^^^^^^
12323
12324Syntax:
12325"""""""
12326
12327::
12328
12329      declare ptr @llvm.gcread(ptr %ObjPtr, ptr %Ptr)
12330
12331Overview:
12332"""""""""
12333
12334The '``llvm.gcread``' intrinsic identifies reads of references from heap
12335locations, allowing garbage collector implementations that require read
12336barriers.
12337
12338Arguments:
12339""""""""""
12340
12341The second argument is the address to read from, which should be an
12342address allocated from the garbage collector. The first object is a
12343pointer to the start of the referenced object, if needed by the language
12344runtime (otherwise null).
12345
12346Semantics:
12347""""""""""
12348
12349The '``llvm.gcread``' intrinsic has the same semantics as a load
12350instruction, but may be replaced with substantially more complex code by
12351the garbage collector runtime, as needed. The '``llvm.gcread``'
12352intrinsic may only be used in a function which :ref:`specifies a GC
12353algorithm <gc>`.
12354
12355.. _int_gcwrite:
12356
12357'``llvm.gcwrite``' Intrinsic
12358^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12359
12360Syntax:
12361"""""""
12362
12363::
12364
12365      declare void @llvm.gcwrite(ptr %P1, ptr %Obj, ptr %P2)
12366
12367Overview:
12368"""""""""
12369
12370The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
12371locations, allowing garbage collector implementations that require write
12372barriers (such as generational or reference counting collectors).
12373
12374Arguments:
12375""""""""""
12376
12377The first argument is the reference to store, the second is the start of
12378the object to store it to, and the third is the address of the field of
12379Obj to store to. If the runtime does not require a pointer to the
12380object, Obj may be null.
12381
12382Semantics:
12383""""""""""
12384
12385The '``llvm.gcwrite``' intrinsic has the same semantics as a store
12386instruction, but may be replaced with substantially more complex code by
12387the garbage collector runtime, as needed. The '``llvm.gcwrite``'
12388intrinsic may only be used in a function which :ref:`specifies a GC
12389algorithm <gc>`.
12390
12391
12392.. _gc_statepoint:
12393
12394'llvm.experimental.gc.statepoint' Intrinsic
12395^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12396
12397Syntax:
12398"""""""
12399
12400::
12401
12402      declare token
12403        @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>,
12404                       ptr elementtype(func_type) <target>,
12405                       i64 <#call args>, i64 <flags>,
12406                       ... (call parameters),
12407                       i64 0, i64 0)
12408
12409Overview:
12410"""""""""
12411
12412The statepoint intrinsic represents a call which is parse-able by the
12413runtime.
12414
12415Operands:
12416"""""""""
12417
12418The 'id' operand is a constant integer that is reported as the ID
12419field in the generated stackmap.  LLVM does not interpret this
12420parameter in any way and its meaning is up to the statepoint user to
12421decide.  Note that LLVM is free to duplicate code containing
12422statepoint calls, and this may transform IR that had a unique 'id' per
12423lexical call to statepoint to IR that does not.
12424
12425If 'num patch bytes' is non-zero then the call instruction
12426corresponding to the statepoint is not emitted and LLVM emits 'num
12427patch bytes' bytes of nops in its place.  LLVM will emit code to
12428prepare the function arguments and retrieve the function return value
12429in accordance to the calling convention; the former before the nop
12430sequence and the latter after the nop sequence.  It is expected that
12431the user will patch over the 'num patch bytes' bytes of nops with a
12432calling sequence specific to their runtime before executing the
12433generated machine code.  There are no guarantees with respect to the
12434alignment of the nop sequence.  Unlike :doc:`StackMaps` statepoints do
12435not have a concept of shadow bytes.  Note that semantically the
12436statepoint still represents a call or invoke to 'target', and the nop
12437sequence after patching is expected to represent an operation
12438equivalent to a call or invoke to 'target'.
12439
12440The 'target' operand is the function actually being called. The operand
12441must have an :ref:`elementtype <attr_elementtype>` attribute specifying
12442the function type of the target. The target can be specified as either
12443a symbolic LLVM function, or as an arbitrary Value of pointer type. Note
12444that the function type must match the signature of the callee and the
12445types of the 'call parameters' arguments.
12446
12447The '#call args' operand is the number of arguments to the actual
12448call.  It must exactly match the number of arguments passed in the
12449'call parameters' variable length section.
12450
12451The 'flags' operand is used to specify extra information about the
12452statepoint. This is currently only used to mark certain statepoints
12453as GC transitions. This operand is a 64-bit integer with the following
12454layout, where bit 0 is the least significant bit:
12455
12456  +-------+---------------------------------------------------+
12457  | Bit # | Usage                                             |
12458  +=======+===================================================+
12459  |     0 | Set if the statepoint is a GC transition, cleared |
12460  |       | otherwise.                                        |
12461  +-------+---------------------------------------------------+
12462  |  1-63 | Reserved for future use; must be cleared.         |
12463  +-------+---------------------------------------------------+
12464
12465The 'call parameters' arguments are simply the arguments which need to
12466be passed to the call target.  They will be lowered according to the
12467specified calling convention and otherwise handled like a normal call
12468instruction.  The number of arguments must exactly match what is
12469specified in '# call args'.  The types must match the signature of
12470'target'.
12471
12472The 'call parameter' attributes must be followed by two 'i64 0' constants.
12473These were originally the length prefixes for 'gc transition parameter' and
12474'deopt parameter' arguments, but the role of these parameter sets have been
12475entirely replaced with the corresponding operand bundles.  In a future
12476revision, these now redundant arguments will be removed.
12477
12478Semantics:
12479""""""""""
12480
12481A statepoint is assumed to read and write all memory.  As a result,
12482memory operations can not be reordered past a statepoint.  It is
12483illegal to mark a statepoint as being either 'readonly' or 'readnone'.
12484
12485Note that legal IR can not perform any memory operation on a 'gc
12486pointer' argument of the statepoint in a location statically reachable
12487from the statepoint.  Instead, the explicitly relocated value (from a
12488``gc.relocate``) must be used.
12489
12490'llvm.experimental.gc.result' Intrinsic
12491^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12492
12493Syntax:
12494"""""""
12495
12496::
12497
12498      declare type
12499        @llvm.experimental.gc.result(token %statepoint_token)
12500
12501Overview:
12502"""""""""
12503
12504``gc.result`` extracts the result of the original call instruction
12505which was replaced by the ``gc.statepoint``.  The ``gc.result``
12506intrinsic is actually a family of three intrinsics due to an
12507implementation limitation.  Other than the type of the return value,
12508the semantics are the same.
12509
12510Operands:
12511"""""""""
12512
12513The first and only argument is the ``gc.statepoint`` which starts
12514the safepoint sequence of which this ``gc.result`` is a part.
12515Despite the typing of this as a generic token, *only* the value defined
12516by a ``gc.statepoint`` is legal here.
12517
12518Semantics:
12519""""""""""
12520
12521The ``gc.result`` represents the return value of the call target of
12522the ``statepoint``.  The type of the ``gc.result`` must exactly match
12523the type of the target.  If the call target returns void, there will
12524be no ``gc.result``.
12525
12526A ``gc.result`` is modeled as a 'readnone' pure function.  It has no
12527side effects since it is just a projection of the return value of the
12528previous call represented by the ``gc.statepoint``.
12529
12530'llvm.experimental.gc.relocate' Intrinsic
12531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12532
12533Syntax:
12534"""""""
12535
12536::
12537
12538      declare <pointer type>
12539        @llvm.experimental.gc.relocate(token %statepoint_token,
12540                                       i32 %base_offset,
12541                                       i32 %pointer_offset)
12542
12543Overview:
12544"""""""""
12545
12546A ``gc.relocate`` returns the potentially relocated value of a pointer
12547at the safepoint.
12548
12549Operands:
12550"""""""""
12551
12552The first argument is the ``gc.statepoint`` which starts the
12553safepoint sequence of which this ``gc.relocation`` is a part.
12554Despite the typing of this as a generic token, *only* the value defined
12555by a ``gc.statepoint`` is legal here.
12556
12557The second and third arguments are both indices into operands of the
12558corresponding statepoint's :ref:`gc-live <ob_gc_live>` operand bundle.
12559
12560The second argument is an index which specifies the allocation for the pointer
12561being relocated. The associated value must be within the object with which the
12562pointer being relocated is associated. The optimizer is free to change *which*
12563interior derived pointer is reported, provided that it does not replace an
12564actual base pointer with another interior derived pointer. Collectors are
12565allowed to rely on the base pointer operand remaining an actual base pointer if
12566so constructed.
12567
12568The third argument is an index which specify the (potentially) derived pointer
12569being relocated.  It is legal for this index to be the same as the second
12570argument if-and-only-if a base pointer is being relocated.
12571
12572Semantics:
12573""""""""""
12574
12575The return value of ``gc.relocate`` is the potentially relocated value
12576of the pointer specified by its arguments.  It is unspecified how the
12577value of the returned pointer relates to the argument to the
12578``gc.statepoint`` other than that a) it points to the same source
12579language object with the same offset, and b) the 'based-on'
12580relationship of the newly relocated pointers is a projection of the
12581unrelocated pointers.  In particular, the integer value of the pointer
12582returned is unspecified.
12583
12584A ``gc.relocate`` is modeled as a ``readnone`` pure function.  It has no
12585side effects since it is just a way to extract information about work
12586done during the actual call modeled by the ``gc.statepoint``.
12587
12588.. _gc.get.pointer.base:
12589
12590'llvm.experimental.gc.get.pointer.base' Intrinsic
12591^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12592
12593Syntax:
12594"""""""
12595
12596::
12597
12598      declare <pointer type>
12599        @llvm.experimental.gc.get.pointer.base(
12600          <pointer type> readnone nocapture %derived_ptr)
12601          nounwind readnone willreturn
12602
12603Overview:
12604"""""""""
12605
12606``gc.get.pointer.base`` for a derived pointer returns its base pointer.
12607
12608Operands:
12609"""""""""
12610
12611The only argument is a pointer which is based on some object with
12612an unknown offset from the base of said object.
12613
12614Semantics:
12615""""""""""
12616
12617This intrinsic is used in the abstract machine model for GC to represent
12618the base pointer for an arbitrary derived pointer.
12619
12620This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12621replacing all uses of this callsite with the offset of a derived pointer from
12622its base pointer value. The replacement is done as part of the lowering to the
12623explicit statepoint model.
12624
12625The return pointer type must be the same as the type of the parameter.
12626
12627
12628'llvm.experimental.gc.get.pointer.offset' Intrinsic
12629^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12630
12631Syntax:
12632"""""""
12633
12634::
12635
12636      declare i64
12637        @llvm.experimental.gc.get.pointer.offset(
12638          <pointer type> readnone nocapture %derived_ptr)
12639          nounwind readnone willreturn
12640
12641Overview:
12642"""""""""
12643
12644``gc.get.pointer.offset`` for a derived pointer returns the offset from its
12645base pointer.
12646
12647Operands:
12648"""""""""
12649
12650The only argument is a pointer which is based on some object with
12651an unknown offset from the base of said object.
12652
12653Semantics:
12654""""""""""
12655
12656This intrinsic is used in the abstract machine model for GC to represent
12657the offset of an arbitrary derived pointer from its base pointer.
12658
12659This intrinsic is inlined by the :ref:`RewriteStatepointsForGC` pass by
12660replacing all uses of this callsite with the offset of a derived pointer from
12661its base pointer value. The replacement is done as part of the lowering to the
12662explicit statepoint model.
12663
12664Basically this call calculates difference between the derived pointer and its
12665base pointer (see :ref:`gc.get.pointer.base`) both ptrtoint casted. But
12666this cast done outside the :ref:`RewriteStatepointsForGC` pass could result
12667in the pointers lost for further lowering from the abstract model to the
12668explicit physical one.
12669
12670Code Generator Intrinsics
12671-------------------------
12672
12673These intrinsics are provided by LLVM to expose special features that
12674may only be implemented with code generator support.
12675
12676'``llvm.returnaddress``' Intrinsic
12677^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12678
12679Syntax:
12680"""""""
12681
12682::
12683
12684      declare ptr @llvm.returnaddress(i32 <level>)
12685
12686Overview:
12687"""""""""
12688
12689The '``llvm.returnaddress``' intrinsic attempts to compute a
12690target-specific value indicating the return address of the current
12691function or one of its callers.
12692
12693Arguments:
12694""""""""""
12695
12696The argument to this intrinsic indicates which function to return the
12697address for. Zero indicates the calling function, one indicates its
12698caller, etc. The argument is **required** to be a constant integer
12699value.
12700
12701Semantics:
12702""""""""""
12703
12704The '``llvm.returnaddress``' intrinsic either returns a pointer
12705indicating the return address of the specified call frame, or zero if it
12706cannot be identified. The value returned by this intrinsic is likely to
12707be incorrect or 0 for arguments other than zero, so it should only be
12708used for debugging purposes.
12709
12710Note that calling this intrinsic does not prevent function inlining or
12711other aggressive transformations, so the value returned may not be that
12712of the obvious source-language caller.
12713
12714'``llvm.addressofreturnaddress``' Intrinsic
12715^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12716
12717Syntax:
12718"""""""
12719
12720::
12721
12722      declare ptr @llvm.addressofreturnaddress()
12723
12724Overview:
12725"""""""""
12726
12727The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
12728pointer to the place in the stack frame where the return address of the
12729current function is stored.
12730
12731Semantics:
12732""""""""""
12733
12734Note that calling this intrinsic does not prevent function inlining or
12735other aggressive transformations, so the value returned may not be that
12736of the obvious source-language caller.
12737
12738This intrinsic is only implemented for x86 and aarch64.
12739
12740'``llvm.sponentry``' Intrinsic
12741^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12742
12743Syntax:
12744"""""""
12745
12746::
12747
12748      declare ptr @llvm.sponentry()
12749
12750Overview:
12751"""""""""
12752
12753The '``llvm.sponentry``' intrinsic returns the stack pointer value at
12754the entry of the current function calling this intrinsic.
12755
12756Semantics:
12757""""""""""
12758
12759Note this intrinsic is only verified on AArch64 and ARM.
12760
12761'``llvm.frameaddress``' Intrinsic
12762^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12763
12764Syntax:
12765"""""""
12766
12767::
12768
12769      declare ptr @llvm.frameaddress(i32 <level>)
12770
12771Overview:
12772"""""""""
12773
12774The '``llvm.frameaddress``' intrinsic attempts to return the
12775target-specific frame pointer value for the specified stack frame.
12776
12777Arguments:
12778""""""""""
12779
12780The argument to this intrinsic indicates which function to return the
12781frame pointer for. Zero indicates the calling function, one indicates
12782its caller, etc. The argument is **required** to be a constant integer
12783value.
12784
12785Semantics:
12786""""""""""
12787
12788The '``llvm.frameaddress``' intrinsic either returns a pointer
12789indicating the frame address of the specified call frame, or zero if it
12790cannot be identified. The value returned by this intrinsic is likely to
12791be incorrect or 0 for arguments other than zero, so it should only be
12792used for debugging purposes.
12793
12794Note that calling this intrinsic does not prevent function inlining or
12795other aggressive transformations, so the value returned may not be that
12796of the obvious source-language caller.
12797
12798'``llvm.swift.async.context.addr``' Intrinsic
12799^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12800
12801Syntax:
12802"""""""
12803
12804::
12805
12806      declare ptr @llvm.swift.async.context.addr()
12807
12808Overview:
12809"""""""""
12810
12811The '``llvm.swift.async.context.addr``' intrinsic returns a pointer to
12812the part of the extended frame record containing the asynchronous
12813context of a Swift execution.
12814
12815Semantics:
12816""""""""""
12817
12818If the caller has a ``swiftasync`` parameter, that argument will initially
12819be stored at the returned address. If not, it will be initialized to null.
12820
12821'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
12822^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12823
12824Syntax:
12825"""""""
12826
12827::
12828
12829      declare void @llvm.localescape(...)
12830      declare ptr @llvm.localrecover(ptr %func, ptr %fp, i32 %idx)
12831
12832Overview:
12833"""""""""
12834
12835The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
12836allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
12837live frame pointer to recover the address of the allocation. The offset is
12838computed during frame layout of the caller of ``llvm.localescape``.
12839
12840Arguments:
12841""""""""""
12842
12843All arguments to '``llvm.localescape``' must be pointers to static allocas or
12844casts of static allocas. Each function can only call '``llvm.localescape``'
12845once, and it can only do so from the entry block.
12846
12847The ``func`` argument to '``llvm.localrecover``' must be a constant
12848bitcasted pointer to a function defined in the current module. The code
12849generator cannot determine the frame allocation offset of functions defined in
12850other modules.
12851
12852The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
12853call frame that is currently live. The return value of '``llvm.localaddress``'
12854is one way to produce such a value, but various runtimes also expose a suitable
12855pointer in platform-specific ways.
12856
12857The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
12858'``llvm.localescape``' to recover. It is zero-indexed.
12859
12860Semantics:
12861""""""""""
12862
12863These intrinsics allow a group of functions to share access to a set of local
12864stack allocations of a one parent function. The parent function may call the
12865'``llvm.localescape``' intrinsic once from the function entry block, and the
12866child functions can use '``llvm.localrecover``' to access the escaped allocas.
12867The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
12868the escaped allocas are allocated, which would break attempts to use
12869'``llvm.localrecover``'.
12870
12871'``llvm.seh.try.begin``' and '``llvm.seh.try.end``' Intrinsics
12872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12873
12874Syntax:
12875"""""""
12876
12877::
12878
12879      declare void @llvm.seh.try.begin()
12880      declare void @llvm.seh.try.end()
12881
12882Overview:
12883"""""""""
12884
12885The '``llvm.seh.try.begin``' and '``llvm.seh.try.end``' intrinsics mark
12886the boundary of a _try region for Windows SEH Asynchrous Exception Handling.
12887
12888Semantics:
12889""""""""""
12890
12891When a C-function is compiled with Windows SEH Asynchrous Exception option,
12892-feh_asynch (aka MSVC -EHa), these two intrinsics are injected to mark _try
12893boundary and to prevent potential exceptions from being moved across boundary.
12894Any set of operations can then be confined to the region by reading their leaf
12895inputs via volatile loads and writing their root outputs via volatile stores.
12896
12897'``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' Intrinsics
12898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12899
12900Syntax:
12901"""""""
12902
12903::
12904
12905      declare void @llvm.seh.scope.begin()
12906      declare void @llvm.seh.scope.end()
12907
12908Overview:
12909"""""""""
12910
12911The '``llvm.seh.scope.begin``' and '``llvm.seh.scope.end``' intrinsics mark
12912the boundary of a CPP object lifetime for Windows SEH Asynchrous Exception
12913Handling (MSVC option -EHa).
12914
12915Semantics:
12916""""""""""
12917
12918LLVM's ordinary exception-handling representation associates EH cleanups and
12919handlers only with ``invoke``s, which normally correspond only to call sites.  To
12920support arbitrary faulting instructions, it must be possible to recover the current
12921EH scope for any instruction.  Turning every operation in LLVM that could fault
12922into an ``invoke`` of a new, potentially-throwing intrinsic would require adding a
12923large number of intrinsics, impede optimization of those operations, and make
12924compilation slower by introducing many extra basic blocks.  These intrinsics can
12925be used instead to mark the region protected by a cleanup, such as for a local
12926C++ object with a non-trivial destructor.  ``llvm.seh.scope.begin`` is used to mark
12927the start of the region; it is always called with ``invoke``, with the unwind block
12928being the desired unwind destination for any potentially-throwing instructions
12929within the region.  `llvm.seh.scope.end` is used to mark when the scope ends
12930and the EH cleanup is no longer required (e.g. because the destructor is being
12931called).
12932
12933.. _int_read_register:
12934.. _int_read_volatile_register:
12935.. _int_write_register:
12936
12937'``llvm.read_register``', '``llvm.read_volatile_register``', and '``llvm.write_register``' Intrinsics
12938^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12939
12940Syntax:
12941"""""""
12942
12943::
12944
12945      declare i32 @llvm.read_register.i32(metadata)
12946      declare i64 @llvm.read_register.i64(metadata)
12947      declare i32 @llvm.read_volatile_register.i32(metadata)
12948      declare i64 @llvm.read_volatile_register.i64(metadata)
12949      declare void @llvm.write_register.i32(metadata, i32 @value)
12950      declare void @llvm.write_register.i64(metadata, i64 @value)
12951      !0 = !{!"sp\00"}
12952
12953Overview:
12954"""""""""
12955
12956The '``llvm.read_register``', '``llvm.read_volatile_register``', and
12957'``llvm.write_register``' intrinsics provide access to the named register.
12958The register must be valid on the architecture being compiled to. The type
12959needs to be compatible with the register being read.
12960
12961Semantics:
12962""""""""""
12963
12964The '``llvm.read_register``' and '``llvm.read_volatile_register``' intrinsics
12965return the current value of the register, where possible. The
12966'``llvm.write_register``' intrinsic sets the current value of the register,
12967where possible.
12968
12969A call to '``llvm.read_volatile_register``' is assumed to have side-effects
12970and possibly return a different value each time (e.g. for a timer register).
12971
12972This is useful to implement named register global variables that need
12973to always be mapped to a specific register, as is common practice on
12974bare-metal programs including OS kernels.
12975
12976The compiler doesn't check for register availability or use of the used
12977register in surrounding code, including inline assembly. Because of that,
12978allocatable registers are not supported.
12979
12980Warning: So far it only works with the stack pointer on selected
12981architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
12982work is needed to support other registers and even more so, allocatable
12983registers.
12984
12985.. _int_stacksave:
12986
12987'``llvm.stacksave``' Intrinsic
12988^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12989
12990Syntax:
12991"""""""
12992
12993::
12994
12995      declare ptr @llvm.stacksave()
12996
12997Overview:
12998"""""""""
12999
13000The '``llvm.stacksave``' intrinsic is used to remember the current state
13001of the function stack, for use with
13002:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
13003implementing language features like scoped automatic variable sized
13004arrays in C99.
13005
13006Semantics:
13007""""""""""
13008
13009This intrinsic returns an opaque pointer value that can be passed to
13010:ref:`llvm.stackrestore <int_stackrestore>`. When an
13011``llvm.stackrestore`` intrinsic is executed with a value saved from
13012``llvm.stacksave``, it effectively restores the state of the stack to
13013the state it was in when the ``llvm.stacksave`` intrinsic executed. In
13014practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
13015were allocated after the ``llvm.stacksave`` was executed.
13016
13017.. _int_stackrestore:
13018
13019'``llvm.stackrestore``' Intrinsic
13020^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13021
13022Syntax:
13023"""""""
13024
13025::
13026
13027      declare void @llvm.stackrestore(ptr %ptr)
13028
13029Overview:
13030"""""""""
13031
13032The '``llvm.stackrestore``' intrinsic is used to restore the state of
13033the function stack to the state it was in when the corresponding
13034:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
13035useful for implementing language features like scoped automatic variable
13036sized arrays in C99.
13037
13038Semantics:
13039""""""""""
13040
13041See the description for :ref:`llvm.stacksave <int_stacksave>`.
13042
13043.. _int_get_dynamic_area_offset:
13044
13045'``llvm.get.dynamic.area.offset``' Intrinsic
13046^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13047
13048Syntax:
13049"""""""
13050
13051::
13052
13053      declare i32 @llvm.get.dynamic.area.offset.i32()
13054      declare i64 @llvm.get.dynamic.area.offset.i64()
13055
13056Overview:
13057"""""""""
13058
13059      The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
13060      get the offset from native stack pointer to the address of the most
13061      recent dynamic alloca on the caller's stack. These intrinsics are
13062      intended for use in combination with
13063      :ref:`llvm.stacksave <int_stacksave>` to get a
13064      pointer to the most recent dynamic alloca. This is useful, for example,
13065      for AddressSanitizer's stack unpoisoning routines.
13066
13067Semantics:
13068""""""""""
13069
13070      These intrinsics return a non-negative integer value that can be used to
13071      get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
13072      on the caller's stack. In particular, for targets where stack grows downwards,
13073      adding this offset to the native stack pointer would get the address of the most
13074      recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
13075      complicated, because subtracting this value from stack pointer would get the address
13076      one past the end of the most recent dynamic alloca.
13077
13078      Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
13079      returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
13080      compile-time-known constant value.
13081
13082      The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
13083      must match the target's default address space's (address space 0) pointer type.
13084
13085'``llvm.prefetch``' Intrinsic
13086^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13087
13088Syntax:
13089"""""""
13090
13091::
13092
13093      declare void @llvm.prefetch(ptr <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
13094
13095Overview:
13096"""""""""
13097
13098The '``llvm.prefetch``' intrinsic is a hint to the code generator to
13099insert a prefetch instruction if supported; otherwise, it is a noop.
13100Prefetches have no effect on the behavior of the program but can change
13101its performance characteristics.
13102
13103Arguments:
13104""""""""""
13105
13106``address`` is the address to be prefetched, ``rw`` is the specifier
13107determining if the fetch should be for a read (0) or write (1), and
13108``locality`` is a temporal locality specifier ranging from (0) - no
13109locality, to (3) - extremely local keep in cache. The ``cache type``
13110specifies whether the prefetch is performed on the data (1) or
13111instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
13112arguments must be constant integers.
13113
13114Semantics:
13115""""""""""
13116
13117This intrinsic does not modify the behavior of the program. In
13118particular, prefetches cannot trap and do not produce a value. On
13119targets that support this intrinsic, the prefetch can provide hints to
13120the processor cache for better performance.
13121
13122'``llvm.pcmarker``' Intrinsic
13123^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13124
13125Syntax:
13126"""""""
13127
13128::
13129
13130      declare void @llvm.pcmarker(i32 <id>)
13131
13132Overview:
13133"""""""""
13134
13135The '``llvm.pcmarker``' intrinsic is a method to export a Program
13136Counter (PC) in a region of code to simulators and other tools. The
13137method is target specific, but it is expected that the marker will use
13138exported symbols to transmit the PC of the marker. The marker makes no
13139guarantees that it will remain with any specific instruction after
13140optimizations. It is possible that the presence of a marker will inhibit
13141optimizations. The intended use is to be inserted after optimizations to
13142allow correlations of simulation runs.
13143
13144Arguments:
13145""""""""""
13146
13147``id`` is a numerical id identifying the marker.
13148
13149Semantics:
13150""""""""""
13151
13152This intrinsic does not modify the behavior of the program. Backends
13153that do not support this intrinsic may ignore it.
13154
13155'``llvm.readcyclecounter``' Intrinsic
13156^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13157
13158Syntax:
13159"""""""
13160
13161::
13162
13163      declare i64 @llvm.readcyclecounter()
13164
13165Overview:
13166"""""""""
13167
13168The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
13169counter register (or similar low latency, high accuracy clocks) on those
13170targets that support it. On X86, it should map to RDTSC. On Alpha, it
13171should map to RPCC. As the backing counters overflow quickly (on the
13172order of 9 seconds on alpha), this should only be used for small
13173timings.
13174
13175Semantics:
13176""""""""""
13177
13178When directly supported, reading the cycle counter should not modify any
13179memory. Implementations are allowed to either return an application
13180specific value or a system wide value. On backends without support, this
13181is lowered to a constant 0.
13182
13183Note that runtime support may be conditional on the privilege-level code is
13184running at and the host platform.
13185
13186'``llvm.clear_cache``' Intrinsic
13187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13188
13189Syntax:
13190"""""""
13191
13192::
13193
13194      declare void @llvm.clear_cache(ptr, ptr)
13195
13196Overview:
13197"""""""""
13198
13199The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
13200in the specified range to the execution unit of the processor. On
13201targets with non-unified instruction and data cache, the implementation
13202flushes the instruction cache.
13203
13204Semantics:
13205""""""""""
13206
13207On platforms with coherent instruction and data caches (e.g. x86), this
13208intrinsic is a nop. On platforms with non-coherent instruction and data
13209cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
13210instructions or a system call, if cache flushing requires special
13211privileges.
13212
13213The default behavior is to emit a call to ``__clear_cache`` from the run
13214time library.
13215
13216This intrinsic does *not* empty the instruction pipeline. Modifications
13217of the current function are outside the scope of the intrinsic.
13218
13219'``llvm.instrprof.increment``' Intrinsic
13220^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13221
13222Syntax:
13223"""""""
13224
13225::
13226
13227      declare void @llvm.instrprof.increment(ptr <name>, i64 <hash>,
13228                                             i32 <num-counters>, i32 <index>)
13229
13230Overview:
13231"""""""""
13232
13233The '``llvm.instrprof.increment``' intrinsic can be emitted by a
13234frontend for use with instrumentation based profiling. These will be
13235lowered by the ``-instrprof`` pass to generate execution counts of a
13236program at runtime.
13237
13238Arguments:
13239""""""""""
13240
13241The first argument is a pointer to a global variable containing the
13242name of the entity being instrumented. This should generally be the
13243(mangled) function name for a set of counters.
13244
13245The second argument is a hash value that can be used by the consumer
13246of the profile data to detect changes to the instrumented source, and
13247the third is the number of counters associated with ``name``. It is an
13248error if ``hash`` or ``num-counters`` differ between two instances of
13249``instrprof.increment`` that refer to the same name.
13250
13251The last argument refers to which of the counters for ``name`` should
13252be incremented. It should be a value between 0 and ``num-counters``.
13253
13254Semantics:
13255""""""""""
13256
13257This intrinsic represents an increment of a profiling counter. It will
13258cause the ``-instrprof`` pass to generate the appropriate data
13259structures and the code to increment the appropriate value, in a
13260format that can be written out by a compiler runtime and consumed via
13261the ``llvm-profdata`` tool.
13262
13263'``llvm.instrprof.increment.step``' Intrinsic
13264^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13265
13266Syntax:
13267"""""""
13268
13269::
13270
13271      declare void @llvm.instrprof.increment.step(ptr <name>, i64 <hash>,
13272                                                  i32 <num-counters>,
13273                                                  i32 <index>, i64 <step>)
13274
13275Overview:
13276"""""""""
13277
13278The '``llvm.instrprof.increment.step``' intrinsic is an extension to
13279the '``llvm.instrprof.increment``' intrinsic with an additional fifth
13280argument to specify the step of the increment.
13281
13282Arguments:
13283""""""""""
13284The first four arguments are the same as '``llvm.instrprof.increment``'
13285intrinsic.
13286
13287The last argument specifies the value of the increment of the counter variable.
13288
13289Semantics:
13290""""""""""
13291See description of '``llvm.instrprof.increment``' intrinsic.
13292
13293'``llvm.instrprof.cover``' Intrinsic
13294^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13295
13296Syntax:
13297"""""""
13298
13299::
13300
13301      declare void @llvm.instrprof.cover(ptr <name>, i64 <hash>,
13302                                         i32 <num-counters>, i32 <index>)
13303
13304Overview:
13305"""""""""
13306
13307The '``llvm.instrprof.cover``' intrinsic is used to implement coverage
13308instrumentation.
13309
13310Arguments:
13311""""""""""
13312The arguments are the same as the first four arguments of
13313'``llvm.instrprof.increment``'.
13314
13315Semantics:
13316""""""""""
13317Similar to the '``llvm.instrprof.increment``' intrinsic, but it stores zero to
13318the profiling variable to signify that the function has been covered. We store
13319zero because this is more efficient on some targets.
13320
13321'``llvm.instrprof.value.profile``' Intrinsic
13322^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13323
13324Syntax:
13325"""""""
13326
13327::
13328
13329      declare void @llvm.instrprof.value.profile(ptr <name>, i64 <hash>,
13330                                                 i64 <value>, i32 <value_kind>,
13331                                                 i32 <index>)
13332
13333Overview:
13334"""""""""
13335
13336The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
13337frontend for use with instrumentation based profiling. This will be
13338lowered by the ``-instrprof`` pass to find out the target values,
13339instrumented expressions take in a program at runtime.
13340
13341Arguments:
13342""""""""""
13343
13344The first argument is a pointer to a global variable containing the
13345name of the entity being instrumented. ``name`` should generally be the
13346(mangled) function name for a set of counters.
13347
13348The second argument is a hash value that can be used by the consumer
13349of the profile data to detect changes to the instrumented source. It
13350is an error if ``hash`` differs between two instances of
13351``llvm.instrprof.*`` that refer to the same name.
13352
13353The third argument is the value of the expression being profiled. The profiled
13354expression's value should be representable as an unsigned 64-bit value. The
13355fourth argument represents the kind of value profiling that is being done. The
13356supported value profiling kinds are enumerated through the
13357``InstrProfValueKind`` type declared in the
13358``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
13359index of the instrumented expression within ``name``. It should be >= 0.
13360
13361Semantics:
13362""""""""""
13363
13364This intrinsic represents the point where a call to a runtime routine
13365should be inserted for value profiling of target expressions. ``-instrprof``
13366pass will generate the appropriate data structures and replace the
13367``llvm.instrprof.value.profile`` intrinsic with the call to the profile
13368runtime library with proper arguments.
13369
13370'``llvm.thread.pointer``' Intrinsic
13371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13372
13373Syntax:
13374"""""""
13375
13376::
13377
13378      declare ptr @llvm.thread.pointer()
13379
13380Overview:
13381"""""""""
13382
13383The '``llvm.thread.pointer``' intrinsic returns the value of the thread
13384pointer.
13385
13386Semantics:
13387""""""""""
13388
13389The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
13390for the current thread.  The exact semantics of this value are target
13391specific: it may point to the start of TLS area, to the end, or somewhere
13392in the middle.  Depending on the target, this intrinsic may read a register,
13393call a helper function, read from an alternate memory space, or perform
13394other operations necessary to locate the TLS area.  Not all targets support
13395this intrinsic.
13396
13397'``llvm.call.preallocated.setup``' Intrinsic
13398^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13399
13400Syntax:
13401"""""""
13402
13403::
13404
13405      declare token @llvm.call.preallocated.setup(i32 %num_args)
13406
13407Overview:
13408"""""""""
13409
13410The '``llvm.call.preallocated.setup``' intrinsic returns a token which can
13411be used with a call's ``"preallocated"`` operand bundle to indicate that
13412certain arguments are allocated and initialized before the call.
13413
13414Semantics:
13415""""""""""
13416
13417The '``llvm.call.preallocated.setup``' intrinsic returns a token which is
13418associated with at most one call. The token can be passed to
13419'``@llvm.call.preallocated.arg``' to get a pointer to get that
13420corresponding argument. The token must be the parameter to a
13421``"preallocated"`` operand bundle for the corresponding call.
13422
13423Nested calls to '``llvm.call.preallocated.setup``' are allowed, but must
13424be properly nested. e.g.
13425
13426:: code-block:: llvm
13427
13428      %t1 = call token @llvm.call.preallocated.setup(i32 0)
13429      %t2 = call token @llvm.call.preallocated.setup(i32 0)
13430      call void foo() ["preallocated"(token %t2)]
13431      call void foo() ["preallocated"(token %t1)]
13432
13433is allowed, but not
13434
13435:: code-block:: llvm
13436
13437      %t1 = call token @llvm.call.preallocated.setup(i32 0)
13438      %t2 = call token @llvm.call.preallocated.setup(i32 0)
13439      call void foo() ["preallocated"(token %t1)]
13440      call void foo() ["preallocated"(token %t2)]
13441
13442.. _int_call_preallocated_arg:
13443
13444'``llvm.call.preallocated.arg``' Intrinsic
13445^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13446
13447Syntax:
13448"""""""
13449
13450::
13451
13452      declare ptr @llvm.call.preallocated.arg(token %setup_token, i32 %arg_index)
13453
13454Overview:
13455"""""""""
13456
13457The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13458corresponding preallocated argument for the preallocated call.
13459
13460Semantics:
13461""""""""""
13462
13463The '``llvm.call.preallocated.arg``' intrinsic returns a pointer to the
13464``%arg_index``th argument with the ``preallocated`` attribute for
13465the call associated with the ``%setup_token``, which must be from
13466'``llvm.call.preallocated.setup``'.
13467
13468A call to '``llvm.call.preallocated.arg``' must have a call site
13469``preallocated`` attribute. The type of the ``preallocated`` attribute must
13470match the type used by the ``preallocated`` attribute of the corresponding
13471argument at the preallocated call. The type is used in the case that an
13472``llvm.call.preallocated.setup`` does not have a corresponding call (e.g. due
13473to DCE), where otherwise we cannot know how large the arguments are.
13474
13475It is undefined behavior if this is called with a token from an
13476'``llvm.call.preallocated.setup``' if another
13477'``llvm.call.preallocated.setup``' has already been called or if the
13478preallocated call corresponding to the '``llvm.call.preallocated.setup``'
13479has already been called.
13480
13481.. _int_call_preallocated_teardown:
13482
13483'``llvm.call.preallocated.teardown``' Intrinsic
13484^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13485
13486Syntax:
13487"""""""
13488
13489::
13490
13491      declare ptr @llvm.call.preallocated.teardown(token %setup_token)
13492
13493Overview:
13494"""""""""
13495
13496The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13497created by a '``llvm.call.preallocated.setup``'.
13498
13499Semantics:
13500""""""""""
13501
13502The token argument must be a '``llvm.call.preallocated.setup``'.
13503
13504The '``llvm.call.preallocated.teardown``' intrinsic cleans up the stack
13505allocated by the corresponding '``llvm.call.preallocated.setup``'. Exactly
13506one of this or the preallocated call must be called to prevent stack leaks.
13507It is undefined behavior to call both a '``llvm.call.preallocated.teardown``'
13508and the preallocated call for a given '``llvm.call.preallocated.setup``'.
13509
13510For example, if the stack is allocated for a preallocated call by a
13511'``llvm.call.preallocated.setup``', then an initializer function called on an
13512allocated argument throws an exception, there should be a
13513'``llvm.call.preallocated.teardown``' in the exception handler to prevent
13514stack leaks.
13515
13516Following the nesting rules in '``llvm.call.preallocated.setup``', nested
13517calls to '``llvm.call.preallocated.setup``' and
13518'``llvm.call.preallocated.teardown``' are allowed but must be properly
13519nested.
13520
13521Example:
13522""""""""
13523
13524.. code-block:: llvm
13525
13526        %cs = call token @llvm.call.preallocated.setup(i32 1)
13527        %x = call ptr @llvm.call.preallocated.arg(token %cs, i32 0) preallocated(i32)
13528        invoke void @constructor(ptr %x) to label %conta unwind label %contb
13529    conta:
13530        call void @foo1(ptr preallocated(i32) %x) ["preallocated"(token %cs)]
13531        ret void
13532    contb:
13533        %s = catchswitch within none [label %catch] unwind to caller
13534    catch:
13535        %p = catchpad within %s []
13536        call void @llvm.call.preallocated.teardown(token %cs)
13537        ret void
13538
13539Standard C/C++ Library Intrinsics
13540---------------------------------
13541
13542LLVM provides intrinsics for a few important standard C/C++ library
13543functions. These intrinsics allow source-language front-ends to pass
13544information about the alignment of the pointer arguments to the code
13545generator, providing opportunity for more efficient code generation.
13546
13547
13548'``llvm.abs.*``' Intrinsic
13549^^^^^^^^^^^^^^^^^^^^^^^^^^
13550
13551Syntax:
13552"""""""
13553
13554This is an overloaded intrinsic. You can use ``llvm.abs`` on any
13555integer bit width or any vector of integer elements.
13556
13557::
13558
13559      declare i32 @llvm.abs.i32(i32 <src>, i1 <is_int_min_poison>)
13560      declare <4 x i32> @llvm.abs.v4i32(<4 x i32> <src>, i1 <is_int_min_poison>)
13561
13562Overview:
13563"""""""""
13564
13565The '``llvm.abs``' family of intrinsic functions returns the absolute value
13566of an argument.
13567
13568Arguments:
13569""""""""""
13570
13571The first argument is the value for which the absolute value is to be returned.
13572This argument may be of any integer type or a vector with integer element type.
13573The return type must match the first argument type.
13574
13575The second argument must be a constant and is a flag to indicate whether the
13576result value of the '``llvm.abs``' intrinsic is a
13577:ref:`poison value <poisonvalues>` if the argument is statically or dynamically
13578an ``INT_MIN`` value.
13579
13580Semantics:
13581""""""""""
13582
13583The '``llvm.abs``' intrinsic returns the magnitude (always positive) of the
13584argument or each element of a vector argument.". If the argument is ``INT_MIN``,
13585then the result is also ``INT_MIN`` if ``is_int_min_poison == 0`` and
13586``poison`` otherwise.
13587
13588
13589'``llvm.smax.*``' Intrinsic
13590^^^^^^^^^^^^^^^^^^^^^^^^^^^
13591
13592Syntax:
13593"""""""
13594
13595This is an overloaded intrinsic. You can use ``@llvm.smax`` on any
13596integer bit width or any vector of integer elements.
13597
13598::
13599
13600      declare i32 @llvm.smax.i32(i32 %a, i32 %b)
13601      declare <4 x i32> @llvm.smax.v4i32(<4 x i32> %a, <4 x i32> %b)
13602
13603Overview:
13604"""""""""
13605
13606Return the larger of ``%a`` and ``%b`` comparing the values as signed integers.
13607Vector intrinsics operate on a per-element basis. The larger element of ``%a``
13608and ``%b`` at a given index is returned for that index.
13609
13610Arguments:
13611""""""""""
13612
13613The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13614integer element type. The argument types must match each other, and the return
13615type must match the argument type.
13616
13617
13618'``llvm.smin.*``' Intrinsic
13619^^^^^^^^^^^^^^^^^^^^^^^^^^^
13620
13621Syntax:
13622"""""""
13623
13624This is an overloaded intrinsic. You can use ``@llvm.smin`` on any
13625integer bit width or any vector of integer elements.
13626
13627::
13628
13629      declare i32 @llvm.smin.i32(i32 %a, i32 %b)
13630      declare <4 x i32> @llvm.smin.v4i32(<4 x i32> %a, <4 x i32> %b)
13631
13632Overview:
13633"""""""""
13634
13635Return the smaller of ``%a`` and ``%b`` comparing the values as signed integers.
13636Vector intrinsics operate on a per-element basis. The smaller element of ``%a``
13637and ``%b`` at a given index is returned for that index.
13638
13639Arguments:
13640""""""""""
13641
13642The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13643integer element type. The argument types must match each other, and the return
13644type must match the argument type.
13645
13646
13647'``llvm.umax.*``' Intrinsic
13648^^^^^^^^^^^^^^^^^^^^^^^^^^^
13649
13650Syntax:
13651"""""""
13652
13653This is an overloaded intrinsic. You can use ``@llvm.umax`` on any
13654integer bit width or any vector of integer elements.
13655
13656::
13657
13658      declare i32 @llvm.umax.i32(i32 %a, i32 %b)
13659      declare <4 x i32> @llvm.umax.v4i32(<4 x i32> %a, <4 x i32> %b)
13660
13661Overview:
13662"""""""""
13663
13664Return the larger of ``%a`` and ``%b`` comparing the values as unsigned
13665integers. Vector intrinsics operate on a per-element basis. The larger element
13666of ``%a`` and ``%b`` at a given index is returned for that index.
13667
13668Arguments:
13669""""""""""
13670
13671The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13672integer element type. The argument types must match each other, and the return
13673type must match the argument type.
13674
13675
13676'``llvm.umin.*``' Intrinsic
13677^^^^^^^^^^^^^^^^^^^^^^^^^^^
13678
13679Syntax:
13680"""""""
13681
13682This is an overloaded intrinsic. You can use ``@llvm.umin`` on any
13683integer bit width or any vector of integer elements.
13684
13685::
13686
13687      declare i32 @llvm.umin.i32(i32 %a, i32 %b)
13688      declare <4 x i32> @llvm.umin.v4i32(<4 x i32> %a, <4 x i32> %b)
13689
13690Overview:
13691"""""""""
13692
13693Return the smaller of ``%a`` and ``%b`` comparing the values as unsigned
13694integers. Vector intrinsics operate on a per-element basis. The smaller element
13695of ``%a`` and ``%b`` at a given index is returned for that index.
13696
13697Arguments:
13698""""""""""
13699
13700The arguments (``%a`` and ``%b``) may be of any integer type or a vector with
13701integer element type. The argument types must match each other, and the return
13702type must match the argument type.
13703
13704
13705.. _int_memcpy:
13706
13707'``llvm.memcpy``' Intrinsic
13708^^^^^^^^^^^^^^^^^^^^^^^^^^^
13709
13710Syntax:
13711"""""""
13712
13713This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
13714integer bit width and for different address spaces. Not all targets
13715support all bit widths however.
13716
13717::
13718
13719      declare void @llvm.memcpy.p0.p0.i32(ptr <dest>, ptr <src>,
13720                                          i32 <len>, i1 <isvolatile>)
13721      declare void @llvm.memcpy.p0.p0.i64(ptr <dest>, ptr <src>,
13722                                          i64 <len>, i1 <isvolatile>)
13723
13724Overview:
13725"""""""""
13726
13727The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
13728source location to the destination location.
13729
13730Note that, unlike the standard libc function, the ``llvm.memcpy.*``
13731intrinsics do not return a value, takes extra isvolatile
13732arguments and the pointers can be in specified address spaces.
13733
13734Arguments:
13735""""""""""
13736
13737The first argument is a pointer to the destination, the second is a
13738pointer to the source. The third argument is an integer argument
13739specifying the number of bytes to copy, and the fourth is a
13740boolean indicating a volatile access.
13741
13742The :ref:`align <attr_align>` parameter attribute can be provided
13743for the first and second arguments.
13744
13745If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
13746a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13747very cleanly specified and it is unwise to depend on it.
13748
13749Semantics:
13750""""""""""
13751
13752The '``llvm.memcpy.*``' intrinsics copy a block of memory from the source
13753location to the destination location, which must either be equal or
13754non-overlapping. It copies "len" bytes of memory over. If the argument is known
13755to be aligned to some boundary, this can be specified as an attribute on the
13756argument.
13757
13758If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13759the arguments.
13760If ``<len>`` is not a well-defined value, the behavior is undefined.
13761If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13762otherwise the behavior is undefined.
13763
13764.. _int_memcpy_inline:
13765
13766'``llvm.memcpy.inline``' Intrinsic
13767^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13768
13769Syntax:
13770"""""""
13771
13772This is an overloaded intrinsic. You can use ``llvm.memcpy.inline`` on any
13773integer bit width and for different address spaces. Not all targets
13774support all bit widths however.
13775
13776::
13777
13778      declare void @llvm.memcpy.inline.p0.p0.i32(ptr <dest>, ptr <src>,
13779                                                 i32 <len>, i1 <isvolatile>)
13780      declare void @llvm.memcpy.inline.p0.p0.i64(ptr <dest>, ptr <src>,
13781                                                 i64 <len>, i1 <isvolatile>)
13782
13783Overview:
13784"""""""""
13785
13786The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13787source location to the destination location and guarantees that no external
13788functions are called.
13789
13790Note that, unlike the standard libc function, the ``llvm.memcpy.inline.*``
13791intrinsics do not return a value, takes extra isvolatile
13792arguments and the pointers can be in specified address spaces.
13793
13794Arguments:
13795""""""""""
13796
13797The first argument is a pointer to the destination, the second is a
13798pointer to the source. The third argument is a constant integer argument
13799specifying the number of bytes to copy, and the fourth is a
13800boolean indicating a volatile access.
13801
13802The :ref:`align <attr_align>` parameter attribute can be provided
13803for the first and second arguments.
13804
13805If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy.inline`` call is
13806a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13807very cleanly specified and it is unwise to depend on it.
13808
13809Semantics:
13810""""""""""
13811
13812The '``llvm.memcpy.inline.*``' intrinsics copy a block of memory from the
13813source location to the destination location, which are not allowed to
13814overlap. It copies "len" bytes of memory over. If the argument is known
13815to be aligned to some boundary, this can be specified as an attribute on
13816the argument.
13817The behavior of '``llvm.memcpy.inline.*``' is equivalent to the behavior of
13818'``llvm.memcpy.*``', but the generated code is guaranteed not to call any
13819external functions.
13820
13821.. _int_memmove:
13822
13823'``llvm.memmove``' Intrinsic
13824^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13825
13826Syntax:
13827"""""""
13828
13829This is an overloaded intrinsic. You can use llvm.memmove on any integer
13830bit width and for different address space. Not all targets support all
13831bit widths however.
13832
13833::
13834
13835      declare void @llvm.memmove.p0.p0.i32(ptr <dest>, ptr <src>,
13836                                           i32 <len>, i1 <isvolatile>)
13837      declare void @llvm.memmove.p0.p0.i64(ptr <dest>, ptr <src>,
13838                                           i64 <len>, i1 <isvolatile>)
13839
13840Overview:
13841"""""""""
13842
13843The '``llvm.memmove.*``' intrinsics move a block of memory from the
13844source location to the destination location. It is similar to the
13845'``llvm.memcpy``' intrinsic but allows the two memory locations to
13846overlap.
13847
13848Note that, unlike the standard libc function, the ``llvm.memmove.*``
13849intrinsics do not return a value, takes an extra isvolatile
13850argument and the pointers can be in specified address spaces.
13851
13852Arguments:
13853""""""""""
13854
13855The first argument is a pointer to the destination, the second is a
13856pointer to the source. The third argument is an integer argument
13857specifying the number of bytes to copy, and the fourth is a
13858boolean indicating a volatile access.
13859
13860The :ref:`align <attr_align>` parameter attribute can be provided
13861for the first and second arguments.
13862
13863If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
13864is a :ref:`volatile operation <volatile>`. The detailed access behavior is
13865not very cleanly specified and it is unwise to depend on it.
13866
13867Semantics:
13868""""""""""
13869
13870The '``llvm.memmove.*``' intrinsics copy a block of memory from the
13871source location to the destination location, which may overlap. It
13872copies "len" bytes of memory over. If the argument is known to be
13873aligned to some boundary, this can be specified as an attribute on
13874the argument.
13875
13876If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13877the arguments.
13878If ``<len>`` is not a well-defined value, the behavior is undefined.
13879If ``<len>`` is not zero, both ``<dest>`` and ``<src>`` should be well-defined,
13880otherwise the behavior is undefined.
13881
13882.. _int_memset:
13883
13884'``llvm.memset.*``' Intrinsics
13885^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13886
13887Syntax:
13888"""""""
13889
13890This is an overloaded intrinsic. You can use llvm.memset on any integer
13891bit width and for different address spaces. However, not all targets
13892support all bit widths.
13893
13894::
13895
13896      declare void @llvm.memset.p0.i32(ptr <dest>, i8 <val>,
13897                                       i32 <len>, i1 <isvolatile>)
13898      declare void @llvm.memset.p0.i64(ptr <dest>, i8 <val>,
13899                                       i64 <len>, i1 <isvolatile>)
13900
13901Overview:
13902"""""""""
13903
13904The '``llvm.memset.*``' intrinsics fill a block of memory with a
13905particular byte value.
13906
13907Note that, unlike the standard libc function, the ``llvm.memset``
13908intrinsic does not return a value and takes an extra volatile
13909argument. Also, the destination can be in an arbitrary address space.
13910
13911Arguments:
13912""""""""""
13913
13914The first argument is a pointer to the destination to fill, the second
13915is the byte value with which to fill it, the third argument is an
13916integer argument specifying the number of bytes to fill, and the fourth
13917is a boolean indicating a volatile access.
13918
13919The :ref:`align <attr_align>` parameter attribute can be provided
13920for the first arguments.
13921
13922If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
13923a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13924very cleanly specified and it is unwise to depend on it.
13925
13926Semantics:
13927""""""""""
13928
13929The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
13930at the destination location. If the argument is known to be
13931aligned to some boundary, this can be specified as an attribute on
13932the argument.
13933
13934If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13935the arguments.
13936If ``<len>`` is not a well-defined value, the behavior is undefined.
13937If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
13938behavior is undefined.
13939
13940.. _int_memset_inline:
13941
13942'``llvm.memset.inline``' Intrinsic
13943^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13944
13945Syntax:
13946"""""""
13947
13948This is an overloaded intrinsic. You can use ``llvm.memset.inline`` on any
13949integer bit width and for different address spaces. Not all targets
13950support all bit widths however.
13951
13952::
13953
13954      declare void @llvm.memset.inline.p0.p0i8.i32(ptr <dest>, i8 <val>,
13955                                                   i32 <len>, i1 <isvolatile>)
13956      declare void @llvm.memset.inline.p0.p0.i64(ptr <dest>, i8 <val>,
13957                                                 i64 <len>, i1 <isvolatile>)
13958
13959Overview:
13960"""""""""
13961
13962The '``llvm.memset.inline.*``' intrinsics fill a block of memory with a
13963particular byte value and guarantees that no external functions are called.
13964
13965Note that, unlike the standard libc function, the ``llvm.memset.inline.*``
13966intrinsics do not return a value, take an extra isvolatile argument and the
13967pointer can be in specified address spaces.
13968
13969Arguments:
13970""""""""""
13971
13972The first argument is a pointer to the destination to fill, the second
13973is the byte value with which to fill it, the third argument is a constant
13974integer argument specifying the number of bytes to fill, and the fourth
13975is a boolean indicating a volatile access.
13976
13977The :ref:`align <attr_align>` parameter attribute can be provided
13978for the first argument.
13979
13980If the ``isvolatile`` parameter is ``true``, the ``llvm.memset.inline`` call is
13981a :ref:`volatile operation <volatile>`. The detailed access behavior is not
13982very cleanly specified and it is unwise to depend on it.
13983
13984Semantics:
13985""""""""""
13986
13987The '``llvm.memset.inline.*``' intrinsics fill "len" bytes of memory starting
13988at the destination location. If the argument is known to be
13989aligned to some boundary, this can be specified as an attribute on
13990the argument.
13991
13992``len`` must be a constant expression.
13993If ``<len>`` is 0, it is no-op modulo the behavior of attributes attached to
13994the arguments.
13995If ``<len>`` is not a well-defined value, the behavior is undefined.
13996If ``<len>`` is not zero, ``<dest>`` should be well-defined, otherwise the
13997behavior is undefined.
13998
13999The behavior of '``llvm.memset.inline.*``' is equivalent to the behavior of
14000'``llvm.memset.*``', but the generated code is guaranteed not to call any
14001external functions.
14002
14003'``llvm.sqrt.*``' Intrinsic
14004^^^^^^^^^^^^^^^^^^^^^^^^^^^
14005
14006Syntax:
14007"""""""
14008
14009This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
14010floating-point or vector of floating-point type. Not all targets support
14011all types however.
14012
14013::
14014
14015      declare float     @llvm.sqrt.f32(float %Val)
14016      declare double    @llvm.sqrt.f64(double %Val)
14017      declare x86_fp80  @llvm.sqrt.f80(x86_fp80 %Val)
14018      declare fp128     @llvm.sqrt.f128(fp128 %Val)
14019      declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
14020
14021Overview:
14022"""""""""
14023
14024The '``llvm.sqrt``' intrinsics return the square root of the specified value.
14025
14026Arguments:
14027""""""""""
14028
14029The argument and return value are floating-point numbers of the same type.
14030
14031Semantics:
14032""""""""""
14033
14034Return the same value as a corresponding libm '``sqrt``' function but without
14035trapping or setting ``errno``. For types specified by IEEE-754, the result
14036matches a conforming libm implementation.
14037
14038When specified with the fast-math-flag 'afn', the result may be approximated
14039using a less accurate calculation.
14040
14041'``llvm.powi.*``' Intrinsic
14042^^^^^^^^^^^^^^^^^^^^^^^^^^^
14043
14044Syntax:
14045"""""""
14046
14047This is an overloaded intrinsic. You can use ``llvm.powi`` on any
14048floating-point or vector of floating-point type. Not all targets support
14049all types however.
14050
14051Generally, the only supported type for the exponent is the one matching
14052with the C type ``int``.
14053
14054::
14055
14056      declare float     @llvm.powi.f32.i32(float  %Val, i32 %power)
14057      declare double    @llvm.powi.f64.i16(double %Val, i16 %power)
14058      declare x86_fp80  @llvm.powi.f80.i32(x86_fp80  %Val, i32 %power)
14059      declare fp128     @llvm.powi.f128.i32(fp128 %Val, i32 %power)
14060      declare ppc_fp128 @llvm.powi.ppcf128.i32(ppc_fp128  %Val, i32 %power)
14061
14062Overview:
14063"""""""""
14064
14065The '``llvm.powi.*``' intrinsics return the first operand raised to the
14066specified (positive or negative) power. The order of evaluation of
14067multiplications is not defined. When a vector of floating-point type is
14068used, the second argument remains a scalar integer value.
14069
14070Arguments:
14071""""""""""
14072
14073The second argument is an integer power, and the first is a value to
14074raise to that power.
14075
14076Semantics:
14077""""""""""
14078
14079This function returns the first value raised to the second power with an
14080unspecified sequence of rounding operations.
14081
14082'``llvm.sin.*``' Intrinsic
14083^^^^^^^^^^^^^^^^^^^^^^^^^^
14084
14085Syntax:
14086"""""""
14087
14088This is an overloaded intrinsic. You can use ``llvm.sin`` on any
14089floating-point or vector of floating-point type. Not all targets support
14090all types however.
14091
14092::
14093
14094      declare float     @llvm.sin.f32(float  %Val)
14095      declare double    @llvm.sin.f64(double %Val)
14096      declare x86_fp80  @llvm.sin.f80(x86_fp80  %Val)
14097      declare fp128     @llvm.sin.f128(fp128 %Val)
14098      declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128  %Val)
14099
14100Overview:
14101"""""""""
14102
14103The '``llvm.sin.*``' intrinsics return the sine of the operand.
14104
14105Arguments:
14106""""""""""
14107
14108The argument and return value are floating-point numbers of the same type.
14109
14110Semantics:
14111""""""""""
14112
14113Return the same value as a corresponding libm '``sin``' function but without
14114trapping or setting ``errno``.
14115
14116When specified with the fast-math-flag 'afn', the result may be approximated
14117using a less accurate calculation.
14118
14119'``llvm.cos.*``' Intrinsic
14120^^^^^^^^^^^^^^^^^^^^^^^^^^
14121
14122Syntax:
14123"""""""
14124
14125This is an overloaded intrinsic. You can use ``llvm.cos`` on any
14126floating-point or vector of floating-point type. Not all targets support
14127all types however.
14128
14129::
14130
14131      declare float     @llvm.cos.f32(float  %Val)
14132      declare double    @llvm.cos.f64(double %Val)
14133      declare x86_fp80  @llvm.cos.f80(x86_fp80  %Val)
14134      declare fp128     @llvm.cos.f128(fp128 %Val)
14135      declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128  %Val)
14136
14137Overview:
14138"""""""""
14139
14140The '``llvm.cos.*``' intrinsics return the cosine of the operand.
14141
14142Arguments:
14143""""""""""
14144
14145The argument and return value are floating-point numbers of the same type.
14146
14147Semantics:
14148""""""""""
14149
14150Return the same value as a corresponding libm '``cos``' function but without
14151trapping or setting ``errno``.
14152
14153When specified with the fast-math-flag 'afn', the result may be approximated
14154using a less accurate calculation.
14155
14156'``llvm.pow.*``' Intrinsic
14157^^^^^^^^^^^^^^^^^^^^^^^^^^
14158
14159Syntax:
14160"""""""
14161
14162This is an overloaded intrinsic. You can use ``llvm.pow`` on any
14163floating-point or vector of floating-point type. Not all targets support
14164all types however.
14165
14166::
14167
14168      declare float     @llvm.pow.f32(float  %Val, float %Power)
14169      declare double    @llvm.pow.f64(double %Val, double %Power)
14170      declare x86_fp80  @llvm.pow.f80(x86_fp80  %Val, x86_fp80 %Power)
14171      declare fp128     @llvm.pow.f128(fp128 %Val, fp128 %Power)
14172      declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128  %Val, ppc_fp128 Power)
14173
14174Overview:
14175"""""""""
14176
14177The '``llvm.pow.*``' intrinsics return the first operand raised to the
14178specified (positive or negative) power.
14179
14180Arguments:
14181""""""""""
14182
14183The arguments and return value are floating-point numbers of the same type.
14184
14185Semantics:
14186""""""""""
14187
14188Return the same value as a corresponding libm '``pow``' function but without
14189trapping or setting ``errno``.
14190
14191When specified with the fast-math-flag 'afn', the result may be approximated
14192using a less accurate calculation.
14193
14194'``llvm.exp.*``' Intrinsic
14195^^^^^^^^^^^^^^^^^^^^^^^^^^
14196
14197Syntax:
14198"""""""
14199
14200This is an overloaded intrinsic. You can use ``llvm.exp`` on any
14201floating-point or vector of floating-point type. Not all targets support
14202all types however.
14203
14204::
14205
14206      declare float     @llvm.exp.f32(float  %Val)
14207      declare double    @llvm.exp.f64(double %Val)
14208      declare x86_fp80  @llvm.exp.f80(x86_fp80  %Val)
14209      declare fp128     @llvm.exp.f128(fp128 %Val)
14210      declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128  %Val)
14211
14212Overview:
14213"""""""""
14214
14215The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
14216value.
14217
14218Arguments:
14219""""""""""
14220
14221The argument and return value are floating-point numbers of the same type.
14222
14223Semantics:
14224""""""""""
14225
14226Return the same value as a corresponding libm '``exp``' function but without
14227trapping or setting ``errno``.
14228
14229When specified with the fast-math-flag 'afn', the result may be approximated
14230using a less accurate calculation.
14231
14232'``llvm.exp2.*``' Intrinsic
14233^^^^^^^^^^^^^^^^^^^^^^^^^^^
14234
14235Syntax:
14236"""""""
14237
14238This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
14239floating-point or vector of floating-point type. Not all targets support
14240all types however.
14241
14242::
14243
14244      declare float     @llvm.exp2.f32(float  %Val)
14245      declare double    @llvm.exp2.f64(double %Val)
14246      declare x86_fp80  @llvm.exp2.f80(x86_fp80  %Val)
14247      declare fp128     @llvm.exp2.f128(fp128 %Val)
14248      declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128  %Val)
14249
14250Overview:
14251"""""""""
14252
14253The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
14254specified value.
14255
14256Arguments:
14257""""""""""
14258
14259The argument and return value are floating-point numbers of the same type.
14260
14261Semantics:
14262""""""""""
14263
14264Return the same value as a corresponding libm '``exp2``' function but without
14265trapping or setting ``errno``.
14266
14267When specified with the fast-math-flag 'afn', the result may be approximated
14268using a less accurate calculation.
14269
14270'``llvm.log.*``' Intrinsic
14271^^^^^^^^^^^^^^^^^^^^^^^^^^
14272
14273Syntax:
14274"""""""
14275
14276This is an overloaded intrinsic. You can use ``llvm.log`` on any
14277floating-point or vector of floating-point type. Not all targets support
14278all types however.
14279
14280::
14281
14282      declare float     @llvm.log.f32(float  %Val)
14283      declare double    @llvm.log.f64(double %Val)
14284      declare x86_fp80  @llvm.log.f80(x86_fp80  %Val)
14285      declare fp128     @llvm.log.f128(fp128 %Val)
14286      declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128  %Val)
14287
14288Overview:
14289"""""""""
14290
14291The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
14292value.
14293
14294Arguments:
14295""""""""""
14296
14297The argument and return value are floating-point numbers of the same type.
14298
14299Semantics:
14300""""""""""
14301
14302Return the same value as a corresponding libm '``log``' function but without
14303trapping or setting ``errno``.
14304
14305When specified with the fast-math-flag 'afn', the result may be approximated
14306using a less accurate calculation.
14307
14308'``llvm.log10.*``' Intrinsic
14309^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14310
14311Syntax:
14312"""""""
14313
14314This is an overloaded intrinsic. You can use ``llvm.log10`` on any
14315floating-point or vector of floating-point type. Not all targets support
14316all types however.
14317
14318::
14319
14320      declare float     @llvm.log10.f32(float  %Val)
14321      declare double    @llvm.log10.f64(double %Val)
14322      declare x86_fp80  @llvm.log10.f80(x86_fp80  %Val)
14323      declare fp128     @llvm.log10.f128(fp128 %Val)
14324      declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128  %Val)
14325
14326Overview:
14327"""""""""
14328
14329The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
14330specified value.
14331
14332Arguments:
14333""""""""""
14334
14335The argument and return value are floating-point numbers of the same type.
14336
14337Semantics:
14338""""""""""
14339
14340Return the same value as a corresponding libm '``log10``' function but without
14341trapping or setting ``errno``.
14342
14343When specified with the fast-math-flag 'afn', the result may be approximated
14344using a less accurate calculation.
14345
14346'``llvm.log2.*``' Intrinsic
14347^^^^^^^^^^^^^^^^^^^^^^^^^^^
14348
14349Syntax:
14350"""""""
14351
14352This is an overloaded intrinsic. You can use ``llvm.log2`` on any
14353floating-point or vector of floating-point type. Not all targets support
14354all types however.
14355
14356::
14357
14358      declare float     @llvm.log2.f32(float  %Val)
14359      declare double    @llvm.log2.f64(double %Val)
14360      declare x86_fp80  @llvm.log2.f80(x86_fp80  %Val)
14361      declare fp128     @llvm.log2.f128(fp128 %Val)
14362      declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128  %Val)
14363
14364Overview:
14365"""""""""
14366
14367The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
14368value.
14369
14370Arguments:
14371""""""""""
14372
14373The argument and return value are floating-point numbers of the same type.
14374
14375Semantics:
14376""""""""""
14377
14378Return the same value as a corresponding libm '``log2``' function but without
14379trapping or setting ``errno``.
14380
14381When specified with the fast-math-flag 'afn', the result may be approximated
14382using a less accurate calculation.
14383
14384.. _int_fma:
14385
14386'``llvm.fma.*``' Intrinsic
14387^^^^^^^^^^^^^^^^^^^^^^^^^^
14388
14389Syntax:
14390"""""""
14391
14392This is an overloaded intrinsic. You can use ``llvm.fma`` on any
14393floating-point or vector of floating-point type. Not all targets support
14394all types however.
14395
14396::
14397
14398      declare float     @llvm.fma.f32(float  %a, float  %b, float  %c)
14399      declare double    @llvm.fma.f64(double %a, double %b, double %c)
14400      declare x86_fp80  @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
14401      declare fp128     @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
14402      declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
14403
14404Overview:
14405"""""""""
14406
14407The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
14408
14409Arguments:
14410""""""""""
14411
14412The arguments and return value are floating-point numbers of the same type.
14413
14414Semantics:
14415""""""""""
14416
14417Return the same value as a corresponding libm '``fma``' function but without
14418trapping or setting ``errno``.
14419
14420When specified with the fast-math-flag 'afn', the result may be approximated
14421using a less accurate calculation.
14422
14423'``llvm.fabs.*``' Intrinsic
14424^^^^^^^^^^^^^^^^^^^^^^^^^^^
14425
14426Syntax:
14427"""""""
14428
14429This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
14430floating-point or vector of floating-point type. Not all targets support
14431all types however.
14432
14433::
14434
14435      declare float     @llvm.fabs.f32(float  %Val)
14436      declare double    @llvm.fabs.f64(double %Val)
14437      declare x86_fp80  @llvm.fabs.f80(x86_fp80 %Val)
14438      declare fp128     @llvm.fabs.f128(fp128 %Val)
14439      declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
14440
14441Overview:
14442"""""""""
14443
14444The '``llvm.fabs.*``' intrinsics return the absolute value of the
14445operand.
14446
14447Arguments:
14448""""""""""
14449
14450The argument and return value are floating-point numbers of the same
14451type.
14452
14453Semantics:
14454""""""""""
14455
14456This function returns the same values as the libm ``fabs`` functions
14457would, and handles error conditions in the same way.
14458
14459'``llvm.minnum.*``' Intrinsic
14460^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14461
14462Syntax:
14463"""""""
14464
14465This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
14466floating-point or vector of floating-point type. Not all targets support
14467all types however.
14468
14469::
14470
14471      declare float     @llvm.minnum.f32(float %Val0, float %Val1)
14472      declare double    @llvm.minnum.f64(double %Val0, double %Val1)
14473      declare x86_fp80  @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14474      declare fp128     @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
14475      declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14476
14477Overview:
14478"""""""""
14479
14480The '``llvm.minnum.*``' intrinsics return the minimum of the two
14481arguments.
14482
14483
14484Arguments:
14485""""""""""
14486
14487The arguments and return value are floating-point numbers of the same
14488type.
14489
14490Semantics:
14491""""""""""
14492
14493Follows the IEEE-754 semantics for minNum, except for handling of
14494signaling NaNs. This match's the behavior of libm's fmin.
14495
14496If either operand is a NaN, returns the other non-NaN operand. Returns
14497NaN only if both operands are NaN. The returned NaN is always
14498quiet. If the operands compare equal, returns a value that compares
14499equal to both operands. This means that fmin(+/-0.0, +/-0.0) could
14500return either -0.0 or 0.0.
14501
14502Unlike the IEEE-754 2008 behavior, this does not distinguish between
14503signaling and quiet NaN inputs. If a target's implementation follows
14504the standard and returns a quiet NaN if either input is a signaling
14505NaN, the intrinsic lowering is responsible for quieting the inputs to
14506correctly return the non-NaN input (e.g. by using the equivalent of
14507``llvm.canonicalize``).
14508
14509
14510'``llvm.maxnum.*``' Intrinsic
14511^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14512
14513Syntax:
14514"""""""
14515
14516This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
14517floating-point or vector of floating-point type. Not all targets support
14518all types however.
14519
14520::
14521
14522      declare float     @llvm.maxnum.f32(float  %Val0, float  %Val1)
14523      declare double    @llvm.maxnum.f64(double %Val0, double %Val1)
14524      declare x86_fp80  @llvm.maxnum.f80(x86_fp80  %Val0, x86_fp80  %Val1)
14525      declare fp128     @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
14526      declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128  %Val0, ppc_fp128  %Val1)
14527
14528Overview:
14529"""""""""
14530
14531The '``llvm.maxnum.*``' intrinsics return the maximum of the two
14532arguments.
14533
14534
14535Arguments:
14536""""""""""
14537
14538The arguments and return value are floating-point numbers of the same
14539type.
14540
14541Semantics:
14542""""""""""
14543Follows the IEEE-754 semantics for maxNum except for the handling of
14544signaling NaNs. This matches the behavior of libm's fmax.
14545
14546If either operand is a NaN, returns the other non-NaN operand. Returns
14547NaN only if both operands are NaN. The returned NaN is always
14548quiet. If the operands compare equal, returns a value that compares
14549equal to both operands. This means that fmax(+/-0.0, +/-0.0) could
14550return either -0.0 or 0.0.
14551
14552Unlike the IEEE-754 2008 behavior, this does not distinguish between
14553signaling and quiet NaN inputs. If a target's implementation follows
14554the standard and returns a quiet NaN if either input is a signaling
14555NaN, the intrinsic lowering is responsible for quieting the inputs to
14556correctly return the non-NaN input (e.g. by using the equivalent of
14557``llvm.canonicalize``).
14558
14559'``llvm.minimum.*``' Intrinsic
14560^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14561
14562Syntax:
14563"""""""
14564
14565This is an overloaded intrinsic. You can use ``llvm.minimum`` on any
14566floating-point or vector of floating-point type. Not all targets support
14567all types however.
14568
14569::
14570
14571      declare float     @llvm.minimum.f32(float %Val0, float %Val1)
14572      declare double    @llvm.minimum.f64(double %Val0, double %Val1)
14573      declare x86_fp80  @llvm.minimum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14574      declare fp128     @llvm.minimum.f128(fp128 %Val0, fp128 %Val1)
14575      declare ppc_fp128 @llvm.minimum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14576
14577Overview:
14578"""""""""
14579
14580The '``llvm.minimum.*``' intrinsics return the minimum of the two
14581arguments, propagating NaNs and treating -0.0 as less than +0.0.
14582
14583
14584Arguments:
14585""""""""""
14586
14587The arguments and return value are floating-point numbers of the same
14588type.
14589
14590Semantics:
14591""""""""""
14592If either operand is a NaN, returns NaN. Otherwise returns the lesser
14593of the two arguments. -0.0 is considered to be less than +0.0 for this
14594intrinsic. Note that these are the semantics specified in the draft of
14595IEEE 754-2018.
14596
14597'``llvm.maximum.*``' Intrinsic
14598^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14599
14600Syntax:
14601"""""""
14602
14603This is an overloaded intrinsic. You can use ``llvm.maximum`` on any
14604floating-point or vector of floating-point type. Not all targets support
14605all types however.
14606
14607::
14608
14609      declare float     @llvm.maximum.f32(float %Val0, float %Val1)
14610      declare double    @llvm.maximum.f64(double %Val0, double %Val1)
14611      declare x86_fp80  @llvm.maximum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
14612      declare fp128     @llvm.maximum.f128(fp128 %Val0, fp128 %Val1)
14613      declare ppc_fp128 @llvm.maximum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
14614
14615Overview:
14616"""""""""
14617
14618The '``llvm.maximum.*``' intrinsics return the maximum of the two
14619arguments, propagating NaNs and treating -0.0 as less than +0.0.
14620
14621
14622Arguments:
14623""""""""""
14624
14625The arguments and return value are floating-point numbers of the same
14626type.
14627
14628Semantics:
14629""""""""""
14630If either operand is a NaN, returns NaN. Otherwise returns the greater
14631of the two arguments. -0.0 is considered to be less than +0.0 for this
14632intrinsic. Note that these are the semantics specified in the draft of
14633IEEE 754-2018.
14634
14635'``llvm.copysign.*``' Intrinsic
14636^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14637
14638Syntax:
14639"""""""
14640
14641This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
14642floating-point or vector of floating-point type. Not all targets support
14643all types however.
14644
14645::
14646
14647      declare float     @llvm.copysign.f32(float  %Mag, float  %Sgn)
14648      declare double    @llvm.copysign.f64(double %Mag, double %Sgn)
14649      declare x86_fp80  @llvm.copysign.f80(x86_fp80  %Mag, x86_fp80  %Sgn)
14650      declare fp128     @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
14651      declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128  %Mag, ppc_fp128  %Sgn)
14652
14653Overview:
14654"""""""""
14655
14656The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
14657first operand and the sign of the second operand.
14658
14659Arguments:
14660""""""""""
14661
14662The arguments and return value are floating-point numbers of the same
14663type.
14664
14665Semantics:
14666""""""""""
14667
14668This function returns the same values as the libm ``copysign``
14669functions would, and handles error conditions in the same way.
14670
14671'``llvm.floor.*``' Intrinsic
14672^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14673
14674Syntax:
14675"""""""
14676
14677This is an overloaded intrinsic. You can use ``llvm.floor`` on any
14678floating-point or vector of floating-point type. Not all targets support
14679all types however.
14680
14681::
14682
14683      declare float     @llvm.floor.f32(float  %Val)
14684      declare double    @llvm.floor.f64(double %Val)
14685      declare x86_fp80  @llvm.floor.f80(x86_fp80  %Val)
14686      declare fp128     @llvm.floor.f128(fp128 %Val)
14687      declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128  %Val)
14688
14689Overview:
14690"""""""""
14691
14692The '``llvm.floor.*``' intrinsics return the floor of the operand.
14693
14694Arguments:
14695""""""""""
14696
14697The argument and return value are floating-point numbers of the same
14698type.
14699
14700Semantics:
14701""""""""""
14702
14703This function returns the same values as the libm ``floor`` functions
14704would, and handles error conditions in the same way.
14705
14706'``llvm.ceil.*``' Intrinsic
14707^^^^^^^^^^^^^^^^^^^^^^^^^^^
14708
14709Syntax:
14710"""""""
14711
14712This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
14713floating-point or vector of floating-point type. Not all targets support
14714all types however.
14715
14716::
14717
14718      declare float     @llvm.ceil.f32(float  %Val)
14719      declare double    @llvm.ceil.f64(double %Val)
14720      declare x86_fp80  @llvm.ceil.f80(x86_fp80  %Val)
14721      declare fp128     @llvm.ceil.f128(fp128 %Val)
14722      declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128  %Val)
14723
14724Overview:
14725"""""""""
14726
14727The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
14728
14729Arguments:
14730""""""""""
14731
14732The argument and return value are floating-point numbers of the same
14733type.
14734
14735Semantics:
14736""""""""""
14737
14738This function returns the same values as the libm ``ceil`` functions
14739would, and handles error conditions in the same way.
14740
14741'``llvm.trunc.*``' Intrinsic
14742^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14743
14744Syntax:
14745"""""""
14746
14747This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
14748floating-point or vector of floating-point type. Not all targets support
14749all types however.
14750
14751::
14752
14753      declare float     @llvm.trunc.f32(float  %Val)
14754      declare double    @llvm.trunc.f64(double %Val)
14755      declare x86_fp80  @llvm.trunc.f80(x86_fp80  %Val)
14756      declare fp128     @llvm.trunc.f128(fp128 %Val)
14757      declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128  %Val)
14758
14759Overview:
14760"""""""""
14761
14762The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
14763nearest integer not larger in magnitude than the operand.
14764
14765Arguments:
14766""""""""""
14767
14768The argument and return value are floating-point numbers of the same
14769type.
14770
14771Semantics:
14772""""""""""
14773
14774This function returns the same values as the libm ``trunc`` functions
14775would, and handles error conditions in the same way.
14776
14777'``llvm.rint.*``' Intrinsic
14778^^^^^^^^^^^^^^^^^^^^^^^^^^^
14779
14780Syntax:
14781"""""""
14782
14783This is an overloaded intrinsic. You can use ``llvm.rint`` on any
14784floating-point or vector of floating-point type. Not all targets support
14785all types however.
14786
14787::
14788
14789      declare float     @llvm.rint.f32(float  %Val)
14790      declare double    @llvm.rint.f64(double %Val)
14791      declare x86_fp80  @llvm.rint.f80(x86_fp80  %Val)
14792      declare fp128     @llvm.rint.f128(fp128 %Val)
14793      declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128  %Val)
14794
14795Overview:
14796"""""""""
14797
14798The '``llvm.rint.*``' intrinsics returns the operand rounded to the
14799nearest integer. It may raise an inexact floating-point exception if the
14800operand isn't an integer.
14801
14802Arguments:
14803""""""""""
14804
14805The argument and return value are floating-point numbers of the same
14806type.
14807
14808Semantics:
14809""""""""""
14810
14811This function returns the same values as the libm ``rint`` functions
14812would, and handles error conditions in the same way.
14813
14814'``llvm.nearbyint.*``' Intrinsic
14815^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14816
14817Syntax:
14818"""""""
14819
14820This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
14821floating-point or vector of floating-point type. Not all targets support
14822all types however.
14823
14824::
14825
14826      declare float     @llvm.nearbyint.f32(float  %Val)
14827      declare double    @llvm.nearbyint.f64(double %Val)
14828      declare x86_fp80  @llvm.nearbyint.f80(x86_fp80  %Val)
14829      declare fp128     @llvm.nearbyint.f128(fp128 %Val)
14830      declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128  %Val)
14831
14832Overview:
14833"""""""""
14834
14835The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
14836nearest integer.
14837
14838Arguments:
14839""""""""""
14840
14841The argument and return value are floating-point numbers of the same
14842type.
14843
14844Semantics:
14845""""""""""
14846
14847This function returns the same values as the libm ``nearbyint``
14848functions would, and handles error conditions in the same way.
14849
14850'``llvm.round.*``' Intrinsic
14851^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14852
14853Syntax:
14854"""""""
14855
14856This is an overloaded intrinsic. You can use ``llvm.round`` on any
14857floating-point or vector of floating-point type. Not all targets support
14858all types however.
14859
14860::
14861
14862      declare float     @llvm.round.f32(float  %Val)
14863      declare double    @llvm.round.f64(double %Val)
14864      declare x86_fp80  @llvm.round.f80(x86_fp80  %Val)
14865      declare fp128     @llvm.round.f128(fp128 %Val)
14866      declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128  %Val)
14867
14868Overview:
14869"""""""""
14870
14871The '``llvm.round.*``' intrinsics returns the operand rounded to the
14872nearest integer.
14873
14874Arguments:
14875""""""""""
14876
14877The argument and return value are floating-point numbers of the same
14878type.
14879
14880Semantics:
14881""""""""""
14882
14883This function returns the same values as the libm ``round``
14884functions would, and handles error conditions in the same way.
14885
14886'``llvm.roundeven.*``' Intrinsic
14887^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14888
14889Syntax:
14890"""""""
14891
14892This is an overloaded intrinsic. You can use ``llvm.roundeven`` on any
14893floating-point or vector of floating-point type. Not all targets support
14894all types however.
14895
14896::
14897
14898      declare float     @llvm.roundeven.f32(float  %Val)
14899      declare double    @llvm.roundeven.f64(double %Val)
14900      declare x86_fp80  @llvm.roundeven.f80(x86_fp80  %Val)
14901      declare fp128     @llvm.roundeven.f128(fp128 %Val)
14902      declare ppc_fp128 @llvm.roundeven.ppcf128(ppc_fp128  %Val)
14903
14904Overview:
14905"""""""""
14906
14907The '``llvm.roundeven.*``' intrinsics returns the operand rounded to the nearest
14908integer in floating-point format rounding halfway cases to even (that is, to the
14909nearest value that is an even integer).
14910
14911Arguments:
14912""""""""""
14913
14914The argument and return value are floating-point numbers of the same type.
14915
14916Semantics:
14917""""""""""
14918
14919This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
14920also behaves in the same way as C standard function ``roundeven``, except that
14921it does not raise floating point exceptions.
14922
14923
14924'``llvm.lround.*``' Intrinsic
14925^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14926
14927Syntax:
14928"""""""
14929
14930This is an overloaded intrinsic. You can use ``llvm.lround`` on any
14931floating-point type. Not all targets support all types however.
14932
14933::
14934
14935      declare i32 @llvm.lround.i32.f32(float %Val)
14936      declare i32 @llvm.lround.i32.f64(double %Val)
14937      declare i32 @llvm.lround.i32.f80(float %Val)
14938      declare i32 @llvm.lround.i32.f128(double %Val)
14939      declare i32 @llvm.lround.i32.ppcf128(double %Val)
14940
14941      declare i64 @llvm.lround.i64.f32(float %Val)
14942      declare i64 @llvm.lround.i64.f64(double %Val)
14943      declare i64 @llvm.lround.i64.f80(float %Val)
14944      declare i64 @llvm.lround.i64.f128(double %Val)
14945      declare i64 @llvm.lround.i64.ppcf128(double %Val)
14946
14947Overview:
14948"""""""""
14949
14950The '``llvm.lround.*``' intrinsics return the operand rounded to the nearest
14951integer with ties away from zero.
14952
14953
14954Arguments:
14955""""""""""
14956
14957The argument is a floating-point number and the return value is an integer
14958type.
14959
14960Semantics:
14961""""""""""
14962
14963This function returns the same values as the libm ``lround``
14964functions would, but without setting errno.
14965
14966'``llvm.llround.*``' Intrinsic
14967^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14968
14969Syntax:
14970"""""""
14971
14972This is an overloaded intrinsic. You can use ``llvm.llround`` on any
14973floating-point type. Not all targets support all types however.
14974
14975::
14976
14977      declare i64 @llvm.lround.i64.f32(float %Val)
14978      declare i64 @llvm.lround.i64.f64(double %Val)
14979      declare i64 @llvm.lround.i64.f80(float %Val)
14980      declare i64 @llvm.lround.i64.f128(double %Val)
14981      declare i64 @llvm.lround.i64.ppcf128(double %Val)
14982
14983Overview:
14984"""""""""
14985
14986The '``llvm.llround.*``' intrinsics return the operand rounded to the nearest
14987integer with ties away from zero.
14988
14989Arguments:
14990""""""""""
14991
14992The argument is a floating-point number and the return value is an integer
14993type.
14994
14995Semantics:
14996""""""""""
14997
14998This function returns the same values as the libm ``llround``
14999functions would, but without setting errno.
15000
15001'``llvm.lrint.*``' Intrinsic
15002^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15003
15004Syntax:
15005"""""""
15006
15007This is an overloaded intrinsic. You can use ``llvm.lrint`` on any
15008floating-point type. Not all targets support all types however.
15009
15010::
15011
15012      declare i32 @llvm.lrint.i32.f32(float %Val)
15013      declare i32 @llvm.lrint.i32.f64(double %Val)
15014      declare i32 @llvm.lrint.i32.f80(float %Val)
15015      declare i32 @llvm.lrint.i32.f128(double %Val)
15016      declare i32 @llvm.lrint.i32.ppcf128(double %Val)
15017
15018      declare i64 @llvm.lrint.i64.f32(float %Val)
15019      declare i64 @llvm.lrint.i64.f64(double %Val)
15020      declare i64 @llvm.lrint.i64.f80(float %Val)
15021      declare i64 @llvm.lrint.i64.f128(double %Val)
15022      declare i64 @llvm.lrint.i64.ppcf128(double %Val)
15023
15024Overview:
15025"""""""""
15026
15027The '``llvm.lrint.*``' intrinsics return the operand rounded to the nearest
15028integer.
15029
15030
15031Arguments:
15032""""""""""
15033
15034The argument is a floating-point number and the return value is an integer
15035type.
15036
15037Semantics:
15038""""""""""
15039
15040This function returns the same values as the libm ``lrint``
15041functions would, but without setting errno.
15042
15043'``llvm.llrint.*``' Intrinsic
15044^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15045
15046Syntax:
15047"""""""
15048
15049This is an overloaded intrinsic. You can use ``llvm.llrint`` on any
15050floating-point type. Not all targets support all types however.
15051
15052::
15053
15054      declare i64 @llvm.llrint.i64.f32(float %Val)
15055      declare i64 @llvm.llrint.i64.f64(double %Val)
15056      declare i64 @llvm.llrint.i64.f80(float %Val)
15057      declare i64 @llvm.llrint.i64.f128(double %Val)
15058      declare i64 @llvm.llrint.i64.ppcf128(double %Val)
15059
15060Overview:
15061"""""""""
15062
15063The '``llvm.llrint.*``' intrinsics return the operand rounded to the nearest
15064integer.
15065
15066Arguments:
15067""""""""""
15068
15069The argument is a floating-point number and the return value is an integer
15070type.
15071
15072Semantics:
15073""""""""""
15074
15075This function returns the same values as the libm ``llrint``
15076functions would, but without setting errno.
15077
15078Bit Manipulation Intrinsics
15079---------------------------
15080
15081LLVM provides intrinsics for a few important bit manipulation
15082operations. These allow efficient code generation for some algorithms.
15083
15084'``llvm.bitreverse.*``' Intrinsics
15085^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15086
15087Syntax:
15088"""""""
15089
15090This is an overloaded intrinsic function. You can use bitreverse on any
15091integer type.
15092
15093::
15094
15095      declare i16 @llvm.bitreverse.i16(i16 <id>)
15096      declare i32 @llvm.bitreverse.i32(i32 <id>)
15097      declare i64 @llvm.bitreverse.i64(i64 <id>)
15098      declare <4 x i32> @llvm.bitreverse.v4i32(<4 x i32> <id>)
15099
15100Overview:
15101"""""""""
15102
15103The '``llvm.bitreverse``' family of intrinsics is used to reverse the
15104bitpattern of an integer value or vector of integer values; for example
15105``0b10110110`` becomes ``0b01101101``.
15106
15107Semantics:
15108""""""""""
15109
15110The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
15111``M`` in the input moved to bit ``N-M-1`` in the output. The vector
15112intrinsics, such as ``llvm.bitreverse.v4i32``, operate on a per-element
15113basis and the element order is not affected.
15114
15115'``llvm.bswap.*``' Intrinsics
15116^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15117
15118Syntax:
15119"""""""
15120
15121This is an overloaded intrinsic function. You can use bswap on any
15122integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
15123
15124::
15125
15126      declare i16 @llvm.bswap.i16(i16 <id>)
15127      declare i32 @llvm.bswap.i32(i32 <id>)
15128      declare i64 @llvm.bswap.i64(i64 <id>)
15129      declare <4 x i32> @llvm.bswap.v4i32(<4 x i32> <id>)
15130
15131Overview:
15132"""""""""
15133
15134The '``llvm.bswap``' family of intrinsics is used to byte swap an integer
15135value or vector of integer values with an even number of bytes (positive
15136multiple of 16 bits).
15137
15138Semantics:
15139""""""""""
15140
15141The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
15142and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
15143intrinsic returns an i32 value that has the four bytes of the input i32
15144swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
15145returned i32 will have its bytes in 3, 2, 1, 0 order. The
15146``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
15147concept to additional even-byte lengths (6 bytes, 8 bytes and more,
15148respectively). The vector intrinsics, such as ``llvm.bswap.v4i32``,
15149operate on a per-element basis and the element order is not affected.
15150
15151'``llvm.ctpop.*``' Intrinsic
15152^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15153
15154Syntax:
15155"""""""
15156
15157This is an overloaded intrinsic. You can use llvm.ctpop on any integer
15158bit width, or on any vector with integer elements. Not all targets
15159support all bit widths or vector types, however.
15160
15161::
15162
15163      declare i8 @llvm.ctpop.i8(i8  <src>)
15164      declare i16 @llvm.ctpop.i16(i16 <src>)
15165      declare i32 @llvm.ctpop.i32(i32 <src>)
15166      declare i64 @llvm.ctpop.i64(i64 <src>)
15167      declare i256 @llvm.ctpop.i256(i256 <src>)
15168      declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
15169
15170Overview:
15171"""""""""
15172
15173The '``llvm.ctpop``' family of intrinsics counts the number of bits set
15174in a value.
15175
15176Arguments:
15177""""""""""
15178
15179The only argument is the value to be counted. The argument may be of any
15180integer type, or a vector with integer elements. The return type must
15181match the argument type.
15182
15183Semantics:
15184""""""""""
15185
15186The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
15187each element of a vector.
15188
15189'``llvm.ctlz.*``' Intrinsic
15190^^^^^^^^^^^^^^^^^^^^^^^^^^^
15191
15192Syntax:
15193"""""""
15194
15195This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
15196integer bit width, or any vector whose elements are integers. Not all
15197targets support all bit widths or vector types, however.
15198
15199::
15200
15201      declare i8   @llvm.ctlz.i8  (i8   <src>, i1 <is_zero_poison>)
15202      declare <2 x i37> @llvm.ctlz.v2i37(<2 x i37> <src>, i1 <is_zero_poison>)
15203
15204Overview:
15205"""""""""
15206
15207The '``llvm.ctlz``' family of intrinsic functions counts the number of
15208leading zeros in a variable.
15209
15210Arguments:
15211""""""""""
15212
15213The first argument is the value to be counted. This argument may be of
15214any integer type, or a vector with integer element type. The return
15215type must match the first argument type.
15216
15217The second argument is a constant flag that indicates whether the intrinsic
15218returns a valid result if the first argument is zero. If the first
15219argument is zero and the second argument is true, the result is poison.
15220Historically some architectures did not provide a defined result for zero
15221values as efficiently, and many algorithms are now predicated on avoiding
15222zero-value inputs.
15223
15224Semantics:
15225""""""""""
15226
15227The '``llvm.ctlz``' intrinsic counts the leading (most significant)
15228zeros in a variable, or within each element of the vector. If
15229``src == 0`` then the result is the size in bits of the type of ``src``
15230if ``is_zero_poison == 0`` and ``poison`` otherwise. For example,
15231``llvm.ctlz(i32 2) = 30``.
15232
15233'``llvm.cttz.*``' Intrinsic
15234^^^^^^^^^^^^^^^^^^^^^^^^^^^
15235
15236Syntax:
15237"""""""
15238
15239This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
15240integer bit width, or any vector of integer elements. Not all targets
15241support all bit widths or vector types, however.
15242
15243::
15244
15245      declare i42   @llvm.cttz.i42  (i42   <src>, i1 <is_zero_poison>)
15246      declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_poison>)
15247
15248Overview:
15249"""""""""
15250
15251The '``llvm.cttz``' family of intrinsic functions counts the number of
15252trailing zeros.
15253
15254Arguments:
15255""""""""""
15256
15257The first argument is the value to be counted. This argument may be of
15258any integer type, or a vector with integer element type. The return
15259type must match the first argument type.
15260
15261The second argument is a constant flag that indicates whether the intrinsic
15262returns a valid result if the first argument is zero. If the first
15263argument is zero and the second argument is true, the result is poison.
15264Historically some architectures did not provide a defined result for zero
15265values as efficiently, and many algorithms are now predicated on avoiding
15266zero-value inputs.
15267
15268Semantics:
15269""""""""""
15270
15271The '``llvm.cttz``' intrinsic counts the trailing (least significant)
15272zeros in a variable, or within each element of a vector. If ``src == 0``
15273then the result is the size in bits of the type of ``src`` if
15274``is_zero_poison == 0`` and ``poison`` otherwise. For example,
15275``llvm.cttz(2) = 1``.
15276
15277.. _int_overflow:
15278
15279'``llvm.fshl.*``' Intrinsic
15280^^^^^^^^^^^^^^^^^^^^^^^^^^^
15281
15282Syntax:
15283"""""""
15284
15285This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
15286integer bit width or any vector of integer elements. Not all targets
15287support all bit widths or vector types, however.
15288
15289::
15290
15291      declare i8  @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
15292      declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
15293      declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15294
15295Overview:
15296"""""""""
15297
15298The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
15299the first two values are concatenated as { %a : %b } (%a is the most significant
15300bits of the wide value), the combined value is shifted left, and the most
15301significant bits are extracted to produce a result that is the same size as the
15302original arguments. If the first 2 arguments are identical, this is equivalent
15303to a rotate left operation. For vector types, the operation occurs for each
15304element of the vector. The shift argument is treated as an unsigned amount
15305modulo the element size of the arguments.
15306
15307Arguments:
15308""""""""""
15309
15310The first two arguments are the values to be concatenated. The third
15311argument is the shift amount. The arguments may be any integer type or a
15312vector with integer element type. All arguments and the return value must
15313have the same type.
15314
15315Example:
15316""""""""
15317
15318.. code-block:: text
15319
15320      %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
15321      %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15)  ; %r = i8: 128 (0b10000000)
15322      %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11)  ; %r = i8: 120 (0b01111000)
15323      %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8)   ; %r = i8: 0   (0b00000000)
15324
15325'``llvm.fshr.*``' Intrinsic
15326^^^^^^^^^^^^^^^^^^^^^^^^^^^
15327
15328Syntax:
15329"""""""
15330
15331This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
15332integer bit width or any vector of integer elements. Not all targets
15333support all bit widths or vector types, however.
15334
15335::
15336
15337      declare i8  @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
15338      declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
15339      declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
15340
15341Overview:
15342"""""""""
15343
15344The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
15345the first two values are concatenated as { %a : %b } (%a is the most significant
15346bits of the wide value), the combined value is shifted right, and the least
15347significant bits are extracted to produce a result that is the same size as the
15348original arguments. If the first 2 arguments are identical, this is equivalent
15349to a rotate right operation. For vector types, the operation occurs for each
15350element of the vector. The shift argument is treated as an unsigned amount
15351modulo the element size of the arguments.
15352
15353Arguments:
15354""""""""""
15355
15356The first two arguments are the values to be concatenated. The third
15357argument is the shift amount. The arguments may be any integer type or a
15358vector with integer element type. All arguments and the return value must
15359have the same type.
15360
15361Example:
15362""""""""
15363
15364.. code-block:: text
15365
15366      %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z)  ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
15367      %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15)  ; %r = i8: 254 (0b11111110)
15368      %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11)  ; %r = i8: 225 (0b11100001)
15369      %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8)   ; %r = i8: 255 (0b11111111)
15370
15371Arithmetic with Overflow Intrinsics
15372-----------------------------------
15373
15374LLVM provides intrinsics for fast arithmetic overflow checking.
15375
15376Each of these intrinsics returns a two-element struct. The first
15377element of this struct contains the result of the corresponding
15378arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
15379the result. Therefore, for example, the first element of the struct
15380returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
15381result of a 32-bit ``add`` instruction with the same operands, where
15382the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
15383
15384The second element of the result is an ``i1`` that is 1 if the
15385arithmetic operation overflowed and 0 otherwise. An operation
15386overflows if, for any values of its operands ``A`` and ``B`` and for
15387any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
15388not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
15389``sext`` for signed overflow and ``zext`` for unsigned overflow, and
15390``op`` is the underlying arithmetic operation.
15391
15392The behavior of these intrinsics is well-defined for all argument
15393values.
15394
15395'``llvm.sadd.with.overflow.*``' Intrinsics
15396^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15397
15398Syntax:
15399"""""""
15400
15401This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
15402on any integer bit width or vectors of integers.
15403
15404::
15405
15406      declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
15407      declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15408      declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
15409      declare {<4 x i32>, <4 x i1>} @llvm.sadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15410
15411Overview:
15412"""""""""
15413
15414The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15415a signed addition of the two arguments, and indicate whether an overflow
15416occurred during the signed summation.
15417
15418Arguments:
15419""""""""""
15420
15421The arguments (%a and %b) and the first element of the result structure
15422may be of integer types of any bit width, but they must have the same
15423bit width. The second element of the result structure must be of type
15424``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15425addition.
15426
15427Semantics:
15428""""""""""
15429
15430The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
15431a signed addition of the two variables. They return a structure --- the
15432first element of which is the signed summation, and the second element
15433of which is a bit specifying if the signed summation resulted in an
15434overflow.
15435
15436Examples:
15437"""""""""
15438
15439.. code-block:: llvm
15440
15441      %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
15442      %sum = extractvalue {i32, i1} %res, 0
15443      %obit = extractvalue {i32, i1} %res, 1
15444      br i1 %obit, label %overflow, label %normal
15445
15446'``llvm.uadd.with.overflow.*``' Intrinsics
15447^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15448
15449Syntax:
15450"""""""
15451
15452This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
15453on any integer bit width or vectors of integers.
15454
15455::
15456
15457      declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
15458      declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15459      declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
15460      declare {<4 x i32>, <4 x i1>} @llvm.uadd.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15461
15462Overview:
15463"""""""""
15464
15465The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15466an unsigned addition of the two arguments, and indicate whether a carry
15467occurred during the unsigned summation.
15468
15469Arguments:
15470""""""""""
15471
15472The arguments (%a and %b) and the first element of the result structure
15473may be of integer types of any bit width, but they must have the same
15474bit width. The second element of the result structure must be of type
15475``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15476addition.
15477
15478Semantics:
15479""""""""""
15480
15481The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
15482an unsigned addition of the two arguments. They return a structure --- the
15483first element of which is the sum, and the second element of which is a
15484bit specifying if the unsigned summation resulted in a carry.
15485
15486Examples:
15487"""""""""
15488
15489.. code-block:: llvm
15490
15491      %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
15492      %sum = extractvalue {i32, i1} %res, 0
15493      %obit = extractvalue {i32, i1} %res, 1
15494      br i1 %obit, label %carry, label %normal
15495
15496'``llvm.ssub.with.overflow.*``' Intrinsics
15497^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15498
15499Syntax:
15500"""""""
15501
15502This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
15503on any integer bit width or vectors of integers.
15504
15505::
15506
15507      declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
15508      declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15509      declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
15510      declare {<4 x i32>, <4 x i1>} @llvm.ssub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15511
15512Overview:
15513"""""""""
15514
15515The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15516a signed subtraction of the two arguments, and indicate whether an
15517overflow occurred during the signed subtraction.
15518
15519Arguments:
15520""""""""""
15521
15522The arguments (%a and %b) and the first element of the result structure
15523may be of integer types of any bit width, but they must have the same
15524bit width. The second element of the result structure must be of type
15525``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15526subtraction.
15527
15528Semantics:
15529""""""""""
15530
15531The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
15532a signed subtraction of the two arguments. They return a structure --- the
15533first element of which is the subtraction, and the second element of
15534which is a bit specifying if the signed subtraction resulted in an
15535overflow.
15536
15537Examples:
15538"""""""""
15539
15540.. code-block:: llvm
15541
15542      %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
15543      %sum = extractvalue {i32, i1} %res, 0
15544      %obit = extractvalue {i32, i1} %res, 1
15545      br i1 %obit, label %overflow, label %normal
15546
15547'``llvm.usub.with.overflow.*``' Intrinsics
15548^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15549
15550Syntax:
15551"""""""
15552
15553This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
15554on any integer bit width or vectors of integers.
15555
15556::
15557
15558      declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
15559      declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15560      declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
15561      declare {<4 x i32>, <4 x i1>} @llvm.usub.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15562
15563Overview:
15564"""""""""
15565
15566The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15567an unsigned subtraction of the two arguments, and indicate whether an
15568overflow occurred during the unsigned subtraction.
15569
15570Arguments:
15571""""""""""
15572
15573The arguments (%a and %b) and the first element of the result structure
15574may be of integer types of any bit width, but they must have the same
15575bit width. The second element of the result structure must be of type
15576``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15577subtraction.
15578
15579Semantics:
15580""""""""""
15581
15582The '``llvm.usub.with.overflow``' family of intrinsic functions perform
15583an unsigned subtraction of the two arguments. They return a structure ---
15584the first element of which is the subtraction, and the second element of
15585which is a bit specifying if the unsigned subtraction resulted in an
15586overflow.
15587
15588Examples:
15589"""""""""
15590
15591.. code-block:: llvm
15592
15593      %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
15594      %sum = extractvalue {i32, i1} %res, 0
15595      %obit = extractvalue {i32, i1} %res, 1
15596      br i1 %obit, label %overflow, label %normal
15597
15598'``llvm.smul.with.overflow.*``' Intrinsics
15599^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15600
15601Syntax:
15602"""""""
15603
15604This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
15605on any integer bit width or vectors of integers.
15606
15607::
15608
15609      declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
15610      declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15611      declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
15612      declare {<4 x i32>, <4 x i1>} @llvm.smul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15613
15614Overview:
15615"""""""""
15616
15617The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15618a signed multiplication of the two arguments, and indicate whether an
15619overflow occurred during the signed multiplication.
15620
15621Arguments:
15622""""""""""
15623
15624The arguments (%a and %b) and the first element of the result structure
15625may be of integer types of any bit width, but they must have the same
15626bit width. The second element of the result structure must be of type
15627``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
15628multiplication.
15629
15630Semantics:
15631""""""""""
15632
15633The '``llvm.smul.with.overflow``' family of intrinsic functions perform
15634a signed multiplication of the two arguments. They return a structure ---
15635the first element of which is the multiplication, and the second element
15636of which is a bit specifying if the signed multiplication resulted in an
15637overflow.
15638
15639Examples:
15640"""""""""
15641
15642.. code-block:: llvm
15643
15644      %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
15645      %sum = extractvalue {i32, i1} %res, 0
15646      %obit = extractvalue {i32, i1} %res, 1
15647      br i1 %obit, label %overflow, label %normal
15648
15649'``llvm.umul.with.overflow.*``' Intrinsics
15650^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15651
15652Syntax:
15653"""""""
15654
15655This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
15656on any integer bit width or vectors of integers.
15657
15658::
15659
15660      declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
15661      declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15662      declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
15663      declare {<4 x i32>, <4 x i1>} @llvm.umul.with.overflow.v4i32(<4 x i32> %a, <4 x i32> %b)
15664
15665Overview:
15666"""""""""
15667
15668The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15669a unsigned multiplication of the two arguments, and indicate whether an
15670overflow occurred during the unsigned multiplication.
15671
15672Arguments:
15673""""""""""
15674
15675The arguments (%a and %b) and the first element of the result structure
15676may be of integer types of any bit width, but they must have the same
15677bit width. The second element of the result structure must be of type
15678``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
15679multiplication.
15680
15681Semantics:
15682""""""""""
15683
15684The '``llvm.umul.with.overflow``' family of intrinsic functions perform
15685an unsigned multiplication of the two arguments. They return a structure ---
15686the first element of which is the multiplication, and the second
15687element of which is a bit specifying if the unsigned multiplication
15688resulted in an overflow.
15689
15690Examples:
15691"""""""""
15692
15693.. code-block:: llvm
15694
15695      %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
15696      %sum = extractvalue {i32, i1} %res, 0
15697      %obit = extractvalue {i32, i1} %res, 1
15698      br i1 %obit, label %overflow, label %normal
15699
15700Saturation Arithmetic Intrinsics
15701---------------------------------
15702
15703Saturation arithmetic is a version of arithmetic in which operations are
15704limited to a fixed range between a minimum and maximum value. If the result of
15705an operation is greater than the maximum value, the result is set (or
15706"clamped") to this maximum. If it is below the minimum, it is clamped to this
15707minimum.
15708
15709
15710'``llvm.sadd.sat.*``' Intrinsics
15711^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15712
15713Syntax
15714"""""""
15715
15716This is an overloaded intrinsic. You can use ``llvm.sadd.sat``
15717on any integer bit width or vectors of integers.
15718
15719::
15720
15721      declare i16 @llvm.sadd.sat.i16(i16 %a, i16 %b)
15722      declare i32 @llvm.sadd.sat.i32(i32 %a, i32 %b)
15723      declare i64 @llvm.sadd.sat.i64(i64 %a, i64 %b)
15724      declare <4 x i32> @llvm.sadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15725
15726Overview
15727"""""""""
15728
15729The '``llvm.sadd.sat``' family of intrinsic functions perform signed
15730saturating addition on the 2 arguments.
15731
15732Arguments
15733""""""""""
15734
15735The arguments (%a and %b) and the result may be of integer types of any bit
15736width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15737values that will undergo signed addition.
15738
15739Semantics:
15740""""""""""
15741
15742The maximum value this operation can clamp to is the largest signed value
15743representable by the bit width of the arguments. The minimum value is the
15744smallest signed value representable by this bit width.
15745
15746
15747Examples
15748"""""""""
15749
15750.. code-block:: llvm
15751
15752      %res = call i4 @llvm.sadd.sat.i4(i4 1, i4 2)  ; %res = 3
15753      %res = call i4 @llvm.sadd.sat.i4(i4 5, i4 6)  ; %res = 7
15754      %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 2)  ; %res = -2
15755      %res = call i4 @llvm.sadd.sat.i4(i4 -4, i4 -5)  ; %res = -8
15756
15757
15758'``llvm.uadd.sat.*``' Intrinsics
15759^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15760
15761Syntax
15762"""""""
15763
15764This is an overloaded intrinsic. You can use ``llvm.uadd.sat``
15765on any integer bit width or vectors of integers.
15766
15767::
15768
15769      declare i16 @llvm.uadd.sat.i16(i16 %a, i16 %b)
15770      declare i32 @llvm.uadd.sat.i32(i32 %a, i32 %b)
15771      declare i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
15772      declare <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15773
15774Overview
15775"""""""""
15776
15777The '``llvm.uadd.sat``' family of intrinsic functions perform unsigned
15778saturating addition on the 2 arguments.
15779
15780Arguments
15781""""""""""
15782
15783The arguments (%a and %b) and the result may be of integer types of any bit
15784width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15785values that will undergo unsigned addition.
15786
15787Semantics:
15788""""""""""
15789
15790The maximum value this operation can clamp to is the largest unsigned value
15791representable by the bit width of the arguments. Because this is an unsigned
15792operation, the result will never saturate towards zero.
15793
15794
15795Examples
15796"""""""""
15797
15798.. code-block:: llvm
15799
15800      %res = call i4 @llvm.uadd.sat.i4(i4 1, i4 2)  ; %res = 3
15801      %res = call i4 @llvm.uadd.sat.i4(i4 5, i4 6)  ; %res = 11
15802      %res = call i4 @llvm.uadd.sat.i4(i4 8, i4 8)  ; %res = 15
15803
15804
15805'``llvm.ssub.sat.*``' Intrinsics
15806^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15807
15808Syntax
15809"""""""
15810
15811This is an overloaded intrinsic. You can use ``llvm.ssub.sat``
15812on any integer bit width or vectors of integers.
15813
15814::
15815
15816      declare i16 @llvm.ssub.sat.i16(i16 %a, i16 %b)
15817      declare i32 @llvm.ssub.sat.i32(i32 %a, i32 %b)
15818      declare i64 @llvm.ssub.sat.i64(i64 %a, i64 %b)
15819      declare <4 x i32> @llvm.ssub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15820
15821Overview
15822"""""""""
15823
15824The '``llvm.ssub.sat``' family of intrinsic functions perform signed
15825saturating subtraction on the 2 arguments.
15826
15827Arguments
15828""""""""""
15829
15830The arguments (%a and %b) and the result may be of integer types of any bit
15831width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15832values that will undergo signed subtraction.
15833
15834Semantics:
15835""""""""""
15836
15837The maximum value this operation can clamp to is the largest signed value
15838representable by the bit width of the arguments. The minimum value is the
15839smallest signed value representable by this bit width.
15840
15841
15842Examples
15843"""""""""
15844
15845.. code-block:: llvm
15846
15847      %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 1)  ; %res = 1
15848      %res = call i4 @llvm.ssub.sat.i4(i4 2, i4 6)  ; %res = -4
15849      %res = call i4 @llvm.ssub.sat.i4(i4 -4, i4 5)  ; %res = -8
15850      %res = call i4 @llvm.ssub.sat.i4(i4 4, i4 -5)  ; %res = 7
15851
15852
15853'``llvm.usub.sat.*``' Intrinsics
15854^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15855
15856Syntax
15857"""""""
15858
15859This is an overloaded intrinsic. You can use ``llvm.usub.sat``
15860on any integer bit width or vectors of integers.
15861
15862::
15863
15864      declare i16 @llvm.usub.sat.i16(i16 %a, i16 %b)
15865      declare i32 @llvm.usub.sat.i32(i32 %a, i32 %b)
15866      declare i64 @llvm.usub.sat.i64(i64 %a, i64 %b)
15867      declare <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15868
15869Overview
15870"""""""""
15871
15872The '``llvm.usub.sat``' family of intrinsic functions perform unsigned
15873saturating subtraction on the 2 arguments.
15874
15875Arguments
15876""""""""""
15877
15878The arguments (%a and %b) and the result may be of integer types of any bit
15879width, but they must have the same bit width. ``%a`` and ``%b`` are the two
15880values that will undergo unsigned subtraction.
15881
15882Semantics:
15883""""""""""
15884
15885The minimum value this operation can clamp to is 0, which is the smallest
15886unsigned value representable by the bit width of the unsigned arguments.
15887Because this is an unsigned operation, the result will never saturate towards
15888the largest possible value representable by this bit width.
15889
15890
15891Examples
15892"""""""""
15893
15894.. code-block:: llvm
15895
15896      %res = call i4 @llvm.usub.sat.i4(i4 2, i4 1)  ; %res = 1
15897      %res = call i4 @llvm.usub.sat.i4(i4 2, i4 6)  ; %res = 0
15898
15899
15900'``llvm.sshl.sat.*``' Intrinsics
15901^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15902
15903Syntax
15904"""""""
15905
15906This is an overloaded intrinsic. You can use ``llvm.sshl.sat``
15907on integers or vectors of integers of any bit width.
15908
15909::
15910
15911      declare i16 @llvm.sshl.sat.i16(i16 %a, i16 %b)
15912      declare i32 @llvm.sshl.sat.i32(i32 %a, i32 %b)
15913      declare i64 @llvm.sshl.sat.i64(i64 %a, i64 %b)
15914      declare <4 x i32> @llvm.sshl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15915
15916Overview
15917"""""""""
15918
15919The '``llvm.sshl.sat``' family of intrinsic functions perform signed
15920saturating left shift on the first argument.
15921
15922Arguments
15923""""""""""
15924
15925The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15926bit width, but they must have the same bit width. ``%a`` is the value to be
15927shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15928dynamically) equal to or larger than the integer bit width of the arguments,
15929the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15930vectors, each vector element of ``a`` is shifted by the corresponding shift
15931amount in ``b``.
15932
15933
15934Semantics:
15935""""""""""
15936
15937The maximum value this operation can clamp to is the largest signed value
15938representable by the bit width of the arguments. The minimum value is the
15939smallest signed value representable by this bit width.
15940
15941
15942Examples
15943"""""""""
15944
15945.. code-block:: llvm
15946
15947      %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 1)  ; %res = 4
15948      %res = call i4 @llvm.sshl.sat.i4(i4 2, i4 2)  ; %res = 7
15949      %res = call i4 @llvm.sshl.sat.i4(i4 -5, i4 1)  ; %res = -8
15950      %res = call i4 @llvm.sshl.sat.i4(i4 -1, i4 1)  ; %res = -2
15951
15952
15953'``llvm.ushl.sat.*``' Intrinsics
15954^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15955
15956Syntax
15957"""""""
15958
15959This is an overloaded intrinsic. You can use ``llvm.ushl.sat``
15960on integers or vectors of integers of any bit width.
15961
15962::
15963
15964      declare i16 @llvm.ushl.sat.i16(i16 %a, i16 %b)
15965      declare i32 @llvm.ushl.sat.i32(i32 %a, i32 %b)
15966      declare i64 @llvm.ushl.sat.i64(i64 %a, i64 %b)
15967      declare <4 x i32> @llvm.ushl.sat.v4i32(<4 x i32> %a, <4 x i32> %b)
15968
15969Overview
15970"""""""""
15971
15972The '``llvm.ushl.sat``' family of intrinsic functions perform unsigned
15973saturating left shift on the first argument.
15974
15975Arguments
15976""""""""""
15977
15978The arguments (``%a`` and ``%b``) and the result may be of integer types of any
15979bit width, but they must have the same bit width. ``%a`` is the value to be
15980shifted, and ``%b`` is the amount to shift by. If ``b`` is (statically or
15981dynamically) equal to or larger than the integer bit width of the arguments,
15982the result is a :ref:`poison value <poisonvalues>`. If the arguments are
15983vectors, each vector element of ``a`` is shifted by the corresponding shift
15984amount in ``b``.
15985
15986Semantics:
15987""""""""""
15988
15989The maximum value this operation can clamp to is the largest unsigned value
15990representable by the bit width of the arguments.
15991
15992
15993Examples
15994"""""""""
15995
15996.. code-block:: llvm
15997
15998      %res = call i4 @llvm.ushl.sat.i4(i4 2, i4 1)  ; %res = 4
15999      %res = call i4 @llvm.ushl.sat.i4(i4 3, i4 3)  ; %res = 15
16000
16001
16002Fixed Point Arithmetic Intrinsics
16003---------------------------------
16004
16005A fixed point number represents a real data type for a number that has a fixed
16006number of digits after a radix point (equivalent to the decimal point '.').
16007The number of digits after the radix point is referred as the `scale`. These
16008are useful for representing fractional values to a specific precision. The
16009following intrinsics perform fixed point arithmetic operations on 2 operands
16010of the same scale, specified as the third argument.
16011
16012The ``llvm.*mul.fix`` family of intrinsic functions represents a multiplication
16013of fixed point numbers through scaled integers. Therefore, fixed point
16014multiplication can be represented as
16015
16016.. code-block:: llvm
16017
16018        %result = call i4 @llvm.smul.fix.i4(i4 %a, i4 %b, i32 %scale)
16019
16020        ; Expands to
16021        %a2 = sext i4 %a to i8
16022        %b2 = sext i4 %b to i8
16023        %mul = mul nsw nuw i8 %a2, %b2
16024        %scale2 = trunc i32 %scale to i8
16025        %r = ashr i8 %mul, i8 %scale2  ; this is for a target rounding down towards negative infinity
16026        %result = trunc i8 %r to i4
16027
16028The ``llvm.*div.fix`` family of intrinsic functions represents a division of
16029fixed point numbers through scaled integers. Fixed point division can be
16030represented as:
16031
16032.. code-block:: llvm
16033
16034        %result call i4 @llvm.sdiv.fix.i4(i4 %a, i4 %b, i32 %scale)
16035
16036        ; Expands to
16037        %a2 = sext i4 %a to i8
16038        %b2 = sext i4 %b to i8
16039        %scale2 = trunc i32 %scale to i8
16040        %a3 = shl i8 %a2, %scale2
16041        %r = sdiv i8 %a3, %b2 ; this is for a target rounding towards zero
16042        %result = trunc i8 %r to i4
16043
16044For each of these functions, if the result cannot be represented exactly with
16045the provided scale, the result is rounded. Rounding is unspecified since
16046preferred rounding may vary for different targets. Rounding is specified
16047through a target hook. Different pipelines should legalize or optimize this
16048using the rounding specified by this hook if it is provided. Operations like
16049constant folding, instruction combining, KnownBits, and ValueTracking should
16050also use this hook, if provided, and not assume the direction of rounding. A
16051rounded result must always be within one unit of precision from the true
16052result. That is, the error between the returned result and the true result must
16053be less than 1/2^(scale).
16054
16055
16056'``llvm.smul.fix.*``' Intrinsics
16057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16058
16059Syntax
16060"""""""
16061
16062This is an overloaded intrinsic. You can use ``llvm.smul.fix``
16063on any integer bit width or vectors of integers.
16064
16065::
16066
16067      declare i16 @llvm.smul.fix.i16(i16 %a, i16 %b, i32 %scale)
16068      declare i32 @llvm.smul.fix.i32(i32 %a, i32 %b, i32 %scale)
16069      declare i64 @llvm.smul.fix.i64(i64 %a, i64 %b, i32 %scale)
16070      declare <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16071
16072Overview
16073"""""""""
16074
16075The '``llvm.smul.fix``' family of intrinsic functions perform signed
16076fixed point multiplication on 2 arguments of the same scale.
16077
16078Arguments
16079""""""""""
16080
16081The arguments (%a and %b) and the result may be of integer types of any bit
16082width, but they must have the same bit width. The arguments may also work with
16083int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16084values that will undergo signed fixed point multiplication. The argument
16085``%scale`` represents the scale of both operands, and must be a constant
16086integer.
16087
16088Semantics:
16089""""""""""
16090
16091This operation performs fixed point multiplication on the 2 arguments of a
16092specified scale. The result will also be returned in the same scale specified
16093in the third argument.
16094
16095If the result value cannot be precisely represented in the given scale, the
16096value is rounded up or down to the closest representable value. The rounding
16097direction is unspecified.
16098
16099It is undefined behavior if the result value does not fit within the range of
16100the fixed point type.
16101
16102
16103Examples
16104"""""""""
16105
16106.. code-block:: llvm
16107
16108      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16109      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16110      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
16111
16112      ; The result in the following could be rounded up to -2 or down to -2.5
16113      %res = call i4 @llvm.smul.fix.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
16114
16115
16116'``llvm.umul.fix.*``' Intrinsics
16117^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16118
16119Syntax
16120"""""""
16121
16122This is an overloaded intrinsic. You can use ``llvm.umul.fix``
16123on any integer bit width or vectors of integers.
16124
16125::
16126
16127      declare i16 @llvm.umul.fix.i16(i16 %a, i16 %b, i32 %scale)
16128      declare i32 @llvm.umul.fix.i32(i32 %a, i32 %b, i32 %scale)
16129      declare i64 @llvm.umul.fix.i64(i64 %a, i64 %b, i32 %scale)
16130      declare <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16131
16132Overview
16133"""""""""
16134
16135The '``llvm.umul.fix``' family of intrinsic functions perform unsigned
16136fixed point multiplication on 2 arguments of the same scale.
16137
16138Arguments
16139""""""""""
16140
16141The arguments (%a and %b) and the result may be of integer types of any bit
16142width, but they must have the same bit width. The arguments may also work with
16143int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16144values that will undergo unsigned fixed point multiplication. The argument
16145``%scale`` represents the scale of both operands, and must be a constant
16146integer.
16147
16148Semantics:
16149""""""""""
16150
16151This operation performs unsigned fixed point multiplication on the 2 arguments of a
16152specified scale. The result will also be returned in the same scale specified
16153in the third argument.
16154
16155If the result value cannot be precisely represented in the given scale, the
16156value is rounded up or down to the closest representable value. The rounding
16157direction is unspecified.
16158
16159It is undefined behavior if the result value does not fit within the range of
16160the fixed point type.
16161
16162
16163Examples
16164"""""""""
16165
16166.. code-block:: llvm
16167
16168      %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16169      %res = call i4 @llvm.umul.fix.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16170
16171      ; The result in the following could be rounded down to 3.5 or up to 4
16172      %res = call i4 @llvm.umul.fix.i4(i4 15, i4 1, i32 1)  ; %res = 7 (or 8) (7.5 x 0.5 = 3.75)
16173
16174
16175'``llvm.smul.fix.sat.*``' Intrinsics
16176^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16177
16178Syntax
16179"""""""
16180
16181This is an overloaded intrinsic. You can use ``llvm.smul.fix.sat``
16182on any integer bit width or vectors of integers.
16183
16184::
16185
16186      declare i16 @llvm.smul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16187      declare i32 @llvm.smul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16188      declare i64 @llvm.smul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16189      declare <4 x i32> @llvm.smul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16190
16191Overview
16192"""""""""
16193
16194The '``llvm.smul.fix.sat``' family of intrinsic functions perform signed
16195fixed point saturating multiplication on 2 arguments of the same scale.
16196
16197Arguments
16198""""""""""
16199
16200The arguments (%a and %b) and the result may be of integer types of any bit
16201width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16202values that will undergo signed fixed point multiplication. The argument
16203``%scale`` represents the scale of both operands, and must be a constant
16204integer.
16205
16206Semantics:
16207""""""""""
16208
16209This operation performs fixed point multiplication on the 2 arguments of a
16210specified scale. The result will also be returned in the same scale specified
16211in the third argument.
16212
16213If the result value cannot be precisely represented in the given scale, the
16214value is rounded up or down to the closest representable value. The rounding
16215direction is unspecified.
16216
16217The maximum value this operation can clamp to is the largest signed value
16218representable by the bit width of the first 2 arguments. The minimum value is the
16219smallest signed value representable by this bit width.
16220
16221
16222Examples
16223"""""""""
16224
16225.. code-block:: llvm
16226
16227      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16228      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16229      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -2, i32 1)  ; %res = -3 (1.5 x -1 = -1.5)
16230
16231      ; The result in the following could be rounded up to -2 or down to -2.5
16232      %res = call i4 @llvm.smul.fix.sat.i4(i4 3, i4 -3, i32 1)  ; %res = -5 (or -4) (1.5 x -1.5 = -2.25)
16233
16234      ; Saturation
16235      %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 2, i32 0)  ; %res = 7
16236      %res = call i4 @llvm.smul.fix.sat.i4(i4 7, i4 4, i32 2)  ; %res = 7
16237      %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 5, i32 2)  ; %res = -8
16238      %res = call i4 @llvm.smul.fix.sat.i4(i4 -8, i4 -2, i32 1)  ; %res = 7
16239
16240      ; Scale can affect the saturation result
16241      %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16242      %res = call i4 @llvm.smul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16243
16244
16245'``llvm.umul.fix.sat.*``' Intrinsics
16246^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16247
16248Syntax
16249"""""""
16250
16251This is an overloaded intrinsic. You can use ``llvm.umul.fix.sat``
16252on any integer bit width or vectors of integers.
16253
16254::
16255
16256      declare i16 @llvm.umul.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16257      declare i32 @llvm.umul.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16258      declare i64 @llvm.umul.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16259      declare <4 x i32> @llvm.umul.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16260
16261Overview
16262"""""""""
16263
16264The '``llvm.umul.fix.sat``' family of intrinsic functions perform unsigned
16265fixed point saturating multiplication on 2 arguments of the same scale.
16266
16267Arguments
16268""""""""""
16269
16270The arguments (%a and %b) and the result may be of integer types of any bit
16271width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16272values that will undergo unsigned fixed point multiplication. The argument
16273``%scale`` represents the scale of both operands, and must be a constant
16274integer.
16275
16276Semantics:
16277""""""""""
16278
16279This operation performs fixed point multiplication on the 2 arguments of a
16280specified scale. The result will also be returned in the same scale specified
16281in the third argument.
16282
16283If the result value cannot be precisely represented in the given scale, the
16284value is rounded up or down to the closest representable value. The rounding
16285direction is unspecified.
16286
16287The maximum value this operation can clamp to is the largest unsigned value
16288representable by the bit width of the first 2 arguments. The minimum value is the
16289smallest unsigned value representable by this bit width (zero).
16290
16291
16292Examples
16293"""""""""
16294
16295.. code-block:: llvm
16296
16297      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 0)  ; %res = 6 (2 x 3 = 6)
16298      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 2, i32 1)  ; %res = 3 (1.5 x 1 = 1.5)
16299
16300      ; The result in the following could be rounded down to 2 or up to 2.5
16301      %res = call i4 @llvm.umul.fix.sat.i4(i4 3, i4 3, i32 1)  ; %res = 4 (or 5) (1.5 x 1.5 = 2.25)
16302
16303      ; Saturation
16304      %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 2, i32 0)  ; %res = 15 (8 x 2 -> clamped to 15)
16305      %res = call i4 @llvm.umul.fix.sat.i4(i4 8, i4 8, i32 2)  ; %res = 15 (2 x 2 -> clamped to 3.75)
16306
16307      ; Scale can affect the saturation result
16308      %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 0)  ; %res = 7 (2 x 4 -> clamped to 7)
16309      %res = call i4 @llvm.umul.fix.sat.i4(i4 2, i4 4, i32 1)  ; %res = 4 (1 x 2 = 2)
16310
16311
16312'``llvm.sdiv.fix.*``' Intrinsics
16313^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16314
16315Syntax
16316"""""""
16317
16318This is an overloaded intrinsic. You can use ``llvm.sdiv.fix``
16319on any integer bit width or vectors of integers.
16320
16321::
16322
16323      declare i16 @llvm.sdiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16324      declare i32 @llvm.sdiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16325      declare i64 @llvm.sdiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16326      declare <4 x i32> @llvm.sdiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16327
16328Overview
16329"""""""""
16330
16331The '``llvm.sdiv.fix``' family of intrinsic functions perform signed
16332fixed point division on 2 arguments of the same scale.
16333
16334Arguments
16335""""""""""
16336
16337The arguments (%a and %b) and the result may be of integer types of any bit
16338width, but they must have the same bit width. The arguments may also work with
16339int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16340values that will undergo signed fixed point division. The argument
16341``%scale`` represents the scale of both operands, and must be a constant
16342integer.
16343
16344Semantics:
16345""""""""""
16346
16347This operation performs fixed point division on the 2 arguments of a
16348specified scale. The result will also be returned in the same scale specified
16349in the third argument.
16350
16351If the result value cannot be precisely represented in the given scale, the
16352value is rounded up or down to the closest representable value. The rounding
16353direction is unspecified.
16354
16355It is undefined behavior if the result value does not fit within the range of
16356the fixed point type, or if the second argument is zero.
16357
16358
16359Examples
16360"""""""""
16361
16362.. code-block:: llvm
16363
16364      %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16365      %res = call i4 @llvm.sdiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16366      %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16367
16368      ; The result in the following could be rounded up to 1 or down to 0.5
16369      %res = call i4 @llvm.sdiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16370
16371
16372'``llvm.udiv.fix.*``' Intrinsics
16373^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16374
16375Syntax
16376"""""""
16377
16378This is an overloaded intrinsic. You can use ``llvm.udiv.fix``
16379on any integer bit width or vectors of integers.
16380
16381::
16382
16383      declare i16 @llvm.udiv.fix.i16(i16 %a, i16 %b, i32 %scale)
16384      declare i32 @llvm.udiv.fix.i32(i32 %a, i32 %b, i32 %scale)
16385      declare i64 @llvm.udiv.fix.i64(i64 %a, i64 %b, i32 %scale)
16386      declare <4 x i32> @llvm.udiv.fix.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16387
16388Overview
16389"""""""""
16390
16391The '``llvm.udiv.fix``' family of intrinsic functions perform unsigned
16392fixed point division on 2 arguments of the same scale.
16393
16394Arguments
16395""""""""""
16396
16397The arguments (%a and %b) and the result may be of integer types of any bit
16398width, but they must have the same bit width. The arguments may also work with
16399int vectors of the same length and int size. ``%a`` and ``%b`` are the two
16400values that will undergo unsigned fixed point division. The argument
16401``%scale`` represents the scale of both operands, and must be a constant
16402integer.
16403
16404Semantics:
16405""""""""""
16406
16407This operation performs fixed point division on the 2 arguments of a
16408specified scale. The result will also be returned in the same scale specified
16409in the third argument.
16410
16411If the result value cannot be precisely represented in the given scale, the
16412value is rounded up or down to the closest representable value. The rounding
16413direction is unspecified.
16414
16415It is undefined behavior if the result value does not fit within the range of
16416the fixed point type, or if the second argument is zero.
16417
16418
16419Examples
16420"""""""""
16421
16422.. code-block:: llvm
16423
16424      %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16425      %res = call i4 @llvm.udiv.fix.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16426      %res = call i4 @llvm.udiv.fix.i4(i4 1, i4 -8, i32 4) ; %res = 2 (0.0625 / 0.5 = 0.125)
16427
16428      ; The result in the following could be rounded up to 1 or down to 0.5
16429      %res = call i4 @llvm.udiv.fix.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16430
16431
16432'``llvm.sdiv.fix.sat.*``' Intrinsics
16433^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16434
16435Syntax
16436"""""""
16437
16438This is an overloaded intrinsic. You can use ``llvm.sdiv.fix.sat``
16439on any integer bit width or vectors of integers.
16440
16441::
16442
16443      declare i16 @llvm.sdiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16444      declare i32 @llvm.sdiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16445      declare i64 @llvm.sdiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16446      declare <4 x i32> @llvm.sdiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16447
16448Overview
16449"""""""""
16450
16451The '``llvm.sdiv.fix.sat``' family of intrinsic functions perform signed
16452fixed point saturating division on 2 arguments of the same scale.
16453
16454Arguments
16455""""""""""
16456
16457The arguments (%a and %b) and the result may be of integer types of any bit
16458width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16459values that will undergo signed fixed point division. The argument
16460``%scale`` represents the scale of both operands, and must be a constant
16461integer.
16462
16463Semantics:
16464""""""""""
16465
16466This operation performs fixed point division on the 2 arguments of a
16467specified scale. The result will also be returned in the same scale specified
16468in the third argument.
16469
16470If the result value cannot be precisely represented in the given scale, the
16471value is rounded up or down to the closest representable value. The rounding
16472direction is unspecified.
16473
16474The maximum value this operation can clamp to is the largest signed value
16475representable by the bit width of the first 2 arguments. The minimum value is the
16476smallest signed value representable by this bit width.
16477
16478It is undefined behavior if the second argument is zero.
16479
16480
16481Examples
16482"""""""""
16483
16484.. code-block:: llvm
16485
16486      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16487      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16488      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 -2, i32 1) ; %res = -3 (1.5 / -1 = -1.5)
16489
16490      ; The result in the following could be rounded up to 1 or down to 0.5
16491      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 2 (or 1) (1.5 / 2 = 0.75)
16492
16493      ; Saturation
16494      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -8, i4 -1, i32 0)  ; %res = 7 (-8 / -1 = 8 => 7)
16495      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 4, i4 2, i32 2)  ; %res = 7 (1 / 0.5 = 2 => 1.75)
16496      %res = call i4 @llvm.sdiv.fix.sat.i4(i4 -4, i4 1, i32 2)  ; %res = -8 (-1 / 0.25 = -4 => -2)
16497
16498
16499'``llvm.udiv.fix.sat.*``' Intrinsics
16500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16501
16502Syntax
16503"""""""
16504
16505This is an overloaded intrinsic. You can use ``llvm.udiv.fix.sat``
16506on any integer bit width or vectors of integers.
16507
16508::
16509
16510      declare i16 @llvm.udiv.fix.sat.i16(i16 %a, i16 %b, i32 %scale)
16511      declare i32 @llvm.udiv.fix.sat.i32(i32 %a, i32 %b, i32 %scale)
16512      declare i64 @llvm.udiv.fix.sat.i64(i64 %a, i64 %b, i32 %scale)
16513      declare <4 x i32> @llvm.udiv.fix.sat.v4i32(<4 x i32> %a, <4 x i32> %b, i32 %scale)
16514
16515Overview
16516"""""""""
16517
16518The '``llvm.udiv.fix.sat``' family of intrinsic functions perform unsigned
16519fixed point saturating division on 2 arguments of the same scale.
16520
16521Arguments
16522""""""""""
16523
16524The arguments (%a and %b) and the result may be of integer types of any bit
16525width, but they must have the same bit width. ``%a`` and ``%b`` are the two
16526values that will undergo unsigned fixed point division. The argument
16527``%scale`` represents the scale of both operands, and must be a constant
16528integer.
16529
16530Semantics:
16531""""""""""
16532
16533This operation performs fixed point division on the 2 arguments of a
16534specified scale. The result will also be returned in the same scale specified
16535in the third argument.
16536
16537If the result value cannot be precisely represented in the given scale, the
16538value is rounded up or down to the closest representable value. The rounding
16539direction is unspecified.
16540
16541The maximum value this operation can clamp to is the largest unsigned value
16542representable by the bit width of the first 2 arguments. The minimum value is the
16543smallest unsigned value representable by this bit width (zero).
16544
16545It is undefined behavior if the second argument is zero.
16546
16547Examples
16548"""""""""
16549
16550.. code-block:: llvm
16551
16552      %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 2, i32 0)  ; %res = 3 (6 / 2 = 3)
16553      %res = call i4 @llvm.udiv.fix.sat.i4(i4 6, i4 4, i32 1)  ; %res = 3 (3 / 2 = 1.5)
16554
16555      ; The result in the following could be rounded down to 0.5 or up to 1
16556      %res = call i4 @llvm.udiv.fix.sat.i4(i4 3, i4 4, i32 1)  ; %res = 1 (or 2) (1.5 / 2 = 0.75)
16557
16558      ; Saturation
16559      %res = call i4 @llvm.udiv.fix.sat.i4(i4 8, i4 2, i32 2)  ; %res = 15 (2 / 0.5 = 4 => 3.75)
16560
16561
16562Specialised Arithmetic Intrinsics
16563---------------------------------
16564
16565.. _i_intr_llvm_canonicalize:
16566
16567'``llvm.canonicalize.*``' Intrinsic
16568^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16569
16570Syntax:
16571"""""""
16572
16573::
16574
16575      declare float @llvm.canonicalize.f32(float %a)
16576      declare double @llvm.canonicalize.f64(double %b)
16577
16578Overview:
16579"""""""""
16580
16581The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
16582encoding of a floating-point number. This canonicalization is useful for
16583implementing certain numeric primitives such as frexp. The canonical encoding is
16584defined by IEEE-754-2008 to be:
16585
16586::
16587
16588      2.1.8 canonical encoding: The preferred encoding of a floating-point
16589      representation in a format. Applied to declets, significands of finite
16590      numbers, infinities, and NaNs, especially in decimal formats.
16591
16592This operation can also be considered equivalent to the IEEE-754-2008
16593conversion of a floating-point value to the same format. NaNs are handled
16594according to section 6.2.
16595
16596Examples of non-canonical encodings:
16597
16598- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
16599  converted to a canonical representation per hardware-specific protocol.
16600- Many normal decimal floating-point numbers have non-canonical alternative
16601  encodings.
16602- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
16603  These are treated as non-canonical encodings of zero and will be flushed to
16604  a zero of the same sign by this operation.
16605
16606Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
16607default exception handling must signal an invalid exception, and produce a
16608quiet NaN result.
16609
16610This function should always be implementable as multiplication by 1.0, provided
16611that the compiler does not constant fold the operation. Likewise, division by
166121.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
16613-0.0 is also sufficient provided that the rounding mode is not -Infinity.
16614
16615``@llvm.canonicalize`` must preserve the equality relation. That is:
16616
16617- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
16618- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
16619  to ``(x == y)``
16620
16621Additionally, the sign of zero must be conserved:
16622``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
16623
16624The payload bits of a NaN must be conserved, with two exceptions.
16625First, environments which use only a single canonical representation of NaN
16626must perform said canonicalization. Second, SNaNs must be quieted per the
16627usual methods.
16628
16629The canonicalization operation may be optimized away if:
16630
16631- The input is known to be canonical. For example, it was produced by a
16632  floating-point operation that is required by the standard to be canonical.
16633- The result is consumed only by (or fused with) other floating-point
16634  operations. That is, the bits of the floating-point value are not examined.
16635
16636'``llvm.fmuladd.*``' Intrinsic
16637^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16638
16639Syntax:
16640"""""""
16641
16642::
16643
16644      declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
16645      declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
16646
16647Overview:
16648"""""""""
16649
16650The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
16651expressions that can be fused if the code generator determines that (a) the
16652target instruction set has support for a fused operation, and (b) that the
16653fused operation is more efficient than the equivalent, separate pair of mul
16654and add instructions.
16655
16656Arguments:
16657""""""""""
16658
16659The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
16660multiplicands, a and b, and an addend c.
16661
16662Semantics:
16663""""""""""
16664
16665The expression:
16666
16667::
16668
16669      %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
16670
16671is equivalent to the expression a \* b + c, except that it is unspecified
16672whether rounding will be performed between the multiplication and addition
16673steps. Fusion is not guaranteed, even if the target platform supports it.
16674If a fused multiply-add is required, the corresponding
16675:ref:`llvm.fma <int_fma>` intrinsic function should be used instead.
16676This never sets errno, just as '``llvm.fma.*``'.
16677
16678Examples:
16679"""""""""
16680
16681.. code-block:: llvm
16682
16683      %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
16684
16685
16686Hardware-Loop Intrinsics
16687------------------------
16688
16689LLVM support several intrinsics to mark a loop as a hardware-loop. They are
16690hints to the backend which are required to lower these intrinsics further to target
16691specific instructions, or revert the hardware-loop to a normal loop if target
16692specific restriction are not met and a hardware-loop can't be generated.
16693
16694These intrinsics may be modified in the future and are not intended to be used
16695outside the backend. Thus, front-end and mid-level optimizations should not be
16696generating these intrinsics.
16697
16698
16699'``llvm.set.loop.iterations.*``' Intrinsic
16700^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16701
16702Syntax:
16703"""""""
16704
16705This is an overloaded intrinsic.
16706
16707::
16708
16709      declare void @llvm.set.loop.iterations.i32(i32)
16710      declare void @llvm.set.loop.iterations.i64(i64)
16711
16712Overview:
16713"""""""""
16714
16715The '``llvm.set.loop.iterations.*``' intrinsics are used to specify the
16716hardware-loop trip count. They are placed in the loop preheader basic block and
16717are marked as ``IntrNoDuplicate`` to avoid optimizers duplicating these
16718instructions.
16719
16720Arguments:
16721""""""""""
16722
16723The integer operand is the loop trip count of the hardware-loop, and thus
16724not e.g. the loop back-edge taken count.
16725
16726Semantics:
16727""""""""""
16728
16729The '``llvm.set.loop.iterations.*``' intrinsics do not perform any arithmetic
16730on their operand. It's a hint to the backend that can use this to set up the
16731hardware-loop count with a target specific instruction, usually a move of this
16732value to a special register or a hardware-loop instruction.
16733
16734
16735'``llvm.start.loop.iterations.*``' Intrinsic
16736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16737
16738Syntax:
16739"""""""
16740
16741This is an overloaded intrinsic.
16742
16743::
16744
16745      declare i32 @llvm.start.loop.iterations.i32(i32)
16746      declare i64 @llvm.start.loop.iterations.i64(i64)
16747
16748Overview:
16749"""""""""
16750
16751The '``llvm.start.loop.iterations.*``' intrinsics are similar to the
16752'``llvm.set.loop.iterations.*``' intrinsics, used to specify the
16753hardware-loop trip count but also produce a value identical to the input
16754that can be used as the input to the loop. They are placed in the loop
16755preheader basic block and the output is expected to be the input to the
16756phi for the induction variable of the loop, decremented by the
16757'``llvm.loop.decrement.reg.*``'.
16758
16759Arguments:
16760""""""""""
16761
16762The integer operand is the loop trip count of the hardware-loop, and thus
16763not e.g. the loop back-edge taken count.
16764
16765Semantics:
16766""""""""""
16767
16768The '``llvm.start.loop.iterations.*``' intrinsics do not perform any arithmetic
16769on their operand. It's a hint to the backend that can use this to set up the
16770hardware-loop count with a target specific instruction, usually a move of this
16771value to a special register or a hardware-loop instruction.
16772
16773'``llvm.test.set.loop.iterations.*``' Intrinsic
16774^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16775
16776Syntax:
16777"""""""
16778
16779This is an overloaded intrinsic.
16780
16781::
16782
16783      declare i1 @llvm.test.set.loop.iterations.i32(i32)
16784      declare i1 @llvm.test.set.loop.iterations.i64(i64)
16785
16786Overview:
16787"""""""""
16788
16789The '``llvm.test.set.loop.iterations.*``' intrinsics are used to specify the
16790the loop trip count, and also test that the given count is not zero, allowing
16791it to control entry to a while-loop.  They are placed in the loop preheader's
16792predecessor basic block, and are marked as ``IntrNoDuplicate`` to avoid
16793optimizers duplicating these instructions.
16794
16795Arguments:
16796""""""""""
16797
16798The integer operand is the loop trip count of the hardware-loop, and thus
16799not e.g. the loop back-edge taken count.
16800
16801Semantics:
16802""""""""""
16803
16804The '``llvm.test.set.loop.iterations.*``' intrinsics do not perform any
16805arithmetic on their operand. It's a hint to the backend that can use this to
16806set up the hardware-loop count with a target specific instruction, usually a
16807move of this value to a special register or a hardware-loop instruction.
16808The result is the conditional value of whether the given count is not zero.
16809
16810
16811'``llvm.test.start.loop.iterations.*``' Intrinsic
16812^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16813
16814Syntax:
16815"""""""
16816
16817This is an overloaded intrinsic.
16818
16819::
16820
16821      declare {i32, i1} @llvm.test.start.loop.iterations.i32(i32)
16822      declare {i64, i1} @llvm.test.start.loop.iterations.i64(i64)
16823
16824Overview:
16825"""""""""
16826
16827The '``llvm.test.start.loop.iterations.*``' intrinsics are similar to the
16828'``llvm.test.set.loop.iterations.*``' and '``llvm.start.loop.iterations.*``'
16829intrinsics, used to specify the hardware-loop trip count, but also produce a
16830value identical to the input that can be used as the input to the loop. The
16831second i1 output controls entry to a while-loop.
16832
16833Arguments:
16834""""""""""
16835
16836The integer operand is the loop trip count of the hardware-loop, and thus
16837not e.g. the loop back-edge taken count.
16838
16839Semantics:
16840""""""""""
16841
16842The '``llvm.test.start.loop.iterations.*``' intrinsics do not perform any
16843arithmetic on their operand. It's a hint to the backend that can use this to
16844set up the hardware-loop count with a target specific instruction, usually a
16845move of this value to a special register or a hardware-loop instruction.
16846The result is a pair of the input and a conditional value of whether the
16847given count is not zero.
16848
16849
16850'``llvm.loop.decrement.reg.*``' Intrinsic
16851^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16852
16853Syntax:
16854"""""""
16855
16856This is an overloaded intrinsic.
16857
16858::
16859
16860      declare i32 @llvm.loop.decrement.reg.i32(i32, i32)
16861      declare i64 @llvm.loop.decrement.reg.i64(i64, i64)
16862
16863Overview:
16864"""""""""
16865
16866The '``llvm.loop.decrement.reg.*``' intrinsics are used to lower the loop
16867iteration counter and return an updated value that will be used in the next
16868loop test check.
16869
16870Arguments:
16871""""""""""
16872
16873Both arguments must have identical integer types. The first operand is the
16874loop iteration counter. The second operand is the maximum number of elements
16875processed in an iteration.
16876
16877Semantics:
16878""""""""""
16879
16880The '``llvm.loop.decrement.reg.*``' intrinsics do an integer ``SUB`` of its
16881two operands, which is not allowed to wrap. They return the remaining number of
16882iterations still to be executed, and can be used together with a ``PHI``,
16883``ICMP`` and ``BR`` to control the number of loop iterations executed. Any
16884optimisations are allowed to treat it is a ``SUB``, and it is supported by
16885SCEV, so it's the backends responsibility to handle cases where it may be
16886optimised. These intrinsics are marked as ``IntrNoDuplicate`` to avoid
16887optimizers duplicating these instructions.
16888
16889
16890'``llvm.loop.decrement.*``' Intrinsic
16891^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16892
16893Syntax:
16894"""""""
16895
16896This is an overloaded intrinsic.
16897
16898::
16899
16900      declare i1 @llvm.loop.decrement.i32(i32)
16901      declare i1 @llvm.loop.decrement.i64(i64)
16902
16903Overview:
16904"""""""""
16905
16906The HardwareLoops pass allows the loop decrement value to be specified with an
16907option. It defaults to a loop decrement value of 1, but it can be an unsigned
16908integer value provided by this option.  The '``llvm.loop.decrement.*``'
16909intrinsics decrement the loop iteration counter with this value, and return a
16910false predicate if the loop should exit, and true otherwise.
16911This is emitted if the loop counter is not updated via a ``PHI`` node, which
16912can also be controlled with an option.
16913
16914Arguments:
16915""""""""""
16916
16917The integer argument is the loop decrement value used to decrement the loop
16918iteration counter.
16919
16920Semantics:
16921""""""""""
16922
16923The '``llvm.loop.decrement.*``' intrinsics do a ``SUB`` of the loop iteration
16924counter with the given loop decrement value, and return false if the loop
16925should exit, this ``SUB`` is not allowed to wrap. The result is a condition
16926that is used by the conditional branch controlling the loop.
16927
16928
16929Vector Reduction Intrinsics
16930---------------------------
16931
16932Horizontal reductions of vectors can be expressed using the following
16933intrinsics. Each one takes a vector operand as an input and applies its
16934respective operation across all elements of the vector, returning a single
16935scalar result of the same element type.
16936
16937.. _int_vector_reduce_add:
16938
16939'``llvm.vector.reduce.add.*``' Intrinsic
16940^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16941
16942Syntax:
16943"""""""
16944
16945::
16946
16947      declare i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a)
16948      declare i64 @llvm.vector.reduce.add.v2i64(<2 x i64> %a)
16949
16950Overview:
16951"""""""""
16952
16953The '``llvm.vector.reduce.add.*``' intrinsics do an integer ``ADD``
16954reduction of a vector, returning the result as a scalar. The return type matches
16955the element-type of the vector input.
16956
16957Arguments:
16958""""""""""
16959The argument to this intrinsic must be a vector of integer values.
16960
16961.. _int_vector_reduce_fadd:
16962
16963'``llvm.vector.reduce.fadd.*``' Intrinsic
16964^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
16965
16966Syntax:
16967"""""""
16968
16969::
16970
16971      declare float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %a)
16972      declare double @llvm.vector.reduce.fadd.v2f64(double %start_value, <2 x double> %a)
16973
16974Overview:
16975"""""""""
16976
16977The '``llvm.vector.reduce.fadd.*``' intrinsics do a floating-point
16978``ADD`` reduction of a vector, returning the result as a scalar. The return type
16979matches the element-type of the vector input.
16980
16981If the intrinsic call has the 'reassoc' flag set, then the reduction will not
16982preserve the associativity of an equivalent scalarized counterpart. Otherwise
16983the reduction will be *sequential*, thus implying that the operation respects
16984the associativity of a scalarized reduction. That is, the reduction begins with
16985the start value and performs an fadd operation with consecutively increasing
16986vector element indices. See the following pseudocode:
16987
16988::
16989
16990    float sequential_fadd(start_value, input_vector)
16991      result = start_value
16992      for i = 0 to length(input_vector)
16993        result = result + input_vector[i]
16994      return result
16995
16996
16997Arguments:
16998""""""""""
16999The first argument to this intrinsic is a scalar start value for the reduction.
17000The type of the start value matches the element-type of the vector input.
17001The second argument must be a vector of floating-point values.
17002
17003To ignore the start value, negative zero (``-0.0``) can be used, as it is
17004the neutral value of floating point addition.
17005
17006Examples:
17007"""""""""
17008
17009::
17010
17011      %unord = call reassoc float @llvm.vector.reduce.fadd.v4f32(float -0.0, <4 x float> %input) ; relaxed reduction
17012      %ord = call float @llvm.vector.reduce.fadd.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
17013
17014
17015.. _int_vector_reduce_mul:
17016
17017'``llvm.vector.reduce.mul.*``' Intrinsic
17018^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17019
17020Syntax:
17021"""""""
17022
17023::
17024
17025      declare i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %a)
17026      declare i64 @llvm.vector.reduce.mul.v2i64(<2 x i64> %a)
17027
17028Overview:
17029"""""""""
17030
17031The '``llvm.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
17032reduction of a vector, returning the result as a scalar. The return type matches
17033the element-type of the vector input.
17034
17035Arguments:
17036""""""""""
17037The argument to this intrinsic must be a vector of integer values.
17038
17039.. _int_vector_reduce_fmul:
17040
17041'``llvm.vector.reduce.fmul.*``' Intrinsic
17042^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17043
17044Syntax:
17045"""""""
17046
17047::
17048
17049      declare float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %a)
17050      declare double @llvm.vector.reduce.fmul.v2f64(double %start_value, <2 x double> %a)
17051
17052Overview:
17053"""""""""
17054
17055The '``llvm.vector.reduce.fmul.*``' intrinsics do a floating-point
17056``MUL`` reduction of a vector, returning the result as a scalar. The return type
17057matches the element-type of the vector input.
17058
17059If the intrinsic call has the 'reassoc' flag set, then the reduction will not
17060preserve the associativity of an equivalent scalarized counterpart. Otherwise
17061the reduction will be *sequential*, thus implying that the operation respects
17062the associativity of a scalarized reduction. That is, the reduction begins with
17063the start value and performs an fmul operation with consecutively increasing
17064vector element indices. See the following pseudocode:
17065
17066::
17067
17068    float sequential_fmul(start_value, input_vector)
17069      result = start_value
17070      for i = 0 to length(input_vector)
17071        result = result * input_vector[i]
17072      return result
17073
17074
17075Arguments:
17076""""""""""
17077The first argument to this intrinsic is a scalar start value for the reduction.
17078The type of the start value matches the element-type of the vector input.
17079The second argument must be a vector of floating-point values.
17080
17081To ignore the start value, one (``1.0``) can be used, as it is the neutral
17082value of floating point multiplication.
17083
17084Examples:
17085"""""""""
17086
17087::
17088
17089      %unord = call reassoc float @llvm.vector.reduce.fmul.v4f32(float 1.0, <4 x float> %input) ; relaxed reduction
17090      %ord = call float @llvm.vector.reduce.fmul.v4f32(float %start_value, <4 x float> %input) ; sequential reduction
17091
17092.. _int_vector_reduce_and:
17093
17094'``llvm.vector.reduce.and.*``' Intrinsic
17095^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17096
17097Syntax:
17098"""""""
17099
17100::
17101
17102      declare i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %a)
17103
17104Overview:
17105"""""""""
17106
17107The '``llvm.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
17108reduction of a vector, returning the result as a scalar. The return type matches
17109the element-type of the vector input.
17110
17111Arguments:
17112""""""""""
17113The argument to this intrinsic must be a vector of integer values.
17114
17115.. _int_vector_reduce_or:
17116
17117'``llvm.vector.reduce.or.*``' Intrinsic
17118^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17119
17120Syntax:
17121"""""""
17122
17123::
17124
17125      declare i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %a)
17126
17127Overview:
17128"""""""""
17129
17130The '``llvm.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
17131of a vector, returning the result as a scalar. The return type matches the
17132element-type of the vector input.
17133
17134Arguments:
17135""""""""""
17136The argument to this intrinsic must be a vector of integer values.
17137
17138.. _int_vector_reduce_xor:
17139
17140'``llvm.vector.reduce.xor.*``' Intrinsic
17141^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17142
17143Syntax:
17144"""""""
17145
17146::
17147
17148      declare i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %a)
17149
17150Overview:
17151"""""""""
17152
17153The '``llvm.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
17154reduction of a vector, returning the result as a scalar. The return type matches
17155the element-type of the vector input.
17156
17157Arguments:
17158""""""""""
17159The argument to this intrinsic must be a vector of integer values.
17160
17161.. _int_vector_reduce_smax:
17162
17163'``llvm.vector.reduce.smax.*``' Intrinsic
17164^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17165
17166Syntax:
17167"""""""
17168
17169::
17170
17171      declare i32 @llvm.vector.reduce.smax.v4i32(<4 x i32> %a)
17172
17173Overview:
17174"""""""""
17175
17176The '``llvm.vector.reduce.smax.*``' intrinsics do a signed integer
17177``MAX`` reduction of a vector, returning the result as a scalar. The return type
17178matches the element-type of the vector input.
17179
17180Arguments:
17181""""""""""
17182The argument to this intrinsic must be a vector of integer values.
17183
17184.. _int_vector_reduce_smin:
17185
17186'``llvm.vector.reduce.smin.*``' Intrinsic
17187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17188
17189Syntax:
17190"""""""
17191
17192::
17193
17194      declare i32 @llvm.vector.reduce.smin.v4i32(<4 x i32> %a)
17195
17196Overview:
17197"""""""""
17198
17199The '``llvm.vector.reduce.smin.*``' intrinsics do a signed integer
17200``MIN`` reduction of a vector, returning the result as a scalar. The return type
17201matches the element-type of the vector input.
17202
17203Arguments:
17204""""""""""
17205The argument to this intrinsic must be a vector of integer values.
17206
17207.. _int_vector_reduce_umax:
17208
17209'``llvm.vector.reduce.umax.*``' Intrinsic
17210^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17211
17212Syntax:
17213"""""""
17214
17215::
17216
17217      declare i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %a)
17218
17219Overview:
17220"""""""""
17221
17222The '``llvm.vector.reduce.umax.*``' intrinsics do an unsigned
17223integer ``MAX`` reduction of a vector, returning the result as a scalar. The
17224return type matches the element-type of the vector input.
17225
17226Arguments:
17227""""""""""
17228The argument to this intrinsic must be a vector of integer values.
17229
17230.. _int_vector_reduce_umin:
17231
17232'``llvm.vector.reduce.umin.*``' Intrinsic
17233^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17234
17235Syntax:
17236"""""""
17237
17238::
17239
17240      declare i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %a)
17241
17242Overview:
17243"""""""""
17244
17245The '``llvm.vector.reduce.umin.*``' intrinsics do an unsigned
17246integer ``MIN`` reduction of a vector, returning the result as a scalar. The
17247return type matches the element-type of the vector input.
17248
17249Arguments:
17250""""""""""
17251The argument to this intrinsic must be a vector of integer values.
17252
17253.. _int_vector_reduce_fmax:
17254
17255'``llvm.vector.reduce.fmax.*``' Intrinsic
17256^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17257
17258Syntax:
17259"""""""
17260
17261::
17262
17263      declare float @llvm.vector.reduce.fmax.v4f32(<4 x float> %a)
17264      declare double @llvm.vector.reduce.fmax.v2f64(<2 x double> %a)
17265
17266Overview:
17267"""""""""
17268
17269The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
17270``MAX`` reduction of a vector, returning the result as a scalar. The return type
17271matches the element-type of the vector input.
17272
17273This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
17274intrinsic. That is, the result will always be a number unless all elements of
17275the vector are NaN. For a vector with maximum element magnitude 0.0 and
17276containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17277
17278If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17279assume that NaNs are not present in the input vector.
17280
17281Arguments:
17282""""""""""
17283The argument to this intrinsic must be a vector of floating-point values.
17284
17285.. _int_vector_reduce_fmin:
17286
17287'``llvm.vector.reduce.fmin.*``' Intrinsic
17288^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17289
17290Syntax:
17291"""""""
17292This is an overloaded intrinsic.
17293
17294::
17295
17296      declare float @llvm.vector.reduce.fmin.v4f32(<4 x float> %a)
17297      declare double @llvm.vector.reduce.fmin.v2f64(<2 x double> %a)
17298
17299Overview:
17300"""""""""
17301
17302The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
17303``MIN`` reduction of a vector, returning the result as a scalar. The return type
17304matches the element-type of the vector input.
17305
17306This instruction has the same comparison semantics as the '``llvm.minnum.*``'
17307intrinsic. That is, the result will always be a number unless all elements of
17308the vector are NaN. For a vector with minimum element magnitude 0.0 and
17309containing both +0.0 and -0.0 elements, the sign of the result is unspecified.
17310
17311If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
17312assume that NaNs are not present in the input vector.
17313
17314Arguments:
17315""""""""""
17316The argument to this intrinsic must be a vector of floating-point values.
17317
17318'``llvm.vector.insert``' Intrinsic
17319^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17320
17321Syntax:
17322"""""""
17323This is an overloaded intrinsic.
17324
17325::
17326
17327      ; Insert fixed type into scalable type
17328      declare <vscale x 4 x float> @llvm.vector.insert.nxv4f32.v4f32(<vscale x 4 x float> %vec, <4 x float> %subvec, i64 <idx>)
17329      declare <vscale x 2 x double> @llvm.vector.insert.nxv2f64.v2f64(<vscale x 2 x double> %vec, <2 x double> %subvec, i64 <idx>)
17330
17331      ; Insert scalable type into scalable type
17332      declare <vscale x 4 x float> @llvm.vector.insert.nxv4f64.nxv2f64(<vscale x 4 x float> %vec, <vscale x 2 x float> %subvec, i64 <idx>)
17333
17334      ; Insert fixed type into fixed type
17335      declare <4 x double> @llvm.vector.insert.v4f64.v2f64(<4 x double> %vec, <2 x double> %subvec, i64 <idx>)
17336
17337Overview:
17338"""""""""
17339
17340The '``llvm.vector.insert.*``' intrinsics insert a vector into another vector
17341starting from a given index. The return type matches the type of the vector we
17342insert into. Conceptually, this can be used to build a scalable vector out of
17343non-scalable vectors, however this intrinsic can also be used on purely fixed
17344types.
17345
17346Scalable vectors can only be inserted into other scalable vectors.
17347
17348Arguments:
17349""""""""""
17350
17351The ``vec`` is the vector which ``subvec`` will be inserted into.
17352The ``subvec`` is the vector that will be inserted.
17353
17354``idx`` represents the starting element number at which ``subvec`` will be
17355inserted. ``idx`` must be a constant multiple of ``subvec``'s known minimum
17356vector length. If ``subvec`` is a scalable vector, ``idx`` is first scaled by
17357the runtime scaling factor of ``subvec``. The elements of ``vec`` starting at
17358``idx`` are overwritten with ``subvec``. Elements ``idx`` through (``idx`` +
17359num_elements(``subvec``) - 1) must be valid ``vec`` indices. If this condition
17360cannot be determined statically but is false at runtime, then the result vector
17361is a :ref:`poison value <poisonvalues>`.
17362
17363
17364'``llvm.vector.extract``' Intrinsic
17365^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17366
17367Syntax:
17368"""""""
17369This is an overloaded intrinsic.
17370
17371::
17372
17373      ; Extract fixed type from scalable type
17374      declare <4 x float> @llvm.vector.extract.v4f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
17375      declare <2 x double> @llvm.vector.extract.v2f64.nxv2f64(<vscale x 2 x double> %vec, i64 <idx>)
17376
17377      ; Extract scalable type from scalable type
17378      declare <vscale x 2 x float> @llvm.vector.extract.nxv2f32.nxv4f32(<vscale x 4 x float> %vec, i64 <idx>)
17379
17380      ; Extract fixed type from fixed type
17381      declare <2 x double> @llvm.vector.extract.v2f64.v4f64(<4 x double> %vec, i64 <idx>)
17382
17383Overview:
17384"""""""""
17385
17386The '``llvm.vector.extract.*``' intrinsics extract a vector from within another
17387vector starting from a given index. The return type must be explicitly
17388specified. Conceptually, this can be used to decompose a scalable vector into
17389non-scalable parts, however this intrinsic can also be used on purely fixed
17390types.
17391
17392Scalable vectors can only be extracted from other scalable vectors.
17393
17394Arguments:
17395""""""""""
17396
17397The ``vec`` is the vector from which we will extract a subvector.
17398
17399The ``idx`` specifies the starting element number within ``vec`` from which a
17400subvector is extracted. ``idx`` must be a constant multiple of the known-minimum
17401vector length of the result type. If the result type is a scalable vector,
17402``idx`` is first scaled by the result type's runtime scaling factor. Elements
17403``idx`` through (``idx`` + num_elements(result_type) - 1) must be valid vector
17404indices. If this condition cannot be determined statically but is false at
17405runtime, then the result vector is a :ref:`poison value <poisonvalues>`. The
17406``idx`` parameter must be a vector index constant type (for most targets this
17407will be an integer pointer type).
17408
17409'``llvm.experimental.vector.reverse``' Intrinsic
17410^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17411
17412Syntax:
17413"""""""
17414This is an overloaded intrinsic.
17415
17416::
17417
17418      declare <2 x i8> @llvm.experimental.vector.reverse.v2i8(<2 x i8> %a)
17419      declare <vscale x 4 x i32> @llvm.experimental.vector.reverse.nxv4i32(<vscale x 4 x i32> %a)
17420
17421Overview:
17422"""""""""
17423
17424The '``llvm.experimental.vector.reverse.*``' intrinsics reverse a vector.
17425The intrinsic takes a single vector and returns a vector of matching type but
17426with the original lane order reversed. These intrinsics work for both fixed
17427and scalable vectors. While this intrinsic is marked as experimental the
17428recommended way to express reverse operations for fixed-width vectors is still
17429to use a shufflevector, as that may allow for more optimization opportunities.
17430
17431Arguments:
17432""""""""""
17433
17434The argument to this intrinsic must be a vector.
17435
17436'``llvm.experimental.vector.splice``' Intrinsic
17437^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17438
17439Syntax:
17440"""""""
17441This is an overloaded intrinsic.
17442
17443::
17444
17445      declare <2 x double> @llvm.experimental.vector.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm)
17446      declare <vscale x 4 x i32> @llvm.experimental.vector.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm)
17447
17448Overview:
17449"""""""""
17450
17451The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
17452concatenating elements from the first input vector with elements of the second
17453input vector, returning a vector of the same type as the input vectors. The
17454signed immediate, modulo the number of elements in the vector, is the index
17455into the first vector from which to extract the result value. This means
17456conceptually that for a positive immediate, a vector is extracted from
17457``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
17458immediate, it extracts ``-imm`` trailing elements from the first vector, and
17459the remaining elements from ``%vec2``.
17460
17461These intrinsics work for both fixed and scalable vectors. While this intrinsic
17462is marked as experimental, the recommended way to express this operation for
17463fixed-width vectors is still to use a shufflevector, as that may allow for more
17464optimization opportunities.
17465
17466For example:
17467
17468.. code-block:: text
17469
17470 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, 1)  ==> <B, C, D, E> ; index
17471 llvm.experimental.vector.splice(<A,B,C,D>, <E,F,G,H>, -3) ==> <B, C, D, E> ; trailing elements
17472
17473
17474Arguments:
17475""""""""""
17476
17477The first two operands are vectors with the same type. The start index is imm
17478modulo the runtime number of elements in the source vector. For a fixed-width
17479vector <N x eltty>, imm is a signed integer constant in the range
17480-N <= imm < N. For a scalable vector <vscale x N x eltty>, imm is a signed
17481integer constant in the range -X <= imm < X where X=vscale_range_min * N.
17482
17483'``llvm.experimental.stepvector``' Intrinsic
17484^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17485
17486This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
17487to generate a vector whose lane values comprise the linear sequence
17488<0, 1, 2, ...>. It is primarily intended for scalable vectors.
17489
17490::
17491
17492      declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
17493      declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
17494
17495The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
17496of integers whose elements contain a linear sequence of values starting from 0
17497with a step of 1.  This experimental intrinsic can only be used for vectors
17498with integer elements that are at least 8 bits in size. If the sequence value
17499exceeds the allowed limit for the element type then the result for that lane is
17500undefined.
17501
17502These intrinsics work for both fixed and scalable vectors. While this intrinsic
17503is marked as experimental, the recommended way to express this operation for
17504fixed-width vectors is still to generate a constant vector instead.
17505
17506
17507Arguments:
17508""""""""""
17509
17510None.
17511
17512
17513Matrix Intrinsics
17514-----------------
17515
17516Operations on matrixes requiring shape information (like number of rows/columns
17517or the memory layout) can be expressed using the matrix intrinsics. These
17518intrinsics require matrix dimensions to be passed as immediate arguments, and
17519matrixes are passed and returned as vectors. This means that for a ``R`` x
17520``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
17521corresponding vector, with indices starting at 0. Currently column-major layout
17522is assumed.  The intrinsics support both integer and floating point matrixes.
17523
17524
17525'``llvm.matrix.transpose.*``' Intrinsic
17526^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17527
17528Syntax:
17529"""""""
17530This is an overloaded intrinsic.
17531
17532::
17533
17534      declare vectorty @llvm.matrix.transpose.*(vectorty %In, i32 <Rows>, i32 <Cols>)
17535
17536Overview:
17537"""""""""
17538
17539The '``llvm.matrix.transpose.*``' intrinsics treat ``%In`` as a ``<Rows> x
17540<Cols>`` matrix and return the transposed matrix in the result vector.
17541
17542Arguments:
17543""""""""""
17544
17545The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17546<Cols>`` matrix. Thus, arguments ``<Rows>`` and ``<Cols>`` correspond to the
17547number of rows and columns, respectively, and must be positive, constant
17548integers. The returned vector must have ``<Rows> * <Cols>`` elements, and have
17549the same float or integer element type as ``%In``.
17550
17551'``llvm.matrix.multiply.*``' Intrinsic
17552^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17553
17554Syntax:
17555"""""""
17556This is an overloaded intrinsic.
17557
17558::
17559
17560      declare vectorty @llvm.matrix.multiply.*(vectorty %A, vectorty %B, i32 <OuterRows>, i32 <Inner>, i32 <OuterColumns>)
17561
17562Overview:
17563"""""""""
17564
17565The '``llvm.matrix.multiply.*``' intrinsics treat ``%A`` as a ``<OuterRows> x
17566<Inner>`` matrix, ``%B`` as a ``<Inner> x <OuterColumns>`` matrix, and
17567multiplies them. The result matrix is returned in the result vector.
17568
17569Arguments:
17570""""""""""
17571
17572The first vector argument ``%A`` corresponds to a matrix with ``<OuterRows> *
17573<Inner>`` elements, and the second argument ``%B`` to a matrix with
17574``<Inner> * <OuterColumns>`` elements. Arguments ``<OuterRows>``,
17575``<Inner>`` and ``<OuterColumns>`` must be positive, constant integers. The
17576returned vector must have ``<OuterRows> * <OuterColumns>`` elements.
17577Vectors ``%A``, ``%B``, and the returned vector all have the same float or
17578integer element type.
17579
17580
17581'``llvm.matrix.column.major.load.*``' Intrinsic
17582^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17583
17584Syntax:
17585"""""""
17586This is an overloaded intrinsic.
17587
17588::
17589
17590      declare vectorty @llvm.matrix.column.major.load.*(
17591          ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17592
17593Overview:
17594"""""""""
17595
17596The '``llvm.matrix.column.major.load.*``' intrinsics load a ``<Rows> x <Cols>``
17597matrix using a stride of ``%Stride`` to compute the start address of the
17598different columns.  The offset is computed using ``%Stride``'s bitwidth. This
17599allows for convenient loading of sub matrixes. If ``<IsVolatile>`` is true, the
17600intrinsic is considered a :ref:`volatile memory access <volatile>`. The result
17601matrix is returned in the result vector. If the ``%Ptr`` argument is known to
17602be aligned to some boundary, this can be specified as an attribute on the
17603argument.
17604
17605Arguments:
17606""""""""""
17607
17608The first argument ``%Ptr`` is a pointer type to the returned vector type, and
17609corresponds to the start address to load from. The second argument ``%Stride``
17610is a positive, constant integer with ``%Stride >= <Rows>``. ``%Stride`` is used
17611to compute the column memory addresses. I.e., for a column ``C``, its start
17612memory addresses is calculated with ``%Ptr + C * %Stride``. The third Argument
17613``<IsVolatile>`` is a boolean value.  The fourth and fifth arguments,
17614``<Rows>`` and ``<Cols>``, correspond to the number of rows and columns,
17615respectively, and must be positive, constant integers. The returned vector must
17616have ``<Rows> * <Cols>`` elements.
17617
17618The :ref:`align <attr_align>` parameter attribute can be provided for the
17619``%Ptr`` arguments.
17620
17621
17622'``llvm.matrix.column.major.store.*``' Intrinsic
17623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17624
17625Syntax:
17626"""""""
17627
17628::
17629
17630      declare void @llvm.matrix.column.major.store.*(
17631          vectorty %In, ptrty %Ptr, i64 %Stride, i1 <IsVolatile>, i32 <Rows>, i32 <Cols>)
17632
17633Overview:
17634"""""""""
17635
17636The '``llvm.matrix.column.major.store.*``' intrinsics store the ``<Rows> x
17637<Cols>`` matrix in ``%In`` to memory using a stride of ``%Stride`` between
17638columns. The offset is computed using ``%Stride``'s bitwidth. If
17639``<IsVolatile>`` is true, the intrinsic is considered a
17640:ref:`volatile memory access <volatile>`.
17641
17642If the ``%Ptr`` argument is known to be aligned to some boundary, this can be
17643specified as an attribute on the argument.
17644
17645Arguments:
17646""""""""""
17647
17648The first argument ``%In`` is a vector that corresponds to a ``<Rows> x
17649<Cols>`` matrix to be stored to memory. The second argument ``%Ptr`` is a
17650pointer to the vector type of ``%In``, and is the start address of the matrix
17651in memory. The third argument ``%Stride`` is a positive, constant integer with
17652``%Stride >= <Rows>``.  ``%Stride`` is used to compute the column memory
17653addresses. I.e., for a column ``C``, its start memory addresses is calculated
17654with ``%Ptr + C * %Stride``.  The fourth argument ``<IsVolatile>`` is a boolean
17655value. The arguments ``<Rows>`` and ``<Cols>`` correspond to the number of rows
17656and columns, respectively, and must be positive, constant integers.
17657
17658The :ref:`align <attr_align>` parameter attribute can be provided
17659for the ``%Ptr`` arguments.
17660
17661
17662Half Precision Floating-Point Intrinsics
17663----------------------------------------
17664
17665For most target platforms, half precision floating-point is a
17666storage-only format. This means that it is a dense encoding (in memory)
17667but does not support computation in the format.
17668
17669This means that code must first load the half-precision floating-point
17670value as an i16, then convert it to float with
17671:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
17672then be performed on the float value (including extending to double
17673etc). To store the value back to memory, it is first converted to float
17674if needed, then converted to i16 with
17675:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
17676i16 value.
17677
17678.. _int_convert_to_fp16:
17679
17680'``llvm.convert.to.fp16``' Intrinsic
17681^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17682
17683Syntax:
17684"""""""
17685
17686::
17687
17688      declare i16 @llvm.convert.to.fp16.f32(float %a)
17689      declare i16 @llvm.convert.to.fp16.f64(double %a)
17690
17691Overview:
17692"""""""""
17693
17694The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17695conventional floating-point type to half precision floating-point format.
17696
17697Arguments:
17698""""""""""
17699
17700The intrinsic function contains single argument - the value to be
17701converted.
17702
17703Semantics:
17704""""""""""
17705
17706The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
17707conventional floating-point format to half precision floating-point format. The
17708return value is an ``i16`` which contains the converted number.
17709
17710Examples:
17711"""""""""
17712
17713.. code-block:: llvm
17714
17715      %res = call i16 @llvm.convert.to.fp16.f32(float %a)
17716      store i16 %res, i16* @x, align 2
17717
17718.. _int_convert_from_fp16:
17719
17720'``llvm.convert.from.fp16``' Intrinsic
17721^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17722
17723Syntax:
17724"""""""
17725
17726::
17727
17728      declare float @llvm.convert.from.fp16.f32(i16 %a)
17729      declare double @llvm.convert.from.fp16.f64(i16 %a)
17730
17731Overview:
17732"""""""""
17733
17734The '``llvm.convert.from.fp16``' intrinsic function performs a
17735conversion from half precision floating-point format to single precision
17736floating-point format.
17737
17738Arguments:
17739""""""""""
17740
17741The intrinsic function contains single argument - the value to be
17742converted.
17743
17744Semantics:
17745""""""""""
17746
17747The '``llvm.convert.from.fp16``' intrinsic function performs a
17748conversion from half single precision floating-point format to single
17749precision floating-point format. The input half-float value is
17750represented by an ``i16`` value.
17751
17752Examples:
17753"""""""""
17754
17755.. code-block:: llvm
17756
17757      %a = load i16, ptr @x, align 2
17758      %res = call float @llvm.convert.from.fp16(i16 %a)
17759
17760Saturating floating-point to integer conversions
17761------------------------------------------------
17762
17763The ``fptoui`` and ``fptosi`` instructions return a
17764:ref:`poison value <poisonvalues>` if the rounded-towards-zero value is not
17765representable by the result type. These intrinsics provide an alternative
17766conversion, which will saturate towards the smallest and largest representable
17767integer values instead.
17768
17769'``llvm.fptoui.sat.*``' Intrinsic
17770^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17771
17772Syntax:
17773"""""""
17774
17775This is an overloaded intrinsic. You can use ``llvm.fptoui.sat`` on any
17776floating-point argument type and any integer result type, or vectors thereof.
17777Not all targets may support all types, however.
17778
17779::
17780
17781      declare i32 @llvm.fptoui.sat.i32.f32(float %f)
17782      declare i19 @llvm.fptoui.sat.i19.f64(double %f)
17783      declare <4 x i100> @llvm.fptoui.sat.v4i100.v4f128(<4 x fp128> %f)
17784
17785Overview:
17786"""""""""
17787
17788This intrinsic converts the argument into an unsigned integer using saturating
17789semantics.
17790
17791Arguments:
17792""""""""""
17793
17794The argument may be any floating-point or vector of floating-point type. The
17795return value may be any integer or vector of integer type. The number of vector
17796elements in argument and return must be the same.
17797
17798Semantics:
17799""""""""""
17800
17801The conversion to integer is performed subject to the following rules:
17802
17803- If the argument is any NaN, zero is returned.
17804- If the argument is smaller than zero (this includes negative infinity),
17805  zero is returned.
17806- If the argument is larger than the largest representable unsigned integer of
17807  the result type (this includes positive infinity), the largest representable
17808  unsigned integer is returned.
17809- Otherwise, the result of rounding the argument towards zero is returned.
17810
17811Example:
17812""""""""
17813
17814.. code-block:: text
17815
17816      %a = call i8 @llvm.fptoui.sat.i8.f32(float 123.9)              ; yields i8: 123
17817      %b = call i8 @llvm.fptoui.sat.i8.f32(float -5.7)               ; yields i8:   0
17818      %c = call i8 @llvm.fptoui.sat.i8.f32(float 377.0)              ; yields i8: 255
17819      %d = call i8 @llvm.fptoui.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:   0
17820
17821'``llvm.fptosi.sat.*``' Intrinsic
17822^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17823
17824Syntax:
17825"""""""
17826
17827This is an overloaded intrinsic. You can use ``llvm.fptosi.sat`` on any
17828floating-point argument type and any integer result type, or vectors thereof.
17829Not all targets may support all types, however.
17830
17831::
17832
17833      declare i32 @llvm.fptosi.sat.i32.f32(float %f)
17834      declare i19 @llvm.fptosi.sat.i19.f64(double %f)
17835      declare <4 x i100> @llvm.fptosi.sat.v4i100.v4f128(<4 x fp128> %f)
17836
17837Overview:
17838"""""""""
17839
17840This intrinsic converts the argument into a signed integer using saturating
17841semantics.
17842
17843Arguments:
17844""""""""""
17845
17846The argument may be any floating-point or vector of floating-point type. The
17847return value may be any integer or vector of integer type. The number of vector
17848elements in argument and return must be the same.
17849
17850Semantics:
17851""""""""""
17852
17853The conversion to integer is performed subject to the following rules:
17854
17855- If the argument is any NaN, zero is returned.
17856- If the argument is smaller than the smallest representable signed integer of
17857  the result type (this includes negative infinity), the smallest
17858  representable signed integer is returned.
17859- If the argument is larger than the largest representable signed integer of
17860  the result type (this includes positive infinity), the largest representable
17861  signed integer is returned.
17862- Otherwise, the result of rounding the argument towards zero is returned.
17863
17864Example:
17865""""""""
17866
17867.. code-block:: text
17868
17869      %a = call i8 @llvm.fptosi.sat.i8.f32(float 23.9)               ; yields i8:   23
17870      %b = call i8 @llvm.fptosi.sat.i8.f32(float -130.8)             ; yields i8: -128
17871      %c = call i8 @llvm.fptosi.sat.i8.f32(float 999.0)              ; yields i8:  127
17872      %d = call i8 @llvm.fptosi.sat.i8.f32(float 0xFFF8000000000000) ; yields i8:    0
17873
17874.. _dbg_intrinsics:
17875
17876Debugger Intrinsics
17877-------------------
17878
17879The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
17880prefix), are described in the `LLVM Source Level
17881Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
17882document.
17883
17884Exception Handling Intrinsics
17885-----------------------------
17886
17887The LLVM exception handling intrinsics (which all start with
17888``llvm.eh.`` prefix), are described in the `LLVM Exception
17889Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
17890
17891Pointer Authentication Intrinsics
17892---------------------------------
17893
17894The LLVM pointer authentication intrinsics (which all start with
17895``llvm.ptrauth.`` prefix), are described in the `Pointer Authentication
17896<PointerAuth.html#intrinsics>`_ document.
17897
17898.. _int_trampoline:
17899
17900Trampoline Intrinsics
17901---------------------
17902
17903These intrinsics make it possible to excise one parameter, marked with
17904the :ref:`nest <nest>` attribute, from a function. The result is a
17905callable function pointer lacking the nest parameter - the caller does
17906not need to provide a value for it. Instead, the value to use is stored
17907in advance in a "trampoline", a block of memory usually allocated on the
17908stack, which also contains code to splice the nest value into the
17909argument list. This is used to implement the GCC nested function address
17910extension.
17911
17912For example, if the function is ``i32 f(ptr nest %c, i32 %x, i32 %y)``
17913then the resulting function pointer has signature ``i32 (i32, i32)``.
17914It can be created as follows:
17915
17916.. code-block:: llvm
17917
17918      %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
17919      call ptr @llvm.init.trampoline(ptr %tramp, ptr @f, ptr %nval)
17920      %fp = call ptr @llvm.adjust.trampoline(ptr %tramp)
17921
17922The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
17923``%val = call i32 %f(ptr %nval, i32 %x, i32 %y)``.
17924
17925.. _int_it:
17926
17927'``llvm.init.trampoline``' Intrinsic
17928^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17929
17930Syntax:
17931"""""""
17932
17933::
17934
17935      declare void @llvm.init.trampoline(ptr <tramp>, ptr <func>, ptr <nval>)
17936
17937Overview:
17938"""""""""
17939
17940This fills the memory pointed to by ``tramp`` with executable code,
17941turning it into a trampoline.
17942
17943Arguments:
17944""""""""""
17945
17946The ``llvm.init.trampoline`` intrinsic takes three arguments, all
17947pointers. The ``tramp`` argument must point to a sufficiently large and
17948sufficiently aligned block of memory; this memory is written to by the
17949intrinsic. Note that the size and the alignment are target-specific -
17950LLVM currently provides no portable way of determining them, so a
17951front-end that generates this intrinsic needs to have some
17952target-specific knowledge. The ``func`` argument must hold a function.
17953
17954Semantics:
17955""""""""""
17956
17957The block of memory pointed to by ``tramp`` is filled with target
17958dependent code, turning it into a function. Then ``tramp`` needs to be
17959passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
17960be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
17961function's signature is the same as that of ``func`` with any arguments
17962marked with the ``nest`` attribute removed. At most one such ``nest``
17963argument is allowed, and it must be of pointer type. Calling the new
17964function is equivalent to calling ``func`` with the same argument list,
17965but with ``nval`` used for the missing ``nest`` argument. If, after
17966calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
17967modified, then the effect of any later call to the returned function
17968pointer is undefined.
17969
17970.. _int_at:
17971
17972'``llvm.adjust.trampoline``' Intrinsic
17973^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
17974
17975Syntax:
17976"""""""
17977
17978::
17979
17980      declare ptr @llvm.adjust.trampoline(ptr <tramp>)
17981
17982Overview:
17983"""""""""
17984
17985This performs any required machine-specific adjustment to the address of
17986a trampoline (passed as ``tramp``).
17987
17988Arguments:
17989""""""""""
17990
17991``tramp`` must point to a block of memory which already has trampoline
17992code filled in by a previous call to
17993:ref:`llvm.init.trampoline <int_it>`.
17994
17995Semantics:
17996""""""""""
17997
17998On some architectures the address of the code to be executed needs to be
17999different than the address where the trampoline is actually stored. This
18000intrinsic returns the executable address corresponding to ``tramp``
18001after performing the required machine specific adjustments. The pointer
18002returned can then be :ref:`bitcast and executed <int_trampoline>`.
18003
18004
18005.. _int_vp:
18006
18007Vector Predication Intrinsics
18008-----------------------------
18009VP intrinsics are intended for predicated SIMD/vector code.  A typical VP
18010operation takes a vector mask and an explicit vector length parameter as in:
18011
18012::
18013
18014      <W x T> llvm.vp.<opcode>.*(<W x T> %x, <W x T> %y, <W x i1> %mask, i32 %evl)
18015
18016The vector mask parameter (%mask) always has a vector of `i1` type, for example
18017`<32 x i1>`.  The explicit vector length parameter always has the type `i32` and
18018is an unsigned integer value.  The explicit vector length parameter (%evl) is in
18019the range:
18020
18021::
18022
18023      0 <= %evl <= W,  where W is the number of vector elements
18024
18025Note that for :ref:`scalable vector types <t_vector>` ``W`` is the runtime
18026length of the vector.
18027
18028The VP intrinsic has undefined behavior if ``%evl > W``.  The explicit vector
18029length (%evl) creates a mask, %EVLmask, with all elements ``0 <= i < %evl`` set
18030to True, and all other lanes ``%evl <= i < W`` to False.  A new mask %M is
18031calculated with an element-wise AND from %mask and %EVLmask:
18032
18033::
18034
18035      M = %mask AND %EVLmask
18036
18037A vector operation ``<opcode>`` on vectors ``A`` and ``B`` calculates:
18038
18039::
18040
18041       A <opcode> B =  {  A[i] <opcode> B[i]   M[i] = True, and
18042                       {  undef otherwise
18043
18044Optimization Hint
18045^^^^^^^^^^^^^^^^^
18046
18047Some targets, such as AVX512, do not support the %evl parameter in hardware.
18048The use of an effective %evl is discouraged for those targets.  The function
18049``TargetTransformInfo::hasActiveVectorLength()`` returns true when the target
18050has native support for %evl.
18051
18052.. _int_vp_select:
18053
18054'``llvm.vp.select.*``' Intrinsics
18055^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18056
18057Syntax:
18058"""""""
18059This is an overloaded intrinsic.
18060
18061::
18062
18063      declare <16 x i32>  @llvm.vp.select.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <evl>)
18064      declare <vscale x 4 x i64>  @llvm.vp.select.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <evl>)
18065
18066Overview:
18067"""""""""
18068
18069The '``llvm.vp.select``' intrinsic is used to choose one value based on a
18070condition vector, without IR-level branching.
18071
18072Arguments:
18073""""""""""
18074
18075The first operand is a vector of ``i1`` and indicates the condition.  The
18076second operand is the value that is selected where the condition vector is
18077true.  The third operand is the value that is selected where the condition
18078vector is false.  The vectors must be of the same size.  The fourth operand is
18079the explicit vector length.
18080
18081#. The optional ``fast-math flags`` marker indicates that the select has one or
18082   more :ref:`fast-math flags <fastmath>`. These are optimization hints to
18083   enable otherwise unsafe floating-point optimizations. Fast-math flags are
18084   only valid for selects that return a floating-point scalar or vector type,
18085   or an array (nested to any depth) of floating-point scalar or vector types.
18086
18087Semantics:
18088""""""""""
18089
18090The intrinsic selects lanes from the second and third operand depending on a
18091condition vector.
18092
18093All result lanes at positions greater or equal than ``%evl`` are undefined.
18094For all lanes below ``%evl`` where the condition vector is true the lane is
18095taken from the second operand.  Otherwise, the lane is taken from the third
18096operand.
18097
18098Example:
18099""""""""
18100
18101.. code-block:: llvm
18102
18103      %r = call <4 x i32> @llvm.vp.select.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %evl)
18104
18105      ;;; Expansion.
18106      ;; Any result is legal on lanes at and above %evl.
18107      %also.r = select <4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false
18108
18109
18110.. _int_vp_merge:
18111
18112'``llvm.vp.merge.*``' Intrinsics
18113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18114
18115Syntax:
18116"""""""
18117This is an overloaded intrinsic.
18118
18119::
18120
18121      declare <16 x i32>  @llvm.vp.merge.v16i32 (<16 x i1> <condition>, <16 x i32> <on_true>, <16 x i32> <on_false>, i32 <pivot>)
18122      declare <vscale x 4 x i64>  @llvm.vp.merge.nxv4i64 (<vscale x 4 x i1> <condition>, <vscale x 4 x i64> <on_true>, <vscale x 4 x i64> <on_false>, i32 <pivot>)
18123
18124Overview:
18125"""""""""
18126
18127The '``llvm.vp.merge``' intrinsic is used to choose one value based on a
18128condition vector and an index operand, without IR-level branching.
18129
18130Arguments:
18131""""""""""
18132
18133The first operand is a vector of ``i1`` and indicates the condition.  The
18134second operand is the value that is merged where the condition vector is true.
18135The third operand is the value that is selected where the condition vector is
18136false or the lane position is greater equal than the pivot. The fourth operand
18137is the pivot.
18138
18139#. The optional ``fast-math flags`` marker indicates that the merge has one or
18140   more :ref:`fast-math flags <fastmath>`. These are optimization hints to
18141   enable otherwise unsafe floating-point optimizations. Fast-math flags are
18142   only valid for merges that return a floating-point scalar or vector type,
18143   or an array (nested to any depth) of floating-point scalar or vector types.
18144
18145Semantics:
18146""""""""""
18147
18148The intrinsic selects lanes from the second and third operand depending on a
18149condition vector and pivot value.
18150
18151For all lanes where the condition vector is true and the lane position is less
18152than ``%pivot`` the lane is taken from the second operand.  Otherwise, the lane
18153is taken from the third operand.
18154
18155Example:
18156""""""""
18157
18158.. code-block:: llvm
18159
18160      %r = call <4 x i32> @llvm.vp.merge.v4i32(<4 x i1> %cond, <4 x i32> %on_true, <4 x i32> %on_false, i32 %pivot)
18161
18162      ;;; Expansion.
18163      ;; Lanes at and above %pivot are taken from %on_false
18164      %atfirst = insertelement <4 x i32> undef, i32 %pivot, i32 0
18165      %splat = shufflevector <4 x i32> %atfirst, <4 x i32> poison, <4 x i32> zeroinitializer
18166      %pivotmask = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32> %splat
18167      %mergemask = and <4 x i1> %cond, <4 x i1> %pivotmask
18168      %also.r = select <4 x i1> %mergemask, <4 x i32> %on_true, <4 x i32> %on_false
18169
18170
18171
18172.. _int_vp_add:
18173
18174'``llvm.vp.add.*``' Intrinsics
18175^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18176
18177Syntax:
18178"""""""
18179This is an overloaded intrinsic.
18180
18181::
18182
18183      declare <16 x i32>  @llvm.vp.add.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18184      declare <vscale x 4 x i32>  @llvm.vp.add.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18185      declare <256 x i64>  @llvm.vp.add.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18186
18187Overview:
18188"""""""""
18189
18190Predicated integer addition of two vectors of integers.
18191
18192
18193Arguments:
18194""""""""""
18195
18196The first two operands and the result have the same vector of integer type. The
18197third operand is the vector mask and has the same number of elements as the
18198result vector type. The fourth operand is the explicit vector length of the
18199operation.
18200
18201Semantics:
18202""""""""""
18203
18204The '``llvm.vp.add``' intrinsic performs integer addition (:ref:`add <i_add>`)
18205of the first and second vector operand on each enabled lane.  The result on
18206disabled lanes is undefined.
18207
18208Examples:
18209"""""""""
18210
18211.. code-block:: llvm
18212
18213      %r = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18214      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18215
18216      %t = add <4 x i32> %a, %b
18217      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18218
18219.. _int_vp_sub:
18220
18221'``llvm.vp.sub.*``' Intrinsics
18222^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18223
18224Syntax:
18225"""""""
18226This is an overloaded intrinsic.
18227
18228::
18229
18230      declare <16 x i32>  @llvm.vp.sub.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18231      declare <vscale x 4 x i32>  @llvm.vp.sub.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18232      declare <256 x i64>  @llvm.vp.sub.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18233
18234Overview:
18235"""""""""
18236
18237Predicated integer subtraction of two vectors of integers.
18238
18239
18240Arguments:
18241""""""""""
18242
18243The first two operands and the result have the same vector of integer type. The
18244third operand is the vector mask and has the same number of elements as the
18245result vector type. The fourth operand is the explicit vector length of the
18246operation.
18247
18248Semantics:
18249""""""""""
18250
18251The '``llvm.vp.sub``' intrinsic performs integer subtraction
18252(:ref:`sub <i_sub>`)  of the first and second vector operand on each enabled
18253lane. The result on disabled lanes is undefined.
18254
18255Examples:
18256"""""""""
18257
18258.. code-block:: llvm
18259
18260      %r = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18261      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18262
18263      %t = sub <4 x i32> %a, %b
18264      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18265
18266
18267
18268.. _int_vp_mul:
18269
18270'``llvm.vp.mul.*``' Intrinsics
18271^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18272
18273Syntax:
18274"""""""
18275This is an overloaded intrinsic.
18276
18277::
18278
18279      declare <16 x i32>  @llvm.vp.mul.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18280      declare <vscale x 4 x i32>  @llvm.vp.mul.nxv46i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18281      declare <256 x i64>  @llvm.vp.mul.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18282
18283Overview:
18284"""""""""
18285
18286Predicated integer multiplication of two vectors of integers.
18287
18288
18289Arguments:
18290""""""""""
18291
18292The first two operands and the result have the same vector of integer type. The
18293third operand is the vector mask and has the same number of elements as the
18294result vector type. The fourth operand is the explicit vector length of the
18295operation.
18296
18297Semantics:
18298""""""""""
18299The '``llvm.vp.mul``' intrinsic performs integer multiplication
18300(:ref:`mul <i_mul>`) of the first and second vector operand on each enabled
18301lane. The result on disabled lanes is undefined.
18302
18303Examples:
18304"""""""""
18305
18306.. code-block:: llvm
18307
18308      %r = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18309      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18310
18311      %t = mul <4 x i32> %a, %b
18312      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18313
18314
18315.. _int_vp_sdiv:
18316
18317'``llvm.vp.sdiv.*``' Intrinsics
18318^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18319
18320Syntax:
18321"""""""
18322This is an overloaded intrinsic.
18323
18324::
18325
18326      declare <16 x i32>  @llvm.vp.sdiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18327      declare <vscale x 4 x i32>  @llvm.vp.sdiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18328      declare <256 x i64>  @llvm.vp.sdiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18329
18330Overview:
18331"""""""""
18332
18333Predicated, signed division of two vectors of integers.
18334
18335
18336Arguments:
18337""""""""""
18338
18339The first two operands and the result have the same vector of integer type. The
18340third operand is the vector mask and has the same number of elements as the
18341result vector type. The fourth operand is the explicit vector length of the
18342operation.
18343
18344Semantics:
18345""""""""""
18346
18347The '``llvm.vp.sdiv``' intrinsic performs signed division (:ref:`sdiv <i_sdiv>`)
18348of the first and second vector operand on each enabled lane.  The result on
18349disabled lanes is undefined.
18350
18351Examples:
18352"""""""""
18353
18354.. code-block:: llvm
18355
18356      %r = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18357      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18358
18359      %t = sdiv <4 x i32> %a, %b
18360      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18361
18362
18363.. _int_vp_udiv:
18364
18365'``llvm.vp.udiv.*``' Intrinsics
18366^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18367
18368Syntax:
18369"""""""
18370This is an overloaded intrinsic.
18371
18372::
18373
18374      declare <16 x i32>  @llvm.vp.udiv.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18375      declare <vscale x 4 x i32>  @llvm.vp.udiv.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18376      declare <256 x i64>  @llvm.vp.udiv.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18377
18378Overview:
18379"""""""""
18380
18381Predicated, unsigned division of two vectors of integers.
18382
18383
18384Arguments:
18385""""""""""
18386
18387The first two operands and the result have the same vector of integer type. The third operand is the vector mask and has the same number of elements as the result vector type. The fourth operand is the explicit vector length of the operation.
18388
18389Semantics:
18390""""""""""
18391
18392The '``llvm.vp.udiv``' intrinsic performs unsigned division
18393(:ref:`udiv <i_udiv>`) of the first and second vector operand on each enabled
18394lane. The result on disabled lanes is undefined.
18395
18396Examples:
18397"""""""""
18398
18399.. code-block:: llvm
18400
18401      %r = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18402      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18403
18404      %t = udiv <4 x i32> %a, %b
18405      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18406
18407
18408
18409.. _int_vp_srem:
18410
18411'``llvm.vp.srem.*``' Intrinsics
18412^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18413
18414Syntax:
18415"""""""
18416This is an overloaded intrinsic.
18417
18418::
18419
18420      declare <16 x i32>  @llvm.vp.srem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18421      declare <vscale x 4 x i32>  @llvm.vp.srem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18422      declare <256 x i64>  @llvm.vp.srem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18423
18424Overview:
18425"""""""""
18426
18427Predicated computations of the signed remainder of two integer vectors.
18428
18429
18430Arguments:
18431""""""""""
18432
18433The first two operands and the result have the same vector of integer type. The
18434third operand is the vector mask and has the same number of elements as the
18435result vector type. The fourth operand is the explicit vector length of the
18436operation.
18437
18438Semantics:
18439""""""""""
18440
18441The '``llvm.vp.srem``' intrinsic computes the remainder of the signed division
18442(:ref:`srem <i_srem>`) of the first and second vector operand on each enabled
18443lane.  The result on disabled lanes is undefined.
18444
18445Examples:
18446"""""""""
18447
18448.. code-block:: llvm
18449
18450      %r = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18451      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18452
18453      %t = srem <4 x i32> %a, %b
18454      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18455
18456
18457
18458.. _int_vp_urem:
18459
18460'``llvm.vp.urem.*``' Intrinsics
18461^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18462
18463Syntax:
18464"""""""
18465This is an overloaded intrinsic.
18466
18467::
18468
18469      declare <16 x i32>  @llvm.vp.urem.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18470      declare <vscale x 4 x i32>  @llvm.vp.urem.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18471      declare <256 x i64>  @llvm.vp.urem.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18472
18473Overview:
18474"""""""""
18475
18476Predicated computation of the unsigned remainder of two integer vectors.
18477
18478
18479Arguments:
18480""""""""""
18481
18482The first two operands and the result have the same vector of integer type. The
18483third operand is the vector mask and has the same number of elements as the
18484result vector type. The fourth operand is the explicit vector length of the
18485operation.
18486
18487Semantics:
18488""""""""""
18489
18490The '``llvm.vp.urem``' intrinsic computes the remainder of the unsigned division
18491(:ref:`urem <i_urem>`) of the first and second vector operand on each enabled
18492lane.  The result on disabled lanes is undefined.
18493
18494Examples:
18495"""""""""
18496
18497.. code-block:: llvm
18498
18499      %r = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18500      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18501
18502      %t = urem <4 x i32> %a, %b
18503      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18504
18505
18506.. _int_vp_ashr:
18507
18508'``llvm.vp.ashr.*``' Intrinsics
18509^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18510
18511Syntax:
18512"""""""
18513This is an overloaded intrinsic.
18514
18515::
18516
18517      declare <16 x i32>  @llvm.vp.ashr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18518      declare <vscale x 4 x i32>  @llvm.vp.ashr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18519      declare <256 x i64>  @llvm.vp.ashr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18520
18521Overview:
18522"""""""""
18523
18524Vector-predicated arithmetic right-shift.
18525
18526
18527Arguments:
18528""""""""""
18529
18530The first two operands and the result have the same vector of integer type. The
18531third operand is the vector mask and has the same number of elements as the
18532result vector type. The fourth operand is the explicit vector length of the
18533operation.
18534
18535Semantics:
18536""""""""""
18537
18538The '``llvm.vp.ashr``' intrinsic computes the arithmetic right shift
18539(:ref:`ashr <i_ashr>`) of the first operand by the second operand on each
18540enabled lane. The result on disabled lanes is undefined.
18541
18542Examples:
18543"""""""""
18544
18545.. code-block:: llvm
18546
18547      %r = call <4 x i32> @llvm.vp.ashr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18548      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18549
18550      %t = ashr <4 x i32> %a, %b
18551      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18552
18553
18554.. _int_vp_lshr:
18555
18556
18557'``llvm.vp.lshr.*``' Intrinsics
18558^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18559
18560Syntax:
18561"""""""
18562This is an overloaded intrinsic.
18563
18564::
18565
18566      declare <16 x i32>  @llvm.vp.lshr.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18567      declare <vscale x 4 x i32>  @llvm.vp.lshr.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18568      declare <256 x i64>  @llvm.vp.lshr.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18569
18570Overview:
18571"""""""""
18572
18573Vector-predicated logical right-shift.
18574
18575
18576Arguments:
18577""""""""""
18578
18579The first two operands and the result have the same vector of integer type. The
18580third operand is the vector mask and has the same number of elements as the
18581result vector type. The fourth operand is the explicit vector length of the
18582operation.
18583
18584Semantics:
18585""""""""""
18586
18587The '``llvm.vp.lshr``' intrinsic computes the logical right shift
18588(:ref:`lshr <i_lshr>`) of the first operand by the second operand on each
18589enabled lane. The result on disabled lanes is undefined.
18590
18591Examples:
18592"""""""""
18593
18594.. code-block:: llvm
18595
18596      %r = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18597      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18598
18599      %t = lshr <4 x i32> %a, %b
18600      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18601
18602
18603.. _int_vp_shl:
18604
18605'``llvm.vp.shl.*``' Intrinsics
18606^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18607
18608Syntax:
18609"""""""
18610This is an overloaded intrinsic.
18611
18612::
18613
18614      declare <16 x i32>  @llvm.vp.shl.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18615      declare <vscale x 4 x i32>  @llvm.vp.shl.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18616      declare <256 x i64>  @llvm.vp.shl.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18617
18618Overview:
18619"""""""""
18620
18621Vector-predicated left shift.
18622
18623
18624Arguments:
18625""""""""""
18626
18627The first two operands and the result have the same vector of integer type. The
18628third operand is the vector mask and has the same number of elements as the
18629result vector type. The fourth operand is the explicit vector length of the
18630operation.
18631
18632Semantics:
18633""""""""""
18634
18635The '``llvm.vp.shl``' intrinsic computes the left shift (:ref:`shl <i_shl>`) of
18636the first operand by the second operand on each enabled lane.  The result on
18637disabled lanes is undefined.
18638
18639Examples:
18640"""""""""
18641
18642.. code-block:: llvm
18643
18644      %r = call <4 x i32> @llvm.vp.shl.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18645      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18646
18647      %t = shl <4 x i32> %a, %b
18648      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18649
18650
18651.. _int_vp_or:
18652
18653'``llvm.vp.or.*``' Intrinsics
18654^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18655
18656Syntax:
18657"""""""
18658This is an overloaded intrinsic.
18659
18660::
18661
18662      declare <16 x i32>  @llvm.vp.or.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18663      declare <vscale x 4 x i32>  @llvm.vp.or.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18664      declare <256 x i64>  @llvm.vp.or.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18665
18666Overview:
18667"""""""""
18668
18669Vector-predicated or.
18670
18671
18672Arguments:
18673""""""""""
18674
18675The first two operands and the result have the same vector of integer type. The
18676third operand is the vector mask and has the same number of elements as the
18677result vector type. The fourth operand is the explicit vector length of the
18678operation.
18679
18680Semantics:
18681""""""""""
18682
18683The '``llvm.vp.or``' intrinsic performs a bitwise or (:ref:`or <i_or>`) of the
18684first two operands on each enabled lane.  The result on disabled lanes is
18685undefined.
18686
18687Examples:
18688"""""""""
18689
18690.. code-block:: llvm
18691
18692      %r = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18693      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18694
18695      %t = or <4 x i32> %a, %b
18696      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18697
18698
18699.. _int_vp_and:
18700
18701'``llvm.vp.and.*``' Intrinsics
18702^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18703
18704Syntax:
18705"""""""
18706This is an overloaded intrinsic.
18707
18708::
18709
18710      declare <16 x i32>  @llvm.vp.and.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18711      declare <vscale x 4 x i32>  @llvm.vp.and.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18712      declare <256 x i64>  @llvm.vp.and.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18713
18714Overview:
18715"""""""""
18716
18717Vector-predicated and.
18718
18719
18720Arguments:
18721""""""""""
18722
18723The first two operands and the result have the same vector of integer type. The
18724third operand is the vector mask and has the same number of elements as the
18725result vector type. The fourth operand is the explicit vector length of the
18726operation.
18727
18728Semantics:
18729""""""""""
18730
18731The '``llvm.vp.and``' intrinsic performs a bitwise and (:ref:`and <i_or>`) of
18732the first two operands on each enabled lane.  The result on disabled lanes is
18733undefined.
18734
18735Examples:
18736"""""""""
18737
18738.. code-block:: llvm
18739
18740      %r = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18741      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18742
18743      %t = and <4 x i32> %a, %b
18744      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18745
18746
18747.. _int_vp_xor:
18748
18749'``llvm.vp.xor.*``' Intrinsics
18750^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18751
18752Syntax:
18753"""""""
18754This is an overloaded intrinsic.
18755
18756::
18757
18758      declare <16 x i32>  @llvm.vp.xor.v16i32 (<16 x i32> <left_op>, <16 x i32> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18759      declare <vscale x 4 x i32>  @llvm.vp.xor.nxv4i32 (<vscale x 4 x i32> <left_op>, <vscale x 4 x i32> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18760      declare <256 x i64>  @llvm.vp.xor.v256i64 (<256 x i64> <left_op>, <256 x i64> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18761
18762Overview:
18763"""""""""
18764
18765Vector-predicated, bitwise xor.
18766
18767
18768Arguments:
18769""""""""""
18770
18771The first two operands and the result have the same vector of integer type. The
18772third operand is the vector mask and has the same number of elements as the
18773result vector type. The fourth operand is the explicit vector length of the
18774operation.
18775
18776Semantics:
18777""""""""""
18778
18779The '``llvm.vp.xor``' intrinsic performs a bitwise xor (:ref:`xor <i_xor>`) of
18780the first two operands on each enabled lane.
18781The result on disabled lanes is undefined.
18782
18783Examples:
18784"""""""""
18785
18786.. code-block:: llvm
18787
18788      %r = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %a, <4 x i32> %b, <4 x i1> %mask, i32 %evl)
18789      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18790
18791      %t = xor <4 x i32> %a, %b
18792      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
18793
18794
18795.. _int_vp_fadd:
18796
18797'``llvm.vp.fadd.*``' Intrinsics
18798^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18799
18800Syntax:
18801"""""""
18802This is an overloaded intrinsic.
18803
18804::
18805
18806      declare <16 x float>  @llvm.vp.fadd.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18807      declare <vscale x 4 x float>  @llvm.vp.fadd.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18808      declare <256 x double>  @llvm.vp.fadd.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18809
18810Overview:
18811"""""""""
18812
18813Predicated floating-point addition of two vectors of floating-point values.
18814
18815
18816Arguments:
18817""""""""""
18818
18819The first two operands and the result have the same vector of floating-point type. The
18820third operand is the vector mask and has the same number of elements as the
18821result vector type. The fourth operand is the explicit vector length of the
18822operation.
18823
18824Semantics:
18825""""""""""
18826
18827The '``llvm.vp.fadd``' intrinsic performs floating-point addition (:ref:`fadd <i_fadd>`)
18828of the first and second vector operand on each enabled lane.  The result on
18829disabled lanes is undefined.  The operation is performed in the default
18830floating-point environment.
18831
18832Examples:
18833"""""""""
18834
18835.. code-block:: llvm
18836
18837      %r = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18838      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18839
18840      %t = fadd <4 x float> %a, %b
18841      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18842
18843
18844.. _int_vp_fsub:
18845
18846'``llvm.vp.fsub.*``' Intrinsics
18847^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18848
18849Syntax:
18850"""""""
18851This is an overloaded intrinsic.
18852
18853::
18854
18855      declare <16 x float>  @llvm.vp.fsub.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18856      declare <vscale x 4 x float>  @llvm.vp.fsub.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18857      declare <256 x double>  @llvm.vp.fsub.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18858
18859Overview:
18860"""""""""
18861
18862Predicated floating-point subtraction of two vectors of floating-point values.
18863
18864
18865Arguments:
18866""""""""""
18867
18868The first two operands and the result have the same vector of floating-point type. The
18869third operand is the vector mask and has the same number of elements as the
18870result vector type. The fourth operand is the explicit vector length of the
18871operation.
18872
18873Semantics:
18874""""""""""
18875
18876The '``llvm.vp.fsub``' intrinsic performs floating-point subtraction (:ref:`fsub <i_fsub>`)
18877of the first and second vector operand on each enabled lane.  The result on
18878disabled lanes is undefined.  The operation is performed in the default
18879floating-point environment.
18880
18881Examples:
18882"""""""""
18883
18884.. code-block:: llvm
18885
18886      %r = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18887      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18888
18889      %t = fsub <4 x float> %a, %b
18890      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18891
18892
18893.. _int_vp_fmul:
18894
18895'``llvm.vp.fmul.*``' Intrinsics
18896^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18897
18898Syntax:
18899"""""""
18900This is an overloaded intrinsic.
18901
18902::
18903
18904      declare <16 x float>  @llvm.vp.fmul.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18905      declare <vscale x 4 x float>  @llvm.vp.fmul.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18906      declare <256 x double>  @llvm.vp.fmul.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18907
18908Overview:
18909"""""""""
18910
18911Predicated floating-point multiplication of two vectors of floating-point values.
18912
18913
18914Arguments:
18915""""""""""
18916
18917The first two operands and the result have the same vector of floating-point type. The
18918third operand is the vector mask and has the same number of elements as the
18919result vector type. The fourth operand is the explicit vector length of the
18920operation.
18921
18922Semantics:
18923""""""""""
18924
18925The '``llvm.vp.fmul``' intrinsic performs floating-point multiplication (:ref:`fmul <i_fmul>`)
18926of the first and second vector operand on each enabled lane.  The result on
18927disabled lanes is undefined.  The operation is performed in the default
18928floating-point environment.
18929
18930Examples:
18931"""""""""
18932
18933.. code-block:: llvm
18934
18935      %r = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18936      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18937
18938      %t = fmul <4 x float> %a, %b
18939      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18940
18941
18942.. _int_vp_fdiv:
18943
18944'``llvm.vp.fdiv.*``' Intrinsics
18945^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18946
18947Syntax:
18948"""""""
18949This is an overloaded intrinsic.
18950
18951::
18952
18953      declare <16 x float>  @llvm.vp.fdiv.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
18954      declare <vscale x 4 x float>  @llvm.vp.fdiv.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
18955      declare <256 x double>  @llvm.vp.fdiv.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
18956
18957Overview:
18958"""""""""
18959
18960Predicated floating-point division of two vectors of floating-point values.
18961
18962
18963Arguments:
18964""""""""""
18965
18966The first two operands and the result have the same vector of floating-point type. The
18967third operand is the vector mask and has the same number of elements as the
18968result vector type. The fourth operand is the explicit vector length of the
18969operation.
18970
18971Semantics:
18972""""""""""
18973
18974The '``llvm.vp.fdiv``' intrinsic performs floating-point division (:ref:`fdiv <i_fdiv>`)
18975of the first and second vector operand on each enabled lane.  The result on
18976disabled lanes is undefined.  The operation is performed in the default
18977floating-point environment.
18978
18979Examples:
18980"""""""""
18981
18982.. code-block:: llvm
18983
18984      %r = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
18985      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
18986
18987      %t = fdiv <4 x float> %a, %b
18988      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
18989
18990
18991.. _int_vp_frem:
18992
18993'``llvm.vp.frem.*``' Intrinsics
18994^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
18995
18996Syntax:
18997"""""""
18998This is an overloaded intrinsic.
18999
19000::
19001
19002      declare <16 x float>  @llvm.vp.frem.v16f32 (<16 x float> <left_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19003      declare <vscale x 4 x float>  @llvm.vp.frem.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19004      declare <256 x double>  @llvm.vp.frem.v256f64 (<256 x double> <left_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19005
19006Overview:
19007"""""""""
19008
19009Predicated floating-point remainder of two vectors of floating-point values.
19010
19011
19012Arguments:
19013""""""""""
19014
19015The first two operands and the result have the same vector of floating-point type. The
19016third operand is the vector mask and has the same number of elements as the
19017result vector type. The fourth operand is the explicit vector length of the
19018operation.
19019
19020Semantics:
19021""""""""""
19022
19023The '``llvm.vp.frem``' intrinsic performs floating-point remainder (:ref:`frem <i_frem>`)
19024of the first and second vector operand on each enabled lane.  The result on
19025disabled lanes is undefined.  The operation is performed in the default
19026floating-point environment.
19027
19028Examples:
19029"""""""""
19030
19031.. code-block:: llvm
19032
19033      %r = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %a, <4 x float> %b, <4 x i1> %mask, i32 %evl)
19034      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19035
19036      %t = frem <4 x float> %a, %b
19037      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
19038
19039
19040.. _int_vp_fneg:
19041
19042'``llvm.vp.fneg.*``' Intrinsics
19043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19044
19045Syntax:
19046"""""""
19047This is an overloaded intrinsic.
19048
19049::
19050
19051      declare <16 x float>  @llvm.vp.fneg.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
19052      declare <vscale x 4 x float>  @llvm.vp.fneg.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19053      declare <256 x double>  @llvm.vp.fneg.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
19054
19055Overview:
19056"""""""""
19057
19058Predicated floating-point negation of a vector of floating-point values.
19059
19060
19061Arguments:
19062""""""""""
19063
19064The first operand and the result have the same vector of floating-point type.
19065The second operand is the vector mask and has the same number of elements as the
19066result vector type. The third operand is the explicit vector length of the
19067operation.
19068
19069Semantics:
19070""""""""""
19071
19072The '``llvm.vp.fneg``' intrinsic performs floating-point negation (:ref:`fneg <i_fneg>`)
19073of the first vector operand on each enabled lane.  The result on disabled lanes
19074is undefined.
19075
19076Examples:
19077"""""""""
19078
19079.. code-block:: llvm
19080
19081      %r = call <4 x float> @llvm.vp.fneg.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
19082      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19083
19084      %t = fneg <4 x float> %a
19085      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
19086
19087
19088.. _int_vp_fma:
19089
19090'``llvm.vp.fma.*``' Intrinsics
19091^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19092
19093Syntax:
19094"""""""
19095This is an overloaded intrinsic.
19096
19097::
19098
19099      declare <16 x float>  @llvm.vp.fma.v16f32 (<16 x float> <left_op>, <16 x float> <middle_op>, <16 x float> <right_op>, <16 x i1> <mask>, i32 <vector_length>)
19100      declare <vscale x 4 x float>  @llvm.vp.fma.nxv4f32 (<vscale x 4 x float> <left_op>, <vscale x 4 x float> <middle_op>, <vscale x 4 x float> <right_op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
19101      declare <256 x double>  @llvm.vp.fma.v256f64 (<256 x double> <left_op>, <256 x double> <middle_op>, <256 x double> <right_op>, <256 x i1> <mask>, i32 <vector_length>)
19102
19103Overview:
19104"""""""""
19105
19106Predicated floating-point fused multiply-add of two vectors of floating-point values.
19107
19108
19109Arguments:
19110""""""""""
19111
19112The first three operands and the result have the same vector of floating-point type. The
19113fourth operand is the vector mask and has the same number of elements as the
19114result vector type. The fifth operand is the explicit vector length of the
19115operation.
19116
19117Semantics:
19118""""""""""
19119
19120The '``llvm.vp.fma``' intrinsic performs floating-point fused multiply-add (:ref:`llvm.fma <int_fma>`)
19121of the first, second, and third vector operand on each enabled lane.  The result on
19122disabled lanes is undefined.  The operation is performed in the default
19123floating-point environment.
19124
19125Examples:
19126"""""""""
19127
19128.. code-block:: llvm
19129
19130      %r = call <4 x float> @llvm.vp.fma.v4f32(<4 x float> %a, <4 x float> %b, <4 x float> %c, <4 x i1> %mask, i32 %evl)
19131      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
19132
19133      %t = call <4 x float> @llvm.fma(<4 x float> %a, <4 x float> %b, <4 x float> %c)
19134      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
19135
19136
19137.. _int_vp_reduce_add:
19138
19139'``llvm.vp.reduce.add.*``' Intrinsics
19140^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19141
19142Syntax:
19143"""""""
19144This is an overloaded intrinsic.
19145
19146::
19147
19148      declare i32 @llvm.vp.reduce.add.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19149      declare i16 @llvm.vp.reduce.add.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19150
19151Overview:
19152"""""""""
19153
19154Predicated integer ``ADD`` reduction of a vector and a scalar starting value,
19155returning the result as a scalar.
19156
19157Arguments:
19158""""""""""
19159
19160The first operand is the start value of the reduction, which must be a scalar
19161integer type equal to the result type. The second operand is the vector on
19162which the reduction is performed and must be a vector of integer values whose
19163element type is the result/start type. The third operand is the vector mask and
19164is a vector of boolean values with the same number of elements as the vector
19165operand. The fourth operand is the explicit vector length of the operation.
19166
19167Semantics:
19168""""""""""
19169
19170The '``llvm.vp.reduce.add``' intrinsic performs the integer ``ADD`` reduction
19171(:ref:`llvm.vector.reduce.add <int_vector_reduce_add>`) of the vector operand
19172``val`` on each enabled lane, adding it to the scalar ``start_value``. Disabled
19173lanes are treated as containing the neutral value ``0`` (i.e. having no effect
19174on the reduction operation). If the vector length is zero, the result is equal
19175to ``start_value``.
19176
19177To ignore the start value, the neutral value can be used.
19178
19179Examples:
19180"""""""""
19181
19182.. code-block:: llvm
19183
19184      %r = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19185      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19186      ; are treated as though %mask were false for those lanes.
19187
19188      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> zeroinitializer
19189      %reduction = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %masked.a)
19190      %also.r = add i32 %reduction, %start
19191
19192
19193.. _int_vp_reduce_fadd:
19194
19195'``llvm.vp.reduce.fadd.*``' Intrinsics
19196^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19197
19198Syntax:
19199"""""""
19200This is an overloaded intrinsic.
19201
19202::
19203
19204      declare float @llvm.vp.reduce.fadd.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
19205      declare double @llvm.vp.reduce.fadd.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19206
19207Overview:
19208"""""""""
19209
19210Predicated floating-point ``ADD`` reduction of a vector and a scalar starting
19211value, returning the result as a scalar.
19212
19213Arguments:
19214""""""""""
19215
19216The first operand is the start value of the reduction, which must be a scalar
19217floating-point type equal to the result type. The second operand is the vector
19218on which the reduction is performed and must be a vector of floating-point
19219values whose element type is the result/start type. The third operand is the
19220vector mask and is a vector of boolean values with the same number of elements
19221as the vector operand. The fourth operand is the explicit vector length of the
19222operation.
19223
19224Semantics:
19225""""""""""
19226
19227The '``llvm.vp.reduce.fadd``' intrinsic performs the floating-point ``ADD``
19228reduction (:ref:`llvm.vector.reduce.fadd <int_vector_reduce_fadd>`) of the
19229vector operand ``val`` on each enabled lane, adding it to the scalar
19230``start_value``. Disabled lanes are treated as containing the neutral value
19231``-0.0`` (i.e. having no effect on the reduction operation). If no lanes are
19232enabled, the resulting value will be equal to ``start_value``.
19233
19234To ignore the start value, the neutral value can be used.
19235
19236See the unpredicated version (:ref:`llvm.vector.reduce.fadd
19237<int_vector_reduce_fadd>`) for more detail on the semantics of the reduction.
19238
19239Examples:
19240"""""""""
19241
19242.. code-block:: llvm
19243
19244      %r = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19245      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19246      ; are treated as though %mask were false for those lanes.
19247
19248      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float -0.0, float -0.0, float -0.0, float -0.0>
19249      %also.r = call float @llvm.vector.reduce.fadd.v4f32(float %start, <4 x float> %masked.a)
19250
19251
19252.. _int_vp_reduce_mul:
19253
19254'``llvm.vp.reduce.mul.*``' Intrinsics
19255^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19256
19257Syntax:
19258"""""""
19259This is an overloaded intrinsic.
19260
19261::
19262
19263      declare i32 @llvm.vp.reduce.mul.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19264      declare i16 @llvm.vp.reduce.mul.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19265
19266Overview:
19267"""""""""
19268
19269Predicated integer ``MUL`` reduction of a vector and a scalar starting value,
19270returning the result as a scalar.
19271
19272
19273Arguments:
19274""""""""""
19275
19276The first operand is the start value of the reduction, which must be a scalar
19277integer type equal to the result type. The second operand is the vector on
19278which the reduction is performed and must be a vector of integer values whose
19279element type is the result/start type. The third operand is the vector mask and
19280is a vector of boolean values with the same number of elements as the vector
19281operand. The fourth operand is the explicit vector length of the operation.
19282
19283Semantics:
19284""""""""""
19285
19286The '``llvm.vp.reduce.mul``' intrinsic performs the integer ``MUL`` reduction
19287(:ref:`llvm.vector.reduce.mul <int_vector_reduce_mul>`) of the vector operand ``val``
19288on each enabled lane, multiplying it by the scalar ``start_value``. Disabled
19289lanes are treated as containing the neutral value ``1`` (i.e. having no effect
19290on the reduction operation). If the vector length is zero, the result is the
19291start value.
19292
19293To ignore the start value, the neutral value can be used.
19294
19295Examples:
19296"""""""""
19297
19298.. code-block:: llvm
19299
19300      %r = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19301      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19302      ; are treated as though %mask were false for those lanes.
19303
19304      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
19305      %reduction = call i32 @llvm.vector.reduce.mul.v4i32(<4 x i32> %masked.a)
19306      %also.r = mul i32 %reduction, %start
19307
19308.. _int_vp_reduce_fmul:
19309
19310'``llvm.vp.reduce.fmul.*``' Intrinsics
19311^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19312
19313Syntax:
19314"""""""
19315This is an overloaded intrinsic.
19316
19317::
19318
19319      declare float @llvm.vp.reduce.fmul.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, i32 <vector_length>)
19320      declare double @llvm.vp.reduce.fmul.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19321
19322Overview:
19323"""""""""
19324
19325Predicated floating-point ``MUL`` reduction of a vector and a scalar starting
19326value, returning the result as a scalar.
19327
19328
19329Arguments:
19330""""""""""
19331
19332The first operand is the start value of the reduction, which must be a scalar
19333floating-point type equal to the result type. The second operand is the vector
19334on which the reduction is performed and must be a vector of floating-point
19335values whose element type is the result/start type. The third operand is the
19336vector mask and is a vector of boolean values with the same number of elements
19337as the vector operand. The fourth operand is the explicit vector length of the
19338operation.
19339
19340Semantics:
19341""""""""""
19342
19343The '``llvm.vp.reduce.fmul``' intrinsic performs the floating-point ``MUL``
19344reduction (:ref:`llvm.vector.reduce.fmul <int_vector_reduce_fmul>`) of the
19345vector operand ``val`` on each enabled lane, multiplying it by the scalar
19346`start_value``. Disabled lanes are treated as containing the neutral value
19347``1.0`` (i.e. having no effect on the reduction operation). If no lanes are
19348enabled, the resulting value will be equal to the starting value.
19349
19350To ignore the start value, the neutral value can be used.
19351
19352See the unpredicated version (:ref:`llvm.vector.reduce.fmul
19353<int_vector_reduce_fmul>`) for more detail on the semantics.
19354
19355Examples:
19356"""""""""
19357
19358.. code-block:: llvm
19359
19360      %r = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19361      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19362      ; are treated as though %mask were false for those lanes.
19363
19364      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float 1.0, float 1.0, float 1.0, float 1.0>
19365      %also.r = call float @llvm.vector.reduce.fmul.v4f32(float %start, <4 x float> %masked.a)
19366
19367
19368.. _int_vp_reduce_and:
19369
19370'``llvm.vp.reduce.and.*``' Intrinsics
19371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19372
19373Syntax:
19374"""""""
19375This is an overloaded intrinsic.
19376
19377::
19378
19379      declare i32 @llvm.vp.reduce.and.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19380      declare i16 @llvm.vp.reduce.and.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19381
19382Overview:
19383"""""""""
19384
19385Predicated integer ``AND`` reduction of a vector and a scalar starting value,
19386returning the result as a scalar.
19387
19388
19389Arguments:
19390""""""""""
19391
19392The first operand is the start value of the reduction, which must be a scalar
19393integer type equal to the result type. The second operand is the vector on
19394which the reduction is performed and must be a vector of integer values whose
19395element type is the result/start type. The third operand is the vector mask and
19396is a vector of boolean values with the same number of elements as the vector
19397operand. The fourth operand is the explicit vector length of the operation.
19398
19399Semantics:
19400""""""""""
19401
19402The '``llvm.vp.reduce.and``' intrinsic performs the integer ``AND`` reduction
19403(:ref:`llvm.vector.reduce.and <int_vector_reduce_and>`) of the vector operand
19404``val`` on each enabled lane, performing an '``and``' of that with with the
19405scalar ``start_value``. Disabled lanes are treated as containing the neutral
19406value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19407operation). If the vector length is zero, the result is the start value.
19408
19409To ignore the start value, the neutral value can be used.
19410
19411Examples:
19412"""""""""
19413
19414.. code-block:: llvm
19415
19416      %r = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19417      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19418      ; are treated as though %mask were false for those lanes.
19419
19420      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19421      %reduction = call i32 @llvm.vector.reduce.and.v4i32(<4 x i32> %masked.a)
19422      %also.r = and i32 %reduction, %start
19423
19424
19425.. _int_vp_reduce_or:
19426
19427'``llvm.vp.reduce.or.*``' Intrinsics
19428^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19429
19430Syntax:
19431"""""""
19432This is an overloaded intrinsic.
19433
19434::
19435
19436      declare i32 @llvm.vp.reduce.or.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19437      declare i16 @llvm.vp.reduce.or.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19438
19439Overview:
19440"""""""""
19441
19442Predicated integer ``OR`` reduction of a vector and a scalar starting value,
19443returning the result as a scalar.
19444
19445
19446Arguments:
19447""""""""""
19448
19449The first operand is the start value of the reduction, which must be a scalar
19450integer type equal to the result type. The second operand is the vector on
19451which the reduction is performed and must be a vector of integer values whose
19452element type is the result/start type. The third operand is the vector mask and
19453is a vector of boolean values with the same number of elements as the vector
19454operand. The fourth operand is the explicit vector length of the operation.
19455
19456Semantics:
19457""""""""""
19458
19459The '``llvm.vp.reduce.or``' intrinsic performs the integer ``OR`` reduction
19460(:ref:`llvm.vector.reduce.or <int_vector_reduce_or>`) of the vector operand
19461``val`` on each enabled lane, performing an '``or``' of that with the scalar
19462``start_value``. Disabled lanes are treated as containing the neutral value
19463``0`` (i.e. having no effect on the reduction operation). If the vector length
19464is zero, the result is the start value.
19465
19466To ignore the start value, the neutral value can be used.
19467
19468Examples:
19469"""""""""
19470
19471.. code-block:: llvm
19472
19473      %r = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19474      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19475      ; are treated as though %mask were false for those lanes.
19476
19477      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19478      %reduction = call i32 @llvm.vector.reduce.or.v4i32(<4 x i32> %masked.a)
19479      %also.r = or i32 %reduction, %start
19480
19481.. _int_vp_reduce_xor:
19482
19483'``llvm.vp.reduce.xor.*``' Intrinsics
19484^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19485
19486Syntax:
19487"""""""
19488This is an overloaded intrinsic.
19489
19490::
19491
19492      declare i32 @llvm.vp.reduce.xor.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19493      declare i16 @llvm.vp.reduce.xor.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19494
19495Overview:
19496"""""""""
19497
19498Predicated integer ``XOR`` reduction of a vector and a scalar starting value,
19499returning the result as a scalar.
19500
19501
19502Arguments:
19503""""""""""
19504
19505The first operand is the start value of the reduction, which must be a scalar
19506integer type equal to the result type. The second operand is the vector on
19507which the reduction is performed and must be a vector of integer values whose
19508element type is the result/start type. The third operand is the vector mask and
19509is a vector of boolean values with the same number of elements as the vector
19510operand. The fourth operand is the explicit vector length of the operation.
19511
19512Semantics:
19513""""""""""
19514
19515The '``llvm.vp.reduce.xor``' intrinsic performs the integer ``XOR`` reduction
19516(:ref:`llvm.vector.reduce.xor <int_vector_reduce_xor>`) of the vector operand
19517``val`` on each enabled lane, performing an '``xor``' of that with the scalar
19518``start_value``. Disabled lanes are treated as containing the neutral value
19519``0`` (i.e. having no effect on the reduction operation). If the vector length
19520is zero, the result is the start value.
19521
19522To ignore the start value, the neutral value can be used.
19523
19524Examples:
19525"""""""""
19526
19527.. code-block:: llvm
19528
19529      %r = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19530      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19531      ; are treated as though %mask were false for those lanes.
19532
19533      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19534      %reduction = call i32 @llvm.vector.reduce.xor.v4i32(<4 x i32> %masked.a)
19535      %also.r = xor i32 %reduction, %start
19536
19537
19538.. _int_vp_reduce_smax:
19539
19540'``llvm.vp.reduce.smax.*``' Intrinsics
19541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19542
19543Syntax:
19544"""""""
19545This is an overloaded intrinsic.
19546
19547::
19548
19549      declare i32 @llvm.vp.reduce.smax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19550      declare i16 @llvm.vp.reduce.smax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19551
19552Overview:
19553"""""""""
19554
19555Predicated signed-integer ``MAX`` reduction of a vector and a scalar starting
19556value, returning the result as a scalar.
19557
19558
19559Arguments:
19560""""""""""
19561
19562The first operand is the start value of the reduction, which must be a scalar
19563integer type equal to the result type. The second operand is the vector on
19564which the reduction is performed and must be a vector of integer values whose
19565element type is the result/start type. The third operand is the vector mask and
19566is a vector of boolean values with the same number of elements as the vector
19567operand. The fourth operand is the explicit vector length of the operation.
19568
19569Semantics:
19570""""""""""
19571
19572The '``llvm.vp.reduce.smax``' intrinsic performs the signed-integer ``MAX``
19573reduction (:ref:`llvm.vector.reduce.smax <int_vector_reduce_smax>`) of the
19574vector operand ``val`` on each enabled lane, and taking the maximum of that and
19575the scalar ``start_value``. Disabled lanes are treated as containing the
19576neutral value ``INT_MIN`` (i.e. having no effect on the reduction operation).
19577If the vector length is zero, the result is the start value.
19578
19579To ignore the start value, the neutral value can be used.
19580
19581Examples:
19582"""""""""
19583
19584.. code-block:: llvm
19585
19586      %r = call i8 @llvm.vp.reduce.smax.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19587      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19588      ; are treated as though %mask were false for those lanes.
19589
19590      %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 -128, i8 -128, i8 -128, i8 -128>
19591      %reduction = call i8 @llvm.vector.reduce.smax.v4i8(<4 x i8> %masked.a)
19592      %also.r = call i8 @llvm.smax.i8(i8 %reduction, i8 %start)
19593
19594
19595.. _int_vp_reduce_smin:
19596
19597'``llvm.vp.reduce.smin.*``' Intrinsics
19598^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19599
19600Syntax:
19601"""""""
19602This is an overloaded intrinsic.
19603
19604::
19605
19606      declare i32 @llvm.vp.reduce.smin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19607      declare i16 @llvm.vp.reduce.smin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19608
19609Overview:
19610"""""""""
19611
19612Predicated signed-integer ``MIN`` reduction of a vector and a scalar starting
19613value, returning the result as a scalar.
19614
19615
19616Arguments:
19617""""""""""
19618
19619The first operand is the start value of the reduction, which must be a scalar
19620integer type equal to the result type. The second operand is the vector on
19621which the reduction is performed and must be a vector of integer values whose
19622element type is the result/start type. The third operand is the vector mask and
19623is a vector of boolean values with the same number of elements as the vector
19624operand. The fourth operand is the explicit vector length of the operation.
19625
19626Semantics:
19627""""""""""
19628
19629The '``llvm.vp.reduce.smin``' intrinsic performs the signed-integer ``MIN``
19630reduction (:ref:`llvm.vector.reduce.smin <int_vector_reduce_smin>`) of the
19631vector operand ``val`` on each enabled lane, and taking the minimum of that and
19632the scalar ``start_value``. Disabled lanes are treated as containing the
19633neutral value ``INT_MAX`` (i.e. having no effect on the reduction operation).
19634If the vector length is zero, the result is the start value.
19635
19636To ignore the start value, the neutral value can be used.
19637
19638Examples:
19639"""""""""
19640
19641.. code-block:: llvm
19642
19643      %r = call i8 @llvm.vp.reduce.smin.v4i8(i8 %start, <4 x i8> %a, <4 x i1> %mask, i32 %evl)
19644      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19645      ; are treated as though %mask were false for those lanes.
19646
19647      %masked.a = select <4 x i1> %mask, <4 x i8> %a, <4 x i8> <i8 127, i8 127, i8 127, i8 127>
19648      %reduction = call i8 @llvm.vector.reduce.smin.v4i8(<4 x i8> %masked.a)
19649      %also.r = call i8 @llvm.smin.i8(i8 %reduction, i8 %start)
19650
19651
19652.. _int_vp_reduce_umax:
19653
19654'``llvm.vp.reduce.umax.*``' Intrinsics
19655^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19656
19657Syntax:
19658"""""""
19659This is an overloaded intrinsic.
19660
19661::
19662
19663      declare i32 @llvm.vp.reduce.umax.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19664      declare i16 @llvm.vp.reduce.umax.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19665
19666Overview:
19667"""""""""
19668
19669Predicated unsigned-integer ``MAX`` reduction of a vector and a scalar starting
19670value, returning the result as a scalar.
19671
19672
19673Arguments:
19674""""""""""
19675
19676The first operand is the start value of the reduction, which must be a scalar
19677integer type equal to the result type. The second operand is the vector on
19678which the reduction is performed and must be a vector of integer values whose
19679element type is the result/start type. The third operand is the vector mask and
19680is a vector of boolean values with the same number of elements as the vector
19681operand. The fourth operand is the explicit vector length of the operation.
19682
19683Semantics:
19684""""""""""
19685
19686The '``llvm.vp.reduce.umax``' intrinsic performs the unsigned-integer ``MAX``
19687reduction (:ref:`llvm.vector.reduce.umax <int_vector_reduce_umax>`) of the
19688vector operand ``val`` on each enabled lane, and taking the maximum of that and
19689the scalar ``start_value``. Disabled lanes are treated as containing the
19690neutral value ``0`` (i.e. having no effect on the reduction operation). If the
19691vector length is zero, the result is the start value.
19692
19693To ignore the start value, the neutral value can be used.
19694
19695Examples:
19696"""""""""
19697
19698.. code-block:: llvm
19699
19700      %r = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19701      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19702      ; are treated as though %mask were false for those lanes.
19703
19704      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
19705      %reduction = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> %masked.a)
19706      %also.r = call i32 @llvm.umax.i32(i32 %reduction, i32 %start)
19707
19708
19709.. _int_vp_reduce_umin:
19710
19711'``llvm.vp.reduce.umin.*``' Intrinsics
19712^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19713
19714Syntax:
19715"""""""
19716This is an overloaded intrinsic.
19717
19718::
19719
19720      declare i32 @llvm.vp.reduce.umin.v4i32(i32 <start_value>, <4 x i32> <val>, <4 x i1> <mask>, i32 <vector_length>)
19721      declare i16 @llvm.vp.reduce.umin.nxv8i16(i16 <start_value>, <vscale x 8 x i16> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19722
19723Overview:
19724"""""""""
19725
19726Predicated unsigned-integer ``MIN`` reduction of a vector and a scalar starting
19727value, returning the result as a scalar.
19728
19729
19730Arguments:
19731""""""""""
19732
19733The first operand is the start value of the reduction, which must be a scalar
19734integer type equal to the result type. The second operand is the vector on
19735which the reduction is performed and must be a vector of integer values whose
19736element type is the result/start type. The third operand is the vector mask and
19737is a vector of boolean values with the same number of elements as the vector
19738operand. The fourth operand is the explicit vector length of the operation.
19739
19740Semantics:
19741""""""""""
19742
19743The '``llvm.vp.reduce.umin``' intrinsic performs the unsigned-integer ``MIN``
19744reduction (:ref:`llvm.vector.reduce.umin <int_vector_reduce_umin>`) of the
19745vector operand ``val`` on each enabled lane, taking the minimum of that and the
19746scalar ``start_value``. Disabled lanes are treated as containing the neutral
19747value ``UINT_MAX``, or ``-1`` (i.e. having no effect on the reduction
19748operation). If the vector length is zero, the result is the start value.
19749
19750To ignore the start value, the neutral value can be used.
19751
19752Examples:
19753"""""""""
19754
19755.. code-block:: llvm
19756
19757      %r = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %a, <4 x i1> %mask, i32 %evl)
19758      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19759      ; are treated as though %mask were false for those lanes.
19760
19761      %masked.a = select <4 x i1> %mask, <4 x i32> %a, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
19762      %reduction = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> %masked.a)
19763      %also.r = call i32 @llvm.umin.i32(i32 %reduction, i32 %start)
19764
19765
19766.. _int_vp_reduce_fmax:
19767
19768'``llvm.vp.reduce.fmax.*``' Intrinsics
19769^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19770
19771Syntax:
19772"""""""
19773This is an overloaded intrinsic.
19774
19775::
19776
19777      declare float @llvm.vp.reduce.fmax.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19778      declare double @llvm.vp.reduce.fmax.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19779
19780Overview:
19781"""""""""
19782
19783Predicated floating-point ``MAX`` reduction of a vector and a scalar starting
19784value, returning the result as a scalar.
19785
19786
19787Arguments:
19788""""""""""
19789
19790The first operand is the start value of the reduction, which must be a scalar
19791floating-point type equal to the result type. The second operand is the vector
19792on which the reduction is performed and must be a vector of floating-point
19793values whose element type is the result/start type. The third operand is the
19794vector mask and is a vector of boolean values with the same number of elements
19795as the vector operand. The fourth operand is the explicit vector length of the
19796operation.
19797
19798Semantics:
19799""""""""""
19800
19801The '``llvm.vp.reduce.fmax``' intrinsic performs the floating-point ``MAX``
19802reduction (:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>`) of the
19803vector operand ``val`` on each enabled lane, taking the maximum of that and the
19804scalar ``start_value``. Disabled lanes are treated as containing the neutral
19805value (i.e. having no effect on the reduction operation). If the vector length
19806is zero, the result is the start value.
19807
19808The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19809flags are set, the neutral value is ``-QNAN``. If ``nnan``  and ``ninf`` are
19810both set, then the neutral value is the smallest floating-point value for the
19811result type. If only ``nnan`` is set then the neutral value is ``-Infinity``.
19812
19813This instruction has the same comparison semantics as the
19814:ref:`llvm.vector.reduce.fmax <int_vector_reduce_fmax>` intrinsic (and thus the
19815'``llvm.maxnum.*``' intrinsic). That is, the result will always be a number
19816unless all elements of the vector and the starting value are ``NaN``. For a
19817vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19818``-0.0`` elements, the sign of the result is unspecified.
19819
19820To ignore the start value, the neutral value can be used.
19821
19822Examples:
19823"""""""""
19824
19825.. code-block:: llvm
19826
19827      %r = call float @llvm.vp.reduce.fmax.v4f32(float %float, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19828      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19829      ; are treated as though %mask were false for those lanes.
19830
19831      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19832      %reduction = call float @llvm.vector.reduce.fmax.v4f32(<4 x float> %masked.a)
19833      %also.r = call float @llvm.maxnum.f32(float %reduction, float %start)
19834
19835
19836.. _int_vp_reduce_fmin:
19837
19838'``llvm.vp.reduce.fmin.*``' Intrinsics
19839^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19840
19841Syntax:
19842"""""""
19843This is an overloaded intrinsic.
19844
19845::
19846
19847      declare float @llvm.vp.reduce.fmin.v4f32(float <start_value>, <4 x float> <val>, <4 x i1> <mask>, float <vector_length>)
19848      declare double @llvm.vp.reduce.fmin.nxv8f64(double <start_value>, <vscale x 8 x double> <val>, <vscale x 8 x i1> <mask>, i32 <vector_length>)
19849
19850Overview:
19851"""""""""
19852
19853Predicated floating-point ``MIN`` reduction of a vector and a scalar starting
19854value, returning the result as a scalar.
19855
19856
19857Arguments:
19858""""""""""
19859
19860The first operand is the start value of the reduction, which must be a scalar
19861floating-point type equal to the result type. The second operand is the vector
19862on which the reduction is performed and must be a vector of floating-point
19863values whose element type is the result/start type. The third operand is the
19864vector mask and is a vector of boolean values with the same number of elements
19865as the vector operand. The fourth operand is the explicit vector length of the
19866operation.
19867
19868Semantics:
19869""""""""""
19870
19871The '``llvm.vp.reduce.fmin``' intrinsic performs the floating-point ``MIN``
19872reduction (:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>`) of the
19873vector operand ``val`` on each enabled lane, taking the minimum of that and the
19874scalar ``start_value``. Disabled lanes are treated as containing the neutral
19875value (i.e. having no effect on the reduction operation). If the vector length
19876is zero, the result is the start value.
19877
19878The neutral value is dependent on the :ref:`fast-math flags <fastmath>`. If no
19879flags are set, the neutral value is ``+QNAN``. If ``nnan``  and ``ninf`` are
19880both set, then the neutral value is the largest floating-point value for the
19881result type. If only ``nnan`` is set then the neutral value is ``+Infinity``.
19882
19883This instruction has the same comparison semantics as the
19884:ref:`llvm.vector.reduce.fmin <int_vector_reduce_fmin>` intrinsic (and thus the
19885'``llvm.minnum.*``' intrinsic). That is, the result will always be a number
19886unless all elements of the vector and the starting value are ``NaN``. For a
19887vector with maximum element magnitude ``0.0`` and containing both ``+0.0`` and
19888``-0.0`` elements, the sign of the result is unspecified.
19889
19890To ignore the start value, the neutral value can be used.
19891
19892Examples:
19893"""""""""
19894
19895.. code-block:: llvm
19896
19897      %r = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %a, <4 x i1> %mask, i32 %evl)
19898      ; %r is equivalent to %also.r, where lanes greater than or equal to %evl
19899      ; are treated as though %mask were false for those lanes.
19900
19901      %masked.a = select <4 x i1> %mask, <4 x float> %a, <4 x float> <float QNAN, float QNAN, float QNAN, float QNAN>
19902      %reduction = call float @llvm.vector.reduce.fmin.v4f32(<4 x float> %masked.a)
19903      %also.r = call float @llvm.minnum.f32(float %reduction, float %start)
19904
19905
19906.. _int_get_active_lane_mask:
19907
19908'``llvm.get.active.lane.mask.*``' Intrinsics
19909^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19910
19911Syntax:
19912"""""""
19913This is an overloaded intrinsic.
19914
19915::
19916
19917      declare <4 x i1> @llvm.get.active.lane.mask.v4i1.i32(i32 %base, i32 %n)
19918      declare <8 x i1> @llvm.get.active.lane.mask.v8i1.i64(i64 %base, i64 %n)
19919      declare <16 x i1> @llvm.get.active.lane.mask.v16i1.i64(i64 %base, i64 %n)
19920      declare <vscale x 16 x i1> @llvm.get.active.lane.mask.nxv16i1.i64(i64 %base, i64 %n)
19921
19922
19923Overview:
19924"""""""""
19925
19926Create a mask representing active and inactive vector lanes.
19927
19928
19929Arguments:
19930""""""""""
19931
19932Both operands have the same scalar integer type. The result is a vector with
19933the i1 element type.
19934
19935Semantics:
19936""""""""""
19937
19938The '``llvm.get.active.lane.mask.*``' intrinsics are semantically equivalent
19939to:
19940
19941::
19942
19943      %m[i] = icmp ult (%base + i), %n
19944
19945where ``%m`` is a vector (mask) of active/inactive lanes with its elements
19946indexed by ``i``,  and ``%base``, ``%n`` are the two arguments to
19947``llvm.get.active.lane.mask.*``, ``%icmp`` is an integer compare and ``ult``
19948the unsigned less-than comparison operator.  Overflow cannot occur in
19949``(%base + i)`` and its comparison against ``%n`` as it is performed in integer
19950numbers and not in machine numbers.  If ``%n`` is ``0``, then the result is a
19951poison value. The above is equivalent to:
19952
19953::
19954
19955      %m = @llvm.get.active.lane.mask(%base, %n)
19956
19957This can, for example, be emitted by the loop vectorizer in which case
19958``%base`` is the first element of the vector induction variable (VIV) and
19959``%n`` is the loop tripcount. Thus, these intrinsics perform an element-wise
19960less than comparison of VIV with the loop tripcount, producing a mask of
19961true/false values representing active/inactive vector lanes, except if the VIV
19962overflows in which case they return false in the lanes where the VIV overflows.
19963The arguments are scalar types to accommodate scalable vector types, for which
19964it is unknown what the type of the step vector needs to be that enumerate its
19965lanes without overflow.
19966
19967This mask ``%m`` can e.g. be used in masked load/store instructions. These
19968intrinsics provide a hint to the backend. I.e., for a vector loop, the
19969back-edge taken count of the original scalar loop is explicit as the second
19970argument.
19971
19972
19973Examples:
19974"""""""""
19975
19976.. code-block:: llvm
19977
19978      %active.lane.mask = call <4 x i1> @llvm.get.active.lane.mask.v4i1.i64(i64 %elem0, i64 429)
19979      %wide.masked.load = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(<4 x i32>* %3, i32 4, <4 x i1> %active.lane.mask, <4 x i32> undef)
19980
19981
19982.. _int_experimental_vp_splice:
19983
19984'``llvm.experimental.vp.splice``' Intrinsic
19985^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
19986
19987Syntax:
19988"""""""
19989This is an overloaded intrinsic.
19990
19991::
19992
19993      declare <2 x double> @llvm.experimental.vp.splice.v2f64(<2 x double> %vec1, <2 x double> %vec2, i32 %imm, <2 x i1> %mask, i32 %evl1, i32 %evl2)
19994      declare <vscale x 4 x i32> @llvm.experimental.vp.splice.nxv4i32(<vscale x 4 x i32> %vec1, <vscale x 4 x i32> %vec2, i32 %imm, <vscale x 4 x i1> %mask, i32 %evl1, i32 %evl2)
19995
19996Overview:
19997"""""""""
19998
19999The '``llvm.experimental.vp.splice.*``' intrinsic is the vector length
20000predicated version of the '``llvm.experimental.vector.splice.*``' intrinsic.
20001
20002Arguments:
20003""""""""""
20004
20005The result and the first two arguments ``vec1`` and ``vec2`` are vectors with
20006the same type.  The third argument ``imm`` is an immediate signed integer that
20007indicates the offset index.  The fourth argument ``mask`` is a vector mask and
20008has the same number of elements as the result.  The last two arguments ``evl1``
20009and ``evl2`` are unsigned integers indicating the explicit vector lengths of
20010``vec1`` and ``vec2`` respectively.  ``imm``, ``evl1`` and ``evl2`` should
20011respect the following constraints: ``-evl1 <= imm < evl1``, ``0 <= evl1 <= VL``
20012and ``0 <= evl2 <= VL``, where ``VL`` is the runtime vector factor. If these
20013constraints are not satisfied the intrinsic has undefined behaviour.
20014
20015Semantics:
20016""""""""""
20017
20018Effectively, this intrinsic concatenates ``vec1[0..evl1-1]`` and
20019``vec2[0..evl2-1]`` and creates the result vector by selecting the elements in a
20020window of size ``evl2``, starting at index ``imm`` (for a positive immediate) of
20021the concatenated vector. Elements in the result vector beyond ``evl2`` are
20022``undef``.  If ``imm`` is negative the starting index is ``evl1 + imm``.  The result
20023vector of active vector length ``evl2`` contains ``evl1 - imm`` (``-imm`` for
20024negative ``imm``) elements from indices ``[imm..evl1 - 1]``
20025(``[evl1 + imm..evl1 -1]`` for negative ``imm``) of ``vec1`` followed by the
20026first ``evl2 - (evl1 - imm)`` (``evl2 + imm`` for negative ``imm``) elements of
20027``vec2``. If ``evl1 - imm`` (``-imm``) >= ``evl2``, only the first ``evl2``
20028elements are considered and the remaining are ``undef``.  The lanes in the result
20029vector disabled by ``mask`` are ``undef``.
20030
20031Examples:
20032"""""""""
20033
20034.. code-block:: text
20035
20036 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, 1, 2, 3)  ==> <B, E, F, undef> ; index
20037 llvm.experimental.vp.splice(<A,B,C,D>, <E,F,G,H>, -2, 3, 2) ==> <B, C, undef, undef> ; trailing elements
20038
20039
20040.. _int_vp_load:
20041
20042'``llvm.vp.load``' Intrinsic
20043^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20044
20045Syntax:
20046"""""""
20047This is an overloaded intrinsic.
20048
20049::
20050
20051    declare <4 x float> @llvm.vp.load.v4f32.p0(ptr %ptr, <4 x i1> %mask, i32 %evl)
20052    declare <vscale x 2 x i16> @llvm.vp.load.nxv2i16.p0(ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
20053    declare <8 x float> @llvm.vp.load.v8f32.p1(ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
20054    declare <vscale x 1 x i64> @llvm.vp.load.nxv1i64.p6(ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
20055
20056Overview:
20057"""""""""
20058
20059The '``llvm.vp.load.*``' intrinsic is the vector length predicated version of
20060the :ref:`llvm.masked.load <int_mload>` intrinsic.
20061
20062Arguments:
20063""""""""""
20064
20065The first operand is the base pointer for the load. The second operand is a
20066vector of boolean values with the same number of elements as the return type.
20067The third is the explicit vector length of the operation. The return type and
20068underlying type of the base pointer are the same vector types.
20069
20070The :ref:`align <attr_align>` parameter attribute can be provided for the first
20071operand.
20072
20073Semantics:
20074""""""""""
20075
20076The '``llvm.vp.load``' intrinsic reads a vector from memory in the same way as
20077the '``llvm.masked.load``' intrinsic, where the mask is taken from the
20078combination of the '``mask``' and '``evl``' operands in the usual VP way.
20079Certain '``llvm.masked.load``' operands do not have corresponding operands in
20080'``llvm.vp.load``': the '``passthru``' operand is implicitly ``undef``; the
20081'``alignment``' operand is taken as the ``align`` parameter attribute, if
20082provided. The default alignment is taken as the ABI alignment of the return
20083type as specified by the :ref:`datalayout string<langref_datalayout>`.
20084
20085Examples:
20086"""""""""
20087
20088.. code-block:: text
20089
20090     %r = call <8 x i8> @llvm.vp.load.v8i8.p0(ptr align 2 %ptr, <8 x i1> %mask, i32 %evl)
20091     ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20092
20093     %also.r = call <8 x i8> @llvm.masked.load.v8i8.p0(ptr %ptr, i32 2, <8 x i1> %mask, <8 x i8> undef)
20094
20095
20096.. _int_vp_store:
20097
20098'``llvm.vp.store``' Intrinsic
20099^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20100
20101Syntax:
20102"""""""
20103This is an overloaded intrinsic.
20104
20105::
20106
20107    declare void @llvm.vp.store.v4f32.p0(<4 x float> %val, ptr %ptr, <4 x i1> %mask, i32 %evl)
20108    declare void @llvm.vp.store.nxv2i16.p0(<vscale x 2 x i16> %val, ptr %ptr, <vscale x 2 x i1> %mask, i32 %evl)
20109    declare void @llvm.vp.store.v8f32.p1(<8 x float> %val, ptr addrspace(1) %ptr, <8 x i1> %mask, i32 %evl)
20110    declare void @llvm.vp.store.nxv1i64.p6(<vscale x 1 x i64> %val, ptr addrspace(6) %ptr, <vscale x 1 x i1> %mask, i32 %evl)
20111
20112Overview:
20113"""""""""
20114
20115The '``llvm.vp.store.*``' intrinsic is the vector length predicated version of
20116the :ref:`llvm.masked.store <int_mstore>` intrinsic.
20117
20118Arguments:
20119""""""""""
20120
20121The first operand is the vector value to be written to memory. The second
20122operand is the base pointer for the store. It has the same underlying type as
20123the value operand. The third operand is a vector of boolean values with the
20124same number of elements as the return type. The fourth is the explicit vector
20125length of the operation.
20126
20127The :ref:`align <attr_align>` parameter attribute can be provided for the
20128second operand.
20129
20130Semantics:
20131""""""""""
20132
20133The '``llvm.vp.store``' intrinsic reads a vector from memory in the same way as
20134the '``llvm.masked.store``' intrinsic, where the mask is taken from the
20135combination of the '``mask``' and '``evl``' operands in the usual VP way. The
20136alignment of the operation (corresponding to the '``alignment``' operand of
20137'``llvm.masked.store``') is specified by the ``align`` parameter attribute (see
20138above). If it is not provided then the ABI alignment of the type of the
20139'``value``' operand as specified by the :ref:`datalayout
20140string<langref_datalayout>` is used instead.
20141
20142Examples:
20143"""""""""
20144
20145.. code-block:: text
20146
20147     call void @llvm.vp.store.v8i8.p0(<8 x i8> %val, ptr align 4 %ptr, <8 x i1> %mask, i32 %evl)
20148     ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
20149
20150     call void @llvm.masked.store.v8i8.p0(<8 x i8> %val, ptr %ptr, i32 4, <8 x i1> %mask)
20151
20152
20153.. _int_experimental_vp_strided_load:
20154
20155'``llvm.experimental.vp.strided.load``' Intrinsic
20156^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20157
20158Syntax:
20159"""""""
20160This is an overloaded intrinsic.
20161
20162::
20163
20164    declare <4 x float> @llvm.experimental.vp.strided.load.v4f32.i64(ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
20165    declare <vscale x 2 x i16> @llvm.experimental.vp.strided.load.nxv2i16.i64(ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
20166
20167Overview:
20168"""""""""
20169
20170The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, scalar values from
20171memory locations evenly spaced apart by '``stride``' number of bytes, starting from '``ptr``'.
20172
20173Arguments:
20174""""""""""
20175
20176The first operand is the base pointer for the load. The second operand is the stride
20177value expressed in bytes. The third operand is a vector of boolean values
20178with the same number of elements as the return type. The fourth is the explicit
20179vector length of the operation. The base pointer underlying type matches the type of the scalar
20180elements of the return operand.
20181
20182The :ref:`align <attr_align>` parameter attribute can be provided for the first
20183operand.
20184
20185Semantics:
20186""""""""""
20187
20188The '``llvm.experimental.vp.strided.load``' intrinsic loads, into a vector, multiple scalar
20189values from memory in the same way as the :ref:`llvm.vp.gather <int_vp_gather>` intrinsic,
20190where the vector of pointers is in the form:
20191
20192   ``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
20193
20194with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
20195integer and all arithmetic occurring in the pointer type.
20196
20197Examples:
20198"""""""""
20199
20200.. code-block:: text
20201
20202	 %r = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.i64(i64* %ptr, i64 %stride, <8 x i64> %mask, i32 %evl)
20203	 ;; The operation can also be expressed like this:
20204
20205	 %addr = bitcast i64* %ptr to i8*
20206	 ;; Create a vector of pointers %addrs in the form:
20207	 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
20208	 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
20209	 %also.r = call <8 x i64> @llvm.vp.gather.v8i64.v8p0i64(<8 x i64* > %ptrs, <8 x i64> %mask, i32 %evl)
20210
20211
20212.. _int_experimental_vp_strided_store:
20213
20214'``llvm.experimental.vp.strided.store``' Intrinsic
20215^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20216
20217Syntax:
20218"""""""
20219This is an overloaded intrinsic.
20220
20221::
20222
20223    declare void @llvm.experimental.vp.strided.store.v4f32.i64(<4 x float> %val, ptr %ptr, i64 %stride, <4 x i1> %mask, i32 %evl)
20224    declare void @llvm.experimental.vp.strided.store.nxv2i16.i64(<vscale x 2 x i16> %val, ptr %ptr, i64 %stride, <vscale x 2 x i1> %mask, i32 %evl)
20225
20226Overview:
20227"""""""""
20228
20229The '``@llvm.experimental.vp.strided.store``' intrinsic stores the elements of
20230'``val``' into memory locations evenly spaced apart by '``stride``' number of
20231bytes, starting from '``ptr``'.
20232
20233Arguments:
20234""""""""""
20235
20236The first operand is the vector value to be written to memory. The second
20237operand is the base pointer for the store. Its underlying type matches the
20238scalar element type of the value operand. The third operand is the stride value
20239expressed in bytes. The fourth operand is a vector of boolean values with the
20240same number of elements as the return type. The fifth is the explicit vector
20241length of the operation.
20242
20243The :ref:`align <attr_align>` parameter attribute can be provided for the
20244second operand.
20245
20246Semantics:
20247""""""""""
20248
20249The '``llvm.experimental.vp.strided.store``' intrinsic stores the elements of
20250'``val``' in the same way as the :ref:`llvm.vp.scatter <int_vp_scatter>` intrinsic,
20251where the vector of pointers is in the form:
20252
20253	``%ptrs = <%ptr, %ptr + %stride, %ptr + 2 * %stride, ... >``,
20254
20255with '``ptr``' previously casted to a pointer '``i8``', '``stride``' always interpreted as a signed
20256integer and all arithmetic occurring in the pointer type.
20257
20258Examples:
20259"""""""""
20260
20261.. code-block:: text
20262
20263	 call void @llvm.experimental.vp.strided.store.v8i64.i64(<8 x i64> %val, i64* %ptr, i64 %stride, <8 x i1> %mask, i32 %evl)
20264	 ;; The operation can also be expressed like this:
20265
20266	 %addr = bitcast i64* %ptr to i8*
20267	 ;; Create a vector of pointers %addrs in the form:
20268	 ;; %addrs = <%addr, %addr + %stride, %addr + 2 * %stride, ...>
20269	 %ptrs = bitcast <8 x i8* > %addrs to <8 x i64* >
20270	 call void @llvm.vp.scatter.v8i64.v8p0i64(<8 x i64> %val, <8 x i64*> %ptrs, <8 x i1> %mask, i32 %evl)
20271
20272
20273.. _int_vp_gather:
20274
20275'``llvm.vp.gather``' Intrinsic
20276^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20277
20278Syntax:
20279"""""""
20280This is an overloaded intrinsic.
20281
20282::
20283
20284    declare <4 x double> @llvm.vp.gather.v4f64.v4p0(<4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
20285    declare <vscale x 2 x i8> @llvm.vp.gather.nxv2i8.nxv2p0(<vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
20286    declare <2 x float> @llvm.vp.gather.v2f32.v2p2(<2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
20287    declare <vscale x 4 x i32> @llvm.vp.gather.nxv4i32.nxv4p4(<vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
20288
20289Overview:
20290"""""""""
20291
20292The '``llvm.vp.gather.*``' intrinsic is the vector length predicated version of
20293the :ref:`llvm.masked.gather <int_mgather>` intrinsic.
20294
20295Arguments:
20296""""""""""
20297
20298The first operand is a vector of pointers which holds all memory addresses to
20299read. The second operand is a vector of boolean values with the same number of
20300elements as the return type. The third is the explicit vector length of the
20301operation. The return type and underlying type of the vector of pointers are
20302the same vector types.
20303
20304The :ref:`align <attr_align>` parameter attribute can be provided for the first
20305operand.
20306
20307Semantics:
20308""""""""""
20309
20310The '``llvm.vp.gather``' intrinsic reads multiple scalar values from memory in
20311the same way as the '``llvm.masked.gather``' intrinsic, where the mask is taken
20312from the combination of the '``mask``' and '``evl``' operands in the usual VP
20313way. Certain '``llvm.masked.gather``' operands do not have corresponding
20314operands in '``llvm.vp.gather``': the '``passthru``' operand is implicitly
20315``undef``; the '``alignment``' operand is taken as the ``align`` parameter, if
20316provided. The default alignment is taken as the ABI alignment of the source
20317addresses as specified by the :ref:`datalayout string<langref_datalayout>`.
20318
20319Examples:
20320"""""""""
20321
20322.. code-block:: text
20323
20324     %r = call <8 x i8> @llvm.vp.gather.v8i8.v8p0(<8 x ptr>  align 8 %ptrs, <8 x i1> %mask, i32 %evl)
20325     ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20326
20327     %also.r = call <8 x i8> @llvm.masked.gather.v8i8.v8p0(<8 x ptr> %ptrs, i32 8, <8 x i1> %mask, <8 x i8> undef)
20328
20329
20330.. _int_vp_scatter:
20331
20332'``llvm.vp.scatter``' Intrinsic
20333^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20334
20335Syntax:
20336"""""""
20337This is an overloaded intrinsic.
20338
20339::
20340
20341    declare void @llvm.vp.scatter.v4f64.v4p0(<4 x double> %val, <4 x ptr> %ptrs, <4 x i1> %mask, i32 %evl)
20342    declare void @llvm.vp.scatter.nxv2i8.nxv2p0(<vscale x 2 x i8> %val, <vscale x 2 x ptr> %ptrs, <vscale x 2 x i1> %mask, i32 %evl)
20343    declare void @llvm.vp.scatter.v2f32.v2p2(<2 x float> %val, <2 x ptr addrspace(2)> %ptrs, <2 x i1> %mask, i32 %evl)
20344    declare void @llvm.vp.scatter.nxv4i32.nxv4p4(<vscale x 4 x i32> %val, <vscale x 4 x ptr addrspace(4)> %ptrs, <vscale x 4 x i1> %mask, i32 %evl)
20345
20346Overview:
20347"""""""""
20348
20349The '``llvm.vp.scatter.*``' intrinsic is the vector length predicated version of
20350the :ref:`llvm.masked.scatter <int_mscatter>` intrinsic.
20351
20352Arguments:
20353""""""""""
20354
20355The first operand is a vector value to be written to memory. The second operand
20356is a vector of pointers, pointing to where the value elements should be stored.
20357The third operand is a vector of boolean values with the same number of
20358elements as the return type. The fourth is the explicit vector length of the
20359operation.
20360
20361The :ref:`align <attr_align>` parameter attribute can be provided for the
20362second operand.
20363
20364Semantics:
20365""""""""""
20366
20367The '``llvm.vp.scatter``' intrinsic writes multiple scalar values to memory in
20368the same way as the '``llvm.masked.scatter``' intrinsic, where the mask is
20369taken from the combination of the '``mask``' and '``evl``' operands in the
20370usual VP way. The '``alignment``' operand of the '``llvm.masked.scatter``' does
20371not have a corresponding operand in '``llvm.vp.scatter``': it is instead
20372provided via the optional ``align`` parameter attribute on the
20373vector-of-pointers operand. Otherwise it is taken as the ABI alignment of the
20374destination addresses as specified by the :ref:`datalayout
20375string<langref_datalayout>`.
20376
20377Examples:
20378"""""""""
20379
20380.. code-block:: text
20381
20382     call void @llvm.vp.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> align 1 %ptrs, <8 x i1> %mask, i32 %evl)
20383     ;; For all lanes below %evl, the call above is lane-wise equivalent to the call below.
20384
20385     call void @llvm.masked.scatter.v8i8.v8p0(<8 x i8> %val, <8 x ptr> %ptrs, i32 1, <8 x i1> %mask)
20386
20387
20388.. _int_vp_trunc:
20389
20390'``llvm.vp.trunc.*``' Intrinsics
20391^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20392
20393Syntax:
20394"""""""
20395This is an overloaded intrinsic.
20396
20397::
20398
20399      declare <16 x i16>  @llvm.vp.trunc.v16i16.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
20400      declare <vscale x 4 x i16>  @llvm.vp.trunc.nxv4i16.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20401
20402Overview:
20403"""""""""
20404
20405The '``llvm.vp.trunc``' intrinsic truncates its first operand to the return
20406type. The operation has a mask and an explicit vector length parameter.
20407
20408
20409Arguments:
20410""""""""""
20411
20412The '``llvm.vp.trunc``' intrinsic takes a value to cast as its first operand.
20413The return type is the type to cast the value to. Both types must be vector of
20414:ref:`integer <t_integer>` type. The bit size of the value must be larger than
20415the bit size of the return type. The second operand is the vector mask. The
20416return type, the value to cast, and the vector mask have the same number of
20417elements.  The third operand is the explicit vector length of the operation.
20418
20419Semantics:
20420""""""""""
20421
20422The '``llvm.vp.trunc``' intrinsic truncates the high order bits in value and
20423converts the remaining bits to return type. Since the source size must be larger
20424than the destination size, '``llvm.vp.trunc``' cannot be a *no-op cast*. It will
20425always truncate bits. The conversion is performed on lane positions below the
20426explicit vector length and where the vector mask is true.  Masked-off lanes are
20427undefined.
20428
20429Examples:
20430"""""""""
20431
20432.. code-block:: llvm
20433
20434      %r = call <4 x i16> @llvm.vp.trunc.v4i16.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
20435      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20436
20437      %t = trunc <4 x i32> %a to <4 x i16>
20438      %also.r = select <4 x i1> %mask, <4 x i16> %t, <4 x i16> undef
20439
20440
20441.. _int_vp_zext:
20442
20443'``llvm.vp.zext.*``' Intrinsics
20444^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20445
20446Syntax:
20447"""""""
20448This is an overloaded intrinsic.
20449
20450::
20451
20452      declare <16 x i32>  @llvm.vp.zext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
20453      declare <vscale x 4 x i32>  @llvm.vp.zext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20454
20455Overview:
20456"""""""""
20457
20458The '``llvm.vp.zext``' intrinsic zero extends its first operand to the return
20459type. The operation has a mask and an explicit vector length parameter.
20460
20461
20462Arguments:
20463""""""""""
20464
20465The '``llvm.vp.zext``' intrinsic takes a value to cast as its first operand.
20466The return type is the type to cast the value to. Both types must be vectors of
20467:ref:`integer <t_integer>` type. The bit size of the value must be smaller than
20468the bit size of the return type. The second operand is the vector mask. The
20469return type, the value to cast, and the vector mask have the same number of
20470elements.  The third operand is the explicit vector length of the operation.
20471
20472Semantics:
20473""""""""""
20474
20475The '``llvm.vp.zext``' intrinsic fill the high order bits of the value with zero
20476bits until it reaches the size of the return type. When zero extending from i1,
20477the result will always be either 0 or 1. The conversion is performed on lane
20478positions below the explicit vector length and where the vector mask is true.
20479Masked-off lanes are undefined.
20480
20481Examples:
20482"""""""""
20483
20484.. code-block:: llvm
20485
20486      %r = call <4 x i32> @llvm.vp.zext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
20487      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20488
20489      %t = zext <4 x i16> %a to <4 x i32>
20490      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
20491
20492
20493.. _int_vp_sext:
20494
20495'``llvm.vp.sext.*``' Intrinsics
20496^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20497
20498Syntax:
20499"""""""
20500This is an overloaded intrinsic.
20501
20502::
20503
20504      declare <16 x i32>  @llvm.vp.sext.v16i32.v16i16 (<16 x i16> <op>, <16 x i1> <mask>, i32 <vector_length>)
20505      declare <vscale x 4 x i32>  @llvm.vp.sext.nxv4i32.nxv4i16 (<vscale x 4 x i16> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20506
20507Overview:
20508"""""""""
20509
20510The '``llvm.vp.sext``' intrinsic sign extends its first operand to the return
20511type. The operation has a mask and an explicit vector length parameter.
20512
20513
20514Arguments:
20515""""""""""
20516
20517The '``llvm.vp.sext``' intrinsic takes a value to cast as its first operand.
20518The return type is the type to cast the value to. Both types must be vectors of
20519:ref:`integer <t_integer>` type. The bit size of the value must be smaller than
20520the bit size of the return type. The second operand is the vector mask. The
20521return type, the value to cast, and the vector mask have the same number of
20522elements.  The third operand is the explicit vector length of the operation.
20523
20524Semantics:
20525""""""""""
20526
20527The '``llvm.vp.sext``' intrinsic performs a sign extension by copying the sign
20528bit (highest order bit) of the value until it reaches the size of the return
20529type. When zero extending from i1, the result will always be either -1 or 0.
20530The conversion is performed on lane positions below the explicit vector length
20531and where the vector mask is true. Masked-off lanes are undefined.
20532
20533Examples:
20534"""""""""
20535
20536.. code-block:: llvm
20537
20538      %r = call <4 x i32> @llvm.vp.sext.v4i32.v4i16(<4 x i16> %a, <4 x i1> %mask, i32 %evl)
20539      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20540
20541      %t = sext <4 x i16> %a to <4 x i32>
20542      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
20543
20544
20545.. _int_vp_fptrunc:
20546
20547'``llvm.vp.fptrunc.*``' Intrinsics
20548^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20549
20550Syntax:
20551"""""""
20552This is an overloaded intrinsic.
20553
20554::
20555
20556      declare <16 x float>  @llvm.vp.fptrunc.v16f32.v16f64 (<16 x double> <op>, <16 x i1> <mask>, i32 <vector_length>)
20557      declare <vscale x 4 x float>  @llvm.vp.trunc.nxv4f32.nxv4f64 (<vscale x 4 x double> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20558
20559Overview:
20560"""""""""
20561
20562The '``llvm.vp.fptrunc``' intrinsic truncates its first operand to the return
20563type. The operation has a mask and an explicit vector length parameter.
20564
20565
20566Arguments:
20567""""""""""
20568
20569The '``llvm.vp.fptrunc``' intrinsic takes a value to cast as its first operand.
20570The return type is the type to cast the value to. Both types must be vector of
20571:ref:`floating-point <t_floating>` type. The bit size of the value must be
20572larger than the bit size of the return type. This implies that
20573'``llvm.vp.fptrunc``' cannot be used to make a *no-op cast*. The second operand
20574is the vector mask. The return type, the value to cast, and the vector mask have
20575the same number of elements.  The third operand is the explicit vector length of
20576the operation.
20577
20578Semantics:
20579""""""""""
20580
20581The '``llvm.vp.fptrunc``' intrinsic casts a ``value`` from a larger
20582:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
20583<t_floating>` type.
20584This instruction is assumed to execute in the default :ref:`floating-point
20585environment <floatenv>`. The conversion is performed on lane positions below the
20586explicit vector length and where the vector mask is true.  Masked-off lanes are
20587undefined.
20588
20589Examples:
20590"""""""""
20591
20592.. code-block:: llvm
20593
20594      %r = call <4 x float> @llvm.vp.fptrunc.v4f32.v4f64(<4 x double> %a, <4 x i1> %mask, i32 %evl)
20595      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20596
20597      %t = fptrunc <4 x double> %a to <4 x float>
20598      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
20599
20600
20601.. _int_vp_fpext:
20602
20603'``llvm.vp.fpext.*``' Intrinsics
20604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20605
20606Syntax:
20607"""""""
20608This is an overloaded intrinsic.
20609
20610::
20611
20612      declare <16 x double>  @llvm.vp.fpext.v16f64.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20613      declare <vscale x 4 x double>  @llvm.vp.fpext.nxv4f64.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20614
20615Overview:
20616"""""""""
20617
20618The '``llvm.vp.fpext``' intrinsic extends its first operand to the return
20619type. The operation has a mask and an explicit vector length parameter.
20620
20621
20622Arguments:
20623""""""""""
20624
20625The '``llvm.vp.fpext``' intrinsic takes a value to cast as its first operand.
20626The return type is the type to cast the value to. Both types must be vector of
20627:ref:`floating-point <t_floating>` type. The bit size of the value must be
20628smaller than the bit size of the return type. This implies that
20629'``llvm.vp.fpext``' cannot be used to make a *no-op cast*. The second operand
20630is the vector mask. The return type, the value to cast, and the vector mask have
20631the same number of elements.  The third operand is the explicit vector length of
20632the operation.
20633
20634Semantics:
20635""""""""""
20636
20637The '``llvm.vp.fpext``' intrinsic extends the ``value`` from a smaller
20638:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
20639<t_floating>` type. The '``llvm.vp.fpext``' cannot be used to make a
20640*no-op cast* because it always changes bits. Use ``bitcast`` to make a
20641*no-op cast* for a floating-point cast.
20642The conversion is performed on lane positions below the explicit vector length
20643and where the vector mask is true.  Masked-off lanes are undefined.
20644
20645Examples:
20646"""""""""
20647
20648.. code-block:: llvm
20649
20650      %r = call <4 x double> @llvm.vp.fpext.v4f64.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20651      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20652
20653      %t = fpext <4 x float> %a to <4 x double>
20654      %also.r = select <4 x i1> %mask, <4 x double> %t, <4 x double> undef
20655
20656
20657.. _int_vp_fptoui:
20658
20659'``llvm.vp.fptoui.*``' Intrinsics
20660^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20661
20662Syntax:
20663"""""""
20664This is an overloaded intrinsic.
20665
20666::
20667
20668      declare <16 x i32>  @llvm.vp.fptoui.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20669      declare <vscale x 4 x i32>  @llvm.vp.fptoui.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20670      declare <256 x i64>  @llvm.vp.fptoui.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20671
20672Overview:
20673"""""""""
20674
20675The '``llvm.vp.fptoui``' intrinsic converts the :ref:`floating-point
20676<t_floating>` operand to the unsigned integer return type.
20677The operation has a mask and an explicit vector length parameter.
20678
20679
20680Arguments:
20681""""""""""
20682
20683The '``llvm.vp.fptoui``' intrinsic takes a value to cast as its first operand.
20684The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
20685The return type is the type to cast the value to. The return type must be
20686vector of :ref:`integer <t_integer>` type.  The second operand is the vector
20687mask. The return type, the value to cast, and the vector mask have the same
20688number of elements.  The third operand is the explicit vector length of the
20689operation.
20690
20691Semantics:
20692""""""""""
20693
20694The '``llvm.vp.fptoui``' intrinsic converts its :ref:`floating-point
20695<t_floating>` operand into the nearest (rounding towards zero) unsigned integer
20696value where the lane position is below the explicit vector length and the
20697vector mask is true.  Masked-off lanes are undefined. On enabled lanes where
20698conversion takes place and the value cannot fit in the return type, the result
20699on that lane is a :ref:`poison value <poisonvalues>`.
20700
20701Examples:
20702"""""""""
20703
20704.. code-block:: llvm
20705
20706      %r = call <4 x i32> @llvm.vp.fptoui.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20707      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20708
20709      %t = fptoui <4 x float> %a to <4 x i32>
20710      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
20711
20712
20713.. _int_vp_fptosi:
20714
20715'``llvm.vp.fptosi.*``' Intrinsics
20716^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20717
20718Syntax:
20719"""""""
20720This is an overloaded intrinsic.
20721
20722::
20723
20724      declare <16 x i32>  @llvm.vp.fptosi.v16i32.v16f32 (<16 x float> <op>, <16 x i1> <mask>, i32 <vector_length>)
20725      declare <vscale x 4 x i32>  @llvm.vp.fptosi.nxv4i32.nxv4f32 (<vscale x 4 x float> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20726      declare <256 x i64>  @llvm.vp.fptosi.v256i64.v256f64 (<256 x double> <op>, <256 x i1> <mask>, i32 <vector_length>)
20727
20728Overview:
20729"""""""""
20730
20731The '``llvm.vp.fptosi``' intrinsic converts the :ref:`floating-point
20732<t_floating>` operand to the signed integer return type.
20733The operation has a mask and an explicit vector length parameter.
20734
20735
20736Arguments:
20737""""""""""
20738
20739The '``llvm.vp.fptosi``' intrinsic takes a value to cast as its first operand.
20740The value to cast must be a vector of :ref:`floating-point <t_floating>` type.
20741The return type is the type to cast the value to. The return type must be
20742vector of :ref:`integer <t_integer>` type.  The second operand is the vector
20743mask. The return type, the value to cast, and the vector mask have the same
20744number of elements.  The third operand is the explicit vector length of the
20745operation.
20746
20747Semantics:
20748""""""""""
20749
20750The '``llvm.vp.fptosi``' intrinsic converts its :ref:`floating-point
20751<t_floating>` operand into the nearest (rounding towards zero) signed integer
20752value where the lane position is below the explicit vector length and the
20753vector mask is true.  Masked-off lanes are undefined. On enabled lanes where
20754conversion takes place and the value cannot fit in the return type, the result
20755on that lane is a :ref:`poison value <poisonvalues>`.
20756
20757Examples:
20758"""""""""
20759
20760.. code-block:: llvm
20761
20762      %r = call <4 x i32> @llvm.vp.fptosi.v4i32.v4f32(<4 x float> %a, <4 x i1> %mask, i32 %evl)
20763      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20764
20765      %t = fptosi <4 x float> %a to <4 x i32>
20766      %also.r = select <4 x i1> %mask, <4 x i32> %t, <4 x i32> undef
20767
20768
20769.. _int_vp_uitofp:
20770
20771'``llvm.vp.uitofp.*``' Intrinsics
20772^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20773
20774Syntax:
20775"""""""
20776This is an overloaded intrinsic.
20777
20778::
20779
20780      declare <16 x float>  @llvm.vp.uitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
20781      declare <vscale x 4 x float>  @llvm.vp.uitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20782      declare <256 x double>  @llvm.vp.uitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
20783
20784Overview:
20785"""""""""
20786
20787The '``llvm.vp.uitofp``' intrinsic converts its unsigned integer operand to the
20788:ref:`floating-point <t_floating>` return type.  The operation has a mask and
20789an explicit vector length parameter.
20790
20791
20792Arguments:
20793""""""""""
20794
20795The '``llvm.vp.uitofp``' intrinsic takes a value to cast as its first operand.
20796The value to cast must be vector of :ref:`integer <t_integer>` type.  The
20797return type is the type to cast the value to.  The return type must be a vector
20798of :ref:`floating-point <t_floating>` type.  The second operand is the vector
20799mask. The return type, the value to cast, and the vector mask have the same
20800number of elements.  The third operand is the explicit vector length of the
20801operation.
20802
20803Semantics:
20804""""""""""
20805
20806The '``llvm.vp.uitofp``' intrinsic interprets its first operand as an unsigned
20807integer quantity and converts it to the corresponding floating-point value. If
20808the value cannot be exactly represented, it is rounded using the default
20809rounding mode.  The conversion is performed on lane positions below the
20810explicit vector length and where the vector mask is true.  Masked-off lanes are
20811undefined.
20812
20813Examples:
20814"""""""""
20815
20816.. code-block:: llvm
20817
20818      %r = call <4 x float> @llvm.vp.uitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
20819      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20820
20821      %t = uitofp <4 x i32> %a to <4 x float>
20822      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
20823
20824
20825.. _int_vp_sitofp:
20826
20827'``llvm.vp.sitofp.*``' Intrinsics
20828^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20829
20830Syntax:
20831"""""""
20832This is an overloaded intrinsic.
20833
20834::
20835
20836      declare <16 x float>  @llvm.vp.sitofp.v16f32.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
20837      declare <vscale x 4 x float>  @llvm.vp.sitofp.nxv4f32.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20838      declare <256 x double>  @llvm.vp.sitofp.v256f64.v256i64 (<256 x i64> <op>, <256 x i1> <mask>, i32 <vector_length>)
20839
20840Overview:
20841"""""""""
20842
20843The '``llvm.vp.sitofp``' intrinsic converts its signed integer operand to the
20844:ref:`floating-point <t_floating>` return type.  The operation has a mask and
20845an explicit vector length parameter.
20846
20847
20848Arguments:
20849""""""""""
20850
20851The '``llvm.vp.sitofp``' intrinsic takes a value to cast as its first operand.
20852The value to cast must be vector of :ref:`integer <t_integer>` type.  The
20853return type is the type to cast the value to.  The return type must be a vector
20854of :ref:`floating-point <t_floating>` type.  The second operand is the vector
20855mask. The return type, the value to cast, and the vector mask have the same
20856number of elements.  The third operand is the explicit vector length of the
20857operation.
20858
20859Semantics:
20860""""""""""
20861
20862The '``llvm.vp.sitofp``' intrinsic interprets its first operand as a signed
20863integer quantity and converts it to the corresponding floating-point value. If
20864the value cannot be exactly represented, it is rounded using the default
20865rounding mode.  The conversion is performed on lane positions below the
20866explicit vector length and where the vector mask is true.  Masked-off lanes are
20867undefined.
20868
20869Examples:
20870"""""""""
20871
20872.. code-block:: llvm
20873
20874      %r = call <4 x float> @llvm.vp.sitofp.v4f32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
20875      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20876
20877      %t = sitofp <4 x i32> %a to <4 x float>
20878      %also.r = select <4 x i1> %mask, <4 x float> %t, <4 x float> undef
20879
20880
20881.. _int_vp_ptrtoint:
20882
20883'``llvm.vp.ptrtoint.*``' Intrinsics
20884^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20885
20886Syntax:
20887"""""""
20888This is an overloaded intrinsic.
20889
20890::
20891
20892      declare <16 x i8>  @llvm.vp.ptrtoint.v16i8.v16p0(<16 x ptr> <op>, <16 x i1> <mask>, i32 <vector_length>)
20893      declare <vscale x 4 x i8>  @llvm.vp.ptrtoint.nxv4i8.nxv4p0(<vscale x 4 x ptr> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20894      declare <256 x i64>  @llvm.vp.ptrtoint.v16i64.v16p0(<256 x ptr> <op>, <256 x i1> <mask>, i32 <vector_length>)
20895
20896Overview:
20897"""""""""
20898
20899The '``llvm.vp.ptrtoint``' intrinsic converts its pointer to the integer return
20900type.  The operation has a mask and an explicit vector length parameter.
20901
20902
20903Arguments:
20904""""""""""
20905
20906The '``llvm.vp.ptrtoint``' intrinsic takes a value to cast as its first operand
20907, which must be a vector of pointers, and a type to cast it to return type,
20908which must be a vector of :ref:`integer <t_integer>` type.
20909The second operand is the vector mask. The return type, the value to cast, and
20910the vector mask have the same number of elements.
20911The third operand is the explicit vector length of the operation.
20912
20913Semantics:
20914""""""""""
20915
20916The '``llvm.vp.ptrtoint``' intrinsic converts value to return type by
20917interpreting the pointer value as an integer and either truncating or zero
20918extending that value to the size of the integer type.
20919If ``value`` is smaller than return type, then a zero extension is done. If
20920``value`` is larger than return type, then a truncation is done. If they are
20921the same size, then nothing is done (*no-op cast*) other than a type
20922change.
20923The conversion is performed on lane positions below the explicit vector length
20924and where the vector mask is true.  Masked-off lanes are undefined.
20925
20926Examples:
20927"""""""""
20928
20929.. code-block:: llvm
20930
20931      %r = call <4 x i8> @llvm.vp.ptrtoint.v4i8.v4p0i32(<4 x ptr> %a, <4 x i1> %mask, i32 %evl)
20932      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20933
20934      %t = ptrtoint <4 x ptr> %a to <4 x i8>
20935      %also.r = select <4 x i1> %mask, <4 x i8> %t, <4 x i8> undef
20936
20937
20938.. _int_vp_inttoptr:
20939
20940'``llvm.vp.inttoptr.*``' Intrinsics
20941^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20942
20943Syntax:
20944"""""""
20945This is an overloaded intrinsic.
20946
20947::
20948
20949      declare <16 x ptr>  @llvm.vp.inttoptr.v16p0.v16i32 (<16 x i32> <op>, <16 x i1> <mask>, i32 <vector_length>)
20950      declare <vscale x 4 x ptr>  @llvm.vp.inttoptr.nxv4p0.nxv4i32 (<vscale x 4 x i32> <op>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
20951      declare <256 x ptr>  @llvm.vp.inttoptr.v256p0.v256i32 (<256 x i32> <op>, <256 x i1> <mask>, i32 <vector_length>)
20952
20953Overview:
20954"""""""""
20955
20956The '``llvm.vp.inttoptr``' intrinsic converts its integer value to the point
20957return type. The operation has a mask and an explicit vector length parameter.
20958
20959
20960Arguments:
20961""""""""""
20962
20963The '``llvm.vp.inttoptr``' intrinsic takes a value to cast as its first operand
20964, which must be a vector of :ref:`integer <t_integer>` type, and a type to cast
20965it to return type, which must be a vector of pointers type.
20966The second operand is the vector mask. The return type, the value to cast, and
20967the vector mask have the same number of elements.
20968The third operand is the explicit vector length of the operation.
20969
20970Semantics:
20971""""""""""
20972
20973The '``llvm.vp.inttoptr``' intrinsic converts ``value`` to return type by
20974applying either a zero extension or a truncation depending on the size of the
20975integer ``value``. If ``value`` is larger than the size of a pointer, then a
20976truncation is done. If ``value`` is smaller than the size of a pointer, then a
20977zero extension is done. If they are the same size, nothing is done (*no-op cast*).
20978The conversion is performed on lane positions below the explicit vector length
20979and where the vector mask is true.  Masked-off lanes are undefined.
20980
20981Examples:
20982"""""""""
20983
20984.. code-block:: llvm
20985
20986      %r = call <4 x ptr> @llvm.vp.inttoptr.v4p0i32.v4i32(<4 x i32> %a, <4 x i1> %mask, i32 %evl)
20987      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
20988
20989      %t = inttoptr <4 x i32> %a to <4 x ptr>
20990      %also.r = select <4 x i1> %mask, <4 x ptr> %t, <4 x ptr> undef
20991
20992
20993.. _int_vp_fcmp:
20994
20995'``llvm.vp.fcmp.*``' Intrinsics
20996^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20997
20998Syntax:
20999"""""""
21000This is an overloaded intrinsic.
21001
21002::
21003
21004      declare <16 x i1> @llvm.vp.fcmp.v16f32(<16 x float> <left_op>, <16 x float> <right_op>, metadata <condition code>, <16 x i1> <mask>, i32 <vector_length>)
21005      declare <vscale x 4 x i1> @llvm.vp.fcmp.nxv4f32(<vscale x 4 x float> <left_op>, <vscale x 4 x float> <right_op>, metadata <condition code>, <vscale x 4 x i1> <mask>, i32 <vector_length>)
21006      declare <256 x i1> @llvm.vp.fcmp.v256f64(<256 x double> <left_op>, <256 x double> <right_op>, metadata <condition code>, <256 x i1> <mask>, i32 <vector_length>)
21007
21008Overview:
21009"""""""""
21010
21011The '``llvm.vp.fcmp``' intrinsic returns a vector of boolean values based on
21012the comparison of its operands. The operation has a mask and an explicit vector
21013length parameter.
21014
21015
21016Arguments:
21017""""""""""
21018
21019The '``llvm.vp.fcmp``' intrinsic takes the two values to compare as its first
21020and second operands. These two values must be vectors of :ref:`floating-point
21021<t_floating>` types.
21022The return type is the result of the comparison. The return type must be a
21023vector of :ref:`i1 <t_integer>` type. The fourth operand is the vector mask.
21024The return type, the values to compare, and the vector mask have the same
21025number of elements. The third operand is the condition code indicating the kind
21026of comparison to perform. It must be a metadata string with :ref:`one of the
21027supported floating-point condition code values <fcmp_md_cc>`. The fifth operand
21028is the explicit vector length of the operation.
21029
21030Semantics:
21031""""""""""
21032
21033The '``llvm.vp.fcmp``' compares its first two operands according to the
21034condition code given as the third operand. The operands are compared element by
21035element on each enabled lane, where the the semantics of the comparison are
21036defined :ref:`according to the condition code <fcmp_md_cc_sem>`. Masked-off
21037lanes are undefined.
21038
21039Examples:
21040"""""""""
21041
21042.. code-block:: llvm
21043
21044      %r = call <4 x i1> @llvm.vp.fcmp.v4f32(<4 x float> %a, <4 x float> %b, metadata !"oeq", <4 x i1> %mask, i32 %evl)
21045      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21046
21047      %t = fcmp oeq <4 x float> %a, %b
21048      %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> undef
21049
21050
21051.. _int_vp_icmp:
21052
21053'``llvm.vp.icmp.*``' Intrinsics
21054^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21055
21056Syntax:
21057"""""""
21058This is an overloaded intrinsic.
21059
21060::
21061
21062      declare <32 x i1> @llvm.vp.icmp.v32i32(<32 x i32> <left_op>, <32 x i32> <right_op>, metadata <condition code>, <32 x i1> <mask>, i32 <vector_length>)
21063      declare <vscale x 2 x i1> @llvm.vp.icmp.nxv2i32(<vscale x 2 x i32> <left_op>, <vscale x 2 x i32> <right_op>, metadata <condition code>, <vscale x 2 x i1> <mask>, i32 <vector_length>)
21064      declare <128 x i1> @llvm.vp.icmp.v128i8(<128 x i8> <left_op>, <128 x i8> <right_op>, metadata <condition code>, <128 x i1> <mask>, i32 <vector_length>)
21065
21066Overview:
21067"""""""""
21068
21069The '``llvm.vp.icmp``' intrinsic returns a vector of boolean values based on
21070the comparison of its operands. The operation has a mask and an explicit vector
21071length parameter.
21072
21073
21074Arguments:
21075""""""""""
21076
21077The '``llvm.vp.icmp``' intrinsic takes the two values to compare as its first
21078and second operands. These two values must be vectors of :ref:`integer
21079<t_integer>` types.
21080The return type is the result of the comparison. The return type must be a
21081vector of :ref:`i1 <t_integer>` type. The fourth operand is the vector mask.
21082The return type, the values to compare, and the vector mask have the same
21083number of elements. The third operand is the condition code indicating the kind
21084of comparison to perform. It must be a metadata string with :ref:`one of the
21085supported integer condition code values <icmp_md_cc>`. The fifth operand is the
21086explicit vector length of the operation.
21087
21088Semantics:
21089""""""""""
21090
21091The '``llvm.vp.icmp``' compares its first two operands according to the
21092condition code given as the third operand. The operands are compared element by
21093element on each enabled lane, where the the semantics of the comparison are
21094defined :ref:`according to the condition code <icmp_md_cc_sem>`. Masked-off
21095lanes are undefined.
21096
21097Examples:
21098"""""""""
21099
21100.. code-block:: llvm
21101
21102      %r = call <4 x i1> @llvm.vp.icmp.v4i32(<4 x i32> %a, <4 x i32> %b, metadata !"ne", <4 x i1> %mask, i32 %evl)
21103      ;; For all lanes below %evl, %r is lane-wise equivalent to %also.r
21104
21105      %t = icmp ne <4 x i32> %a, %b
21106      %also.r = select <4 x i1> %mask, <4 x i1> %t, <4 x i1> undef
21107
21108
21109.. _int_mload_mstore:
21110
21111Masked Vector Load and Store Intrinsics
21112---------------------------------------
21113
21114LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
21115
21116.. _int_mload:
21117
21118'``llvm.masked.load.*``' Intrinsics
21119^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21120
21121Syntax:
21122"""""""
21123This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
21124
21125::
21126
21127      declare <16 x float>  @llvm.masked.load.v16f32.p0(ptr <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
21128      declare <2 x double>  @llvm.masked.load.v2f64.p0(ptr <ptr>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
21129      ;; The data is a vector of pointers
21130      declare <8 x ptr> @llvm.masked.load.v8p0.p0(ptr <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x ptr> <passthru>)
21131
21132Overview:
21133"""""""""
21134
21135Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
21136
21137
21138Arguments:
21139""""""""""
21140
21141The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
21142
21143Semantics:
21144""""""""""
21145
21146The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
21147The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
21148
21149
21150::
21151
21152       %res = call <16 x float> @llvm.masked.load.v16f32.p0(ptr %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
21153
21154       ;; The result of the two following instructions is identical aside from potential memory access exception
21155       %loadlal = load <16 x float>, ptr %ptr, align 4
21156       %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
21157
21158.. _int_mstore:
21159
21160'``llvm.masked.store.*``' Intrinsics
21161^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21162
21163Syntax:
21164"""""""
21165This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
21166
21167::
21168
21169       declare void @llvm.masked.store.v8i32.p0 (<8  x i32>   <value>, ptr <ptr>, i32 <alignment>, <8  x i1> <mask>)
21170       declare void @llvm.masked.store.v16f32.p0(<16 x float> <value>, ptr <ptr>, i32 <alignment>, <16 x i1> <mask>)
21171       ;; The data is a vector of pointers
21172       declare void @llvm.masked.store.v8p0.p0  (<8 x ptr>    <value>, ptr <ptr>, i32 <alignment>, <8 x i1> <mask>)
21173
21174Overview:
21175"""""""""
21176
21177Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
21178
21179Arguments:
21180""""""""""
21181
21182The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. It must be a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
21183
21184
21185Semantics:
21186""""""""""
21187
21188The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
21189The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
21190
21191::
21192
21193       call void @llvm.masked.store.v16f32.p0(<16 x float> %value, ptr %ptr, i32 4,  <16 x i1> %mask)
21194
21195       ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
21196       %oldval = load <16 x float>, ptr %ptr, align 4
21197       %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
21198       store <16 x float> %res, ptr %ptr, align 4
21199
21200
21201Masked Vector Gather and Scatter Intrinsics
21202-------------------------------------------
21203
21204LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
21205
21206.. _int_mgather:
21207
21208'``llvm.masked.gather.*``' Intrinsics
21209^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21210
21211Syntax:
21212"""""""
21213This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
21214
21215::
21216
21217      declare <16 x float> @llvm.masked.gather.v16f32.v16p0(<16 x ptr> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
21218      declare <2 x double> @llvm.masked.gather.v2f64.v2p1(<2 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <2 x i1>  <mask>, <2 x double> <passthru>)
21219      declare <8 x ptr> @llvm.masked.gather.v8p0.v8p0(<8 x ptr> <ptrs>, i32 <alignment>, <8 x i1>  <mask>, <8 x ptr> <passthru>)
21220
21221Overview:
21222"""""""""
21223
21224Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
21225
21226
21227Arguments:
21228""""""""""
21229
21230The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be 0 or a power of two constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
21231
21232Semantics:
21233""""""""""
21234
21235The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
21236The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
21237
21238
21239::
21240
21241       %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0(<4 x ptr> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
21242
21243       ;; The gather with all-true mask is equivalent to the following instruction sequence
21244       %ptr0 = extractelement <4 x ptr> %ptrs, i32 0
21245       %ptr1 = extractelement <4 x ptr> %ptrs, i32 1
21246       %ptr2 = extractelement <4 x ptr> %ptrs, i32 2
21247       %ptr3 = extractelement <4 x ptr> %ptrs, i32 3
21248
21249       %val0 = load double, ptr %ptr0, align 8
21250       %val1 = load double, ptr %ptr1, align 8
21251       %val2 = load double, ptr %ptr2, align 8
21252       %val3 = load double, ptr %ptr3, align 8
21253
21254       %vec0    = insertelement <4 x double>undef, %val0, 0
21255       %vec01   = insertelement <4 x double>%vec0, %val1, 1
21256       %vec012  = insertelement <4 x double>%vec01, %val2, 2
21257       %vec0123 = insertelement <4 x double>%vec012, %val3, 3
21258
21259.. _int_mscatter:
21260
21261'``llvm.masked.scatter.*``' Intrinsics
21262^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21263
21264Syntax:
21265"""""""
21266This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
21267
21268::
21269
21270       declare void @llvm.masked.scatter.v8i32.v8p0  (<8 x i32>    <value>, <8 x ptr>               <ptrs>, i32 <alignment>, <8 x i1>  <mask>)
21271       declare void @llvm.masked.scatter.v16f32.v16p1(<16 x float> <value>, <16 x ptr addrspace(1)> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
21272       declare void @llvm.masked.scatter.v4p0.v4p0   (<4 x ptr>    <value>, <4 x ptr>               <ptrs>, i32 <alignment>, <4 x i1>  <mask>)
21273
21274Overview:
21275"""""""""
21276
21277Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
21278
21279Arguments:
21280""""""""""
21281
21282The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. It must be 0 or a power of two constant integer value. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
21283
21284Semantics:
21285""""""""""
21286
21287The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
21288
21289::
21290
21291       ;; This instruction unconditionally stores data vector in multiple addresses
21292       call @llvm.masked.scatter.v8i32.v8p0(<8 x i32> %value, <8 x ptr> %ptrs, i32 4,  <8 x i1>  <true, true, .. true>)
21293
21294       ;; It is equivalent to a list of scalar stores
21295       %val0 = extractelement <8 x i32> %value, i32 0
21296       %val1 = extractelement <8 x i32> %value, i32 1
21297       ..
21298       %val7 = extractelement <8 x i32> %value, i32 7
21299       %ptr0 = extractelement <8 x ptr> %ptrs, i32 0
21300       %ptr1 = extractelement <8 x ptr> %ptrs, i32 1
21301       ..
21302       %ptr7 = extractelement <8 x ptr> %ptrs, i32 7
21303       ;; Note: the order of the following stores is important when they overlap:
21304       store i32 %val0, ptr %ptr0, align 4
21305       store i32 %val1, ptr %ptr1, align 4
21306       ..
21307       store i32 %val7, ptr %ptr7, align 4
21308
21309
21310Masked Vector Expanding Load and Compressing Store Intrinsics
21311-------------------------------------------------------------
21312
21313LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
21314
21315.. _int_expandload:
21316
21317'``llvm.masked.expandload.*``' Intrinsics
21318^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21319
21320Syntax:
21321"""""""
21322This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
21323
21324::
21325
21326      declare <16 x float>  @llvm.masked.expandload.v16f32 (ptr <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
21327      declare <2 x i64>     @llvm.masked.expandload.v2i64 (ptr <ptr>, <2 x i1>  <mask>, <2 x i64> <passthru>)
21328
21329Overview:
21330"""""""""
21331
21332Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "expandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
21333
21334
21335Arguments:
21336""""""""""
21337
21338The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
21339
21340Semantics:
21341""""""""""
21342
21343The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
21344
21345.. code-block:: c
21346
21347    // In this loop we load from B and spread the elements into array A.
21348    double *A, B; int *C;
21349    for (int i = 0; i < size; ++i) {
21350      if (C[i] != 0)
21351        A[i] = B[j++];
21352    }
21353
21354
21355.. code-block:: llvm
21356
21357    ; Load several elements from array B and expand them in a vector.
21358    ; The number of loaded elements is equal to the number of '1' elements in the Mask.
21359    %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(ptr %Bptr, <8 x i1> %Mask, <8 x double> undef)
21360    ; Store the result in A
21361    call void @llvm.masked.store.v8f64.p0(<8 x double> %Tmp, ptr %Aptr, i32 8, <8 x i1> %Mask)
21362
21363    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
21364    %MaskI = bitcast <8 x i1> %Mask to i8
21365    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
21366    %MaskI64 = zext i8 %MaskIPopcnt to i64
21367    %BNextInd = add i64 %BInd, %MaskI64
21368
21369
21370Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
21371If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
21372
21373.. _int_compressstore:
21374
21375'``llvm.masked.compressstore.*``' Intrinsics
21376^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21377
21378Syntax:
21379"""""""
21380This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
21381
21382::
21383
21384      declare void @llvm.masked.compressstore.v8i32  (<8  x i32>   <value>, ptr <ptr>, <8  x i1> <mask>)
21385      declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, ptr <ptr>, <16 x i1> <mask>)
21386
21387Overview:
21388"""""""""
21389
21390Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
21391
21392Arguments:
21393""""""""""
21394
21395The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
21396
21397
21398Semantics:
21399""""""""""
21400
21401The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
21402
21403.. code-block:: c
21404
21405    // In this loop we load elements from A and store them consecutively in B
21406    double *A, B; int *C;
21407    for (int i = 0; i < size; ++i) {
21408      if (C[i] != 0)
21409        B[j++] = A[i]
21410    }
21411
21412
21413.. code-block:: llvm
21414
21415    ; Load elements from A.
21416    %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0(ptr %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
21417    ; Store all selected elements consecutively in array B
21418    call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, ptr %Bptr, <8 x i1> %Mask)
21419
21420    ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
21421    %MaskI = bitcast <8 x i1> %Mask to i8
21422    %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
21423    %MaskI64 = zext i8 %MaskIPopcnt to i64
21424    %BNextInd = add i64 %BInd, %MaskI64
21425
21426
21427Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
21428
21429
21430Memory Use Markers
21431------------------
21432
21433This class of intrinsics provides information about the
21434:ref:`lifetime of memory objects <objectlifetime>` and ranges where variables
21435are immutable.
21436
21437.. _int_lifestart:
21438
21439'``llvm.lifetime.start``' Intrinsic
21440^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21441
21442Syntax:
21443"""""""
21444
21445::
21446
21447      declare void @llvm.lifetime.start(i64 <size>, ptr nocapture <ptr>)
21448
21449Overview:
21450"""""""""
21451
21452The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
21453object's lifetime.
21454
21455Arguments:
21456""""""""""
21457
21458The first argument is a constant integer representing the size of the
21459object, or -1 if it is variable sized. The second argument is a pointer
21460to the object.
21461
21462Semantics:
21463""""""""""
21464
21465If ``ptr`` is a stack-allocated object and it points to the first byte of
21466the object, the object is initially marked as dead.
21467``ptr`` is conservatively considered as a non-stack-allocated object if
21468the stack coloring algorithm that is used in the optimization pipeline cannot
21469conclude that ``ptr`` is a stack-allocated object.
21470
21471After '``llvm.lifetime.start``', the stack object that ``ptr`` points is marked
21472as alive and has an uninitialized value.
21473The stack object is marked as dead when either
21474:ref:`llvm.lifetime.end <int_lifeend>` to the alloca is executed or the
21475function returns.
21476
21477After :ref:`llvm.lifetime.end <int_lifeend>` is called,
21478'``llvm.lifetime.start``' on the stack object can be called again.
21479The second '``llvm.lifetime.start``' call marks the object as alive, but it
21480does not change the address of the object.
21481
21482If ``ptr`` is a non-stack-allocated object, it does not point to the first
21483byte of the object or it is a stack object that is already alive, it simply
21484fills all bytes of the object with ``poison``.
21485
21486
21487.. _int_lifeend:
21488
21489'``llvm.lifetime.end``' Intrinsic
21490^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21491
21492Syntax:
21493"""""""
21494
21495::
21496
21497      declare void @llvm.lifetime.end(i64 <size>, ptr nocapture <ptr>)
21498
21499Overview:
21500"""""""""
21501
21502The '``llvm.lifetime.end``' intrinsic specifies the end of a memory object's
21503lifetime.
21504
21505Arguments:
21506""""""""""
21507
21508The first argument is a constant integer representing the size of the
21509object, or -1 if it is variable sized. The second argument is a pointer
21510to the object.
21511
21512Semantics:
21513""""""""""
21514
21515If ``ptr`` is a stack-allocated object and it points to the first byte of the
21516object, the object is dead.
21517``ptr`` is conservatively considered as a non-stack-allocated object if
21518the stack coloring algorithm that is used in the optimization pipeline cannot
21519conclude that ``ptr`` is a stack-allocated object.
21520
21521Calling ``llvm.lifetime.end`` on an already dead alloca is no-op.
21522
21523If ``ptr`` is a non-stack-allocated object or it does not point to the first
21524byte of the object, it is equivalent to simply filling all bytes of the object
21525with ``poison``.
21526
21527
21528'``llvm.invariant.start``' Intrinsic
21529^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21530
21531Syntax:
21532"""""""
21533This is an overloaded intrinsic. The memory object can belong to any address space.
21534
21535::
21536
21537      declare ptr @llvm.invariant.start.p0(i64 <size>, ptr nocapture <ptr>)
21538
21539Overview:
21540"""""""""
21541
21542The '``llvm.invariant.start``' intrinsic specifies that the contents of
21543a memory object will not change.
21544
21545Arguments:
21546""""""""""
21547
21548The first argument is a constant integer representing the size of the
21549object, or -1 if it is variable sized. The second argument is a pointer
21550to the object.
21551
21552Semantics:
21553""""""""""
21554
21555This intrinsic indicates that until an ``llvm.invariant.end`` that uses
21556the return value, the referenced memory location is constant and
21557unchanging.
21558
21559'``llvm.invariant.end``' Intrinsic
21560^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21561
21562Syntax:
21563"""""""
21564This is an overloaded intrinsic. The memory object can belong to any address space.
21565
21566::
21567
21568      declare void @llvm.invariant.end.p0(ptr <start>, i64 <size>, ptr nocapture <ptr>)
21569
21570Overview:
21571"""""""""
21572
21573The '``llvm.invariant.end``' intrinsic specifies that the contents of a
21574memory object are mutable.
21575
21576Arguments:
21577""""""""""
21578
21579The first argument is the matching ``llvm.invariant.start`` intrinsic.
21580The second argument is a constant integer representing the size of the
21581object, or -1 if it is variable sized and the third argument is a
21582pointer to the object.
21583
21584Semantics:
21585""""""""""
21586
21587This intrinsic indicates that the memory is mutable again.
21588
21589'``llvm.launder.invariant.group``' Intrinsic
21590^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21591
21592Syntax:
21593"""""""
21594This is an overloaded intrinsic. The memory object can belong to any address
21595space. The returned pointer must belong to the same address space as the
21596argument.
21597
21598::
21599
21600      declare ptr @llvm.launder.invariant.group.p0(ptr <ptr>)
21601
21602Overview:
21603"""""""""
21604
21605The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
21606established by ``invariant.group`` metadata no longer holds, to obtain a new
21607pointer value that carries fresh invariant group information. It is an
21608experimental intrinsic, which means that its semantics might change in the
21609future.
21610
21611
21612Arguments:
21613""""""""""
21614
21615The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
21616to the memory.
21617
21618Semantics:
21619""""""""""
21620
21621Returns another pointer that aliases its argument but which is considered different
21622for the purposes of ``load``/``store`` ``invariant.group`` metadata.
21623It does not read any accessible memory and the execution can be speculated.
21624
21625'``llvm.strip.invariant.group``' Intrinsic
21626^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21627
21628Syntax:
21629"""""""
21630This is an overloaded intrinsic. The memory object can belong to any address
21631space. The returned pointer must belong to the same address space as the
21632argument.
21633
21634::
21635
21636      declare ptr @llvm.strip.invariant.group.p0(ptr <ptr>)
21637
21638Overview:
21639"""""""""
21640
21641The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
21642established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
21643value that does not carry the invariant information. It is an experimental
21644intrinsic, which means that its semantics might change in the future.
21645
21646
21647Arguments:
21648""""""""""
21649
21650The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
21651to the memory.
21652
21653Semantics:
21654""""""""""
21655
21656Returns another pointer that aliases its argument but which has no associated
21657``invariant.group`` metadata.
21658It does not read any memory and can be speculated.
21659
21660
21661
21662.. _constrainedfp:
21663
21664Constrained Floating-Point Intrinsics
21665-------------------------------------
21666
21667These intrinsics are used to provide special handling of floating-point
21668operations when specific rounding mode or floating-point exception behavior is
21669required.  By default, LLVM optimization passes assume that the rounding mode is
21670round-to-nearest and that floating-point exceptions will not be monitored.
21671Constrained FP intrinsics are used to support non-default rounding modes and
21672accurately preserve exception behavior without compromising LLVM's ability to
21673optimize FP code when the default behavior is used.
21674
21675If any FP operation in a function is constrained then they all must be
21676constrained. This is required for correct LLVM IR. Optimizations that
21677move code around can create miscompiles if mixing of constrained and normal
21678operations is done. The correct way to mix constrained and less constrained
21679operations is to use the rounding mode and exception handling metadata to
21680mark constrained intrinsics as having LLVM's default behavior.
21681
21682Each of these intrinsics corresponds to a normal floating-point operation. The
21683data arguments and the return value are the same as the corresponding FP
21684operation.
21685
21686The rounding mode argument is a metadata string specifying what
21687assumptions, if any, the optimizer can make when transforming constant
21688values. Some constrained FP intrinsics omit this argument. If required
21689by the intrinsic, this argument must be one of the following strings:
21690
21691::
21692
21693      "round.dynamic"
21694      "round.tonearest"
21695      "round.downward"
21696      "round.upward"
21697      "round.towardzero"
21698      "round.tonearestaway"
21699
21700If this argument is "round.dynamic" optimization passes must assume that the
21701rounding mode is unknown and may change at runtime.  No transformations that
21702depend on rounding mode may be performed in this case.
21703
21704The other possible values for the rounding mode argument correspond to the
21705similarly named IEEE rounding modes.  If the argument is any of these values
21706optimization passes may perform transformations as long as they are consistent
21707with the specified rounding mode.
21708
21709For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
21710"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
21711'x-0' should evaluate to '-0' when rounding downward.  However, this
21712transformation is legal for all other rounding modes.
21713
21714For values other than "round.dynamic" optimization passes may assume that the
21715actual runtime rounding mode (as defined in a target-specific manner) matches
21716the specified rounding mode, but this is not guaranteed.  Using a specific
21717non-dynamic rounding mode which does not match the actual rounding mode at
21718runtime results in undefined behavior.
21719
21720The exception behavior argument is a metadata string describing the floating
21721point exception semantics that required for the intrinsic. This argument
21722must be one of the following strings:
21723
21724::
21725
21726      "fpexcept.ignore"
21727      "fpexcept.maytrap"
21728      "fpexcept.strict"
21729
21730If this argument is "fpexcept.ignore" optimization passes may assume that the
21731exception status flags will not be read and that floating-point exceptions will
21732be masked.  This allows transformations to be performed that may change the
21733exception semantics of the original code.  For example, FP operations may be
21734speculatively executed in this case whereas they must not be for either of the
21735other possible values of this argument.
21736
21737If the exception behavior argument is "fpexcept.maytrap" optimization passes
21738must avoid transformations that may raise exceptions that would not have been
21739raised by the original code (such as speculatively executing FP operations), but
21740passes are not required to preserve all exceptions that are implied by the
21741original code.  For example, exceptions may be potentially hidden by constant
21742folding.
21743
21744If the exception behavior argument is "fpexcept.strict" all transformations must
21745strictly preserve the floating-point exception semantics of the original code.
21746Any FP exception that would have been raised by the original code must be raised
21747by the transformed code, and the transformed code must not raise any FP
21748exceptions that would not have been raised by the original code.  This is the
21749exception behavior argument that will be used if the code being compiled reads
21750the FP exception status flags, but this mode can also be used with code that
21751unmasks FP exceptions.
21752
21753The number and order of floating-point exceptions is NOT guaranteed.  For
21754example, a series of FP operations that each may raise exceptions may be
21755vectorized into a single instruction that raises each unique exception a single
21756time.
21757
21758Proper :ref:`function attributes <fnattrs>` usage is required for the
21759constrained intrinsics to function correctly.
21760
21761All function *calls* done in a function that uses constrained floating
21762point intrinsics must have the ``strictfp`` attribute.
21763
21764All function *definitions* that use constrained floating point intrinsics
21765must have the ``strictfp`` attribute.
21766
21767'``llvm.experimental.constrained.fadd``' Intrinsic
21768^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21769
21770Syntax:
21771"""""""
21772
21773::
21774
21775      declare <type>
21776      @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
21777                                          metadata <rounding mode>,
21778                                          metadata <exception behavior>)
21779
21780Overview:
21781"""""""""
21782
21783The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
21784two operands.
21785
21786
21787Arguments:
21788""""""""""
21789
21790The first two arguments to the '``llvm.experimental.constrained.fadd``'
21791intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21792of floating-point values. Both arguments must have identical types.
21793
21794The third and fourth arguments specify the rounding mode and exception
21795behavior as described above.
21796
21797Semantics:
21798""""""""""
21799
21800The value produced is the floating-point sum of the two value operands and has
21801the same type as the operands.
21802
21803
21804'``llvm.experimental.constrained.fsub``' Intrinsic
21805^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21806
21807Syntax:
21808"""""""
21809
21810::
21811
21812      declare <type>
21813      @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
21814                                          metadata <rounding mode>,
21815                                          metadata <exception behavior>)
21816
21817Overview:
21818"""""""""
21819
21820The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
21821of its two operands.
21822
21823
21824Arguments:
21825""""""""""
21826
21827The first two arguments to the '``llvm.experimental.constrained.fsub``'
21828intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21829of floating-point values. Both arguments must have identical types.
21830
21831The third and fourth arguments specify the rounding mode and exception
21832behavior as described above.
21833
21834Semantics:
21835""""""""""
21836
21837The value produced is the floating-point difference of the two value operands
21838and has the same type as the operands.
21839
21840
21841'``llvm.experimental.constrained.fmul``' Intrinsic
21842^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21843
21844Syntax:
21845"""""""
21846
21847::
21848
21849      declare <type>
21850      @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
21851                                          metadata <rounding mode>,
21852                                          metadata <exception behavior>)
21853
21854Overview:
21855"""""""""
21856
21857The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
21858its two operands.
21859
21860
21861Arguments:
21862""""""""""
21863
21864The first two arguments to the '``llvm.experimental.constrained.fmul``'
21865intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21866of floating-point values. Both arguments must have identical types.
21867
21868The third and fourth arguments specify the rounding mode and exception
21869behavior as described above.
21870
21871Semantics:
21872""""""""""
21873
21874The value produced is the floating-point product of the two value operands and
21875has the same type as the operands.
21876
21877
21878'``llvm.experimental.constrained.fdiv``' Intrinsic
21879^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21880
21881Syntax:
21882"""""""
21883
21884::
21885
21886      declare <type>
21887      @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
21888                                          metadata <rounding mode>,
21889                                          metadata <exception behavior>)
21890
21891Overview:
21892"""""""""
21893
21894The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
21895its two operands.
21896
21897
21898Arguments:
21899""""""""""
21900
21901The first two arguments to the '``llvm.experimental.constrained.fdiv``'
21902intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21903of floating-point values. Both arguments must have identical types.
21904
21905The third and fourth arguments specify the rounding mode and exception
21906behavior as described above.
21907
21908Semantics:
21909""""""""""
21910
21911The value produced is the floating-point quotient of the two value operands and
21912has the same type as the operands.
21913
21914
21915'``llvm.experimental.constrained.frem``' Intrinsic
21916^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21917
21918Syntax:
21919"""""""
21920
21921::
21922
21923      declare <type>
21924      @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
21925                                          metadata <rounding mode>,
21926                                          metadata <exception behavior>)
21927
21928Overview:
21929"""""""""
21930
21931The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
21932from the division of its two operands.
21933
21934
21935Arguments:
21936""""""""""
21937
21938The first two arguments to the '``llvm.experimental.constrained.frem``'
21939intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
21940of floating-point values. Both arguments must have identical types.
21941
21942The third and fourth arguments specify the rounding mode and exception
21943behavior as described above.  The rounding mode argument has no effect, since
21944the result of frem is never rounded, but the argument is included for
21945consistency with the other constrained floating-point intrinsics.
21946
21947Semantics:
21948""""""""""
21949
21950The value produced is the floating-point remainder from the division of the two
21951value operands and has the same type as the operands.  The remainder has the
21952same sign as the dividend.
21953
21954'``llvm.experimental.constrained.fma``' Intrinsic
21955^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21956
21957Syntax:
21958"""""""
21959
21960::
21961
21962      declare <type>
21963      @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
21964                                          metadata <rounding mode>,
21965                                          metadata <exception behavior>)
21966
21967Overview:
21968"""""""""
21969
21970The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
21971fused-multiply-add operation on its operands.
21972
21973Arguments:
21974""""""""""
21975
21976The first three arguments to the '``llvm.experimental.constrained.fma``'
21977intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
21978<t_vector>` of floating-point values. All arguments must have identical types.
21979
21980The fourth and fifth arguments specify the rounding mode and exception behavior
21981as described above.
21982
21983Semantics:
21984""""""""""
21985
21986The result produced is the product of the first two operands added to the third
21987operand computed with infinite precision, and then rounded to the target
21988precision.
21989
21990'``llvm.experimental.constrained.fptoui``' Intrinsic
21991^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21992
21993Syntax:
21994"""""""
21995
21996::
21997
21998      declare <ty2>
21999      @llvm.experimental.constrained.fptoui(<type> <value>,
22000                                          metadata <exception behavior>)
22001
22002Overview:
22003"""""""""
22004
22005The '``llvm.experimental.constrained.fptoui``' intrinsic converts a
22006floating-point ``value`` to its unsigned integer equivalent of type ``ty2``.
22007
22008Arguments:
22009""""""""""
22010
22011The first argument to the '``llvm.experimental.constrained.fptoui``'
22012intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
22013<t_vector>` of floating point values.
22014
22015The second argument specifies the exception behavior as described above.
22016
22017Semantics:
22018""""""""""
22019
22020The result produced is an unsigned integer converted from the floating
22021point operand. The value is truncated, so it is rounded towards zero.
22022
22023'``llvm.experimental.constrained.fptosi``' Intrinsic
22024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22025
22026Syntax:
22027"""""""
22028
22029::
22030
22031      declare <ty2>
22032      @llvm.experimental.constrained.fptosi(<type> <value>,
22033                                          metadata <exception behavior>)
22034
22035Overview:
22036"""""""""
22037
22038The '``llvm.experimental.constrained.fptosi``' intrinsic converts
22039:ref:`floating-point <t_floating>` ``value`` to type ``ty2``.
22040
22041Arguments:
22042""""""""""
22043
22044The first argument to the '``llvm.experimental.constrained.fptosi``'
22045intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
22046<t_vector>` of floating point values.
22047
22048The second argument specifies the exception behavior as described above.
22049
22050Semantics:
22051""""""""""
22052
22053The result produced is a signed integer converted from the floating
22054point operand. The value is truncated, so it is rounded towards zero.
22055
22056'``llvm.experimental.constrained.uitofp``' Intrinsic
22057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22058
22059Syntax:
22060"""""""
22061
22062::
22063
22064      declare <ty2>
22065      @llvm.experimental.constrained.uitofp(<type> <value>,
22066                                          metadata <rounding mode>,
22067                                          metadata <exception behavior>)
22068
22069Overview:
22070"""""""""
22071
22072The '``llvm.experimental.constrained.uitofp``' intrinsic converts an
22073unsigned integer ``value`` to a floating-point of type ``ty2``.
22074
22075Arguments:
22076""""""""""
22077
22078The first argument to the '``llvm.experimental.constrained.uitofp``'
22079intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
22080<t_vector>` of integer values.
22081
22082The second and third arguments specify the rounding mode and exception
22083behavior as described above.
22084
22085Semantics:
22086""""""""""
22087
22088An inexact floating-point exception will be raised if rounding is required.
22089Any result produced is a floating point value converted from the input
22090integer operand.
22091
22092'``llvm.experimental.constrained.sitofp``' Intrinsic
22093^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22094
22095Syntax:
22096"""""""
22097
22098::
22099
22100      declare <ty2>
22101      @llvm.experimental.constrained.sitofp(<type> <value>,
22102                                          metadata <rounding mode>,
22103                                          metadata <exception behavior>)
22104
22105Overview:
22106"""""""""
22107
22108The '``llvm.experimental.constrained.sitofp``' intrinsic converts a
22109signed integer ``value`` to a floating-point of type ``ty2``.
22110
22111Arguments:
22112""""""""""
22113
22114The first argument to the '``llvm.experimental.constrained.sitofp``'
22115intrinsic must be an :ref:`integer <t_integer>` or :ref:`vector
22116<t_vector>` of integer values.
22117
22118The second and third arguments specify the rounding mode and exception
22119behavior as described above.
22120
22121Semantics:
22122""""""""""
22123
22124An inexact floating-point exception will be raised if rounding is required.
22125Any result produced is a floating point value converted from the input
22126integer operand.
22127
22128'``llvm.experimental.constrained.fptrunc``' Intrinsic
22129^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22130
22131Syntax:
22132"""""""
22133
22134::
22135
22136      declare <ty2>
22137      @llvm.experimental.constrained.fptrunc(<type> <value>,
22138                                          metadata <rounding mode>,
22139                                          metadata <exception behavior>)
22140
22141Overview:
22142"""""""""
22143
22144The '``llvm.experimental.constrained.fptrunc``' intrinsic truncates ``value``
22145to type ``ty2``.
22146
22147Arguments:
22148""""""""""
22149
22150The first argument to the '``llvm.experimental.constrained.fptrunc``'
22151intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
22152<t_vector>` of floating point values. This argument must be larger in size
22153than the result.
22154
22155The second and third arguments specify the rounding mode and exception
22156behavior as described above.
22157
22158Semantics:
22159""""""""""
22160
22161The result produced is a floating point value truncated to be smaller in size
22162than the operand.
22163
22164'``llvm.experimental.constrained.fpext``' Intrinsic
22165^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22166
22167Syntax:
22168"""""""
22169
22170::
22171
22172      declare <ty2>
22173      @llvm.experimental.constrained.fpext(<type> <value>,
22174                                          metadata <exception behavior>)
22175
22176Overview:
22177"""""""""
22178
22179The '``llvm.experimental.constrained.fpext``' intrinsic extends a
22180floating-point ``value`` to a larger floating-point value.
22181
22182Arguments:
22183""""""""""
22184
22185The first argument to the '``llvm.experimental.constrained.fpext``'
22186intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
22187<t_vector>` of floating point values. This argument must be smaller in size
22188than the result.
22189
22190The second argument specifies the exception behavior as described above.
22191
22192Semantics:
22193""""""""""
22194
22195The result produced is a floating point value extended to be larger in size
22196than the operand. All restrictions that apply to the fpext instruction also
22197apply to this intrinsic.
22198
22199'``llvm.experimental.constrained.fcmp``' and '``llvm.experimental.constrained.fcmps``' Intrinsics
22200^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22201
22202Syntax:
22203"""""""
22204
22205::
22206
22207      declare <ty2>
22208      @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
22209                                          metadata <condition code>,
22210                                          metadata <exception behavior>)
22211      declare <ty2>
22212      @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
22213                                           metadata <condition code>,
22214                                           metadata <exception behavior>)
22215
22216Overview:
22217"""""""""
22218
22219The '``llvm.experimental.constrained.fcmp``' and
22220'``llvm.experimental.constrained.fcmps``' intrinsics return a boolean
22221value or vector of boolean values based on comparison of its operands.
22222
22223If the operands are floating-point scalars, then the result type is a
22224boolean (:ref:`i1 <t_integer>`).
22225
22226If the operands are floating-point vectors, then the result type is a
22227vector of boolean with the same number of elements as the operands being
22228compared.
22229
22230The '``llvm.experimental.constrained.fcmp``' intrinsic performs a quiet
22231comparison operation while the '``llvm.experimental.constrained.fcmps``'
22232intrinsic performs a signaling comparison operation.
22233
22234Arguments:
22235""""""""""
22236
22237The first two arguments to the '``llvm.experimental.constrained.fcmp``'
22238and '``llvm.experimental.constrained.fcmps``' intrinsics must be
22239:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
22240of floating-point values. Both arguments must have identical types.
22241
22242The third argument is the condition code indicating the kind of comparison
22243to perform. It must be a metadata string with one of the following values:
22244
22245.. _fcmp_md_cc:
22246
22247- "``oeq``": ordered and equal
22248- "``ogt``": ordered and greater than
22249- "``oge``": ordered and greater than or equal
22250- "``olt``": ordered and less than
22251- "``ole``": ordered and less than or equal
22252- "``one``": ordered and not equal
22253- "``ord``": ordered (no nans)
22254- "``ueq``": unordered or equal
22255- "``ugt``": unordered or greater than
22256- "``uge``": unordered or greater than or equal
22257- "``ult``": unordered or less than
22258- "``ule``": unordered or less than or equal
22259- "``une``": unordered or not equal
22260- "``uno``": unordered (either nans)
22261
22262*Ordered* means that neither operand is a NAN while *unordered* means
22263that either operand may be a NAN.
22264
22265The fourth argument specifies the exception behavior as described above.
22266
22267Semantics:
22268""""""""""
22269
22270``op1`` and ``op2`` are compared according to the condition code given
22271as the third argument. If the operands are vectors, then the
22272vectors are compared element by element. Each comparison performed
22273always yields an :ref:`i1 <t_integer>` result, as follows:
22274
22275.. _fcmp_md_cc_sem:
22276
22277- "``oeq``": yields ``true`` if both operands are not a NAN and ``op1``
22278  is equal to ``op2``.
22279- "``ogt``": yields ``true`` if both operands are not a NAN and ``op1``
22280  is greater than ``op2``.
22281- "``oge``": yields ``true`` if both operands are not a NAN and ``op1``
22282  is greater than or equal to ``op2``.
22283- "``olt``": yields ``true`` if both operands are not a NAN and ``op1``
22284  is less than ``op2``.
22285- "``ole``": yields ``true`` if both operands are not a NAN and ``op1``
22286  is less than or equal to ``op2``.
22287- "``one``": yields ``true`` if both operands are not a NAN and ``op1``
22288  is not equal to ``op2``.
22289- "``ord``": yields ``true`` if both operands are not a NAN.
22290- "``ueq``": yields ``true`` if either operand is a NAN or ``op1`` is
22291  equal to ``op2``.
22292- "``ugt``": yields ``true`` if either operand is a NAN or ``op1`` is
22293  greater than ``op2``.
22294- "``uge``": yields ``true`` if either operand is a NAN or ``op1`` is
22295  greater than or equal to ``op2``.
22296- "``ult``": yields ``true`` if either operand is a NAN or ``op1`` is
22297  less than ``op2``.
22298- "``ule``": yields ``true`` if either operand is a NAN or ``op1`` is
22299  less than or equal to ``op2``.
22300- "``une``": yields ``true`` if either operand is a NAN or ``op1`` is
22301  not equal to ``op2``.
22302- "``uno``": yields ``true`` if either operand is a NAN.
22303
22304The quiet comparison operation performed by
22305'``llvm.experimental.constrained.fcmp``' will only raise an exception
22306if either operand is a SNAN.  The signaling comparison operation
22307performed by '``llvm.experimental.constrained.fcmps``' will raise an
22308exception if either operand is a NAN (QNAN or SNAN). Such an exception
22309does not preclude a result being produced (e.g. exception might only
22310set a flag), therefore the distinction between ordered and unordered
22311comparisons is also relevant for the
22312'``llvm.experimental.constrained.fcmps``' intrinsic.
22313
22314'``llvm.experimental.constrained.fmuladd``' Intrinsic
22315^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22316
22317Syntax:
22318"""""""
22319
22320::
22321
22322      declare <type>
22323      @llvm.experimental.constrained.fmuladd(<type> <op1>, <type> <op2>,
22324                                             <type> <op3>,
22325                                             metadata <rounding mode>,
22326                                             metadata <exception behavior>)
22327
22328Overview:
22329"""""""""
22330
22331The '``llvm.experimental.constrained.fmuladd``' intrinsic represents
22332multiply-add expressions that can be fused if the code generator determines
22333that (a) the target instruction set has support for a fused operation,
22334and (b) that the fused operation is more efficient than the equivalent,
22335separate pair of mul and add instructions.
22336
22337Arguments:
22338""""""""""
22339
22340The first three arguments to the '``llvm.experimental.constrained.fmuladd``'
22341intrinsic must be floating-point or vector of floating-point values.
22342All three arguments must have identical types.
22343
22344The fourth and fifth arguments specify the rounding mode and exception behavior
22345as described above.
22346
22347Semantics:
22348""""""""""
22349
22350The expression:
22351
22352::
22353
22354      %0 = call float @llvm.experimental.constrained.fmuladd.f32(%a, %b, %c,
22355                                                                 metadata <rounding mode>,
22356                                                                 metadata <exception behavior>)
22357
22358is equivalent to the expression:
22359
22360::
22361
22362      %0 = call float @llvm.experimental.constrained.fmul.f32(%a, %b,
22363                                                              metadata <rounding mode>,
22364                                                              metadata <exception behavior>)
22365      %1 = call float @llvm.experimental.constrained.fadd.f32(%0, %c,
22366                                                              metadata <rounding mode>,
22367                                                              metadata <exception behavior>)
22368
22369except that it is unspecified whether rounding will be performed between the
22370multiplication and addition steps. Fusion is not guaranteed, even if the target
22371platform supports it.
22372If a fused multiply-add is required, the corresponding
22373:ref:`llvm.experimental.constrained.fma <int_fma>` intrinsic function should be
22374used instead.
22375This never sets errno, just as '``llvm.experimental.constrained.fma.*``'.
22376
22377Constrained libm-equivalent Intrinsics
22378--------------------------------------
22379
22380In addition to the basic floating-point operations for which constrained
22381intrinsics are described above, there are constrained versions of various
22382operations which provide equivalent behavior to a corresponding libm function.
22383These intrinsics allow the precise behavior of these operations with respect to
22384rounding mode and exception behavior to be controlled.
22385
22386As with the basic constrained floating-point intrinsics, the rounding mode
22387and exception behavior arguments only control the behavior of the optimizer.
22388They do not change the runtime floating-point environment.
22389
22390
22391'``llvm.experimental.constrained.sqrt``' Intrinsic
22392^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22393
22394Syntax:
22395"""""""
22396
22397::
22398
22399      declare <type>
22400      @llvm.experimental.constrained.sqrt(<type> <op1>,
22401                                          metadata <rounding mode>,
22402                                          metadata <exception behavior>)
22403
22404Overview:
22405"""""""""
22406
22407The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
22408of the specified value, returning the same value as the libm '``sqrt``'
22409functions would, but without setting ``errno``.
22410
22411Arguments:
22412""""""""""
22413
22414The first argument and the return type are floating-point numbers of the same
22415type.
22416
22417The second and third arguments specify the rounding mode and exception
22418behavior as described above.
22419
22420Semantics:
22421""""""""""
22422
22423This function returns the nonnegative square root of the specified value.
22424If the value is less than negative zero, a floating-point exception occurs
22425and the return value is architecture specific.
22426
22427
22428'``llvm.experimental.constrained.pow``' Intrinsic
22429^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22430
22431Syntax:
22432"""""""
22433
22434::
22435
22436      declare <type>
22437      @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
22438                                         metadata <rounding mode>,
22439                                         metadata <exception behavior>)
22440
22441Overview:
22442"""""""""
22443
22444The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
22445raised to the (positive or negative) power specified by the second operand.
22446
22447Arguments:
22448""""""""""
22449
22450The first two arguments and the return value are floating-point numbers of the
22451same type.  The second argument specifies the power to which the first argument
22452should be raised.
22453
22454The third and fourth arguments specify the rounding mode and exception
22455behavior as described above.
22456
22457Semantics:
22458""""""""""
22459
22460This function returns the first value raised to the second power,
22461returning the same values as the libm ``pow`` functions would, and
22462handles error conditions in the same way.
22463
22464
22465'``llvm.experimental.constrained.powi``' Intrinsic
22466^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22467
22468Syntax:
22469"""""""
22470
22471::
22472
22473      declare <type>
22474      @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
22475                                          metadata <rounding mode>,
22476                                          metadata <exception behavior>)
22477
22478Overview:
22479"""""""""
22480
22481The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
22482raised to the (positive or negative) power specified by the second operand. The
22483order of evaluation of multiplications is not defined. When a vector of
22484floating-point type is used, the second argument remains a scalar integer value.
22485
22486
22487Arguments:
22488""""""""""
22489
22490The first argument and the return value are floating-point numbers of the same
22491type.  The second argument is a 32-bit signed integer specifying the power to
22492which the first argument should be raised.
22493
22494The third and fourth arguments specify the rounding mode and exception
22495behavior as described above.
22496
22497Semantics:
22498""""""""""
22499
22500This function returns the first value raised to the second power with an
22501unspecified sequence of rounding operations.
22502
22503
22504'``llvm.experimental.constrained.sin``' Intrinsic
22505^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22506
22507Syntax:
22508"""""""
22509
22510::
22511
22512      declare <type>
22513      @llvm.experimental.constrained.sin(<type> <op1>,
22514                                         metadata <rounding mode>,
22515                                         metadata <exception behavior>)
22516
22517Overview:
22518"""""""""
22519
22520The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
22521first operand.
22522
22523Arguments:
22524""""""""""
22525
22526The first argument and the return type are floating-point numbers of the same
22527type.
22528
22529The second and third arguments specify the rounding mode and exception
22530behavior as described above.
22531
22532Semantics:
22533""""""""""
22534
22535This function returns the sine of the specified operand, returning the
22536same values as the libm ``sin`` functions would, and handles error
22537conditions in the same way.
22538
22539
22540'``llvm.experimental.constrained.cos``' Intrinsic
22541^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22542
22543Syntax:
22544"""""""
22545
22546::
22547
22548      declare <type>
22549      @llvm.experimental.constrained.cos(<type> <op1>,
22550                                         metadata <rounding mode>,
22551                                         metadata <exception behavior>)
22552
22553Overview:
22554"""""""""
22555
22556The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
22557first operand.
22558
22559Arguments:
22560""""""""""
22561
22562The first argument and the return type are floating-point numbers of the same
22563type.
22564
22565The second and third arguments specify the rounding mode and exception
22566behavior as described above.
22567
22568Semantics:
22569""""""""""
22570
22571This function returns the cosine of the specified operand, returning the
22572same values as the libm ``cos`` functions would, and handles error
22573conditions in the same way.
22574
22575
22576'``llvm.experimental.constrained.exp``' Intrinsic
22577^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22578
22579Syntax:
22580"""""""
22581
22582::
22583
22584      declare <type>
22585      @llvm.experimental.constrained.exp(<type> <op1>,
22586                                         metadata <rounding mode>,
22587                                         metadata <exception behavior>)
22588
22589Overview:
22590"""""""""
22591
22592The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
22593exponential of the specified value.
22594
22595Arguments:
22596""""""""""
22597
22598The first argument and the return value are floating-point numbers of the same
22599type.
22600
22601The second and third arguments specify the rounding mode and exception
22602behavior as described above.
22603
22604Semantics:
22605""""""""""
22606
22607This function returns the same values as the libm ``exp`` functions
22608would, and handles error conditions in the same way.
22609
22610
22611'``llvm.experimental.constrained.exp2``' Intrinsic
22612^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22613
22614Syntax:
22615"""""""
22616
22617::
22618
22619      declare <type>
22620      @llvm.experimental.constrained.exp2(<type> <op1>,
22621                                          metadata <rounding mode>,
22622                                          metadata <exception behavior>)
22623
22624Overview:
22625"""""""""
22626
22627The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
22628exponential of the specified value.
22629
22630
22631Arguments:
22632""""""""""
22633
22634The first argument and the return value are floating-point numbers of the same
22635type.
22636
22637The second and third arguments specify the rounding mode and exception
22638behavior as described above.
22639
22640Semantics:
22641""""""""""
22642
22643This function returns the same values as the libm ``exp2`` functions
22644would, and handles error conditions in the same way.
22645
22646
22647'``llvm.experimental.constrained.log``' Intrinsic
22648^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22649
22650Syntax:
22651"""""""
22652
22653::
22654
22655      declare <type>
22656      @llvm.experimental.constrained.log(<type> <op1>,
22657                                         metadata <rounding mode>,
22658                                         metadata <exception behavior>)
22659
22660Overview:
22661"""""""""
22662
22663The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
22664logarithm of the specified value.
22665
22666Arguments:
22667""""""""""
22668
22669The first argument and the return value are floating-point numbers of the same
22670type.
22671
22672The second and third arguments specify the rounding mode and exception
22673behavior as described above.
22674
22675
22676Semantics:
22677""""""""""
22678
22679This function returns the same values as the libm ``log`` functions
22680would, and handles error conditions in the same way.
22681
22682
22683'``llvm.experimental.constrained.log10``' Intrinsic
22684^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22685
22686Syntax:
22687"""""""
22688
22689::
22690
22691      declare <type>
22692      @llvm.experimental.constrained.log10(<type> <op1>,
22693                                           metadata <rounding mode>,
22694                                           metadata <exception behavior>)
22695
22696Overview:
22697"""""""""
22698
22699The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
22700logarithm of the specified value.
22701
22702Arguments:
22703""""""""""
22704
22705The first argument and the return value are floating-point numbers of the same
22706type.
22707
22708The second and third arguments specify the rounding mode and exception
22709behavior as described above.
22710
22711Semantics:
22712""""""""""
22713
22714This function returns the same values as the libm ``log10`` functions
22715would, and handles error conditions in the same way.
22716
22717
22718'``llvm.experimental.constrained.log2``' Intrinsic
22719^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22720
22721Syntax:
22722"""""""
22723
22724::
22725
22726      declare <type>
22727      @llvm.experimental.constrained.log2(<type> <op1>,
22728                                          metadata <rounding mode>,
22729                                          metadata <exception behavior>)
22730
22731Overview:
22732"""""""""
22733
22734The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
22735logarithm of the specified value.
22736
22737Arguments:
22738""""""""""
22739
22740The first argument and the return value are floating-point numbers of the same
22741type.
22742
22743The second and third arguments specify the rounding mode and exception
22744behavior as described above.
22745
22746Semantics:
22747""""""""""
22748
22749This function returns the same values as the libm ``log2`` functions
22750would, and handles error conditions in the same way.
22751
22752
22753'``llvm.experimental.constrained.rint``' Intrinsic
22754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22755
22756Syntax:
22757"""""""
22758
22759::
22760
22761      declare <type>
22762      @llvm.experimental.constrained.rint(<type> <op1>,
22763                                          metadata <rounding mode>,
22764                                          metadata <exception behavior>)
22765
22766Overview:
22767"""""""""
22768
22769The '``llvm.experimental.constrained.rint``' intrinsic returns the first
22770operand rounded to the nearest integer. It may raise an inexact floating-point
22771exception if the operand is not an integer.
22772
22773Arguments:
22774""""""""""
22775
22776The first argument and the return value are floating-point numbers of the same
22777type.
22778
22779The second and third arguments specify the rounding mode and exception
22780behavior as described above.
22781
22782Semantics:
22783""""""""""
22784
22785This function returns the same values as the libm ``rint`` functions
22786would, and handles error conditions in the same way.  The rounding mode is
22787described, not determined, by the rounding mode argument.  The actual rounding
22788mode is determined by the runtime floating-point environment.  The rounding
22789mode argument is only intended as information to the compiler.
22790
22791
22792'``llvm.experimental.constrained.lrint``' Intrinsic
22793^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22794
22795Syntax:
22796"""""""
22797
22798::
22799
22800      declare <inttype>
22801      @llvm.experimental.constrained.lrint(<fptype> <op1>,
22802                                           metadata <rounding mode>,
22803                                           metadata <exception behavior>)
22804
22805Overview:
22806"""""""""
22807
22808The '``llvm.experimental.constrained.lrint``' intrinsic returns the first
22809operand rounded to the nearest integer. An inexact floating-point exception
22810will be raised if the operand is not an integer. An invalid exception is
22811raised if the result is too large to fit into a supported integer type,
22812and in this case the result is undefined.
22813
22814Arguments:
22815""""""""""
22816
22817The first argument is a floating-point number. The return value is an
22818integer type. Not all types are supported on all targets. The supported
22819types are the same as the ``llvm.lrint`` intrinsic and the ``lrint``
22820libm functions.
22821
22822The second and third arguments specify the rounding mode and exception
22823behavior as described above.
22824
22825Semantics:
22826""""""""""
22827
22828This function returns the same values as the libm ``lrint`` functions
22829would, and handles error conditions in the same way.
22830
22831The rounding mode is described, not determined, by the rounding mode
22832argument.  The actual rounding mode is determined by the runtime floating-point
22833environment.  The rounding mode argument is only intended as information
22834to the compiler.
22835
22836If the runtime floating-point environment is using the default rounding mode
22837then the results will be the same as the llvm.lrint intrinsic.
22838
22839
22840'``llvm.experimental.constrained.llrint``' Intrinsic
22841^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22842
22843Syntax:
22844"""""""
22845
22846::
22847
22848      declare <inttype>
22849      @llvm.experimental.constrained.llrint(<fptype> <op1>,
22850                                            metadata <rounding mode>,
22851                                            metadata <exception behavior>)
22852
22853Overview:
22854"""""""""
22855
22856The '``llvm.experimental.constrained.llrint``' intrinsic returns the first
22857operand rounded to the nearest integer. An inexact floating-point exception
22858will be raised if the operand is not an integer. An invalid exception is
22859raised if the result is too large to fit into a supported integer type,
22860and in this case the result is undefined.
22861
22862Arguments:
22863""""""""""
22864
22865The first argument is a floating-point number. The return value is an
22866integer type. Not all types are supported on all targets. The supported
22867types are the same as the ``llvm.llrint`` intrinsic and the ``llrint``
22868libm functions.
22869
22870The second and third arguments specify the rounding mode and exception
22871behavior as described above.
22872
22873Semantics:
22874""""""""""
22875
22876This function returns the same values as the libm ``llrint`` functions
22877would, and handles error conditions in the same way.
22878
22879The rounding mode is described, not determined, by the rounding mode
22880argument.  The actual rounding mode is determined by the runtime floating-point
22881environment.  The rounding mode argument is only intended as information
22882to the compiler.
22883
22884If the runtime floating-point environment is using the default rounding mode
22885then the results will be the same as the llvm.llrint intrinsic.
22886
22887
22888'``llvm.experimental.constrained.nearbyint``' Intrinsic
22889^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22890
22891Syntax:
22892"""""""
22893
22894::
22895
22896      declare <type>
22897      @llvm.experimental.constrained.nearbyint(<type> <op1>,
22898                                               metadata <rounding mode>,
22899                                               metadata <exception behavior>)
22900
22901Overview:
22902"""""""""
22903
22904The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
22905operand rounded to the nearest integer. It will not raise an inexact
22906floating-point exception if the operand is not an integer.
22907
22908
22909Arguments:
22910""""""""""
22911
22912The first argument and the return value are floating-point numbers of the same
22913type.
22914
22915The second and third arguments specify the rounding mode and exception
22916behavior as described above.
22917
22918Semantics:
22919""""""""""
22920
22921This function returns the same values as the libm ``nearbyint`` functions
22922would, and handles error conditions in the same way.  The rounding mode is
22923described, not determined, by the rounding mode argument.  The actual rounding
22924mode is determined by the runtime floating-point environment.  The rounding
22925mode argument is only intended as information to the compiler.
22926
22927
22928'``llvm.experimental.constrained.maxnum``' Intrinsic
22929^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22930
22931Syntax:
22932"""""""
22933
22934::
22935
22936      declare <type>
22937      @llvm.experimental.constrained.maxnum(<type> <op1>, <type> <op2>
22938                                            metadata <exception behavior>)
22939
22940Overview:
22941"""""""""
22942
22943The '``llvm.experimental.constrained.maxnum``' intrinsic returns the maximum
22944of the two arguments.
22945
22946Arguments:
22947""""""""""
22948
22949The first two arguments and the return value are floating-point numbers
22950of the same type.
22951
22952The third argument specifies the exception behavior as described above.
22953
22954Semantics:
22955""""""""""
22956
22957This function follows the IEEE-754 semantics for maxNum.
22958
22959
22960'``llvm.experimental.constrained.minnum``' Intrinsic
22961^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22962
22963Syntax:
22964"""""""
22965
22966::
22967
22968      declare <type>
22969      @llvm.experimental.constrained.minnum(<type> <op1>, <type> <op2>
22970                                            metadata <exception behavior>)
22971
22972Overview:
22973"""""""""
22974
22975The '``llvm.experimental.constrained.minnum``' intrinsic returns the minimum
22976of the two arguments.
22977
22978Arguments:
22979""""""""""
22980
22981The first two arguments and the return value are floating-point numbers
22982of the same type.
22983
22984The third argument specifies the exception behavior as described above.
22985
22986Semantics:
22987""""""""""
22988
22989This function follows the IEEE-754 semantics for minNum.
22990
22991
22992'``llvm.experimental.constrained.maximum``' Intrinsic
22993^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
22994
22995Syntax:
22996"""""""
22997
22998::
22999
23000      declare <type>
23001      @llvm.experimental.constrained.maximum(<type> <op1>, <type> <op2>
23002                                             metadata <exception behavior>)
23003
23004Overview:
23005"""""""""
23006
23007The '``llvm.experimental.constrained.maximum``' intrinsic returns the maximum
23008of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
23009
23010Arguments:
23011""""""""""
23012
23013The first two arguments and the return value are floating-point numbers
23014of the same type.
23015
23016The third argument specifies the exception behavior as described above.
23017
23018Semantics:
23019""""""""""
23020
23021This function follows semantics specified in the draft of IEEE 754-2018.
23022
23023
23024'``llvm.experimental.constrained.minimum``' Intrinsic
23025^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23026
23027Syntax:
23028"""""""
23029
23030::
23031
23032      declare <type>
23033      @llvm.experimental.constrained.minimum(<type> <op1>, <type> <op2>
23034                                             metadata <exception behavior>)
23035
23036Overview:
23037"""""""""
23038
23039The '``llvm.experimental.constrained.minimum``' intrinsic returns the minimum
23040of the two arguments, propagating NaNs and treating -0.0 as less than +0.0.
23041
23042Arguments:
23043""""""""""
23044
23045The first two arguments and the return value are floating-point numbers
23046of the same type.
23047
23048The third argument specifies the exception behavior as described above.
23049
23050Semantics:
23051""""""""""
23052
23053This function follows semantics specified in the draft of IEEE 754-2018.
23054
23055
23056'``llvm.experimental.constrained.ceil``' Intrinsic
23057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23058
23059Syntax:
23060"""""""
23061
23062::
23063
23064      declare <type>
23065      @llvm.experimental.constrained.ceil(<type> <op1>,
23066                                          metadata <exception behavior>)
23067
23068Overview:
23069"""""""""
23070
23071The '``llvm.experimental.constrained.ceil``' intrinsic returns the ceiling of the
23072first operand.
23073
23074Arguments:
23075""""""""""
23076
23077The first argument and the return value are floating-point numbers of the same
23078type.
23079
23080The second argument specifies the exception behavior as described above.
23081
23082Semantics:
23083""""""""""
23084
23085This function returns the same values as the libm ``ceil`` functions
23086would and handles error conditions in the same way.
23087
23088
23089'``llvm.experimental.constrained.floor``' Intrinsic
23090^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23091
23092Syntax:
23093"""""""
23094
23095::
23096
23097      declare <type>
23098      @llvm.experimental.constrained.floor(<type> <op1>,
23099                                           metadata <exception behavior>)
23100
23101Overview:
23102"""""""""
23103
23104The '``llvm.experimental.constrained.floor``' intrinsic returns the floor of the
23105first operand.
23106
23107Arguments:
23108""""""""""
23109
23110The first argument and the return value are floating-point numbers of the same
23111type.
23112
23113The second argument specifies the exception behavior as described above.
23114
23115Semantics:
23116""""""""""
23117
23118This function returns the same values as the libm ``floor`` functions
23119would and handles error conditions in the same way.
23120
23121
23122'``llvm.experimental.constrained.round``' Intrinsic
23123^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23124
23125Syntax:
23126"""""""
23127
23128::
23129
23130      declare <type>
23131      @llvm.experimental.constrained.round(<type> <op1>,
23132                                           metadata <exception behavior>)
23133
23134Overview:
23135"""""""""
23136
23137The '``llvm.experimental.constrained.round``' intrinsic returns the first
23138operand rounded to the nearest integer.
23139
23140Arguments:
23141""""""""""
23142
23143The first argument and the return value are floating-point numbers of the same
23144type.
23145
23146The second argument specifies the exception behavior as described above.
23147
23148Semantics:
23149""""""""""
23150
23151This function returns the same values as the libm ``round`` functions
23152would and handles error conditions in the same way.
23153
23154
23155'``llvm.experimental.constrained.roundeven``' Intrinsic
23156^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23157
23158Syntax:
23159"""""""
23160
23161::
23162
23163      declare <type>
23164      @llvm.experimental.constrained.roundeven(<type> <op1>,
23165                                               metadata <exception behavior>)
23166
23167Overview:
23168"""""""""
23169
23170The '``llvm.experimental.constrained.roundeven``' intrinsic returns the first
23171operand rounded to the nearest integer in floating-point format, rounding
23172halfway cases to even (that is, to the nearest value that is an even integer),
23173regardless of the current rounding direction.
23174
23175Arguments:
23176""""""""""
23177
23178The first argument and the return value are floating-point numbers of the same
23179type.
23180
23181The second argument specifies the exception behavior as described above.
23182
23183Semantics:
23184""""""""""
23185
23186This function implements IEEE-754 operation ``roundToIntegralTiesToEven``. It
23187also behaves in the same way as C standard function ``roundeven`` and can signal
23188the invalid operation exception for a SNAN operand.
23189
23190
23191'``llvm.experimental.constrained.lround``' Intrinsic
23192^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23193
23194Syntax:
23195"""""""
23196
23197::
23198
23199      declare <inttype>
23200      @llvm.experimental.constrained.lround(<fptype> <op1>,
23201                                            metadata <exception behavior>)
23202
23203Overview:
23204"""""""""
23205
23206The '``llvm.experimental.constrained.lround``' intrinsic returns the first
23207operand rounded to the nearest integer with ties away from zero.  It will
23208raise an inexact floating-point exception if the operand is not an integer.
23209An invalid exception is raised if the result is too large to fit into a
23210supported integer type, and in this case the result is undefined.
23211
23212Arguments:
23213""""""""""
23214
23215The first argument is a floating-point number. The return value is an
23216integer type. Not all types are supported on all targets. The supported
23217types are the same as the ``llvm.lround`` intrinsic and the ``lround``
23218libm functions.
23219
23220The second argument specifies the exception behavior as described above.
23221
23222Semantics:
23223""""""""""
23224
23225This function returns the same values as the libm ``lround`` functions
23226would and handles error conditions in the same way.
23227
23228
23229'``llvm.experimental.constrained.llround``' Intrinsic
23230^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23231
23232Syntax:
23233"""""""
23234
23235::
23236
23237      declare <inttype>
23238      @llvm.experimental.constrained.llround(<fptype> <op1>,
23239                                             metadata <exception behavior>)
23240
23241Overview:
23242"""""""""
23243
23244The '``llvm.experimental.constrained.llround``' intrinsic returns the first
23245operand rounded to the nearest integer with ties away from zero. It will
23246raise an inexact floating-point exception if the operand is not an integer.
23247An invalid exception is raised if the result is too large to fit into a
23248supported integer type, and in this case the result is undefined.
23249
23250Arguments:
23251""""""""""
23252
23253The first argument is a floating-point number. The return value is an
23254integer type. Not all types are supported on all targets. The supported
23255types are the same as the ``llvm.llround`` intrinsic and the ``llround``
23256libm functions.
23257
23258The second argument specifies the exception behavior as described above.
23259
23260Semantics:
23261""""""""""
23262
23263This function returns the same values as the libm ``llround`` functions
23264would and handles error conditions in the same way.
23265
23266
23267'``llvm.experimental.constrained.trunc``' Intrinsic
23268^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23269
23270Syntax:
23271"""""""
23272
23273::
23274
23275      declare <type>
23276      @llvm.experimental.constrained.trunc(<type> <op1>,
23277                                           metadata <exception behavior>)
23278
23279Overview:
23280"""""""""
23281
23282The '``llvm.experimental.constrained.trunc``' intrinsic returns the first
23283operand rounded to the nearest integer not larger in magnitude than the
23284operand.
23285
23286Arguments:
23287""""""""""
23288
23289The first argument and the return value are floating-point numbers of the same
23290type.
23291
23292The second argument specifies the exception behavior as described above.
23293
23294Semantics:
23295""""""""""
23296
23297This function returns the same values as the libm ``trunc`` functions
23298would and handles error conditions in the same way.
23299
23300.. _int_experimental_noalias_scope_decl:
23301
23302'``llvm.experimental.noalias.scope.decl``' Intrinsic
23303^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23304
23305Syntax:
23306"""""""
23307
23308
23309::
23310
23311      declare void @llvm.experimental.noalias.scope.decl(metadata !id.scope.list)
23312
23313Overview:
23314"""""""""
23315
23316The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
23317noalias scope is declared. When the intrinsic is duplicated, a decision must
23318also be made about the scope: depending on the reason of the duplication,
23319the scope might need to be duplicated as well.
23320
23321
23322Arguments:
23323""""""""""
23324
23325The ``!id.scope.list`` argument is metadata that is a list of ``noalias``
23326metadata references. The format is identical to that required for ``noalias``
23327metadata. This list must have exactly one element.
23328
23329Semantics:
23330""""""""""
23331
23332The ``llvm.experimental.noalias.scope.decl`` intrinsic identifies where a
23333noalias scope is declared. When the intrinsic is duplicated, a decision must
23334also be made about the scope: depending on the reason of the duplication,
23335the scope might need to be duplicated as well.
23336
23337For example, when the intrinsic is used inside a loop body, and that loop is
23338unrolled, the associated noalias scope must also be duplicated. Otherwise, the
23339noalias property it signifies would spill across loop iterations, whereas it
23340was only valid within a single iteration.
23341
23342.. code-block:: llvm
23343
23344  ; This examples shows two possible positions for noalias.decl and how they impact the semantics:
23345  ; If it is outside the loop (Version 1), then %a and %b are noalias across *all* iterations.
23346  ; If it is inside the loop (Version 2), then %a and %b are noalias only within *one* iteration.
23347  declare void @decl_in_loop(ptr %a.base, ptr %b.base) {
23348  entry:
23349    ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 1: noalias decl outside loop
23350    br label %loop
23351
23352  loop:
23353    %a = phi ptr [ %a.base, %entry ], [ %a.inc, %loop ]
23354    %b = phi ptr [ %b.base, %entry ], [ %b.inc, %loop ]
23355    ; call void @llvm.experimental.noalias.scope.decl(metadata !2) ; Version 2: noalias decl inside loop
23356    %val = load i8, ptr %a, !alias.scope !2
23357    store i8 %val, ptr %b, !noalias !2
23358    %a.inc = getelementptr inbounds i8, ptr %a, i64 1
23359    %b.inc = getelementptr inbounds i8, ptr %b, i64 1
23360    %cond = call i1 @cond()
23361    br i1 %cond, label %loop, label %exit
23362
23363  exit:
23364    ret void
23365  }
23366
23367  !0 = !{!0} ; domain
23368  !1 = !{!1, !0} ; scope
23369  !2 = !{!1} ; scope list
23370
23371Multiple calls to `@llvm.experimental.noalias.scope.decl` for the same scope
23372are possible, but one should never dominate another. Violations are pointed out
23373by the verifier as they indicate a problem in either a transformation pass or
23374the input.
23375
23376
23377Floating Point Environment Manipulation intrinsics
23378--------------------------------------------------
23379
23380These functions read or write floating point environment, such as rounding
23381mode or state of floating point exceptions. Altering the floating point
23382environment requires special care. See :ref:`Floating Point Environment <floatenv>`.
23383
23384'``llvm.flt.rounds``' Intrinsic
23385^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23386
23387Syntax:
23388"""""""
23389
23390::
23391
23392      declare i32 @llvm.flt.rounds()
23393
23394Overview:
23395"""""""""
23396
23397The '``llvm.flt.rounds``' intrinsic reads the current rounding mode.
23398
23399Semantics:
23400""""""""""
23401
23402The '``llvm.flt.rounds``' intrinsic returns the current rounding mode.
23403Encoding of the returned values is same as the result of ``FLT_ROUNDS``,
23404specified by C standard:
23405
23406::
23407
23408    0  - toward zero
23409    1  - to nearest, ties to even
23410    2  - toward positive infinity
23411    3  - toward negative infinity
23412    4  - to nearest, ties away from zero
23413
23414Other values may be used to represent additional rounding modes, supported by a
23415target. These values are target-specific.
23416
23417
23418'``llvm.set.rounding``' Intrinsic
23419^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23420
23421Syntax:
23422"""""""
23423
23424::
23425
23426      declare void @llvm.set.rounding(i32 <val>)
23427
23428Overview:
23429"""""""""
23430
23431The '``llvm.set.rounding``' intrinsic sets current rounding mode.
23432
23433Arguments:
23434""""""""""
23435
23436The argument is the required rounding mode. Encoding of rounding mode is
23437the same as used by '``llvm.flt.rounds``'.
23438
23439Semantics:
23440""""""""""
23441
23442The '``llvm.set.rounding``' intrinsic sets the current rounding mode. It is
23443similar to C library function 'fesetround', however this intrinsic does not
23444return any value and uses platform-independent representation of IEEE rounding
23445modes.
23446
23447
23448Floating-Point Test Intrinsics
23449------------------------------
23450
23451These functions get properties of floating-point values.
23452
23453
23454'``llvm.is.fpclass``' Intrinsic
23455^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23456
23457Syntax:
23458"""""""
23459
23460::
23461
23462      declare i1 @llvm.is.fpclass(<fptype> <op>, i32 <test>)
23463      declare <N x i1> @llvm.is.fpclass(<vector-fptype> <op>, i32 <test>)
23464
23465Overview:
23466"""""""""
23467
23468The '``llvm.is.fpclass``' intrinsic returns a boolean value or vector of boolean
23469values depending on whether the first argument satisfies the test specified by
23470the second argument.
23471
23472If the first argument is a floating-point scalar, then the result type is a
23473boolean (:ref:`i1 <t_integer>`).
23474
23475If the first argument is a floating-point vector, then the result type is a
23476vector of boolean with the same number of elements as the first argument.
23477
23478Arguments:
23479""""""""""
23480
23481The first argument to the '``llvm.is.fpclass``' intrinsic must be
23482:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
23483of floating-point values.
23484
23485The second argument specifies, which tests to perform. It must be a compile-time
23486integer constant, each bit in which specifies floating-point class:
23487
23488+-------+----------------------+
23489| Bit # | floating-point class |
23490+=======+======================+
23491| 0     | Signaling NaN        |
23492+-------+----------------------+
23493| 1     | Quiet NaN            |
23494+-------+----------------------+
23495| 2     | Negative infinity    |
23496+-------+----------------------+
23497| 3     | Negative normal      |
23498+-------+----------------------+
23499| 4     | Negative subnormal   |
23500+-------+----------------------+
23501| 5     | Negative zero        |
23502+-------+----------------------+
23503| 6     | Positive zero        |
23504+-------+----------------------+
23505| 7     | Positive subnormal   |
23506+-------+----------------------+
23507| 8     | Positive normal      |
23508+-------+----------------------+
23509| 9     | Positive infinity    |
23510+-------+----------------------+
23511
23512Semantics:
23513""""""""""
23514
23515The function checks if ``op`` belongs to any of the floating-point classes
23516specified by ``test``. If ``op`` is a vector, then the check is made element by
23517element. Each check yields an :ref:`i1 <t_integer>` result, which is ``true``,
23518if the element value satisfies the specified test. The argument ``test`` is a
23519bit mask where each bit specifies floating-point class to test. For example, the
23520value 0x108 makes test for normal value, - bits 3 and 8 in it are set, which
23521means that the function returns ``true`` if ``op`` is a positive or negative
23522normal value. The function never raises floating-point exceptions.
23523
23524
23525General Intrinsics
23526------------------
23527
23528This class of intrinsics is designed to be generic and has no specific
23529purpose.
23530
23531'``llvm.var.annotation``' Intrinsic
23532^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23533
23534Syntax:
23535"""""""
23536
23537::
23538
23539      declare void @llvm.var.annotation(ptr <val>, ptr <str>, ptr <str>, i32  <int>)
23540
23541Overview:
23542"""""""""
23543
23544The '``llvm.var.annotation``' intrinsic.
23545
23546Arguments:
23547""""""""""
23548
23549The first argument is a pointer to a value, the second is a pointer to a
23550global string, the third is a pointer to a global string which is the
23551source file name, and the last argument is the line number.
23552
23553Semantics:
23554""""""""""
23555
23556This intrinsic allows annotation of local variables with arbitrary
23557strings. This can be useful for special purpose optimizations that want
23558to look for these annotations. These have no other defined use; they are
23559ignored by code generation and optimization.
23560
23561'``llvm.ptr.annotation.*``' Intrinsic
23562^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23563
23564Syntax:
23565"""""""
23566
23567This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
23568pointer to an integer of any width. *NOTE* you must specify an address space for
23569the pointer. The identifier for the default address space is the integer
23570'``0``'.
23571
23572::
23573
23574      declare ptr @llvm.ptr.annotation.p0(ptr <val>, ptr <str>, ptr <str>, i32 <int>)
23575      declare ptr @llvm.ptr.annotation.p1(ptr addrspace(1) <val>, ptr <str>, ptr <str>, i32 <int>)
23576
23577Overview:
23578"""""""""
23579
23580The '``llvm.ptr.annotation``' intrinsic.
23581
23582Arguments:
23583""""""""""
23584
23585The first argument is a pointer to an integer value of arbitrary bitwidth
23586(result of some expression), the second is a pointer to a global string, the
23587third is a pointer to a global string which is the source file name, and the
23588last argument is the line number. It returns the value of the first argument.
23589
23590Semantics:
23591""""""""""
23592
23593This intrinsic allows annotation of a pointer to an integer with arbitrary
23594strings. This can be useful for special purpose optimizations that want to look
23595for these annotations. These have no other defined use; transformations preserve
23596annotations on a best-effort basis but are allowed to replace the intrinsic with
23597its first argument without breaking semantics and the intrinsic is completely
23598dropped during instruction selection.
23599
23600'``llvm.annotation.*``' Intrinsic
23601^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23602
23603Syntax:
23604"""""""
23605
23606This is an overloaded intrinsic. You can use '``llvm.annotation``' on
23607any integer bit width.
23608
23609::
23610
23611      declare i8 @llvm.annotation.i8(i8 <val>, ptr <str>, ptr <str>, i32  <int>)
23612      declare i16 @llvm.annotation.i16(i16 <val>, ptr <str>, ptr <str>, i32  <int>)
23613      declare i32 @llvm.annotation.i32(i32 <val>, ptr <str>, ptr <str>, i32  <int>)
23614      declare i64 @llvm.annotation.i64(i64 <val>, ptr <str>, ptr <str>, i32  <int>)
23615      declare i256 @llvm.annotation.i256(i256 <val>, ptr <str>, ptr <str>, i32  <int>)
23616
23617Overview:
23618"""""""""
23619
23620The '``llvm.annotation``' intrinsic.
23621
23622Arguments:
23623""""""""""
23624
23625The first argument is an integer value (result of some expression), the
23626second is a pointer to a global string, the third is a pointer to a
23627global string which is the source file name, and the last argument is
23628the line number. It returns the value of the first argument.
23629
23630Semantics:
23631""""""""""
23632
23633This intrinsic allows annotations to be put on arbitrary expressions with
23634arbitrary strings. This can be useful for special purpose optimizations that
23635want to look for these annotations. These have no other defined use;
23636transformations preserve annotations on a best-effort basis but are allowed to
23637replace the intrinsic with its first argument without breaking semantics and the
23638intrinsic is completely dropped during instruction selection.
23639
23640'``llvm.codeview.annotation``' Intrinsic
23641^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23642
23643Syntax:
23644"""""""
23645
23646This annotation emits a label at its program point and an associated
23647``S_ANNOTATION`` codeview record with some additional string metadata. This is
23648used to implement MSVC's ``__annotation`` intrinsic. It is marked
23649``noduplicate``, so calls to this intrinsic prevent inlining and should be
23650considered expensive.
23651
23652::
23653
23654      declare void @llvm.codeview.annotation(metadata)
23655
23656Arguments:
23657""""""""""
23658
23659The argument should be an MDTuple containing any number of MDStrings.
23660
23661'``llvm.trap``' Intrinsic
23662^^^^^^^^^^^^^^^^^^^^^^^^^
23663
23664Syntax:
23665"""""""
23666
23667::
23668
23669      declare void @llvm.trap() cold noreturn nounwind
23670
23671Overview:
23672"""""""""
23673
23674The '``llvm.trap``' intrinsic.
23675
23676Arguments:
23677""""""""""
23678
23679None.
23680
23681Semantics:
23682""""""""""
23683
23684This intrinsic is lowered to the target dependent trap instruction. If
23685the target does not have a trap instruction, this intrinsic will be
23686lowered to a call of the ``abort()`` function.
23687
23688'``llvm.debugtrap``' Intrinsic
23689^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23690
23691Syntax:
23692"""""""
23693
23694::
23695
23696      declare void @llvm.debugtrap() nounwind
23697
23698Overview:
23699"""""""""
23700
23701The '``llvm.debugtrap``' intrinsic.
23702
23703Arguments:
23704""""""""""
23705
23706None.
23707
23708Semantics:
23709""""""""""
23710
23711This intrinsic is lowered to code which is intended to cause an
23712execution trap with the intention of requesting the attention of a
23713debugger.
23714
23715'``llvm.ubsantrap``' Intrinsic
23716^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23717
23718Syntax:
23719"""""""
23720
23721::
23722
23723      declare void @llvm.ubsantrap(i8 immarg) cold noreturn nounwind
23724
23725Overview:
23726"""""""""
23727
23728The '``llvm.ubsantrap``' intrinsic.
23729
23730Arguments:
23731""""""""""
23732
23733An integer describing the kind of failure detected.
23734
23735Semantics:
23736""""""""""
23737
23738This intrinsic is lowered to code which is intended to cause an execution trap,
23739embedding the argument into encoding of that trap somehow to discriminate
23740crashes if possible.
23741
23742Equivalent to ``@llvm.trap`` for targets that do not support this behaviour.
23743
23744'``llvm.stackprotector``' Intrinsic
23745^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23746
23747Syntax:
23748"""""""
23749
23750::
23751
23752      declare void @llvm.stackprotector(ptr <guard>, ptr <slot>)
23753
23754Overview:
23755"""""""""
23756
23757The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
23758onto the stack at ``slot``. The stack slot is adjusted to ensure that it
23759is placed on the stack before local variables.
23760
23761Arguments:
23762""""""""""
23763
23764The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
23765The first argument is the value loaded from the stack guard
23766``@__stack_chk_guard``. The second variable is an ``alloca`` that has
23767enough space to hold the value of the guard.
23768
23769Semantics:
23770""""""""""
23771
23772This intrinsic causes the prologue/epilogue inserter to force the position of
23773the ``AllocaInst`` stack slot to be before local variables on the stack. This is
23774to ensure that if a local variable on the stack is overwritten, it will destroy
23775the value of the guard. When the function exits, the guard on the stack is
23776checked against the original guard by ``llvm.stackprotectorcheck``. If they are
23777different, then ``llvm.stackprotectorcheck`` causes the program to abort by
23778calling the ``__stack_chk_fail()`` function.
23779
23780'``llvm.stackguard``' Intrinsic
23781^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23782
23783Syntax:
23784"""""""
23785
23786::
23787
23788      declare ptr @llvm.stackguard()
23789
23790Overview:
23791"""""""""
23792
23793The ``llvm.stackguard`` intrinsic returns the system stack guard value.
23794
23795It should not be generated by frontends, since it is only for internal usage.
23796The reason why we create this intrinsic is that we still support IR form Stack
23797Protector in FastISel.
23798
23799Arguments:
23800""""""""""
23801
23802None.
23803
23804Semantics:
23805""""""""""
23806
23807On some platforms, the value returned by this intrinsic remains unchanged
23808between loads in the same thread. On other platforms, it returns the same
23809global variable value, if any, e.g. ``@__stack_chk_guard``.
23810
23811Currently some platforms have IR-level customized stack guard loading (e.g.
23812X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
23813in the future.
23814
23815'``llvm.objectsize``' Intrinsic
23816^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23817
23818Syntax:
23819"""""""
23820
23821::
23822
23823      declare i32 @llvm.objectsize.i32(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
23824      declare i64 @llvm.objectsize.i64(ptr <object>, i1 <min>, i1 <nullunknown>, i1 <dynamic>)
23825
23826Overview:
23827"""""""""
23828
23829The ``llvm.objectsize`` intrinsic is designed to provide information to the
23830optimizer to determine whether a) an operation (like memcpy) will overflow a
23831buffer that corresponds to an object, or b) that a runtime check for overflow
23832isn't necessary. An object in this context means an allocation of a specific
23833class, structure, array, or other object.
23834
23835Arguments:
23836""""""""""
23837
23838The ``llvm.objectsize`` intrinsic takes four arguments. The first argument is a
23839pointer to or into the ``object``. The second argument determines whether
23840``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size is
23841unknown. The third argument controls how ``llvm.objectsize`` acts when ``null``
23842in address space 0 is used as its pointer argument. If it's ``false``,
23843``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
23844the ``null`` is in a non-zero address space or if ``true`` is given for the
23845third argument of ``llvm.objectsize``, we assume its size is unknown. The fourth
23846argument to ``llvm.objectsize`` determines if the value should be evaluated at
23847runtime.
23848
23849The second, third, and fourth arguments only accept constants.
23850
23851Semantics:
23852""""""""""
23853
23854The ``llvm.objectsize`` intrinsic is lowered to a value representing the size of
23855the object concerned. If the size cannot be determined, ``llvm.objectsize``
23856returns ``i32/i64 -1 or 0`` (depending on the ``min`` argument).
23857
23858'``llvm.expect``' Intrinsic
23859^^^^^^^^^^^^^^^^^^^^^^^^^^^
23860
23861Syntax:
23862"""""""
23863
23864This is an overloaded intrinsic. You can use ``llvm.expect`` on any
23865integer bit width.
23866
23867::
23868
23869      declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
23870      declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
23871      declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
23872
23873Overview:
23874"""""""""
23875
23876The ``llvm.expect`` intrinsic provides information about expected (the
23877most probable) value of ``val``, which can be used by optimizers.
23878
23879Arguments:
23880""""""""""
23881
23882The ``llvm.expect`` intrinsic takes two arguments. The first argument is
23883a value. The second argument is an expected value.
23884
23885Semantics:
23886""""""""""
23887
23888This intrinsic is lowered to the ``val``.
23889
23890'``llvm.expect.with.probability``' Intrinsic
23891^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23892
23893Syntax:
23894"""""""
23895
23896This intrinsic is similar to ``llvm.expect``. This is an overloaded intrinsic.
23897You can use ``llvm.expect.with.probability`` on any integer bit width.
23898
23899::
23900
23901      declare i1 @llvm.expect.with.probability.i1(i1 <val>, i1 <expected_val>, double <prob>)
23902      declare i32 @llvm.expect.with.probability.i32(i32 <val>, i32 <expected_val>, double <prob>)
23903      declare i64 @llvm.expect.with.probability.i64(i64 <val>, i64 <expected_val>, double <prob>)
23904
23905Overview:
23906"""""""""
23907
23908The ``llvm.expect.with.probability`` intrinsic provides information about
23909expected value of ``val`` with probability(or confidence) ``prob``, which can
23910be used by optimizers.
23911
23912Arguments:
23913""""""""""
23914
23915The ``llvm.expect.with.probability`` intrinsic takes three arguments. The first
23916argument is a value. The second argument is an expected value. The third
23917argument is a probability.
23918
23919Semantics:
23920""""""""""
23921
23922This intrinsic is lowered to the ``val``.
23923
23924.. _int_assume:
23925
23926'``llvm.assume``' Intrinsic
23927^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23928
23929Syntax:
23930"""""""
23931
23932::
23933
23934      declare void @llvm.assume(i1 %cond)
23935
23936Overview:
23937"""""""""
23938
23939The ``llvm.assume`` allows the optimizer to assume that the provided
23940condition is true. This information can then be used in simplifying other parts
23941of the code.
23942
23943More complex assumptions can be encoded as
23944:ref:`assume operand bundles <assume_opbundles>`.
23945
23946Arguments:
23947""""""""""
23948
23949The argument of the call is the condition which the optimizer may assume is
23950always true.
23951
23952Semantics:
23953""""""""""
23954
23955The intrinsic allows the optimizer to assume that the provided condition is
23956always true whenever the control flow reaches the intrinsic call. No code is
23957generated for this intrinsic, and instructions that contribute only to the
23958provided condition are not used for code generation. If the condition is
23959violated during execution, the behavior is undefined.
23960
23961Note that the optimizer might limit the transformations performed on values
23962used by the ``llvm.assume`` intrinsic in order to preserve the instructions
23963only used to form the intrinsic's input argument. This might prove undesirable
23964if the extra information provided by the ``llvm.assume`` intrinsic does not cause
23965sufficient overall improvement in code quality. For this reason,
23966``llvm.assume`` should not be used to document basic mathematical invariants
23967that the optimizer can otherwise deduce or facts that are of little use to the
23968optimizer.
23969
23970.. _int_ssa_copy:
23971
23972'``llvm.ssa.copy``' Intrinsic
23973^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23974
23975Syntax:
23976"""""""
23977
23978::
23979
23980      declare type @llvm.ssa.copy(type %operand) returned(1) readnone
23981
23982Arguments:
23983""""""""""
23984
23985The first argument is an operand which is used as the returned value.
23986
23987Overview:
23988""""""""""
23989
23990The ``llvm.ssa.copy`` intrinsic can be used to attach information to
23991operations by copying them and giving them new names.  For example,
23992the PredicateInfo utility uses it to build Extended SSA form, and
23993attach various forms of information to operands that dominate specific
23994uses.  It is not meant for general use, only for building temporary
23995renaming forms that require value splits at certain points.
23996
23997.. _type.test:
23998
23999'``llvm.type.test``' Intrinsic
24000^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24001
24002Syntax:
24003"""""""
24004
24005::
24006
24007      declare i1 @llvm.type.test(ptr %ptr, metadata %type) nounwind readnone
24008
24009
24010Arguments:
24011""""""""""
24012
24013The first argument is a pointer to be tested. The second argument is a
24014metadata object representing a :doc:`type identifier <TypeMetadata>`.
24015
24016Overview:
24017"""""""""
24018
24019The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
24020with the given type identifier.
24021
24022.. _type.checked.load:
24023
24024'``llvm.type.checked.load``' Intrinsic
24025^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24026
24027Syntax:
24028"""""""
24029
24030::
24031
24032      declare {ptr, i1} @llvm.type.checked.load(ptr %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
24033
24034
24035Arguments:
24036""""""""""
24037
24038The first argument is a pointer from which to load a function pointer. The
24039second argument is the byte offset from which to load the function pointer. The
24040third argument is a metadata object representing a :doc:`type identifier
24041<TypeMetadata>`.
24042
24043Overview:
24044"""""""""
24045
24046The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
24047virtual table pointer using type metadata. This intrinsic is used to implement
24048control flow integrity in conjunction with virtual call optimization. The
24049virtual call optimization pass will optimize away ``llvm.type.checked.load``
24050intrinsics associated with devirtualized calls, thereby removing the type
24051check in cases where it is not needed to enforce the control flow integrity
24052constraint.
24053
24054If the given pointer is associated with a type metadata identifier, this
24055function returns true as the second element of its return value. (Note that
24056the function may also return true if the given pointer is not associated
24057with a type metadata identifier.) If the function's return value's second
24058element is true, the following rules apply to the first element:
24059
24060- If the given pointer is associated with the given type metadata identifier,
24061  it is the function pointer loaded from the given byte offset from the given
24062  pointer.
24063
24064- If the given pointer is not associated with the given type metadata
24065  identifier, it is one of the following (the choice of which is unspecified):
24066
24067  1. The function pointer that would have been loaded from an arbitrarily chosen
24068     (through an unspecified mechanism) pointer associated with the type
24069     metadata.
24070
24071  2. If the function has a non-void return type, a pointer to a function that
24072     returns an unspecified value without causing side effects.
24073
24074If the function's return value's second element is false, the value of the
24075first element is undefined.
24076
24077
24078'``llvm.arithmetic.fence``' Intrinsic
24079^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24080
24081Syntax:
24082"""""""
24083
24084::
24085
24086      declare <type>
24087      @llvm.arithmetic.fence(<type> <op>)
24088
24089Overview:
24090"""""""""
24091
24092The purpose of the ``llvm.arithmetic.fence`` intrinsic
24093is to prevent the optimizer from performing fast-math optimizations,
24094particularly reassociation,
24095between the argument and the expression that contains the argument.
24096It can be used to preserve the parentheses in the source language.
24097
24098Arguments:
24099""""""""""
24100
24101The ``llvm.arithmetic.fence`` intrinsic takes only one argument.
24102The argument and the return value are floating-point numbers,
24103or vector floating-point numbers, of the same type.
24104
24105Semantics:
24106""""""""""
24107
24108This intrinsic returns the value of its operand. The optimizer can optimize
24109the argument, but the optimizer cannot hoist any component of the operand
24110to the containing context, and the optimizer cannot move the calculation of
24111any expression in the containing context into the operand.
24112
24113
24114'``llvm.donothing``' Intrinsic
24115^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24116
24117Syntax:
24118"""""""
24119
24120::
24121
24122      declare void @llvm.donothing() nounwind readnone
24123
24124Overview:
24125"""""""""
24126
24127The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
24128three intrinsics (besides ``llvm.experimental.patchpoint`` and
24129``llvm.experimental.gc.statepoint``) that can be called with an invoke
24130instruction.
24131
24132Arguments:
24133""""""""""
24134
24135None.
24136
24137Semantics:
24138""""""""""
24139
24140This intrinsic does nothing, and it's removed by optimizers and ignored
24141by codegen.
24142
24143'``llvm.experimental.deoptimize``' Intrinsic
24144^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24145
24146Syntax:
24147"""""""
24148
24149::
24150
24151      declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
24152
24153Overview:
24154"""""""""
24155
24156This intrinsic, together with :ref:`deoptimization operand bundles
24157<deopt_opbundles>`, allow frontends to express transfer of control and
24158frame-local state from the currently executing (typically more specialized,
24159hence faster) version of a function into another (typically more generic, hence
24160slower) version.
24161
24162In languages with a fully integrated managed runtime like Java and JavaScript
24163this intrinsic can be used to implement "uncommon trap" or "side exit" like
24164functionality.  In unmanaged languages like C and C++, this intrinsic can be
24165used to represent the slow paths of specialized functions.
24166
24167
24168Arguments:
24169""""""""""
24170
24171The intrinsic takes an arbitrary number of arguments, whose meaning is
24172decided by the :ref:`lowering strategy<deoptimize_lowering>`.
24173
24174Semantics:
24175""""""""""
24176
24177The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
24178deoptimization continuation (denoted using a :ref:`deoptimization
24179operand bundle <deopt_opbundles>`) and returns the value returned by
24180the deoptimization continuation.  Defining the semantic properties of
24181the continuation itself is out of scope of the language reference --
24182as far as LLVM is concerned, the deoptimization continuation can
24183invoke arbitrary side effects, including reading from and writing to
24184the entire heap.
24185
24186Deoptimization continuations expressed using ``"deopt"`` operand bundles always
24187continue execution to the end of the physical frame containing them, so all
24188calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
24189
24190   - ``@llvm.experimental.deoptimize`` cannot be invoked.
24191   - The call must immediately precede a :ref:`ret <i_ret>` instruction.
24192   - The ``ret`` instruction must return the value produced by the
24193     ``@llvm.experimental.deoptimize`` call if there is one, or void.
24194
24195Note that the above restrictions imply that the return type for a call to
24196``@llvm.experimental.deoptimize`` will match the return type of its immediate
24197caller.
24198
24199The inliner composes the ``"deopt"`` continuations of the caller into the
24200``"deopt"`` continuations present in the inlinee, and also updates calls to this
24201intrinsic to return directly from the frame of the function it inlined into.
24202
24203All declarations of ``@llvm.experimental.deoptimize`` must share the
24204same calling convention.
24205
24206.. _deoptimize_lowering:
24207
24208Lowering:
24209"""""""""
24210
24211Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
24212symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
24213ensure that this symbol is defined).  The call arguments to
24214``@llvm.experimental.deoptimize`` are lowered as if they were formal
24215arguments of the specified types, and not as varargs.
24216
24217
24218'``llvm.experimental.guard``' Intrinsic
24219^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24220
24221Syntax:
24222"""""""
24223
24224::
24225
24226      declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
24227
24228Overview:
24229"""""""""
24230
24231This intrinsic, together with :ref:`deoptimization operand bundles
24232<deopt_opbundles>`, allows frontends to express guards or checks on
24233optimistic assumptions made during compilation.  The semantics of
24234``@llvm.experimental.guard`` is defined in terms of
24235``@llvm.experimental.deoptimize`` -- its body is defined to be
24236equivalent to:
24237
24238.. code-block:: text
24239
24240  define void @llvm.experimental.guard(i1 %pred, <args...>) {
24241    %realPred = and i1 %pred, undef
24242    br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
24243
24244  leave:
24245    call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
24246    ret void
24247
24248  continue:
24249    ret void
24250  }
24251
24252
24253with the optional ``[, !make.implicit !{}]`` present if and only if it
24254is present on the call site.  For more details on ``!make.implicit``,
24255see :doc:`FaultMaps`.
24256
24257In words, ``@llvm.experimental.guard`` executes the attached
24258``"deopt"`` continuation if (but **not** only if) its first argument
24259is ``false``.  Since the optimizer is allowed to replace the ``undef``
24260with an arbitrary value, it can optimize guard to fail "spuriously",
24261i.e. without the original condition being false (hence the "not only
24262if"); and this allows for "check widening" type optimizations.
24263
24264``@llvm.experimental.guard`` cannot be invoked.
24265
24266After ``@llvm.experimental.guard`` was first added, a more general
24267formulation was found in ``@llvm.experimental.widenable.condition``.
24268Support for ``@llvm.experimental.guard`` is slowly being rephrased in
24269terms of this alternate.
24270
24271'``llvm.experimental.widenable.condition``' Intrinsic
24272^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24273
24274Syntax:
24275"""""""
24276
24277::
24278
24279      declare i1 @llvm.experimental.widenable.condition()
24280
24281Overview:
24282"""""""""
24283
24284This intrinsic represents a "widenable condition" which is
24285boolean expressions with the following property: whether this
24286expression is `true` or `false`, the program is correct and
24287well-defined.
24288
24289Together with :ref:`deoptimization operand bundles <deopt_opbundles>`,
24290``@llvm.experimental.widenable.condition`` allows frontends to
24291express guards or checks on optimistic assumptions made during
24292compilation and represent them as branch instructions on special
24293conditions.
24294
24295While this may appear similar in semantics to `undef`, it is very
24296different in that an invocation produces a particular, singular
24297value. It is also intended to be lowered late, and remain available
24298for specific optimizations and transforms that can benefit from its
24299special properties.
24300
24301Arguments:
24302""""""""""
24303
24304None.
24305
24306Semantics:
24307""""""""""
24308
24309The intrinsic ``@llvm.experimental.widenable.condition()``
24310returns either `true` or `false`. For each evaluation of a call
24311to this intrinsic, the program must be valid and correct both if
24312it returns `true` and if it returns `false`. This allows
24313transformation passes to replace evaluations of this intrinsic
24314with either value whenever one is beneficial.
24315
24316When used in a branch condition, it allows us to choose between
24317two alternative correct solutions for the same problem, like
24318in example below:
24319
24320.. code-block:: text
24321
24322    %cond = call i1 @llvm.experimental.widenable.condition()
24323    br i1 %cond, label %solution_1, label %solution_2
24324
24325  label %fast_path:
24326    ; Apply memory-consuming but fast solution for a task.
24327
24328  label %slow_path:
24329    ; Cheap in memory but slow solution.
24330
24331Whether the result of intrinsic's call is `true` or `false`,
24332it should be correct to pick either solution. We can switch
24333between them by replacing the result of
24334``@llvm.experimental.widenable.condition`` with different
24335`i1` expressions.
24336
24337This is how it can be used to represent guards as widenable branches:
24338
24339.. code-block:: text
24340
24341  block:
24342    ; Unguarded instructions
24343    call void @llvm.experimental.guard(i1 %cond, <args...>) ["deopt"(<deopt_args...>)]
24344    ; Guarded instructions
24345
24346Can be expressed in an alternative equivalent form of explicit branch using
24347``@llvm.experimental.widenable.condition``:
24348
24349.. code-block:: text
24350
24351  block:
24352    ; Unguarded instructions
24353    %widenable_condition = call i1 @llvm.experimental.widenable.condition()
24354    %guard_condition = and i1 %cond, %widenable_condition
24355    br i1 %guard_condition, label %guarded, label %deopt
24356
24357  guarded:
24358    ; Guarded instructions
24359
24360  deopt:
24361    call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ]
24362
24363So the block `guarded` is only reachable when `%cond` is `true`,
24364and it should be valid to go to the block `deopt` whenever `%cond`
24365is `true` or `false`.
24366
24367``@llvm.experimental.widenable.condition`` will never throw, thus
24368it cannot be invoked.
24369
24370Guard widening:
24371"""""""""""""""
24372
24373When ``@llvm.experimental.widenable.condition()`` is used in
24374condition of a guard represented as explicit branch, it is
24375legal to widen the guard's condition with any additional
24376conditions.
24377
24378Guard widening looks like replacement of
24379
24380.. code-block:: text
24381
24382  %widenable_cond = call i1 @llvm.experimental.widenable.condition()
24383  %guard_cond = and i1 %cond, %widenable_cond
24384  br i1 %guard_cond, label %guarded, label %deopt
24385
24386with
24387
24388.. code-block:: text
24389
24390  %widenable_cond = call i1 @llvm.experimental.widenable.condition()
24391  %new_cond = and i1 %any_other_cond, %widenable_cond
24392  %new_guard_cond = and i1 %cond, %new_cond
24393  br i1 %new_guard_cond, label %guarded, label %deopt
24394
24395for this branch. Here `%any_other_cond` is an arbitrarily chosen
24396well-defined `i1` value. By making guard widening, we may
24397impose stricter conditions on `guarded` block and bail to the
24398deopt when the new condition is not met.
24399
24400Lowering:
24401"""""""""
24402
24403Default lowering strategy is replacing the result of
24404call of ``@llvm.experimental.widenable.condition``  with
24405constant `true`. However it is always correct to replace
24406it with any other `i1` value. Any pass can
24407freely do it if it can benefit from non-default lowering.
24408
24409
24410'``llvm.load.relative``' Intrinsic
24411^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24412
24413Syntax:
24414"""""""
24415
24416::
24417
24418      declare ptr @llvm.load.relative.iN(ptr %ptr, iN %offset) argmemonly nounwind readonly
24419
24420Overview:
24421"""""""""
24422
24423This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
24424adds ``%ptr`` to that value and returns it. The constant folder specifically
24425recognizes the form of this intrinsic and the constant initializers it may
24426load from; if a loaded constant initializer is known to have the form
24427``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
24428
24429LLVM provides that the calculation of such a constant initializer will
24430not overflow at link time under the medium code model if ``x`` is an
24431``unnamed_addr`` function. However, it does not provide this guarantee for
24432a constant initializer folded into a function body. This intrinsic can be
24433used to avoid the possibility of overflows when loading from such a constant.
24434
24435.. _llvm_sideeffect:
24436
24437'``llvm.sideeffect``' Intrinsic
24438^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24439
24440Syntax:
24441"""""""
24442
24443::
24444
24445      declare void @llvm.sideeffect() inaccessiblememonly nounwind willreturn
24446
24447Overview:
24448"""""""""
24449
24450The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
24451treat it as having side effects, so it can be inserted into a loop to
24452indicate that the loop shouldn't be assumed to terminate (which could
24453potentially lead to the loop being optimized away entirely), even if it's
24454an infinite loop with no other side effects.
24455
24456Arguments:
24457""""""""""
24458
24459None.
24460
24461Semantics:
24462""""""""""
24463
24464This intrinsic actually does nothing, but optimizers must assume that it
24465has externally observable side effects.
24466
24467'``llvm.is.constant.*``' Intrinsic
24468^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24469
24470Syntax:
24471"""""""
24472
24473This is an overloaded intrinsic. You can use llvm.is.constant with any argument type.
24474
24475::
24476
24477      declare i1 @llvm.is.constant.i32(i32 %operand) nounwind readnone
24478      declare i1 @llvm.is.constant.f32(float %operand) nounwind readnone
24479      declare i1 @llvm.is.constant.TYPENAME(TYPE %operand) nounwind readnone
24480
24481Overview:
24482"""""""""
24483
24484The '``llvm.is.constant``' intrinsic will return true if the argument
24485is known to be a manifest compile-time constant. It is guaranteed to
24486fold to either true or false before generating machine code.
24487
24488Semantics:
24489""""""""""
24490
24491This intrinsic generates no code. If its argument is known to be a
24492manifest compile-time constant value, then the intrinsic will be
24493converted to a constant true value. Otherwise, it will be converted to
24494a constant false value.
24495
24496In particular, note that if the argument is a constant expression
24497which refers to a global (the address of which _is_ a constant, but
24498not manifest during the compile), then the intrinsic evaluates to
24499false.
24500
24501The result also intentionally depends on the result of optimization
24502passes -- e.g., the result can change depending on whether a
24503function gets inlined or not. A function's parameters are
24504obviously not constant. However, a call like
24505``llvm.is.constant.i32(i32 %param)`` *can* return true after the
24506function is inlined, if the value passed to the function parameter was
24507a constant.
24508
24509On the other hand, if constant folding is not run, it will never
24510evaluate to true, even in simple cases.
24511
24512.. _int_ptrmask:
24513
24514'``llvm.ptrmask``' Intrinsic
24515^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24516
24517Syntax:
24518"""""""
24519
24520::
24521
24522      declare ptrty llvm.ptrmask(ptrty %ptr, intty %mask) readnone speculatable
24523
24524Arguments:
24525""""""""""
24526
24527The first argument is a pointer. The second argument is an integer.
24528
24529Overview:
24530""""""""""
24531
24532The ``llvm.ptrmask`` intrinsic masks out bits of the pointer according to a mask.
24533This allows stripping data from tagged pointers without converting them to an
24534integer (ptrtoint/inttoptr). As a consequence, we can preserve more information
24535to facilitate alias analysis and underlying-object detection.
24536
24537Semantics:
24538""""""""""
24539
24540The result of ``ptrmask(ptr, mask)`` is equivalent to
24541``getelementptr ptr, (ptrtoint(ptr) & mask) - ptrtoint(ptr)``. Both the returned
24542pointer and the first argument are based on the same underlying object (for more
24543information on the *based on* terminology see
24544:ref:`the pointer aliasing rules <pointeraliasing>`). If the bitwidth of the
24545mask argument does not match the pointer size of the target, the mask is
24546zero-extended or truncated accordingly.
24547
24548.. _int_vscale:
24549
24550'``llvm.vscale``' Intrinsic
24551^^^^^^^^^^^^^^^^^^^^^^^^^^^
24552
24553Syntax:
24554"""""""
24555
24556::
24557
24558      declare i32 llvm.vscale.i32()
24559      declare i64 llvm.vscale.i64()
24560
24561Overview:
24562"""""""""
24563
24564The ``llvm.vscale`` intrinsic returns the value for ``vscale`` in scalable
24565vectors such as ``<vscale x 16 x i8>``.
24566
24567Semantics:
24568""""""""""
24569
24570``vscale`` is a positive value that is constant throughout program
24571execution, but is unknown at compile time.
24572If the result value does not fit in the result type, then the result is
24573a :ref:`poison value <poisonvalues>`.
24574
24575
24576Stack Map Intrinsics
24577--------------------
24578
24579LLVM provides experimental intrinsics to support runtime patching
24580mechanisms commonly desired in dynamic language JITs. These intrinsics
24581are described in :doc:`StackMaps`.
24582
24583Element Wise Atomic Memory Intrinsics
24584-------------------------------------
24585
24586These intrinsics are similar to the standard library memory intrinsics except
24587that they perform memory transfer as a sequence of atomic memory accesses.
24588
24589.. _int_memcpy_element_unordered_atomic:
24590
24591'``llvm.memcpy.element.unordered.atomic``' Intrinsic
24592^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24593
24594Syntax:
24595"""""""
24596
24597This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
24598any integer bit width and for different address spaces. Not all targets
24599support all bit widths however.
24600
24601::
24602
24603      declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i32(ptr <dest>,
24604                                                                   ptr <src>,
24605                                                                   i32 <len>,
24606                                                                   i32 <element_size>)
24607      declare void @llvm.memcpy.element.unordered.atomic.p0.p0.i64(ptr <dest>,
24608                                                                   ptr <src>,
24609                                                                   i64 <len>,
24610                                                                   i32 <element_size>)
24611
24612Overview:
24613"""""""""
24614
24615The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
24616'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
24617as arrays with elements that are exactly ``element_size`` bytes, and the copy between
24618buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
24619that are a positive integer multiple of the ``element_size`` in size.
24620
24621Arguments:
24622""""""""""
24623
24624The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
24625intrinsic, with the added constraint that ``len`` is required to be a positive integer
24626multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
24627``element_size``, then the behaviour of the intrinsic is undefined.
24628
24629``element_size`` must be a compile-time constant positive power of two no greater than
24630target-specific atomic access size limit.
24631
24632For each of the input pointers ``align`` parameter attribute must be specified. It
24633must be a power of two no less than the ``element_size``. Caller guarantees that
24634both the source and destination pointers are aligned to that boundary.
24635
24636Semantics:
24637""""""""""
24638
24639The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
24640memory from the source location to the destination location. These locations are not
24641allowed to overlap. The memory copy is performed as a sequence of load/store operations
24642where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
24643aligned at an ``element_size`` boundary.
24644
24645The order of the copy is unspecified. The same value may be read from the source
24646buffer many times, but only one write is issued to the destination buffer per
24647element. It is well defined to have concurrent reads and writes to both source and
24648destination provided those reads and writes are unordered atomic when specified.
24649
24650This intrinsic does not provide any additional ordering guarantees over those
24651provided by a set of unordered loads from the source location and stores to the
24652destination.
24653
24654Lowering:
24655"""""""""
24656
24657In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
24658lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
24659is replaced with an actual element size. See :ref:`RewriteStatepointsForGC intrinsic
24660lowering <RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
24661lowering.
24662
24663Optimizer is allowed to inline memory copy when it's profitable to do so.
24664
24665'``llvm.memmove.element.unordered.atomic``' Intrinsic
24666^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24667
24668Syntax:
24669"""""""
24670
24671This is an overloaded intrinsic. You can use
24672``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
24673different address spaces. Not all targets support all bit widths however.
24674
24675::
24676
24677      declare void @llvm.memmove.element.unordered.atomic.p0.p0.i32(ptr <dest>,
24678                                                                    ptr <src>,
24679                                                                    i32 <len>,
24680                                                                    i32 <element_size>)
24681      declare void @llvm.memmove.element.unordered.atomic.p0.p0.i64(ptr <dest>,
24682                                                                    ptr <src>,
24683                                                                    i64 <len>,
24684                                                                    i32 <element_size>)
24685
24686Overview:
24687"""""""""
24688
24689The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
24690of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
24691``src`` are treated as arrays with elements that are exactly ``element_size``
24692bytes, and the copy between buffers uses a sequence of
24693:ref:`unordered atomic <ordering>` load/store operations that are a positive
24694integer multiple of the ``element_size`` in size.
24695
24696Arguments:
24697""""""""""
24698
24699The first three arguments are the same as they are in the
24700:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
24701``len`` is required to be a positive integer multiple of the ``element_size``.
24702If ``len`` is not a positive integer multiple of ``element_size``, then the
24703behaviour of the intrinsic is undefined.
24704
24705``element_size`` must be a compile-time constant positive power of two no
24706greater than a target-specific atomic access size limit.
24707
24708For each of the input pointers the ``align`` parameter attribute must be
24709specified. It must be a power of two no less than the ``element_size``. Caller
24710guarantees that both the source and destination pointers are aligned to that
24711boundary.
24712
24713Semantics:
24714""""""""""
24715
24716The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
24717of memory from the source location to the destination location. These locations
24718are allowed to overlap. The memory copy is performed as a sequence of load/store
24719operations where each access is guaranteed to be a multiple of ``element_size``
24720bytes wide and aligned at an ``element_size`` boundary.
24721
24722The order of the copy is unspecified. The same value may be read from the source
24723buffer many times, but only one write is issued to the destination buffer per
24724element. It is well defined to have concurrent reads and writes to both source
24725and destination provided those reads and writes are unordered atomic when
24726specified.
24727
24728This intrinsic does not provide any additional ordering guarantees over those
24729provided by a set of unordered loads from the source location and stores to the
24730destination.
24731
24732Lowering:
24733"""""""""
24734
24735In the most general case call to the
24736'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
24737``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
24738actual element size. See :ref:`RewriteStatepointsForGC intrinsic lowering
24739<RewriteStatepointsForGC_intrinsic_lowering>` for details on GC specific
24740lowering.
24741
24742The optimizer is allowed to inline the memory copy when it's profitable to do so.
24743
24744.. _int_memset_element_unordered_atomic:
24745
24746'``llvm.memset.element.unordered.atomic``' Intrinsic
24747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24748
24749Syntax:
24750"""""""
24751
24752This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
24753any integer bit width and for different address spaces. Not all targets
24754support all bit widths however.
24755
24756::
24757
24758      declare void @llvm.memset.element.unordered.atomic.p0.i32(ptr <dest>,
24759                                                                i8 <value>,
24760                                                                i32 <len>,
24761                                                                i32 <element_size>)
24762      declare void @llvm.memset.element.unordered.atomic.p0.i64(ptr <dest>,
24763                                                                i8 <value>,
24764                                                                i64 <len>,
24765                                                                i32 <element_size>)
24766
24767Overview:
24768"""""""""
24769
24770The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
24771'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
24772with elements that are exactly ``element_size`` bytes, and the assignment to that array
24773uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
24774that are a positive integer multiple of the ``element_size`` in size.
24775
24776Arguments:
24777""""""""""
24778
24779The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
24780intrinsic, with the added constraint that ``len`` is required to be a positive integer
24781multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
24782``element_size``, then the behaviour of the intrinsic is undefined.
24783
24784``element_size`` must be a compile-time constant positive power of two no greater than
24785target-specific atomic access size limit.
24786
24787The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
24788must be a power of two no less than the ``element_size``. Caller guarantees that
24789the destination pointer is aligned to that boundary.
24790
24791Semantics:
24792""""""""""
24793
24794The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
24795memory starting at the destination location to the given ``value``. The memory is
24796set with a sequence of store operations where each access is guaranteed to be a
24797multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
24798
24799The order of the assignment is unspecified. Only one write is issued to the
24800destination buffer per element. It is well defined to have concurrent reads and
24801writes to the destination provided those reads and writes are unordered atomic
24802when specified.
24803
24804This intrinsic does not provide any additional ordering guarantees over those
24805provided by a set of unordered stores to the destination.
24806
24807Lowering:
24808"""""""""
24809
24810In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
24811lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
24812is replaced with an actual element size.
24813
24814The optimizer is allowed to inline the memory assignment when it's profitable to do so.
24815
24816Objective-C ARC Runtime Intrinsics
24817----------------------------------
24818
24819LLVM provides intrinsics that lower to Objective-C ARC runtime entry points.
24820LLVM is aware of the semantics of these functions, and optimizes based on that
24821knowledge. You can read more about the details of Objective-C ARC `here
24822<https://clang.llvm.org/docs/AutomaticReferenceCounting.html>`_.
24823
24824'``llvm.objc.autorelease``' Intrinsic
24825^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24826
24827Syntax:
24828"""""""
24829::
24830
24831      declare ptr @llvm.objc.autorelease(ptr)
24832
24833Lowering:
24834"""""""""
24835
24836Lowers to a call to `objc_autorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autorelease>`_.
24837
24838'``llvm.objc.autoreleasePoolPop``' Intrinsic
24839^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24840
24841Syntax:
24842"""""""
24843::
24844
24845      declare void @llvm.objc.autoreleasePoolPop(ptr)
24846
24847Lowering:
24848"""""""""
24849
24850Lowers to a call to `objc_autoreleasePoolPop <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpop-void-pool>`_.
24851
24852'``llvm.objc.autoreleasePoolPush``' Intrinsic
24853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24854
24855Syntax:
24856"""""""
24857::
24858
24859      declare ptr @llvm.objc.autoreleasePoolPush()
24860
24861Lowering:
24862"""""""""
24863
24864Lowers to a call to `objc_autoreleasePoolPush <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-autoreleasepoolpush-void>`_.
24865
24866'``llvm.objc.autoreleaseReturnValue``' Intrinsic
24867^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24868
24869Syntax:
24870"""""""
24871::
24872
24873      declare ptr @llvm.objc.autoreleaseReturnValue(ptr)
24874
24875Lowering:
24876"""""""""
24877
24878Lowers to a call to `objc_autoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue>`_.
24879
24880'``llvm.objc.copyWeak``' Intrinsic
24881^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24882
24883Syntax:
24884"""""""
24885::
24886
24887      declare void @llvm.objc.copyWeak(ptr, ptr)
24888
24889Lowering:
24890"""""""""
24891
24892Lowers to a call to `objc_copyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-copyweak-id-dest-id-src>`_.
24893
24894'``llvm.objc.destroyWeak``' Intrinsic
24895^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24896
24897Syntax:
24898"""""""
24899::
24900
24901      declare void @llvm.objc.destroyWeak(ptr)
24902
24903Lowering:
24904"""""""""
24905
24906Lowers to a call to `objc_destroyWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-destroyweak-id-object>`_.
24907
24908'``llvm.objc.initWeak``' Intrinsic
24909^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24910
24911Syntax:
24912"""""""
24913::
24914
24915      declare ptr @llvm.objc.initWeak(ptr, ptr)
24916
24917Lowering:
24918"""""""""
24919
24920Lowers to a call to `objc_initWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-initweak>`_.
24921
24922'``llvm.objc.loadWeak``' Intrinsic
24923^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24924
24925Syntax:
24926"""""""
24927::
24928
24929      declare ptr @llvm.objc.loadWeak(ptr)
24930
24931Lowering:
24932"""""""""
24933
24934Lowers to a call to `objc_loadWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweak>`_.
24935
24936'``llvm.objc.loadWeakRetained``' Intrinsic
24937^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24938
24939Syntax:
24940"""""""
24941::
24942
24943      declare ptr @llvm.objc.loadWeakRetained(ptr)
24944
24945Lowering:
24946"""""""""
24947
24948Lowers to a call to `objc_loadWeakRetained <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-loadweakretained>`_.
24949
24950'``llvm.objc.moveWeak``' Intrinsic
24951^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24952
24953Syntax:
24954"""""""
24955::
24956
24957      declare void @llvm.objc.moveWeak(ptr, ptr)
24958
24959Lowering:
24960"""""""""
24961
24962Lowers to a call to `objc_moveWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-moveweak-id-dest-id-src>`_.
24963
24964'``llvm.objc.release``' Intrinsic
24965^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24966
24967Syntax:
24968"""""""
24969::
24970
24971      declare void @llvm.objc.release(ptr)
24972
24973Lowering:
24974"""""""""
24975
24976Lowers to a call to `objc_release <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-release-id-value>`_.
24977
24978'``llvm.objc.retain``' Intrinsic
24979^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24980
24981Syntax:
24982"""""""
24983::
24984
24985      declare ptr @llvm.objc.retain(ptr)
24986
24987Lowering:
24988"""""""""
24989
24990Lowers to a call to `objc_retain <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retain>`_.
24991
24992'``llvm.objc.retainAutorelease``' Intrinsic
24993^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
24994
24995Syntax:
24996"""""""
24997::
24998
24999      declare ptr @llvm.objc.retainAutorelease(ptr)
25000
25001Lowering:
25002"""""""""
25003
25004Lowers to a call to `objc_retainAutorelease <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautorelease>`_.
25005
25006'``llvm.objc.retainAutoreleaseReturnValue``' Intrinsic
25007^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25008
25009Syntax:
25010"""""""
25011::
25012
25013      declare ptr @llvm.objc.retainAutoreleaseReturnValue(ptr)
25014
25015Lowering:
25016"""""""""
25017
25018Lowers to a call to `objc_retainAutoreleaseReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasereturnvalue>`_.
25019
25020'``llvm.objc.retainAutoreleasedReturnValue``' Intrinsic
25021^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25022
25023Syntax:
25024"""""""
25025::
25026
25027      declare ptr @llvm.objc.retainAutoreleasedReturnValue(ptr)
25028
25029Lowering:
25030"""""""""
25031
25032Lowers to a call to `objc_retainAutoreleasedReturnValue <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainautoreleasedreturnvalue>`_.
25033
25034'``llvm.objc.retainBlock``' Intrinsic
25035^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25036
25037Syntax:
25038"""""""
25039::
25040
25041      declare ptr @llvm.objc.retainBlock(ptr)
25042
25043Lowering:
25044"""""""""
25045
25046Lowers to a call to `objc_retainBlock <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-retainblock>`_.
25047
25048'``llvm.objc.storeStrong``' Intrinsic
25049^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25050
25051Syntax:
25052"""""""
25053::
25054
25055      declare void @llvm.objc.storeStrong(ptr, ptr)
25056
25057Lowering:
25058"""""""""
25059
25060Lowers to a call to `objc_storeStrong <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#void-objc-storestrong-id-object-id-value>`_.
25061
25062'``llvm.objc.storeWeak``' Intrinsic
25063^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25064
25065Syntax:
25066"""""""
25067::
25068
25069      declare ptr @llvm.objc.storeWeak(ptr, ptr)
25070
25071Lowering:
25072"""""""""
25073
25074Lowers to a call to `objc_storeWeak <https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-storeweak>`_.
25075
25076Preserving Debug Information Intrinsics
25077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25078
25079These intrinsics are used to carry certain debuginfo together with
25080IR-level operations. For example, it may be desirable to
25081know the structure/union name and the original user-level field
25082indices. Such information got lost in IR GetElementPtr instruction
25083since the IR types are different from debugInfo types and unions
25084are converted to structs in IR.
25085
25086'``llvm.preserve.array.access.index``' Intrinsic
25087^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25088
25089Syntax:
25090"""""""
25091::
25092
25093      declare <ret_type>
25094      @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(<type> base,
25095                                                                           i32 dim,
25096                                                                           i32 index)
25097
25098Overview:
25099"""""""""
25100
25101The '``llvm.preserve.array.access.index``' intrinsic returns the getelementptr address
25102based on array base ``base``, array dimension ``dim`` and the last access index ``index``
25103into the array. The return type ``ret_type`` is a pointer type to the array element.
25104The array ``dim`` and ``index`` are preserved which is more robust than
25105getelementptr instruction which may be subject to compiler transformation.
25106The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
25107to provide array or pointer debuginfo type.
25108The metadata is a ``DICompositeType`` or ``DIDerivedType`` representing the
25109debuginfo version of ``type``.
25110
25111Arguments:
25112""""""""""
25113
25114The ``base`` is the array base address.  The ``dim`` is the array dimension.
25115The ``base`` is a pointer if ``dim`` equals 0.
25116The ``index`` is the last access index into the array or pointer.
25117
25118The ``base`` argument must be annotated with an :ref:`elementtype
25119<attr_elementtype>` attribute at the call-site. This attribute specifies the
25120getelementptr element type.
25121
25122Semantics:
25123""""""""""
25124
25125The '``llvm.preserve.array.access.index``' intrinsic produces the same result
25126as a getelementptr with base ``base`` and access operands ``{dim's 0's, index}``.
25127
25128'``llvm.preserve.union.access.index``' Intrinsic
25129^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25130
25131Syntax:
25132"""""""
25133::
25134
25135      declare <type>
25136      @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(<type> base,
25137                                                                        i32 di_index)
25138
25139Overview:
25140"""""""""
25141
25142The '``llvm.preserve.union.access.index``' intrinsic carries the debuginfo field index
25143``di_index`` and returns the ``base`` address.
25144The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
25145to provide union debuginfo type.
25146The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
25147The return type ``type`` is the same as the ``base`` type.
25148
25149Arguments:
25150""""""""""
25151
25152The ``base`` is the union base address. The ``di_index`` is the field index in debuginfo.
25153
25154Semantics:
25155""""""""""
25156
25157The '``llvm.preserve.union.access.index``' intrinsic returns the ``base`` address.
25158
25159'``llvm.preserve.struct.access.index``' Intrinsic
25160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25161
25162Syntax:
25163"""""""
25164::
25165
25166      declare <ret_type>
25167      @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(<type> base,
25168                                                                 i32 gep_index,
25169                                                                 i32 di_index)
25170
25171Overview:
25172"""""""""
25173
25174The '``llvm.preserve.struct.access.index``' intrinsic returns the getelementptr address
25175based on struct base ``base`` and IR struct member index ``gep_index``.
25176The ``llvm.preserve.access.index`` type of metadata is attached to this call instruction
25177to provide struct debuginfo type.
25178The metadata is a ``DICompositeType`` representing the debuginfo version of ``type``.
25179The return type ``ret_type`` is a pointer type to the structure member.
25180
25181Arguments:
25182""""""""""
25183
25184The ``base`` is the structure base address. The ``gep_index`` is the struct member index
25185based on IR structures. The ``di_index`` is the struct member index based on debuginfo.
25186
25187The ``base`` argument must be annotated with an :ref:`elementtype
25188<attr_elementtype>` attribute at the call-site. This attribute specifies the
25189getelementptr element type.
25190
25191Semantics:
25192""""""""""
25193
25194The '``llvm.preserve.struct.access.index``' intrinsic produces the same result
25195as a getelementptr with base ``base`` and access operands ``{0, gep_index}``.
25196
25197'``llvm.fptrunc.round``' Intrinsic
25198^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25199
25200Syntax:
25201"""""""
25202
25203::
25204
25205      declare <ty2>
25206      @llvm.fptrunc.round(<type> <value>, metadata <rounding mode>)
25207
25208Overview:
25209"""""""""
25210
25211The '``llvm.fptrunc.round``' intrinsic truncates
25212:ref:`floating-point <t_floating>` ``value`` to type ``ty2``
25213with a specified rounding mode.
25214
25215Arguments:
25216""""""""""
25217
25218The '``llvm.fptrunc.round``' intrinsic takes a :ref:`floating-point
25219<t_floating>` value to cast and a :ref:`floating-point <t_floating>` type
25220to cast it to. This argument must be larger in size than the result.
25221
25222The second argument specifies the rounding mode as described in the constrained
25223intrinsics section.
25224For this intrinsic, the "round.dynamic" mode is not supported.
25225
25226Semantics:
25227""""""""""
25228
25229The '``llvm.fptrunc.round``' intrinsic casts a ``value`` from a larger
25230:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
25231<t_floating>` type.
25232This intrinsic is assumed to execute in the default :ref:`floating-point
25233environment <floatenv>` *except* for the rounding mode.
25234This intrinsic is not supported on all targets. Some targets may not support
25235all rounding modes.
25236