1======================================
2Syntax of AMDGPU Instruction Modifiers
3======================================
4
5.. contents::
6   :local:
7
8Conventions
9===========
10
11The following notation is used throughout this document:
12
13    =================== =============================================================
14    Notation            Description
15    =================== =============================================================
16    {0..N}              Any integer value in the range from 0 to N (inclusive).
17    <x>                 Syntax and meaning of *x* is explained elsewhere.
18    =================== =============================================================
19
20.. _amdgpu_syn_modifiers:
21
22Modifiers
23=========
24
25DS Modifiers
26------------
27
28.. _amdgpu_synid_ds_offset80:
29
30offset0
31~~~~~~~
32
33Specifies first 8-bit offset, in bytes. The default value is 0.
34
35Used with DS instructions that expect two addresses.
36
37    =================== ====================================================================
38    Syntax              Description
39    =================== ====================================================================
40    offset0:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
41                        :ref:`integer number <amdgpu_synid_integer_number>`
42                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
43    =================== ====================================================================
44
45Examples:
46
47.. parsed-literal::
48
49  offset0:0xff
50  offset0:2-x
51  offset0:-x-y
52
53.. _amdgpu_synid_ds_offset81:
54
55offset1
56~~~~~~~
57
58Specifies second 8-bit offset, in bytes. The default value is 0.
59
60Used with DS instructions that expect two addresses.
61
62    =================== ====================================================================
63    Syntax              Description
64    =================== ====================================================================
65    offset1:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
66                        :ref:`integer number <amdgpu_synid_integer_number>`
67                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
68    =================== ====================================================================
69
70Examples:
71
72.. parsed-literal::
73
74  offset1:0xff
75  offset1:2-x
76  offset1:-x-y
77
78.. _amdgpu_synid_ds_offset16:
79
80offset
81~~~~~~
82
83Specifies a 16-bit offset, in bytes. The default value is 0.
84
85Used with DS instructions that expect a single address.
86
87    ==================== ====================================================================
88    Syntax               Description
89    ==================== ====================================================================
90    offset:{0..0xFFFF}   Specifies an unsigned 16-bit offset as a positive
91                         :ref:`integer number <amdgpu_synid_integer_number>`
92                         or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
93    ==================== ====================================================================
94
95Examples:
96
97.. parsed-literal::
98
99  offset:65535
100  offset:0xffff
101  offset:-x-y
102
103.. _amdgpu_synid_sw_offset16:
104
105swizzle pattern
106~~~~~~~~~~~~~~~
107
108This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
109It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
110
111See AMD documentation for more information.
112
113    ======================================================= ===========================================================
114    Syntax                                                  Description
115    ======================================================= ===========================================================
116    offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern.
117    offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern
118
119                                                            Each number is a lane *id*.
120    offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern.
121
122                                                            The pattern converts a 5-bit lane *id* to another
123                                                            lane *id* with which the lane interacts.
124
125                                                            *mask* is a 5 character sequence which
126                                                            specifies how to transform the bits of the
127                                                            lane *id*.
128
129                                                            The following characters are allowed:
130
131                                                            * "0" - set bit to 0.
132
133                                                            * "1" - set bit to 1.
134
135                                                            * "p" - preserve bit.
136
137                                                            * "i" - inverse bit.
138
139    offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode.
140
141                                                            Broadcasts the value of any particular lane to
142                                                            all lanes in its group.
143
144                                                            The first numeric parameter is a group
145                                                            size and must be equal to 2, 4, 8, 16 or 32.
146
147                                                            The second numeric parameter is an index of the
148                                                            lane being broadcasted.
149
150                                                            The index must not exceed group size.
151    offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
152
153                                                            Swaps the neighboring groups of
154                                                            1, 2, 4, 8 or 16 lanes.
155    offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode.
156
157                                                            Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
158    ======================================================= ===========================================================
159
160Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
161:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
162
163Examples:
164
165.. parsed-literal::
166
167  offset:255
168  offset:0xffff
169  offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
170  offset:swizzle(BITMASK_PERM, "01pi0")
171  offset:swizzle(BROADCAST, 2, 0)
172  offset:swizzle(SWAP, 8)
173  offset:swizzle(REVERSE, 30 + 2)
174
175.. _amdgpu_synid_gds:
176
177gds
178~~~
179
180Specifies whether to use GDS or LDS memory (LDS is the default).
181
182    ======================================== ================================================
183    Syntax                                   Description
184    ======================================== ================================================
185    gds                                      Use GDS memory.
186    ======================================== ================================================
187
188
189EXP Modifiers
190-------------
191
192.. _amdgpu_synid_done:
193
194done
195~~~~
196
197Specifies if this is the last export from the shader to the target. By default,
198*exp* instruction does not finish an export sequence.
199
200    ======================================== ================================================
201    Syntax                                   Description
202    ======================================== ================================================
203    done                                     Indicates the last export operation.
204    ======================================== ================================================
205
206.. _amdgpu_synid_compr:
207
208compr
209~~~~~
210
211Indicates if the data are compressed (data are not compressed by default).
212
213    ======================================== ================================================
214    Syntax                                   Description
215    ======================================== ================================================
216    compr                                    Data are compressed.
217    ======================================== ================================================
218
219.. _amdgpu_synid_vm:
220
221vm
222~~
223
224Specifies valid mask flag state (off by default).
225
226    ======================================== ================================================
227    Syntax                                   Description
228    ======================================== ================================================
229    vm                                       Set valid mask flag.
230    ======================================== ================================================
231
232FLAT Modifiers
233--------------
234
235.. _amdgpu_synid_flat_offset12:
236
237offset12
238~~~~~~~~
239
240Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
241
242Cannot be used with *global/scratch* opcodes. GFX9 only.
243
244    ================= ====================================================================
245    Syntax            Description
246    ================= ====================================================================
247    offset:{0..4095}  Specifies a 12-bit unsigned offset as a positive
248                      :ref:`integer number <amdgpu_synid_integer_number>`
249                      or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
250    ================= ====================================================================
251
252Examples:
253
254.. parsed-literal::
255
256  offset:4095
257  offset:x-0xff
258
259.. _amdgpu_synid_flat_offset13s:
260
261offset13s
262~~~~~~~~~
263
264Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
265
266Can be used with *global/scratch* opcodes only. GFX9 only.
267
268    ===================== ====================================================================
269    Syntax                Description
270    ===================== ====================================================================
271    offset:{-4096..4095}  Specifies a 13-bit signed offset as an
272                          :ref:`integer number <amdgpu_synid_integer_number>`
273                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
274    ===================== ====================================================================
275
276Examples:
277
278.. parsed-literal::
279
280  offset:-4000
281  offset:0x10
282  offset:-x
283
284.. _amdgpu_synid_flat_offset12s:
285
286offset12s
287~~~~~~~~~
288
289Specifies an immediate signed 12-bit offset, in bytes. The default value is 0.
290
291Can be used with *global/scratch* opcodes only.
292
293GFX10 only.
294
295    ===================== ====================================================================
296    Syntax                Description
297    ===================== ====================================================================
298    offset:{-2048..2047}  Specifies a 12-bit signed offset as an
299                          :ref:`integer number <amdgpu_synid_integer_number>`
300                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
301    ===================== ====================================================================
302
303Examples:
304
305.. parsed-literal::
306
307  offset:-2000
308  offset:0x10
309  offset:-x+y
310
311.. _amdgpu_synid_flat_offset11:
312
313offset11
314~~~~~~~~
315
316Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0.
317
318Cannot be used with *global/scratch* opcodes.
319
320GFX10 only.
321
322    ================= ====================================================================
323    Syntax            Description
324    ================= ====================================================================
325    offset:{0..2047}  Specifies an 11-bit unsigned offset as a positive
326                      :ref:`integer number <amdgpu_synid_integer_number>`
327                      or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
328    ================= ====================================================================
329
330Examples:
331
332.. parsed-literal::
333
334  offset:2047
335  offset:x+0xff
336
337dlc
338~~~
339
340See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
341
342glc
343~~~
344
345See a description :ref:`here<amdgpu_synid_glc>`.
346
347lds
348~~~
349
350See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only.
351
352slc
353~~~
354
355See a description :ref:`here<amdgpu_synid_slc>`.
356
357tfe
358~~~
359
360See a description :ref:`here<amdgpu_synid_tfe>`.
361
362nv
363~~
364
365See a description :ref:`here<amdgpu_synid_nv>`.
366
367sc0
368~~~
369
370See a description :ref:`here<amdgpu_synid_sc0>`.
371
372sc1
373~~~
374
375See a description :ref:`here<amdgpu_synid_sc1>`.
376
377nt
378~~
379
380See a description :ref:`here<amdgpu_synid_nt>`.
381
382MIMG Modifiers
383--------------
384
385.. _amdgpu_synid_dmask:
386
387dmask
388~~~~~
389
390Specifies which channels (image components) are used by the operation. By default, no channels
391are used.
392
393    =============== ====================================================================
394    Syntax          Description
395    =============== ====================================================================
396    dmask:{0..15}   Specifies image channels as a positive
397                    :ref:`integer number <amdgpu_synid_integer_number>`
398                    or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
399
400                    Each bit corresponds to one of 4 image components (RGBA).
401
402                    If the specified bit value is 0, the component is not used,
403                    value 1 means that the component is used.
404    =============== ====================================================================
405
406This modifier has some limitations depending on instruction kind:
407
408    =================================================== ========================
409    Instruction Kind                                    Valid dmask Values
410    =================================================== ========================
411    32-bit atomic *cmpswap*                             0x3
412    32-bit atomic instructions except for *cmpswap*     0x1
413    64-bit atomic *cmpswap*                             0xF
414    64-bit atomic instructions except for *cmpswap*     0x3
415    *gather4*                                           0x1, 0x2, 0x4, 0x8
416    Other instructions                                  any value
417    =================================================== ========================
418
419Examples:
420
421.. parsed-literal::
422
423  dmask:0xf
424  dmask:0b1111
425  dmask:x|y|z
426
427.. _amdgpu_synid_unorm:
428
429unorm
430~~~~~
431
432Specifies whether the address is normalized or not (the address is normalized by default).
433
434    ======================== ========================================
435    Syntax                   Description
436    ======================== ========================================
437    unorm                    Force the address to be unnormalized.
438    ======================== ========================================
439
440glc
441~~~
442
443See a description :ref:`here<amdgpu_synid_glc>`.
444
445slc
446~~~
447
448See a description :ref:`here<amdgpu_synid_slc>`.
449
450.. _amdgpu_synid_r128:
451
452r128
453~~~~
454
455Specifies texture resource size. The default size is 256 bits.
456
457GFX7, GFX8 and GFX10 only.
458
459    =================== ================================================
460    Syntax              Description
461    =================== ================================================
462    r128                Specifies 128 bits texture resource size.
463    =================== ================================================
464
465.. WARNING:: Using this modifier should decrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature.
466
467tfe
468~~~
469
470See a description :ref:`here<amdgpu_synid_tfe>`.
471
472.. _amdgpu_synid_lwe:
473
474lwe
475~~~
476
477Specifies LOD warning status (LOD warning is disabled by default).
478
479    ======================================== ================================================
480    Syntax                                   Description
481    ======================================== ================================================
482    lwe                                      Enables LOD warning.
483    ======================================== ================================================
484
485.. _amdgpu_synid_da:
486
487da
488~~
489
490Specifies if an array index must be sent to TA. By default, array index is not sent.
491
492    ======================================== ================================================
493    Syntax                                   Description
494    ======================================== ================================================
495    da                                       Send an array-index to TA.
496    ======================================== ================================================
497
498.. _amdgpu_synid_d16:
499
500d16
501~~~
502
503Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
504
505    ======================================== ================================================
506    Syntax                                   Description
507    ======================================== ================================================
508    d16                                      Enables 16-bits data mode.
509
510                                             On loads, convert data in memory to 16-bit
511                                             format before storing it in VGPRs.
512
513                                             For stores, convert 16-bit data in VGPRs to
514                                             32 bits before going to memory.
515
516                                             Note that GFX8.0 does not support data packing.
517                                             Each 16-bit data element occupies 1 VGPR.
518
519                                             GFX8.1, GFX9 and GFX10 support data packing.
520                                             Each pair of 16-bit data elements
521                                             occupies 1 VGPR.
522    ======================================== ================================================
523
524.. _amdgpu_synid_a16:
525
526a16
527~~~
528
529Specifies size of image address components: 16 or 32 bits (32 bits by default).
530GFX9 and GFX10 only.
531
532    ======================================== ================================================
533    Syntax                                   Description
534    ======================================== ================================================
535    a16                                      Enables 16-bits image address components.
536    ======================================== ================================================
537
538.. _amdgpu_synid_dim:
539
540dim
541~~~
542
543Specifies surface dimension. This is a mandatory modifier. There is no default value.
544
545GFX10 only.
546
547    =============================== =========================================================
548    Syntax                          Description
549    =============================== =========================================================
550    dim:1D                          One-dimensional image.
551    dim:2D                          Two-dimensional image.
552    dim:3D                          Three-dimensional image.
553    dim:CUBE                        Cubemap array.
554    dim:1D_ARRAY                    One-dimensional image array.
555    dim:2D_ARRAY                    Two-dimensional image array.
556    dim:2D_MSAA                     Two-dimensional multi-sample auto-aliasing image.
557    dim:2D_MSAA_ARRAY               Two-dimensional multi-sample auto-aliasing image array.
558    =============================== =========================================================
559
560The following table defines an alternative syntax which is supported
561for compatibility with SP3 assembler:
562
563    =============================== =========================================================
564    Syntax                          Description
565    =============================== =========================================================
566    dim:SQ_RSRC_IMG_1D              One-dimensional image.
567    dim:SQ_RSRC_IMG_2D              Two-dimensional image.
568    dim:SQ_RSRC_IMG_3D              Three-dimensional image.
569    dim:SQ_RSRC_IMG_CUBE            Cubemap array.
570    dim:SQ_RSRC_IMG_1D_ARRAY        One-dimensional image array.
571    dim:SQ_RSRC_IMG_2D_ARRAY        Two-dimensional image array.
572    dim:SQ_RSRC_IMG_2D_MSAA         Two-dimensional multi-sample auto-aliasing image.
573    dim:SQ_RSRC_IMG_2D_MSAA_ARRAY   Two-dimensional multi-sample auto-aliasing image array.
574    =============================== =========================================================
575
576dlc
577~~~
578
579See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
580
581Miscellaneous Modifiers
582-----------------------
583
584.. _amdgpu_synid_dlc:
585
586dlc
587~~~
588
589Controls device level cache policy for memory operations. Used for synchronization.
590When specified, forces operation to bypass device level cache making the operation device
591level coherent. By default, instructions use device level cache.
592
593GFX10 only.
594
595    ======================================== ================================================
596    Syntax                                   Description
597    ======================================== ================================================
598    dlc                                      Bypass device level cache.
599    ======================================== ================================================
600
601.. _amdgpu_synid_glc:
602
603glc
604~~~
605
606This modifier has different meaning for loads, stores, and atomic operations.
607The default value is off (0).
608
609See AMD documentation for details.
610
611    ======================================== ================================================
612    Syntax                                   Description
613    ======================================== ================================================
614    glc                                      Set glc bit to 1.
615    ======================================== ================================================
616
617.. _amdgpu_synid_lds:
618
619lds
620~~~
621
622Specifies where to store the result: VGPRs or LDS (VGPRs by default).
623
624    ======================================== ===========================
625    Syntax                                   Description
626    ======================================== ===========================
627    lds                                      Store result in LDS.
628    ======================================== ===========================
629
630.. _amdgpu_synid_nv:
631
632nv
633~~
634
635Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
636
637GFX9 only.
638
639    ======================================== ================================================
640    Syntax                                   Description
641    ======================================== ================================================
642    nv                                       Indicates that instruction operates on
643                                             non-volatile memory.
644    ======================================== ================================================
645
646.. _amdgpu_synid_slc:
647
648slc
649~~~
650
651Specifies cache policy. The default value is off (0).
652
653See AMD documentation for details.
654
655    ======================================== ================================================
656    Syntax                                   Description
657    ======================================== ================================================
658    slc                                      Set slc bit to 1.
659    ======================================== ================================================
660
661.. _amdgpu_synid_tfe:
662
663tfe
664~~~
665
666Controls access to partially resident textures. The default value is off (0).
667
668See AMD documentation for details.
669
670    ======================================== ================================================
671    Syntax                                   Description
672    ======================================== ================================================
673    tfe                                      Set tfe bit to 1.
674    ======================================== ================================================
675
676.. _amdgpu_synid_sc0:
677
678sc0
679~~~
680
681For atomics, sc0 indicates that the atomic operation returns a value.
682For other opcodes is is used together with :ref:`sc1<amdgpu_synid_sc1>` to specify cache
683policy. See AMD documentation for details.
684
685    ======================================== ================================================
686    Syntax                                   Description
687    ======================================== ================================================
688    sc0                                      Set sc0 bit to 1.
689    ======================================== ================================================
690
691.. _amdgpu_synid_sc1:
692
693sc1
694~~~
695
696This modifier is used together with :ref:`sc0<amdgpu_synid_sc0>` to specify cache
697policy.
698
699    ======================================== ================================================
700    Syntax                                   Description
701    ======================================== ================================================
702    sc1                                      Set sc1 bit to 1.
703    ======================================== ================================================
704
705.. _amdgpu_synid_nt:
706
707nt
708~~
709
710Indicates an operation with non-temporal data.
711
712    ======================================== ================================================
713    Syntax                                   Description
714    ======================================== ================================================
715    nt                                       Set nt bit to 1.
716    ======================================== ================================================
717
718MUBUF/MTBUF Modifiers
719---------------------
720
721.. _amdgpu_synid_idxen:
722
723idxen
724~~~~~
725
726Specifies whether address components include an index. By default, no components are used.
727
728Can be used together with :ref:`offen<amdgpu_synid_offen>`.
729
730Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
731
732    ======================================== ================================================
733    Syntax                                   Description
734    ======================================== ================================================
735    idxen                                    Address components include an index.
736    ======================================== ================================================
737
738.. _amdgpu_synid_offen:
739
740offen
741~~~~~
742
743Specifies whether address components include an offset. By default, no components are used.
744
745Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
746
747Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
748
749    ======================================== ================================================
750    Syntax                                   Description
751    ======================================== ================================================
752    offen                                    Address components include an offset.
753    ======================================== ================================================
754
755.. _amdgpu_synid_addr64:
756
757addr64
758~~~~~~
759
760Specifies whether a 64-bit address is used. By default, no address is used.
761
762GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
763:ref:`idxen<amdgpu_synid_idxen>` modifiers.
764
765    ======================================== ================================================
766    Syntax                                   Description
767    ======================================== ================================================
768    addr64                                   A 64-bit address is used.
769    ======================================== ================================================
770
771.. _amdgpu_synid_buf_offset12:
772
773offset12
774~~~~~~~~
775
776Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
777
778    ================== ====================================================================
779    Syntax             Description
780    ================== ====================================================================
781    offset:{0..0xFFF}  Specifies a 12-bit unsigned offset as a positive
782                       :ref:`integer number <amdgpu_synid_integer_number>`
783                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
784    ================== ====================================================================
785
786Examples:
787
788.. parsed-literal::
789
790  offset:x+y
791  offset:0x10
792
793glc
794~~~
795
796See a description :ref:`here<amdgpu_synid_glc>`.
797
798slc
799~~~
800
801See a description :ref:`here<amdgpu_synid_slc>`.
802
803lds
804~~~
805
806See a description :ref:`here<amdgpu_synid_lds>`.
807
808dlc
809~~~
810
811See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
812
813tfe
814~~~
815
816See a description :ref:`here<amdgpu_synid_tfe>`.
817
818.. _amdgpu_synid_fmt:
819
820fmt
821~~~
822
823Specifies data and numeric formats used by the operation.
824The default numeric format is BUF_NUM_FORMAT_UNORM.
825The default data format is BUF_DATA_FORMAT_8.
826
827    ========================================= ===============================================================
828    Syntax                                    Description
829    ========================================= ===============================================================
830    format:{0..127}                           Use format specified as either an
831                                              :ref:`integer number<amdgpu_synid_integer_number>` or an
832                                              :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
833    format:[<data format>]                    Use the specified data format and
834                                              default numeric format.
835    format:[<numeric format>]                 Use the specified numeric format and
836                                              default data format.
837    format:[<data format>, <numeric format>]  Use the specified data and numeric formats.
838    format:[<numeric format>, <data format>]  Use the specified data and numeric formats.
839    ========================================= ===============================================================
840
841.. _amdgpu_synid_format_data:
842
843Supported data formats are defined in the following table:
844
845    ========================================= ===============================
846    Syntax                                    Note
847    ========================================= ===============================
848    BUF_DATA_FORMAT_INVALID
849    BUF_DATA_FORMAT_8                         Default value.
850    BUF_DATA_FORMAT_16
851    BUF_DATA_FORMAT_8_8
852    BUF_DATA_FORMAT_32
853    BUF_DATA_FORMAT_16_16
854    BUF_DATA_FORMAT_10_11_11
855    BUF_DATA_FORMAT_11_11_10
856    BUF_DATA_FORMAT_10_10_10_2
857    BUF_DATA_FORMAT_2_10_10_10
858    BUF_DATA_FORMAT_8_8_8_8
859    BUF_DATA_FORMAT_32_32
860    BUF_DATA_FORMAT_16_16_16_16
861    BUF_DATA_FORMAT_32_32_32
862    BUF_DATA_FORMAT_32_32_32_32
863    BUF_DATA_FORMAT_RESERVED_15
864    ========================================= ===============================
865
866.. _amdgpu_synid_format_num:
867
868Supported numeric formats are defined below:
869
870    ========================================= ===============================
871    Syntax                                    Note
872    ========================================= ===============================
873    BUF_NUM_FORMAT_UNORM                      Default value.
874    BUF_NUM_FORMAT_SNORM
875    BUF_NUM_FORMAT_USCALED
876    BUF_NUM_FORMAT_SSCALED
877    BUF_NUM_FORMAT_UINT
878    BUF_NUM_FORMAT_SINT
879    BUF_NUM_FORMAT_SNORM_OGL                  GFX7 only.
880    BUF_NUM_FORMAT_RESERVED_6                 GFX8 and GFX9 only.
881    BUF_NUM_FORMAT_FLOAT
882    ========================================= ===============================
883
884Examples:
885
886.. parsed-literal::
887
888  format:0
889  format:127
890  format:[BUF_DATA_FORMAT_16]
891  format:[BUF_DATA_FORMAT_16,BUF_NUM_FORMAT_SSCALED]
892  format:[BUF_NUM_FORMAT_FLOAT]
893
894.. _amdgpu_synid_ufmt:
895
896ufmt
897~~~~
898
899Specifies a unified format used by the operation.
900The default format is BUF_FMT_8_UNORM.
901GFX10 only.
902
903    ========================================= ===============================================================
904    Syntax                                    Description
905    ========================================= ===============================================================
906    format:{0..127}                           Use unified format specified as either an
907                                              :ref:`integer number<amdgpu_synid_integer_number>` or an
908                                              :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
909                                              Note that unified format numbers are not compatible with
910                                              format numbers used for pre-GFX10 ISA.
911    format:[<unified format>]                 Use the specified unified format.
912    ========================================= ===============================================================
913
914Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>`
915and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA,
916:ref:`syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted
917provided that the combination of formats can be mapped to a unified format.
918
919Supported unified formats and equivalent combinations of data and numeric formats
920are defined below:
921
922    ============================== ============================== =============================
923    Syntax                         Equivalent Data Format         Equivalent Numeric Format
924    ============================== ============================== =============================
925    BUF_FMT_INVALID                BUF_DATA_FORMAT_INVALID        BUF_NUM_FORMAT_UNORM
926
927    BUF_FMT_8_UNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UNORM
928    BUF_FMT_8_SNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SNORM
929    BUF_FMT_8_USCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_USCALED
930    BUF_FMT_8_SSCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SSCALED
931    BUF_FMT_8_UINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UINT
932    BUF_FMT_8_SINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SINT
933
934    BUF_FMT_16_UNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UNORM
935    BUF_FMT_16_SNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SNORM
936    BUF_FMT_16_USCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_USCALED
937    BUF_FMT_16_SSCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SSCALED
938    BUF_FMT_16_UINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UINT
939    BUF_FMT_16_SINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SINT
940    BUF_FMT_16_FLOAT               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_FLOAT
941
942    BUF_FMT_8_8_UNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UNORM
943    BUF_FMT_8_8_SNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SNORM
944    BUF_FMT_8_8_USCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_USCALED
945    BUF_FMT_8_8_SSCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SSCALED
946    BUF_FMT_8_8_UINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UINT
947    BUF_FMT_8_8_SINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SINT
948
949    BUF_FMT_32_UINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_UINT
950    BUF_FMT_32_SINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_SINT
951    BUF_FMT_32_FLOAT               BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_FLOAT
952
953    BUF_FMT_16_16_UNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UNORM
954    BUF_FMT_16_16_SNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SNORM
955    BUF_FMT_16_16_USCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_USCALED
956    BUF_FMT_16_16_SSCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SSCALED
957    BUF_FMT_16_16_UINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UINT
958    BUF_FMT_16_16_SINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SINT
959    BUF_FMT_16_16_FLOAT            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_FLOAT
960
961    BUF_FMT_10_11_11_UNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UNORM
962    BUF_FMT_10_11_11_SNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SNORM
963    BUF_FMT_10_11_11_USCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_USCALED
964    BUF_FMT_10_11_11_SSCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SSCALED
965    BUF_FMT_10_11_11_UINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UINT
966    BUF_FMT_10_11_11_SINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SINT
967    BUF_FMT_10_11_11_FLOAT         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_FLOAT
968
969    BUF_FMT_11_11_10_UNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UNORM
970    BUF_FMT_11_11_10_SNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SNORM
971    BUF_FMT_11_11_10_USCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_USCALED
972    BUF_FMT_11_11_10_SSCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SSCALED
973    BUF_FMT_11_11_10_UINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UINT
974    BUF_FMT_11_11_10_SINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SINT
975    BUF_FMT_11_11_10_FLOAT         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_FLOAT
976
977    BUF_FMT_10_10_10_2_UNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UNORM
978    BUF_FMT_10_10_10_2_SNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SNORM
979    BUF_FMT_10_10_10_2_USCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_USCALED
980    BUF_FMT_10_10_10_2_SSCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SSCALED
981    BUF_FMT_10_10_10_2_UINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UINT
982    BUF_FMT_10_10_10_2_SINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SINT
983
984    BUF_FMT_2_10_10_10_UNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UNORM
985    BUF_FMT_2_10_10_10_SNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SNORM
986    BUF_FMT_2_10_10_10_USCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_USCALED
987    BUF_FMT_2_10_10_10_SSCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SSCALED
988    BUF_FMT_2_10_10_10_UINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UINT
989    BUF_FMT_2_10_10_10_SINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SINT
990
991    BUF_FMT_8_8_8_8_UNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UNORM
992    BUF_FMT_8_8_8_8_SNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SNORM
993    BUF_FMT_8_8_8_8_USCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_USCALED
994    BUF_FMT_8_8_8_8_SSCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SSCALED
995    BUF_FMT_8_8_8_8_UINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UINT
996    BUF_FMT_8_8_8_8_SINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SINT
997
998    BUF_FMT_32_32_UINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_UINT
999    BUF_FMT_32_32_SINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_SINT
1000    BUF_FMT_32_32_FLOAT            BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_FLOAT
1001
1002    BUF_FMT_16_16_16_16_UNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UNORM
1003    BUF_FMT_16_16_16_16_SNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SNORM
1004    BUF_FMT_16_16_16_16_USCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_USCALED
1005    BUF_FMT_16_16_16_16_SSCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SSCALED
1006    BUF_FMT_16_16_16_16_UINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UINT
1007    BUF_FMT_16_16_16_16_SINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SINT
1008    BUF_FMT_16_16_16_16_FLOAT      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_FLOAT
1009
1010    BUF_FMT_32_32_32_UINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_UINT
1011    BUF_FMT_32_32_32_SINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_SINT
1012    BUF_FMT_32_32_32_FLOAT         BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_FLOAT
1013    BUF_FMT_32_32_32_32_UINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_UINT
1014    BUF_FMT_32_32_32_32_SINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_SINT
1015    BUF_FMT_32_32_32_32_FLOAT      BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_FLOAT
1016    ============================== ============================== =============================
1017
1018Examples:
1019
1020.. parsed-literal::
1021
1022  format:0
1023  format:[BUF_FMT_32_UINT]
1024
1025SMRD/SMEM Modifiers
1026-------------------
1027
1028glc
1029~~~
1030
1031See a description :ref:`here<amdgpu_synid_glc>`.
1032
1033nv
1034~~
1035
1036See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only.
1037
1038dlc
1039~~~
1040
1041See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
1042
1043VINTRP Modifiers
1044----------------
1045
1046.. _amdgpu_synid_high:
1047
1048high
1049~~~~
1050
1051Specifies which half of the LDS word to use. Low half of LDS word is used by default.
1052GFX9 and GFX10 only.
1053
1054    ======================================== ================================
1055    Syntax                                   Description
1056    ======================================== ================================
1057    high                                     Use high half of LDS word.
1058    ======================================== ================================
1059
1060DPP8 Modifiers
1061--------------
1062
1063GFX10 only.
1064
1065.. _amdgpu_synid_dpp8_sel:
1066
1067dpp8_sel
1068~~~~~~~~
1069
1070Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
1071There is no default value.
1072
1073GFX10 only.
1074
1075The *dpp8_sel* modifier must specify exactly 8 values.
1076First value selects which lane to read from to supply data into lane 0.
1077Second value controls lane 1 and so on.
1078
1079Each value may be specified as either
1080an :ref:`integer number<amdgpu_synid_integer_number>` or
1081an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1082
1083    =============================================================== ===========================
1084    Syntax                                                          Description
1085    =============================================================== ===========================
1086    dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}]  Select lanes to read from.
1087    =============================================================== ===========================
1088
1089Examples:
1090
1091.. parsed-literal::
1092
1093  dpp8:[7,6,5,4,3,2,1,0]
1094  dpp8:[0,1,0,1,0,1,0,1]
1095
1096.. _amdgpu_synid_fi8:
1097
1098fi
1099~~
1100
1101Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
1102
1103Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1104
1105GFX10 only.
1106
1107    ==================================== =====================================================
1108    Syntax                               Description
1109    ==================================== =====================================================
1110    fi:0                                 Fetch zero when accessing data from inactive lanes.
1111    fi:1                                 Fetch pre-exist values from inactive lanes.
1112    ==================================== =====================================================
1113
1114Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1115:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1116
1117DPP Modifiers
1118-------------
1119
1120GFX8, GFX9 and GFX10 only.
1121
1122.. _amdgpu_synid_dpp_ctrl:
1123
1124dpp_ctrl
1125~~~~~~~~
1126
1127Specifies how data are shared between threads. This is a mandatory modifier.
1128There is no default value.
1129
1130GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
1131
1132Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1133
1134    ======================================== ================================================
1135    Syntax                                   Description
1136    ======================================== ================================================
1137    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1138    row_mirror                               Mirror threads within row.
1139    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1140    row_bcast:15                             Broadcast 15th thread of each row to next row.
1141    row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1142    wave_shl:1                               Wavefront left shift by 1 thread.
1143    wave_rol:1                               Wavefront left rotate by 1 thread.
1144    wave_shr:1                               Wavefront right shift by 1 thread.
1145    wave_ror:1                               Wavefront right rotate by 1 thread.
1146    row_shl:{1..15}                          Row shift left by 1-15 threads.
1147    row_shr:{1..15}                          Row shift right by 1-15 threads.
1148    row_ror:{1..15}                          Row rotate right by 1-15 threads.
1149    ======================================== ================================================
1150
1151Note: numeric values may be specified as either
1152:ref:`integer numbers<amdgpu_synid_integer_number>` or
1153:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1154
1155Examples:
1156
1157.. parsed-literal::
1158
1159  quad_perm:[0, 1, 2, 3]
1160  row_shl:3
1161
1162.. _amdgpu_synid_dpp16_ctrl:
1163
1164dpp16_ctrl
1165~~~~~~~~~~
1166
1167Specifies how data are shared between threads. This is a mandatory modifier.
1168There is no default value.
1169
1170GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
1171
1172Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1173(There are only two rows in *wave32* mode.)
1174
1175    ======================================== ====================================================
1176    Syntax                                   Description
1177    ======================================== ====================================================
1178    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1179    row_mirror                               Mirror threads within row.
1180    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1181    row_share:{0..15}                        Share the value from the specified lane with other
1182                                             lanes in the row.
1183    row_xmask:{0..15}                        Fetch from XOR(current lane id, specified lane id).
1184    row_shl:{1..15}                          Row shift left by 1-15 threads.
1185    row_shr:{1..15}                          Row shift right by 1-15 threads.
1186    row_ror:{1..15}                          Row rotate right by 1-15 threads.
1187    ======================================== ====================================================
1188
1189Note: numeric values may be specified as either
1190:ref:`integer numbers<amdgpu_synid_integer_number>` or
1191:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1192
1193Examples:
1194
1195.. parsed-literal::
1196
1197  quad_perm:[0, 1, 2, 3]
1198  row_shl:3
1199
1200.. _amdgpu_synid_dpp32_ctrl:
1201
1202dpp32_ctrl
1203~~~~~~~~~~
1204
1205Specifies how data are shared between threads. This is a mandatory modifier.
1206There is no default value.
1207
1208May be used only with GFX90A 32-bit instructions.
1209
1210Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1211
1212    ======================================== ==================================================
1213    Syntax                                   Description
1214    ======================================== ==================================================
1215    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1216    row_mirror                               Mirror threads within row.
1217    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1218    row_bcast:15                             Broadcast 15th thread of each row to next row.
1219    row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1220    wave_shl:1                               Wavefront left shift by 1 thread.
1221    wave_rol:1                               Wavefront left rotate by 1 thread.
1222    wave_shr:1                               Wavefront right shift by 1 thread.
1223    wave_ror:1                               Wavefront right rotate by 1 thread.
1224    row_shl:{1..15}                          Row shift left by 1-15 threads.
1225    row_shr:{1..15}                          Row shift right by 1-15 threads.
1226    row_ror:{1..15}                          Row rotate right by 1-15 threads.
1227    row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1228    ======================================== ==================================================
1229
1230Note: numeric values may be specified as either
1231:ref:`integer numbers<amdgpu_synid_integer_number>` or
1232:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1233
1234Examples:
1235
1236.. parsed-literal::
1237
1238  quad_perm:[0, 1, 2, 3]
1239  row_shl:3
1240
1241
1242.. _amdgpu_synid_dpp64_ctrl:
1243
1244dpp64_ctrl
1245~~~~~~~~~~
1246
1247Specifies how data are shared between threads. This is a mandatory modifier.
1248There is no default value.
1249
1250May be used only with GFX90A 64-bit instructions.
1251
1252Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1253
1254    ======================================== ==================================================
1255    Syntax                                   Description
1256    ======================================== ==================================================
1257    row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1258    ======================================== ==================================================
1259
1260Note: numeric values may be specified as either
1261:ref:`integer numbers<amdgpu_synid_integer_number>` or
1262:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1263
1264Examples:
1265
1266.. parsed-literal::
1267
1268  row_newbcast:3
1269
1270
1271.. _amdgpu_synid_row_mask:
1272
1273row_mask
1274~~~~~~~~
1275
1276Controls which rows are enabled for data sharing. By default, all rows are enabled.
1277
1278Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1279(There are only two rows in *wave32* mode.)
1280
1281    ================= ====================================================================
1282    Syntax            Description
1283    ================= ====================================================================
1284    row_mask:{0..15}  Specifies a *row mask* as a positive
1285                      :ref:`integer number <amdgpu_synid_integer_number>`
1286                      or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1287
1288                      Each of 4 bits in the mask controls one row
1289                      (0 - disabled, 1 - enabled).
1290
1291                      In *wave32* mode the values should be limited to 0..7.
1292    ================= ====================================================================
1293
1294Examples:
1295
1296.. parsed-literal::
1297
1298  row_mask:0xf
1299  row_mask:0b1010
1300  row_mask:x|y
1301
1302.. _amdgpu_synid_bank_mask:
1303
1304bank_mask
1305~~~~~~~~~
1306
1307Controls which banks are enabled for data sharing. By default, all banks are enabled.
1308
1309Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1310(There are only two rows in *wave32* mode.)
1311
1312    ================== ====================================================================
1313    Syntax             Description
1314    ================== ====================================================================
1315    bank_mask:{0..15}  Specifies a *bank mask* as a positive
1316                       :ref:`integer number <amdgpu_synid_integer_number>`
1317                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1318
1319                       Each of 4 bits in the mask controls one bank
1320                       (0 - disabled, 1 - enabled).
1321    ================== ====================================================================
1322
1323Examples:
1324
1325.. parsed-literal::
1326
1327  bank_mask:0x3
1328  bank_mask:0b0011
1329  bank_mask:x&y
1330
1331.. _amdgpu_synid_bound_ctrl:
1332
1333bound_ctrl
1334~~~~~~~~~~
1335
1336Controls data sharing when accessing an invalid lane. By default, data sharing with
1337invalid lanes is disabled.
1338
1339    ======================================== ================================================
1340    Syntax                                   Description
1341    ======================================== ================================================
1342    bound_ctrl:1                             Enables data sharing with invalid lanes.
1343
1344                                             Accessing data from an invalid lane will
1345                                             return zero.
1346    ======================================== ================================================
1347
1348.. _amdgpu_synid_fi16:
1349
1350fi
1351~~
1352
1353Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
1354
1355Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1356
1357GFX10 only.
1358
1359    ======================================== ==================================================
1360    Syntax                                   Description
1361    ======================================== ==================================================
1362    fi:0                                     Interaction with inactive lanes is controlled by
1363                                             :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1364
1365    fi:1                                     Fetch pre-exist values from inactive lanes.
1366    ======================================== ==================================================
1367
1368Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1369:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1370
1371SDWA Modifiers
1372--------------
1373
1374GFX8, GFX9 and GFX10 only.
1375
1376clamp
1377~~~~~
1378
1379See a description :ref:`here<amdgpu_synid_clamp>`.
1380
1381omod
1382~~~~
1383
1384See a description :ref:`here<amdgpu_synid_omod>`.
1385
1386GFX9 and GFX10 only.
1387
1388.. _amdgpu_synid_dst_sel:
1389
1390dst_sel
1391~~~~~~~
1392
1393Selects which bits in the destination are affected. By default, all bits are affected.
1394
1395    ======================================== ================================================
1396    Syntax                                   Description
1397    ======================================== ================================================
1398    dst_sel:DWORD                            Use bits 31:0.
1399    dst_sel:BYTE_0                           Use bits 7:0.
1400    dst_sel:BYTE_1                           Use bits 15:8.
1401    dst_sel:BYTE_2                           Use bits 23:16.
1402    dst_sel:BYTE_3                           Use bits 31:24.
1403    dst_sel:WORD_0                           Use bits 15:0.
1404    dst_sel:WORD_1                           Use bits 31:16.
1405    ======================================== ================================================
1406
1407.. _amdgpu_synid_dst_unused:
1408
1409dst_unused
1410~~~~~~~~~~
1411
1412Controls what to do with the bits in the destination which are not selected
1413by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
1414By default, unused bits are preserved.
1415
1416    ======================================== ================================================
1417    Syntax                                   Description
1418    ======================================== ================================================
1419    dst_unused:UNUSED_PAD                    Pad with zeros.
1420    dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits.
1421    dst_unused:UNUSED_PRESERVE               Preserve bits.
1422    ======================================== ================================================
1423
1424.. _amdgpu_synid_src0_sel:
1425
1426src0_sel
1427~~~~~~~~
1428
1429Controls which bits in the src0 are used. By default, all bits are used.
1430
1431    ======================================== ================================================
1432    Syntax                                   Description
1433    ======================================== ================================================
1434    src0_sel:DWORD                           Use bits 31:0.
1435    src0_sel:BYTE_0                          Use bits 7:0.
1436    src0_sel:BYTE_1                          Use bits 15:8.
1437    src0_sel:BYTE_2                          Use bits 23:16.
1438    src0_sel:BYTE_3                          Use bits 31:24.
1439    src0_sel:WORD_0                          Use bits 15:0.
1440    src0_sel:WORD_1                          Use bits 31:16.
1441    ======================================== ================================================
1442
1443.. _amdgpu_synid_src1_sel:
1444
1445src1_sel
1446~~~~~~~~
1447
1448Controls which bits in the src1 are used. By default, all bits are used.
1449
1450    ======================================== ================================================
1451    Syntax                                   Description
1452    ======================================== ================================================
1453    src1_sel:DWORD                           Use bits 31:0.
1454    src1_sel:BYTE_0                          Use bits 7:0.
1455    src1_sel:BYTE_1                          Use bits 15:8.
1456    src1_sel:BYTE_2                          Use bits 23:16.
1457    src1_sel:BYTE_3                          Use bits 31:24.
1458    src1_sel:WORD_0                          Use bits 15:0.
1459    src1_sel:WORD_1                          Use bits 31:16.
1460    ======================================== ================================================
1461
1462.. _amdgpu_synid_sdwa_operand_modifiers:
1463
1464SDWA Operand Modifiers
1465----------------------
1466
1467Operand modifiers are not used separately. They are applied to source operands.
1468
1469GFX8, GFX9 and GFX10 only.
1470
1471abs
1472~~~
1473
1474See a description :ref:`here<amdgpu_synid_abs>`.
1475
1476neg
1477~~~
1478
1479See a description :ref:`here<amdgpu_synid_neg>`.
1480
1481.. _amdgpu_synid_sext:
1482
1483sext
1484~~~~
1485
1486Sign-extends value of a (sub-dword) operand to fill all 32 bits.
1487Has no effect for 32-bit operands.
1488
1489Valid for integer operands only.
1490
1491    ======================================== ================================================
1492    Syntax                                   Description
1493    ======================================== ================================================
1494    sext(<operand>)                          Sign-extend operand value.
1495    ======================================== ================================================
1496
1497Examples:
1498
1499.. parsed-literal::
1500
1501  sext(v4)
1502  sext(v255)
1503
1504VOP3 Modifiers
1505--------------
1506
1507.. _amdgpu_synid_vop3_op_sel:
1508
1509op_sel
1510~~~~~~
1511
1512Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
1513By default, low bits are used for all operands.
1514
1515The number of values specified with the op_sel modifier must match the number of instruction
1516operands (both source and destination). First value controls src0, second value controls src1
1517and so on, except that the last value controls destination.
1518The value 0 selects the low bits, while 1 selects the high bits.
1519
1520Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
1521by op_sel must be 0.
1522
1523GFX9 and GFX10 only.
1524
1525    ======================================== ============================================================
1526    Syntax                                   Description
1527    ======================================== ============================================================
1528    op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand.
1529    op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1530    op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1531    ======================================== ============================================================
1532
1533Note: numeric values may be specified as either
1534:ref:`integer numbers<amdgpu_synid_integer_number>` or
1535:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1536
1537Examples:
1538
1539.. parsed-literal::
1540
1541  op_sel:[0,0]
1542  op_sel:[0,1]
1543
1544.. _amdgpu_synid_dpp_op_sel:
1545
1546dpp_op_sel
1547~~~~~~~~~~
1548
1549Special version of *op_sel* used for *permlane* opcodes to specify
1550dpp-like mode bits - :ref:`fi<amdgpu_synid_fi16>` and
1551:ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1552
1553GFX10 only.
1554
1555    ======================================== ============================================================
1556    Syntax                                   Description
1557    ======================================== ============================================================
1558    op_sel:[{0..1},{0..1}]                   First bit specifies :ref:`fi<amdgpu_synid_fi16>`, second
1559                                             bit specifies :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1560    ======================================== ============================================================
1561
1562Note: numeric values may be specified as either
1563:ref:`integer numbers<amdgpu_synid_integer_number>` or
1564:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1565
1566Examples:
1567
1568.. parsed-literal::
1569
1570  op_sel:[0,0]
1571
1572.. _amdgpu_synid_clamp:
1573
1574clamp
1575~~~~~
1576
1577Clamp meaning depends on instruction.
1578
1579For *v_cmp* instructions, clamp modifier indicates that the compare signals
1580if a floating point exception occurs. By default, signaling is disabled.
1581Not supported by GFX7.
1582
1583For integer operations, clamp modifier indicates that the result must be clamped
1584to the largest and smallest representable value. By default, there is no clamping.
1585Integer clamping is not supported by GFX7.
1586
1587For floating point operations, clamp modifier indicates that the result must be clamped
1588to the range [0.0, 1.0]. By default, there is no clamping.
1589
1590Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
1591
1592    ======================================== ================================================
1593    Syntax                                   Description
1594    ======================================== ================================================
1595    clamp                                    Enables clamping (or signaling).
1596    ======================================== ================================================
1597
1598.. _amdgpu_synid_omod:
1599
1600omod
1601~~~~
1602
1603Specifies if an output modifier must be applied to the result.
1604By default, no output modifiers are applied.
1605
1606Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
1607
1608Output modifiers are valid for f32 and f64 floating point results only.
1609They must not be used with f16.
1610
1611Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
1612but accepts output modifiers.
1613
1614    ======================================== ================================================
1615    Syntax                                   Description
1616    ======================================== ================================================
1617    mul:2                                    Multiply the result by 2.
1618    mul:4                                    Multiply the result by 4.
1619    div:2                                    Multiply the result by 0.5.
1620    ======================================== ================================================
1621
1622Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1623:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1624
1625Examples:
1626
1627.. parsed-literal::
1628
1629  mul:2
1630  mul:x      // x must be equal to 2 or 4
1631
1632.. _amdgpu_synid_vop3_operand_modifiers:
1633
1634VOP3 Operand Modifiers
1635----------------------
1636
1637Operand modifiers are not used separately. They are applied to source operands.
1638
1639.. _amdgpu_synid_abs:
1640
1641abs
1642~~~
1643
1644Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
1645(if any). Valid for floating point operands only.
1646
1647    ======================================== ====================================================
1648    Syntax                                   Description
1649    ======================================== ====================================================
1650    abs(<operand>)                           Get the absolute value of a floating-point operand.
1651    \|<operand>|                             The same as above (an SP3 syntax).
1652    ======================================== ====================================================
1653
1654Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
1655may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
1656in examples below.
1657
1658Examples:
1659
1660.. parsed-literal::
1661
1662  abs(v36)
1663  \|v36|
1664  abs(x|y)     // ok
1665  \|(x|y)|      // additional parentheses are required
1666
1667.. _amdgpu_synid_neg:
1668
1669neg
1670~~~
1671
1672Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
1673(if any). Valid for floating point operands only.
1674
1675    ================== ====================================================
1676    Syntax             Description
1677    ================== ====================================================
1678    neg(<operand>)     Get the negative value of a floating-point operand.
1679                       The operand may include an optional
1680                       :ref:`abs<amdgpu_synid_abs>` modifier.
1681    -<operand>         The same as above (an SP3 syntax).
1682    ================== ====================================================
1683
1684Note: SP3 syntax is supported with limitations because of a potential ambiguity.
1685Currently it is allowed in the following cases:
1686
1687* Before a register.
1688* Before an :ref:`abs<amdgpu_synid_abs>` modifier.
1689* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
1690
1691In all other cases "-" is handled as a part of an expression that follows the sign.
1692
1693Examples:
1694
1695.. parsed-literal::
1696
1697  // Operands with negate modifiers
1698  neg(v[0])
1699  neg(1.0)
1700  neg(abs(v0))
1701  -v5
1702  -abs(v5)
1703  -\|v5|
1704
1705  // Operands without negate modifiers
1706  -1
1707  -x+y
1708
1709VOP3P Modifiers
1710---------------
1711
1712This section describes modifiers of *regular* VOP3P instructions.
1713
1714*v_mad_mix\** and *v_fma_mix\**
1715instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
1716
1717GFX9 and GFX10 only.
1718
1719.. _amdgpu_synid_op_sel:
1720
1721op_sel
1722~~~~~~
1723
1724Selects the low [15:0] or high [31:16] operand bits as input to the operation
1725which results in the lower-half of the destination.
1726By default, low bits are used for all operands.
1727
1728The number of values specified by the *op_sel* modifier must match the number of source
1729operands. First value controls src0, second value controls src1 and so on.
1730
1731The value 0 selects the low bits, while 1 selects the high bits.
1732
1733    ================================= =============================================================
1734    Syntax                            Description
1735    ================================= =============================================================
1736    op_sel:[{0..1}]                   Select operand bits for instructions with 1 source operand.
1737    op_sel:[{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1738    op_sel:[{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1739    ================================= =============================================================
1740
1741Note: numeric values may be specified as either
1742:ref:`integer numbers<amdgpu_synid_integer_number>` or
1743:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1744
1745Examples:
1746
1747.. parsed-literal::
1748
1749  op_sel:[0,0]
1750  op_sel:[0,1,0]
1751
1752.. _amdgpu_synid_op_sel_hi:
1753
1754op_sel_hi
1755~~~~~~~~~
1756
1757Selects the low [15:0] or high [31:16] operand bits as input to the operation
1758which results in the upper-half of the destination.
1759By default, high bits are used for all operands.
1760
1761The number of values specified by the *op_sel_hi* modifier must match the number of source
1762operands. First value controls src0, second value controls src1 and so on.
1763
1764The value 0 selects the low bits, while 1 selects the high bits.
1765
1766    =================================== =============================================================
1767    Syntax                              Description
1768    =================================== =============================================================
1769    op_sel_hi:[{0..1}]                  Select operand bits for instructions with 1 source operand.
1770    op_sel_hi:[{0..1},{0..1}]           Select operand bits for instructions with 2 source operands.
1771    op_sel_hi:[{0..1},{0..1},{0..1}]    Select operand bits for instructions with 3 source operands.
1772    =================================== =============================================================
1773
1774Note: numeric values may be specified as either
1775:ref:`integer numbers<amdgpu_synid_integer_number>` or
1776:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1777
1778Examples:
1779
1780.. parsed-literal::
1781
1782  op_sel_hi:[0,0]
1783  op_sel_hi:[0,0,1]
1784
1785.. _amdgpu_synid_neg_lo:
1786
1787neg_lo
1788~~~~~~
1789
1790Specifies whether to change sign of operand values selected by
1791:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
1792as input to the operation which results in the upper-half of the destination.
1793
1794The number of values specified by this modifier must match the number of source
1795operands. First value controls src0, second value controls src1 and so on.
1796
1797The value 0 indicates that the corresponding operand value is used unmodified,
1798the value 1 indicates that negative value of the operand must be used.
1799
1800By default, operand values are used unmodified.
1801
1802This modifier is valid for floating point operands only.
1803
1804    ================================ ==================================================================
1805    Syntax                           Description
1806    ================================ ==================================================================
1807    neg_lo:[{0..1}]                  Select affected operands for instructions with 1 source operand.
1808    neg_lo:[{0..1},{0..1}]           Select affected operands for instructions with 2 source operands.
1809    neg_lo:[{0..1},{0..1},{0..1}]    Select affected operands for instructions with 3 source operands.
1810    ================================ ==================================================================
1811
1812Note: numeric values may be specified as either
1813:ref:`integer numbers<amdgpu_synid_integer_number>` or
1814:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1815
1816Examples:
1817
1818.. parsed-literal::
1819
1820  neg_lo:[0]
1821  neg_lo:[0,1]
1822
1823.. _amdgpu_synid_neg_hi:
1824
1825neg_hi
1826~~~~~~
1827
1828Specifies whether to change sign of operand values selected by
1829:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
1830as input to the operation which results in the upper-half of the destination.
1831
1832The number of values specified by this modifier must match the number of source
1833operands. First value controls src0, second value controls src1 and so on.
1834
1835The value 0 indicates that the corresponding operand value is used unmodified,
1836the value 1 indicates that negative value of the operand must be used.
1837
1838By default, operand values are used unmodified.
1839
1840This modifier is valid for floating point operands only.
1841
1842    =============================== ==================================================================
1843    Syntax                          Description
1844    =============================== ==================================================================
1845    neg_hi:[{0..1}]                 Select affected operands for instructions with 1 source operand.
1846    neg_hi:[{0..1},{0..1}]          Select affected operands for instructions with 2 source operands.
1847    neg_hi:[{0..1},{0..1},{0..1}]   Select affected operands for instructions with 3 source operands.
1848    =============================== ==================================================================
1849
1850Note: numeric values may be specified as either
1851:ref:`integer numbers<amdgpu_synid_integer_number>` or
1852:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1853
1854Examples:
1855
1856.. parsed-literal::
1857
1858  neg_hi:[1,0]
1859  neg_hi:[0,1,1]
1860
1861clamp
1862~~~~~
1863
1864See a description :ref:`here<amdgpu_synid_clamp>`.
1865
1866.. _amdgpu_synid_mad_mix:
1867
1868VOP3P MAD_MIX/FMA_MIX Modifiers
1869-------------------------------
1870
1871*v_mad_mix\** and *v_fma_mix\**
1872instructions use *op_sel* and *op_sel_hi* modifiers
1873in a manner different from *regular* VOP3P instructions.
1874
1875See a description below.
1876
1877GFX9 and GFX10 only.
1878
1879.. _amdgpu_synid_mad_mix_op_sel:
1880
1881m_op_sel
1882~~~~~~~~
1883
1884This operand has meaning only for 16-bit source operands as indicated by
1885:ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
1886It specifies to select either the low [15:0] or high [31:16] operand bits
1887as input to the operation.
1888
1889The number of values specified by the *op_sel* modifier must match the number of source
1890operands. First value controls src0, second value controls src1 and so on.
1891
1892The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
1893
1894By default, low bits are used for all operands.
1895
1896    =============================== ================================================
1897    Syntax                          Description
1898    =============================== ================================================
1899    op_sel:[{0..1},{0..1},{0..1}]   Select location of each 16-bit source operand.
1900    =============================== ================================================
1901
1902Note: numeric values may be specified as either
1903:ref:`integer numbers<amdgpu_synid_integer_number>` or
1904:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1905
1906Examples:
1907
1908.. parsed-literal::
1909
1910  op_sel:[0,1]
1911
1912.. _amdgpu_synid_mad_mix_op_sel_hi:
1913
1914m_op_sel_hi
1915~~~~~~~~~~~
1916
1917Selects the size of source operands: either 32 bits or 16 bits.
1918By default, 32 bits are used for all source operands.
1919
1920The number of values specified by the *op_sel_hi* modifier must match the number of source
1921operands. First value controls src0, second value controls src1 and so on.
1922
1923The value 0 indicates 32 bits, the value 1 indicates 16 bits.
1924
1925The location of 16 bits in the operand may be specified by
1926:ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
1927
1928    ======================================== ====================================
1929    Syntax                                   Description
1930    ======================================== ====================================
1931    op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand.
1932    ======================================== ====================================
1933
1934Note: numeric values may be specified as either
1935:ref:`integer numbers<amdgpu_synid_integer_number>` or
1936:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1937
1938Examples:
1939
1940.. parsed-literal::
1941
1942  op_sel_hi:[1,1,1]
1943
1944abs
1945~~~
1946
1947See a description :ref:`here<amdgpu_synid_abs>`.
1948
1949neg
1950~~~
1951
1952See a description :ref:`here<amdgpu_synid_neg>`.
1953
1954clamp
1955~~~~~
1956
1957See a description :ref:`here<amdgpu_synid_clamp>`.
1958
1959VOP3P MFMA Modifiers
1960--------------------
1961
1962These modifiers may only be used with GFX908 and GFX90A.
1963
1964.. _amdgpu_synid_cbsz:
1965
1966cbsz
1967~~~~
1968
1969Specifies a broadcast mode.
1970
1971    =============================== ==================================================================
1972    Syntax                          Description
1973    =============================== ==================================================================
1974    cbsz:[{0..7}]                   A broadcast mode.
1975    =============================== ==================================================================
1976
1977Note: numeric value may be specified as either
1978an :ref:`integer number<amdgpu_synid_integer_number>` or
1979an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1980
1981.. _amdgpu_synid_abid:
1982
1983abid
1984~~~~
1985
1986Specifies matrix A group select.
1987
1988    =============================== ==================================================================
1989    Syntax                          Description
1990    =============================== ==================================================================
1991    abid:[{0..15}]                  Matrix A group select id.
1992    =============================== ==================================================================
1993
1994Note: numeric value may be specified as either
1995an :ref:`integer number<amdgpu_synid_integer_number>` or
1996an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1997
1998.. _amdgpu_synid_blgp:
1999
2000blgp
2001~~~~
2002
2003Specifies matrix B lane group pattern.
2004
2005    =============================== ==================================================================
2006    Syntax                          Description
2007    =============================== ==================================================================
2008    blgp:[{0..7}]                   Matrix B lane group pattern.
2009    =============================== ==================================================================
2010
2011Note: numeric value may be specified as either
2012an :ref:`integer number<amdgpu_synid_integer_number>` or
2013an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
2014
2015.. _amdgpu_synid_mfma_neg:
2016
2017neg
2018~~~
2019
2020Indicates operands that must be negated before the operation.
2021The number of values specified by this modifier must match the number of source
2022operands. First value controls src0, second value controls src1 and so on.
2023
2024The value 0 indicates that the corresponding operand value is used unmodified,
2025the value 1 indicates that the operand value must be negated before the operation.
2026
2027By default, operand values are used unmodified.
2028
2029This modifier is valid for floating point operands only.
2030
2031    =============================== ==================================================================
2032    Syntax                          Description
2033    =============================== ==================================================================
2034    neg:[{0..1},{0..1},{0..1}]      Select operands which must be negated before the operation.
2035    =============================== ==================================================================
2036
2037Note: numeric values may be specified as either
2038:ref:`integer numbers<amdgpu_synid_integer_number>` or
2039:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
2040
2041Examples:
2042
2043.. parsed-literal::
2044
2045  neg:[0,1,1]
2046