1======================================
2Syntax of AMDGPU Instruction Modifiers
3======================================
4
5.. contents::
6   :local:
7
8Conventions
9===========
10
11The following notation is used throughout this document:
12
13    =================== =============================================================
14    Notation            Description
15    =================== =============================================================
16    {0..N}              Any integer value in the range from 0 to N (inclusive).
17    <x>                 Syntax and meaning of *x* is explained elsewhere.
18    =================== =============================================================
19
20.. _amdgpu_syn_modifiers:
21
22Modifiers
23=========
24
25DS Modifiers
26------------
27
28.. _amdgpu_synid_ds_offset80:
29
30offset0
31~~~~~~~
32
33Specifies first 8-bit offset, in bytes. The default value is 0.
34
35Used with DS instructions that expect two addresses.
36
37    =================== ====================================================================
38    Syntax              Description
39    =================== ====================================================================
40    offset0:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
41                        :ref:`integer number <amdgpu_synid_integer_number>`
42                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
43    =================== ====================================================================
44
45Examples:
46
47.. parsed-literal::
48
49  offset0:0xff
50  offset0:2-x
51  offset0:-x-y
52
53.. _amdgpu_synid_ds_offset81:
54
55offset1
56~~~~~~~
57
58Specifies second 8-bit offset, in bytes. The default value is 0.
59
60Used with DS instructions that expect two addresses.
61
62    =================== ====================================================================
63    Syntax              Description
64    =================== ====================================================================
65    offset1:{0..0xFF}   Specifies an unsigned 8-bit offset as a positive
66                        :ref:`integer number <amdgpu_synid_integer_number>`
67                        or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
68    =================== ====================================================================
69
70Examples:
71
72.. parsed-literal::
73
74  offset1:0xff
75  offset1:2-x
76  offset1:-x-y
77
78.. _amdgpu_synid_ds_offset16:
79
80offset
81~~~~~~
82
83Specifies a 16-bit offset, in bytes. The default value is 0.
84
85Used with DS instructions that expect a single address.
86
87    ==================== ====================================================================
88    Syntax               Description
89    ==================== ====================================================================
90    offset:{0..0xFFFF}   Specifies an unsigned 16-bit offset as a positive
91                         :ref:`integer number <amdgpu_synid_integer_number>`
92                         or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
93    ==================== ====================================================================
94
95Examples:
96
97.. parsed-literal::
98
99  offset:65535
100  offset:0xffff
101  offset:-x-y
102
103.. _amdgpu_synid_sw_offset16:
104
105swizzle pattern
106~~~~~~~~~~~~~~~
107
108This is a special modifier which may be used with *ds_swizzle_b32* instruction only.
109It specifies a swizzle pattern in numeric or symbolic form. The default value is 0.
110
111See AMD documentation for more information.
112
113    ======================================================= ===========================================================
114    Syntax                                                  Description
115    ======================================================= ===========================================================
116    offset:{0..0xFFFF}                                      Specifies a 16-bit swizzle pattern.
117    offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3})   Specifies a quad permute mode pattern
118
119                                                            Each number is a lane *id*.
120    offset:swizzle(BITMASK_PERM, "<mask>")                  Specifies a bitmask permute mode pattern.
121
122                                                            The pattern converts a 5-bit lane *id* to another
123                                                            lane *id* with which the lane interacts.
124
125                                                            *mask* is a 5 character sequence which
126                                                            specifies how to transform the bits of the
127                                                            lane *id*.
128
129                                                            The following characters are allowed:
130
131                                                            * "0" - set bit to 0.
132
133                                                            * "1" - set bit to 1.
134
135                                                            * "p" - preserve bit.
136
137                                                            * "i" - inverse bit.
138
139    offset:swizzle(BROADCAST,{2..32},{0..N})                Specifies a broadcast mode.
140
141                                                            Broadcasts the value of any particular lane to
142                                                            all lanes in its group.
143
144                                                            The first numeric parameter is a group
145                                                            size and must be equal to 2, 4, 8, 16 or 32.
146
147                                                            The second numeric parameter is an index of the
148                                                            lane being broadcasted.
149
150                                                            The index must not exceed group size.
151    offset:swizzle(SWAP,{1..16})                            Specifies a swap mode.
152
153                                                            Swaps the neighboring groups of
154                                                            1, 2, 4, 8 or 16 lanes.
155    offset:swizzle(REVERSE,{2..32})                         Specifies a reverse mode.
156
157                                                            Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes.
158    ======================================================= ===========================================================
159
160Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
161:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
162
163Examples:
164
165.. parsed-literal::
166
167  offset:255
168  offset:0xffff
169  offset:swizzle(QUAD_PERM, 0, 1, 2, 3)
170  offset:swizzle(BITMASK_PERM, "01pi0")
171  offset:swizzle(BROADCAST, 2, 0)
172  offset:swizzle(SWAP, 8)
173  offset:swizzle(REVERSE, 30 + 2)
174
175.. _amdgpu_synid_gds:
176
177gds
178~~~
179
180Specifies whether to use GDS or LDS memory (LDS is the default).
181
182    ======================================== ================================================
183    Syntax                                   Description
184    ======================================== ================================================
185    gds                                      Use GDS memory.
186    ======================================== ================================================
187
188
189EXP Modifiers
190-------------
191
192.. _amdgpu_synid_done:
193
194done
195~~~~
196
197Specifies if this is the last export from the shader to the target. By default,
198*exp* instruction does not finish an export sequence.
199
200    ======================================== ================================================
201    Syntax                                   Description
202    ======================================== ================================================
203    done                                     Indicates the last export operation.
204    ======================================== ================================================
205
206.. _amdgpu_synid_compr:
207
208compr
209~~~~~
210
211Indicates if the data are compressed (data are not compressed by default).
212
213    ======================================== ================================================
214    Syntax                                   Description
215    ======================================== ================================================
216    compr                                    Data are compressed.
217    ======================================== ================================================
218
219.. _amdgpu_synid_vm:
220
221vm
222~~
223
224Specifies valid mask flag state (off by default).
225
226    ======================================== ================================================
227    Syntax                                   Description
228    ======================================== ================================================
229    vm                                       Set valid mask flag.
230    ======================================== ================================================
231
232FLAT Modifiers
233--------------
234
235.. _amdgpu_synid_flat_offset12:
236
237offset12
238~~~~~~~~
239
240Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
241
242Cannot be used with *global/scratch* opcodes. GFX9 only.
243
244    ================= ====================================================================
245    Syntax            Description
246    ================= ====================================================================
247    offset:{0..4095}  Specifies a 12-bit unsigned offset as a positive
248                      :ref:`integer number <amdgpu_synid_integer_number>`
249                      or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
250    ================= ====================================================================
251
252Examples:
253
254.. parsed-literal::
255
256  offset:4095
257  offset:x-0xff
258
259.. _amdgpu_synid_flat_offset13s:
260
261offset13s
262~~~~~~~~~
263
264Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
265
266Can be used with *global/scratch* opcodes only. GFX9 only.
267
268    ===================== ====================================================================
269    Syntax                Description
270    ===================== ====================================================================
271    offset:{-4096..4095}  Specifies a 13-bit signed offset as an
272                          :ref:`integer number <amdgpu_synid_integer_number>`
273                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
274    ===================== ====================================================================
275
276Examples:
277
278.. parsed-literal::
279
280  offset:-4000
281  offset:0x10
282  offset:-x
283
284.. _amdgpu_synid_flat_offset12s:
285
286offset12s
287~~~~~~~~~
288
289Specifies an immediate signed 12-bit offset, in bytes. The default value is 0.
290
291Can be used with *global/scratch* opcodes only.
292
293GFX10 only.
294
295    ===================== ====================================================================
296    Syntax                Description
297    ===================== ====================================================================
298    offset:{-2048..2047}  Specifies a 12-bit signed offset as an
299                          :ref:`integer number <amdgpu_synid_integer_number>`
300                          or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
301    ===================== ====================================================================
302
303Examples:
304
305.. parsed-literal::
306
307  offset:-2000
308  offset:0x10
309  offset:-x+y
310
311.. _amdgpu_synid_flat_offset11:
312
313offset11
314~~~~~~~~
315
316Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0.
317
318Cannot be used with *global/scratch* opcodes.
319
320GFX10 only.
321
322    ================= ====================================================================
323    Syntax            Description
324    ================= ====================================================================
325    offset:{0..2047}  Specifies an 11-bit unsigned offset as a positive
326                      :ref:`integer number <amdgpu_synid_integer_number>`
327                      or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
328    ================= ====================================================================
329
330Examples:
331
332.. parsed-literal::
333
334  offset:2047
335  offset:x+0xff
336
337dlc
338~~~
339
340See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
341
342glc
343~~~
344
345See a description :ref:`here<amdgpu_synid_glc>`.
346
347lds
348~~~
349
350See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only.
351
352slc
353~~~
354
355See a description :ref:`here<amdgpu_synid_slc>`.
356
357tfe
358~~~
359
360See a description :ref:`here<amdgpu_synid_tfe>`.
361
362nv
363~~
364
365See a description :ref:`here<amdgpu_synid_nv>`.
366
367MIMG Modifiers
368--------------
369
370.. _amdgpu_synid_dmask:
371
372dmask
373~~~~~
374
375Specifies which channels (image components) are used by the operation. By default, no channels
376are used.
377
378    =============== ====================================================================
379    Syntax          Description
380    =============== ====================================================================
381    dmask:{0..15}   Specifies image channels as a positive
382                    :ref:`integer number <amdgpu_synid_integer_number>`
383                    or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
384
385                    Each bit corresponds to one of 4 image components (RGBA).
386
387                    If the specified bit value is 0, the component is not used,
388                    value 1 means that the component is used.
389    =============== ====================================================================
390
391This modifier has some limitations depending on instruction kind:
392
393    =================================================== ========================
394    Instruction Kind                                    Valid dmask Values
395    =================================================== ========================
396    32-bit atomic *cmpswap*                             0x3
397    32-bit atomic instructions except for *cmpswap*     0x1
398    64-bit atomic *cmpswap*                             0xF
399    64-bit atomic instructions except for *cmpswap*     0x3
400    *gather4*                                           0x1, 0x2, 0x4, 0x8
401    Other instructions                                  any value
402    =================================================== ========================
403
404Examples:
405
406.. parsed-literal::
407
408  dmask:0xf
409  dmask:0b1111
410  dmask:x|y|z
411
412.. _amdgpu_synid_unorm:
413
414unorm
415~~~~~
416
417Specifies whether the address is normalized or not (the address is normalized by default).
418
419    ======================== ========================================
420    Syntax                   Description
421    ======================== ========================================
422    unorm                    Force the address to be unnormalized.
423    ======================== ========================================
424
425glc
426~~~
427
428See a description :ref:`here<amdgpu_synid_glc>`.
429
430slc
431~~~
432
433See a description :ref:`here<amdgpu_synid_slc>`.
434
435.. _amdgpu_synid_r128:
436
437r128
438~~~~
439
440Specifies texture resource size. The default size is 256 bits.
441
442GFX7, GFX8 and GFX10 only.
443
444    =================== ================================================
445    Syntax              Description
446    =================== ================================================
447    r128                Specifies 128 bits texture resource size.
448    =================== ================================================
449
450.. WARNING:: Using this modifier should decrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature.
451
452tfe
453~~~
454
455See a description :ref:`here<amdgpu_synid_tfe>`.
456
457.. _amdgpu_synid_lwe:
458
459lwe
460~~~
461
462Specifies LOD warning status (LOD warning is disabled by default).
463
464    ======================================== ================================================
465    Syntax                                   Description
466    ======================================== ================================================
467    lwe                                      Enables LOD warning.
468    ======================================== ================================================
469
470.. _amdgpu_synid_da:
471
472da
473~~
474
475Specifies if an array index must be sent to TA. By default, array index is not sent.
476
477    ======================================== ================================================
478    Syntax                                   Description
479    ======================================== ================================================
480    da                                       Send an array-index to TA.
481    ======================================== ================================================
482
483.. _amdgpu_synid_d16:
484
485d16
486~~~
487
488Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
489
490    ======================================== ================================================
491    Syntax                                   Description
492    ======================================== ================================================
493    d16                                      Enables 16-bits data mode.
494
495                                             On loads, convert data in memory to 16-bit
496                                             format before storing it in VGPRs.
497
498                                             For stores, convert 16-bit data in VGPRs to
499                                             32 bits before going to memory.
500
501                                             Note that GFX8.0 does not support data packing.
502                                             Each 16-bit data element occupies 1 VGPR.
503
504                                             GFX8.1, GFX9 and GFX10 support data packing.
505                                             Each pair of 16-bit data elements
506                                             occupies 1 VGPR.
507    ======================================== ================================================
508
509.. _amdgpu_synid_a16:
510
511a16
512~~~
513
514Specifies size of image address components: 16 or 32 bits (32 bits by default).
515GFX9 and GFX10 only.
516
517    ======================================== ================================================
518    Syntax                                   Description
519    ======================================== ================================================
520    a16                                      Enables 16-bits image address components.
521    ======================================== ================================================
522
523.. _amdgpu_synid_dim:
524
525dim
526~~~
527
528Specifies surface dimension. This is a mandatory modifier. There is no default value.
529
530GFX10 only.
531
532    =============================== =========================================================
533    Syntax                          Description
534    =============================== =========================================================
535    dim:1D                          One-dimensional image.
536    dim:2D                          Two-dimensional image.
537    dim:3D                          Three-dimensional image.
538    dim:CUBE                        Cubemap array.
539    dim:1D_ARRAY                    One-dimensional image array.
540    dim:2D_ARRAY                    Two-dimensional image array.
541    dim:2D_MSAA                     Two-dimensional multi-sample auto-aliasing image.
542    dim:2D_MSAA_ARRAY               Two-dimensional multi-sample auto-aliasing image array.
543    =============================== =========================================================
544
545The following table defines an alternative syntax which is supported
546for compatibility with SP3 assembler:
547
548    =============================== =========================================================
549    Syntax                          Description
550    =============================== =========================================================
551    dim:SQ_RSRC_IMG_1D              One-dimensional image.
552    dim:SQ_RSRC_IMG_2D              Two-dimensional image.
553    dim:SQ_RSRC_IMG_3D              Three-dimensional image.
554    dim:SQ_RSRC_IMG_CUBE            Cubemap array.
555    dim:SQ_RSRC_IMG_1D_ARRAY        One-dimensional image array.
556    dim:SQ_RSRC_IMG_2D_ARRAY        Two-dimensional image array.
557    dim:SQ_RSRC_IMG_2D_MSAA         Two-dimensional multi-sample auto-aliasing image.
558    dim:SQ_RSRC_IMG_2D_MSAA_ARRAY   Two-dimensional multi-sample auto-aliasing image array.
559    =============================== =========================================================
560
561dlc
562~~~
563
564See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
565
566Miscellaneous Modifiers
567-----------------------
568
569.. _amdgpu_synid_dlc:
570
571dlc
572~~~
573
574Controls device level cache policy for memory operations. Used for synchronization.
575When specified, forces operation to bypass device level cache making the operation device
576level coherent. By default, instructions use device level cache.
577
578GFX10 only.
579
580    ======================================== ================================================
581    Syntax                                   Description
582    ======================================== ================================================
583    dlc                                      Bypass device level cache.
584    ======================================== ================================================
585
586.. _amdgpu_synid_glc:
587
588glc
589~~~
590
591This modifier has different meaning for loads, stores, and atomic operations.
592The default value is off (0).
593
594See AMD documentation for details.
595
596    ======================================== ================================================
597    Syntax                                   Description
598    ======================================== ================================================
599    glc                                      Set glc bit to 1.
600    ======================================== ================================================
601
602.. _amdgpu_synid_lds:
603
604lds
605~~~
606
607Specifies where to store the result: VGPRs or LDS (VGPRs by default).
608
609    ======================================== ===========================
610    Syntax                                   Description
611    ======================================== ===========================
612    lds                                      Store result in LDS.
613    ======================================== ===========================
614
615.. _amdgpu_synid_nv:
616
617nv
618~~
619
620Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
621
622GFX9 only.
623
624    ======================================== ================================================
625    Syntax                                   Description
626    ======================================== ================================================
627    nv                                       Indicates that instruction operates on
628                                             non-volatile memory.
629    ======================================== ================================================
630
631.. _amdgpu_synid_slc:
632
633slc
634~~~
635
636Specifies cache policy. The default value is off (0).
637
638See AMD documentation for details.
639
640    ======================================== ================================================
641    Syntax                                   Description
642    ======================================== ================================================
643    slc                                      Set slc bit to 1.
644    ======================================== ================================================
645
646.. _amdgpu_synid_tfe:
647
648tfe
649~~~
650
651Controls access to partially resident textures. The default value is off (0).
652
653See AMD documentation for details.
654
655    ======================================== ================================================
656    Syntax                                   Description
657    ======================================== ================================================
658    tfe                                      Set tfe bit to 1.
659    ======================================== ================================================
660
661MUBUF/MTBUF Modifiers
662---------------------
663
664.. _amdgpu_synid_idxen:
665
666idxen
667~~~~~
668
669Specifies whether address components include an index. By default, no components are used.
670
671Can be used together with :ref:`offen<amdgpu_synid_offen>`.
672
673Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
674
675    ======================================== ================================================
676    Syntax                                   Description
677    ======================================== ================================================
678    idxen                                    Address components include an index.
679    ======================================== ================================================
680
681.. _amdgpu_synid_offen:
682
683offen
684~~~~~
685
686Specifies whether address components include an offset. By default, no components are used.
687
688Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
689
690Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
691
692    ======================================== ================================================
693    Syntax                                   Description
694    ======================================== ================================================
695    offen                                    Address components include an offset.
696    ======================================== ================================================
697
698.. _amdgpu_synid_addr64:
699
700addr64
701~~~~~~
702
703Specifies whether a 64-bit address is used. By default, no address is used.
704
705GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
706:ref:`idxen<amdgpu_synid_idxen>` modifiers.
707
708    ======================================== ================================================
709    Syntax                                   Description
710    ======================================== ================================================
711    addr64                                   A 64-bit address is used.
712    ======================================== ================================================
713
714.. _amdgpu_synid_buf_offset12:
715
716offset12
717~~~~~~~~
718
719Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
720
721    ================== ====================================================================
722    Syntax             Description
723    ================== ====================================================================
724    offset:{0..0xFFF}  Specifies a 12-bit unsigned offset as a positive
725                       :ref:`integer number <amdgpu_synid_integer_number>`
726                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
727    ================== ====================================================================
728
729Examples:
730
731.. parsed-literal::
732
733  offset:x+y
734  offset:0x10
735
736glc
737~~~
738
739See a description :ref:`here<amdgpu_synid_glc>`.
740
741slc
742~~~
743
744See a description :ref:`here<amdgpu_synid_slc>`.
745
746lds
747~~~
748
749See a description :ref:`here<amdgpu_synid_lds>`.
750
751dlc
752~~~
753
754See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
755
756tfe
757~~~
758
759See a description :ref:`here<amdgpu_synid_tfe>`.
760
761.. _amdgpu_synid_fmt:
762
763fmt
764~~~
765
766Specifies data and numeric formats used by the operation.
767The default numeric format is BUF_NUM_FORMAT_UNORM.
768The default data format is BUF_DATA_FORMAT_8.
769
770    ========================================= ===============================================================
771    Syntax                                    Description
772    ========================================= ===============================================================
773    format:{0..127}                           Use format specified as either an
774                                              :ref:`integer number<amdgpu_synid_integer_number>` or an
775                                              :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
776    format:[<data format>]                    Use the specified data format and
777                                              default numeric format.
778    format:[<numeric format>]                 Use the specified numeric format and
779                                              default data format.
780    format:[<data format>, <numeric format>]  Use the specified data and numeric formats.
781    format:[<numeric format>, <data format>]  Use the specified data and numeric formats.
782    ========================================= ===============================================================
783
784.. _amdgpu_synid_format_data:
785
786Supported data formats are defined in the following table:
787
788    ========================================= ===============================
789    Syntax                                    Note
790    ========================================= ===============================
791    BUF_DATA_FORMAT_INVALID
792    BUF_DATA_FORMAT_8                         Default value.
793    BUF_DATA_FORMAT_16
794    BUF_DATA_FORMAT_8_8
795    BUF_DATA_FORMAT_32
796    BUF_DATA_FORMAT_16_16
797    BUF_DATA_FORMAT_10_11_11
798    BUF_DATA_FORMAT_11_11_10
799    BUF_DATA_FORMAT_10_10_10_2
800    BUF_DATA_FORMAT_2_10_10_10
801    BUF_DATA_FORMAT_8_8_8_8
802    BUF_DATA_FORMAT_32_32
803    BUF_DATA_FORMAT_16_16_16_16
804    BUF_DATA_FORMAT_32_32_32
805    BUF_DATA_FORMAT_32_32_32_32
806    BUF_DATA_FORMAT_RESERVED_15
807    ========================================= ===============================
808
809.. _amdgpu_synid_format_num:
810
811Supported numeric formats are defined below:
812
813    ========================================= ===============================
814    Syntax                                    Note
815    ========================================= ===============================
816    BUF_NUM_FORMAT_UNORM                      Default value.
817    BUF_NUM_FORMAT_SNORM
818    BUF_NUM_FORMAT_USCALED
819    BUF_NUM_FORMAT_SSCALED
820    BUF_NUM_FORMAT_UINT
821    BUF_NUM_FORMAT_SINT
822    BUF_NUM_FORMAT_SNORM_OGL                  GFX7 only.
823    BUF_NUM_FORMAT_RESERVED_6                 GFX8 and GFX9 only.
824    BUF_NUM_FORMAT_FLOAT
825    ========================================= ===============================
826
827Examples:
828
829.. parsed-literal::
830
831  format:0
832  format:127
833  format:[BUF_DATA_FORMAT_16]
834  format:[BUF_DATA_FORMAT_16,BUF_NUM_FORMAT_SSCALED]
835  format:[BUF_NUM_FORMAT_FLOAT]
836
837.. _amdgpu_synid_ufmt:
838
839ufmt
840~~~~
841
842Specifies a unified format used by the operation.
843The default format is BUF_FMT_8_UNORM.
844GFX10 only.
845
846    ========================================= ===============================================================
847    Syntax                                    Description
848    ========================================= ===============================================================
849    format:{0..127}                           Use unified format specified as either an
850                                              :ref:`integer number<amdgpu_synid_integer_number>` or an
851                                              :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
852                                              Note that unified format numbers are not compatible with
853                                              format numbers used for pre-GFX10 ISA.
854    format:[<unified format>]                 Use the specified unified format.
855    ========================================= ===============================================================
856
857Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>`
858and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA,
859:ref:`syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepted
860provided that the combination of formats can be mapped to a unified format.
861
862Supported unified formats and equivalent combinations of data and numeric formats
863are defined below:
864
865    ============================== ============================== =============================
866    Syntax                         Equivalent Data Format         Equivalent Numeric Format
867    ============================== ============================== =============================
868    BUF_FMT_INVALID                BUF_DATA_FORMAT_INVALID        BUF_NUM_FORMAT_UNORM
869
870    BUF_FMT_8_UNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UNORM
871    BUF_FMT_8_SNORM                BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SNORM
872    BUF_FMT_8_USCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_USCALED
873    BUF_FMT_8_SSCALED              BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SSCALED
874    BUF_FMT_8_UINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_UINT
875    BUF_FMT_8_SINT                 BUF_DATA_FORMAT_8              BUF_NUM_FORMAT_SINT
876
877    BUF_FMT_16_UNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UNORM
878    BUF_FMT_16_SNORM               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SNORM
879    BUF_FMT_16_USCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_USCALED
880    BUF_FMT_16_SSCALED             BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SSCALED
881    BUF_FMT_16_UINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_UINT
882    BUF_FMT_16_SINT                BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_SINT
883    BUF_FMT_16_FLOAT               BUF_DATA_FORMAT_16             BUF_NUM_FORMAT_FLOAT
884
885    BUF_FMT_8_8_UNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UNORM
886    BUF_FMT_8_8_SNORM              BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SNORM
887    BUF_FMT_8_8_USCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_USCALED
888    BUF_FMT_8_8_SSCALED            BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SSCALED
889    BUF_FMT_8_8_UINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_UINT
890    BUF_FMT_8_8_SINT               BUF_DATA_FORMAT_8_8            BUF_NUM_FORMAT_SINT
891
892    BUF_FMT_32_UINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_UINT
893    BUF_FMT_32_SINT                BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_SINT
894    BUF_FMT_32_FLOAT               BUF_DATA_FORMAT_32             BUF_NUM_FORMAT_FLOAT
895
896    BUF_FMT_16_16_UNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UNORM
897    BUF_FMT_16_16_SNORM            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SNORM
898    BUF_FMT_16_16_USCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_USCALED
899    BUF_FMT_16_16_SSCALED          BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SSCALED
900    BUF_FMT_16_16_UINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_UINT
901    BUF_FMT_16_16_SINT             BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_SINT
902    BUF_FMT_16_16_FLOAT            BUF_DATA_FORMAT_16_16          BUF_NUM_FORMAT_FLOAT
903
904    BUF_FMT_10_11_11_UNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UNORM
905    BUF_FMT_10_11_11_SNORM         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SNORM
906    BUF_FMT_10_11_11_USCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_USCALED
907    BUF_FMT_10_11_11_SSCALED       BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SSCALED
908    BUF_FMT_10_11_11_UINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_UINT
909    BUF_FMT_10_11_11_SINT          BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_SINT
910    BUF_FMT_10_11_11_FLOAT         BUF_DATA_FORMAT_10_11_11       BUF_NUM_FORMAT_FLOAT
911
912    BUF_FMT_11_11_10_UNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UNORM
913    BUF_FMT_11_11_10_SNORM         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SNORM
914    BUF_FMT_11_11_10_USCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_USCALED
915    BUF_FMT_11_11_10_SSCALED       BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SSCALED
916    BUF_FMT_11_11_10_UINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_UINT
917    BUF_FMT_11_11_10_SINT          BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_SINT
918    BUF_FMT_11_11_10_FLOAT         BUF_DATA_FORMAT_11_11_10       BUF_NUM_FORMAT_FLOAT
919
920    BUF_FMT_10_10_10_2_UNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UNORM
921    BUF_FMT_10_10_10_2_SNORM       BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SNORM
922    BUF_FMT_10_10_10_2_USCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_USCALED
923    BUF_FMT_10_10_10_2_SSCALED     BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SSCALED
924    BUF_FMT_10_10_10_2_UINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_UINT
925    BUF_FMT_10_10_10_2_SINT        BUF_DATA_FORMAT_10_10_10_2     BUF_NUM_FORMAT_SINT
926
927    BUF_FMT_2_10_10_10_UNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UNORM
928    BUF_FMT_2_10_10_10_SNORM       BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SNORM
929    BUF_FMT_2_10_10_10_USCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_USCALED
930    BUF_FMT_2_10_10_10_SSCALED     BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SSCALED
931    BUF_FMT_2_10_10_10_UINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_UINT
932    BUF_FMT_2_10_10_10_SINT        BUF_DATA_FORMAT_2_10_10_10     BUF_NUM_FORMAT_SINT
933
934    BUF_FMT_8_8_8_8_UNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UNORM
935    BUF_FMT_8_8_8_8_SNORM          BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SNORM
936    BUF_FMT_8_8_8_8_USCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_USCALED
937    BUF_FMT_8_8_8_8_SSCALED        BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SSCALED
938    BUF_FMT_8_8_8_8_UINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_UINT
939    BUF_FMT_8_8_8_8_SINT           BUF_DATA_FORMAT_8_8_8_8        BUF_NUM_FORMAT_SINT
940
941    BUF_FMT_32_32_UINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_UINT
942    BUF_FMT_32_32_SINT             BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_SINT
943    BUF_FMT_32_32_FLOAT            BUF_DATA_FORMAT_32_32          BUF_NUM_FORMAT_FLOAT
944
945    BUF_FMT_16_16_16_16_UNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UNORM
946    BUF_FMT_16_16_16_16_SNORM      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SNORM
947    BUF_FMT_16_16_16_16_USCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_USCALED
948    BUF_FMT_16_16_16_16_SSCALED    BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SSCALED
949    BUF_FMT_16_16_16_16_UINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_UINT
950    BUF_FMT_16_16_16_16_SINT       BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_SINT
951    BUF_FMT_16_16_16_16_FLOAT      BUF_DATA_FORMAT_16_16_16_16    BUF_NUM_FORMAT_FLOAT
952
953    BUF_FMT_32_32_32_UINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_UINT
954    BUF_FMT_32_32_32_SINT          BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_SINT
955    BUF_FMT_32_32_32_FLOAT         BUF_DATA_FORMAT_32_32_32       BUF_NUM_FORMAT_FLOAT
956    BUF_FMT_32_32_32_32_UINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_UINT
957    BUF_FMT_32_32_32_32_SINT       BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_SINT
958    BUF_FMT_32_32_32_32_FLOAT      BUF_DATA_FORMAT_32_32_32_32    BUF_NUM_FORMAT_FLOAT
959    ============================== ============================== =============================
960
961Examples:
962
963.. parsed-literal::
964
965  format:0
966  format:[BUF_FMT_32_UINT]
967
968SMRD/SMEM Modifiers
969-------------------
970
971glc
972~~~
973
974See a description :ref:`here<amdgpu_synid_glc>`.
975
976nv
977~~
978
979See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only.
980
981dlc
982~~~
983
984See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only.
985
986VINTRP Modifiers
987----------------
988
989.. _amdgpu_synid_high:
990
991high
992~~~~
993
994Specifies which half of the LDS word to use. Low half of LDS word is used by default.
995GFX9 and GFX10 only.
996
997    ======================================== ================================
998    Syntax                                   Description
999    ======================================== ================================
1000    high                                     Use high half of LDS word.
1001    ======================================== ================================
1002
1003DPP8 Modifiers
1004--------------
1005
1006GFX10 only.
1007
1008.. _amdgpu_synid_dpp8_sel:
1009
1010dpp8_sel
1011~~~~~~~~
1012
1013Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier.
1014There is no default value.
1015
1016GFX10 only.
1017
1018The *dpp8_sel* modifier must specify exactly 8 values.
1019First value selects which lane to read from to supply data into lane 0.
1020Second value controls lane 1 and so on.
1021
1022Each value may be specified as either
1023an :ref:`integer number<amdgpu_synid_integer_number>` or
1024an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1025
1026    =============================================================== ===========================
1027    Syntax                                                          Description
1028    =============================================================== ===========================
1029    dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}]  Select lanes to read from.
1030    =============================================================== ===========================
1031
1032Examples:
1033
1034.. parsed-literal::
1035
1036  dpp8:[7,6,5,4,3,2,1,0]
1037  dpp8:[0,1,0,1,0,1,0,1]
1038
1039.. _amdgpu_synid_fi8:
1040
1041fi
1042~~
1043
1044Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero.
1045
1046Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1047
1048GFX10 only.
1049
1050    ==================================== =====================================================
1051    Syntax                               Description
1052    ==================================== =====================================================
1053    fi:0                                 Fetch zero when accessing data from inactive lanes.
1054    fi:1                                 Fetch pre-exist values from inactive lanes.
1055    ==================================== =====================================================
1056
1057Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1058:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1059
1060DPP Modifiers
1061-------------
1062
1063GFX8, GFX9 and GFX10 only.
1064
1065.. _amdgpu_synid_dpp_ctrl:
1066
1067dpp_ctrl
1068~~~~~~~~
1069
1070Specifies how data are shared between threads. This is a mandatory modifier.
1071There is no default value.
1072
1073GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10.
1074
1075Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1076
1077    ======================================== ================================================
1078    Syntax                                   Description
1079    ======================================== ================================================
1080    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1081    row_mirror                               Mirror threads within row.
1082    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1083    row_bcast:15                             Broadcast 15th thread of each row to next row.
1084    row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1085    wave_shl:1                               Wavefront left shift by 1 thread.
1086    wave_rol:1                               Wavefront left rotate by 1 thread.
1087    wave_shr:1                               Wavefront right shift by 1 thread.
1088    wave_ror:1                               Wavefront right rotate by 1 thread.
1089    row_shl:{1..15}                          Row shift left by 1-15 threads.
1090    row_shr:{1..15}                          Row shift right by 1-15 threads.
1091    row_ror:{1..15}                          Row rotate right by 1-15 threads.
1092    ======================================== ================================================
1093
1094Note: numeric values may be specified as either
1095:ref:`integer numbers<amdgpu_synid_integer_number>` or
1096:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1097
1098Examples:
1099
1100.. parsed-literal::
1101
1102  quad_perm:[0, 1, 2, 3]
1103  row_shl:3
1104
1105.. _amdgpu_synid_dpp16_ctrl:
1106
1107dpp16_ctrl
1108~~~~~~~~~~
1109
1110Specifies how data are shared between threads. This is a mandatory modifier.
1111There is no default value.
1112
1113GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9.
1114
1115Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1116(There are only two rows in *wave32* mode.)
1117
1118    ======================================== ====================================================
1119    Syntax                                   Description
1120    ======================================== ====================================================
1121    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1122    row_mirror                               Mirror threads within row.
1123    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1124    row_share:{0..15}                        Share the value from the specified lane with other
1125                                             lanes in the row.
1126    row_xmask:{0..15}                        Fetch from XOR(current lane id, specified lane id).
1127    row_shl:{1..15}                          Row shift left by 1-15 threads.
1128    row_shr:{1..15}                          Row shift right by 1-15 threads.
1129    row_ror:{1..15}                          Row rotate right by 1-15 threads.
1130    ======================================== ====================================================
1131
1132Note: numeric values may be specified as either
1133:ref:`integer numbers<amdgpu_synid_integer_number>` or
1134:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1135
1136Examples:
1137
1138.. parsed-literal::
1139
1140  quad_perm:[0, 1, 2, 3]
1141  row_shl:3
1142
1143.. _amdgpu_synid_dpp32_ctrl:
1144
1145dpp32_ctrl
1146~~~~~~~~~~
1147
1148Specifies how data are shared between threads. This is a mandatory modifier.
1149There is no default value.
1150
1151May be used only with GFX90A 32-bit instructions.
1152
1153Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1154
1155    ======================================== ==================================================
1156    Syntax                                   Description
1157    ======================================== ==================================================
1158    quad_perm:[{0..3},{0..3},{0..3},{0..3}]  Full permute of 4 threads.
1159    row_mirror                               Mirror threads within row.
1160    row_half_mirror                          Mirror threads within 1/2 row (8 threads).
1161    row_bcast:15                             Broadcast 15th thread of each row to next row.
1162    row_bcast:31                             Broadcast thread 31 to rows 2 and 3.
1163    wave_shl:1                               Wavefront left shift by 1 thread.
1164    wave_rol:1                               Wavefront left rotate by 1 thread.
1165    wave_shr:1                               Wavefront right shift by 1 thread.
1166    wave_ror:1                               Wavefront right rotate by 1 thread.
1167    row_shl:{1..15}                          Row shift left by 1-15 threads.
1168    row_shr:{1..15}                          Row shift right by 1-15 threads.
1169    row_ror:{1..15}                          Row rotate right by 1-15 threads.
1170    row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1171    ======================================== ==================================================
1172
1173Note: numeric values may be specified as either
1174:ref:`integer numbers<amdgpu_synid_integer_number>` or
1175:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1176
1177Examples:
1178
1179.. parsed-literal::
1180
1181  quad_perm:[0, 1, 2, 3]
1182  row_shl:3
1183
1184
1185.. _amdgpu_synid_dpp64_ctrl:
1186
1187dpp64_ctrl
1188~~~~~~~~~~
1189
1190Specifies how data are shared between threads. This is a mandatory modifier.
1191There is no default value.
1192
1193May be used only with GFX90A 64-bit instructions.
1194
1195Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1196
1197    ======================================== ==================================================
1198    Syntax                                   Description
1199    ======================================== ==================================================
1200    row_newbcast:{1..15}                     Broadcast a thread within a row to the whole row.
1201    ======================================== ==================================================
1202
1203Note: numeric values may be specified as either
1204:ref:`integer numbers<amdgpu_synid_integer_number>` or
1205:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1206
1207Examples:
1208
1209.. parsed-literal::
1210
1211  row_newbcast:3
1212
1213
1214.. _amdgpu_synid_row_mask:
1215
1216row_mask
1217~~~~~~~~
1218
1219Controls which rows are enabled for data sharing. By default, all rows are enabled.
1220
1221Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1222(There are only two rows in *wave32* mode.)
1223
1224    ================= ====================================================================
1225    Syntax            Description
1226    ================= ====================================================================
1227    row_mask:{0..15}  Specifies a *row mask* as a positive
1228                      :ref:`integer number <amdgpu_synid_integer_number>`
1229                      or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1230
1231                      Each of 4 bits in the mask controls one row
1232                      (0 - disabled, 1 - enabled).
1233
1234                      In *wave32* mode the values should be limited to 0..7.
1235    ================= ====================================================================
1236
1237Examples:
1238
1239.. parsed-literal::
1240
1241  row_mask:0xf
1242  row_mask:0b1010
1243  row_mask:x|y
1244
1245.. _amdgpu_synid_bank_mask:
1246
1247bank_mask
1248~~~~~~~~~
1249
1250Controls which banks are enabled for data sharing. By default, all banks are enabled.
1251
1252Note: the lanes of a wavefront are organized in four *rows* and four *banks*.
1253(There are only two rows in *wave32* mode.)
1254
1255    ================== ====================================================================
1256    Syntax             Description
1257    ================== ====================================================================
1258    bank_mask:{0..15}  Specifies a *bank mask* as a positive
1259                       :ref:`integer number <amdgpu_synid_integer_number>`
1260                       or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1261
1262                       Each of 4 bits in the mask controls one bank
1263                       (0 - disabled, 1 - enabled).
1264    ================== ====================================================================
1265
1266Examples:
1267
1268.. parsed-literal::
1269
1270  bank_mask:0x3
1271  bank_mask:0b0011
1272  bank_mask:x&y
1273
1274.. _amdgpu_synid_bound_ctrl:
1275
1276bound_ctrl
1277~~~~~~~~~~
1278
1279Controls data sharing when accessing an invalid lane. By default, data sharing with
1280invalid lanes is disabled.
1281
1282    ======================================== ================================================
1283    Syntax                                   Description
1284    ======================================== ================================================
1285    bound_ctrl:1                             Enables data sharing with invalid lanes.
1286
1287                                             Accessing data from an invalid lane will
1288                                             return zero.
1289    ======================================== ================================================
1290
1291.. _amdgpu_synid_fi16:
1292
1293fi
1294~~
1295
1296Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero.
1297
1298Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero.
1299
1300GFX10 only.
1301
1302    ======================================== ==================================================
1303    Syntax                                   Description
1304    ======================================== ==================================================
1305    fi:0                                     Interaction with inactive lanes is controlled by
1306                                             :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1307
1308    fi:1                                     Fetch pre-exist values from inactive lanes.
1309    ======================================== ==================================================
1310
1311Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1312:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1313
1314SDWA Modifiers
1315--------------
1316
1317GFX8, GFX9 and GFX10 only.
1318
1319clamp
1320~~~~~
1321
1322See a description :ref:`here<amdgpu_synid_clamp>`.
1323
1324omod
1325~~~~
1326
1327See a description :ref:`here<amdgpu_synid_omod>`.
1328
1329GFX9 and GFX10 only.
1330
1331.. _amdgpu_synid_dst_sel:
1332
1333dst_sel
1334~~~~~~~
1335
1336Selects which bits in the destination are affected. By default, all bits are affected.
1337
1338    ======================================== ================================================
1339    Syntax                                   Description
1340    ======================================== ================================================
1341    dst_sel:DWORD                            Use bits 31:0.
1342    dst_sel:BYTE_0                           Use bits 7:0.
1343    dst_sel:BYTE_1                           Use bits 15:8.
1344    dst_sel:BYTE_2                           Use bits 23:16.
1345    dst_sel:BYTE_3                           Use bits 31:24.
1346    dst_sel:WORD_0                           Use bits 15:0.
1347    dst_sel:WORD_1                           Use bits 31:16.
1348    ======================================== ================================================
1349
1350.. _amdgpu_synid_dst_unused:
1351
1352dst_unused
1353~~~~~~~~~~
1354
1355Controls what to do with the bits in the destination which are not selected
1356by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
1357By default, unused bits are preserved.
1358
1359    ======================================== ================================================
1360    Syntax                                   Description
1361    ======================================== ================================================
1362    dst_unused:UNUSED_PAD                    Pad with zeros.
1363    dst_unused:UNUSED_SEXT                   Sign-extend upper bits, zero lower bits.
1364    dst_unused:UNUSED_PRESERVE               Preserve bits.
1365    ======================================== ================================================
1366
1367.. _amdgpu_synid_src0_sel:
1368
1369src0_sel
1370~~~~~~~~
1371
1372Controls which bits in the src0 are used. By default, all bits are used.
1373
1374    ======================================== ================================================
1375    Syntax                                   Description
1376    ======================================== ================================================
1377    src0_sel:DWORD                           Use bits 31:0.
1378    src0_sel:BYTE_0                          Use bits 7:0.
1379    src0_sel:BYTE_1                          Use bits 15:8.
1380    src0_sel:BYTE_2                          Use bits 23:16.
1381    src0_sel:BYTE_3                          Use bits 31:24.
1382    src0_sel:WORD_0                          Use bits 15:0.
1383    src0_sel:WORD_1                          Use bits 31:16.
1384    ======================================== ================================================
1385
1386.. _amdgpu_synid_src1_sel:
1387
1388src1_sel
1389~~~~~~~~
1390
1391Controls which bits in the src1 are used. By default, all bits are used.
1392
1393    ======================================== ================================================
1394    Syntax                                   Description
1395    ======================================== ================================================
1396    src1_sel:DWORD                           Use bits 31:0.
1397    src1_sel:BYTE_0                          Use bits 7:0.
1398    src1_sel:BYTE_1                          Use bits 15:8.
1399    src1_sel:BYTE_2                          Use bits 23:16.
1400    src1_sel:BYTE_3                          Use bits 31:24.
1401    src1_sel:WORD_0                          Use bits 15:0.
1402    src1_sel:WORD_1                          Use bits 31:16.
1403    ======================================== ================================================
1404
1405.. _amdgpu_synid_sdwa_operand_modifiers:
1406
1407SDWA Operand Modifiers
1408----------------------
1409
1410Operand modifiers are not used separately. They are applied to source operands.
1411
1412GFX8, GFX9 and GFX10 only.
1413
1414abs
1415~~~
1416
1417See a description :ref:`here<amdgpu_synid_abs>`.
1418
1419neg
1420~~~
1421
1422See a description :ref:`here<amdgpu_synid_neg>`.
1423
1424.. _amdgpu_synid_sext:
1425
1426sext
1427~~~~
1428
1429Sign-extends value of a (sub-dword) operand to fill all 32 bits.
1430Has no effect for 32-bit operands.
1431
1432Valid for integer operands only.
1433
1434    ======================================== ================================================
1435    Syntax                                   Description
1436    ======================================== ================================================
1437    sext(<operand>)                          Sign-extend operand value.
1438    ======================================== ================================================
1439
1440Examples:
1441
1442.. parsed-literal::
1443
1444  sext(v4)
1445  sext(v255)
1446
1447VOP3 Modifiers
1448--------------
1449
1450.. _amdgpu_synid_vop3_op_sel:
1451
1452op_sel
1453~~~~~~
1454
1455Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
1456By default, low bits are used for all operands.
1457
1458The number of values specified with the op_sel modifier must match the number of instruction
1459operands (both source and destination). First value controls src0, second value controls src1
1460and so on, except that the last value controls destination.
1461The value 0 selects the low bits, while 1 selects the high bits.
1462
1463Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
1464by op_sel must be 0.
1465
1466GFX9 and GFX10 only.
1467
1468    ======================================== ============================================================
1469    Syntax                                   Description
1470    ======================================== ============================================================
1471    op_sel:[{0..1},{0..1}]                   Select operand bits for instructions with 1 source operand.
1472    op_sel:[{0..1},{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1473    op_sel:[{0..1},{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1474    ======================================== ============================================================
1475
1476Note: numeric values may be specified as either
1477:ref:`integer numbers<amdgpu_synid_integer_number>` or
1478:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1479
1480Examples:
1481
1482.. parsed-literal::
1483
1484  op_sel:[0,0]
1485  op_sel:[0,1]
1486
1487.. _amdgpu_synid_dpp_op_sel:
1488
1489dpp_op_sel
1490~~~~~~~~~~
1491
1492Special version of *op_sel* used for *permlane* opcodes to specify
1493dpp-like mode bits - :ref:`fi<amdgpu_synid_fi16>` and
1494:ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1495
1496GFX10 only.
1497
1498    ======================================== ============================================================
1499    Syntax                                   Description
1500    ======================================== ============================================================
1501    op_sel:[{0..1},{0..1}]                   First bit specifies :ref:`fi<amdgpu_synid_fi16>`, second
1502                                             bit specifies :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`.
1503    ======================================== ============================================================
1504
1505Note: numeric values may be specified as either
1506:ref:`integer numbers<amdgpu_synid_integer_number>` or
1507:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1508
1509Examples:
1510
1511.. parsed-literal::
1512
1513  op_sel:[0,0]
1514
1515.. _amdgpu_synid_clamp:
1516
1517clamp
1518~~~~~
1519
1520Clamp meaning depends on instruction.
1521
1522For *v_cmp* instructions, clamp modifier indicates that the compare signals
1523if a floating point exception occurs. By default, signaling is disabled.
1524Not supported by GFX7.
1525
1526For integer operations, clamp modifier indicates that the result must be clamped
1527to the largest and smallest representable value. By default, there is no clamping.
1528Integer clamping is not supported by GFX7.
1529
1530For floating point operations, clamp modifier indicates that the result must be clamped
1531to the range [0.0, 1.0]. By default, there is no clamping.
1532
1533Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
1534
1535    ======================================== ================================================
1536    Syntax                                   Description
1537    ======================================== ================================================
1538    clamp                                    Enables clamping (or signaling).
1539    ======================================== ================================================
1540
1541.. _amdgpu_synid_omod:
1542
1543omod
1544~~~~
1545
1546Specifies if an output modifier must be applied to the result.
1547By default, no output modifiers are applied.
1548
1549Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
1550
1551Output modifiers are valid for f32 and f64 floating point results only.
1552They must not be used with f16.
1553
1554Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result
1555but accepts output modifiers.
1556
1557    ======================================== ================================================
1558    Syntax                                   Description
1559    ======================================== ================================================
1560    mul:2                                    Multiply the result by 2.
1561    mul:4                                    Multiply the result by 4.
1562    div:2                                    Multiply the result by 0.5.
1563    ======================================== ================================================
1564
1565Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or
1566:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1567
1568Examples:
1569
1570.. parsed-literal::
1571
1572  mul:2
1573  mul:x      // x must be equal to 2 or 4
1574
1575.. _amdgpu_synid_vop3_operand_modifiers:
1576
1577VOP3 Operand Modifiers
1578----------------------
1579
1580Operand modifiers are not used separately. They are applied to source operands.
1581
1582.. _amdgpu_synid_abs:
1583
1584abs
1585~~~
1586
1587Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>`
1588(if any). Valid for floating point operands only.
1589
1590    ======================================== ====================================================
1591    Syntax                                   Description
1592    ======================================== ====================================================
1593    abs(<operand>)                           Get the absolute value of a floating-point operand.
1594    \|<operand>|                             The same as above (an SP3 syntax).
1595    ======================================== ====================================================
1596
1597Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|'
1598may be misinterpreted. Such operands should be enclosed into additional parentheses as shown
1599in examples below.
1600
1601Examples:
1602
1603.. parsed-literal::
1604
1605  abs(v36)
1606  \|v36|
1607  abs(x|y)     // ok
1608  \|(x|y)|      // additional parentheses are required
1609
1610.. _amdgpu_synid_neg:
1611
1612neg
1613~~~
1614
1615Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>`
1616(if any). Valid for floating point operands only.
1617
1618    ================== ====================================================
1619    Syntax             Description
1620    ================== ====================================================
1621    neg(<operand>)     Get the negative value of a floating-point operand.
1622                       The operand may include an optional
1623                       :ref:`abs<amdgpu_synid_abs>` modifier.
1624    -<operand>         The same as above (an SP3 syntax).
1625    ================== ====================================================
1626
1627Note: SP3 syntax is supported with limitations because of a potential ambiguity.
1628Currently it is allowed in the following cases:
1629
1630* Before a register.
1631* Before an :ref:`abs<amdgpu_synid_abs>` modifier.
1632* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier.
1633
1634In all other cases "-" is handled as a part of an expression that follows the sign.
1635
1636Examples:
1637
1638.. parsed-literal::
1639
1640  // Operands with negate modifiers
1641  neg(v[0])
1642  neg(1.0)
1643  neg(abs(v0))
1644  -v5
1645  -abs(v5)
1646  -\|v5|
1647
1648  // Operands without negate modifiers
1649  -1
1650  -x+y
1651
1652VOP3P Modifiers
1653---------------
1654
1655This section describes modifiers of *regular* VOP3P instructions.
1656
1657*v_mad_mix\** and *v_fma_mix\**
1658instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`.
1659
1660GFX9 and GFX10 only.
1661
1662.. _amdgpu_synid_op_sel:
1663
1664op_sel
1665~~~~~~
1666
1667Selects the low [15:0] or high [31:16] operand bits as input to the operation
1668which results in the lower-half of the destination.
1669By default, low bits are used for all operands.
1670
1671The number of values specified by the *op_sel* modifier must match the number of source
1672operands. First value controls src0, second value controls src1 and so on.
1673
1674The value 0 selects the low bits, while 1 selects the high bits.
1675
1676    ================================= =============================================================
1677    Syntax                            Description
1678    ================================= =============================================================
1679    op_sel:[{0..1}]                   Select operand bits for instructions with 1 source operand.
1680    op_sel:[{0..1},{0..1}]            Select operand bits for instructions with 2 source operands.
1681    op_sel:[{0..1},{0..1},{0..1}]     Select operand bits for instructions with 3 source operands.
1682    ================================= =============================================================
1683
1684Note: numeric values may be specified as either
1685:ref:`integer numbers<amdgpu_synid_integer_number>` or
1686:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1687
1688Examples:
1689
1690.. parsed-literal::
1691
1692  op_sel:[0,0]
1693  op_sel:[0,1,0]
1694
1695.. _amdgpu_synid_op_sel_hi:
1696
1697op_sel_hi
1698~~~~~~~~~
1699
1700Selects the low [15:0] or high [31:16] operand bits as input to the operation
1701which results in the upper-half of the destination.
1702By default, high bits are used for all operands.
1703
1704The number of values specified by the *op_sel_hi* modifier must match the number of source
1705operands. First value controls src0, second value controls src1 and so on.
1706
1707The value 0 selects the low bits, while 1 selects the high bits.
1708
1709    =================================== =============================================================
1710    Syntax                              Description
1711    =================================== =============================================================
1712    op_sel_hi:[{0..1}]                  Select operand bits for instructions with 1 source operand.
1713    op_sel_hi:[{0..1},{0..1}]           Select operand bits for instructions with 2 source operands.
1714    op_sel_hi:[{0..1},{0..1},{0..1}]    Select operand bits for instructions with 3 source operands.
1715    =================================== =============================================================
1716
1717Note: numeric values may be specified as either
1718:ref:`integer numbers<amdgpu_synid_integer_number>` or
1719:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1720
1721Examples:
1722
1723.. parsed-literal::
1724
1725  op_sel_hi:[0,0]
1726  op_sel_hi:[0,0,1]
1727
1728.. _amdgpu_synid_neg_lo:
1729
1730neg_lo
1731~~~~~~
1732
1733Specifies whether to change sign of operand values selected by
1734:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
1735as input to the operation which results in the upper-half of the destination.
1736
1737The number of values specified by this modifier must match the number of source
1738operands. First value controls src0, second value controls src1 and so on.
1739
1740The value 0 indicates that the corresponding operand value is used unmodified,
1741the value 1 indicates that negative value of the operand must be used.
1742
1743By default, operand values are used unmodified.
1744
1745This modifier is valid for floating point operands only.
1746
1747    ================================ ==================================================================
1748    Syntax                           Description
1749    ================================ ==================================================================
1750    neg_lo:[{0..1}]                  Select affected operands for instructions with 1 source operand.
1751    neg_lo:[{0..1},{0..1}]           Select affected operands for instructions with 2 source operands.
1752    neg_lo:[{0..1},{0..1},{0..1}]    Select affected operands for instructions with 3 source operands.
1753    ================================ ==================================================================
1754
1755Note: numeric values may be specified as either
1756:ref:`integer numbers<amdgpu_synid_integer_number>` or
1757:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1758
1759Examples:
1760
1761.. parsed-literal::
1762
1763  neg_lo:[0]
1764  neg_lo:[0,1]
1765
1766.. _amdgpu_synid_neg_hi:
1767
1768neg_hi
1769~~~~~~
1770
1771Specifies whether to change sign of operand values selected by
1772:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
1773as input to the operation which results in the upper-half of the destination.
1774
1775The number of values specified by this modifier must match the number of source
1776operands. First value controls src0, second value controls src1 and so on.
1777
1778The value 0 indicates that the corresponding operand value is used unmodified,
1779the value 1 indicates that negative value of the operand must be used.
1780
1781By default, operand values are used unmodified.
1782
1783This modifier is valid for floating point operands only.
1784
1785    =============================== ==================================================================
1786    Syntax                          Description
1787    =============================== ==================================================================
1788    neg_hi:[{0..1}]                 Select affected operands for instructions with 1 source operand.
1789    neg_hi:[{0..1},{0..1}]          Select affected operands for instructions with 2 source operands.
1790    neg_hi:[{0..1},{0..1},{0..1}]   Select affected operands for instructions with 3 source operands.
1791    =============================== ==================================================================
1792
1793Note: numeric values may be specified as either
1794:ref:`integer numbers<amdgpu_synid_integer_number>` or
1795:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1796
1797Examples:
1798
1799.. parsed-literal::
1800
1801  neg_hi:[1,0]
1802  neg_hi:[0,1,1]
1803
1804clamp
1805~~~~~
1806
1807See a description :ref:`here<amdgpu_synid_clamp>`.
1808
1809.. _amdgpu_synid_mad_mix:
1810
1811VOP3P MAD_MIX/FMA_MIX Modifiers
1812-------------------------------
1813
1814*v_mad_mix\** and *v_fma_mix\**
1815instructions use *op_sel* and *op_sel_hi* modifiers
1816in a manner different from *regular* VOP3P instructions.
1817
1818See a description below.
1819
1820GFX9 and GFX10 only.
1821
1822.. _amdgpu_synid_mad_mix_op_sel:
1823
1824m_op_sel
1825~~~~~~~~
1826
1827This operand has meaning only for 16-bit source operands as indicated by
1828:ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
1829It specifies to select either the low [15:0] or high [31:16] operand bits
1830as input to the operation.
1831
1832The number of values specified by the *op_sel* modifier must match the number of source
1833operands. First value controls src0, second value controls src1 and so on.
1834
1835The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
1836
1837By default, low bits are used for all operands.
1838
1839    =============================== ================================================
1840    Syntax                          Description
1841    =============================== ================================================
1842    op_sel:[{0..1},{0..1},{0..1}]   Select location of each 16-bit source operand.
1843    =============================== ================================================
1844
1845Note: numeric values may be specified as either
1846:ref:`integer numbers<amdgpu_synid_integer_number>` or
1847:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1848
1849Examples:
1850
1851.. parsed-literal::
1852
1853  op_sel:[0,1]
1854
1855.. _amdgpu_synid_mad_mix_op_sel_hi:
1856
1857m_op_sel_hi
1858~~~~~~~~~~~
1859
1860Selects the size of source operands: either 32 bits or 16 bits.
1861By default, 32 bits are used for all source operands.
1862
1863The number of values specified by the *op_sel_hi* modifier must match the number of source
1864operands. First value controls src0, second value controls src1 and so on.
1865
1866The value 0 indicates 32 bits, the value 1 indicates 16 bits.
1867
1868The location of 16 bits in the operand may be specified by
1869:ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`.
1870
1871    ======================================== ====================================
1872    Syntax                                   Description
1873    ======================================== ====================================
1874    op_sel_hi:[{0..1},{0..1},{0..1}]         Select size of each source operand.
1875    ======================================== ====================================
1876
1877Note: numeric values may be specified as either
1878:ref:`integer numbers<amdgpu_synid_integer_number>` or
1879:ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
1880
1881Examples:
1882
1883.. parsed-literal::
1884
1885  op_sel_hi:[1,1,1]
1886
1887abs
1888~~~
1889
1890See a description :ref:`here<amdgpu_synid_abs>`.
1891
1892neg
1893~~~
1894
1895See a description :ref:`here<amdgpu_synid_neg>`.
1896
1897clamp
1898~~~~~
1899
1900See a description :ref:`here<amdgpu_synid_clamp>`.
1901
1902VOP3P MFMA Modifiers
1903--------------------
1904
1905These modifiers may only be used with GFX908 and GFX90A.
1906
1907.. _amdgpu_synid_cbsz:
1908
1909cbsz
1910~~~~
1911
1912Specifies a broadcast mode.
1913
1914    =============================== ==================================================================
1915    Syntax                          Description
1916    =============================== ==================================================================
1917    cbsz:[{0..7}]                   A broadcast mode.
1918    =============================== ==================================================================
1919
1920Note: numeric value may be specified as either
1921an :ref:`integer number<amdgpu_synid_integer_number>` or
1922an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1923
1924.. _amdgpu_synid_abid:
1925
1926abid
1927~~~~
1928
1929Specifies matrix A group select.
1930
1931    =============================== ==================================================================
1932    Syntax                          Description
1933    =============================== ==================================================================
1934    abid:[{0..15}]                  Matrix A group select id.
1935    =============================== ==================================================================
1936
1937Note: numeric value may be specified as either
1938an :ref:`integer number<amdgpu_synid_integer_number>` or
1939an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1940
1941.. _amdgpu_synid_blgp:
1942
1943blgp
1944~~~~
1945
1946Specifies matrix B lane group pattern.
1947
1948    =============================== ==================================================================
1949    Syntax                          Description
1950    =============================== ==================================================================
1951    blgp:[{0..7}]                   Matrix B lane group pattern.
1952    =============================== ==================================================================
1953
1954Note: numeric value may be specified as either
1955an :ref:`integer number<amdgpu_synid_integer_number>` or
1956an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
1957
1958