1=====================
2Clang Offload Wrapper
3=====================
4
5.. contents::
6   :local:
7
8.. _clang-offload-wrapper:
9
10Introduction
11============
12
13This tool is used in OpenMP offloading toolchain to embed device code objects
14(usually ELF) into a wrapper host llvm IR (bitcode) file. The wrapper host IR
15is then assembled and linked with host code objects to generate the executable
16binary. See :ref:`image-binary-embedding-execution` for more details.
17
18Usage
19=====
20
21This tool can be used as follows:
22
23.. code-block:: console
24
25  $ clang-offload-wrapper -help
26  OVERVIEW: A tool to create a wrapper bitcode for offload target binaries.
27  Takes offload target binaries as input and produces bitcode file containing
28  target binaries packaged as data and initialization code which registers
29  target binaries in offload runtime.
30  USAGE: clang-offload-wrapper [options] <input files>
31  OPTIONS:
32  Generic Options:
33    --help                             - Display available options (--help-hidden for more)
34    --help-list                        - Display list of available options (--help-list-hidden for more)
35    --version                          - Display the version of this program
36  clang-offload-wrapper options:
37    -o <filename>                      - Output filename
38    --target=<triple>                  - Target triple for the output module
39
40Example
41=======
42
43.. code-block:: console
44
45  clang-offload-wrapper -target host-triple -o host-wrapper.bc gfx90a-binary.out
46
47.. _openmp-device-binary_embedding:
48
49OpenMP Device Binary Embedding
50==============================
51
52Various structures and functions used in the wrapper host IR form the interface
53between the executable binary and the OpenMP runtime.
54
55Enum Types
56----------
57
58:ref:`table-offloading-declare-target-flags` lists different flag for
59offloading entries.
60
61  .. table:: Offloading Declare Target Flags Enum
62    :name: table-offloading-declare-target-flags
63
64    +-------------------------+-------+------------------------------------------------------------------+
65    |          Name           | Value | Description                                                      |
66    +=========================+=======+==================================================================+
67    | OMP_DECLARE_TARGET_LINK | 0x01  | Mark the entry as having a 'link' attribute (w.r.t. link clause) |
68    +-------------------------+-------+------------------------------------------------------------------+
69    | OMP_DECLARE_TARGET_CTOR | 0x02  | Mark the entry as being a global constructor                     |
70    +-------------------------+-------+------------------------------------------------------------------+
71    | OMP_DECLARE_TARGET_DTOR | 0x04  | Mark the entry as being a global destructor                      |
72    +-------------------------+-------+------------------------------------------------------------------+
73
74Structure Types
75---------------
76
77:ref:`table-tgt_offload_entry`, :ref:`table-tgt_device_image`, and
78:ref:`table-tgt_bin_desc` are the structures used in the wrapper host IR.
79
80  .. table:: __tgt_offload_entry structure
81    :name: table-tgt_offload_entry
82
83    +---------+------------+------------------------------------------------------------------------------------+
84    |   Type  | Identifier | Description                                                                        |
85    +=========+============+====================================================================================+
86    |  void*  |    addr    | Address of global symbol within device image (function or global)                  |
87    +---------+------------+------------------------------------------------------------------------------------+
88    |  char*  |    name    | Name of the symbol                                                                 |
89    +---------+------------+------------------------------------------------------------------------------------+
90    |  size_t |    size    | Size of the entry info (0 if it is a function)                                     |
91    +---------+------------+------------------------------------------------------------------------------------+
92    | int32_t |    flags   | Flags associated with the entry (see :ref:`table-offloading-declare-target-flags`) |
93    +---------+------------+------------------------------------------------------------------------------------+
94    | int32_t |  reserved  | Reserved, to be used by the runtime library.                                       |
95    +---------+------------+------------------------------------------------------------------------------------+
96
97  .. table:: __tgt_device_image structure
98    :name: table-tgt_device_image
99
100    +----------------------+--------------+----------------------------------------+
101    |         Type         |  Identifier  | Description                            |
102    +======================+==============+========================================+
103    |         void*        |  ImageStart  | Pointer to the target code start       |
104    +----------------------+--------------+----------------------------------------+
105    |         void*        |   ImageEnd   | Pointer to the target code end         |
106    +----------------------+--------------+----------------------------------------+
107    | __tgt_offload_entry* | EntriesBegin | Begin of table with all target entries |
108    +----------------------+--------------+----------------------------------------+
109    | __tgt_offload_entry* |  EntriesEnd  | End of table (non inclusive)           |
110    +----------------------+--------------+----------------------------------------+
111
112  .. table:: __tgt_bin_desc structure
113    :name: table-tgt_bin_desc
114
115    +----------------------+------------------+------------------------------------------+
116    |         Type         |    Identifier    | Description                              |
117    +======================+==================+==========================================+
118    |        int32_t       |  NumDeviceImages | Number of device types supported         |
119    +----------------------+------------------+------------------------------------------+
120    |  __tgt_device_image* |   DeviceImages   | Array of device images (1 per dev. type) |
121    +----------------------+------------------+------------------------------------------+
122    | __tgt_offload_entry* | HostEntriesBegin | Begin of table with all host entries     |
123    +----------------------+------------------+------------------------------------------+
124    | __tgt_offload_entry* |  HostEntriesEnd  | End of table (non inclusive)             |
125    +----------------------+------------------+------------------------------------------+
126
127Global Variables
128----------------
129
130:ref:`table-global-variables` lists various global variables, along with their
131type and their explicit ELF sections, which are used to store device images and
132related symbols.
133
134  .. table:: Global Variables
135    :name: table-global-variables
136
137    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
138    |            Variable            |         Type        |       ELF Section       |                    Description                    |
139    +================================+=====================+=========================+===================================================+
140    | __start_omp_offloading_entries | __tgt_offload_entry | .omp_offloading_entries | Begin symbol for the offload entries table.       |
141    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
142    | __stop_omp_offloading_entries  | __tgt_offload_entry | .omp_offloading_entries | End symbol for the offload entries table.         |
143    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
144    | __dummy.omp_offloading.entry   | __tgt_offload_entry | .omp_offloading_entries | Dummy zero-sized object in the offload entries    |
145    |                                |                     |                         | section to force linker to define begin/end       |
146    |                                |                     |                         | symbols defined above.                            |
147    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
148    | .omp_offloading.device_image   |  __tgt_device_image | .omp_offloading_entries | ELF device code object of the first image.        |
149    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
150    | .omp_offloading.device_image.N |  __tgt_device_image | .omp_offloading_entries | ELF device code object of the (N+1)th image.      |
151    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
152    | .omp_offloading.device_images  |  __tgt_device_image | .omp_offloading_entries | Array of images.                                  |
153    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
154    | .omp_offloading.descriptor     | __tgt_bin_desc      | .omp_offloading_entries | Binary descriptor object (see details below).     |
155    +--------------------------------+---------------------+-------------------------+---------------------------------------------------+
156
157
158Binary Descriptor for Device Images
159^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
160
161This object is passed to the offloading runtime at program startup and it
162describes all device images available in the executable or shared library. It
163is defined as follows:
164
165.. code-block:: console
166
167  __attribute__((visibility("hidden")))
168  extern __tgt_offload_entry *__start_omp_offloading_entries;
169  __attribute__((visibility("hidden")))
170  extern __tgt_offload_entry *__stop_omp_offloading_entries;
171  static const char Image0[] = { <Bufs.front() contents> };
172  ...
173  static const char ImageN[] = { <Bufs.back() contents> };
174  static const __tgt_device_image Images[] = {
175    {
176      Image0,                            /*ImageStart*/
177      Image0 + sizeof(Image0),           /*ImageEnd*/
178      __start_omp_offloading_entries,    /*EntriesBegin*/
179      __stop_omp_offloading_entries      /*EntriesEnd*/
180    },
181    ...
182    {
183      ImageN,                            /*ImageStart*/
184      ImageN + sizeof(ImageN),           /*ImageEnd*/
185      __start_omp_offloading_entries,    /*EntriesBegin*/
186      __stop_omp_offloading_entries      /*EntriesEnd*/
187    }
188  };
189  static const __tgt_bin_desc BinDesc = {
190    sizeof(Images) / sizeof(Images[0]),  /*NumDeviceImages*/
191    Images,                              /*DeviceImages*/
192    __start_omp_offloading_entries,      /*HostEntriesBegin*/
193    __stop_omp_offloading_entries        /*HostEntriesEnd*/
194  };
195
196Global Constructor and Destructor
197---------------------------------
198
199Global constructor (``.omp_offloading.descriptor_reg()``) registers the library
200of images with the runtime by calling ``__tgt_register_lib()`` function. The
201cunstructor is explicitly defined in ``.text.startup`` section.
202Similarly, global destructor
203(``.omp_offloading.descriptor_unreg()``) calls ``__tgt_unregister_lib()`` for
204the unregistration and is also defined in ``.text.startup`` section.
205
206.. _image-binary-embedding-execution:
207
208Image Binary Embedding and Execution for OpenMP
209===============================================
210
211For each offloading target, device ELF code objects are generated by ``clang``,
212``opt``, ``llc``, and ``lld`` pipeline. These code objects are passed to the
213``clang-offload-wrapper``.
214
215  * At compile time, the ``clang-offload-wrapper`` tool takes the following
216    actions:
217
218    * It embeds the ELF code objects for the device into the host code (see
219      :ref:`openmp-device-binary_embedding`).
220
221  * At execution time:
222
223    * The global constructor gets run and it registers the device image.
224