1=======================================================
2How to Update Debug Info: A Guide for LLVM Pass Authors
3=======================================================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11Certain kinds of code transformations can inadvertently result in a loss of
12debug info, or worse, make debug info misrepresent the state of a program.
13
14This document specifies how to correctly update debug info in various kinds of
15code transformations, and offers suggestions for how to create targeted debug
16info tests for arbitrary transformations.
17
18For more on the philosophy behind LLVM debugging information, see
19:doc:`SourceLevelDebugging`.
20
21IR-level transformations
22========================
23
24Deleting an Instruction
25-----------------------
26
27When an ``Instruction`` is deleted, its debug uses change to ``undef``. This is
28a loss of debug info: the value of a one or more source variables becomes
29unavailable, starting with the ``llvm.dbg.value(undef, ...)``. When there is no
30way to reconstitute the value of the lost instruction, this is the best
31possible outcome. However, it's often possible to do better:
32
33* If the dying instruction can be RAUW'd, do so. The
34  ``Value::replaceAllUsesWith`` API transparently updates debug uses of the
35  dying instruction to point to the replacement value.
36
37* If the dying instruction cannot be RAUW'd, call
38  ``llvm::salvageDebugInfoOrMarkUndef`` on it. This makes a best-effort attempt
39  to rewrite debug uses of the dying instruction by describing its effect as a
40  ``DIExpression``.
41
42* If one of the **operands** of a dying instruction would become trivially
43  dead, use ``llvm::replaceAllDbgUsesWith`` to rewrite the debug uses of that
44  operand. Consider the following example function:
45
46.. code-block:: llvm
47
48  define i16 @foo(i16 %a) {
49    %b = sext i16 %a to i32
50    %c = and i32 %b, 15
51    call void @llvm.dbg.value(metadata i32 %c, ...)
52    %d = trunc i32 %c to i16
53    ret i16 %d
54  }
55
56Now, here's what happens after the unnecessary truncation instruction ``%d`` is
57replaced with a simplified instruction:
58
59.. code-block:: llvm
60
61  define i16 @foo(i16 %a) {
62    call void @llvm.dbg.value(metadata i32 undef, ...)
63    %simplified = and i16 %a, 15
64    ret i16 %simplified
65  }
66
67Note that after deleting ``%d``, all uses of its operand ``%c`` become
68trivially dead. The debug use which used to point to ``%c`` is now ``undef``,
69and debug info is needlessly lost.
70
71To solve this problem, do:
72
73.. code-block:: cpp
74
75  llvm::replaceAllDbgUsesWith(%c, theSimplifiedAndInstruction, ...)
76
77This results in better debug info because the debug use of ``%c`` is preserved:
78
79.. code-block:: llvm
80
81  define i16 @foo(i16 %a) {
82    %simplified = and i16 %a, 15
83    call void @llvm.dbg.value(metadata i16 %simplified, ...)
84    ret i16 %simplified
85  }
86
87You may have noticed that ``%simplified`` is narrower than ``%c``: this is not
88a problem, because ``llvm::replaceAllDbgUsesWith`` takes care of inserting the
89necessary conversion operations into the DIExpressions of updated debug uses.
90
91Hoisting an Instruction
92-----------------------
93
94TODO
95
96Sinking an Instruction
97----------------------
98
99TODO
100
101Cloning an Instruction
102----------------------
103
104TODO
105
106Merging two Instructions
107------------------------
108
109TODO
110
111Creating an artificial Instruction
112----------------------------------
113
114TODO
115
116Mutation testing for IR-level transformations
117---------------------------------------------
118
119An IR test case for a transformation can, in many cases, be automatically
120mutated to test debug info handling within that transformation. This is a
121simple way to test for proper debug info handling.
122
123The ``debugify`` utility
124^^^^^^^^^^^^^^^^^^^^^^^^
125
126The ``debugify`` testing utility is just a pair of passes: ``debugify`` and
127``check-debugify``.
128
129The first applies synthetic debug information to every instruction of the
130module, and the second checks that this DI is still available after an
131optimization has occurred, reporting any errors/warnings while doing so.
132
133The instructions are assigned sequentially increasing line locations, and are
134immediately used by debug value intrinsics everywhere possible.
135
136For example, here is a module before:
137
138.. code-block:: llvm
139
140   define void @f(i32* %x) {
141   entry:
142     %x.addr = alloca i32*, align 8
143     store i32* %x, i32** %x.addr, align 8
144     %0 = load i32*, i32** %x.addr, align 8
145     store i32 10, i32* %0, align 4
146     ret void
147   }
148
149and after running ``opt -debugify``:
150
151.. code-block:: llvm
152
153   define void @f(i32* %x) !dbg !6 {
154   entry:
155     %x.addr = alloca i32*, align 8, !dbg !12
156     call void @llvm.dbg.value(metadata i32** %x.addr, metadata !9, metadata !DIExpression()), !dbg !12
157     store i32* %x, i32** %x.addr, align 8, !dbg !13
158     %0 = load i32*, i32** %x.addr, align 8, !dbg !14
159     call void @llvm.dbg.value(metadata i32* %0, metadata !11, metadata !DIExpression()), !dbg !14
160     store i32 10, i32* %0, align 4, !dbg !15
161     ret void, !dbg !16
162   }
163
164   !llvm.dbg.cu = !{!0}
165   !llvm.debugify = !{!3, !4}
166   !llvm.module.flags = !{!5}
167
168   !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
169   !1 = !DIFile(filename: "debugify-sample.ll", directory: "/")
170   !2 = !{}
171   !3 = !{i32 5}
172   !4 = !{i32 2}
173   !5 = !{i32 2, !"Debug Info Version", i32 3}
174   !6 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !1, line: 1, type: !7, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !0, retainedNodes: !8)
175   !7 = !DISubroutineType(types: !2)
176   !8 = !{!9, !11}
177   !9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
178   !10 = !DIBasicType(name: "ty64", size: 64, encoding: DW_ATE_unsigned)
179   !11 = !DILocalVariable(name: "2", scope: !6, file: !1, line: 3, type: !10)
180   !12 = !DILocation(line: 1, column: 1, scope: !6)
181   !13 = !DILocation(line: 2, column: 1, scope: !6)
182   !14 = !DILocation(line: 3, column: 1, scope: !6)
183   !15 = !DILocation(line: 4, column: 1, scope: !6)
184   !16 = !DILocation(line: 5, column: 1, scope: !6)
185
186Using ``debugify``
187^^^^^^^^^^^^^^^^^^
188
189A simple way to use ``debugify`` is as follows:
190
191.. code-block:: bash
192
193  $ opt -debugify -pass-to-test -check-debugify sample.ll
194
195This will inject synthetic DI to ``sample.ll`` run the ``pass-to-test`` and
196then check for missing DI. The ``-check-debugify`` step can of course be
197omitted in favor of more customizable FileCheck directives.
198
199Some other ways to run debugify are available:
200
201.. code-block:: bash
202
203   # Same as the above example.
204   $ opt -enable-debugify -pass-to-test sample.ll
205
206   # Suppresses verbose debugify output.
207   $ opt -enable-debugify -debugify-quiet -pass-to-test sample.ll
208
209   # Prepend -debugify before and append -check-debugify -strip after
210   # each pass on the pipeline (similar to -verify-each).
211   $ opt -debugify-each -O2 sample.ll
212
213In order for ``check-debugify`` to work, the DI must be coming from
214``debugify``. Thus, modules with existing DI will be skipped.
215
216``debugify`` can be used to test a backend, e.g:
217
218.. code-block:: bash
219
220   $ opt -debugify < sample.ll | llc -o -
221
222There is also a MIR-level debugify pass that can be run before each backend
223pass, see:
224:ref:`Mutation testing for MIR-level transformations`.
225
226``debugify`` in regression tests
227^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
228
229The output of the ``debugify`` pass must be stable enough to use in regression
230tests. Changes to this pass are not allowed to break existing tests.
231
232.. note::
233
234   Regression tests must be robust. Avoid hardcoding line/variable numbers in
235   check lines. In cases where this can't be avoided (say, if a test wouldn't
236   be precise enough), moving the test to its own file is preferred.
237
238MIR-level transformations
239=========================
240
241Deleting a MachineInstr
242-----------------------
243
244TODO
245
246Hoisting a MachineInstr
247-----------------------
248
249TODO
250
251Sinking a MachineInstr
252----------------------
253
254TODO
255
256Cloning a MachineInstr
257----------------------
258
259TODO
260
261Creating an artificial MachineInstr
262-----------------------------------
263
264TODO
265
266Mutation testing for MIR-level transformations
267----------------------------------------------
268
269A varaint of the ``debugify`` utility described in :ref:`Mutation testing for
270IR-level transformations` can be used for MIR-level transformations as well:
271much like the IR-level pass, ``mir-debugify`` inserts sequentially increasing
272line locations to each ``MachineInstr`` in a ``Module`` (although there is no
273equivalent MIR-level ``check-debugify`` pass).
274
275For example, here is a snippet before:
276
277.. code-block:: llvm
278
279  name:            test
280  body:             |
281    bb.1 (%ir-block.0):
282      %0:_(s32) = IMPLICIT_DEF
283      %1:_(s32) = IMPLICIT_DEF
284      %2:_(s32) = G_CONSTANT i32 2
285      %3:_(s32) = G_ADD %0, %2
286      %4:_(s32) = G_SUB %3, %1
287
288and after running ``llc -run-pass=mir-debugify``:
289
290.. code-block:: llvm
291
292  name:            test
293  body:             |
294    bb.0 (%ir-block.0):
295      %0:_(s32) = IMPLICIT_DEF debug-location !12
296      DBG_VALUE %0(s32), $noreg, !9, !DIExpression(), debug-location !12
297      %1:_(s32) = IMPLICIT_DEF debug-location !13
298      DBG_VALUE %1(s32), $noreg, !11, !DIExpression(), debug-location !13
299      %2:_(s32) = G_CONSTANT i32 2, debug-location !14
300      DBG_VALUE %2(s32), $noreg, !9, !DIExpression(), debug-location !14
301      %3:_(s32) = G_ADD %0, %2, debug-location !DILocation(line: 4, column: 1, scope: !6)
302      DBG_VALUE %3(s32), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !6)
303      %4:_(s32) = G_SUB %3, %1, debug-location !DILocation(line: 5, column: 1, scope: !6)
304      DBG_VALUE %4(s32), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !6)
305
306By default, ``mir-debugify`` inserts ``DBG_VALUE`` instructions **everywhere**
307it is legal to do so.  In particular, every (non-PHI) machine instruction that
308defines a register should be followed by a ``DBG_VALUE`` use of that def.  If
309an instruction does not define a register, but can be followed by a debug inst,
310MIRDebugify inserts a ``DBG_VALUE`` that references a constant.  Insertion of
311``DBG_VALUE``'s can be disabled by setting ``-debugify-level=locations``.
312
313To run MIRDebugify once, simply insert ``mir-debugify`` into your ``llc``
314invocation, like:
315
316.. code-block:: bash
317
318  # Before some other pass.
319  $ llc -run-pass=mir-debugify,other-pass ...
320
321  # After some other pass.
322  $ llc -run-pass=other-pass,mir-debugify ...
323
324To run MIRDebugify before each pass in a pipeline, use
325``-debugify-and-strip-all-safe``. This can be combined with ``-start-before``
326and ``-start-after``. For example:
327
328.. code-block:: bash
329
330  $ llc -debugify-and-strip-all-safe -run-pass=... <other llc args>
331  $ llc -debugify-and-strip-all-safe -O1 <other llc args>
332
333To strip out all debug info from a test, use ``mir-strip-debug``, like:
334
335.. code-block:: bash
336
337  $ llc -run-pass=mir-debugify,other-pass,mir-strip-debug
338
339It can be useful to combine ``mir-debugify`` and ``mir-strip-debug`` to
340identify backend transformations which break in the presence of debug info.
341For example, to run the AArch64 backend tests with all normal passes
342"sandwiched" in between MIRDebugify and MIRStripDebugify mutation passes, run:
343
344.. code-block:: bash
345
346  $ llvm-lit test/CodeGen/AArch64 -Dllc="llc -debugify-and-strip-all-safe"
347
348Using the LostDebugLocObserver
349------------------------------
350
351TODO
352