1=======================================================
2How to Update Debug Info: A Guide for LLVM Pass Authors
3=======================================================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11Certain kinds of code transformations can inadvertently result in a loss of
12debug info, or worse, make debug info misrepresent the state of a program.
13
14This document specifies how to correctly update debug info in various kinds of
15code transformations, and offers suggestions for how to create targeted debug
16info tests for arbitrary transformations.
17
18For more on the philosophy behind LLVM debugging information, see
19:doc:`SourceLevelDebugging`.
20
21IR-level transformations
22========================
23
24Deleting an Instruction
25-----------------------
26
27When an ``Instruction`` is deleted, its debug uses change to ``undef``. This is
28a loss of debug info: the value of a one or more source variables becomes
29unavailable, starting with the ``llvm.dbg.value(undef, ...)``. When there is no
30way to reconstitute the value of the lost instruction, this is the best
31possible outcome. However, it's often possible to do better:
32
33* If the dying instruction can be RAUW'd, do so. The
34  ``Value::replaceAllUsesWith`` API transparently updates debug uses of the
35  dying instruction to point to the replacement value.
36
37* If the dying instruction cannot be RAUW'd, call ``llvm::salvageDebugInfo`` on
38  it. This makes a best-effort attempt to rewrite debug uses of the dying
39  instruction by describing its effect as a ``DIExpression``.
40
41* If one of the **operands** of a dying instruction would become trivially
42  dead, use ``llvm::replaceAllDbgUsesWith`` to rewrite the debug uses of that
43  operand. Consider the following example function:
44
45.. code-block:: llvm
46
47  define i16 @foo(i16 %a) {
48    %b = sext i16 %a to i32
49    %c = and i32 %b, 15
50    call void @llvm.dbg.value(metadata i32 %c, ...)
51    %d = trunc i32 %c to i16
52    ret i16 %d
53  }
54
55Now, here's what happens after the unnecessary truncation instruction ``%d`` is
56replaced with a simplified instruction:
57
58.. code-block:: llvm
59
60  define i16 @foo(i16 %a) {
61    call void @llvm.dbg.value(metadata i32 undef, ...)
62    %simplified = and i16 %a, 15
63    ret i16 %simplified
64  }
65
66Note that after deleting ``%d``, all uses of its operand ``%c`` become
67trivially dead. The debug use which used to point to ``%c`` is now ``undef``,
68and debug info is needlessly lost.
69
70To solve this problem, do:
71
72.. code-block:: cpp
73
74  llvm::replaceAllDbgUsesWith(%c, theSimplifiedAndInstruction, ...)
75
76This results in better debug info because the debug use of ``%c`` is preserved:
77
78.. code-block:: llvm
79
80  define i16 @foo(i16 %a) {
81    %simplified = and i16 %a, 15
82    call void @llvm.dbg.value(metadata i16 %simplified, ...)
83    ret i16 %simplified
84  }
85
86You may have noticed that ``%simplified`` is narrower than ``%c``: this is not
87a problem, because ``llvm::replaceAllDbgUsesWith`` takes care of inserting the
88necessary conversion operations into the DIExpressions of updated debug uses.
89
90Deleting a MIR-level MachineInstr
91---------------------------------
92
93TODO
94
95How to automatically convert tests into debug info tests
96========================================================
97
98.. _IRDebugify:
99
100Mutation testing for IR-level transformations
101---------------------------------------------
102
103An IR test case for a transformation can, in many cases, be automatically
104mutated to test debug info handling within that transformation. This is a
105simple way to test for proper debug info handling.
106
107The ``debugify`` utility
108^^^^^^^^^^^^^^^^^^^^^^^^
109
110The ``debugify`` testing utility is just a pair of passes: ``debugify`` and
111``check-debugify``.
112
113The first applies synthetic debug information to every instruction of the
114module, and the second checks that this DI is still available after an
115optimization has occurred, reporting any errors/warnings while doing so.
116
117The instructions are assigned sequentially increasing line locations, and are
118immediately used by debug value intrinsics everywhere possible.
119
120For example, here is a module before:
121
122.. code-block:: llvm
123
124   define void @f(i32* %x) {
125   entry:
126     %x.addr = alloca i32*, align 8
127     store i32* %x, i32** %x.addr, align 8
128     %0 = load i32*, i32** %x.addr, align 8
129     store i32 10, i32* %0, align 4
130     ret void
131   }
132
133and after running ``opt -debugify``:
134
135.. code-block:: llvm
136
137   define void @f(i32* %x) !dbg !6 {
138   entry:
139     %x.addr = alloca i32*, align 8, !dbg !12
140     call void @llvm.dbg.value(metadata i32** %x.addr, metadata !9, metadata !DIExpression()), !dbg !12
141     store i32* %x, i32** %x.addr, align 8, !dbg !13
142     %0 = load i32*, i32** %x.addr, align 8, !dbg !14
143     call void @llvm.dbg.value(metadata i32* %0, metadata !11, metadata !DIExpression()), !dbg !14
144     store i32 10, i32* %0, align 4, !dbg !15
145     ret void, !dbg !16
146   }
147
148   !llvm.dbg.cu = !{!0}
149   !llvm.debugify = !{!3, !4}
150   !llvm.module.flags = !{!5}
151
152   !0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "debugify", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
153   !1 = !DIFile(filename: "debugify-sample.ll", directory: "/")
154   !2 = !{}
155   !3 = !{i32 5}
156   !4 = !{i32 2}
157   !5 = !{i32 2, !"Debug Info Version", i32 3}
158   !6 = distinct !DISubprogram(name: "f", linkageName: "f", scope: null, file: !1, line: 1, type: !7, isLocal: false, isDefinition: true, scopeLine: 1, isOptimized: true, unit: !0, retainedNodes: !8)
159   !7 = !DISubroutineType(types: !2)
160   !8 = !{!9, !11}
161   !9 = !DILocalVariable(name: "1", scope: !6, file: !1, line: 1, type: !10)
162   !10 = !DIBasicType(name: "ty64", size: 64, encoding: DW_ATE_unsigned)
163   !11 = !DILocalVariable(name: "2", scope: !6, file: !1, line: 3, type: !10)
164   !12 = !DILocation(line: 1, column: 1, scope: !6)
165   !13 = !DILocation(line: 2, column: 1, scope: !6)
166   !14 = !DILocation(line: 3, column: 1, scope: !6)
167   !15 = !DILocation(line: 4, column: 1, scope: !6)
168   !16 = !DILocation(line: 5, column: 1, scope: !6)
169
170Using ``debugify``
171^^^^^^^^^^^^^^^^^^
172
173A simple way to use ``debugify`` is as follows:
174
175.. code-block:: bash
176
177  $ opt -debugify -pass-to-test -check-debugify sample.ll
178
179This will inject synthetic DI to ``sample.ll`` run the ``pass-to-test`` and
180then check for missing DI. The ``-check-debugify`` step can of course be
181omitted in favor of more customizable FileCheck directives.
182
183Some other ways to run debugify are available:
184
185.. code-block:: bash
186
187   # Same as the above example.
188   $ opt -enable-debugify -pass-to-test sample.ll
189
190   # Suppresses verbose debugify output.
191   $ opt -enable-debugify -debugify-quiet -pass-to-test sample.ll
192
193   # Prepend -debugify before and append -check-debugify -strip after
194   # each pass on the pipeline (similar to -verify-each).
195   $ opt -debugify-each -O2 sample.ll
196
197In order for ``check-debugify`` to work, the DI must be coming from
198``debugify``. Thus, modules with existing DI will be skipped.
199
200``debugify`` can be used to test a backend, e.g:
201
202.. code-block:: bash
203
204   $ opt -debugify < sample.ll | llc -o -
205
206There is also a MIR-level debugify pass that can be run before each backend
207pass, see:
208:ref:`Mutation testing for MIR-level transformations<MIRDebugify>`.
209
210``debugify`` in regression tests
211^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
212
213The output of the ``debugify`` pass must be stable enough to use in regression
214tests. Changes to this pass are not allowed to break existing tests.
215
216.. note::
217
218   Regression tests must be robust. Avoid hardcoding line/variable numbers in
219   check lines. In cases where this can't be avoided (say, if a test wouldn't
220   be precise enough), moving the test to its own file is preferred.
221
222.. _MIRDebugify:
223
224Mutation testing for MIR-level transformations
225----------------------------------------------
226
227A variant of the ``debugify`` utility described in
228:ref:`Mutation testing for IR-level transformations<IRDebugify>` can be used
229for MIR-level transformations as well: much like the IR-level pass,
230``mir-debugify`` inserts sequentially increasing line locations to each
231``MachineInstr`` in a ``Module`` (although there is no equivalent MIR-level
232``check-debugify`` pass).
233
234For example, here is a snippet before:
235
236.. code-block:: llvm
237
238  name:            test
239  body:             |
240    bb.1 (%ir-block.0):
241      %0:_(s32) = IMPLICIT_DEF
242      %1:_(s32) = IMPLICIT_DEF
243      %2:_(s32) = G_CONSTANT i32 2
244      %3:_(s32) = G_ADD %0, %2
245      %4:_(s32) = G_SUB %3, %1
246
247and after running ``llc -run-pass=mir-debugify``:
248
249.. code-block:: llvm
250
251  name:            test
252  body:             |
253    bb.0 (%ir-block.0):
254      %0:_(s32) = IMPLICIT_DEF debug-location !12
255      DBG_VALUE %0(s32), $noreg, !9, !DIExpression(), debug-location !12
256      %1:_(s32) = IMPLICIT_DEF debug-location !13
257      DBG_VALUE %1(s32), $noreg, !11, !DIExpression(), debug-location !13
258      %2:_(s32) = G_CONSTANT i32 2, debug-location !14
259      DBG_VALUE %2(s32), $noreg, !9, !DIExpression(), debug-location !14
260      %3:_(s32) = G_ADD %0, %2, debug-location !DILocation(line: 4, column: 1, scope: !6)
261      DBG_VALUE %3(s32), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !6)
262      %4:_(s32) = G_SUB %3, %1, debug-location !DILocation(line: 5, column: 1, scope: !6)
263      DBG_VALUE %4(s32), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !6)
264
265By default, ``mir-debugify`` inserts ``DBG_VALUE`` instructions **everywhere**
266it is legal to do so.  In particular, every (non-PHI) machine instruction that
267defines a register must be followed by a ``DBG_VALUE`` use of that def.  If
268an instruction does not define a register, but can be followed by a debug inst,
269MIRDebugify inserts a ``DBG_VALUE`` that references a constant.  Insertion of
270``DBG_VALUE``'s can be disabled by setting ``-debugify-level=locations``.
271
272To run MIRDebugify once, simply insert ``mir-debugify`` into your ``llc``
273invocation, like:
274
275.. code-block:: bash
276
277  # Before some other pass.
278  $ llc -run-pass=mir-debugify,other-pass ...
279
280  # After some other pass.
281  $ llc -run-pass=other-pass,mir-debugify ...
282
283To run MIRDebugify before each pass in a pipeline, use
284``-debugify-and-strip-all-safe``. This can be combined with ``-start-before``
285and ``-start-after``. For example:
286
287.. code-block:: bash
288
289  $ llc -debugify-and-strip-all-safe -run-pass=... <other llc args>
290  $ llc -debugify-and-strip-all-safe -O1 <other llc args>
291
292To strip out all debug info from a test, use ``mir-strip-debug``, like:
293
294.. code-block:: bash
295
296  $ llc -run-pass=mir-debugify,other-pass,mir-strip-debug
297
298It can be useful to combine ``mir-debugify`` and ``mir-strip-debug`` to
299identify backend transformations which break in the presence of debug info.
300For example, to run the AArch64 backend tests with all normal passes
301"sandwiched" in between MIRDebugify and MIRStripDebugify mutation passes, run:
302
303.. code-block:: bash
304
305  $ llvm-lit test/CodeGen/AArch64 -Dllc="llc -debugify-and-strip-all-safe"
306
307Using LostDebugLocObserver
308--------------------------
309
310TODO
311