1d80f118eSChris Lattner======================================
2d80f118eSChris LattnerKaleidoscope: Adding Debug Information
3d80f118eSChris Lattner======================================
4d80f118eSChris Lattner
5d80f118eSChris Lattner.. contents::
6d80f118eSChris Lattner   :local:
7d80f118eSChris Lattner
8d80f118eSChris LattnerChapter 9 Introduction
9d80f118eSChris Lattner======================
10d80f118eSChris Lattner
11d80f118eSChris LattnerWelcome to Chapter 9 of the "`Implementing a language with
12d80f118eSChris LattnerLLVM <index.html>`_" tutorial. In chapters 1 through 8, we've built a
13d80f118eSChris Lattnerdecent little programming language with functions and variables.
14d80f118eSChris LattnerWhat happens if something goes wrong though, how do you debug your
15d80f118eSChris Lattnerprogram?
16d80f118eSChris Lattner
17d80f118eSChris LattnerSource level debugging uses formatted data that helps a debugger
18d80f118eSChris Lattnertranslate from binary and the state of the machine back to the
19d80f118eSChris Lattnersource that the programmer wrote. In LLVM we generally use a format
20d80f118eSChris Lattnercalled `DWARF <http://dwarfstd.org>`_. DWARF is a compact encoding
21d80f118eSChris Lattnerthat represents types, source locations, and variable locations.
22d80f118eSChris Lattner
23d80f118eSChris LattnerThe short summary of this chapter is that we'll go through the
24d80f118eSChris Lattnervarious things you have to add to a programming language to
25d80f118eSChris Lattnersupport debug info, and how you translate that into DWARF.
26d80f118eSChris Lattner
27d80f118eSChris LattnerCaveat: For now we can't debug via the JIT, so we'll need to compile
28d80f118eSChris Lattnerour program down to something small and standalone. As part of this
29d80f118eSChris Lattnerwe'll make a few modifications to the running of the language and
30d80f118eSChris Lattnerhow programs are compiled. This means that we'll have a source file
31d80f118eSChris Lattnerwith a simple program written in Kaleidoscope rather than the
32d80f118eSChris Lattnerinteractive JIT. It does involve a limitation that we can only
33d80f118eSChris Lattnerhave one "top level" command at a time to reduce the number of
34d80f118eSChris Lattnerchanges necessary.
35d80f118eSChris Lattner
36d80f118eSChris LattnerHere's the sample program we'll be compiling:
37d80f118eSChris Lattner
38d80f118eSChris Lattner.. code-block:: python
39d80f118eSChris Lattner
40d80f118eSChris Lattner   def fib(x)
41d80f118eSChris Lattner     if x < 3 then
42d80f118eSChris Lattner       1
43d80f118eSChris Lattner     else
44d80f118eSChris Lattner       fib(x-1)+fib(x-2);
45d80f118eSChris Lattner
46d80f118eSChris Lattner   fib(10)
47d80f118eSChris Lattner
48d80f118eSChris Lattner
49d80f118eSChris LattnerWhy is this a hard problem?
50d80f118eSChris Lattner===========================
51d80f118eSChris Lattner
52d80f118eSChris LattnerDebug information is a hard problem for a few different reasons - mostly
53d80f118eSChris Lattnercentered around optimized code. First, optimization makes keeping source
54d80f118eSChris Lattnerlocations more difficult. In LLVM IR we keep the original source location
55d80f118eSChris Lattnerfor each IR level instruction on the instruction. Optimization passes
56d80f118eSChris Lattnershould keep the source locations for newly created instructions, but merged
57d80f118eSChris Lattnerinstructions only get to keep a single location - this can cause jumping
58d80f118eSChris Lattneraround when stepping through optimized programs. Secondly, optimization
59d80f118eSChris Lattnercan move variables in ways that are either optimized out, shared in memory
60d80f118eSChris Lattnerwith other variables, or difficult to track. For the purposes of this
61d80f118eSChris Lattnertutorial we're going to avoid optimization (as you'll see with one of the
62d80f118eSChris Lattnernext sets of patches).
63d80f118eSChris Lattner
64d80f118eSChris LattnerAhead-of-Time Compilation Mode
65d80f118eSChris Lattner==============================
66d80f118eSChris Lattner
67d80f118eSChris LattnerTo highlight only the aspects of adding debug information to a source
68d80f118eSChris Lattnerlanguage without needing to worry about the complexities of JIT debugging
69d80f118eSChris Lattnerwe're going to make a few changes to Kaleidoscope to support compiling
70d80f118eSChris Lattnerthe IR emitted by the front end into a simple standalone program that
71d80f118eSChris Lattneryou can execute, debug, and see results.
72d80f118eSChris Lattner
73d80f118eSChris LattnerFirst we make our anonymous function that contains our top level
74d80f118eSChris Lattnerstatement be our "main":
75d80f118eSChris Lattner
76d80f118eSChris Lattner.. code-block:: udiff
77d80f118eSChris Lattner
780eaee545SJonas Devlieghere  -    auto Proto = std::make_unique<PrototypeAST>("", std::vector<std::string>());
790eaee545SJonas Devlieghere  +    auto Proto = std::make_unique<PrototypeAST>("main", std::vector<std::string>());
80d80f118eSChris Lattner
81d80f118eSChris Lattnerjust with the simple change of giving it a name.
82d80f118eSChris Lattner
83d80f118eSChris LattnerThen we're going to remove the command line code wherever it exists:
84d80f118eSChris Lattner
85d80f118eSChris Lattner.. code-block:: udiff
86d80f118eSChris Lattner
87d80f118eSChris Lattner  @@ -1129,7 +1129,6 @@ static void HandleTopLevelExpression() {
88d80f118eSChris Lattner   /// top ::= definition | external | expression | ';'
89d80f118eSChris Lattner   static void MainLoop() {
90d80f118eSChris Lattner     while (1) {
91d80f118eSChris Lattner  -    fprintf(stderr, "ready> ");
92d80f118eSChris Lattner       switch (CurTok) {
93d80f118eSChris Lattner       case tok_eof:
94d80f118eSChris Lattner         return;
95d80f118eSChris Lattner  @@ -1184,7 +1183,6 @@ int main() {
96d80f118eSChris Lattner     BinopPrecedence['*'] = 40; // highest.
97d80f118eSChris Lattner
98d80f118eSChris Lattner     // Prime the first token.
99d80f118eSChris Lattner  -  fprintf(stderr, "ready> ");
100d80f118eSChris Lattner     getNextToken();
101d80f118eSChris Lattner
102d80f118eSChris LattnerLastly we're going to disable all of the optimization passes and the JIT so
103d80f118eSChris Lattnerthat the only thing that happens after we're done parsing and generating
104d80f118eSChris Lattnercode is that the LLVM IR goes to standard error:
105d80f118eSChris Lattner
106d80f118eSChris Lattner.. code-block:: udiff
107d80f118eSChris Lattner
108d80f118eSChris Lattner  @@ -1108,17 +1108,8 @@ static void HandleExtern() {
109d80f118eSChris Lattner   static void HandleTopLevelExpression() {
110d80f118eSChris Lattner     // Evaluate a top-level expression into an anonymous function.
111d80f118eSChris Lattner     if (auto FnAST = ParseTopLevelExpr()) {
112d80f118eSChris Lattner  -    if (auto *FnIR = FnAST->codegen()) {
113d80f118eSChris Lattner  -      // We're just doing this to make sure it executes.
114d80f118eSChris Lattner  -      TheExecutionEngine->finalizeObject();
115d80f118eSChris Lattner  -      // JIT the function, returning a function pointer.
116d80f118eSChris Lattner  -      void *FPtr = TheExecutionEngine->getPointerToFunction(FnIR);
117d80f118eSChris Lattner  -
118d80f118eSChris Lattner  -      // Cast it to the right type (takes no arguments, returns a double) so we
119d80f118eSChris Lattner  -      // can call it as a native function.
120d80f118eSChris Lattner  -      double (*FP)() = (double (*)())(intptr_t)FPtr;
121d80f118eSChris Lattner  -      // Ignore the return value for this.
122d80f118eSChris Lattner  -      (void)FP;
123d80f118eSChris Lattner  +    if (!F->codegen()) {
124d80f118eSChris Lattner  +      fprintf(stderr, "Error generating code for top level expr");
125d80f118eSChris Lattner       }
126d80f118eSChris Lattner     } else {
127d80f118eSChris Lattner       // Skip token for error recovery.
128d80f118eSChris Lattner  @@ -1439,11 +1459,11 @@ int main() {
129d80f118eSChris Lattner     // target lays out data structures.
130d80f118eSChris Lattner     TheModule->setDataLayout(TheExecutionEngine->getDataLayout());
131d80f118eSChris Lattner     OurFPM.add(new DataLayoutPass());
132d80f118eSChris Lattner  +#if 0
133d80f118eSChris Lattner     OurFPM.add(createBasicAliasAnalysisPass());
134d80f118eSChris Lattner     // Promote allocas to registers.
135d80f118eSChris Lattner     OurFPM.add(createPromoteMemoryToRegisterPass());
136d80f118eSChris Lattner  @@ -1218,7 +1210,7 @@ int main() {
137d80f118eSChris Lattner     OurFPM.add(createGVNPass());
138d80f118eSChris Lattner     // Simplify the control flow graph (deleting unreachable blocks, etc).
139d80f118eSChris Lattner     OurFPM.add(createCFGSimplificationPass());
140d80f118eSChris Lattner  -
141d80f118eSChris Lattner  +  #endif
142d80f118eSChris Lattner     OurFPM.doInitialization();
143d80f118eSChris Lattner
144d80f118eSChris Lattner     // Set the global so the code gen can use this.
145d80f118eSChris Lattner
146d80f118eSChris LattnerThis relatively small set of changes get us to the point that we can compile
147d80f118eSChris Lattnerour piece of Kaleidoscope language down to an executable program via this
148d80f118eSChris Lattnercommand line:
149d80f118eSChris Lattner
150d80f118eSChris Lattner.. code-block:: bash
151d80f118eSChris Lattner
152d80f118eSChris Lattner  Kaleidoscope-Ch9 < fib.ks | & clang -x ir -
153d80f118eSChris Lattner
154d80f118eSChris Lattnerwhich gives an a.out/a.exe in the current working directory.
155d80f118eSChris Lattner
156d80f118eSChris LattnerCompile Unit
157d80f118eSChris Lattner============
158d80f118eSChris Lattner
159d80f118eSChris LattnerThe top level container for a section of code in DWARF is a compile unit.
160d80f118eSChris LattnerThis contains the type and function data for an individual translation unit
161d80f118eSChris Lattner(read: one file of source code). So the first thing we need to do is
162d80f118eSChris Lattnerconstruct one for our fib.ks file.
163d80f118eSChris Lattner
164d80f118eSChris LattnerDWARF Emission Setup
165d80f118eSChris Lattner====================
166d80f118eSChris Lattner
167d80f118eSChris LattnerSimilar to the ``IRBuilder`` class we have a
16872fd1033SSylvestre Ledru`DIBuilder <https://llvm.org/doxygen/classllvm_1_1DIBuilder.html>`_ class
169d80f118eSChris Lattnerthat helps in constructing debug metadata for an LLVM IR file. It
170d80f118eSChris Lattnercorresponds 1:1 similarly to ``IRBuilder`` and LLVM IR, but with nicer names.
171d80f118eSChris LattnerUsing it does require that you be more familiar with DWARF terminology than
172d80f118eSChris Lattneryou needed to be with ``IRBuilder`` and ``Instruction`` names, but if you
173d80f118eSChris Lattnerread through the general documentation on the
17472fd1033SSylvestre Ledru`Metadata Format <https://llvm.org/docs/SourceLevelDebugging.html>`_ it
175d80f118eSChris Lattnershould be a little more clear. We'll be using this class to construct all
176d80f118eSChris Lattnerof our IR level descriptions. Construction for it takes a module so we
177d80f118eSChris Lattnerneed to construct it shortly after we construct our module. We've left it
178d80f118eSChris Lattneras a global static variable to make it a bit easier to use.
179d80f118eSChris Lattner
180d80f118eSChris LattnerNext we're going to create a small container to cache some of our frequent
181d80f118eSChris Lattnerdata. The first will be our compile unit, but we'll also write a bit of
182d80f118eSChris Lattnercode for our one type since we won't have to worry about multiple typed
183d80f118eSChris Lattnerexpressions:
184d80f118eSChris Lattner
185d80f118eSChris Lattner.. code-block:: c++
186d80f118eSChris Lattner
187d80f118eSChris Lattner  static DIBuilder *DBuilder;
188d80f118eSChris Lattner
189d80f118eSChris Lattner  struct DebugInfo {
190d80f118eSChris Lattner    DICompileUnit *TheCU;
191d80f118eSChris Lattner    DIType *DblTy;
192d80f118eSChris Lattner
193d80f118eSChris Lattner    DIType *getDoubleTy();
194d80f118eSChris Lattner  } KSDbgInfo;
195d80f118eSChris Lattner
196d80f118eSChris Lattner  DIType *DebugInfo::getDoubleTy() {
197d80f118eSChris Lattner    if (DblTy)
198d80f118eSChris Lattner      return DblTy;
199d80f118eSChris Lattner
200d80f118eSChris Lattner    DblTy = DBuilder->createBasicType("double", 64, dwarf::DW_ATE_float);
201d80f118eSChris Lattner    return DblTy;
202d80f118eSChris Lattner  }
203d80f118eSChris Lattner
204d80f118eSChris LattnerAnd then later on in ``main`` when we're constructing our module:
205d80f118eSChris Lattner
206d80f118eSChris Lattner.. code-block:: c++
207d80f118eSChris Lattner
208d80f118eSChris Lattner  DBuilder = new DIBuilder(*TheModule);
209d80f118eSChris Lattner
210d80f118eSChris Lattner  KSDbgInfo.TheCU = DBuilder->createCompileUnit(
211d80f118eSChris Lattner      dwarf::DW_LANG_C, DBuilder->createFile("fib.ks", "."),
212d80f118eSChris Lattner      "Kaleidoscope Compiler", 0, "", 0);
213d80f118eSChris Lattner
214d80f118eSChris LattnerThere are a couple of things to note here. First, while we're producing a
215d80f118eSChris Lattnercompile unit for a language called Kaleidoscope we used the language
216d80f118eSChris Lattnerconstant for C. This is because a debugger wouldn't necessarily understand
217d80f118eSChris Lattnerthe calling conventions or default ABI for a language it doesn't recognize
218d80f118eSChris Lattnerand we follow the C ABI in our LLVM code generation so it's the closest
219d80f118eSChris Lattnerthing to accurate. This ensures we can actually call functions from the
220d80f118eSChris Lattnerdebugger and have them execute. Secondly, you'll see the "fib.ks" in the
221d80f118eSChris Lattnercall to ``createCompileUnit``. This is a default hard coded value since
222d80f118eSChris Lattnerwe're using shell redirection to put our source into the Kaleidoscope
223d80f118eSChris Lattnercompiler. In a usual front end you'd have an input file name and it would
224d80f118eSChris Lattnergo there.
225d80f118eSChris Lattner
226d80f118eSChris LattnerOne last thing as part of emitting debug information via DIBuilder is that
227d80f118eSChris Lattnerwe need to "finalize" the debug information. The reasons are part of the
228d80f118eSChris Lattnerunderlying API for DIBuilder, but make sure you do this near the end of
229d80f118eSChris Lattnermain:
230d80f118eSChris Lattner
231d80f118eSChris Lattner.. code-block:: c++
232d80f118eSChris Lattner
233d80f118eSChris Lattner  DBuilder->finalize();
234d80f118eSChris Lattner
235d80f118eSChris Lattnerbefore you dump out the module.
236d80f118eSChris Lattner
237d80f118eSChris LattnerFunctions
238d80f118eSChris Lattner=========
239d80f118eSChris Lattner
240d80f118eSChris LattnerNow that we have our ``Compile Unit`` and our source locations, we can add
241d80f118eSChris Lattnerfunction definitions to the debug info. So in ``PrototypeAST::codegen()`` we
242d80f118eSChris Lattneradd a few lines of code to describe a context for our subprogram, in this
243d80f118eSChris Lattnercase the "File", and the actual definition of the function itself.
244d80f118eSChris Lattner
245d80f118eSChris LattnerSo the context:
246d80f118eSChris Lattner
247d80f118eSChris Lattner.. code-block:: c++
248d80f118eSChris Lattner
249d80f118eSChris Lattner  DIFile *Unit = DBuilder->createFile(KSDbgInfo.TheCU.getFilename(),
250d80f118eSChris Lattner                                      KSDbgInfo.TheCU.getDirectory());
251d80f118eSChris Lattner
252d80f118eSChris Lattnergiving us an DIFile and asking the ``Compile Unit`` we created above for the
253d80f118eSChris Lattnerdirectory and filename where we are currently. Then, for now, we use some
254d80f118eSChris Lattnersource locations of 0 (since our AST doesn't currently have source location
255d80f118eSChris Lattnerinformation) and construct our function definition:
256d80f118eSChris Lattner
257d80f118eSChris Lattner.. code-block:: c++
258d80f118eSChris Lattner
259d80f118eSChris Lattner  DIScope *FContext = Unit;
260d80f118eSChris Lattner  unsigned LineNo = 0;
261d80f118eSChris Lattner  unsigned ScopeLine = 0;
262d80f118eSChris Lattner  DISubprogram *SP = DBuilder->createFunction(
263d80f118eSChris Lattner      FContext, P.getName(), StringRef(), Unit, LineNo,
264972fe431SMarc Auberer      CreateFunctionType(TheFunction->arg_size()),
265*fb95b8dcSJustin Brooks      ScopeLine,
266*fb95b8dcSJustin Brooks      DINode::FlagPrototyped,
267*fb95b8dcSJustin Brooks      DISubprogram::SPFlagDefinition);
268d80f118eSChris Lattner  TheFunction->setSubprogram(SP);
269d80f118eSChris Lattner
270d80f118eSChris Lattnerand we now have an DISubprogram that contains a reference to all of our
271d80f118eSChris Lattnermetadata for the function.
272d80f118eSChris Lattner
273d80f118eSChris LattnerSource Locations
274d80f118eSChris Lattner================
275d80f118eSChris Lattner
276d80f118eSChris LattnerThe most important thing for debug information is accurate source location -
277d80f118eSChris Lattnerthis makes it possible to map your source code back. We have a problem though,
278d80f118eSChris LattnerKaleidoscope really doesn't have any source location information in the lexer
279d80f118eSChris Lattneror parser so we'll need to add it.
280d80f118eSChris Lattner
281d80f118eSChris Lattner.. code-block:: c++
282d80f118eSChris Lattner
283d80f118eSChris Lattner   struct SourceLocation {
284d80f118eSChris Lattner     int Line;
285d80f118eSChris Lattner     int Col;
286d80f118eSChris Lattner   };
287d80f118eSChris Lattner   static SourceLocation CurLoc;
288d80f118eSChris Lattner   static SourceLocation LexLoc = {1, 0};
289d80f118eSChris Lattner
290d80f118eSChris Lattner   static int advance() {
291d80f118eSChris Lattner     int LastChar = getchar();
292d80f118eSChris Lattner
293d80f118eSChris Lattner     if (LastChar == '\n' || LastChar == '\r') {
294d80f118eSChris Lattner       LexLoc.Line++;
295d80f118eSChris Lattner       LexLoc.Col = 0;
296d80f118eSChris Lattner     } else
297d80f118eSChris Lattner       LexLoc.Col++;
298d80f118eSChris Lattner     return LastChar;
299d80f118eSChris Lattner   }
300d80f118eSChris Lattner
301d80f118eSChris LattnerIn this set of code we've added some functionality on how to keep track of the
302d80f118eSChris Lattnerline and column of the "source file". As we lex every token we set our current
303d80f118eSChris Lattnercurrent "lexical location" to the assorted line and column for the beginning
304d80f118eSChris Lattnerof the token. We do this by overriding all of the previous calls to
305d80f118eSChris Lattner``getchar()`` with our new ``advance()`` that keeps track of the information
306d80f118eSChris Lattnerand then we have added to all of our AST classes a source location:
307d80f118eSChris Lattner
308d80f118eSChris Lattner.. code-block:: c++
309d80f118eSChris Lattner
310d80f118eSChris Lattner   class ExprAST {
311d80f118eSChris Lattner     SourceLocation Loc;
312d80f118eSChris Lattner
313d80f118eSChris Lattner     public:
314d80f118eSChris Lattner       ExprAST(SourceLocation Loc = CurLoc) : Loc(Loc) {}
315d80f118eSChris Lattner       virtual ~ExprAST() {}
316d80f118eSChris Lattner       virtual Value* codegen() = 0;
317d80f118eSChris Lattner       int getLine() const { return Loc.Line; }
318d80f118eSChris Lattner       int getCol() const { return Loc.Col; }
319d80f118eSChris Lattner       virtual raw_ostream &dump(raw_ostream &out, int ind) {
320d80f118eSChris Lattner         return out << ':' << getLine() << ':' << getCol() << '\n';
321d80f118eSChris Lattner       }
322d80f118eSChris Lattner
323d80f118eSChris Lattnerthat we pass down through when we create a new expression:
324d80f118eSChris Lattner
325d80f118eSChris Lattner.. code-block:: c++
326d80f118eSChris Lattner
3270eaee545SJonas Devlieghere   LHS = std::make_unique<BinaryExprAST>(BinLoc, BinOp, std::move(LHS),
328d80f118eSChris Lattner                                          std::move(RHS));
329d80f118eSChris Lattner
330d80f118eSChris Lattnergiving us locations for each of our expressions and variables.
331d80f118eSChris Lattner
332d80f118eSChris LattnerTo make sure that every instruction gets proper source location information,
333d80f118eSChris Lattnerwe have to tell ``Builder`` whenever we're at a new source location.
334d80f118eSChris LattnerWe use a small helper function for this:
335d80f118eSChris Lattner
336d80f118eSChris Lattner.. code-block:: c++
337d80f118eSChris Lattner
338d80f118eSChris Lattner  void DebugInfo::emitLocation(ExprAST *AST) {
339d80f118eSChris Lattner    DIScope *Scope;
340d80f118eSChris Lattner    if (LexicalBlocks.empty())
341d80f118eSChris Lattner      Scope = TheCU;
342d80f118eSChris Lattner    else
343d80f118eSChris Lattner      Scope = LexicalBlocks.back();
344d80f118eSChris Lattner    Builder.SetCurrentDebugLocation(
3458c4e5576SFangrui Song        DILocation::get(Scope->getContext(), AST->getLine(), AST->getCol(), Scope));
346d80f118eSChris Lattner  }
347d80f118eSChris Lattner
348d80f118eSChris LattnerThis both tells the main ``IRBuilder`` where we are, but also what scope
349d80f118eSChris Lattnerwe're in. The scope can either be on compile-unit level or be the nearest
350d80f118eSChris Lattnerenclosing lexical block like the current function.
351d80f118eSChris LattnerTo represent this we create a stack of scopes:
352d80f118eSChris Lattner
353d80f118eSChris Lattner.. code-block:: c++
354d80f118eSChris Lattner
355d80f118eSChris Lattner   std::vector<DIScope *> LexicalBlocks;
356d80f118eSChris Lattner
357d80f118eSChris Lattnerand push the scope (function) to the top of the stack when we start
358d80f118eSChris Lattnergenerating the code for each function:
359d80f118eSChris Lattner
360d80f118eSChris Lattner.. code-block:: c++
361d80f118eSChris Lattner
362d80f118eSChris Lattner  KSDbgInfo.LexicalBlocks.push_back(SP);
363d80f118eSChris Lattner
364d80f118eSChris LattnerAlso, we may not forget to pop the scope back off of the scope stack at the
365d80f118eSChris Lattnerend of the code generation for the function:
366d80f118eSChris Lattner
367d80f118eSChris Lattner.. code-block:: c++
368d80f118eSChris Lattner
369d80f118eSChris Lattner  // Pop off the lexical block for the function since we added it
370d80f118eSChris Lattner  // unconditionally.
371d80f118eSChris Lattner  KSDbgInfo.LexicalBlocks.pop_back();
372d80f118eSChris Lattner
373d80f118eSChris LattnerThen we make sure to emit the location every time we start to generate code
374d80f118eSChris Lattnerfor a new AST object:
375d80f118eSChris Lattner
376d80f118eSChris Lattner.. code-block:: c++
377d80f118eSChris Lattner
378d80f118eSChris Lattner   KSDbgInfo.emitLocation(this);
379d80f118eSChris Lattner
380d80f118eSChris LattnerVariables
381d80f118eSChris Lattner=========
382d80f118eSChris Lattner
383d80f118eSChris LattnerNow that we have functions, we need to be able to print out the variables
384d80f118eSChris Lattnerwe have in scope. Let's get our function arguments set up so we can get
385d80f118eSChris Lattnerdecent backtraces and see how our functions are being called. It isn't
386d80f118eSChris Lattnera lot of code, and we generally handle it when we're creating the
387d80f118eSChris Lattnerargument allocas in ``FunctionAST::codegen``.
388d80f118eSChris Lattner
389d80f118eSChris Lattner.. code-block:: c++
390d80f118eSChris Lattner
391d80f118eSChris Lattner    // Record the function arguments in the NamedValues map.
392d80f118eSChris Lattner    NamedValues.clear();
393d80f118eSChris Lattner    unsigned ArgIdx = 0;
394d80f118eSChris Lattner    for (auto &Arg : TheFunction->args()) {
395d80f118eSChris Lattner      // Create an alloca for this variable.
396d80f118eSChris Lattner      AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
397d80f118eSChris Lattner
398d80f118eSChris Lattner      // Create a debug descriptor for the variable.
399d80f118eSChris Lattner      DILocalVariable *D = DBuilder->createParameterVariable(
400d80f118eSChris Lattner          SP, Arg.getName(), ++ArgIdx, Unit, LineNo, KSDbgInfo.getDoubleTy(),
401d80f118eSChris Lattner          true);
402d80f118eSChris Lattner
403d80f118eSChris Lattner      DBuilder->insertDeclare(Alloca, D, DBuilder->createExpression(),
4048c4e5576SFangrui Song                              DILocation::get(SP->getContext(), LineNo, 0, SP),
405d80f118eSChris Lattner                              Builder.GetInsertBlock());
406d80f118eSChris Lattner
407d80f118eSChris Lattner      // Store the initial value into the alloca.
408d80f118eSChris Lattner      Builder.CreateStore(&Arg, Alloca);
409d80f118eSChris Lattner
410d80f118eSChris Lattner      // Add arguments to variable symbol table.
411d80f118eSChris Lattner      NamedValues[Arg.getName()] = Alloca;
412d80f118eSChris Lattner    }
413d80f118eSChris Lattner
414d80f118eSChris Lattner
415d80f118eSChris LattnerHere we're first creating the variable, giving it the scope (``SP``),
416d80f118eSChris Lattnerthe name, source location, type, and since it's an argument, the argument
417d80f118eSChris Lattnerindex. Next, we create an ``lvm.dbg.declare`` call to indicate at the IR
418d80f118eSChris Lattnerlevel that we've got a variable in an alloca (and it gives a starting
419d80f118eSChris Lattnerlocation for the variable), and setting a source location for the
420d80f118eSChris Lattnerbeginning of the scope on the declare.
421d80f118eSChris Lattner
422d80f118eSChris LattnerOne interesting thing to note at this point is that various debuggers have
423d80f118eSChris Lattnerassumptions based on how code and debug information was generated for them
424d80f118eSChris Lattnerin the past. In this case we need to do a little bit of a hack to avoid
425d80f118eSChris Lattnergenerating line information for the function prologue so that the debugger
426d80f118eSChris Lattnerknows to skip over those instructions when setting a breakpoint. So in
427d80f118eSChris Lattner``FunctionAST::CodeGen`` we add some more lines:
428d80f118eSChris Lattner
429d80f118eSChris Lattner.. code-block:: c++
430d80f118eSChris Lattner
431d80f118eSChris Lattner  // Unset the location for the prologue emission (leading instructions with no
432d80f118eSChris Lattner  // location in a function are considered part of the prologue and the debugger
433d80f118eSChris Lattner  // will run past them when breaking on a function)
434d80f118eSChris Lattner  KSDbgInfo.emitLocation(nullptr);
435d80f118eSChris Lattner
436d80f118eSChris Lattnerand then emit a new location when we actually start generating code for the
437d80f118eSChris Lattnerbody of the function:
438d80f118eSChris Lattner
439d80f118eSChris Lattner.. code-block:: c++
440d80f118eSChris Lattner
441d80f118eSChris Lattner  KSDbgInfo.emitLocation(Body.get());
442d80f118eSChris Lattner
443d80f118eSChris LattnerWith this we have enough debug information to set breakpoints in functions,
444d80f118eSChris Lattnerprint out argument variables, and call functions. Not too bad for just a
445d80f118eSChris Lattnerfew simple lines of code!
446d80f118eSChris Lattner
447d80f118eSChris LattnerFull Code Listing
448d80f118eSChris Lattner=================
449d80f118eSChris Lattner
450d80f118eSChris LattnerHere is the complete code listing for our running example, enhanced with
451d80f118eSChris Lattnerdebug information. To build this example, use:
452d80f118eSChris Lattner
453d80f118eSChris Lattner.. code-block:: bash
454d80f118eSChris Lattner
455d80f118eSChris Lattner    # Compile
4563546b372Sxgupta    clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy
457d80f118eSChris Lattner    # Run
458d80f118eSChris Lattner    ./toy
459d80f118eSChris Lattner
460d80f118eSChris LattnerHere is the code:
461d80f118eSChris Lattner
462147e0ddaSHans Wennborg.. literalinclude:: ../../../examples/Kaleidoscope/Chapter9/toy.cpp
463d80f118eSChris Lattner   :language: c++
464d80f118eSChris Lattner
465d80f118eSChris Lattner`Next: Conclusion and other useful LLVM tidbits <LangImpl10.html>`_
466d80f118eSChris Lattner
467