1:orphan: 2 3=========================================== 4Kaleidoscope: Implementing a Parser and AST 5=========================================== 6 7.. contents:: 8 :local: 9 10Chapter 2 Introduction 11====================== 12 13Welcome to Chapter 2 of the "`Implementing a language with 14LLVM <index.html>`_" tutorial. This chapter shows you how to use the 15lexer, built in `Chapter 1 <LangImpl01.html>`_, to build a full 16`parser <http://en.wikipedia.org/wiki/Parsing>`_ for our Kaleidoscope 17language. Once we have a parser, we'll define and build an `Abstract 18Syntax Tree <http://en.wikipedia.org/wiki/Abstract_syntax_tree>`_ (AST). 19 20The parser we will build uses a combination of `Recursive Descent 21Parsing <http://en.wikipedia.org/wiki/Recursive_descent_parser>`_ and 22`Operator-Precedence 23Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to 24parse the Kaleidoscope language (the latter for binary expressions and 25the former for everything else). Before we get to parsing though, let's 26talk about the output of the parser: the Abstract Syntax Tree. 27 28The Abstract Syntax Tree (AST) 29============================== 30 31The AST for a program captures its behavior in such a way that it is 32easy for later stages of the compiler (e.g. code generation) to 33interpret. We basically want one object for each construct in the 34language, and the AST should closely model the language. In 35Kaleidoscope, we have expressions, a prototype, and a function object. 36We'll start with expressions first: 37 38.. code-block:: c++ 39 40 /// ExprAST - Base class for all expression nodes. 41 class ExprAST { 42 public: 43 virtual ~ExprAST() {} 44 }; 45 46 /// NumberExprAST - Expression class for numeric literals like "1.0". 47 class NumberExprAST : public ExprAST { 48 double Val; 49 50 public: 51 NumberExprAST(double Val) : Val(Val) {} 52 }; 53 54The code above shows the definition of the base ExprAST class and one 55subclass which we use for numeric literals. The important thing to note 56about this code is that the NumberExprAST class captures the numeric 57value of the literal as an instance variable. This allows later phases 58of the compiler to know what the stored numeric value is. 59 60Right now we only create the AST, so there are no useful accessor 61methods on them. It would be very easy to add a virtual method to pretty 62print the code, for example. Here are the other expression AST node 63definitions that we'll use in the basic form of the Kaleidoscope 64language: 65 66.. code-block:: c++ 67 68 /// VariableExprAST - Expression class for referencing a variable, like "a". 69 class VariableExprAST : public ExprAST { 70 std::string Name; 71 72 public: 73 VariableExprAST(const std::string &Name) : Name(Name) {} 74 }; 75 76 /// BinaryExprAST - Expression class for a binary operator. 77 class BinaryExprAST : public ExprAST { 78 char Op; 79 std::unique_ptr<ExprAST> LHS, RHS; 80 81 public: 82 BinaryExprAST(char op, std::unique_ptr<ExprAST> LHS, 83 std::unique_ptr<ExprAST> RHS) 84 : Op(op), LHS(std::move(LHS)), RHS(std::move(RHS)) {} 85 }; 86 87 /// CallExprAST - Expression class for function calls. 88 class CallExprAST : public ExprAST { 89 std::string Callee; 90 std::vector<std::unique_ptr<ExprAST>> Args; 91 92 public: 93 CallExprAST(const std::string &Callee, 94 std::vector<std::unique_ptr<ExprAST>> Args) 95 : Callee(Callee), Args(std::move(Args)) {} 96 }; 97 98This is all (intentionally) rather straight-forward: variables capture 99the variable name, binary operators capture their opcode (e.g. '+'), and 100calls capture a function name as well as a list of any argument 101expressions. One thing that is nice about our AST is that it captures 102the language features without talking about the syntax of the language. 103Note that there is no discussion about precedence of binary operators, 104lexical structure, etc. 105 106For our basic language, these are all of the expression nodes we'll 107define. Because it doesn't have conditional control flow, it isn't 108Turing-complete; we'll fix that in a later installment. The two things 109we need next are a way to talk about the interface to a function, and a 110way to talk about functions themselves: 111 112.. code-block:: c++ 113 114 /// PrototypeAST - This class represents the "prototype" for a function, 115 /// which captures its name, and its argument names (thus implicitly the number 116 /// of arguments the function takes). 117 class PrototypeAST { 118 std::string Name; 119 std::vector<std::string> Args; 120 121 public: 122 PrototypeAST(const std::string &name, std::vector<std::string> Args) 123 : Name(name), Args(std::move(Args)) {} 124 125 const std::string &getName() const { return Name; } 126 }; 127 128 /// FunctionAST - This class represents a function definition itself. 129 class FunctionAST { 130 std::unique_ptr<PrototypeAST> Proto; 131 std::unique_ptr<ExprAST> Body; 132 133 public: 134 FunctionAST(std::unique_ptr<PrototypeAST> Proto, 135 std::unique_ptr<ExprAST> Body) 136 : Proto(std::move(Proto)), Body(std::move(Body)) {} 137 }; 138 139In Kaleidoscope, functions are typed with just a count of their 140arguments. Since all values are double precision floating point, the 141type of each argument doesn't need to be stored anywhere. In a more 142aggressive and realistic language, the "ExprAST" class would probably 143have a type field. 144 145With this scaffolding, we can now talk about parsing expressions and 146function bodies in Kaleidoscope. 147 148Parser Basics 149============= 150 151Now that we have an AST to build, we need to define the parser code to 152build it. The idea here is that we want to parse something like "x+y" 153(which is returned as three tokens by the lexer) into an AST that could 154be generated with calls like this: 155 156.. code-block:: c++ 157 158 auto LHS = std::make_unique<VariableExprAST>("x"); 159 auto RHS = std::make_unique<VariableExprAST>("y"); 160 auto Result = std::make_unique<BinaryExprAST>('+', std::move(LHS), 161 std::move(RHS)); 162 163In order to do this, we'll start by defining some basic helper routines: 164 165.. code-block:: c++ 166 167 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current 168 /// token the parser is looking at. getNextToken reads another token from the 169 /// lexer and updates CurTok with its results. 170 static int CurTok; 171 static int getNextToken() { 172 return CurTok = gettok(); 173 } 174 175This implements a simple token buffer around the lexer. This allows us 176to look one token ahead at what the lexer is returning. Every function 177in our parser will assume that CurTok is the current token that needs to 178be parsed. 179 180.. code-block:: c++ 181 182 183 /// LogError* - These are little helper functions for error handling. 184 std::unique_ptr<ExprAST> LogError(const char *Str) { 185 fprintf(stderr, "LogError: %s\n", Str); 186 return nullptr; 187 } 188 std::unique_ptr<PrototypeAST> LogErrorP(const char *Str) { 189 LogError(Str); 190 return nullptr; 191 } 192 193The ``LogError`` routines are simple helper routines that our parser will 194use to handle errors. The error recovery in our parser will not be the 195best and is not particular user-friendly, but it will be enough for our 196tutorial. These routines make it easier to handle errors in routines 197that have various return types: they always return null. 198 199With these basic helper functions, we can implement the first piece of 200our grammar: numeric literals. 201 202Basic Expression Parsing 203======================== 204 205We start with numeric literals, because they are the simplest to 206process. For each production in our grammar, we'll define a function 207which parses that production. For numeric literals, we have: 208 209.. code-block:: c++ 210 211 /// numberexpr ::= number 212 static std::unique_ptr<ExprAST> ParseNumberExpr() { 213 auto Result = std::make_unique<NumberExprAST>(NumVal); 214 getNextToken(); // consume the number 215 return std::move(Result); 216 } 217 218This routine is very simple: it expects to be called when the current 219token is a ``tok_number`` token. It takes the current number value, 220creates a ``NumberExprAST`` node, advances the lexer to the next token, 221and finally returns. 222 223There are some interesting aspects to this. The most important one is 224that this routine eats all of the tokens that correspond to the 225production and returns the lexer buffer with the next token (which is 226not part of the grammar production) ready to go. This is a fairly 227standard way to go for recursive descent parsers. For a better example, 228the parenthesis operator is defined like this: 229 230.. code-block:: c++ 231 232 /// parenexpr ::= '(' expression ')' 233 static std::unique_ptr<ExprAST> ParseParenExpr() { 234 getNextToken(); // eat (. 235 auto V = ParseExpression(); 236 if (!V) 237 return nullptr; 238 239 if (CurTok != ')') 240 return LogError("expected ')'"); 241 getNextToken(); // eat ). 242 return V; 243 } 244 245This function illustrates a number of interesting things about the 246parser: 247 2481) It shows how we use the LogError routines. When called, this function 249expects that the current token is a '(' token, but after parsing the 250subexpression, it is possible that there is no ')' waiting. For example, 251if the user types in "(4 x" instead of "(4)", the parser should emit an 252error. Because errors can occur, the parser needs a way to indicate that 253they happened: in our parser, we return null on an error. 254 2552) Another interesting aspect of this function is that it uses recursion 256by calling ``ParseExpression`` (we will soon see that 257``ParseExpression`` can call ``ParseParenExpr``). This is powerful 258because it allows us to handle recursive grammars, and keeps each 259production very simple. Note that parentheses do not cause construction 260of AST nodes themselves. While we could do it this way, the most 261important role of parentheses are to guide the parser and provide 262grouping. Once the parser constructs the AST, parentheses are not 263needed. 264 265The next simple production is for handling variable references and 266function calls: 267 268.. code-block:: c++ 269 270 /// identifierexpr 271 /// ::= identifier 272 /// ::= identifier '(' expression* ')' 273 static std::unique_ptr<ExprAST> ParseIdentifierExpr() { 274 std::string IdName = IdentifierStr; 275 276 getNextToken(); // eat identifier. 277 278 if (CurTok != '(') // Simple variable ref. 279 return std::make_unique<VariableExprAST>(IdName); 280 281 // Call. 282 getNextToken(); // eat ( 283 std::vector<std::unique_ptr<ExprAST>> Args; 284 if (CurTok != ')') { 285 while (1) { 286 if (auto Arg = ParseExpression()) 287 Args.push_back(std::move(Arg)); 288 else 289 return nullptr; 290 291 if (CurTok == ')') 292 break; 293 294 if (CurTok != ',') 295 return LogError("Expected ')' or ',' in argument list"); 296 getNextToken(); 297 } 298 } 299 300 // Eat the ')'. 301 getNextToken(); 302 303 return std::make_unique<CallExprAST>(IdName, std::move(Args)); 304 } 305 306This routine follows the same style as the other routines. (It expects 307to be called if the current token is a ``tok_identifier`` token). It 308also has recursion and error handling. One interesting aspect of this is 309that it uses *look-ahead* to determine if the current identifier is a 310stand alone variable reference or if it is a function call expression. 311It handles this by checking to see if the token after the identifier is 312a '(' token, constructing either a ``VariableExprAST`` or 313``CallExprAST`` node as appropriate. 314 315Now that we have all of our simple expression-parsing logic in place, we 316can define a helper function to wrap it together into one entry point. 317We call this class of expressions "primary" expressions, for reasons 318that will become more clear `later in the 319tutorial <LangImpl6.html#user-defined-unary-operators>`_. In order to parse an arbitrary 320primary expression, we need to determine what sort of expression it is: 321 322.. code-block:: c++ 323 324 /// primary 325 /// ::= identifierexpr 326 /// ::= numberexpr 327 /// ::= parenexpr 328 static std::unique_ptr<ExprAST> ParsePrimary() { 329 switch (CurTok) { 330 default: 331 return LogError("unknown token when expecting an expression"); 332 case tok_identifier: 333 return ParseIdentifierExpr(); 334 case tok_number: 335 return ParseNumberExpr(); 336 case '(': 337 return ParseParenExpr(); 338 } 339 } 340 341Now that you see the definition of this function, it is more obvious why 342we can assume the state of CurTok in the various functions. This uses 343look-ahead to determine which sort of expression is being inspected, and 344then parses it with a function call. 345 346Now that basic expressions are handled, we need to handle binary 347expressions. They are a bit more complex. 348 349Binary Expression Parsing 350========================= 351 352Binary expressions are significantly harder to parse because they are 353often ambiguous. For example, when given the string "x+y\*z", the parser 354can choose to parse it as either "(x+y)\*z" or "x+(y\*z)". With common 355definitions from mathematics, we expect the later parse, because "\*" 356(multiplication) has higher *precedence* than "+" (addition). 357 358There are many ways to handle this, but an elegant and efficient way is 359to use `Operator-Precedence 360Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_. 361This parsing technique uses the precedence of binary operators to guide 362recursion. To start with, we need a table of precedences: 363 364.. code-block:: c++ 365 366 /// BinopPrecedence - This holds the precedence for each binary operator that is 367 /// defined. 368 static std::map<char, int> BinopPrecedence; 369 370 /// GetTokPrecedence - Get the precedence of the pending binary operator token. 371 static int GetTokPrecedence() { 372 if (!isascii(CurTok)) 373 return -1; 374 375 // Make sure it's a declared binop. 376 int TokPrec = BinopPrecedence[CurTok]; 377 if (TokPrec <= 0) return -1; 378 return TokPrec; 379 } 380 381 int main() { 382 // Install standard binary operators. 383 // 1 is lowest precedence. 384 BinopPrecedence['<'] = 10; 385 BinopPrecedence['+'] = 20; 386 BinopPrecedence['-'] = 20; 387 BinopPrecedence['*'] = 40; // highest. 388 ... 389 } 390 391For the basic form of Kaleidoscope, we will only support 4 binary 392operators (this can obviously be extended by you, our brave and intrepid 393reader). The ``GetTokPrecedence`` function returns the precedence for 394the current token, or -1 if the token is not a binary operator. Having a 395map makes it easy to add new operators and makes it clear that the 396algorithm doesn't depend on the specific operators involved, but it 397would be easy enough to eliminate the map and do the comparisons in the 398``GetTokPrecedence`` function. (Or just use a fixed-size array). 399 400With the helper above defined, we can now start parsing binary 401expressions. The basic idea of operator precedence parsing is to break 402down an expression with potentially ambiguous binary operators into 403pieces. Consider, for example, the expression "a+b+(c+d)\*e\*f+g". 404Operator precedence parsing considers this as a stream of primary 405expressions separated by binary operators. As such, it will first parse 406the leading primary expression "a", then it will see the pairs [+, b] 407[+, (c+d)] [\*, e] [\*, f] and [+, g]. Note that because parentheses are 408primary expressions, the binary expression parser doesn't need to worry 409about nested subexpressions like (c+d) at all. 410 411To start, an expression is a primary expression potentially followed by 412a sequence of [binop,primaryexpr] pairs: 413 414.. code-block:: c++ 415 416 /// expression 417 /// ::= primary binoprhs 418 /// 419 static std::unique_ptr<ExprAST> ParseExpression() { 420 auto LHS = ParsePrimary(); 421 if (!LHS) 422 return nullptr; 423 424 return ParseBinOpRHS(0, std::move(LHS)); 425 } 426 427``ParseBinOpRHS`` is the function that parses the sequence of pairs for 428us. It takes a precedence and a pointer to an expression for the part 429that has been parsed so far. Note that "x" is a perfectly valid 430expression: As such, "binoprhs" is allowed to be empty, in which case it 431returns the expression that is passed into it. In our example above, the 432code passes the expression for "a" into ``ParseBinOpRHS`` and the 433current token is "+". 434 435The precedence value passed into ``ParseBinOpRHS`` indicates the 436*minimal operator precedence* that the function is allowed to eat. For 437example, if the current pair stream is [+, x] and ``ParseBinOpRHS`` is 438passed in a precedence of 40, it will not consume any tokens (because 439the precedence of '+' is only 20). With this in mind, ``ParseBinOpRHS`` 440starts with: 441 442.. code-block:: c++ 443 444 /// binoprhs 445 /// ::= ('+' primary)* 446 static std::unique_ptr<ExprAST> ParseBinOpRHS(int ExprPrec, 447 std::unique_ptr<ExprAST> LHS) { 448 // If this is a binop, find its precedence. 449 while (1) { 450 int TokPrec = GetTokPrecedence(); 451 452 // If this is a binop that binds at least as tightly as the current binop, 453 // consume it, otherwise we are done. 454 if (TokPrec < ExprPrec) 455 return LHS; 456 457This code gets the precedence of the current token and checks to see if 458if is too low. Because we defined invalid tokens to have a precedence of 459-1, this check implicitly knows that the pair-stream ends when the token 460stream runs out of binary operators. If this check succeeds, we know 461that the token is a binary operator and that it will be included in this 462expression: 463 464.. code-block:: c++ 465 466 // Okay, we know this is a binop. 467 int BinOp = CurTok; 468 getNextToken(); // eat binop 469 470 // Parse the primary expression after the binary operator. 471 auto RHS = ParsePrimary(); 472 if (!RHS) 473 return nullptr; 474 475As such, this code eats (and remembers) the binary operator and then 476parses the primary expression that follows. This builds up the whole 477pair, the first of which is [+, b] for the running example. 478 479Now that we parsed the left-hand side of an expression and one pair of 480the RHS sequence, we have to decide which way the expression associates. 481In particular, we could have "(a+b) binop unparsed" or "a + (b binop 482unparsed)". To determine this, we look ahead at "binop" to determine its 483precedence and compare it to BinOp's precedence (which is '+' in this 484case): 485 486.. code-block:: c++ 487 488 // If BinOp binds less tightly with RHS than the operator after RHS, let 489 // the pending operator take RHS as its LHS. 490 int NextPrec = GetTokPrecedence(); 491 if (TokPrec < NextPrec) { 492 493If the precedence of the binop to the right of "RHS" is lower or equal 494to the precedence of our current operator, then we know that the 495parentheses associate as "(a+b) binop ...". In our example, the current 496operator is "+" and the next operator is "+", we know that they have the 497same precedence. In this case we'll create the AST node for "a+b", and 498then continue parsing: 499 500.. code-block:: c++ 501 502 ... if body omitted ... 503 } 504 505 // Merge LHS/RHS. 506 LHS = std::make_unique<BinaryExprAST>(BinOp, std::move(LHS), 507 std::move(RHS)); 508 } // loop around to the top of the while loop. 509 } 510 511In our example above, this will turn "a+b+" into "(a+b)" and execute the 512next iteration of the loop, with "+" as the current token. The code 513above will eat, remember, and parse "(c+d)" as the primary expression, 514which makes the current pair equal to [+, (c+d)]. It will then evaluate 515the 'if' conditional above with "\*" as the binop to the right of the 516primary. In this case, the precedence of "\*" is higher than the 517precedence of "+" so the if condition will be entered. 518 519The critical question left here is "how can the if condition parse the 520right hand side in full"? In particular, to build the AST correctly for 521our example, it needs to get all of "(c+d)\*e\*f" as the RHS expression 522variable. The code to do this is surprisingly simple (code from the 523above two blocks duplicated for context): 524 525.. code-block:: c++ 526 527 // If BinOp binds less tightly with RHS than the operator after RHS, let 528 // the pending operator take RHS as its LHS. 529 int NextPrec = GetTokPrecedence(); 530 if (TokPrec < NextPrec) { 531 RHS = ParseBinOpRHS(TokPrec+1, std::move(RHS)); 532 if (!RHS) 533 return nullptr; 534 } 535 // Merge LHS/RHS. 536 LHS = std::make_unique<BinaryExprAST>(BinOp, std::move(LHS), 537 std::move(RHS)); 538 } // loop around to the top of the while loop. 539 } 540 541At this point, we know that the binary operator to the RHS of our 542primary has higher precedence than the binop we are currently parsing. 543As such, we know that any sequence of pairs whose operators are all 544higher precedence than "+" should be parsed together and returned as 545"RHS". To do this, we recursively invoke the ``ParseBinOpRHS`` function 546specifying "TokPrec+1" as the minimum precedence required for it to 547continue. In our example above, this will cause it to return the AST 548node for "(c+d)\*e\*f" as RHS, which is then set as the RHS of the '+' 549expression. 550 551Finally, on the next iteration of the while loop, the "+g" piece is 552parsed and added to the AST. With this little bit of code (14 553non-trivial lines), we correctly handle fully general binary expression 554parsing in a very elegant way. This was a whirlwind tour of this code, 555and it is somewhat subtle. I recommend running through it with a few 556tough examples to see how it works. 557 558This wraps up handling of expressions. At this point, we can point the 559parser at an arbitrary token stream and build an expression from it, 560stopping at the first token that is not part of the expression. Next up 561we need to handle function definitions, etc. 562 563Parsing the Rest 564================ 565 566The next thing missing is handling of function prototypes. In 567Kaleidoscope, these are used both for 'extern' function declarations as 568well as function body definitions. The code to do this is 569straight-forward and not very interesting (once you've survived 570expressions): 571 572.. code-block:: c++ 573 574 /// prototype 575 /// ::= id '(' id* ')' 576 static std::unique_ptr<PrototypeAST> ParsePrototype() { 577 if (CurTok != tok_identifier) 578 return LogErrorP("Expected function name in prototype"); 579 580 std::string FnName = IdentifierStr; 581 getNextToken(); 582 583 if (CurTok != '(') 584 return LogErrorP("Expected '(' in prototype"); 585 586 // Read the list of argument names. 587 std::vector<std::string> ArgNames; 588 while (getNextToken() == tok_identifier) 589 ArgNames.push_back(IdentifierStr); 590 if (CurTok != ')') 591 return LogErrorP("Expected ')' in prototype"); 592 593 // success. 594 getNextToken(); // eat ')'. 595 596 return std::make_unique<PrototypeAST>(FnName, std::move(ArgNames)); 597 } 598 599Given this, a function definition is very simple, just a prototype plus 600an expression to implement the body: 601 602.. code-block:: c++ 603 604 /// definition ::= 'def' prototype expression 605 static std::unique_ptr<FunctionAST> ParseDefinition() { 606 getNextToken(); // eat def. 607 auto Proto = ParsePrototype(); 608 if (!Proto) return nullptr; 609 610 if (auto E = ParseExpression()) 611 return std::make_unique<FunctionAST>(std::move(Proto), std::move(E)); 612 return nullptr; 613 } 614 615In addition, we support 'extern' to declare functions like 'sin' and 616'cos' as well as to support forward declaration of user functions. These 617'extern's are just prototypes with no body: 618 619.. code-block:: c++ 620 621 /// external ::= 'extern' prototype 622 static std::unique_ptr<PrototypeAST> ParseExtern() { 623 getNextToken(); // eat extern. 624 return ParsePrototype(); 625 } 626 627Finally, we'll also let the user type in arbitrary top-level expressions 628and evaluate them on the fly. We will handle this by defining anonymous 629nullary (zero argument) functions for them: 630 631.. code-block:: c++ 632 633 /// toplevelexpr ::= expression 634 static std::unique_ptr<FunctionAST> ParseTopLevelExpr() { 635 if (auto E = ParseExpression()) { 636 // Make an anonymous proto. 637 auto Proto = std::make_unique<PrototypeAST>("", std::vector<std::string>()); 638 return std::make_unique<FunctionAST>(std::move(Proto), std::move(E)); 639 } 640 return nullptr; 641 } 642 643Now that we have all the pieces, let's build a little driver that will 644let us actually *execute* this code we've built! 645 646The Driver 647========== 648 649The driver for this simply invokes all of the parsing pieces with a 650top-level dispatch loop. There isn't much interesting here, so I'll just 651include the top-level loop. See `below <#full-code-listing>`_ for full code in the 652"Top-Level Parsing" section. 653 654.. code-block:: c++ 655 656 /// top ::= definition | external | expression | ';' 657 static void MainLoop() { 658 while (1) { 659 fprintf(stderr, "ready> "); 660 switch (CurTok) { 661 case tok_eof: 662 return; 663 case ';': // ignore top-level semicolons. 664 getNextToken(); 665 break; 666 case tok_def: 667 HandleDefinition(); 668 break; 669 case tok_extern: 670 HandleExtern(); 671 break; 672 default: 673 HandleTopLevelExpression(); 674 break; 675 } 676 } 677 } 678 679The most interesting part of this is that we ignore top-level 680semicolons. Why is this, you ask? The basic reason is that if you type 681"4 + 5" at the command line, the parser doesn't know whether that is the 682end of what you will type or not. For example, on the next line you 683could type "def foo..." in which case 4+5 is the end of a top-level 684expression. Alternatively you could type "\* 6", which would continue 685the expression. Having top-level semicolons allows you to type "4+5;", 686and the parser will know you are done. 687 688Conclusions 689=========== 690 691With just under 400 lines of commented code (240 lines of non-comment, 692non-blank code), we fully defined our minimal language, including a 693lexer, parser, and AST builder. With this done, the executable will 694validate Kaleidoscope code and tell us if it is grammatically invalid. 695For example, here is a sample interaction: 696 697.. code-block:: bash 698 699 $ ./a.out 700 ready> def foo(x y) x+foo(y, 4.0); 701 Parsed a function definition. 702 ready> def foo(x y) x+y y; 703 Parsed a function definition. 704 Parsed a top-level expr 705 ready> def foo(x y) x+y ); 706 Parsed a function definition. 707 Error: unknown token when expecting an expression 708 ready> extern sin(a); 709 ready> Parsed an extern 710 ready> ^D 711 $ 712 713There is a lot of room for extension here. You can define new AST nodes, 714extend the language in many ways, etc. In the `next 715installment <LangImpl03.html>`_, we will describe how to generate LLVM 716Intermediate Representation (IR) from the AST. 717 718Full Code Listing 719================= 720 721Here is the complete code listing for our running example. Because this 722uses the LLVM libraries, we need to link them in. To do this, we use the 723`llvm-config <http://llvm.org/cmds/llvm-config.html>`_ tool to inform 724our makefile/command line about which options to use: 725 726.. code-block:: bash 727 728 # Compile 729 clang++ -g -O3 toy.cpp `llvm-config --cxxflags` 730 # Run 731 ./a.out 732 733Here is the code: 734 735.. literalinclude:: ../../../examples/Kaleidoscope/Chapter2/toy.cpp 736 :language: c++ 737 738`Next: Implementing Code Generation to LLVM IR <LangImpl03.html>`_ 739 740