1d80f118eSChris Lattner============================================== 2d80f118eSChris LattnerKaleidoscope: Adding JIT and Optimizer Support 3d80f118eSChris Lattner============================================== 4d80f118eSChris Lattner 5d80f118eSChris Lattner.. contents:: 6d80f118eSChris Lattner :local: 7d80f118eSChris Lattner 8d80f118eSChris LattnerChapter 4 Introduction 9d80f118eSChris Lattner====================== 10d80f118eSChris Lattner 11d80f118eSChris LattnerWelcome to Chapter 4 of the "`Implementing a language with 12d80f118eSChris LattnerLLVM <index.html>`_" tutorial. Chapters 1-3 described the implementation 13d80f118eSChris Lattnerof a simple language and added support for generating LLVM IR. This 14d80f118eSChris Lattnerchapter describes two new techniques: adding optimizer support to your 15d80f118eSChris Lattnerlanguage, and adding JIT compiler support. These additions will 16d80f118eSChris Lattnerdemonstrate how to get nice, efficient code for the Kaleidoscope 17d80f118eSChris Lattnerlanguage. 18d80f118eSChris Lattner 19d80f118eSChris LattnerTrivial Constant Folding 20d80f118eSChris Lattner======================== 21d80f118eSChris Lattner 22d80f118eSChris LattnerOur demonstration for Chapter 3 is elegant and easy to extend. 23d80f118eSChris LattnerUnfortunately, it does not produce wonderful code. The IRBuilder, 24d80f118eSChris Lattnerhowever, does give us obvious optimizations when compiling simple code: 25d80f118eSChris Lattner 26d80f118eSChris Lattner:: 27d80f118eSChris Lattner 28d80f118eSChris Lattner ready> def test(x) 1+2+x; 29d80f118eSChris Lattner Read function definition: 30d80f118eSChris Lattner define double @test(double %x) { 31d80f118eSChris Lattner entry: 32d80f118eSChris Lattner %addtmp = fadd double 3.000000e+00, %x 33d80f118eSChris Lattner ret double %addtmp 34d80f118eSChris Lattner } 35d80f118eSChris Lattner 36d80f118eSChris LattnerThis code is not a literal transcription of the AST built by parsing the 37d80f118eSChris Lattnerinput. That would be: 38d80f118eSChris Lattner 39d80f118eSChris Lattner:: 40d80f118eSChris Lattner 41d80f118eSChris Lattner ready> def test(x) 1+2+x; 42d80f118eSChris Lattner Read function definition: 43d80f118eSChris Lattner define double @test(double %x) { 44d80f118eSChris Lattner entry: 45d80f118eSChris Lattner %addtmp = fadd double 2.000000e+00, 1.000000e+00 46d80f118eSChris Lattner %addtmp1 = fadd double %addtmp, %x 47d80f118eSChris Lattner ret double %addtmp1 48d80f118eSChris Lattner } 49d80f118eSChris Lattner 50d80f118eSChris LattnerConstant folding, as seen above, in particular, is a very common and 51d80f118eSChris Lattnervery important optimization: so much so that many language implementors 52d80f118eSChris Lattnerimplement constant folding support in their AST representation. 53d80f118eSChris Lattner 54d80f118eSChris LattnerWith LLVM, you don't need this support in the AST. Since all calls to 55d80f118eSChris Lattnerbuild LLVM IR go through the LLVM IR builder, the builder itself checked 56d80f118eSChris Lattnerto see if there was a constant folding opportunity when you call it. If 57d80f118eSChris Lattnerso, it just does the constant fold and return the constant instead of 58d80f118eSChris Lattnercreating an instruction. 59d80f118eSChris Lattner 60d80f118eSChris LattnerWell, that was easy :). In practice, we recommend always using 61d80f118eSChris Lattner``IRBuilder`` when generating code like this. It has no "syntactic 62d80f118eSChris Lattneroverhead" for its use (you don't have to uglify your compiler with 63d80f118eSChris Lattnerconstant checks everywhere) and it can dramatically reduce the amount of 64d80f118eSChris LattnerLLVM IR that is generated in some cases (particular for languages with a 65d80f118eSChris Lattnermacro preprocessor or that use a lot of constants). 66d80f118eSChris Lattner 67d80f118eSChris LattnerOn the other hand, the ``IRBuilder`` is limited by the fact that it does 68d80f118eSChris Lattnerall of its analysis inline with the code as it is built. If you take a 69d80f118eSChris Lattnerslightly more complex example: 70d80f118eSChris Lattner 71d80f118eSChris Lattner:: 72d80f118eSChris Lattner 73d80f118eSChris Lattner ready> def test(x) (1+2+x)*(x+(1+2)); 74d80f118eSChris Lattner ready> Read function definition: 75d80f118eSChris Lattner define double @test(double %x) { 76d80f118eSChris Lattner entry: 77d80f118eSChris Lattner %addtmp = fadd double 3.000000e+00, %x 78d80f118eSChris Lattner %addtmp1 = fadd double %x, 3.000000e+00 79d80f118eSChris Lattner %multmp = fmul double %addtmp, %addtmp1 80d80f118eSChris Lattner ret double %multmp 81d80f118eSChris Lattner } 82d80f118eSChris Lattner 83d80f118eSChris LattnerIn this case, the LHS and RHS of the multiplication are the same value. 84d80f118eSChris LattnerWe'd really like to see this generate "``tmp = x+3; result = tmp*tmp;``" 85d80f118eSChris Lattnerinstead of computing "``x+3``" twice. 86d80f118eSChris Lattner 87d80f118eSChris LattnerUnfortunately, no amount of local analysis will be able to detect and 88d80f118eSChris Lattnercorrect this. This requires two transformations: reassociation of 89d80f118eSChris Lattnerexpressions (to make the add's lexically identical) and Common 90d80f118eSChris LattnerSubexpression Elimination (CSE) to delete the redundant add instruction. 91d80f118eSChris LattnerFortunately, LLVM provides a broad range of optimizations that you can 92d80f118eSChris Lattneruse, in the form of "passes". 93d80f118eSChris Lattner 94d80f118eSChris LattnerLLVM Optimization Passes 95d80f118eSChris Lattner======================== 96d80f118eSChris Lattner 97d80f118eSChris Lattner.. warning:: 98d80f118eSChris Lattner 99d80f118eSChris Lattner Due to the transition to the new PassManager infrastructure this tutorial 100d80f118eSChris Lattner is based on ``llvm::legacy::FunctionPassManager`` which can be found in 10172fd1033SSylvestre Ledru `LegacyPassManager.h <https://llvm.org/doxygen/classllvm_1_1legacy_1_1FunctionPassManager.html>`_. 102d80f118eSChris Lattner For the purpose of the this tutorial the above should be used until 103d80f118eSChris Lattner the pass manager transition is complete. 104d80f118eSChris Lattner 105d80f118eSChris LattnerLLVM provides many optimization passes, which do many different sorts of 106d80f118eSChris Lattnerthings and have different tradeoffs. Unlike other systems, LLVM doesn't 107d80f118eSChris Lattnerhold to the mistaken notion that one set of optimizations is right for 108d80f118eSChris Lattnerall languages and for all situations. LLVM allows a compiler implementor 109d80f118eSChris Lattnerto make complete decisions about what optimizations to use, in which 110d80f118eSChris Lattnerorder, and in what situation. 111d80f118eSChris Lattner 112d80f118eSChris LattnerAs a concrete example, LLVM supports both "whole module" passes, which 113d80f118eSChris Lattnerlook across as large of body of code as they can (often a whole file, 114d80f118eSChris Lattnerbut if run at link time, this can be a substantial portion of the whole 115d80f118eSChris Lattnerprogram). It also supports and includes "per-function" passes which just 116d80f118eSChris Lattneroperate on a single function at a time, without looking at other 117d80f118eSChris Lattnerfunctions. For more information on passes and how they are run, see the 1182916489cSkristina`How to Write a Pass <../../WritingAnLLVMPass.html>`_ document and the 1192916489cSkristina`List of LLVM Passes <../../Passes.html>`_. 120d80f118eSChris Lattner 121d80f118eSChris LattnerFor Kaleidoscope, we are currently generating functions on the fly, one 122d80f118eSChris Lattnerat a time, as the user types them in. We aren't shooting for the 123d80f118eSChris Lattnerultimate optimization experience in this setting, but we also want to 124d80f118eSChris Lattnercatch the easy and quick stuff where possible. As such, we will choose 125d80f118eSChris Lattnerto run a few per-function optimizations as the user types the function 126d80f118eSChris Lattnerin. If we wanted to make a "static Kaleidoscope compiler", we would use 127d80f118eSChris Lattnerexactly the code we have now, except that we would defer running the 128d80f118eSChris Lattneroptimizer until the entire file has been parsed. 129d80f118eSChris Lattner 130d80f118eSChris LattnerIn order to get per-function optimizations going, we need to set up a 1312916489cSkristina`FunctionPassManager <../../WritingAnLLVMPass.html#what-passmanager-doesr>`_ to hold 132d80f118eSChris Lattnerand organize the LLVM optimizations that we want to run. Once we have 133d80f118eSChris Lattnerthat, we can add a set of optimizations to run. We'll need a new 134d80f118eSChris LattnerFunctionPassManager for each module that we want to optimize, so we'll 135d80f118eSChris Lattnerwrite a function to create and initialize both the module and pass manager 136d80f118eSChris Lattnerfor us: 137d80f118eSChris Lattner 138d80f118eSChris Lattner.. code-block:: c++ 139d80f118eSChris Lattner 140d80f118eSChris Lattner void InitializeModuleAndPassManager(void) { 141d80f118eSChris Lattner // Open a new module. 1420eaee545SJonas Devlieghere TheModule = std::make_unique<Module>("my cool jit", TheContext); 143d80f118eSChris Lattner 144d80f118eSChris Lattner // Create a new pass manager attached to it. 145884fb45eSErich Keane TheFPM = std::make_unique<legacy::FunctionPassManager>(TheModule.get()); 146d80f118eSChris Lattner 147d80f118eSChris Lattner // Do simple "peephole" optimizations and bit-twiddling optzns. 148d80f118eSChris Lattner TheFPM->add(createInstructionCombiningPass()); 149d80f118eSChris Lattner // Reassociate expressions. 150d80f118eSChris Lattner TheFPM->add(createReassociatePass()); 151d80f118eSChris Lattner // Eliminate Common SubExpressions. 152d80f118eSChris Lattner TheFPM->add(createGVNPass()); 153d80f118eSChris Lattner // Simplify the control flow graph (deleting unreachable blocks, etc). 154d80f118eSChris Lattner TheFPM->add(createCFGSimplificationPass()); 155d80f118eSChris Lattner 156d80f118eSChris Lattner TheFPM->doInitialization(); 157d80f118eSChris Lattner } 158d80f118eSChris Lattner 159d80f118eSChris LattnerThis code initializes the global module ``TheModule``, and the function pass 160d80f118eSChris Lattnermanager ``TheFPM``, which is attached to ``TheModule``. Once the pass manager is 161d80f118eSChris Lattnerset up, we use a series of "add" calls to add a bunch of LLVM passes. 162d80f118eSChris Lattner 163d80f118eSChris LattnerIn this case, we choose to add four optimization passes. 164d80f118eSChris LattnerThe passes we choose here are a pretty standard set 165d80f118eSChris Lattnerof "cleanup" optimizations that are useful for a wide variety of code. I won't 166d80f118eSChris Lattnerdelve into what they do but, believe me, they are a good starting place :). 167d80f118eSChris Lattner 168d80f118eSChris LattnerOnce the PassManager is set up, we need to make use of it. We do this by 169d80f118eSChris Lattnerrunning it after our newly created function is constructed (in 170d80f118eSChris Lattner``FunctionAST::codegen()``), but before it is returned to the client: 171d80f118eSChris Lattner 172d80f118eSChris Lattner.. code-block:: c++ 173d80f118eSChris Lattner 174d80f118eSChris Lattner if (Value *RetVal = Body->codegen()) { 175d80f118eSChris Lattner // Finish off the function. 176d80f118eSChris Lattner Builder.CreateRet(RetVal); 177d80f118eSChris Lattner 178d80f118eSChris Lattner // Validate the generated code, checking for consistency. 179d80f118eSChris Lattner verifyFunction(*TheFunction); 180d80f118eSChris Lattner 181d80f118eSChris Lattner // Optimize the function. 182d80f118eSChris Lattner TheFPM->run(*TheFunction); 183d80f118eSChris Lattner 184d80f118eSChris Lattner return TheFunction; 185d80f118eSChris Lattner } 186d80f118eSChris Lattner 187d80f118eSChris LattnerAs you can see, this is pretty straightforward. The 188d80f118eSChris Lattner``FunctionPassManager`` optimizes and updates the LLVM Function\* in 189d80f118eSChris Lattnerplace, improving (hopefully) its body. With this in place, we can try 190d80f118eSChris Lattnerour test above again: 191d80f118eSChris Lattner 192d80f118eSChris Lattner:: 193d80f118eSChris Lattner 194d80f118eSChris Lattner ready> def test(x) (1+2+x)*(x+(1+2)); 195d80f118eSChris Lattner ready> Read function definition: 196d80f118eSChris Lattner define double @test(double %x) { 197d80f118eSChris Lattner entry: 198d80f118eSChris Lattner %addtmp = fadd double %x, 3.000000e+00 199d80f118eSChris Lattner %multmp = fmul double %addtmp, %addtmp 200d80f118eSChris Lattner ret double %multmp 201d80f118eSChris Lattner } 202d80f118eSChris Lattner 203d80f118eSChris LattnerAs expected, we now get our nicely optimized code, saving a floating 204d80f118eSChris Lattnerpoint add instruction from every execution of this function. 205d80f118eSChris Lattner 206d80f118eSChris LattnerLLVM provides a wide variety of optimizations that can be used in 207d80f118eSChris Lattnercertain circumstances. Some `documentation about the various 2082916489cSkristinapasses <../../Passes.html>`_ is available, but it isn't very complete. 209d80f118eSChris LattnerAnother good source of ideas can come from looking at the passes that 210d80f118eSChris Lattner``Clang`` runs to get started. The "``opt``" tool allows you to 211d80f118eSChris Lattnerexperiment with passes from the command line, so you can see if they do 212d80f118eSChris Lattneranything. 213d80f118eSChris Lattner 214d80f118eSChris LattnerNow that we have reasonable code coming out of our front-end, let's talk 215d80f118eSChris Lattnerabout executing it! 216d80f118eSChris Lattner 217d80f118eSChris LattnerAdding a JIT Compiler 218d80f118eSChris Lattner===================== 219d80f118eSChris Lattner 220d80f118eSChris LattnerCode that is available in LLVM IR can have a wide variety of tools 221d80f118eSChris Lattnerapplied to it. For example, you can run optimizations on it (as we did 222d80f118eSChris Lattnerabove), you can dump it out in textual or binary forms, you can compile 223d80f118eSChris Lattnerthe code to an assembly file (.s) for some target, or you can JIT 224d80f118eSChris Lattnercompile it. The nice thing about the LLVM IR representation is that it 225d80f118eSChris Lattneris the "common currency" between many different parts of the compiler. 226d80f118eSChris Lattner 227d80f118eSChris LattnerIn this section, we'll add JIT compiler support to our interpreter. The 228d80f118eSChris Lattnerbasic idea that we want for Kaleidoscope is to have the user enter 229d80f118eSChris Lattnerfunction bodies as they do now, but immediately evaluate the top-level 230d80f118eSChris Lattnerexpressions they type in. For example, if they type in "1 + 2;", we 231d80f118eSChris Lattnershould evaluate and print out 3. If they define a function, they should 232d80f118eSChris Lattnerbe able to call it from the command line. 233d80f118eSChris Lattner 234d80f118eSChris LattnerIn order to do this, we first prepare the environment to create code for 235d80f118eSChris Lattnerthe current native target and declare and initialize the JIT. This is 236d80f118eSChris Lattnerdone by calling some ``InitializeNativeTarget\*`` functions and 237d80f118eSChris Lattneradding a global variable ``TheJIT``, and initializing it in 238d80f118eSChris Lattner``main``: 239d80f118eSChris Lattner 240d80f118eSChris Lattner.. code-block:: c++ 241d80f118eSChris Lattner 242d80f118eSChris Lattner static std::unique_ptr<KaleidoscopeJIT> TheJIT; 243d80f118eSChris Lattner ... 244d80f118eSChris Lattner int main() { 245d80f118eSChris Lattner InitializeNativeTarget(); 246d80f118eSChris Lattner InitializeNativeTargetAsmPrinter(); 247d80f118eSChris Lattner InitializeNativeTargetAsmParser(); 248d80f118eSChris Lattner 249d80f118eSChris Lattner // Install standard binary operators. 250d80f118eSChris Lattner // 1 is lowest precedence. 251d80f118eSChris Lattner BinopPrecedence['<'] = 10; 252d80f118eSChris Lattner BinopPrecedence['+'] = 20; 253d80f118eSChris Lattner BinopPrecedence['-'] = 20; 254d80f118eSChris Lattner BinopPrecedence['*'] = 40; // highest. 255d80f118eSChris Lattner 256d80f118eSChris Lattner // Prime the first token. 257d80f118eSChris Lattner fprintf(stderr, "ready> "); 258d80f118eSChris Lattner getNextToken(); 259d80f118eSChris Lattner 2600eaee545SJonas Devlieghere TheJIT = std::make_unique<KaleidoscopeJIT>(); 261d80f118eSChris Lattner 262d80f118eSChris Lattner // Run the main "interpreter loop" now. 263d80f118eSChris Lattner MainLoop(); 264d80f118eSChris Lattner 265d80f118eSChris Lattner return 0; 266d80f118eSChris Lattner } 267d80f118eSChris Lattner 268d80f118eSChris LattnerWe also need to setup the data layout for the JIT: 269d80f118eSChris Lattner 270d80f118eSChris Lattner.. code-block:: c++ 271d80f118eSChris Lattner 272d80f118eSChris Lattner void InitializeModuleAndPassManager(void) { 273d80f118eSChris Lattner // Open a new module. 2740eaee545SJonas Devlieghere TheModule = std::make_unique<Module>("my cool jit", TheContext); 275d80f118eSChris Lattner TheModule->setDataLayout(TheJIT->getTargetMachine().createDataLayout()); 276d80f118eSChris Lattner 277d80f118eSChris Lattner // Create a new pass manager attached to it. 278884fb45eSErich Keane TheFPM = std::make_unique<legacy::FunctionPassManager>(TheModule.get()); 279d80f118eSChris Lattner ... 280d80f118eSChris Lattner 281d80f118eSChris LattnerThe KaleidoscopeJIT class is a simple JIT built specifically for these 282d80f118eSChris Lattnertutorials, available inside the LLVM source code 283d80f118eSChris Lattnerat llvm-src/examples/Kaleidoscope/include/KaleidoscopeJIT.h. 284d80f118eSChris LattnerIn later chapters we will look at how it works and extend it with 285d80f118eSChris Lattnernew features, but for now we will take it as given. Its API is very simple: 286d80f118eSChris Lattner``addModule`` adds an LLVM IR module to the JIT, making its functions 287d80f118eSChris Lattneravailable for execution; ``removeModule`` removes a module, freeing any 288d80f118eSChris Lattnermemory associated with the code in that module; and ``findSymbol`` allows us 289d80f118eSChris Lattnerto look up pointers to the compiled code. 290d80f118eSChris Lattner 291d80f118eSChris LattnerWe can take this simple API and change our code that parses top-level expressions to 292d80f118eSChris Lattnerlook like this: 293d80f118eSChris Lattner 294d80f118eSChris Lattner.. code-block:: c++ 295d80f118eSChris Lattner 296d80f118eSChris Lattner static void HandleTopLevelExpression() { 297d80f118eSChris Lattner // Evaluate a top-level expression into an anonymous function. 298d80f118eSChris Lattner if (auto FnAST = ParseTopLevelExpr()) { 299d80f118eSChris Lattner if (FnAST->codegen()) { 300d80f118eSChris Lattner 301d80f118eSChris Lattner // JIT the module containing the anonymous expression, keeping a handle so 302d80f118eSChris Lattner // we can free it later. 303d80f118eSChris Lattner auto H = TheJIT->addModule(std::move(TheModule)); 304d80f118eSChris Lattner InitializeModuleAndPassManager(); 305d80f118eSChris Lattner 306d80f118eSChris Lattner // Search the JIT for the __anon_expr symbol. 307d80f118eSChris Lattner auto ExprSymbol = TheJIT->findSymbol("__anon_expr"); 308d80f118eSChris Lattner assert(ExprSymbol && "Function not found"); 309d80f118eSChris Lattner 310d80f118eSChris Lattner // Get the symbol's address and cast it to the right type (takes no 311d80f118eSChris Lattner // arguments, returns a double) so we can call it as a native function. 312d80f118eSChris Lattner double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress(); 313d80f118eSChris Lattner fprintf(stderr, "Evaluated to %f\n", FP()); 314d80f118eSChris Lattner 315d80f118eSChris Lattner // Delete the anonymous expression module from the JIT. 316d80f118eSChris Lattner TheJIT->removeModule(H); 317d80f118eSChris Lattner } 318d80f118eSChris Lattner 319bb69208dSNico WeberIf parsing and codegen succeed, the next step is to add the module containing 320d80f118eSChris Lattnerthe top-level expression to the JIT. We do this by calling addModule, which 321d80f118eSChris Lattnertriggers code generation for all the functions in the module, and returns a 322d80f118eSChris Lattnerhandle that can be used to remove the module from the JIT later. Once the module 323d80f118eSChris Lattnerhas been added to the JIT it can no longer be modified, so we also open a new 324d80f118eSChris Lattnermodule to hold subsequent code by calling ``InitializeModuleAndPassManager()``. 325d80f118eSChris Lattner 326d80f118eSChris LattnerOnce we've added the module to the JIT we need to get a pointer to the final 327d80f118eSChris Lattnergenerated code. We do this by calling the JIT's findSymbol method, and passing 328d80f118eSChris Lattnerthe name of the top-level expression function: ``__anon_expr``. Since we just 329d80f118eSChris Lattneradded this function, we assert that findSymbol returned a result. 330d80f118eSChris Lattner 331d80f118eSChris LattnerNext, we get the in-memory address of the ``__anon_expr`` function by calling 332d80f118eSChris Lattner``getAddress()`` on the symbol. Recall that we compile top-level expressions 333d80f118eSChris Lattnerinto a self-contained LLVM function that takes no arguments and returns the 334d80f118eSChris Lattnercomputed double. Because the LLVM JIT compiler matches the native platform ABI, 335d80f118eSChris Lattnerthis means that you can just cast the result pointer to a function pointer of 336d80f118eSChris Lattnerthat type and call it directly. This means, there is no difference between JIT 337d80f118eSChris Lattnercompiled code and native machine code that is statically linked into your 338d80f118eSChris Lattnerapplication. 339d80f118eSChris Lattner 340d80f118eSChris LattnerFinally, since we don't support re-evaluation of top-level expressions, we 341d80f118eSChris Lattnerremove the module from the JIT when we're done to free the associated memory. 342d80f118eSChris LattnerRecall, however, that the module we created a few lines earlier (via 343d80f118eSChris Lattner``InitializeModuleAndPassManager``) is still open and waiting for new code to be 344d80f118eSChris Lattneradded. 345d80f118eSChris Lattner 346d80f118eSChris LattnerWith just these two changes, let's see how Kaleidoscope works now! 347d80f118eSChris Lattner 348d80f118eSChris Lattner:: 349d80f118eSChris Lattner 350d80f118eSChris Lattner ready> 4+5; 351d80f118eSChris Lattner Read top-level expression: 352d80f118eSChris Lattner define double @0() { 353d80f118eSChris Lattner entry: 354d80f118eSChris Lattner ret double 9.000000e+00 355d80f118eSChris Lattner } 356d80f118eSChris Lattner 357d80f118eSChris Lattner Evaluated to 9.000000 358d80f118eSChris Lattner 359d80f118eSChris LattnerWell this looks like it is basically working. The dump of the function 360d80f118eSChris Lattnershows the "no argument function that always returns double" that we 361d80f118eSChris Lattnersynthesize for each top-level expression that is typed in. This 362d80f118eSChris Lattnerdemonstrates very basic functionality, but can we do more? 363d80f118eSChris Lattner 364d80f118eSChris Lattner:: 365d80f118eSChris Lattner 366d80f118eSChris Lattner ready> def testfunc(x y) x + y*2; 367d80f118eSChris Lattner Read function definition: 368d80f118eSChris Lattner define double @testfunc(double %x, double %y) { 369d80f118eSChris Lattner entry: 370d80f118eSChris Lattner %multmp = fmul double %y, 2.000000e+00 371d80f118eSChris Lattner %addtmp = fadd double %multmp, %x 372d80f118eSChris Lattner ret double %addtmp 373d80f118eSChris Lattner } 374d80f118eSChris Lattner 375d80f118eSChris Lattner ready> testfunc(4, 10); 376d80f118eSChris Lattner Read top-level expression: 377d80f118eSChris Lattner define double @1() { 378d80f118eSChris Lattner entry: 379d80f118eSChris Lattner %calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01) 380d80f118eSChris Lattner ret double %calltmp 381d80f118eSChris Lattner } 382d80f118eSChris Lattner 383d80f118eSChris Lattner Evaluated to 24.000000 384d80f118eSChris Lattner 385d80f118eSChris Lattner ready> testfunc(5, 10); 386d80f118eSChris Lattner ready> LLVM ERROR: Program used external function 'testfunc' which could not be resolved! 387d80f118eSChris Lattner 388d80f118eSChris Lattner 389d80f118eSChris LattnerFunction definitions and calls also work, but something went very wrong on that 390d80f118eSChris Lattnerlast line. The call looks valid, so what happened? As you may have guessed from 391d80f118eSChris Lattnerthe API a Module is a unit of allocation for the JIT, and testfunc was part 392d80f118eSChris Lattnerof the same module that contained anonymous expression. When we removed that 393d80f118eSChris Lattnermodule from the JIT to free the memory for the anonymous expression, we deleted 394d80f118eSChris Lattnerthe definition of ``testfunc`` along with it. Then, when we tried to call 395d80f118eSChris Lattnertestfunc a second time, the JIT could no longer find it. 396d80f118eSChris Lattner 397d80f118eSChris LattnerThe easiest way to fix this is to put the anonymous expression in a separate 398d80f118eSChris Lattnermodule from the rest of the function definitions. The JIT will happily resolve 399d80f118eSChris Lattnerfunction calls across module boundaries, as long as each of the functions called 400d80f118eSChris Lattnerhas a prototype, and is added to the JIT before it is called. By putting the 401d80f118eSChris Lattneranonymous expression in a different module we can delete it without affecting 402d80f118eSChris Lattnerthe rest of the functions. 403d80f118eSChris Lattner 404d80f118eSChris LattnerIn fact, we're going to go a step further and put every function in its own 405d80f118eSChris Lattnermodule. Doing so allows us to exploit a useful property of the KaleidoscopeJIT 406d80f118eSChris Lattnerthat will make our environment more REPL-like: Functions can be added to the 407d80f118eSChris LattnerJIT more than once (unlike a module where every function must have a unique 408d80f118eSChris Lattnerdefinition). When you look up a symbol in KaleidoscopeJIT it will always return 409d80f118eSChris Lattnerthe most recent definition: 410d80f118eSChris Lattner 411d80f118eSChris Lattner:: 412d80f118eSChris Lattner 413d80f118eSChris Lattner ready> def foo(x) x + 1; 414d80f118eSChris Lattner Read function definition: 415d80f118eSChris Lattner define double @foo(double %x) { 416d80f118eSChris Lattner entry: 417d80f118eSChris Lattner %addtmp = fadd double %x, 1.000000e+00 418d80f118eSChris Lattner ret double %addtmp 419d80f118eSChris Lattner } 420d80f118eSChris Lattner 421d80f118eSChris Lattner ready> foo(2); 422d80f118eSChris Lattner Evaluated to 3.000000 423d80f118eSChris Lattner 424d80f118eSChris Lattner ready> def foo(x) x + 2; 425d80f118eSChris Lattner define double @foo(double %x) { 426d80f118eSChris Lattner entry: 427d80f118eSChris Lattner %addtmp = fadd double %x, 2.000000e+00 428d80f118eSChris Lattner ret double %addtmp 429d80f118eSChris Lattner } 430d80f118eSChris Lattner 431d80f118eSChris Lattner ready> foo(2); 432d80f118eSChris Lattner Evaluated to 4.000000 433d80f118eSChris Lattner 434d80f118eSChris Lattner 435d80f118eSChris LattnerTo allow each function to live in its own module we'll need a way to 436d80f118eSChris Lattnerre-generate previous function declarations into each new module we open: 437d80f118eSChris Lattner 438d80f118eSChris Lattner.. code-block:: c++ 439d80f118eSChris Lattner 440d80f118eSChris Lattner static std::unique_ptr<KaleidoscopeJIT> TheJIT; 441d80f118eSChris Lattner 442d80f118eSChris Lattner ... 443d80f118eSChris Lattner 444d80f118eSChris Lattner Function *getFunction(std::string Name) { 445d80f118eSChris Lattner // First, see if the function has already been added to the current module. 446d80f118eSChris Lattner if (auto *F = TheModule->getFunction(Name)) 447d80f118eSChris Lattner return F; 448d80f118eSChris Lattner 449d80f118eSChris Lattner // If not, check whether we can codegen the declaration from some existing 450d80f118eSChris Lattner // prototype. 451d80f118eSChris Lattner auto FI = FunctionProtos.find(Name); 452d80f118eSChris Lattner if (FI != FunctionProtos.end()) 453d80f118eSChris Lattner return FI->second->codegen(); 454d80f118eSChris Lattner 455d80f118eSChris Lattner // If no existing prototype exists, return null. 456d80f118eSChris Lattner return nullptr; 457d80f118eSChris Lattner } 458d80f118eSChris Lattner 459d80f118eSChris Lattner ... 460d80f118eSChris Lattner 461d80f118eSChris Lattner Value *CallExprAST::codegen() { 462d80f118eSChris Lattner // Look up the name in the global module table. 463d80f118eSChris Lattner Function *CalleeF = getFunction(Callee); 464d80f118eSChris Lattner 465d80f118eSChris Lattner ... 466d80f118eSChris Lattner 467d80f118eSChris Lattner Function *FunctionAST::codegen() { 468d80f118eSChris Lattner // Transfer ownership of the prototype to the FunctionProtos map, but keep a 469d80f118eSChris Lattner // reference to it for use below. 470d80f118eSChris Lattner auto &P = *Proto; 471d80f118eSChris Lattner FunctionProtos[Proto->getName()] = std::move(Proto); 472d80f118eSChris Lattner Function *TheFunction = getFunction(P.getName()); 473d80f118eSChris Lattner if (!TheFunction) 474d80f118eSChris Lattner return nullptr; 475d80f118eSChris Lattner 476d80f118eSChris Lattner 477d80f118eSChris LattnerTo enable this, we'll start by adding a new global, ``FunctionProtos``, that 478d80f118eSChris Lattnerholds the most recent prototype for each function. We'll also add a convenience 479d80f118eSChris Lattnermethod, ``getFunction()``, to replace calls to ``TheModule->getFunction()``. 480d80f118eSChris LattnerOur convenience method searches ``TheModule`` for an existing function 481d80f118eSChris Lattnerdeclaration, falling back to generating a new declaration from FunctionProtos if 482d80f118eSChris Lattnerit doesn't find one. In ``CallExprAST::codegen()`` we just need to replace the 483d80f118eSChris Lattnercall to ``TheModule->getFunction()``. In ``FunctionAST::codegen()`` we need to 484d80f118eSChris Lattnerupdate the FunctionProtos map first, then call ``getFunction()``. With this 485d80f118eSChris Lattnerdone, we can always obtain a function declaration in the current module for any 486d80f118eSChris Lattnerpreviously declared function. 487d80f118eSChris Lattner 488d80f118eSChris LattnerWe also need to update HandleDefinition and HandleExtern: 489d80f118eSChris Lattner 490d80f118eSChris Lattner.. code-block:: c++ 491d80f118eSChris Lattner 492d80f118eSChris Lattner static void HandleDefinition() { 493d80f118eSChris Lattner if (auto FnAST = ParseDefinition()) { 494d80f118eSChris Lattner if (auto *FnIR = FnAST->codegen()) { 495d80f118eSChris Lattner fprintf(stderr, "Read function definition:"); 496d80f118eSChris Lattner FnIR->print(errs()); 497d80f118eSChris Lattner fprintf(stderr, "\n"); 498d80f118eSChris Lattner TheJIT->addModule(std::move(TheModule)); 499d80f118eSChris Lattner InitializeModuleAndPassManager(); 500d80f118eSChris Lattner } 501d80f118eSChris Lattner } else { 502d80f118eSChris Lattner // Skip token for error recovery. 503d80f118eSChris Lattner getNextToken(); 504d80f118eSChris Lattner } 505d80f118eSChris Lattner } 506d80f118eSChris Lattner 507d80f118eSChris Lattner static void HandleExtern() { 508d80f118eSChris Lattner if (auto ProtoAST = ParseExtern()) { 509d80f118eSChris Lattner if (auto *FnIR = ProtoAST->codegen()) { 510d80f118eSChris Lattner fprintf(stderr, "Read extern: "); 511d80f118eSChris Lattner FnIR->print(errs()); 512d80f118eSChris Lattner fprintf(stderr, "\n"); 513d80f118eSChris Lattner FunctionProtos[ProtoAST->getName()] = std::move(ProtoAST); 514d80f118eSChris Lattner } 515d80f118eSChris Lattner } else { 516d80f118eSChris Lattner // Skip token for error recovery. 517d80f118eSChris Lattner getNextToken(); 518d80f118eSChris Lattner } 519d80f118eSChris Lattner } 520d80f118eSChris Lattner 521d80f118eSChris LattnerIn HandleDefinition, we add two lines to transfer the newly defined function to 522d80f118eSChris Lattnerthe JIT and open a new module. In HandleExtern, we just need to add one line to 523d80f118eSChris Lattneradd the prototype to FunctionProtos. 524d80f118eSChris Lattner 525d80f118eSChris LattnerWith these changes made, let's try our REPL again (I removed the dump of the 526d80f118eSChris Lattneranonymous functions this time, you should get the idea by now :) : 527d80f118eSChris Lattner 528d80f118eSChris Lattner:: 529d80f118eSChris Lattner 530d80f118eSChris Lattner ready> def foo(x) x + 1; 531d80f118eSChris Lattner ready> foo(2); 532d80f118eSChris Lattner Evaluated to 3.000000 533d80f118eSChris Lattner 534d80f118eSChris Lattner ready> def foo(x) x + 2; 535d80f118eSChris Lattner ready> foo(2); 536d80f118eSChris Lattner Evaluated to 4.000000 537d80f118eSChris Lattner 538d80f118eSChris LattnerIt works! 539d80f118eSChris Lattner 540d80f118eSChris LattnerEven with this simple code, we get some surprisingly powerful capabilities - 541d80f118eSChris Lattnercheck this out: 542d80f118eSChris Lattner 543d80f118eSChris Lattner:: 544d80f118eSChris Lattner 545d80f118eSChris Lattner ready> extern sin(x); 546d80f118eSChris Lattner Read extern: 547d80f118eSChris Lattner declare double @sin(double) 548d80f118eSChris Lattner 549d80f118eSChris Lattner ready> extern cos(x); 550d80f118eSChris Lattner Read extern: 551d80f118eSChris Lattner declare double @cos(double) 552d80f118eSChris Lattner 553d80f118eSChris Lattner ready> sin(1.0); 554d80f118eSChris Lattner Read top-level expression: 555d80f118eSChris Lattner define double @2() { 556d80f118eSChris Lattner entry: 557d80f118eSChris Lattner ret double 0x3FEAED548F090CEE 558d80f118eSChris Lattner } 559d80f118eSChris Lattner 560d80f118eSChris Lattner Evaluated to 0.841471 561d80f118eSChris Lattner 562d80f118eSChris Lattner ready> def foo(x) sin(x)*sin(x) + cos(x)*cos(x); 563d80f118eSChris Lattner Read function definition: 564d80f118eSChris Lattner define double @foo(double %x) { 565d80f118eSChris Lattner entry: 566d80f118eSChris Lattner %calltmp = call double @sin(double %x) 567d80f118eSChris Lattner %multmp = fmul double %calltmp, %calltmp 568d80f118eSChris Lattner %calltmp2 = call double @cos(double %x) 569d80f118eSChris Lattner %multmp4 = fmul double %calltmp2, %calltmp2 570d80f118eSChris Lattner %addtmp = fadd double %multmp, %multmp4 571d80f118eSChris Lattner ret double %addtmp 572d80f118eSChris Lattner } 573d80f118eSChris Lattner 574d80f118eSChris Lattner ready> foo(4.0); 575d80f118eSChris Lattner Read top-level expression: 576d80f118eSChris Lattner define double @3() { 577d80f118eSChris Lattner entry: 578d80f118eSChris Lattner %calltmp = call double @foo(double 4.000000e+00) 579d80f118eSChris Lattner ret double %calltmp 580d80f118eSChris Lattner } 581d80f118eSChris Lattner 582d80f118eSChris Lattner Evaluated to 1.000000 583d80f118eSChris Lattner 584d80f118eSChris LattnerWhoa, how does the JIT know about sin and cos? The answer is surprisingly 585d80f118eSChris Lattnersimple: The KaleidoscopeJIT has a straightforward symbol resolution rule that 586d80f118eSChris Lattnerit uses to find symbols that aren't available in any given module: First 587d80f118eSChris Lattnerit searches all the modules that have already been added to the JIT, from the 588d80f118eSChris Lattnermost recent to the oldest, to find the newest definition. If no definition is 589d80f118eSChris Lattnerfound inside the JIT, it falls back to calling "``dlsym("sin")``" on the 590d80f118eSChris LattnerKaleidoscope process itself. Since "``sin``" is defined within the JIT's 591d80f118eSChris Lattneraddress space, it simply patches up calls in the module to call the libm 592d80f118eSChris Lattnerversion of ``sin`` directly. But in some cases this even goes further: 593d80f118eSChris Lattneras sin and cos are names of standard math functions, the constant folder 594d80f118eSChris Lattnerwill directly evaluate the function calls to the correct result when called 595d80f118eSChris Lattnerwith constants like in the "``sin(1.0)``" above. 596d80f118eSChris Lattner 597d80f118eSChris LattnerIn the future we'll see how tweaking this symbol resolution rule can be used to 598d80f118eSChris Lattnerenable all sorts of useful features, from security (restricting the set of 599d80f118eSChris Lattnersymbols available to JIT'd code), to dynamic code generation based on symbol 600d80f118eSChris Lattnernames, and even lazy compilation. 601d80f118eSChris Lattner 602d80f118eSChris LattnerOne immediate benefit of the symbol resolution rule is that we can now extend 603d80f118eSChris Lattnerthe language by writing arbitrary C++ code to implement operations. For example, 604d80f118eSChris Lattnerif we add: 605d80f118eSChris Lattner 606d80f118eSChris Lattner.. code-block:: c++ 607d80f118eSChris Lattner 608d80f118eSChris Lattner #ifdef _WIN32 609d80f118eSChris Lattner #define DLLEXPORT __declspec(dllexport) 610d80f118eSChris Lattner #else 611d80f118eSChris Lattner #define DLLEXPORT 612d80f118eSChris Lattner #endif 613d80f118eSChris Lattner 614d80f118eSChris Lattner /// putchard - putchar that takes a double and returns 0. 615d80f118eSChris Lattner extern "C" DLLEXPORT double putchard(double X) { 616d80f118eSChris Lattner fputc((char)X, stderr); 617d80f118eSChris Lattner return 0; 618d80f118eSChris Lattner } 619d80f118eSChris Lattner 620d80f118eSChris LattnerNote, that for Windows we need to actually export the functions because 621d80f118eSChris Lattnerthe dynamic symbol loader will use GetProcAddress to find the symbols. 622d80f118eSChris Lattner 623d80f118eSChris LattnerNow we can produce simple output to the console by using things like: 624d80f118eSChris Lattner"``extern putchard(x); putchard(120);``", which prints a lowercase 'x' 625d80f118eSChris Lattneron the console (120 is the ASCII code for 'x'). Similar code could be 626d80f118eSChris Lattnerused to implement file I/O, console input, and many other capabilities 627d80f118eSChris Lattnerin Kaleidoscope. 628d80f118eSChris Lattner 629d80f118eSChris LattnerThis completes the JIT and optimizer chapter of the Kaleidoscope 630d80f118eSChris Lattnertutorial. At this point, we can compile a non-Turing-complete 631d80f118eSChris Lattnerprogramming language, optimize and JIT compile it in a user-driven way. 632d80f118eSChris LattnerNext up we'll look into `extending the language with control flow 633d80f118eSChris Lattnerconstructs <LangImpl05.html>`_, tackling some interesting LLVM IR issues 634d80f118eSChris Lattneralong the way. 635d80f118eSChris Lattner 636d80f118eSChris LattnerFull Code Listing 637d80f118eSChris Lattner================= 638d80f118eSChris Lattner 639d80f118eSChris LattnerHere is the complete code listing for our running example, enhanced with 640d80f118eSChris Lattnerthe LLVM JIT and optimizer. To build this example, use: 641d80f118eSChris Lattner 642d80f118eSChris Lattner.. code-block:: bash 643d80f118eSChris Lattner 644d80f118eSChris Lattner # Compile 645*3546b372Sxgupta clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy 646d80f118eSChris Lattner # Run 647d80f118eSChris Lattner ./toy 648d80f118eSChris Lattner 649d80f118eSChris LattnerIf you are compiling this on Linux, make sure to add the "-rdynamic" 650d80f118eSChris Lattneroption as well. This makes sure that the external functions are resolved 651d80f118eSChris Lattnerproperly at runtime. 652d80f118eSChris Lattner 653d80f118eSChris LattnerHere is the code: 654d80f118eSChris Lattner 655147e0ddaSHans Wennborg.. literalinclude:: ../../../examples/Kaleidoscope/Chapter4/toy.cpp 656d80f118eSChris Lattner :language: c++ 657d80f118eSChris Lattner 658d80f118eSChris Lattner`Next: Extending the language: control flow <LangImpl05.html>`_ 659d80f118eSChris Lattner 660