1.. _loop-terminology: 2 3=========================================== 4LLVM Loop Terminology (and Canonical Forms) 5=========================================== 6 7.. contents:: 8 :local: 9 10Introduction 11============ 12 13Loops are a core concept in any optimizer. This page spells out some 14of the common terminology used within LLVM code to describe loop 15structures. 16 17First, let's start with the basics. In LLVM, a Loop is a maximal set of basic 18blocks that form a strongly connected component (SCC) in the Control 19Flow Graph (CFG) where there exists a dedicated entry/header block that 20dominates all other blocks within the loop. Thus, without leaving the 21loop, one can reach every block in the loop from the header block and 22the header block from every block in the loop. 23 24Note that there are some important implications of this definition: 25 26* Not all SCCs are loops. There exist SCCs that do not meet the 27 dominance requirement and such are not considered loops. 28 29* Loops can contain non-loop SCCs and non-loop SCCs may contain 30 loops. Loops may also contain sub-loops. 31 32* A header block is uniquely associated with one loop. There can be 33 multiple SCC within that loop, but the strongly connected component 34 (SCC) formed from their union must always be unique. 35 36* Given the use of dominance in the definition, all loops are 37 statically reachable from the entry of the function. 38 39* Every loop must have a header block, and some set of predecessors 40 outside the loop. A loop is allowed to be statically infinite, so 41 there need not be any exiting edges. 42 43* Any two loops are either fully disjoint (no intersecting blocks), or 44 one must be a sub-loop of the other. 45 46A loop may have an arbitrary number of exits, both explicit (via 47control flow) and implicit (via throwing calls which transfer control 48out of the containing function). There is no special requirement on 49the form or structure of exit blocks (the block outside the loop which 50is branched to). They may have multiple predecessors, phis, etc... 51 52Key Terminology 53=============== 54 55Header Block - The basic block which dominates all other blocks 56contained within the loop. As such, it is the first one executed if 57the loop executes at all. Note that a block can be the header of 58two separate loops at the same time, but only if one is a sub-loop 59of the other. 60 61Exiting Block - A basic block contained within a given loop which has 62at least one successor outside of the loop and one successor inside the 63loop. (The latter is a consequence of the block being contained within 64an SCC which is part of the loop.) That is, it has a successor which 65is an Exit Block. 66 67Exit Block - A basic block outside of the associated loop which has a 68predecessor inside the loop. That is, it has a predecessor which is 69an Exiting Block. 70 71Latch Block - A basic block within the loop whose successors include 72the header block of the loop. Thus, a latch is a source of backedge. 73A loop may have multiple latch blocks. A latch block may be either 74conditional or unconditional. 75 76Backedge(s) - The edge(s) in the CFG from latch blocks to the header 77block. Note that there can be multiple such edges, and even multiple 78such edges leaving a single latch block. 79 80Loop Predecessor - The predecessor blocks of the loop header which 81are not contained by the loop itself. These are the only blocks 82through which execution can enter the loop. When used in the 83singular form implies that there is only one such unique block. 84 85Preheader Block - A preheader is a (singular) loop predecessor which 86ends in an unconditional transfer of control to the loop header. Note 87that not all loops have such blocks. 88 89Backedge Taken Count - The number of times the backedge will execute 90before some interesting event happens. Commonly used without 91qualification of the event as a shorthand for when some exiting block 92branches to some exit block. May be zero, or not statically computable. 93 94Iteration Count - The number of times the header will execute before 95some interesting event happens. Commonly used without qualification to 96refer to the iteration count at which the loop exits. Will always be 97one greater than the backedge taken count. *Warning*: Preceding 98statement is true in the *integer domain*; if you're dealing with fixed 99width integers (such as LLVM Values or SCEVs), you need to be cautious 100of overflow when converting one to the other. 101 102It's important to note that the same basic block can play multiple 103roles in the same loop, or in different loops at once. For example, a 104single block can be the header for two nested loops at once, while 105also being an exiting block for the inner one only, and an exit block 106for a sibling loop. Example: 107 108.. code-block:: C 109 110 while (..) { 111 for (..) {} 112 do { 113 do { 114 // <-- block of interest 115 if (exit) break; 116 } while (..); 117 } while (..) 118 } 119 120LoopInfo 121======== 122 123LoopInfo is the core analysis for obtaining information about loops. 124There are few key implications of the definitions given above which 125are important for working successfully with this interface. 126 127* LoopInfo does not contain information about non-loop cycles. As a 128 result, it is not suitable for any algorithm which requires complete 129 cycle detection for correctness. 130 131* LoopInfo provides an interface for enumerating all top level loops 132 (e.g. those not contained in any other loop). From there, you may 133 walk the tree of sub-loops rooted in that top level loop. 134 135* Loops which become statically unreachable during optimization *must* 136 be removed from LoopInfo. If this can not be done for some reason, 137 then the optimization is *required* to preserve the static 138 reachability of the loop. 139 140 141Loop Simplify Form 142================== 143 144TBD 145 146 147Loop Closed SSA (LCSSA) 148======================= 149 150TBD 151 152"More Canonical" Loops 153====================== 154 155TBD 156