1<!--===- docs/ControlFlowGraph.md 2 3 Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. 4 See https://llvm.org/LICENSE.txt for license information. 5 SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception 6 7--> 8 9## Concept 10After a Fortran subprogram has been parsed, its names resolved, and all its 11semantic constraints successfully checked, the parse tree of its 12executable part is translated into another abstract representation, 13namely the _control flow graph_ described in this note. 14 15This second representation of the subprogram's executable part is 16suitable for analysis and incremental modification as the subprogram 17is readied for code generation. 18Many high-level Fortran features are implemented by rewriting portions 19of a subprogram's control flow graph in place. 20 21### Control Flow Graph 22A _control flow graph_ is a collection of simple (_i.e.,_ "non-extended") 23basic _blocks_ that comprise straight-line sequences of _actions_ with a 24single entry point and a single exit point, and a collection of 25directed flow _edges_ (or _arcs_) denoting all possible transitions of 26control flow that may take place during execution from the end of 27one basic block to the beginning of another (or itself). 28 29A block that has multiple distinct successors in the flow of control 30must end with an action that selects its successor. 31 32The sequence of actions that constitutes a basic block may 33include references to user and library procedures. 34Subprogram calls with implicit control flow afterwards, namely 35alternate returns and `END=`/`ERR=` labels on input/output, 36will be lowered in translation to a representation that materializes 37that control flow into something similar to a computed `GO TO` or 38C language `switch` statement. 39 40For convenience in optimization and to simplify the implementation of 41data flow confluence functions, we may choose to maintain the 42property that each flow arc is the sole outbound arc emanating from 43its originating block, the sole inbound arc arriving at its destination, 44or both. 45Empty blocks would inserted to "split" arcs when necessary to maintain this 46invariant property. 47 48Fortran subprograms (other than internal subprograms) can have multiple 49entry points by using the obsolescent `ENTRY` statement. 50We will implement such subprograms by constructing a union 51of their dummy argument lists and using it as part of the definition 52of a new subroutine or function that can be called by each of 53the entry points, which are then all converted into wrapper routines that 54pass a selector value as an additional argument to drive a `switch` on entry 55to the new subprogram. 56 57This transformation ensures that every subprogram's control 58flow graph has a well-defined `START` node. 59 60Statement labels can be used in Fortran on any statement, but only 61the labels that decorate legal destinations of `GO TO` statements 62need to be implemented in the control flow graph. 63Specifically, non-executable statements like `DATA`, `NAMELIST`, and 64`FORMAT` statements will be extracted into data initialization 65records before or during the construction of the control flow 66graph, and will survive only as synonyms for `CONTINUE`. 67 68Nests of multiple labeled `DO` loops that terminate on the same 69label will be have that label rewritten so that `GO TO` within 70the loop nest will arrive at the copy that most closely nests 71the context. 72The Fortran standard does not require us to do this, but XLF 73(at least) works this way. 74 75### Expressions and Statements (Operations and Actions) 76Expressions are trees, not DAGs, of intrinsic operations, 77resolved function references, constant literals, and 78data designators. 79 80Expression nodes are represented in the compiler in a type-safe manner. 81There is a distinct class or class template for every category of 82intrinsic type, templatized over its supported kind type parameter values. 83 84Operands are storage-owning indirections to other instances 85of `Expression`, instances of constant values, and to representations 86of data and function references. 87These indirections are not nullable apart from the situation in which 88the operands of an expression are being removed for use elsewhere before 89the expression is destructed. 90 91The ranks and the extents of the shapes of the results of expressions 92are explicit for constant arrays and recoverable by analysis otherwise. 93 94Parenthesized subexpressions are scrupulously preserved in accordance with 95the Fortran standard. 96 97The expression tree is meant to be a representation that is 98as equally well suited for use in the symbol table (e.g., for 99a bound of an explicit shape array) as it is for an action 100in a basic block of the control flow graph (e.g., the right 101hand side of an assignment statement). 102 103Each basic block comprises a linear sequence of _actions_. 104These are represented as a doubly-linked list so that insertion 105and deletion can be done in constant time. 106 107Only the last action in a basic block can represent a change 108to the flow of control. 109 110### Scope Transitions 111Some of the various scopes of the symbol table are visible in the control flow 112graph as `SCOPE ENTRY` and `SCOPE EXIT` actions. 113`SCOPE ENTRY` actions are unique for their corresponding scopes, 114while `SCOPE EXIT` actions need not be so. 115It must be the case that 116any flow of control within the subprogram will enter only scopes that are 117not yet active, and exit only the most recently entered scope that has not 118yet been deactivated; i.e., when modeled by a push-down stack that is 119pushed by each traversal of a `SCOPE ENTRY` action, 120the entries of the stack are always distinct, only the scope at 121the top of the stack is ever popped by `SCOPE EXIT`, and the stack is empty 122when the subprogram terminates. 123Further, any references to resolved symbols must be to symbols whose scopes 124are active. 125 126The `DEALLOCATE` actions and calls to `FINAL` procedures implied by scoped 127lifetimes will be explicit in the sequence of actions in the control flow 128graph. 129 130Parallel regions might be partially represented by scopes, or by explicit 131operations similar to the scope entry and exit operations. 132 133### Data Flow Representation 134The subprogram text will be in static single assignment form by the time the 135subprogram arrives at the bridge to the LLVM IR builder. 136Merge points are actions at the heads of basic blocks whose operands 137are definition points; definition points are actions at the ends of 138basic blocks whose operands are expression trees (which may refer to 139merge points). 140 141### Rewriting Transformations 142 143#### I/O 144#### Dynamic allocation 145#### Array constructors 146 147#### Derived type initialization, deallocation, and finalization 148The machinery behind the complicated semantics of Fortran's derived types 149and `ALLOCATABLE` objects will be implemented in large part by the run time 150support library. 151 152#### Actual argument temporaries 153#### Array assignments, `WHERE`, and `FORALL` 154 155Array operations have shape. 156 157`WHERE` masks have shape. 158Their effects on array operations are by means of explicit `MASK` operands that 159are part of array assignment operations. 160 161#### Intrinsic function and subroutine calls 162