1<!--===- docs/FortranForCProgrammers.md
2
3   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4   See https://llvm.org/LICENSE.txt for license information.
5   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6
7-->
8
9Fortran For C Programmers
10=========================
11
12This note is limited to essential information about Fortran so that
13a C or C++ programmer can get started more quickly with the language,
14at least as a reader, and avoid some common pitfalls when starting
15to write or modify Fortran code.
16Please see other sources to learn about Fortran's rich history,
17current applications, and modern best practices in new code.
18
19Know This At Least
20------------------
21* There have been many implementations of Fortran, often from competing
22  vendors, and the standard language has been defined by U.S. and
23  international standards organizations.  The various editions of
24  the standard are known as the '66, '77, '90, '95, 2003, 2008, and
25  (now) 2018 standards.
26* Forward compatibility is important.  Fortran has outlasted many
27  generations of computer systems hardware and software.  Standard
28  compliance notwithstanding, Fortran programmers generally expect that
29  code that has compiled successfully in the past will continue to
30  compile and work indefinitely.  The standards sometimes designate
31  features as being deprecated, obsolescent, or even deleted, but that
32  can be read only as discouraging their use in new code -- they'll
33  probably always work in any serious implementation.
34* Fortran has two source forms, which are typically distinguished by
35  filename suffixes.  `foo.f` is old-style "fixed-form" source, and
36  `foo.f90` is new-style "free-form" source.  All language features
37  are available in both source forms.  Neither form has reserved words
38  in the sense that C does.  Spaces are not required between tokens
39  in fixed form, and case is not significant in either form.
40* Variable declarations are optional by default.  Variables whose
41  names begin with the letters `I` through `N` are implicitly
42  `INTEGER`, and others are implicitly `REAL`.  These implicit typing
43  rules can be changed in the source.
44* Fortran uses parentheses in both array references and function calls.
45  All arrays must be declared as such; other names followed by parenthesized
46  expressions are assumed to be function calls.
47* Fortran has a _lot_ of built-in "intrinsic" functions.  They are always
48  available without a need to declare or import them.  Their names reflect
49  the implicit typing rules, so you will encounter names that have been
50  modified so that they have the right type (e.g., `AIMAG` has a leading `A`
51  so that it's `REAL` rather than `INTEGER`).
52* The modern language has means for declaring types, data, and subprogram
53  interfaces in compiled "modules", as well as legacy mechanisms for
54  sharing data and interconnecting subprograms.
55
56A Rosetta Stone
57---------------
58Fortran's language standard and other documentation uses some terminology
59in particular ways that might be unfamiliar.
60
61| Fortran | English |
62| ------- | ------- |
63| Association | Making a name refer to something else |
64| Assumed | Some attribute of an argument or interface that is not known until a call is made |
65| Companion processor | A C compiler |
66| Component | Class member |
67| Deferred | Some attribute of a variable that is not known until an allocation or assignment |
68| Derived type | C++ class |
69| Dummy argument | C++ reference argument |
70| Final procedure | C++ destructor |
71| Generic | Overloaded function, resolved by actual arguments |
72| Host procedure | The subprogram that contains a nested one |
73| Implied DO | There's a loop inside a statement |
74| Interface | Prototype |
75| Internal I/O | `sscanf` and `snprintf` |
76| Intrinsic | Built-in type or function |
77| Polymorphic | Dynamically typed |
78| Processor | Fortran compiler |
79| Rank | Number of dimensions that an array has |
80| `SAVE` attribute | Statically allocated |
81| Type-bound procedure | Kind of a C++ member function but not really |
82| Unformatted | Raw binary |
83
84Data Types
85----------
86There are five built-in ("intrinsic") types: `INTEGER`, `REAL`, `COMPLEX`,
87`LOGICAL`, and `CHARACTER`.
88They are parameterized with "kind" values, which should be treated as
89non-portable integer codes, although in practice today these are the
90byte sizes of the data.
91(For `COMPLEX`, the kind type parameter value is the byte size of one of the
92two `REAL` components, or half of the total size.)
93The legacy `DOUBLE PRECISION` intrinsic type is an alias for a kind of `REAL`
94that should be more precise, and bigger, than the default `REAL`.
95
96`COMPLEX` is a simple structure that comprises two `REAL` components.
97
98`CHARACTER` data also have length, which may or may not be known at compilation
99time.
100`CHARACTER` variables are fixed-length strings and they get padded out
101with space characters when not completely assigned.
102
103User-defined ("derived") data types can be synthesized from the intrinsic
104types and from previously-defined user types, much like a C `struct`.
105Derived types can be parameterized with integer values that either have
106to be constant at compilation time ("kind" parameters) or deferred to
107execution ("len" parameters).
108
109Derived types can inherit ("extend") from at most one other derived type.
110They can have user-defined destructors (`FINAL` procedures).
111They can specify default initial values for their components.
112With some work, one can also specify a general constructor function,
113since Fortran allows a generic interface to have the same name as that
114of a derived type.
115
116Last, there are "typeless" binary constants that can be used in a few
117situations, like static data initialization or immediate conversion,
118where type is not necessary.
119
120Arrays
121------
122Arrays are not types in Fortran.
123Being an array is a property of an object or function, not of a type.
124Unlike C, one cannot have an array of arrays or an array of pointers,
125although can can have an array of a derived type that has arrays or
126pointers as components.
127Arrays are multidimensional, and the number of dimensions is called
128the _rank_ of the array.
129In storage, arrays are stored such that the last subscript has the
130largest stride in memory, e.g. A(1,1) is followed by A(2,1), not A(1,2).
131And yes, the default lower bound on each dimension is 1, not 0.
132
133Expressions can manipulate arrays as multidimensional values, and
134the compiler will create the necessary loops.
135
136Allocatables
137------------
138Modern Fortran programs use `ALLOCATABLE` data extensively.
139Such variables and derived type components are allocated dynamically.
140They are automatically deallocated when they go out of scope, much
141like C++'s `std::vector<>` class template instances are.
142The array bounds, derived type `LEN` parameters, and even the
143type of an allocatable can all be deferred to run time.
144(If you really want to learn all about modern Fortran, I suggest
145that you study everything that can be done with `ALLOCATABLE` data,
146and follow up all the references that are made in the documentation
147from the description of `ALLOCATABLE` to other topics; it's a feature
148that interacts with much of the rest of the language.)
149
150I/O
151---
152Fortran's input/output features are built into the syntax of the language,
153rather than being defined by library interfaces as in C and C++.
154There are means for raw binary I/O and for "formatted" transfers to
155character representations.
156There are means for random-access I/O using fixed-size records as well as for
157sequential I/O.
158One can scan data from or format data into `CHARACTER` variables via
159"internal" formatted I/O.
160I/O from and to files uses a scheme of integer "unit" numbers that is
161similar to the open file descriptors of UNIX; i.e., one opens a file
162and assigns it a unit number, then uses that unit number in subsequent
163`READ` and `WRITE` statements.
164
165Formatted I/O relies on format specifications to map values to fields of
166characters, similar to the format strings used with C's `printf` family
167of standard library functions.
168These format specifications can appear in `FORMAT` statements and
169be referenced by their labels, in character literals directly in I/O
170statements, or in character variables.
171
172One can also use compiler-generated formatting in "list-directed" I/O,
173in which the compiler derives reasonable default formats based on
174data types.
175
176Subprograms
177-----------
178Fortran has both `FUNCTION` and `SUBROUTINE` subprograms.
179They share the same name space, but functions cannot be called as
180subroutines or vice versa.
181Subroutines are called with the `CALL` statement, while functions are
182invoked with function references in expressions.
183
184There is one level of subprogram nesting.
185A function, subroutine, or main program can have functions and subroutines
186nested within it, but these "internal" procedures cannot themselves have
187their own internal procedures.
188As is the case with C++ lambda expressions, internal procedures can
189reference names from their host subprograms.
190
191Modules
192-------
193Modern Fortran has good support for separate compilation and namespace
194management.
195The *module* is the basic unit of compilation, although independent
196subprograms still exist, of course, as well as the main program.
197Modules define types, constants, interfaces, and nested
198subprograms.
199
200Objects from a module are made available for use in other compilation
201units via the `USE` statement, which has options for limiting the objects
202that are made available as well as for renaming them.
203All references to objects in modules are done with direct names or
204aliases that have been added to the local scope, as Fortran has no means
205of qualifying references with module names.
206
207Arguments
208---------
209Functions and subroutines have "dummy" arguments that are dynamically
210associated with actual arguments during calls.
211Essentially, all argument passing in Fortran is by reference, not value.
212One may restrict access to argument data by declaring that dummy
213arguments have `INTENT(IN)`, but that corresponds to the use of
214a `const` reference in C++ and does not imply that the data are
215copied; use `VALUE` for that.
216
217When it is not possible to pass a reference to an object, or a sparse
218regular array section of an object, as an actual argument, Fortran
219compilers must allocate temporary space to hold the actual argument
220across the call.
221This is always guaranteed to happen when an actual argument is enclosed
222in parentheses.
223
224The compiler is free to assume that any aliasing between dummy arguments
225and other data is safe.
226In other words, if some object can be written to under one name, it's
227never going to be read or written using some other name in that same
228scope.
229```
230  SUBROUTINE FOO(X,Y,Z)
231  X = 3.14159
232  Y = 2.1828
233  Z = 2 * X ! CAN BE FOLDED AT COMPILE TIME
234  END
235```
236This is the opposite of the assumptions under which a C or C++ compiler must
237labor when trying to optimize code with pointers.
238
239Overloading
240-----------
241Fortran supports a form of overloading via its interface feature.
242By default, an interface is a means for specifying prototypes for a
243set of subroutines and functions.
244But when an interface is named, that name becomes a *generic* name
245for its specific subprograms, and calls via the generic name are
246mapped at compile time to one of the specific subprograms based
247on the types, kinds, and ranks of the actual arguments.
248A similar feature can be used for generic type-bound procedures.
249
250This feature can be used to overload the built-in operators and some
251I/O statements, too.
252
253Polymorphism
254------------
255Fortran code can be written to accept data of some derived type or
256any extension thereof using `CLASS`, deferring the actual type to
257execution, rather than the usual `TYPE` syntax.
258This is somewhat similar to the use of `virtual` functions in c++.
259
260Fortran's `SELECT TYPE` construct is used to distinguish between
261possible specific types dynamically, when necessary.  It's a
262little like C++17's `std::visit()` on a discriminated union.
263
264Pointers
265--------
266Pointers are objects in Fortran, not data types.
267Pointers can point to data, arrays, and subprograms.
268A pointer can only point to data that has the `TARGET` attribute.
269Outside of the pointer assignment statement (`P=>X`) and some intrinsic
270functions and cases with pointer dummy arguments, pointers are implicitly
271dereferenced, and the use of their name is a reference to the data to which
272they point instead.
273
274Unlike C, a pointer cannot point to a pointer *per se*, nor can they be
275used to implement a level of indirection to the management structure of
276an allocatable.
277If you assign to a Fortran pointer to make it point at another pointer,
278you are making the pointer point to the data (if any) to which the other
279pointer points.
280Similarly, if you assign to a Fortran pointer to make it point to an allocatable,
281you are making the pointer point to the current content of the allocatable,
282not to the metadata that manages the allocatable.
283
284Unlike allocatables, pointers do not deallocate their data when they go
285out of scope.
286
287A legacy feature, "Cray pointers", implements dynamic base addressing of
288one variable using an address stored in another.
289
290Preprocessing
291-------------
292There is no standard preprocessing feature, but every real Fortran implementation
293has some support for passing Fortran source code through a variant of
294the standard C source preprocessor.
295Since Fortran is very different from C at the lexical level (e.g., line
296continuations, Hollerith literals, no reserved words, fixed form), using
297a stock modern C preprocessor on Fortran source can be difficult.
298Preprocessing behavior varies across implementations and one should not depend on
299much portability.
300Preprocessing is typically requested by the use of a capitalized filename
301suffix (e.g., "foo.F90") or a compiler command line option.
302(Since the F18 compiler always runs its built-in preprocessing stage,
303no special option or filename suffix is required.)
304
305"Object Oriented" Programming
306-----------------------------
307Fortran doesn't have member functions (or subroutines) in the sense
308that C++ does, in which a function has immediate access to the members
309of a specific instance of a derived type.
310But Fortran does have an analog to C++'s `this` via *type-bound
311procedures*.
312This is a means of binding a particular subprogram name to a derived
313type, possibly with aliasing, in such a way that the subprogram can
314be called as if it were a component of the type (e.g., `X%F(Y)`)
315and receive the object to the left of the `%` as an additional actual argument,
316exactly as if the call had been written `F(X,Y)`.
317The object is passed as the first argument by default, but that can be
318changed; indeed, the same specific subprogram can be used for multiple
319type-bound procedures by choosing different dummy arguments to serve as
320the passed object.
321The equivalent of a `static` member function is also available by saying
322that no argument is to be associated with the object via `NOPASS`.
323
324There's a lot more that can be said about type-bound procedures (e.g., how they
325support overloading) but this should be enough to get you started with
326the most common usage.
327
328Pitfalls
329--------
330Variable initializers, e.g. `INTEGER :: J=123`, are _static_ initializers!
331They imply that the variable is stored in static storage, not on the stack,
332and the initialized value lasts only until the variable is assigned.
333One must use an assignment statement to implement a dynamic initializer
334that will apply to every fresh instance of the variable.
335Be especially careful when using initializers in the newish `BLOCK` construct,
336which perpetuates the interpretation as static data.
337(Derived type component initializers, however, do work as expected.)
338
339If you see an assignment to an array that's never been declared as such,
340it's probably a definition of a *statement function*, which is like
341a parameterized macro definition, e.g. `A(X)=SQRT(X)**3`.
342In the original Fortran language, this was the only means for user
343function definitions.
344Today, of course, one should use an external or internal function instead.
345
346Fortran expressions don't bind exactly like C's do.
347Watch out for exponentiation with `**`, which of course C lacks; it
348binds more tightly than negation does (e.g., `-2**2` is -4),
349and it binds to the right, unlike what any other Fortran and most
350C operators do; e.g., `2**2**3` is 256, not 64.
351Logical values must be compared with special logical equivalence
352relations (`.EQV.` and `.NEQV.`) rather than the usual equality
353operators.
354
355A Fortran compiler is allowed to short-circuit expression evaluation,
356but not required to do so.
357If one needs to protect a use of an `OPTIONAL` argument or possibly
358disassociated pointer, use an `IF` statement, not a logical `.AND.`
359operation.
360In fact, Fortran can remove function calls from expressions if their
361values are not required to determine the value of the expression's
362result; e.g., if there is a `PRINT` statement in function `F`, it
363may or may not be executed by the assignment statement `X=0*F()`.
364(Well, it probably will be, in practice, but compilers always reserve
365the right to optimize better.)
366
367Unless they have an explicit suffix (`1.0_8`, `2.0_8`) or a `D`
368exponent (`3.0D0`), real literal constants in Fortran have the
369default `REAL` type -- *not* `double` as in the case in C and C++.
370If you're not careful, you can lose precision at compilation time
371from your constant values and never know it.
372