1<!--===- docs/Extensions.md
2
3   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4   See https://llvm.org/LICENSE.txt for license information.
5   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6
7-->
8
9# Fortran Extensions supported by Flang
10
11```eval_rst
12.. contents::
13   :local:
14```
15
16As a general principle, this compiler will accept by default and
17without complaint many legacy features, extensions to the standard
18language, and features that have been deleted from the standard,
19so long as the recognition of those features would not cause a
20standard-conforming program to be rejected or misinterpreted.
21
22Other non-standard features, which do conflict with the current
23standard specification of the Fortran programming language, are
24accepted if enabled by command-line options.
25
26## Intentional violations of the standard
27
28* Scalar `INTEGER` actual argument expressions (not variables!)
29  are converted to the kinds of scalar `INTEGER` dummy arguments
30  when the interface is explicit and the kinds differ.
31  This conversion allows the results of the intrinsics like
32  `SIZE` that (as mentioned below) may return non-default
33  `INTEGER` results by default to be passed.  A warning is
34  emitted when truncation is possible.  These conversions
35  are not applied in calls to non-intrinsic generic procedures.
36* We are not strict on the contents of `BLOCK DATA` subprograms
37  so long as they contain no executable code, no internal subprograms,
38  and allocate no storage outside a named `COMMON` block.  (C1415)
39* Delimited list-directed (and NAMELIST) character output is required
40  to emit contiguous doubled instances of the delimiter character
41  when it appears in the output value.  When fixed-size records
42  are being emitted, as is the case with internal output, this
43  is not possible when the problematic character falls on the last
44  position of a record.  No two other Fortran compilers do the same
45  thing in this situation so there is no good precedent to follow.
46  Because it seems least wrong, we emit one copy of the delimiter as
47  the last character of the current record and another as the first
48  character of the next record.  (The second-least-wrong alternative
49  might be to flag a runtime error, but that seems harsh since it's
50  not an explicit error in the standard, and the output may not have
51  to be usable later as input anyway.)
52  Consequently, the output is not suitable for use as list-directed or
53  NAMELIST input.  If a later standard were to clarify this case, this
54  behavior will change as needed to conform.
55```
56character(11) :: buffer(3)
57character(10) :: quotes = '""""""""""'
58write(buffer,*,delim="QUOTE") quotes
59print "('>',a10,'<')", buffer
60end
61```
62* The name of the control variable in an implied DO loop in an array
63  constructor or DATA statement has a scope over the value-list only,
64  not the bounds of the implied DO loop.  It is not advisable to use
65  an object of the same name as the index variable in a bounds
66  expression, but it will work, instead of being needlessly undefined.
67* If both the `COUNT=` and the `COUNT_MAX=` optional arguments are
68  present on the same call to the intrinsic subroutine `SYSTEM_CLOCK`,
69  we require that their types have the same integer kind, since the
70  kind of these arguments is used to select the clock rate.  In common
71  with some other compilers, the clock rate varies from tenths of a
72  second to nanoseconds depending on argument kind and platform support.
73* If a dimension of a descriptor has zero extent in a call to
74  `CFI_section`, `CFI_setpointer` or `CFI_allocate`, the lower
75  bound on that dimension will be set to 1 for consistency with
76  the `LBOUND()` intrinsic function.
77* `-2147483648_4` is, strictly speaking, a non-conforming literal
78  constant on a machine with 32-bit two's-complement integers as
79  kind 4, because the grammar of Fortran expressions parses it as a
80  negation of a literal constant, not a negative literal constant.
81  This compiler accepts it with a portability warning.
82
83## Extensions, deletions, and legacy features supported by default
84
85* Tabs in source
86* `<>` as synonym for `.NE.` and `/=`
87* `$` and `@` as legal characters in names
88* Initialization in type declaration statements using `/values/`
89* Kind specification with `*`, e.g. `REAL*4`
90* `DOUBLE COMPLEX` as a synonym for `COMPLEX(KIND(0.D0))` --
91  but not when spelled `TYPE(DOUBLECOMPLEX)`.
92* Signed complex literal constants
93* DEC `STRUCTURE`, `RECORD`, with '%FILL'; but `UNION`, and `MAP`
94  are not yet supported throughout compilation, and elicit a
95  "not yet implemented" message.
96* Structure field access with `.field`
97* `BYTE` as synonym for `INTEGER(KIND=1)`; but not when spelled `TYPE(BYTE)`.
98* When kind-param is used for REAL literals, allow a matching exponent letter
99* Quad precision REAL literals with `Q`
100* `X` prefix/suffix as synonym for `Z` on hexadecimal literals
101* `B`, `O`, `Z`, and `X` accepted as suffixes as well as prefixes
102* Triplets allowed in array constructors
103* `%LOC`, `%VAL`, and `%REF`
104* Leading comma allowed before I/O item list
105* Empty parentheses allowed in `PROGRAM P()`
106* Missing parentheses allowed in `FUNCTION F`
107* Cray based `POINTER(p,x)` and `LOC()` intrinsic (with `%LOC()` as
108  an alias)
109* Arithmetic `IF`.  (Which branch should NaN take? Fall through?)
110* `ASSIGN` statement, assigned `GO TO`, and assigned format
111* `PAUSE` statement
112* Hollerith literals and edit descriptors
113* `NAMELIST` allowed in the execution part
114* Omitted colons on type declaration statements with attributes
115* COMPLEX constructor expression, e.g. `(x+y,z)`
116* `+` and `-` before all primary expressions, e.g. `x*-y`
117* `.NOT. .NOT.` accepted
118* `NAME=` as synonym for `FILE=`
119* Data edit descriptors without width or other details
120* `D` lines in fixed form as comments or debug code
121* `CARRIAGECONTROL=` on the OPEN and INQUIRE statements
122* `CONVERT=` on the OPEN and INQUIRE statements
123* `DISPOSE=` on the OPEN and INQUIRE statements
124* Leading semicolons are ignored before any statement that
125  could have a label
126* The character `&` in column 1 in fixed form source is a variant form
127  of continuation line.
128* Character literals as elements of an array constructor without an explicit
129  type specifier need not have the same length; the longest literal determines
130  the length parameter of the implicit type, not the first.
131* Outside a character literal, a comment after a continuation marker (&)
132  need not begin with a comment marker (!).
133* Classic C-style /*comments*/ are skipped, so multi-language header
134  files are easier to write and use.
135* $ and \ edit descriptors are supported in FORMAT to suppress newline
136  output on user prompts.
137* Tabs in format strings (not `FORMAT` statements) are allowed on output.
138* REAL and DOUBLE PRECISION variable and bounds in DO loops
139* Integer literals without explicit kind specifiers that are out of range
140  for the default kind of INTEGER are assumed to have the least larger kind
141  that can hold them, if one exists.
142* BOZ literals can be used as INTEGER values in contexts where the type is
143  unambiguous: the right hand sides of assigments and initializations
144  of INTEGER entities, as actual arguments to a few intrinsic functions
145  (ACHAR, BTEST, CHAR), and as actual arguments of references to
146  procedures with explicit interfaces whose corresponding dummy
147  argument has a numeric type to which the BOZ literal may be
148  converted.  BOZ literals are interpreted as default INTEGER only
149  when they appear as the first items of array constructors with no
150  explicit type.  Otherwise, they generally cannot be used if the type would
151  not be known (e.g., `IAND(X'1',X'2')`).
152* BOZ literals can also be used as REAL values in some contexts where the
153  type is unambiguous, such as initializations of REAL parameters.
154* EQUIVALENCE of numeric and character sequences (a ubiquitous extension),
155  as well as of sequences of non-default kinds of numeric types
156  with each other.
157* Values for whole anonymous parent components in structure constructors
158  (e.g., `EXTENDEDTYPE(PARENTTYPE(1,2,3))` rather than `EXTENDEDTYPE(1,2,3)`
159   or `EXTENDEDTYPE(PARENTTYPE=PARENTTYPE(1,2,3))`).
160* Some intrinsic functions are specified in the standard as requiring the
161  same type and kind for their arguments (viz., ATAN with two arguments,
162  ATAN2, DIM, HYPOT, MAX, MIN, MOD, and MODULO);
163  we allow distinct types to be used, promoting
164  the arguments as if they were operands to an intrinsic `+` operator,
165  and defining the result type accordingly.
166* DOUBLE COMPLEX intrinsics DREAL, DCMPLX, DCONJG, and DIMAG.
167* The DFLOAT intrinsic function.
168* INT_PTR_KIND intrinsic returns the kind of c_intptr_t.
169* Restricted specific conversion intrinsics FLOAT, SNGL, IDINT, IFIX, DREAL,
170  and DCMPLX accept arguments of any kind instead of only the default kind or
171  double precision kind. Their result kinds remain as specified.
172* Specific intrinsics AMAX0, AMAX1, AMIN0, AMIN1, DMAX1, DMIN1, MAX0, MAX1,
173  MIN0, and MIN1 accept more argument types than specified. They are replaced by
174  the related generics followed by conversions to the specified result types.
175* When a scalar CHARACTER actual argument of the same kind is known to
176  have a length shorter than the associated dummy argument, it is extended
177  on the right with blanks, similar to assignment.
178* When a dummy argument is `POINTER` or `ALLOCATABLE` and is `INTENT(IN)`, we
179  relax enforcement of some requirements on actual arguments that must otherwise
180  hold true for definable arguments.
181* Assignment of `LOGICAL` to `INTEGER` and vice versa (but not other types) is
182  allowed.  The values are normalized.
183* Static initialization of `LOGICAL` with `INTEGER` is allowed in `DATA` statements
184  and object initializers.
185  The results are *not* normalized to canonical `.TRUE.`/`.FALSE.`.
186  Static initialization of `INTEGER` with `LOGICAL` is also permitted.
187* An effectively empty source file (no program unit) is accepted and
188  produces an empty relocatable output file.
189* A `RETURN` statement may appear in a main program.
190* DATA statement initialization is allowed for procedure pointers outside
191  structure constructors.
192* Nonstandard intrinsic functions: ISNAN, SIZEOF
193* A forward reference to a default INTEGER scalar dummy argument is
194  permitted to appear in a specification expression, such as an array
195  bound, in a scope with IMPLICIT NONE(TYPE) if the name
196  of the dummy argument would have caused it to be implicitly typed
197  as default INTEGER if IMPLICIT NONE(TYPE) were absent.
198* OPEN(ACCESS='APPEND') is interpreted as OPEN(POSITION='APPEND')
199  to ease porting from Sun Fortran.
200* Intrinsic subroutines EXIT([status]) and ABORT()
201* The definition of simple contiguity in 9.5.4 applies only to arrays;
202  we also treat scalars as being trivially contiguous, so that they
203  can be used in contexts like data targets in pointer assignments
204  with bounds remapping.
205* We support some combinations of specific procedures in generic
206  interfaces that a strict reading of the standard would preclude
207  when their calls must nonetheless be distinguishable.
208  Specifically, `ALLOCATABLE` dummy arguments are distinguishing
209  if an actual argument acceptable to one could not be passed to
210  the other & vice versa because exactly one is polymorphic or
211  exactly one is unlimited polymorphic).
212* External unit 0 is predefined and connected to the standard error output,
213  and defined as `ERROR_UNIT` in the intrinsic `ISO_FORTRAN_ENV` module.
214* Objects in blank COMMON may be initialized.
215* Initialization of COMMON blocks outside of BLOCK DATA subprograms.
216* Multiple specifications of the SAVE attribute on the same object
217  are allowed, with a warning.
218* Specific intrinsic functions BABS, IIABS, JIABS, KIABS, ZABS, and CDABS.
219* A `POINTER` component's type need not be a sequence type when
220  the component appears in a derived type with `SEQUENCE`.
221  (This case should probably be an exception to constraint C740 in
222  the standard.)
223* Format expressions that have type but are not character and not
224  integer scalars are accepted so long as they are simply contiguous.
225  This legacy extension supports pre-Fortran'77 usage in which
226  variables initialized in DATA statements with Hollerith literals
227  as modifiable formats.
228* At runtime, `NAMELIST` input will skip over `NAMELIST` groups
229  with other names, and will treat text before and between groups
230  as if they were comment lines, even if not begun with `!`.
231* Commas are required in FORMAT statements and character variables
232  only when they prevent ambiguity.
233* Legacy names `AND`, `OR`, and `XOR` are accepted as aliases for
234  the standard intrinsic functions `IAND`, `IOR`, and `IEOR`
235  respectively.
236* A digit count of d=0 is accepted in Ew.0, Dw.0, and Gw.0 output
237  editing if no nonzero scale factor (kP) is in effect.
238
239### Extensions supported when enabled by options
240
241* C-style backslash escape sequences in quoted CHARACTER literals
242  (but not Hollerith) [-fbackslash]
243* Logical abbreviations `.T.`, `.F.`, `.N.`, `.A.`, `.O.`, and `.X.`
244  [-flogical-abbreviations]
245* `.XOR.` as a synonym for `.NEQV.` [-fxor-operator]
246* The default `INTEGER` type is required by the standard to occupy
247  the same amount of storage as the default `REAL` type.  Default
248  `REAL` is of course 32-bit IEEE-754 floating-point today.  This legacy
249  rule imposes an artificially small constraint in some cases
250  where Fortran mandates that something have the default `INTEGER`
251  type: specifically, the results of references to the intrinsic functions
252  `SIZE`, `STORAGE_SIZE`,`LBOUND`, `UBOUND`, `SHAPE`, and the location reductions
253  `FINDLOC`, `MAXLOC`, and `MINLOC` in the absence of an explicit
254  `KIND=` actual argument.  We return `INTEGER(KIND=8)` by default in
255  these cases when the `-flarge-sizes` option is enabled.
256  `SIZEOF` and `C_SIZEOF` always return `INTEGER(KIND=8)`.
257* Treat each specification-part like is has `IMPLICIT NONE`
258  [-fimplicit-none-type-always]
259* Ignore occurrences of `IMPLICIT NONE` and `IMPLICIT NONE(TYPE)`
260  [-fimplicit-none-type-never]
261* Old-style `PARAMETER pi=3.14` statement without parentheses
262  [-falternative-parameter-statement]
263
264### Extensions and legacy features deliberately not supported
265
266* `.LG.` as synonym for `.NE.`
267* `REDIMENSION`
268* Allocatable `COMMON`
269* Expressions in formats
270* `ACCEPT` as synonym for `READ *`
271* `TYPE` as synonym for `PRINT`
272* `ARRAY` as synonym for `DIMENSION`
273* `VIRTUAL` as synonym for `DIMENSION`
274* `ENCODE` and `DECODE` as synonyms for internal I/O
275* `IMPLICIT AUTOMATIC`, `IMPLICIT STATIC`
276* Default exponent of zero, e.g. `3.14159E`
277* Characters in defined operators that are neither letters nor digits
278* `B` suffix on unquoted octal constants
279* `Z` prefix on unquoted hexadecimal constants (dangerous)
280* `T` and `F` as abbreviations for `.TRUE.` and `.FALSE.` in DATA (PGI/XLF)
281* Use of host FORMAT labels in internal subprograms (PGI-only feature)
282* ALLOCATE(TYPE(derived)::...) as variant of correct ALLOCATE(derived::...) (PGI only)
283* Defining an explicit interface for a subprogram within itself (PGI only)
284* USE association of a procedure interface within that same procedure's definition
285* NULL() as a structure constructor expression for an ALLOCATABLE component (PGI).
286* Conversion of LOGICAL to INTEGER in expressions.
287* Use of INTEGER data with the intrinsic logical operators `.NOT.`, `.AND.`, `.OR.`,
288  and `.XOR.`.
289* IF (integer expression) THEN ... END IF  (PGI/Intel)
290* Comparsion of LOGICAL with ==/.EQ. rather than .EQV. (also .NEQV.) (PGI/Intel)
291* Procedure pointers in COMMON blocks (PGI/Intel)
292* Underindexing multi-dimensional arrays (e.g., A(1) rather than A(1,1)) (PGI only)
293* Legacy PGI `NCHARACTER` type and `NC` Kanji character literals
294* Using non-integer expressions for array bounds (e.g., REAL A(3.14159)) (PGI/Intel)
295* Mixing INTEGER types as operands to bit intrinsics (e.g., IAND); only two
296  compilers support it, and they disagree on sign extension.
297* Module & program names that conflict with an object inside the unit (PGI only).
298* When the same name is brought into scope via USE association from
299  multiple modules, the name must refer to a generic interface; PGI
300  allows a name to be a procedure from one module and a generic interface
301  from another.
302* Type parameter declarations must come first in a derived type definition;
303  some compilers allow them to follow `PRIVATE`, or be intermixed with the
304  component declarations.
305* Wrong argument types in calls to specific intrinsics that have different names than the
306  related generics. Some accepted exceptions are listed above in the allowed extensions.
307  PGI, Intel, and XLF support this in ways that are not numerically equivalent.
308  PGI converts the arguments while Intel and XLF replace the specific by the related generic.
309* VMS listing control directives (`%LIST`, `%NOLIST`, `%EJECT`)
310* Continuation lines on `INCLUDE` lines
311* `NULL()` actual argument corresponding to an `ALLOCATABLE` dummy data object
312
313## Preprocessing behavior
314
315* The preprocessor is always run, whatever the filename extension may be.
316* We respect Fortran comments in macro actual arguments (like GNU, Intel, NAG;
317  unlike PGI and XLF) on the principle that macro calls should be treated
318  like function references.  Fortran's line continuation methods also work.
319
320## Standard features not silently accepted
321
322* Fortran explicitly ignores type declaration statements when they
323  attempt to type the name of a generic intrinsic function (8.2 p3).
324  One can declare `CHARACTER::COS` and still get a real result
325  from `COS(3.14159)`, for example.  f18 will complain when a
326  generic intrinsic function's inferred result type does not
327  match an explicit declaration.  This message is a warning.
328
329## Standard features that might as well not be
330
331* f18 supports designators with constant expressions, properly
332  constrained, as initial data targets for data pointers in
333  initializers of variable and component declarations and in
334  `DATA` statements; e.g., `REAL, POINTER :: P => T(1:10:2)`.
335  This Fortran 2008 feature might as well be viewed like an
336  extension; no other compiler that we've tested can handle
337  it yet.
338
339## Behavior in cases where the standard is ambiguous or indefinite
340
341* When an inner procedure of a subprogram uses the value or an attribute
342  of an undeclared name in a specification expression and that name does
343  not appear in the host, it is not clear in the standard whether that
344  name is an implicitly typed local variable of the inner procedure or a
345  host association with an implicitly typed local variable of the host.
346  For example:
347```
348module module
349 contains
350  subroutine host(j)
351    ! Although "m" never appears in the specification or executable
352    ! parts of this subroutine, both of its contained subroutines
353    ! might be accessing it via host association.
354    integer, intent(in out) :: j
355    call inner1(j)
356    call inner2(j)
357   contains
358    subroutine inner1(n)
359      integer(kind(m)), intent(in) :: n
360      m = n + 1
361    end subroutine
362    subroutine inner2(n)
363      integer(kind(m)), intent(out) :: n
364      n = m + 2
365    end subroutine
366  end subroutine
367end module
368
369program demo
370  use module
371  integer :: k
372  k = 0
373  call host(k)
374  print *, k, " should be 3"
375end
376
377```
378
379  Other Fortran compilers disagree in their interpretations of this example;
380  some seem to treat the references to `m` as if they were host associations
381  to an implicitly typed variable (and print `3`), while others seem to
382  treat them as references to implicitly typed local variabless, and
383  load uninitialized values.
384
385  In f18, we chose to emit an error message for this case since the standard
386  is unclear, the usage is not portable, and the issue can be easily resolved
387  by adding a declaration.
388
389* In subclause 7.5.6.2 of Fortran 2018 the standard defines a partial ordering
390  of the final subroutine calls for finalizable objects, their non-parent
391  components, and then their parent components.
392  (The object is finalized, then the non-parent components of each element,
393  and then the parent component.)
394  Some have argued that the standard permits an implementation
395  to finalize the parent component before finalizing an allocatable component in
396  the context of deallocation, and the next revision of the language may codify
397  this option.
398  In the interest of avoiding needless confusion, this compiler implements what
399  we believe to be the least surprising order of finalization.
400  Specifically: all non-parent components are finalized before
401  the parent, allocatable or not;
402  all finalization takes place before any deallocation;
403  and no object or subobject will be finalized more than once.
404
405* When `RECL=` is set via the `OPEN` statement for a sequential formatted input
406  file, it functions as an effective maximum record length.
407  Longer records, if any, will appear as if they had been truncated to
408  the value of `RECL=`.
409  (Other compilers ignore `RECL=`, signal an error, or apply effective truncation
410  to some forms of input in this situation.)
411  For sequential formatted output, RECL= serves as a limit on record lengths
412  that raises an error when it is exceeded.
413