1<!--===- docs/ModFiles.md
2
3   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4   See https://llvm.org/LICENSE.txt for license information.
5   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6
7-->
8
9# Module Files
10
11Module files hold information from a module that is necessary to compile
12program units that depend on the module.
13
14## Name
15
16Module files must be searchable by module name. They are typically named
17`<modulename>.mod`. The advantage of using `.mod` is that it is consistent with
18other compilers so users will know what they are. Also, makefiles and scripts
19often use `rm *.mod` to clean up.
20
21The disadvantage of using the same name as other compilers is that it is not
22clear which compiler created a `.mod` file and files from multiple compilers
23cannot be in the same directory. This could be solved by adding something
24between the module name and extension, e.g. `<modulename>-f18.mod`.
25
26## Format
27
28Module files will be Fortran source.
29Declarations of all visible entities will be included, along with private
30entities that they depend on.
31Entity declarations that span multiple statements will be collapsed into
32a single *type-declaration-statement*.
33Executable statements will be omitted.
34
35### Header
36
37There will be a header containing extra information that cannot be expressed
38in Fortran. This will take the form of a comment or directive
39at the beginning of the file.
40
41If it's a comment, the module file reader would have to strip it out and
42perform *ad hoc* parsing on it. If it's a directive the compiler could
43parse it like other directives as part of the grammar.
44Processing the header before parsing might result in better error messages
45when the `.mod` file is invalid.
46
47Regardless of whether the header is a comment or directive we can use the
48same string to introduce it: `!mod$`.
49
50Information in the header:
51- Magic string to confirm it is an f18 `.mod` file
52- Version information: to indicate the version of the file format, in case it changes,
53  and the version of the compiler that wrote the file, for diagnostics.
54- Checksum of the body of the current file
55- Modules we depend on and the checksum of their module file when the current
56  module file is created
57- The source file that produced the `.mod` file? This could be used in error messages.
58
59### Body
60
61The body will consist of minimal Fortran source for the required declarations.
62The order will match the order they first appeared in the source.
63
64Some normalization will take place:
65- extraneous spaces will be removed
66- implicit types will be made explicit
67- attributes will be written in a consistent order
68- entity declarations will be combined into a single declaration
69- function return types specified in a *prefix-spec* will be replaced by
70  an entity declaration
71- etc.
72
73#### Symbols included
74
75All public symbols from the module need to be included.
76
77In addition, some private symbols are needed:
78- private types that appear in the public API
79- private components of non-private derived types
80- private parameters used in non-private declarations (initial values, kind parameters)
81- others?
82
83It might be possible to anonymize private names if users don't want them exposed
84in the `.mod` file. (Currently they are readable in PGI `.mod` files.)
85
86#### USE association
87
88A module that contains `USE` statements needs them represented in the
89`.mod` file.
90Each use-associated symbol will be written as a separate *use-only* statement,
91possibly with renaming.
92
93Alternatives:
94- Emit a single `USE` for each module, listing all of the symbols that were
95  use-associated in the *only-list*.
96- Detect when all of the symbols from a module are imported (either by a *use-stmt*
97  without an *only-list* or because all of the public symbols of the module
98  have been listed in *only-list*s). In that case collapse them into a single *use-stmt*.
99- Emit the *use-stmt*s that appeared in the original source.
100
101## Reading and writing module files
102
103### Options
104
105The compiler will have command-line options to specify where to search
106for module files and where to write them. By default it will be the current
107directory for both.
108
109For PGI, `-I` specifies directories to search for include files and module
110files. `-module` specifics a directory to write module files in as well as to
111search for them. gfortran is similar except it uses `-J` instead of `-module`.
112
113The search order for module files is:
1141. The `-module` directory (Note: for gfortran the `-J` directory is not searched).
1152. The current directory
1163. The `-I` directories in the order they appear on the command line
117
118### Writing module files
119
120When writing a module file, if the existing one matches what would be written,
121the timestamp is not updated.
122
123Module files will be written after semantics, i.e. after the compiler has
124determined the module is valid Fortran.<br>
125**NOTE:** PGI does create `.mod` files sometimes even when the module has a
126compilation error.
127
128Question: If the compiler can get far enough to determine it is compiling a module
129but then encounters an error, should it delete the existing `.mod` file?
130PGI does not, gfortran does.
131
132### Reading module files
133
134When the compiler finds a `.mod` file it needs to read, it firsts checks the first
135line and verifies it is a valid module file. It can also verify checksums of
136modules it depends on and report if they are out of date.
137
138If the header is valid, the module file will be run through the parser and name
139resolution to recreate the symbols from the module. Once the symbol table is
140populated the parse tree can be discarded.
141
142When processing `.mod` files we know they are valid Fortran with these properties:
1431. The input (without the header) is already in the "cooked input" format.
1442. No preprocessing is necessary.
1453. No errors can occur.
146
147## Error messages referring to modules
148
149With this design, diagnostics can refer to names in modules and can emit a
150normalized declaration of an entity but not point to its location in the
151source.
152
153If the header includes the source file it came from, that could be included in
154a diagnostic but we still wouldn't have line numbers.
155
156To provide line numbers and character positions or source lines as the user
157wrote them we would have to save some amount of provenance information in the
158module file as well.
159