1====================
2Standard C++ Modules
3====================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11The term ``modules`` has a lot of meanings. For the users of Clang, modules may
12refer to ``Objective-C Modules``, ``Clang C++ Modules`` (or ``Clang Header Modules``,
13etc.) or ``Standard C++ Modules``. The implementation of all these kinds of modules in Clang
14has a lot of shared code, but from the perspective of users, their semantics and
15command line interfaces are very different. This document focuses on
16an introduction of how to use standard C++ modules in Clang.
17
18There is already a detailed document about `Clang modules <Modules.html>`_, it
19should be helpful to read `Clang modules <Modules.html>`_ if you want to know
20more about the general idea of modules. Since standard C++ modules have different semantics
21(and work flows) from `Clang modules`, this page describes the background and use of
22Clang with standard C++ modules.
23
24Modules exist in two forms in the C++ Language Specification. They can refer to
25either "Named Modules" or to "Header Units". This document covers both forms.
26
27Standard C++ Named modules
28==========================
29
30This document was intended to be a manual first and foremost, however, we consider it helpful to
31introduce some language background here for readers who are not familiar with
32the new language feature. This document is not intended to be a language
33tutorial; it will only introduce necessary concepts about the
34structure and building of the project.
35
36Background and terminology
37--------------------------
38
39Modules
40~~~~~~~
41
42In this document, the term ``Modules``/``modules`` refers to standard C++ modules
43feature if it is not decorated by ``Clang``.
44
45Clang Modules
46~~~~~~~~~~~~~
47
48In this document, the term ``Clang Modules``/``Clang modules`` refer to Clang
49c++ modules extension. These are also known as ``Clang header modules``,
50``Clang module map modules`` or ``Clang c++ modules``.
51
52Module and module unit
53~~~~~~~~~~~~~~~~~~~~~~
54
55A module consists of one or more module units. A module unit is a special
56translation unit. Every module unit must have a module declaration. The syntax
57of the module declaration is:
58
59.. code-block:: c++
60
61  [export] module module_name[:partition_name];
62
63Terms enclosed in ``[]`` are optional. The syntax of ``module_name`` and ``partition_name``
64in regex form corresponds to ``[a-zA-Z_][a-zA-Z_0-9\.]*``. In particular, a literal dot ``.``
65in the name has no semantic meaning (e.g. implying a hierarchy).
66
67In this document, module units are classified into:
68
69* Primary module interface unit.
70
71* Module implementation unit.
72
73* Module interface partition unit.
74
75* Internal module partition unit.
76
77A primary module interface unit is a module unit whose module declaration is
78``export module module_name;``. The ``module_name`` here denotes the name of the
79module. A module should have one and only one primary module interface unit.
80
81A module implementation unit is a module unit whose module declaration is
82``module module_name;``. A module could have multiple module implementation
83units with the same declaration.
84
85A module interface partition unit is a module unit whose module declaration is
86``export module module_name:partition_name;``. The ``partition_name`` should be
87unique within any given module.
88
89An internal module partition unit is a module unit whose module declaration
90is ``module module_name:partition_name;``. The ``partition_name`` should be
91unique within any given module.
92
93In this document, we use the following umbrella terms:
94
95* A ``module interface unit`` refers to either a ``primary module interface unit``
96  or a ``module interface partition unit``.
97
98* An ``importable module unit`` refers to either a ``module interface unit``
99  or a ``internal module partition unit``.
100
101* A ``module partition unit`` refers to either a ``module interface partition unit``
102  or a ``internal module partition unit``.
103
104Built Module Interface file
105~~~~~~~~~~~~~~~~~~~~~~~~~~~
106
107A ``Built Module Interface file`` stands for the precompiled result of an importable module unit.
108It is also called the acronym ``BMI`` genrally.
109
110Global module fragment
111~~~~~~~~~~~~~~~~~~~~~~
112
113In a module unit, the section from ``module;`` to the module declaration is called the global module fragment.
114
115
116How to build projects using modules
117-----------------------------------
118
119Quick Start
120~~~~~~~~~~~
121
122Let's see a "hello world" example that uses modules.
123
124.. code-block:: c++
125
126  // Hello.cppm
127  module;
128  #include <iostream>
129  export module Hello;
130  export void hello() {
131    std::cout << "Hello World!\n";
132  }
133
134  // use.cpp
135  import Hello;
136  int main() {
137    hello();
138    return 0;
139  }
140
141Then we type:
142
143.. code-block:: console
144
145  $ clang++ -std=c++20 Hello.cppm --precompile -o Hello.pcm
146  $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out
147  $ ./Hello.out
148  Hello World!
149
150In this example, we make and use a simple module ``Hello`` which contains only a
151primary module interface unit ``Hello.cppm``.
152
153Then let's see a little bit more complex "hello world" example which uses the 4 kinds of module units.
154
155.. code-block:: c++
156
157  // M.cppm
158  export module M;
159  export import :interface_part;
160  import :impl_part;
161  export void Hello();
162
163  // interface_part.cppm
164  export module M:interface_part;
165  export void World();
166
167  // impl_part.cppm
168  module;
169  #include <iostream>
170  #include <string>
171  module M:impl_part;
172  import :interface_part;
173
174  std::string W = "World.";
175  void World() {
176    std::cout << W << std::endl;
177  }
178
179  // Impl.cpp
180  module;
181  #include <iostream>
182  module M;
183  void Hello() {
184    std::cout << "Hello ";
185  }
186
187  // User.cpp
188  import M;
189  int main() {
190    Hello();
191    World();
192    return 0;
193  }
194
195Then we are able to compile the example by the following command:
196
197.. code-block:: console
198
199  # Precompiling the module
200  $ clang++ -std=c++20 interface_part.cppm --precompile -o M-interface_part.pcm
201  $ clang++ -std=c++20 impl_part.cppm --precompile -fprebuilt-module-path=. -o M-impl_part.pcm
202  $ clang++ -std=c++20 M.cppm --precompile -fprebuilt-module-path=. -o M.pcm
203  $ clang++ -std=c++20 Impl.cpp -fmodule-file=M.pcm -c -o Impl.o
204
205  # Compiling the user
206  $ clang++ -std=c++20 User.cpp -fprebuilt-module-path=. -c -o User.o
207
208  # Compiling the module and linking it together
209  $ clang++ -std=c++20 M-interface_part.pcm -c -o M-interface_part.o
210  $ clang++ -std=c++20 M-impl_part.pcm -c -o M-impl_part.o
211  $ clang++ -std=c++20 M.pcm -c -o M.o
212  $ clang++ User.o M-interface_part.o  M-impl_part.o M.o Impl.o -o a.out
213
214We explain the options in the following sections.
215
216How to enable standard C++ modules
217~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
218
219Currently, standard C++ modules are enabled automatically
220if the language standard is ``-std=c++20`` or newer.
221The ``-fmodules-ts`` option is deprecated and is planned to be removed.
222
223How to produce a BMI
224~~~~~~~~~~~~~~~~~~~~
225
226It is possible to generate a BMI for an importable module unit by specifying the ``--precompile`` option.
227
228File name requirement
229~~~~~~~~~~~~~~~~~~~~~
230
231The file name of an ``importable module unit`` should end with ``.cppm``
232(or ``.ccm``, ``.cxxm``, ``.c++m``). The file name of a ``module implementation unit``
233should end with ``.cpp`` (or ``.cc``, ``.cxx``, ``.c++``).
234
235The file name of BMIs should end with ``.pcm``.
236The file name of the BMI of a ``primary module interface unit`` should be ``module_name.pcm``.
237The file name of BMIs of ``module partition unit`` should be ``module_name-partition_name.pcm``.
238
239If the file names use different extensions, Clang may fail to build the module.
240For example, if the filename of an ``importable module unit`` ends with ``.cpp`` instead of ``.cppm``,
241then we can't generate a BMI for the ``importable module unit`` by ``--precompile`` option
242since ``--precompile`` option now would only run preprocessor, which is equal to `-E` now.
243If we want the filename of an ``importable module unit`` ends with other suffixes instead of ``.cppm``,
244we could put ``-x c++-module`` in front of the file. For example,
245
246.. code-block:: c++
247
248  // Hello.cpp
249  module;
250  #include <iostream>
251  export module Hello;
252  export void hello() {
253    std::cout << "Hello World!\n";
254  }
255
256  // use.cpp
257  import Hello;
258  int main() {
259    hello();
260    return 0;
261  }
262
263Now the filename of the ``module interface`` ends with ``.cpp`` instead of ``.cppm``,
264we can't compile them by the original command lines. But we are still able to do it by:
265
266.. code-block:: console
267
268  $ clang++ -std=c++20 -x c++-module Hello.cpp --precompile -o Hello.pcm
269  $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out
270  $ ./Hello.out
271  Hello World!
272
273How to specify the dependent BMIs
274~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
275
276The option ``-fprebuilt-module-path`` tells the compiler the path where to search for dependent BMIs.
277It may be used multiple times just like ``-I`` for specifying paths for header files. The look up rule here is:
278
279* (1) When we import module M. The compiler would look up M.pcm in the directories specified
280  by ``-fprebuilt-module-path``.
281* (2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the
282  directories specified by ``-fprebuilt-module-path``.
283
284Another way to specify the dependent BMIs is to use ``-fmodule-file``. The main difference
285is that ``-fprebuilt-module-path`` takes a directory, whereas ``-fmodule-file`` requires a
286specific file. In case both the ``-fprebuilt-module-path`` and ``-fmodule-file`` exist, the
287``-fmodule-file`` option takes higher precedence. In another word, if the compiler finds the wanted
288BMI specified by ``-fmodule-file``, the compiler wouldn't look up again in the directories specified
289by ``-fprebuilt-module-path``.
290
291When we compile a ``module implementation unit``, we must pass the BMI of the corresponding
292``primary module interface unit`` by ``-fmodule-file``
293since the language specification says a module implementation unit implicitly imports
294the primary module interface unit.
295
296  [module.unit]p8
297
298  A module-declaration that contains neither an export-keyword nor a module-partition implicitly
299  imports the primary module interface unit of the module as if by a module-import-declaration.
300
301Again, the option ``-fmodule-file`` may occur multiple times.
302For example, the command line to compile ``M.cppm`` in
303the above example could be rewritten into:
304
305.. code-block:: console
306
307  $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M-interface_part.pcm -fmodule-file=M-impl_part.pcm -o M.pcm
308
309``-fprebuilt-module-path`` is more convenient and ``-fmodule-file`` is faster since
310it saves time for file lookup.
311
312Remember that module units still have an object counterpart to the BMI
313~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
314
315It is easy to forget to compile BMIs at first since we may envision module interfaces like headers.
316However, this is not true.
317Module units are translation units. We need to compile them to object files
318and link the object files like the example shows.
319
320For example, the traditional compilation processes for headers are like:
321
322.. code-block:: text
323
324  src1.cpp -+> clang++ src1.cpp --> src1.o ---,
325  hdr1.h  --'                                 +-> clang++ src1.o src2.o ->  executable
326  hdr2.h  --,                                 |
327  src2.cpp -+> clang++ src2.cpp --> src2.o ---'
328
329And the compilation process for module units are like:
330
331.. code-block:: text
332
333                src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -,
334  (header unit) hdr1.h    -> clang++ hdr1.h ...    -> hdr1.pcm --'                                    +-> clang++ src1.o mod1.o src2.o ->  executable
335                mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
336                src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'
337
338As the diagrams show, we need to compile the BMI from module units to object files and link the object files.
339(But we can't do this for the BMI from header units. See the later section for the definition of header units)
340
341If we want to create a module library, we can't just ship the BMIs in an archive.
342We must compile these BMIs(``*.pcm``) into object files(``*.o``) and add those object files to the archive instead.
343
344Consistency Requirement
345~~~~~~~~~~~~~~~~~~~~~~~
346
347If we envision modules as a cache to speed up compilation, then - as with other caching techniques -
348it is important to keep cache consistency.
349So **currently** Clang will do very strict check for consistency.
350
351Options consistency
352^^^^^^^^^^^^^^^^^^^
353
354The language option of module units and their non-module-unit users should be consistent.
355The following example is not allowed:
356
357.. code-block:: c++
358
359  // M.cppm
360  export module M;
361
362  // Use.cpp
363  import M;
364
365.. code-block:: console
366
367  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
368  $ clang++ -std=c++2b Use.cpp -fprebuilt-module-path=.
369
370The compiler would reject the example due to the inconsistent language options.
371Not all options are language options.
372For example, the following example is allowed:
373
374.. code-block:: console
375
376  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
377  # Inconsistent optimization level.
378  $ clang++ -std=c++20 -O3 Use.cpp -fprebuilt-module-path=.
379  # Inconsistent debugging level.
380  $ clang++ -std=c++20 -g Use.cpp -fprebuilt-module-path=.
381
382Although the two examples have inconsistent optimization and debugging level, both of them are accepted.
383
384Note that **currently** the compiler doesn't consider inconsistent macro definition a problem. For example:
385
386.. code-block:: console
387
388  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
389  # Inconsistent optimization level.
390  $ clang++ -std=c++20 -O3 -DNDEBUG Use.cpp -fprebuilt-module-path=.
391
392Currently Clang would accept the above example. But it may produce surprising results if the
393debugging code depends on consistent use of ``NDEBUG`` also in other translation units.
394
395Source content consistency
396^^^^^^^^^^^^^^^^^^^^^^^^^^
397
398When the compiler reads a BMI, the compiler will check the consistency of the corresponding
399source files. For example:
400
401.. code-block:: c++
402
403  // M.cppm
404  export module M;
405  export template <class T>
406  T foo(T t) {
407    return t;
408  }
409
410  // Use.cpp
411  import M;
412  void bar() {
413    foo(5);
414  }
415
416.. code-block:: console
417
418  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
419  $ rm M.cppm
420  $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm
421
422The compiler would reject the example since the compiler failed to find the source file to check the consistency.
423So the following example would be rejected too.
424
425.. code-block:: console
426
427  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
428  $ echo "int i=0;" >> M.cppm
429  $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm
430
431The compiler would reject it too since the compiler detected the file was changed.
432
433But it is OK to move the BMI as long as the source files remain:
434
435.. code-block:: console
436
437  $ clang++ -std=c++20 M.cppm --precompile -o M.pcm
438  $ mkdir -p tmp
439  $ mv M.pcm tmp/M.pcm
440  $ clang++ -std=c++20 Use.cpp -fmodule-file=tmp/M.pcm
441
442The above example would be accepted.
443
444If the user doesn't want to follow the consistency requirement due to some reasons (e.g., distributing BMI),
445the user could try to use ``-Xclang -fmodules-embed-all-files`` when producing BMI. For example:
446
447.. code-block:: console
448
449  $ clang++ -std=c++20 M.cppm --precompile -Xclang -fmodules-embed-all-files -o M.pcm
450  $ rm M.cppm
451  $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm
452
453Now the compiler would accept the above example.
454Important note: Xclang options are intended to be used by compiler internally and its semantics
455are not guaranteed to be preserved in future versions.
456
457Also the compiler will record the path to the header files included in the global module fragment and compare the
458headers when imported. For example,
459
460.. code-block:: c++
461
462  // foo.h
463  #include <iostream>
464  void Hello() {
465    std::cout << "Hello World.\n";
466  }
467
468  // foo.cppm
469  module;
470  #include "foo.h"
471  export module foo;
472  export using ::Hello;
473
474  // Use.cpp
475  import foo;
476  int main() {
477    Hello();
478  }
479
480Then it is problematic if we remove ``foo.h`` before import `foo` module.
481
482.. code-block:: console
483
484  clang++ -std=c++20 foo.cppm --precompile  -o foo.pcm
485	mv foo.h foo.orig.h
486  # The following one is rejected
487	clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c
488
489The above case will rejected. And we're still able to workaround it by ``-Xclang -fmodules-embed-all-files`` option:
490
491.. code-block:: console
492
493  clang++ -std=c++20 foo.cppm --precompile  -Xclang -fmodules-embed-all-files -o foo.pcm
494	mv foo.h foo.orig.h
495	clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c -o Use.o
496	clang++ Use.o foo.pcm
497
498ABI Impacts
499-----------
500
501The declarations in a module unit which are not in the global module fragment have new linkage names.
502
503For example,
504
505.. code-block:: c++
506
507  export module M;
508  namespace NS {
509    export int foo();
510  }
511
512The linkage name of ``NS::foo()`` would be ``_ZN2NSW1M3fooEv``.
513This couldn't be demangled by previous versions of the debugger or demangler.
514As of LLVM 15.x, users can utilize ``llvm-cxxfilt`` to demangle this:
515
516.. code-block:: console
517
518  $ llvm-cxxfilt _ZN2NSW1M3fooEv
519
520The result would be ``NS::foo@M()``, which reads as ``NS::foo()`` in module ``M``.
521
522The ABI implies that we can't declare something in a module unit and define it in a non-module unit (or vice-versa),
523as this would result in linking errors.
524
525Known Problems
526--------------
527
528The following describes issues in the current implementation of modules.
529Please see https://github.com/llvm/llvm-project/labels/clang%3Amodules for more issues
530or file a new issue if you don't find an existing one.
531If you're going to create a new issue for standard C++ modules,
532please start the title with ``[C++20] [Modules]`` (or ``[C++2b] [Modules]``, etc)
533and add the label ``clang:modules`` (if you have permissions for that).
534
535For higher level support for proposals, you could visit https://clang.llvm.org/cxx_status.html.
536
537Support for clang-scan-deps
538~~~~~~~~~~~~~~~~~~~~~~~~~~~
539
540The support for clang-scan-deps may be the most urgent problem for modules now.
541Without the support for clang-scan-deps, it's hard to involve build systems.
542This means that users could only play with modules through makefiles or by writing a parser by hand.
543It blocks more uses for modules, which will block more defect reports or requirements.
544
545This is tracked in: https://github.com/llvm/llvm-project/issues/51792.
546
547Ambiguous deduction guide
548~~~~~~~~~~~~~~~~~~~~~~~~~
549
550Currently, when we call deduction guides in global module fragment,
551we may get incorrect diagnosing message like: `ambiguous deduction`.
552
553So if we're using deduction guide from global module fragment, we probably need to write:
554
555.. code-block:: c++
556
557  std::lock_guard<std::mutex> lk(mutex);
558
559instead of
560
561.. code-block:: c++
562
563  std::lock_guard lk(mutex);
564
565This is tracked in: https://github.com/llvm/llvm-project/issues/56916
566
567Ignored PreferredName Attribute
568~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
569
570Due to a tricky problem, when Clang writes BMIs, Clang will ignore the ``preferred_name`` attribute, if any.
571This implies that the ``preferred_name`` wouldn't show in debugger or dumping.
572
573This is tracked in: https://github.com/llvm/llvm-project/issues/56490
574
575Don't emit macros about module declaration
576~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
577
578This is covered by P1857R3. We mention it again here since users may abuse it before we implement it.
579
580Someone may want to write code which could be compiled both by modules or non-modules.
581A direct idea would be use macros like:
582
583.. code-block:: c++
584
585  MODULE
586  IMPORT header_name
587  EXPORT_MODULE MODULE_NAME;
588  IMPORT header_name
589  EXPORT ...
590
591So this file could be triggered like a module unit or a non-module unit depending on the definition
592of some macros.
593However, this kind of usage is forbidden by P1857R3 but we haven't implemented P1857R3 yet.
594This means that is possible to write illegal modules code now, and obviously this will stop working
595once P1857R3 is implemented.
596A simple suggestion would be "Don't play macro tricks with module declarations".
597
598This is tracked in: https://github.com/llvm/llvm-project/issues/56917
599
600In consistent filename suffix requirement for importable module units
601~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
602
603Currently, clang requires the file name of an ``importable module unit`` should end with ``.cppm``
604(or ``.ccm``, ``.cxxm``, ``.c++m``). However, the behavior is inconsistent with other compilers.
605
606This is tracked in: https://github.com/llvm/llvm-project/issues/57416
607
608Header Units
609============
610
611How to build projects using header unit
612---------------------------------------
613
614Quick Start
615~~~~~~~~~~~
616
617For the following example,
618
619.. code-block:: c++
620
621  import <iostream>;
622  int main() {
623    std::cout << "Hello World.\n";
624  }
625
626we could compile it as
627
628.. code-block:: console
629
630  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
631  $ clang++ -std=c++20 -fmodule-file=iostream.pcm main.cpp
632
633How to produce BMIs
634~~~~~~~~~~~~~~~~~~~
635
636Similar to named modules, we could use ``--precompile`` to produce the BMI.
637But we need to specify that the input file is a header by ``-xc++-system-header`` or ``-xc++-user-header``.
638
639Also we could use `-fmodule-header={user,system}` option to produce the BMI for header units
640which has suffix like `.h` or `.hh`.
641The value of `-fmodule-header` means the user search path or the system search path.
642The default value for `-fmodule-header` is `user`.
643For example,
644
645.. code-block:: c++
646
647  // foo.h
648  #include <iostream>
649  void Hello() {
650    std::cout << "Hello World.\n";
651  }
652
653  // use.cpp
654  import "foo.h";
655  int main() {
656    Hello();
657  }
658
659We could compile it as:
660
661.. code-block:: console
662
663  $ clang++ -std=c++20 -fmodule-header foo.h -o foo.pcm
664  $ clang++ -std=c++20 -fmodule-file=foo.pcm use.cpp
665
666For headers which don't have a suffix, we need to pass ``-xc++-header``
667(or ``-xc++-system-header`` or ``-xc++-user-header``) to mark it as a header.
668For example,
669
670.. code-block:: c++
671
672  // use.cpp
673  import "foo.h";
674  int main() {
675    Hello();
676  }
677
678.. code-block:: console
679
680  $ clang++ -std=c++20 -fmodule-header=system -xc++-header iostream -o iostream.pcm
681  $ clang++ -std=c++20 -fmodule-file=iostream.pcm use.cpp
682
683How to specify the dependent BMIs
684~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
685
686We could use ``-fmodule-file`` to specify the BMIs, and this option may occur multiple times as well.
687
688With the existing implementation ``-fprebuilt-module-path`` cannot be used for header units
689(since they are nominally anonymous).
690For header units, use  ``-fmodule-file`` to include the relevant PCM file for each header unit.
691
692This is expect to be solved in future editions of the compiler either by the tooling finding and specifying
693the -fmodule-file or by the use of a module-mapper that understands how to map the header name to their PCMs.
694
695Don't compile the BMI
696~~~~~~~~~~~~~~~~~~~~~
697
698Another difference with modules is that we can't compile the BMI from a header unit.
699For example:
700
701.. code-block:: console
702
703  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
704  # This is not allowed!
705  $ clang++ iostream.pcm -c -o iostream.o
706
707It makes sense due to the semantics of header units, which are just like headers.
708
709Include translation
710~~~~~~~~~~~~~~~~~~~
711
712The C++ spec allows the vendors to convert ``#include header-name`` to ``import header-name;`` when possible.
713Currently, Clang would do this translation for the ``#include`` in the global module fragment.
714
715For example, the following two examples are the same:
716
717.. code-block:: c++
718
719  module;
720  import <iostream>;
721  export module M;
722  export void Hello() {
723    std::cout << "Hello.\n";
724  }
725
726with the following one:
727
728.. code-block:: c++
729
730  module;
731  #include <iostream>
732  export module M;
733  export void Hello() {
734      std::cout << "Hello.\n";
735  }
736
737.. code-block:: console
738
739  $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
740  $ clang++ -std=c++20 -fmodule-file=iostream.pcm --precompile M.cppm -o M.cpp
741
742In the latter example, the Clang could find the BMI for the ``<iostream>``
743so it would try to replace the ``#include <iostream>`` to ``import <iostream>;`` automatically.
744
745
746Relationships between Clang modules
747-----------------------------------
748
749Header units have pretty similar semantics with Clang modules.
750The semantics of both of them are like headers.
751
752In fact, we could even "mimic" the sytle of header units by Clang modules:
753
754.. code-block:: c++
755
756  module "iostream" {
757    export *
758    header "/path/to/libstdcxx/iostream"
759  }
760
761.. code-block:: console
762
763  $ clang++ -std=c++20 -fimplicit-modules -fmodule-map-file=.modulemap main.cpp
764
765It would be simpler if we are using libcxx:
766
767.. code-block:: console
768
769  $ clang++ -std=c++20 main.cpp -fimplicit-modules -fimplicit-module-maps
770
771Since there is already one
772`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_
773in the source of libcxx.
774
775Then immediately leads to the question: why don't we implement header units through Clang header modules?
776
777The main reason for this is that Clang modules have more semantics like hierarchy or
778wrapping multiple headers together as a big module.
779However, these things are not part of Standard C++ Header units,
780and we want to avoid the impression that these additional semantics get interpreted as Standard C++ behavior.
781
782Another reason is that there are proposals to introduce module mappers to the C++ standard
783(for example, https://wg21.link/p1184r2).
784If we decide to reuse Clang's modulemap, we may get in trouble once we need to introduce another module mapper.
785
786So the final answer for why we don't reuse the interface of Clang modules for header units is that
787there are some differences between header units and Clang modules and that ignoring those
788differences now would likely become a problem in the future.
789
790Possible Questions
791==================
792
793How modules speed up compilation
794--------------------------------
795
796A classic theory for the reason why modules speed up the compilation is:
797if there are ``n`` headers and ``m`` source files and each header is included by each source file,
798then the complexity of the compilation is ``O(n*m)``;
799But if there are ``n`` module interfaces and ``m`` source files, the complexity of the compilation is
800``O(n+m)``. So, using modules would be a big win when scaling.
801In a simpler word, we could get rid of many redundant compilations by using modules.
802
803Roughly, this theory is correct. But the problem is that it is too rough.
804The behavior depends on the optimization level, as we will illustrate below.
805
806First is ``O0``. The compilation process is described in the following graph.
807
808.. code-block:: none
809
810  ├-------------frontend----------┼-------------middle end----------------┼----backend----┤
811  │                               │                                       │               │
812  └---parsing----sema----codegen--┴----- transformations ---- codegen ----┴---- codegen --┘
813
814  ┌---------------------------------------------------------------------------------------┐
815  |                                                                                       │
816  |                                     source file                                       │
817  |                                                                                       │
818  └---------------------------------------------------------------------------------------┘
819
820              ┌--------┐
821              │        │
822              │imported│
823              │        │
824              │  code  │
825              │        │
826              └--------┘
827
828Here we can see that the source file (could be a non-module unit or a module unit) would get processed by the
829whole pipeline.
830But the imported code would only get involved in semantic analysis, which is mainly about name lookup,
831overload resolution and template instantiation.
832All of these processes are fast relative to the whole compilation process.
833More importantly, the imported code only needs to be processed once in frontend code generation,
834as well as the whole middle end and backend.
835So we could get a big win for the compilation time in O0.
836
837But with optimizations, things are different:
838
839(we omit ``code generation`` part for each end due to the limited space)
840
841.. code-block:: none
842
843  ├-------- frontend ---------┼--------------- middle end --------------------┼------ backend ----┤
844  │                           │                                               │                   │
845  └--- parsing ---- sema -----┴--- optimizations --- IPO ---- optimizations---┴--- optimizations -┘
846
847  ┌-----------------------------------------------------------------------------------------------┐
848  │                                                                                               │
849  │                                         source file                                           │
850  │                                                                                               │
851  └-----------------------------------------------------------------------------------------------┘
852                ┌---------------------------------------┐
853                │                                       │
854                │                                       │
855                │            imported code              │
856                │                                       │
857                │                                       │
858                └---------------------------------------┘
859
860It would be very unfortunate if we end up with worse performance after using modules.
861The main concern is that when we compile a source file, the compiler needs to see the function body
862of imported module units so that it can perform IPO (InterProcedural Optimization, primarily inlining
863in practice) to optimize functions in current source file with the help of the information provided by
864the imported module units.
865In other words, the imported code would be processed again and again in importee units
866by optimizations (including IPO itself).
867The optimizations before IPO and the IPO itself are the most time-consuming part in whole compilation process.
868So from this perspective, we might not be able to get the improvements described in the theory.
869But we could still save the time for optimizations after IPO and the whole backend.
870
871Overall, at ``O0`` the implementations of functions defined in a module will not impact module users,
872but at higher optimization levels the definitions of such functions are provided to user compilations for the
873purposes of optimization (but definitions of these functions are still not included in the use's object file)-
874this means the build speedup at higher optimization levels may be lower than expected given ``O0`` experience,
875but does provide by more optimization opportunities.
876
877