1==================== 2Standard C++ Modules 3==================== 4 5.. contents:: 6 :local: 7 8Introduction 9============ 10 11The term ``modules`` has a lot of meanings. For the users of Clang, modules may 12refer to ``Objective-C Modules``, ``Clang C++ Modules`` (or ``Clang Header Modules``, 13etc.) or ``Standard C++ Modules``. The implementation of all these kinds of modules in Clang 14has a lot of shared code, but from the perspective of users, their semantics and 15command line interfaces are very different. This document focuses on 16an introduction of how to use standard C++ modules in Clang. 17 18There is already a detailed document about `Clang modules <Modules.html>`_, it 19should be helpful to read `Clang modules <Modules.html>`_ if you want to know 20more about the general idea of modules. Since standard C++ modules have different semantics 21(and work flows) from `Clang modules`, this page describes the background and use of 22Clang with standard C++ modules. 23 24Modules exist in two forms in the C++ Language Specification. They can refer to 25either "Named Modules" or to "Header Units". This document covers both forms. 26 27Standard C++ Named modules 28========================== 29 30This document was intended to be a manual first and foremost, however, we consider it helpful to 31introduce some language background here for readers who are not familiar with 32the new language feature. This document is not intended to be a language 33tutorial; it will only introduce necessary concepts about the 34structure and building of the project. 35 36Background and terminology 37-------------------------- 38 39Modules 40~~~~~~~ 41 42In this document, the term ``Modules``/``modules`` refers to standard C++ modules 43feature if it is not decorated by ``Clang``. 44 45Clang Modules 46~~~~~~~~~~~~~ 47 48In this document, the term ``Clang Modules``/``Clang modules`` refer to Clang 49c++ modules extension. These are also known as ``Clang header modules``, 50``Clang module map modules`` or ``Clang c++ modules``. 51 52Module and module unit 53~~~~~~~~~~~~~~~~~~~~~~ 54 55A module consists of one or more module units. A module unit is a special 56translation unit. Every module unit must have a module declaration. The syntax 57of the module declaration is: 58 59.. code-block:: c++ 60 61 [export] module module_name[:partition_name]; 62 63Terms enclosed in ``[]`` are optional. The syntax of ``module_name`` and ``partition_name`` 64in regex form corresponds to ``[a-zA-Z_][a-zA-Z_0-9\.]*``. In particular, a literal dot ``.`` 65in the name has no semantic meaning (e.g. implying a hierarchy). 66 67In this document, module units are classified into: 68 69* Primary module interface unit. 70 71* Module implementation unit. 72 73* Module interface partition unit. 74 75* Internal module partition unit. 76 77A primary module interface unit is a module unit whose module declaration is 78``export module module_name;``. The ``module_name`` here denotes the name of the 79module. A module should have one and only one primary module interface unit. 80 81A module implementation unit is a module unit whose module declaration is 82``module module_name;``. A module could have multiple module implementation 83units with the same declaration. 84 85A module interface partition unit is a module unit whose module declaration is 86``export module module_name:partition_name;``. The ``partition_name`` should be 87unique within any given module. 88 89An internal module partition unit is a module unit whose module declaration 90is ``module module_name:partition_name;``. The ``partition_name`` should be 91unique within any given module. 92 93In this document, we use the following umbrella terms: 94 95* A ``module interface unit`` refers to either a ``primary module interface unit`` 96 or a ``module interface partition unit``. 97 98* An ``importable module unit`` refers to either a ``module interface unit`` 99 or a ``internal module partition unit``. 100 101* A ``module partition unit`` refers to either a ``module interface partition unit`` 102 or a ``internal module partition unit``. 103 104Built Module Interface file 105~~~~~~~~~~~~~~~~~~~~~~~~~~~ 106 107A ``Built Module Interface file`` stands for the precompiled result of an importable module unit. 108It is also called the acronym ``BMI`` genrally. 109 110Global module fragment 111~~~~~~~~~~~~~~~~~~~~~~ 112 113In a module unit, the section from ``module;`` to the module declaration is called the global module fragment. 114 115 116How to build projects using modules 117----------------------------------- 118 119Quick Start 120~~~~~~~~~~~ 121 122Let's see a "hello world" example that uses modules. 123 124.. code-block:: c++ 125 126 // Hello.cppm 127 module; 128 #include <iostream> 129 export module Hello; 130 export void hello() { 131 std::cout << "Hello World!\n"; 132 } 133 134 // use.cpp 135 import Hello; 136 int main() { 137 hello(); 138 return 0; 139 } 140 141Then we type: 142 143.. code-block:: console 144 145 $ clang++ -std=c++20 Hello.cppm --precompile -o Hello.pcm 146 $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out 147 $ ./Hello.out 148 Hello World! 149 150In this example, we make and use a simple module ``Hello`` which contains only a 151primary module interface unit ``Hello.cppm``. 152 153Then let's see a little bit more complex "hello world" example which uses the 4 kinds of module units. 154 155.. code-block:: c++ 156 157 // M.cppm 158 export module M; 159 export import :interface_part; 160 import :impl_part; 161 export void Hello(); 162 163 // interface_part.cppm 164 export module M:interface_part; 165 export void World(); 166 167 // impl_part.cppm 168 module; 169 #include <iostream> 170 #include <string> 171 module M:impl_part; 172 import :interface_part; 173 174 std::string W = "World."; 175 void World() { 176 std::cout << W << std::endl; 177 } 178 179 // Impl.cpp 180 module; 181 #include <iostream> 182 module M; 183 void Hello() { 184 std::cout << "Hello "; 185 } 186 187 // User.cpp 188 import M; 189 int main() { 190 Hello(); 191 World(); 192 return 0; 193 } 194 195Then we are able to compile the example by the following command: 196 197.. code-block:: console 198 199 # Precompiling the module 200 $ clang++ -std=c++20 interface_part.cppm --precompile -o M-interface_part.pcm 201 $ clang++ -std=c++20 impl_part.cppm --precompile -fprebuilt-module-path=. -o M-impl_part.pcm 202 $ clang++ -std=c++20 M.cppm --precompile -fprebuilt-module-path=. -o M.pcm 203 $ clang++ -std=c++20 Impl.cpp -fmodule-file=M.pcm -c -o Impl.o 204 205 # Compiling the user 206 $ clang++ -std=c++20 User.cpp -fprebuilt-module-path=. -c -o User.o 207 208 # Compiling the module and linking it together 209 $ clang++ -std=c++20 M-interface_part.pcm -c -o M-interface_part.o 210 $ clang++ -std=c++20 M-impl_part.pcm -c -o M-impl_part.o 211 $ clang++ -std=c++20 M.pcm -c -o M.o 212 $ clang++ User.o M-interface_part.o M-impl_part.o M.o Impl.o -o a.out 213 214We explain the options in the following sections. 215 216How to enable standard C++ modules 217~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 218 219Currently, standard C++ modules are enabled automatically 220if the language standard is ``-std=c++20`` or newer. 221The ``-fmodules-ts`` option is deprecated and is planned to be removed. 222 223How to produce a BMI 224~~~~~~~~~~~~~~~~~~~~ 225 226It is possible to generate a BMI for an importable module unit by specifying the ``--precompile`` option. 227 228File name requirement 229~~~~~~~~~~~~~~~~~~~~~ 230 231The file name of an ``importable module unit`` should end with ``.cppm`` 232(or ``.ccm``, ``.cxxm``, ``.c++m``). The file name of a ``module implementation unit`` 233should end with ``.cpp`` (or ``.cc``, ``.cxx``, ``.c++``). 234 235The file name of BMIs should end with ``.pcm``. 236The file name of the BMI of a ``primary module interface unit`` should be ``module_name.pcm``. 237The file name of BMIs of ``module partition unit`` should be ``module_name-partition_name.pcm``. 238 239If the file names use different extensions, Clang may fail to build the module. 240For example, if the filename of an ``importable module unit`` ends with ``.cpp`` instead of ``.cppm``, 241then we can't generate a BMI for the ``importable module unit`` by ``--precompile`` option 242since ``--precompile`` option now would only run preprocessor, which is equal to `-E` now. 243If we want the filename of an ``importable module unit`` ends with other suffixes instead of ``.cppm``, 244we could put ``-x c++-module`` in front of the file. For example, 245 246.. code-block:: c++ 247 248 // Hello.cpp 249 module; 250 #include <iostream> 251 export module Hello; 252 export void hello() { 253 std::cout << "Hello World!\n"; 254 } 255 256 // use.cpp 257 import Hello; 258 int main() { 259 hello(); 260 return 0; 261 } 262 263Now the filename of the ``module interface`` ends with ``.cpp`` instead of ``.cppm``, 264we can't compile them by the original command lines. But we are still able to do it by: 265 266.. code-block:: console 267 268 $ clang++ -std=c++20 -x c++-module Hello.cpp --precompile -o Hello.pcm 269 $ clang++ -std=c++20 use.cpp -fprebuilt-module-path=. Hello.pcm -o Hello.out 270 $ ./Hello.out 271 Hello World! 272 273How to specify the dependent BMIs 274~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 275 276The option ``-fprebuilt-module-path`` tells the compiler the path where to search for dependent BMIs. 277It may be used multiple times just like ``-I`` for specifying paths for header files. The look up rule here is: 278 279* (1) When we import module M. The compiler would look up M.pcm in the directories specified 280 by ``-fprebuilt-module-path``. 281* (2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the 282 directories specified by ``-fprebuilt-module-path``. 283 284Another way to specify the dependent BMIs is to use ``-fmodule-file``. The main difference 285is that ``-fprebuilt-module-path`` takes a directory, whereas ``-fmodule-file`` requires a 286specific file. In case both the ``-fprebuilt-module-path`` and ``-fmodule-file`` exist, the 287``-fmodule-file`` option takes higher precedence. In another word, if the compiler finds the wanted 288BMI specified by ``-fmodule-file``, the compiler wouldn't look up again in the directories specified 289by ``-fprebuilt-module-path``. 290 291When we compile a ``module implementation unit``, we must pass the BMI of the corresponding 292``primary module interface unit`` by ``-fmodule-file`` 293since the language specification says a module implementation unit implicitly imports 294the primary module interface unit. 295 296 [module.unit]p8 297 298 A module-declaration that contains neither an export-keyword nor a module-partition implicitly 299 imports the primary module interface unit of the module as if by a module-import-declaration. 300 301Again, the option ``-fmodule-file`` may occur multiple times. 302For example, the command line to compile ``M.cppm`` in 303the above example could be rewritten into: 304 305.. code-block:: console 306 307 $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M-interface_part.pcm -fmodule-file=M-impl_part.pcm -o M.pcm 308 309``-fprebuilt-module-path`` is more convenient and ``-fmodule-file`` is faster since 310it saves time for file lookup. 311 312Remember that module units still have an object counterpart to the BMI 313~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 314 315It is easy to forget to compile BMIs at first since we may envision module interfaces like headers. 316However, this is not true. 317Module units are translation units. We need to compile them to object files 318and link the object files like the example shows. 319 320For example, the traditional compilation processes for headers are like: 321 322.. code-block:: text 323 324 src1.cpp -+> clang++ src1.cpp --> src1.o ---, 325 hdr1.h --' +-> clang++ src1.o src2.o -> executable 326 hdr2.h --, | 327 src2.cpp -+> clang++ src2.cpp --> src2.o ---' 328 329And the compilation process for module units are like: 330 331.. code-block:: text 332 333 src1.cpp ----------------------------------------+> clang++ src1.cpp -------> src1.o -, 334 (header unit) hdr1.h -> clang++ hdr1.h ... -> hdr1.pcm --' +-> clang++ src1.o mod1.o src2.o -> executable 335 mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+ 336 src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -' 337 338As the diagrams show, we need to compile the BMI from module units to object files and link the object files. 339(But we can't do this for the BMI from header units. See the later section for the definition of header units) 340 341If we want to create a module library, we can't just ship the BMIs in an archive. 342We must compile these BMIs(``*.pcm``) into object files(``*.o``) and add those object files to the archive instead. 343 344Consistency Requirement 345~~~~~~~~~~~~~~~~~~~~~~~ 346 347If we envision modules as a cache to speed up compilation, then - as with other caching techniques - 348it is important to keep cache consistency. 349So **currently** Clang will do very strict check for consistency. 350 351Options consistency 352^^^^^^^^^^^^^^^^^^^ 353 354The language option of module units and their non-module-unit users should be consistent. 355The following example is not allowed: 356 357.. code-block:: c++ 358 359 // M.cppm 360 export module M; 361 362 // Use.cpp 363 import M; 364 365.. code-block:: console 366 367 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 368 $ clang++ -std=c++2b Use.cpp -fprebuilt-module-path=. 369 370The compiler would reject the example due to the inconsistent language options. 371Not all options are language options. 372For example, the following example is allowed: 373 374.. code-block:: console 375 376 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 377 # Inconsistent optimization level. 378 $ clang++ -std=c++20 -O3 Use.cpp -fprebuilt-module-path=. 379 # Inconsistent debugging level. 380 $ clang++ -std=c++20 -g Use.cpp -fprebuilt-module-path=. 381 382Although the two examples have inconsistent optimization and debugging level, both of them are accepted. 383 384Note that **currently** the compiler doesn't consider inconsistent macro definition a problem. For example: 385 386.. code-block:: console 387 388 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 389 # Inconsistent optimization level. 390 $ clang++ -std=c++20 -O3 -DNDEBUG Use.cpp -fprebuilt-module-path=. 391 392Currently Clang would accept the above example. But it may produce surprising results if the 393debugging code depends on consistent use of ``NDEBUG`` also in other translation units. 394 395Source content consistency 396^^^^^^^^^^^^^^^^^^^^^^^^^^ 397 398When the compiler reads a BMI, the compiler will check the consistency of the corresponding 399source files. For example: 400 401.. code-block:: c++ 402 403 // M.cppm 404 export module M; 405 export template <class T> 406 T foo(T t) { 407 return t; 408 } 409 410 // Use.cpp 411 import M; 412 void bar() { 413 foo(5); 414 } 415 416.. code-block:: console 417 418 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 419 $ rm M.cppm 420 $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm 421 422The compiler would reject the example since the compiler failed to find the source file to check the consistency. 423So the following example would be rejected too. 424 425.. code-block:: console 426 427 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 428 $ echo "int i=0;" >> M.cppm 429 $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm 430 431The compiler would reject it too since the compiler detected the file was changed. 432 433But it is OK to move the BMI as long as the source files remain: 434 435.. code-block:: console 436 437 $ clang++ -std=c++20 M.cppm --precompile -o M.pcm 438 $ mkdir -p tmp 439 $ mv M.pcm tmp/M.pcm 440 $ clang++ -std=c++20 Use.cpp -fmodule-file=tmp/M.pcm 441 442The above example would be accepted. 443 444If the user doesn't want to follow the consistency requirement due to some reasons (e.g., distributing BMI), 445the user could try to use ``-Xclang -fmodules-embed-all-files`` when producing BMI. For example: 446 447.. code-block:: console 448 449 $ clang++ -std=c++20 M.cppm --precompile -Xclang -fmodules-embed-all-files -o M.pcm 450 $ rm M.cppm 451 $ clang++ -std=c++20 Use.cpp -fmodule-file=M.pcm 452 453Now the compiler would accept the above example. 454Important note: Xclang options are intended to be used by compiler internally and its semantics 455are not guaranteed to be preserved in future versions. 456 457Also the compiler will record the path to the header files included in the global module fragment and compare the 458headers when imported. For example, 459 460.. code-block:: c++ 461 462 // foo.h 463 #include <iostream> 464 void Hello() { 465 std::cout << "Hello World.\n"; 466 } 467 468 // foo.cppm 469 module; 470 #include "foo.h" 471 export module foo; 472 export using ::Hello; 473 474 // Use.cpp 475 import foo; 476 int main() { 477 Hello(); 478 } 479 480Then it is problematic if we remove ``foo.h`` before import `foo` module. 481 482.. code-block:: console 483 484 clang++ -std=c++20 foo.cppm --precompile -o foo.pcm 485 mv foo.h foo.orig.h 486 # The following one is rejected 487 clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c 488 489The above case will rejected. And we're still able to workaround it by ``-Xclang -fmodules-embed-all-files`` option: 490 491.. code-block:: console 492 493 clang++ -std=c++20 foo.cppm --precompile -Xclang -fmodules-embed-all-files -o foo.pcm 494 mv foo.h foo.orig.h 495 clang++ -std=c++20 Use.cpp -fmodule-file=foo.pcm -c -o Use.o 496 clang++ Use.o foo.pcm 497 498ABI Impacts 499----------- 500 501The declarations in a module unit which are not in the global module fragment have new linkage names. 502 503For example, 504 505.. code-block:: c++ 506 507 export module M; 508 namespace NS { 509 export int foo(); 510 } 511 512The linkage name of ``NS::foo()`` would be ``_ZN2NSW1M3fooEv``. 513This couldn't be demangled by previous versions of the debugger or demangler. 514As of LLVM 15.x, users can utilize ``llvm-cxxfilt`` to demangle this: 515 516.. code-block:: console 517 518 $ llvm-cxxfilt _ZN2NSW1M3fooEv 519 520The result would be ``NS::foo@M()``, which reads as ``NS::foo()`` in module ``M``. 521 522The ABI implies that we can't declare something in a module unit and define it in a non-module unit (or vice-versa), 523as this would result in linking errors. 524 525Known Problems 526-------------- 527 528The following describes issues in the current implementation of modules. 529Please see https://github.com/llvm/llvm-project/labels/clang%3Amodules for more issues 530or file a new issue if you don't find an existing one. 531If you're going to create a new issue for standard C++ modules, 532please start the title with ``[C++20] [Modules]`` (or ``[C++2b] [Modules]``, etc) 533and add the label ``clang:modules`` (if you have permissions for that). 534 535For higher level support for proposals, you could visit https://clang.llvm.org/cxx_status.html. 536 537Support for clang-scan-deps 538~~~~~~~~~~~~~~~~~~~~~~~~~~~ 539 540The support for clang-scan-deps may be the most urgent problem for modules now. 541Without the support for clang-scan-deps, it's hard to involve build systems. 542This means that users could only play with modules through makefiles or by writing a parser by hand. 543It blocks more uses for modules, which will block more defect reports or requirements. 544 545This is tracked in: https://github.com/llvm/llvm-project/issues/51792. 546 547Ambiguous deduction guide 548~~~~~~~~~~~~~~~~~~~~~~~~~ 549 550Currently, when we call deduction guides in global module fragment, 551we may get incorrect diagnosing message like: `ambiguous deduction`. 552 553So if we're using deduction guide from global module fragment, we probably need to write: 554 555.. code-block:: c++ 556 557 std::lock_guard<std::mutex> lk(mutex); 558 559instead of 560 561.. code-block:: c++ 562 563 std::lock_guard lk(mutex); 564 565This is tracked in: https://github.com/llvm/llvm-project/issues/56916 566 567Ignored PreferredName Attribute 568~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 569 570Due to a tricky problem, when Clang writes BMIs, Clang will ignore the ``preferred_name`` attribute, if any. 571This implies that the ``preferred_name`` wouldn't show in debugger or dumping. 572 573This is tracked in: https://github.com/llvm/llvm-project/issues/56490 574 575Don't emit macros about module declaration 576~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 577 578This is covered by P1857R3. We mention it again here since users may abuse it before we implement it. 579 580Someone may want to write code which could be compiled both by modules or non-modules. 581A direct idea would be use macros like: 582 583.. code-block:: c++ 584 585 MODULE 586 IMPORT header_name 587 EXPORT_MODULE MODULE_NAME; 588 IMPORT header_name 589 EXPORT ... 590 591So this file could be triggered like a module unit or a non-module unit depending on the definition 592of some macros. 593However, this kind of usage is forbidden by P1857R3 but we haven't implemented P1857R3 yet. 594This means that is possible to write illegal modules code now, and obviously this will stop working 595once P1857R3 is implemented. 596A simple suggestion would be "Don't play macro tricks with module declarations". 597 598This is tracked in: https://github.com/llvm/llvm-project/issues/56917 599 600In consistent filename suffix requirement for importable module units 601~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 602 603Currently, clang requires the file name of an ``importable module unit`` should end with ``.cppm`` 604(or ``.ccm``, ``.cxxm``, ``.c++m``). However, the behavior is inconsistent with other compilers. 605 606This is tracked in: https://github.com/llvm/llvm-project/issues/57416 607 608Header Units 609============ 610 611How to build projects using header unit 612--------------------------------------- 613 614Quick Start 615~~~~~~~~~~~ 616 617For the following example, 618 619.. code-block:: c++ 620 621 import <iostream>; 622 int main() { 623 std::cout << "Hello World.\n"; 624 } 625 626we could compile it as 627 628.. code-block:: console 629 630 $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm 631 $ clang++ -std=c++20 -fmodule-file=iostream.pcm main.cpp 632 633How to produce BMIs 634~~~~~~~~~~~~~~~~~~~ 635 636Similar to named modules, we could use ``--precompile`` to produce the BMI. 637But we need to specify that the input file is a header by ``-xc++-system-header`` or ``-xc++-user-header``. 638 639Also we could use `-fmodule-header={user,system}` option to produce the BMI for header units 640which has suffix like `.h` or `.hh`. 641The value of `-fmodule-header` means the user search path or the system search path. 642The default value for `-fmodule-header` is `user`. 643For example, 644 645.. code-block:: c++ 646 647 // foo.h 648 #include <iostream> 649 void Hello() { 650 std::cout << "Hello World.\n"; 651 } 652 653 // use.cpp 654 import "foo.h"; 655 int main() { 656 Hello(); 657 } 658 659We could compile it as: 660 661.. code-block:: console 662 663 $ clang++ -std=c++20 -fmodule-header foo.h -o foo.pcm 664 $ clang++ -std=c++20 -fmodule-file=foo.pcm use.cpp 665 666For headers which don't have a suffix, we need to pass ``-xc++-header`` 667(or ``-xc++-system-header`` or ``-xc++-user-header``) to mark it as a header. 668For example, 669 670.. code-block:: c++ 671 672 // use.cpp 673 import "foo.h"; 674 int main() { 675 Hello(); 676 } 677 678.. code-block:: console 679 680 $ clang++ -std=c++20 -fmodule-header=system -xc++-header iostream -o iostream.pcm 681 $ clang++ -std=c++20 -fmodule-file=iostream.pcm use.cpp 682 683How to specify the dependent BMIs 684~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 685 686We could use ``-fmodule-file`` to specify the BMIs, and this option may occur multiple times as well. 687 688With the existing implementation ``-fprebuilt-module-path`` cannot be used for header units 689(since they are nominally anonymous). 690For header units, use ``-fmodule-file`` to include the relevant PCM file for each header unit. 691 692This is expect to be solved in future editions of the compiler either by the tooling finding and specifying 693the -fmodule-file or by the use of a module-mapper that understands how to map the header name to their PCMs. 694 695Don't compile the BMI 696~~~~~~~~~~~~~~~~~~~~~ 697 698Another difference with modules is that we can't compile the BMI from a header unit. 699For example: 700 701.. code-block:: console 702 703 $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm 704 # This is not allowed! 705 $ clang++ iostream.pcm -c -o iostream.o 706 707It makes sense due to the semantics of header units, which are just like headers. 708 709Include translation 710~~~~~~~~~~~~~~~~~~~ 711 712The C++ spec allows the vendors to convert ``#include header-name`` to ``import header-name;`` when possible. 713Currently, Clang would do this translation for the ``#include`` in the global module fragment. 714 715For example, the following two examples are the same: 716 717.. code-block:: c++ 718 719 module; 720 import <iostream>; 721 export module M; 722 export void Hello() { 723 std::cout << "Hello.\n"; 724 } 725 726with the following one: 727 728.. code-block:: c++ 729 730 module; 731 #include <iostream> 732 export module M; 733 export void Hello() { 734 std::cout << "Hello.\n"; 735 } 736 737.. code-block:: console 738 739 $ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm 740 $ clang++ -std=c++20 -fmodule-file=iostream.pcm --precompile M.cppm -o M.cpp 741 742In the latter example, the Clang could find the BMI for the ``<iostream>`` 743so it would try to replace the ``#include <iostream>`` to ``import <iostream>;`` automatically. 744 745 746Relationships between Clang modules 747----------------------------------- 748 749Header units have pretty similar semantics with Clang modules. 750The semantics of both of them are like headers. 751 752In fact, we could even "mimic" the sytle of header units by Clang modules: 753 754.. code-block:: c++ 755 756 module "iostream" { 757 export * 758 header "/path/to/libstdcxx/iostream" 759 } 760 761.. code-block:: console 762 763 $ clang++ -std=c++20 -fimplicit-modules -fmodule-map-file=.modulemap main.cpp 764 765It would be simpler if we are using libcxx: 766 767.. code-block:: console 768 769 $ clang++ -std=c++20 main.cpp -fimplicit-modules -fimplicit-module-maps 770 771Since there is already one 772`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_ 773in the source of libcxx. 774 775Then immediately leads to the question: why don't we implement header units through Clang header modules? 776 777The main reason for this is that Clang modules have more semantics like hierarchy or 778wrapping multiple headers together as a big module. 779However, these things are not part of Standard C++ Header units, 780and we want to avoid the impression that these additional semantics get interpreted as Standard C++ behavior. 781 782Another reason is that there are proposals to introduce module mappers to the C++ standard 783(for example, https://wg21.link/p1184r2). 784If we decide to reuse Clang's modulemap, we may get in trouble once we need to introduce another module mapper. 785 786So the final answer for why we don't reuse the interface of Clang modules for header units is that 787there are some differences between header units and Clang modules and that ignoring those 788differences now would likely become a problem in the future. 789 790Possible Questions 791================== 792 793How modules speed up compilation 794-------------------------------- 795 796A classic theory for the reason why modules speed up the compilation is: 797if there are ``n`` headers and ``m`` source files and each header is included by each source file, 798then the complexity of the compilation is ``O(n*m)``; 799But if there are ``n`` module interfaces and ``m`` source files, the complexity of the compilation is 800``O(n+m)``. So, using modules would be a big win when scaling. 801In a simpler word, we could get rid of many redundant compilations by using modules. 802 803Roughly, this theory is correct. But the problem is that it is too rough. 804The behavior depends on the optimization level, as we will illustrate below. 805 806First is ``O0``. The compilation process is described in the following graph. 807 808.. code-block:: none 809 810 ├-------------frontend----------┼-------------middle end----------------┼----backend----┤ 811 │ │ │ │ 812 └---parsing----sema----codegen--┴----- transformations ---- codegen ----┴---- codegen --┘ 813 814 ┌---------------------------------------------------------------------------------------┐ 815 | │ 816 | source file │ 817 | │ 818 └---------------------------------------------------------------------------------------┘ 819 820 ┌--------┐ 821 │ │ 822 │imported│ 823 │ │ 824 │ code │ 825 │ │ 826 └--------┘ 827 828Here we can see that the source file (could be a non-module unit or a module unit) would get processed by the 829whole pipeline. 830But the imported code would only get involved in semantic analysis, which is mainly about name lookup, 831overload resolution and template instantiation. 832All of these processes are fast relative to the whole compilation process. 833More importantly, the imported code only needs to be processed once in frontend code generation, 834as well as the whole middle end and backend. 835So we could get a big win for the compilation time in O0. 836 837But with optimizations, things are different: 838 839(we omit ``code generation`` part for each end due to the limited space) 840 841.. code-block:: none 842 843 ├-------- frontend ---------┼--------------- middle end --------------------┼------ backend ----┤ 844 │ │ │ │ 845 └--- parsing ---- sema -----┴--- optimizations --- IPO ---- optimizations---┴--- optimizations -┘ 846 847 ┌-----------------------------------------------------------------------------------------------┐ 848 │ │ 849 │ source file │ 850 │ │ 851 └-----------------------------------------------------------------------------------------------┘ 852 ┌---------------------------------------┐ 853 │ │ 854 │ │ 855 │ imported code │ 856 │ │ 857 │ │ 858 └---------------------------------------┘ 859 860It would be very unfortunate if we end up with worse performance after using modules. 861The main concern is that when we compile a source file, the compiler needs to see the function body 862of imported module units so that it can perform IPO (InterProcedural Optimization, primarily inlining 863in practice) to optimize functions in current source file with the help of the information provided by 864the imported module units. 865In other words, the imported code would be processed again and again in importee units 866by optimizations (including IPO itself). 867The optimizations before IPO and the IPO itself are the most time-consuming part in whole compilation process. 868So from this perspective, we might not be able to get the improvements described in the theory. 869But we could still save the time for optimizations after IPO and the whole backend. 870 871Overall, at ``O0`` the implementations of functions defined in a module will not impact module users, 872but at higher optimization levels the definitions of such functions are provided to user compilations for the 873purposes of optimization (but definitions of these functions are still not included in the use's object file)- 874this means the build speedup at higher optimization levels may be lower than expected given ``O0`` experience, 875but does provide by more optimization opportunities. 876 877