1=====================
2LLVM Coding Standards
3=====================
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11This document attempts to describe a few coding standards that are being used in
12the LLVM source tree.  Although no coding standards should be regarded as
13absolute requirements to be followed in all instances, coding standards are
14particularly important for large-scale code bases that follow a library-based
15design (like LLVM).
16
17While this document may provide guidance for some mechanical formatting issues,
18whitespace, or other "microscopic details", these are not fixed standards.
19Always follow the golden rule:
20
21.. _Golden Rule:
22
23    **If you are extending, enhancing, or bug fixing already implemented code,
24    use the style that is already being used so that the source is uniform and
25    easy to follow.**
26
27Note that some code bases (e.g. ``libc++``) have really good reasons to deviate
28from the coding standards.  In the case of ``libc++``, this is because the
29naming and other conventions are dictated by the C++ standard.  If you think
30there is a specific good reason to deviate from the standards here, please bring
31it up on the LLVM-dev mailing list.
32
33There are some conventions that are not uniformly followed in the code base
34(e.g. the naming convention).  This is because they are relatively new, and a
35lot of code was written before they were put in place.  Our long term goal is
36for the entire codebase to follow the convention, but we explicitly *do not*
37want patches that do large-scale reformatting of existing code.  On the other
38hand, it is reasonable to rename the methods of a class if you're about to
39change it in some other way.  Just do the reformatting as a separate commit
40from the functionality change.
41
42The ultimate goal of these guidelines is to increase the readability and
43maintainability of our common source base. If you have suggestions for topics to
44be included, please mail them to `Chris <mailto:[email protected]>`_.
45
46Languages, Libraries, and Standards
47===================================
48
49Most source code in LLVM and other LLVM projects using these coding standards
50is C++ code. There are some places where C code is used either due to
51environment restrictions, historical restrictions, or due to third-party source
52code imported into the tree. Generally, our preference is for standards
53conforming, modern, and portable C++ code as the implementation language of
54choice.
55
56C++ Standard Versions
57---------------------
58
59LLVM, Clang, and LLD are currently written using C++14 conforming code,
60although we restrict ourselves to features which are available in the major
61toolchains supported as host compilers. The LLDB project is even more
62aggressive in the set of host compilers supported and thus uses still more
63features. Regardless of the supported features, code is expected to (when
64reasonable) be standard, portable, and modern C++14 code. We avoid unnecessary
65vendor-specific extensions, etc.
66
67C++ Standard Library
68--------------------
69
70Use the C++ standard library facilities whenever they are available for
71a particular task. LLVM and related projects emphasize and rely on the standard
72library facilities for as much as possible. Common support libraries providing
73functionality missing from the standard library for which there are standard
74interfaces or active work on adding standard interfaces will often be
75implemented in the LLVM namespace following the expected standard interface.
76
77There are some exceptions such as the standard I/O streams library which are
78avoided. Also, there is much more detailed information on these subjects in the
79:doc:`ProgrammersManual`.
80
81Supported C++14 Language and Library Features
82---------------------------------------------
83
84While LLVM, Clang, and LLD use C++14, not all features are available in all of
85the toolchains which we support. The set of features supported for use in LLVM
86is the intersection of those supported in the minimum requirements described
87in the :doc:`GettingStarted` page, section `Software`.
88The ultimate definition of this set is what build bots with those respective
89toolchains accept. Don't argue with the build bots. However, we have some
90guidance below to help you know what to expect.
91
92Each toolchain provides a good reference for what it accepts:
93
94* Clang: https://clang.llvm.org/cxx_status.html
95* GCC: https://gcc.gnu.org/projects/cxx-status.html#cxx14
96* MSVC: https://msdn.microsoft.com/en-us/library/hh567368.aspx
97
98Other Languages
99---------------
100
101Any code written in the Go programming language is not subject to the
102formatting rules below. Instead, we adopt the formatting rules enforced by
103the `gofmt`_ tool.
104
105Go code should strive to be idiomatic. Two good sets of guidelines for what
106this means are `Effective Go`_ and `Go Code Review Comments`_.
107
108.. _gofmt:
109  https://golang.org/cmd/gofmt/
110
111.. _Effective Go:
112  https://golang.org/doc/effective_go.html
113
114.. _Go Code Review Comments:
115  https://github.com/golang/go/wiki/CodeReviewComments
116
117Mechanical Source Issues
118========================
119
120Source Code Formatting
121----------------------
122
123Commenting
124^^^^^^^^^^
125
126Comments are one critical part of readability and maintainability.  Everyone
127knows they should comment their code, and so should you.  When writing comments,
128write them as English prose, which means they should use proper capitalization,
129punctuation, etc.  Aim to describe what the code is trying to do and why, not
130*how* it does it at a micro level. Here are a few critical things to document:
131
132.. _header file comment:
133
134File Headers
135""""""""""""
136
137Every source file should have a header on it that describes the basic purpose of
138the file.  If a file does not have a header, it should not be checked into the
139tree.  The standard header looks like this:
140
141.. code-block:: c++
142
143  //===-- llvm/Instruction.h - Instruction class definition -------*- C++ -*-===//
144  //
145  // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
146  // See https://llvm.org/LICENSE.txt for license information.
147  // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
148  //
149  //===----------------------------------------------------------------------===//
150  ///
151  /// \file
152  /// This file contains the declaration of the Instruction class, which is the
153  /// base class for all of the VM instructions.
154  ///
155  //===----------------------------------------------------------------------===//
156
157A few things to note about this particular format: The "``-*- C++ -*-``" string
158on the first line is there to tell Emacs that the source file is a C++ file, not
159a C file (Emacs assumes ``.h`` files are C files by default).
160
161.. note::
162
163    This tag is not necessary in ``.cpp`` files.  The name of the file is also
164    on the first line, along with a very short description of the purpose of the
165    file.  This is important when printing out code and flipping though lots of
166    pages.
167
168The next section in the file is a concise note that defines the license that the
169file is released under.  This makes it perfectly clear what terms the source
170code can be distributed under and should not be modified in any way.
171
172The main body is a ``doxygen`` comment (identified by the ``///`` comment
173marker instead of the usual ``//``) describing the purpose of the file.  The
174first sentence (or a passage beginning with ``\brief``) is used as an abstract.
175Any additional information should be separated by a blank line.  If an
176algorithm is being implemented or something tricky is going on, a reference
177to the paper where it is published should be included, as well as any notes or
178*gotchas* in the code to watch out for.
179
180Class overviews
181"""""""""""""""
182
183Classes are one fundamental part of a good object oriented design.  As such, a
184class definition should have a comment block that explains what the class is
185used for and how it works.  Every non-trivial class is expected to have a
186``doxygen`` comment block.
187
188Method information
189""""""""""""""""""
190
191Methods defined in a class (as well as any global functions) should also be
192documented properly.  A quick note about what it does and a description of the
193borderline behaviour is all that is necessary here (unless something
194particularly tricky or insidious is going on).  The hope is that people can
195figure out how to use your interfaces without reading the code itself.
196
197Good things to talk about here are what happens when something unexpected
198happens: does the method return null?  Abort?  Format your hard disk?
199
200Comment Formatting
201^^^^^^^^^^^^^^^^^^
202
203In general, prefer C++ style comments (``//`` for normal comments, ``///`` for
204``doxygen`` documentation comments).  They take less space, require
205less typing, don't have nesting problems, etc.  There are a few cases when it is
206useful to use C style (``/* */``) comments however:
207
208#. When writing C code: Obviously if you are writing C code, use C style
209   comments.
210
211#. When writing a header file that may be ``#include``\d by a C source file.
212
213#. When writing a source file that is used by a tool that only accepts C style
214   comments.
215
216#. When documenting the significance of constants used as actual parameters in
217   a call. This is most helpful for ``bool`` parameters, or passing ``0`` or
218   ``nullptr``. Typically you add the formal parameter name, which ought to be
219   meaningful. For example, it's not clear what the parameter means in this call:
220
221   .. code-block:: c++
222
223     Object.emitName(nullptr);
224
225   An in-line C-style comment makes the intent obvious:
226
227   .. code-block:: c++
228
229     Object.emitName(/*Prefix=*/nullptr);
230
231Commenting out large blocks of code is discouraged, but if you really have to do
232this (for documentation purposes or as a suggestion for debug printing), use
233``#if 0`` and ``#endif``. These nest properly and are better behaved in general
234than C style comments.
235
236Doxygen Use in Documentation Comments
237^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
238
239Use the ``\file`` command to turn the standard file header into a file-level
240comment.
241
242Include descriptive paragraphs for all public interfaces (public classes,
243member and non-member functions).  Don't just restate the information that can
244be inferred from the API name.  The first sentence (or a paragraph beginning
245with ``\brief``) is used as an abstract. Try to use a single sentence as the
246``\brief`` adds visual clutter.  Put detailed discussion into separate
247paragraphs.
248
249To refer to parameter names inside a paragraph, use the ``\p name`` command.
250Don't use the ``\arg name`` command since it starts a new paragraph that
251contains documentation for the parameter.
252
253Wrap non-inline code examples in ``\code ... \endcode``.
254
255To document a function parameter, start a new paragraph with the
256``\param name`` command.  If the parameter is used as an out or an in/out
257parameter, use the ``\param [out] name`` or ``\param [in,out] name`` command,
258respectively.
259
260To describe function return value, start a new paragraph with the ``\returns``
261command.
262
263A minimal documentation comment:
264
265.. code-block:: c++
266
267  /// Sets the xyzzy property to \p Baz.
268  void setXyzzy(bool Baz);
269
270A documentation comment that uses all Doxygen features in a preferred way:
271
272.. code-block:: c++
273
274  /// Does foo and bar.
275  ///
276  /// Does not do foo the usual way if \p Baz is true.
277  ///
278  /// Typical usage:
279  /// \code
280  ///   fooBar(false, "quux", Res);
281  /// \endcode
282  ///
283  /// \param Quux kind of foo to do.
284  /// \param [out] Result filled with bar sequence on foo success.
285  ///
286  /// \returns true on success.
287  bool fooBar(bool Baz, StringRef Quux, std::vector<int> &Result);
288
289Don't duplicate the documentation comment in the header file and in the
290implementation file.  Put the documentation comments for public APIs into the
291header file.  Documentation comments for private APIs can go to the
292implementation file.  In any case, implementation files can include additional
293comments (not necessarily in Doxygen markup) to explain implementation details
294as needed.
295
296Don't duplicate function or class name at the beginning of the comment.
297For humans it is obvious which function or class is being documented;
298automatic documentation processing tools are smart enough to bind the comment
299to the correct declaration.
300
301Wrong:
302
303.. code-block:: c++
304
305  // In Something.h:
306
307  /// Something - An abstraction for some complicated thing.
308  class Something {
309  public:
310    /// fooBar - Does foo and bar.
311    void fooBar();
312  };
313
314  // In Something.cpp:
315
316  /// fooBar - Does foo and bar.
317  void Something::fooBar() { ... }
318
319Correct:
320
321.. code-block:: c++
322
323  // In Something.h:
324
325  /// An abstraction for some complicated thing.
326  class Something {
327  public:
328    /// Does foo and bar.
329    void fooBar();
330  };
331
332  // In Something.cpp:
333
334  // Builds a B-tree in order to do foo.  See paper by...
335  void Something::fooBar() { ... }
336
337It is not required to use additional Doxygen features, but sometimes it might
338be a good idea to do so.
339
340Consider:
341
342* adding comments to any narrow namespace containing a collection of
343  related functions or types;
344
345* using top-level groups to organize a collection of related functions at
346  namespace scope where the grouping is smaller than the namespace;
347
348* using member groups and additional comments attached to member
349  groups to organize within a class.
350
351For example:
352
353.. code-block:: c++
354
355  class Something {
356    /// \name Functions that do Foo.
357    /// @{
358    void fooBar();
359    void fooBaz();
360    /// @}
361    ...
362  };
363
364``#include`` Style
365^^^^^^^^^^^^^^^^^^
366
367Immediately after the `header file comment`_ (and include guards if working on a
368header file), the `minimal list of #includes`_ required by the file should be
369listed.  We prefer these ``#include``\s to be listed in this order:
370
371.. _Main Module Header:
372.. _Local/Private Headers:
373
374#. Main Module Header
375#. Local/Private Headers
376#. LLVM project/subproject headers (``clang/...``, ``lldb/...``, ``llvm/...``, etc)
377#. System ``#include``\s
378
379and each category should be sorted lexicographically by the full path.
380
381The `Main Module Header`_ file applies to ``.cpp`` files which implement an
382interface defined by a ``.h`` file.  This ``#include`` should always be included
383**first** regardless of where it lives on the file system.  By including a
384header file first in the ``.cpp`` files that implement the interfaces, we ensure
385that the header does not have any hidden dependencies which are not explicitly
386``#include``\d in the header, but should be. It is also a form of documentation
387in the ``.cpp`` file to indicate where the interfaces it implements are defined.
388
389LLVM project and subproject headers should be grouped from most specific to least
390specific, for the same reasons described above.  For example, LLDB depends on
391both clang and LLVM, and clang depends on LLVM.  So an LLDB source file should
392include ``lldb`` headers first, followed by ``clang`` headers, followed by
393``llvm`` headers, to reduce the possibility (for example) of an LLDB header
394accidentally picking up a missing include due to the previous inclusion of that
395header in the main source file or some earlier header file.  clang should
396similarly include its own headers before including llvm headers.  This rule
397applies to all LLVM subprojects.
398
399.. _fit into 80 columns:
400
401Source Code Width
402^^^^^^^^^^^^^^^^^
403
404Write your code to fit within 80 columns of text.  This helps those of us who
405like to print out code and look at your code in an ``xterm`` without resizing
406it.
407
408The longer answer is that there must be some limit to the width of the code in
409order to reasonably allow developers to have multiple files side-by-side in
410windows on a modest display.  If you are going to pick a width limit, it is
411somewhat arbitrary but you might as well pick something standard.  Going with 90
412columns (for example) instead of 80 columns wouldn't add any significant value
413and would be detrimental to printing out code.  Also many other projects have
414standardized on 80 columns, so some people have already configured their editors
415for it (vs something else, like 90 columns).
416
417This is one of many contentious issues in coding standards, but it is not up for
418debate.
419
420Whitespace
421^^^^^^^^^^
422
423In all cases, prefer spaces to tabs in source files.  People have different
424preferred indentation levels, and different styles of indentation that they
425like; this is fine.  What isn't fine is that different editors/viewers expand
426tabs out to different tab stops.  This can cause your code to look completely
427unreadable, and it is not worth dealing with.
428
429As always, follow the `Golden Rule`_ above: follow the style of
430existing code if you are modifying and extending it.  If you like four spaces of
431indentation, **DO NOT** do that in the middle of a chunk of code with two spaces
432of indentation.  Also, do not reindent a whole source file: it makes for
433incredible diffs that are absolutely worthless.
434
435Do not commit changes that include trailing whitespace. If you find trailing
436whitespace in a file, do not remove it unless you're otherwise changing that
437line of code. Some common editors will automatically remove trailing whitespace
438when saving a file which causes unrelated changes to appear in diffs and
439commits.
440
441Indent Code Consistently
442^^^^^^^^^^^^^^^^^^^^^^^^
443
444Okay, in your first year of programming you were told that indentation is
445important. If you didn't believe and internalize this then, now is the time.
446Just do it. With the introduction of C++11, there are some new formatting
447challenges that merit some suggestions to help have consistent, maintainable,
448and tool-friendly formatting and indentation.
449
450Format Lambdas Like Blocks Of Code
451""""""""""""""""""""""""""""""""""
452
453When formatting a multi-line lambda, format it like a block of code, that's
454what it is. If there is only one multi-line lambda in a statement, and there
455are no expressions lexically after it in the statement, drop the indent to the
456standard two space indent for a block of code, as if it were an if-block opened
457by the preceding part of the statement:
458
459.. code-block:: c++
460
461  std::sort(foo.begin(), foo.end(), [&](Foo a, Foo b) -> bool {
462    if (a.blah < b.blah)
463      return true;
464    if (a.baz < b.baz)
465      return true;
466    return a.bam < b.bam;
467  });
468
469To take best advantage of this formatting, if you are designing an API which
470accepts a continuation or single callable argument (be it a functor, or
471a ``std::function``), it should be the last argument if at all possible.
472
473If there are multiple multi-line lambdas in a statement, or there is anything
474interesting after the lambda in the statement, indent the block two spaces from
475the indent of the ``[]``:
476
477.. code-block:: c++
478
479  dyn_switch(V->stripPointerCasts(),
480             [] (PHINode *PN) {
481               // process phis...
482             },
483             [] (SelectInst *SI) {
484               // process selects...
485             },
486             [] (LoadInst *LI) {
487               // process loads...
488             },
489             [] (AllocaInst *AI) {
490               // process allocas...
491             });
492
493Braced Initializer Lists
494""""""""""""""""""""""""
495
496With C++11, there are significantly more uses of braced lists to perform
497initialization. These allow you to easily construct aggregate temporaries in
498expressions among other niceness. They now have a natural way of ending up
499nested within each other and within function calls in order to build up
500aggregates (such as option structs) from local variables. To make matters
501worse, we also have many more uses of braces in an expression context that are
502*not* performing initialization.
503
504The historically common formatting of braced initialization of aggregate
505variables does not mix cleanly with deep nesting, general expression contexts,
506function arguments, and lambdas. We suggest new code use a simple rule for
507formatting braced initialization lists: act as-if the braces were parentheses
508in a function call. The formatting rules exactly match those already well
509understood for formatting nested function calls. Examples:
510
511.. code-block:: c++
512
513  foo({a, b, c}, {1, 2, 3});
514
515  llvm::Constant *Mask[] = {
516      llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 0),
517      llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 1),
518      llvm::ConstantInt::get(llvm::Type::getInt32Ty(getLLVMContext()), 2)};
519
520This formatting scheme also makes it particularly easy to get predictable,
521consistent, and automatic formatting with tools like `Clang Format`_.
522
523.. _Clang Format: https://clang.llvm.org/docs/ClangFormat.html
524
525Language and Compiler Issues
526----------------------------
527
528Treat Compiler Warnings Like Errors
529^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
530
531If your code has compiler warnings in it, something is wrong --- you aren't
532casting values correctly, you have "questionable" constructs in your code, or
533you are doing something legitimately wrong.  Compiler warnings can cover up
534legitimate errors in output and make dealing with a translation unit difficult.
535
536It is not possible to prevent all warnings from all compilers, nor is it
537desirable.  Instead, pick a standard compiler (like ``gcc``) that provides a
538good thorough set of warnings, and stick to it.  At least in the case of
539``gcc``, it is possible to work around any spurious errors by changing the
540syntax of the code slightly.  For example, a warning that annoys me occurs when
541I write code like this:
542
543.. code-block:: c++
544
545  if (V = getValue()) {
546    ...
547  }
548
549``gcc`` will warn me that I probably want to use the ``==`` operator, and that I
550probably mistyped it.  In most cases, I haven't, and I really don't want the
551spurious errors.  To fix this particular problem, I rewrite the code like
552this:
553
554.. code-block:: c++
555
556  if ((V = getValue())) {
557    ...
558  }
559
560which shuts ``gcc`` up.  Any ``gcc`` warning that annoys you can be fixed by
561massaging the code appropriately.
562
563Write Portable Code
564^^^^^^^^^^^^^^^^^^^
565
566In almost all cases, it is possible and within reason to write completely
567portable code.  If there are cases where it isn't possible to write portable
568code, isolate it behind a well defined (and well documented) interface.
569
570In practice, this means that you shouldn't assume much about the host compiler
571(and Visual Studio tends to be the lowest common denominator).  If advanced
572features are used, they should only be an implementation detail of a library
573which has a simple exposed API, and preferably be buried in ``libSystem``.
574
575Do not use RTTI or Exceptions
576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
577
578In an effort to reduce code and executable size, LLVM does not use RTTI
579(e.g. ``dynamic_cast<>;``) or exceptions.  These two language features violate
580the general C++ principle of *"you only pay for what you use"*, causing
581executable bloat even if exceptions are never used in the code base, or if RTTI
582is never used for a class.  Because of this, we turn them off globally in the
583code.
584
585That said, LLVM does make extensive use of a hand-rolled form of RTTI that use
586templates like :ref:`isa\<>, cast\<>, and dyn_cast\<> <isa>`.
587This form of RTTI is opt-in and can be
588:doc:`added to any class <HowToSetUpLLVMStyleRTTI>`. It is also
589substantially more efficient than ``dynamic_cast<>``.
590
591.. _static constructor:
592
593Do not use Static Constructors
594^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
595
596Static constructors and destructors (e.g. global variables whose types have a
597constructor or destructor) should not be added to the code base, and should be
598removed wherever possible.  Besides `well known problems
599<https://yosefk.com/c++fqa/ctors.html#fqa-10.12>`_ where the order of
600initialization is undefined between globals in different source files, the
601entire concept of static constructors is at odds with the common use case of
602LLVM as a library linked into a larger application.
603
604Consider the use of LLVM as a JIT linked into another application (perhaps for
605`OpenGL, custom languages <https://llvm.org/Users.html>`_, `shaders in movies
606<https://llvm.org/devmtg/2010-11/Gritz-OpenShadingLang.pdf>`_, etc). Due to the
607design of static constructors, they must be executed at startup time of the
608entire application, regardless of whether or how LLVM is used in that larger
609application.  There are two problems with this:
610
611* The time to run the static constructors impacts startup time of applications
612  --- a critical time for GUI apps, among others.
613
614* The static constructors cause the app to pull many extra pages of memory off
615  the disk: both the code for the constructor in each ``.o`` file and the small
616  amount of data that gets touched. In addition, touched/dirty pages put more
617  pressure on the VM system on low-memory machines.
618
619We would really like for there to be zero cost for linking in an additional LLVM
620target or other library into an application, but static constructors violate
621this goal.
622
623That said, LLVM unfortunately does contain static constructors.  It would be a
624`great project <https://llvm.org/PR11944>`_ for someone to purge all static
625constructors from LLVM, and then enable the ``-Wglobal-constructors`` warning
626flag (when building with Clang) to ensure we do not regress in the future.
627
628Use of ``class`` and ``struct`` Keywords
629^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
630
631In C++, the ``class`` and ``struct`` keywords can be used almost
632interchangeably. The only difference is when they are used to declare a class:
633``class`` makes all members private by default while ``struct`` makes all
634members public by default.
635
636Unfortunately, not all compilers follow the rules and some will generate
637different symbols based on whether ``class`` or ``struct`` was used to declare
638the symbol (e.g., MSVC).  This can lead to problems at link time.
639
640* All declarations and definitions of a given ``class`` or ``struct`` must use
641  the same keyword.  For example:
642
643.. code-block:: c++
644
645  class Foo;
646
647  // Breaks mangling in MSVC.
648  struct Foo { int Data; };
649
650* As a rule of thumb, ``struct`` should be kept to structures where *all*
651  members are declared public.
652
653.. code-block:: c++
654
655  // Foo feels like a class... this is strange.
656  struct Foo {
657  private:
658    int Data;
659  public:
660    Foo() : Data(0) { }
661    int getData() const { return Data; }
662    void setData(int D) { Data = D; }
663  };
664
665  // Bar isn't POD, but it does look like a struct.
666  struct Bar {
667    int Data;
668    Bar() : Data(0) { }
669  };
670
671Do not use Braced Initializer Lists to Call a Constructor
672^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
673
674In C++11 there is a "generalized initialization syntax" which allows calling
675constructors using braced initializer lists. Do not use these to call
676constructors with any interesting logic or if you care that you're calling some
677*particular* constructor. Those should look like function calls using
678parentheses rather than like aggregate initialization. Similarly, if you need
679to explicitly name the type and call its constructor to create a temporary,
680don't use a braced initializer list. Instead, use a braced initializer list
681(without any type for temporaries) when doing aggregate initialization or
682something notionally equivalent. Examples:
683
684.. code-block:: c++
685
686  class Foo {
687  public:
688    // Construct a Foo by reading data from the disk in the whizbang format, ...
689    Foo(std::string filename);
690
691    // Construct a Foo by looking up the Nth element of some global data ...
692    Foo(int N);
693
694    // ...
695  };
696
697  // The Foo constructor call is very deliberate, no braces.
698  std::fill(foo.begin(), foo.end(), Foo("name"));
699
700  // The pair is just being constructed like an aggregate, use braces.
701  bar_map.insert({my_key, my_value});
702
703If you use a braced initializer list when initializing a variable, use an equals before the open curly brace:
704
705.. code-block:: c++
706
707  int data[] = {0, 1, 2, 3};
708
709Use ``auto`` Type Deduction to Make Code More Readable
710^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
711
712Some are advocating a policy of "almost always ``auto``" in C++11, however LLVM
713uses a more moderate stance. Use ``auto`` if and only if it makes the code more
714readable or easier to maintain. Don't "almost always" use ``auto``, but do use
715``auto`` with initializers like ``cast<Foo>(...)`` or other places where the
716type is already obvious from the context. Another time when ``auto`` works well
717for these purposes is when the type would have been abstracted away anyways,
718often behind a container's typedef such as ``std::vector<T>::iterator``.
719
720Similarly, C++14 adds generic lambda expressions where parameter types can be
721``auto``. Use these where you would have used a template.
722
723Beware unnecessary copies with ``auto``
724^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
725
726The convenience of ``auto`` makes it easy to forget that its default behavior
727is a copy.  Particularly in range-based ``for`` loops, careless copies are
728expensive.
729
730As a rule of thumb, use ``auto &`` unless you need to copy the result, and use
731``auto *`` when copying pointers.
732
733.. code-block:: c++
734
735  // Typically there's no reason to copy.
736  for (const auto &Val : Container) { observe(Val); }
737  for (auto &Val : Container) { Val.change(); }
738
739  // Remove the reference if you really want a new copy.
740  for (auto Val : Container) { Val.change(); saveSomewhere(Val); }
741
742  // Copy pointers, but make it clear that they're pointers.
743  for (const auto *Ptr : Container) { observe(*Ptr); }
744  for (auto *Ptr : Container) { Ptr->change(); }
745
746Beware of non-determinism due to ordering of pointers
747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
748
749In general, there is no relative ordering among pointers. As a result,
750when unordered containers like sets and maps are used with pointer keys
751the iteration order is undefined. Hence, iterating such containers may
752result in non-deterministic code generation. While the generated code
753might not necessarily be "wrong code", this non-determinism might result
754in unexpected runtime crashes or simply hard to reproduce bugs on the
755customer side making it harder to debug and fix.
756
757As a rule of thumb, in case an ordered result is expected, remember to
758sort an unordered container before iteration. Or use ordered containers
759like vector/MapVector/SetVector if you want to iterate pointer keys.
760
761Beware of non-deterministic sorting order of equal elements
762^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
763
764std::sort uses a non-stable sorting algorithm in which the order of equal
765elements is not guaranteed to be preserved. Thus using std::sort for a
766container having equal elements may result in non-determinstic behavior.
767To uncover such instances of non-determinism, LLVM has introduced a new
768llvm::sort wrapper function. For an EXPENSIVE_CHECKS build this will randomly
769shuffle the container before sorting. As a rule of thumb, always make sure to
770use llvm::sort instead of std::sort.
771
772Style Issues
773============
774
775The High-Level Issues
776---------------------
777
778Self-contained Headers
779^^^^^^^^^^^^^^^^^^^^^^
780
781Header files should be self-contained (compile on their own) and end in .h.
782Non-header files that are meant for inclusion should end in .inc and be used
783sparingly.
784
785All header files should be self-contained. Users and refactoring tools should
786not have to adhere to special conditions to include the header. Specifically, a
787header should have header guards and include all other headers it needs.
788
789There are rare cases where a file designed to be included is not
790self-contained. These are typically intended to be included at unusual
791locations, such as the middle of another file. They might not use header
792guards, and might not include their prerequisites. Name such files with the
793.inc extension. Use sparingly, and prefer self-contained headers when possible.
794
795In general, a header should be implemented by one or more ``.cpp`` files.  Each
796of these ``.cpp`` files should include the header that defines their interface
797first.  This ensures that all of the dependences of the header have been
798properly added to the header itself, and are not implicit.  System headers
799should be included after user headers for a translation unit.
800
801Library Layering
802^^^^^^^^^^^^^^^^
803
804A directory of header files (for example ``include/llvm/Foo``) defines a
805library (``Foo``). Dependencies between libraries are defined by the
806``LLVMBuild.txt`` file in their implementation (``lib/Foo``). One library (both
807its headers and implementation) should only use things from the libraries
808listed in its dependencies.
809
810Some of this constraint can be enforced by classic Unix linkers (Mac & Windows
811linkers, as well as lld, do not enforce this constraint). A Unix linker
812searches left to right through the libraries specified on its command line and
813never revisits a library. In this way, no circular dependencies between
814libraries can exist.
815
816This doesn't fully enforce all inter-library dependencies, and importantly
817doesn't enforce header file circular dependencies created by inline functions.
818A good way to answer the "is this layered correctly" would be to consider
819whether a Unix linker would succeed at linking the program if all inline
820functions were defined out-of-line. (& for all valid orderings of dependencies
821- since linking resolution is linear, it's possible that some implicit
822dependencies can sneak through: A depends on B and C, so valid orderings are
823"C B A" or "B C A", in both cases the explicit dependencies come before their
824use. But in the first case, B could still link successfully if it implicitly
825depended on C, or the opposite in the second case)
826
827.. _minimal list of #includes:
828
829``#include`` as Little as Possible
830^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
831
832``#include`` hurts compile time performance.  Don't do it unless you have to,
833especially in header files.
834
835But wait! Sometimes you need to have the definition of a class to use it, or to
836inherit from it.  In these cases go ahead and ``#include`` that header file.  Be
837aware however that there are many cases where you don't need to have the full
838definition of a class.  If you are using a pointer or reference to a class, you
839don't need the header file.  If you are simply returning a class instance from a
840prototyped function or method, you don't need it.  In fact, for most cases, you
841simply don't need the definition of a class. And not ``#include``\ing speeds up
842compilation.
843
844It is easy to try to go too overboard on this recommendation, however.  You
845**must** include all of the header files that you are using --- you can include
846them either directly or indirectly through another header file.  To make sure
847that you don't accidentally forget to include a header file in your module
848header, make sure to include your module header **first** in the implementation
849file (as mentioned above).  This way there won't be any hidden dependencies that
850you'll find out about later.
851
852Keep "Internal" Headers Private
853^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
854
855Many modules have a complex implementation that causes them to use more than one
856implementation (``.cpp``) file.  It is often tempting to put the internal
857communication interface (helper classes, extra functions, etc) in the public
858module header file.  Don't do this!
859
860If you really need to do something like this, put a private header file in the
861same directory as the source files, and include it locally.  This ensures that
862your private interface remains private and undisturbed by outsiders.
863
864.. note::
865
866    It's okay to put extra implementation methods in a public class itself. Just
867    make them private (or protected) and all is well.
868
869.. _early exits:
870
871Use Early Exits and ``continue`` to Simplify Code
872^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
873
874When reading code, keep in mind how much state and how many previous decisions
875have to be remembered by the reader to understand a block of code.  Aim to
876reduce indentation where possible when it doesn't make it more difficult to
877understand the code.  One great way to do this is by making use of early exits
878and the ``continue`` keyword in long loops.  As an example of using an early
879exit from a function, consider this "bad" code:
880
881.. code-block:: c++
882
883  Value *doSomething(Instruction *I) {
884    if (!I->isTerminator() &&
885        I->hasOneUse() && doOtherThing(I)) {
886      ... some long code ....
887    }
888
889    return 0;
890  }
891
892This code has several problems if the body of the ``'if'`` is large.  When
893you're looking at the top of the function, it isn't immediately clear that this
894*only* does interesting things with non-terminator instructions, and only
895applies to things with the other predicates.  Second, it is relatively difficult
896to describe (in comments) why these predicates are important because the ``if``
897statement makes it difficult to lay out the comments.  Third, when you're deep
898within the body of the code, it is indented an extra level.  Finally, when
899reading the top of the function, it isn't clear what the result is if the
900predicate isn't true; you have to read to the end of the function to know that
901it returns null.
902
903It is much preferred to format the code like this:
904
905.. code-block:: c++
906
907  Value *doSomething(Instruction *I) {
908    // Terminators never need 'something' done to them because ...
909    if (I->isTerminator())
910      return 0;
911
912    // We conservatively avoid transforming instructions with multiple uses
913    // because goats like cheese.
914    if (!I->hasOneUse())
915      return 0;
916
917    // This is really just here for example.
918    if (!doOtherThing(I))
919      return 0;
920
921    ... some long code ....
922  }
923
924This fixes these problems.  A similar problem frequently happens in ``for``
925loops.  A silly example is something like this:
926
927.. code-block:: c++
928
929  for (Instruction &I : BB) {
930    if (auto *BO = dyn_cast<BinaryOperator>(&I)) {
931      Value *LHS = BO->getOperand(0);
932      Value *RHS = BO->getOperand(1);
933      if (LHS != RHS) {
934        ...
935      }
936    }
937  }
938
939When you have very, very small loops, this sort of structure is fine. But if it
940exceeds more than 10-15 lines, it becomes difficult for people to read and
941understand at a glance. The problem with this sort of code is that it gets very
942nested very quickly. Meaning that the reader of the code has to keep a lot of
943context in their brain to remember what is going immediately on in the loop,
944because they don't know if/when the ``if`` conditions will have ``else``\s etc.
945It is strongly preferred to structure the loop like this:
946
947.. code-block:: c++
948
949  for (Instruction &I : BB) {
950    auto *BO = dyn_cast<BinaryOperator>(&I);
951    if (!BO) continue;
952
953    Value *LHS = BO->getOperand(0);
954    Value *RHS = BO->getOperand(1);
955    if (LHS == RHS) continue;
956
957    ...
958  }
959
960This has all the benefits of using early exits for functions: it reduces nesting
961of the loop, it makes it easier to describe why the conditions are true, and it
962makes it obvious to the reader that there is no ``else`` coming up that they
963have to push context into their brain for.  If a loop is large, this can be a
964big understandability win.
965
966Don't use ``else`` after a ``return``
967^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
968
969For similar reasons above (reduction of indentation and easier reading), please
970do not use ``'else'`` or ``'else if'`` after something that interrupts control
971flow --- like ``return``, ``break``, ``continue``, ``goto``, etc. For
972example, this is *bad*:
973
974.. code-block:: c++
975
976  case 'J': {
977    if (Signed) {
978      Type = Context.getsigjmp_bufType();
979      if (Type.isNull()) {
980        Error = ASTContext::GE_Missing_sigjmp_buf;
981        return QualType();
982      } else {
983        break;
984      }
985    } else {
986      Type = Context.getjmp_bufType();
987      if (Type.isNull()) {
988        Error = ASTContext::GE_Missing_jmp_buf;
989        return QualType();
990      } else {
991        break;
992      }
993    }
994  }
995
996It is better to write it like this:
997
998.. code-block:: c++
999
1000  case 'J':
1001    if (Signed) {
1002      Type = Context.getsigjmp_bufType();
1003      if (Type.isNull()) {
1004        Error = ASTContext::GE_Missing_sigjmp_buf;
1005        return QualType();
1006      }
1007    } else {
1008      Type = Context.getjmp_bufType();
1009      if (Type.isNull()) {
1010        Error = ASTContext::GE_Missing_jmp_buf;
1011        return QualType();
1012      }
1013    }
1014    break;
1015
1016Or better yet (in this case) as:
1017
1018.. code-block:: c++
1019
1020  case 'J':
1021    if (Signed)
1022      Type = Context.getsigjmp_bufType();
1023    else
1024      Type = Context.getjmp_bufType();
1025
1026    if (Type.isNull()) {
1027      Error = Signed ? ASTContext::GE_Missing_sigjmp_buf :
1028                       ASTContext::GE_Missing_jmp_buf;
1029      return QualType();
1030    }
1031    break;
1032
1033The idea is to reduce indentation and the amount of code you have to keep track
1034of when reading the code.
1035
1036Turn Predicate Loops into Predicate Functions
1037^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1038
1039It is very common to write small loops that just compute a boolean value.  There
1040are a number of ways that people commonly write these, but an example of this
1041sort of thing is:
1042
1043.. code-block:: c++
1044
1045  bool FoundFoo = false;
1046  for (unsigned I = 0, E = BarList.size(); I != E; ++I)
1047    if (BarList[I]->isFoo()) {
1048      FoundFoo = true;
1049      break;
1050    }
1051
1052  if (FoundFoo) {
1053    ...
1054  }
1055
1056This sort of code is awkward to write, and is almost always a bad sign.  Instead
1057of this sort of loop, we strongly prefer to use a predicate function (which may
1058be `static`_) that uses `early exits`_ to compute the predicate.  We prefer the
1059code to be structured like this:
1060
1061.. code-block:: c++
1062
1063  /// \returns true if the specified list has an element that is a foo.
1064  static bool containsFoo(const std::vector<Bar*> &List) {
1065    for (unsigned I = 0, E = List.size(); I != E; ++I)
1066      if (List[I]->isFoo())
1067        return true;
1068    return false;
1069  }
1070  ...
1071
1072  if (containsFoo(BarList)) {
1073    ...
1074  }
1075
1076There are many reasons for doing this: it reduces indentation and factors out
1077code which can often be shared by other code that checks for the same predicate.
1078More importantly, it *forces you to pick a name* for the function, and forces
1079you to write a comment for it.  In this silly example, this doesn't add much
1080value.  However, if the condition is complex, this can make it a lot easier for
1081the reader to understand the code that queries for this predicate.  Instead of
1082being faced with the in-line details of how we check to see if the BarList
1083contains a foo, we can trust the function name and continue reading with better
1084locality.
1085
1086The Low-Level Issues
1087--------------------
1088
1089Name Types, Functions, Variables, and Enumerators Properly
1090^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1091
1092Poorly-chosen names can mislead the reader and cause bugs. We cannot stress
1093enough how important it is to use *descriptive* names.  Pick names that match
1094the semantics and role of the underlying entities, within reason.  Avoid
1095abbreviations unless they are well known.  After picking a good name, make sure
1096to use consistent capitalization for the name, as inconsistency requires clients
1097to either memorize the APIs or to look it up to find the exact spelling.
1098
1099In general, names should be in camel case (e.g. ``TextFileReader`` and
1100``isLValue()``).  Different kinds of declarations have different rules:
1101
1102* **Type names** (including classes, structs, enums, typedefs, etc) should be
1103  nouns and start with an upper-case letter (e.g. ``TextFileReader``).
1104
1105* **Variable names** should be nouns (as they represent state).  The name should
1106  be camel case, and start with an upper case letter (e.g. ``Leader`` or
1107  ``Boats``).
1108
1109* **Function names** should be verb phrases (as they represent actions), and
1110  command-like function should be imperative.  The name should be camel case,
1111  and start with a lower case letter (e.g. ``openFile()`` or ``isFoo()``).
1112
1113* **Enum declarations** (e.g. ``enum Foo {...}``) are types, so they should
1114  follow the naming conventions for types.  A common use for enums is as a
1115  discriminator for a union, or an indicator of a subclass.  When an enum is
1116  used for something like this, it should have a ``Kind`` suffix
1117  (e.g. ``ValueKind``).
1118
1119* **Enumerators** (e.g. ``enum { Foo, Bar }``) and **public member variables**
1120  should start with an upper-case letter, just like types.  Unless the
1121  enumerators are defined in their own small namespace or inside a class,
1122  enumerators should have a prefix corresponding to the enum declaration name.
1123  For example, ``enum ValueKind { ... };`` may contain enumerators like
1124  ``VK_Argument``, ``VK_BasicBlock``, etc.  Enumerators that are just
1125  convenience constants are exempt from the requirement for a prefix.  For
1126  instance:
1127
1128  .. code-block:: c++
1129
1130      enum {
1131        MaxSize = 42,
1132        Density = 12
1133      };
1134
1135As an exception, classes that mimic STL classes can have member names in STL's
1136style of lower-case words separated by underscores (e.g. ``begin()``,
1137``push_back()``, and ``empty()``). Classes that provide multiple
1138iterators should add a singular prefix to ``begin()`` and ``end()``
1139(e.g. ``global_begin()`` and ``use_begin()``).
1140
1141Here are some examples of good and bad names:
1142
1143.. code-block:: c++
1144
1145  class VehicleMaker {
1146    ...
1147    Factory<Tire> F;            // Bad -- abbreviation and non-descriptive.
1148    Factory<Tire> Factory;      // Better.
1149    Factory<Tire> TireFactory;  // Even better -- if VehicleMaker has more than one
1150                                // kind of factories.
1151  };
1152
1153  Vehicle makeVehicle(VehicleType Type) {
1154    VehicleMaker M;                         // Might be OK if having a short life-span.
1155    Tire Tmp1 = M.makeTire();               // Bad -- 'Tmp1' provides no information.
1156    Light Headlight = M.makeLight("head");  // Good -- descriptive.
1157    ...
1158  }
1159
1160Assert Liberally
1161^^^^^^^^^^^^^^^^
1162
1163Use the "``assert``" macro to its fullest.  Check all of your preconditions and
1164assumptions, you never know when a bug (not necessarily even yours) might be
1165caught early by an assertion, which reduces debugging time dramatically.  The
1166"``<cassert>``" header file is probably already included by the header files you
1167are using, so it doesn't cost anything to use it.
1168
1169To further assist with debugging, make sure to put some kind of error message in
1170the assertion statement, which is printed if the assertion is tripped. This
1171helps the poor debugger make sense of why an assertion is being made and
1172enforced, and hopefully what to do about it.  Here is one complete example:
1173
1174.. code-block:: c++
1175
1176  inline Value *getOperand(unsigned I) {
1177    assert(I < Operands.size() && "getOperand() out of range!");
1178    return Operands[I];
1179  }
1180
1181Here are more examples:
1182
1183.. code-block:: c++
1184
1185  assert(Ty->isPointerType() && "Can't allocate a non-pointer type!");
1186
1187  assert((Opcode == Shl || Opcode == Shr) && "ShiftInst Opcode invalid!");
1188
1189  assert(idx < getNumSuccessors() && "Successor # out of range!");
1190
1191  assert(V1.getType() == V2.getType() && "Constant types must be identical!");
1192
1193  assert(isa<PHINode>(Succ->front()) && "Only works on PHId BBs!");
1194
1195You get the idea.
1196
1197In the past, asserts were used to indicate a piece of code that should not be
1198reached.  These were typically of the form:
1199
1200.. code-block:: c++
1201
1202  assert(0 && "Invalid radix for integer literal");
1203
1204This has a few issues, the main one being that some compilers might not
1205understand the assertion, or warn about a missing return in builds where
1206assertions are compiled out.
1207
1208Today, we have something much better: ``llvm_unreachable``:
1209
1210.. code-block:: c++
1211
1212  llvm_unreachable("Invalid radix for integer literal");
1213
1214When assertions are enabled, this will print the message if it's ever reached
1215and then exit the program. When assertions are disabled (i.e. in release
1216builds), ``llvm_unreachable`` becomes a hint to compilers to skip generating
1217code for this branch. If the compiler does not support this, it will fall back
1218to the "abort" implementation.
1219
1220Neither assertions or ``llvm_unreachable`` will abort the program on a release
1221build. If the error condition can be triggered by user input then the
1222recoverable error mechanism described in :doc:`ProgrammersManual` should be
1223used instead. In cases where this is not practical, ``report_fatal_error`` may
1224be used.
1225
1226Another issue is that values used only by assertions will produce an "unused
1227value" warning when assertions are disabled.  For example, this code will warn:
1228
1229.. code-block:: c++
1230
1231  unsigned Size = V.size();
1232  assert(Size > 42 && "Vector smaller than it should be");
1233
1234  bool NewToSet = Myset.insert(Value);
1235  assert(NewToSet && "The value shouldn't be in the set yet");
1236
1237These are two interesting different cases. In the first case, the call to
1238``V.size()`` is only useful for the assert, and we don't want it executed when
1239assertions are disabled.  Code like this should move the call into the assert
1240itself.  In the second case, the side effects of the call must happen whether
1241the assert is enabled or not.  In this case, the value should be cast to void to
1242disable the warning.  To be specific, it is preferred to write the code like
1243this:
1244
1245.. code-block:: c++
1246
1247  assert(V.size() > 42 && "Vector smaller than it should be");
1248
1249  bool NewToSet = Myset.insert(Value); (void)NewToSet;
1250  assert(NewToSet && "The value shouldn't be in the set yet");
1251
1252Do Not Use ``using namespace std``
1253^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1254
1255In LLVM, we prefer to explicitly prefix all identifiers from the standard
1256namespace with an "``std::``" prefix, rather than rely on "``using namespace
1257std;``".
1258
1259In header files, adding a ``'using namespace XXX'`` directive pollutes the
1260namespace of any source file that ``#include``\s the header.  This is clearly a
1261bad thing.
1262
1263In implementation files (e.g. ``.cpp`` files), the rule is more of a stylistic
1264rule, but is still important.  Basically, using explicit namespace prefixes
1265makes the code **clearer**, because it is immediately obvious what facilities
1266are being used and where they are coming from. And **more portable**, because
1267namespace clashes cannot occur between LLVM code and other namespaces.  The
1268portability rule is important because different standard library implementations
1269expose different symbols (potentially ones they shouldn't), and future revisions
1270to the C++ standard will add more symbols to the ``std`` namespace.  As such, we
1271never use ``'using namespace std;'`` in LLVM.
1272
1273The exception to the general rule (i.e. it's not an exception for the ``std``
1274namespace) is for implementation files.  For example, all of the code in the
1275LLVM project implements code that lives in the 'llvm' namespace.  As such, it is
1276ok, and actually clearer, for the ``.cpp`` files to have a ``'using namespace
1277llvm;'`` directive at the top, after the ``#include``\s.  This reduces
1278indentation in the body of the file for source editors that indent based on
1279braces, and keeps the conceptual context cleaner.  The general form of this rule
1280is that any ``.cpp`` file that implements code in any namespace may use that
1281namespace (and its parents'), but should not use any others.
1282
1283Provide a Virtual Method Anchor for Classes in Headers
1284^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1285
1286If a class is defined in a header file and has a vtable (either it has virtual
1287methods or it derives from classes with virtual methods), it must always have at
1288least one out-of-line virtual method in the class.  Without this, the compiler
1289will copy the vtable and RTTI into every ``.o`` file that ``#include``\s the
1290header, bloating ``.o`` file sizes and increasing link times.
1291
1292Don't use default labels in fully covered switches over enumerations
1293^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1294
1295``-Wswitch`` warns if a switch, without a default label, over an enumeration
1296does not cover every enumeration value. If you write a default label on a fully
1297covered switch over an enumeration then the ``-Wswitch`` warning won't fire
1298when new elements are added to that enumeration. To help avoid adding these
1299kinds of defaults, Clang has the warning ``-Wcovered-switch-default`` which is
1300off by default but turned on when building LLVM with a version of Clang that
1301supports the warning.
1302
1303A knock-on effect of this stylistic requirement is that when building LLVM with
1304GCC you may get warnings related to "control may reach end of non-void function"
1305if you return from each case of a covered switch-over-enum because GCC assumes
1306that the enum expression may take any representable value, not just those of
1307individual enumerators. To suppress this warning, use ``llvm_unreachable`` after
1308the switch.
1309
1310Use range-based ``for`` loops wherever possible
1311^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1312
1313The introduction of range-based ``for`` loops in C++11 means that explicit
1314manipulation of iterators is rarely necessary. We use range-based ``for``
1315loops wherever possible for all newly added code. For example:
1316
1317.. code-block:: c++
1318
1319  BasicBlock *BB = ...
1320  for (Instruction &I : *BB)
1321    ... use I ...
1322
1323Don't evaluate ``end()`` every time through a loop
1324^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1325
1326In cases where range-based ``for`` loops can't be used and it is necessary
1327to write an explicit iterator-based loop, pay close attention to whether
1328``end()`` is re-evaluted on each loop iteration. One common mistake is to
1329write a loop in this style:
1330
1331.. code-block:: c++
1332
1333  BasicBlock *BB = ...
1334  for (auto I = BB->begin(); I != BB->end(); ++I)
1335    ... use I ...
1336
1337The problem with this construct is that it evaluates "``BB->end()``" every time
1338through the loop.  Instead of writing the loop like this, we strongly prefer
1339loops to be written so that they evaluate it once before the loop starts.  A
1340convenient way to do this is like so:
1341
1342.. code-block:: c++
1343
1344  BasicBlock *BB = ...
1345  for (auto I = BB->begin(), E = BB->end(); I != E; ++I)
1346    ... use I ...
1347
1348The observant may quickly point out that these two loops may have different
1349semantics: if the container (a basic block in this case) is being mutated, then
1350"``BB->end()``" may change its value every time through the loop and the second
1351loop may not in fact be correct.  If you actually do depend on this behavior,
1352please write the loop in the first form and add a comment indicating that you
1353did it intentionally.
1354
1355Why do we prefer the second form (when correct)?  Writing the loop in the first
1356form has two problems. First it may be less efficient than evaluating it at the
1357start of the loop.  In this case, the cost is probably minor --- a few extra
1358loads every time through the loop.  However, if the base expression is more
1359complex, then the cost can rise quickly.  I've seen loops where the end
1360expression was actually something like: "``SomeMap[X]->end()``" and map lookups
1361really aren't cheap.  By writing it in the second form consistently, you
1362eliminate the issue entirely and don't even have to think about it.
1363
1364The second (even bigger) issue is that writing the loop in the first form hints
1365to the reader that the loop is mutating the container (a fact that a comment
1366would handily confirm!).  If you write the loop in the second form, it is
1367immediately obvious without even looking at the body of the loop that the
1368container isn't being modified, which makes it easier to read the code and
1369understand what it does.
1370
1371While the second form of the loop is a few extra keystrokes, we do strongly
1372prefer it.
1373
1374``#include <iostream>`` is Forbidden
1375^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1376
1377The use of ``#include <iostream>`` in library files is hereby **forbidden**,
1378because many common implementations transparently inject a `static constructor`_
1379into every translation unit that includes it.
1380
1381Note that using the other stream headers (``<sstream>`` for example) is not
1382problematic in this regard --- just ``<iostream>``. However, ``raw_ostream``
1383provides various APIs that are better performing for almost every use than
1384``std::ostream`` style APIs.
1385
1386.. note::
1387
1388  New code should always use `raw_ostream`_ for writing, or the
1389  ``llvm::MemoryBuffer`` API for reading files.
1390
1391.. _raw_ostream:
1392
1393Use ``raw_ostream``
1394^^^^^^^^^^^^^^^^^^^
1395
1396LLVM includes a lightweight, simple, and efficient stream implementation in
1397``llvm/Support/raw_ostream.h``, which provides all of the common features of
1398``std::ostream``.  All new code should use ``raw_ostream`` instead of
1399``ostream``.
1400
1401Unlike ``std::ostream``, ``raw_ostream`` is not a template and can be forward
1402declared as ``class raw_ostream``.  Public headers should generally not include
1403the ``raw_ostream`` header, but use forward declarations and constant references
1404to ``raw_ostream`` instances.
1405
1406Avoid ``std::endl``
1407^^^^^^^^^^^^^^^^^^^
1408
1409The ``std::endl`` modifier, when used with ``iostreams`` outputs a newline to
1410the output stream specified.  In addition to doing this, however, it also
1411flushes the output stream.  In other words, these are equivalent:
1412
1413.. code-block:: c++
1414
1415  std::cout << std::endl;
1416  std::cout << '\n' << std::flush;
1417
1418Most of the time, you probably have no reason to flush the output stream, so
1419it's better to use a literal ``'\n'``.
1420
1421Don't use ``inline`` when defining a function in a class definition
1422^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1423
1424A member function defined in a class definition is implicitly inline, so don't
1425put the ``inline`` keyword in this case.
1426
1427Don't:
1428
1429.. code-block:: c++
1430
1431  class Foo {
1432  public:
1433    inline void bar() {
1434      // ...
1435    }
1436  };
1437
1438Do:
1439
1440.. code-block:: c++
1441
1442  class Foo {
1443  public:
1444    void bar() {
1445      // ...
1446    }
1447  };
1448
1449Microscopic Details
1450-------------------
1451
1452This section describes preferred low-level formatting guidelines along with
1453reasoning on why we prefer them.
1454
1455Spaces Before Parentheses
1456^^^^^^^^^^^^^^^^^^^^^^^^^
1457
1458We prefer to put a space before an open parenthesis only in control flow
1459statements, but not in normal function call expressions and function-like
1460macros.  For example, this is good:
1461
1462.. code-block:: c++
1463
1464  if (X) ...
1465  for (I = 0; I != 100; ++I) ...
1466  while (LLVMRocks) ...
1467
1468  somefunc(42);
1469  assert(3 != 4 && "laws of math are failing me");
1470
1471  A = foo(42, 92) + bar(X);
1472
1473and this is bad:
1474
1475.. code-block:: c++
1476
1477  if(X) ...
1478  for(I = 0; I != 100; ++I) ...
1479  while(LLVMRocks) ...
1480
1481  somefunc (42);
1482  assert (3 != 4 && "laws of math are failing me");
1483
1484  A = foo (42, 92) + bar (X);
1485
1486The reason for doing this is not completely arbitrary.  This style makes control
1487flow operators stand out more, and makes expressions flow better. The function
1488call operator binds very tightly as a postfix operator.  Putting a space after a
1489function name (as in the last example) makes it appear that the code might bind
1490the arguments of the left-hand-side of a binary operator with the argument list
1491of a function and the name of the right side.  More specifically, it is easy to
1492misread the "``A``" example as:
1493
1494.. code-block:: c++
1495
1496  A = foo ((42, 92) + bar) (X);
1497
1498when skimming through the code.  By avoiding a space in a function, we avoid
1499this misinterpretation.
1500
1501Prefer Preincrement
1502^^^^^^^^^^^^^^^^^^^
1503
1504Hard fast rule: Preincrement (``++X``) may be no slower than postincrement
1505(``X++``) and could very well be a lot faster than it.  Use preincrementation
1506whenever possible.
1507
1508The semantics of postincrement include making a copy of the value being
1509incremented, returning it, and then preincrementing the "work value".  For
1510primitive types, this isn't a big deal. But for iterators, it can be a huge
1511issue (for example, some iterators contains stack and set objects in them...
1512copying an iterator could invoke the copy ctor's of these as well).  In general,
1513get in the habit of always using preincrement, and you won't have a problem.
1514
1515
1516Namespace Indentation
1517^^^^^^^^^^^^^^^^^^^^^
1518
1519In general, we strive to reduce indentation wherever possible.  This is useful
1520because we want code to `fit into 80 columns`_ without wrapping horribly, but
1521also because it makes it easier to understand the code. To facilitate this and
1522avoid some insanely deep nesting on occasion, don't indent namespaces. If it
1523helps readability, feel free to add a comment indicating what namespace is
1524being closed by a ``}``.  For example:
1525
1526.. code-block:: c++
1527
1528  namespace llvm {
1529  namespace knowledge {
1530
1531  /// This class represents things that Smith can have an intimate
1532  /// understanding of and contains the data associated with it.
1533  class Grokable {
1534  ...
1535  public:
1536    explicit Grokable() { ... }
1537    virtual ~Grokable() = 0;
1538
1539    ...
1540
1541  };
1542
1543  } // end namespace knowledge
1544  } // end namespace llvm
1545
1546
1547Feel free to skip the closing comment when the namespace being closed is
1548obvious for any reason. For example, the outer-most namespace in a header file
1549is rarely a source of confusion. But namespaces both anonymous and named in
1550source files that are being closed half way through the file probably could use
1551clarification.
1552
1553.. _static:
1554
1555Anonymous Namespaces
1556^^^^^^^^^^^^^^^^^^^^
1557
1558After talking about namespaces in general, you may be wondering about anonymous
1559namespaces in particular.  Anonymous namespaces are a great language feature
1560that tells the C++ compiler that the contents of the namespace are only visible
1561within the current translation unit, allowing more aggressive optimization and
1562eliminating the possibility of symbol name collisions.  Anonymous namespaces are
1563to C++ as "static" is to C functions and global variables.  While "``static``"
1564is available in C++, anonymous namespaces are more general: they can make entire
1565classes private to a file.
1566
1567The problem with anonymous namespaces is that they naturally want to encourage
1568indentation of their body, and they reduce locality of reference: if you see a
1569random function definition in a C++ file, it is easy to see if it is marked
1570static, but seeing if it is in an anonymous namespace requires scanning a big
1571chunk of the file.
1572
1573Because of this, we have a simple guideline: make anonymous namespaces as small
1574as possible, and only use them for class declarations.  For example, this is
1575good:
1576
1577.. code-block:: c++
1578
1579  namespace {
1580  class StringSort {
1581  ...
1582  public:
1583    StringSort(...)
1584    bool operator<(const char *RHS) const;
1585  };
1586  } // end anonymous namespace
1587
1588  static void runHelper() {
1589    ...
1590  }
1591
1592  bool StringSort::operator<(const char *RHS) const {
1593    ...
1594  }
1595
1596This is bad:
1597
1598.. code-block:: c++
1599
1600  namespace {
1601
1602  class StringSort {
1603  ...
1604  public:
1605    StringSort(...)
1606    bool operator<(const char *RHS) const;
1607  };
1608
1609  void runHelper() {
1610    ...
1611  }
1612
1613  bool StringSort::operator<(const char *RHS) const {
1614    ...
1615  }
1616
1617  } // end anonymous namespace
1618
1619This is bad specifically because if you're looking at "``runHelper``" in the middle
1620of a large C++ file, that you have no immediate way to tell if it is local to
1621the file.  When it is marked static explicitly, this is immediately obvious.
1622Also, there is no reason to enclose the definition of "``operator<``" in the
1623namespace just because it was declared there.
1624
1625See Also
1626========
1627
1628A lot of these comments and recommendations have been culled from other sources.
1629Two particularly important books for our work are:
1630
1631#. `Effective C++
1632   <https://www.amazon.com/Effective-Specific-Addison-Wesley-Professional-Computing/dp/0321334876>`_
1633   by Scott Meyers.  Also interesting and useful are "More Effective C++" and
1634   "Effective STL" by the same author.
1635
1636#. `Large-Scale C++ Software Design
1637   <https://www.amazon.com/Large-Scale-Software-Design-John-Lakos/dp/0201633620>`_
1638   by John Lakos
1639
1640If you get some free time, and you haven't read them: do so, you might learn
1641something.
1642