1===================================================================
2How To Add Your Build Configuration To LLVM Buildbot Infrastructure
3===================================================================
4
5Introduction
6============
7
8This document contains information about adding a build configuration and
9buildbot-worker to private worker builder to LLVM Buildbot Infrastructure.
10
11Buildmasters
12============
13
14There are two buildmasters running.
15
16* The main buildmaster at `<https://lab.llvm.org/buildbot>`_. All builders
17  attached to this machine will notify commit authors every time they break
18  the build.
19* The staging buildmaster at `<https://lab.llvm.org/staging>`_. All builders
20  attached to this machine will be completely silent by default when the build
21  is broken.
22
23In order to remain connected to the main buildmaster (and thus notify
24developers of failures), a builbot must:
25
26* Be building a supported configuration.  Builders for experimental backends
27  should generally be attached to staging buildmaster.
28* Be able to keep up with new commits to the main branch, or at a minimum
29  recover to tip of tree within a couple of days of falling behind.
30
31Additionally, we encourage all bot owners to point their bots towards the
32staging master during maintenance windows, instability troubleshooting, and
33such.
34
35Roles & Expectations
36====================
37
38Each buildbot has an owner who is the responsible party for addressing problems
39which arise with said buildbot.  We generally expect the bot owner to be
40reasonably responsive.
41
42For some bots, the ownership responsibility is split between a "resource owner"
43who provides the underlying machine resource, and a "configuration owner" who
44maintains the build configuration.  Generally, operational responsibility lies
45with the "config owner".  We do expect "resource owners" - who are generally
46the contact listed in a workers attributes - to proxy requests to the relevant
47"config owner" in a timely manner.
48
49Most issues with a buildbot should be addressed directly with a bot owner
50via email.  Please CC `Galina Kistanova <mailto:[email protected]>`_.
51
52Steps To Add Builder To LLVM Buildbot
53=====================================
54Volunteers can provide their build machines to work as build workers to
55public LLVM Buildbot.
56
57Here are the steps you can follow to do so:
58
59#. Check the existing build configurations to make sure the one you are
60   interested in is not covered yet or gets built on your computer much
61   faster than on the existing one. We prefer faster builds so developers
62   will get feedback sooner after changes get committed.
63
64#. The computer you will be registering with the LLVM buildbot
65   infrastructure should have all dependencies installed and you can
66   actually build your configuration successfully. Please check what degree
67   of parallelism (-j param) would give the fastest build.  You can build
68   multiple configurations on one computer.
69
70#. Install buildbot-worker (currently we are using buildbot version 2.8.5).
71   Depending on the platform, buildbot-worker could be available to download and
72   install with your package manager, or you can download it directly from
73   `<http://trac.buildbot.net>`_ and install it manually.
74
75#. Create a designated user account, your buildbot-worker will be running under,
76   and set appropriate permissions.
77
78#. Choose the buildbot-worker root directory (all builds will be placed under
79   it), buildbot-worker access name and password the build master will be using
80   to authenticate your buildbot-worker.
81
82#. Create a buildbot-worker in context of that buildbot-worker account. Point it
83   to the **lab.llvm.org** port **9990** (see `Buildbot documentation,
84   Creating a worker
85   <http://docs.buildbot.net/current/tutorial/firstrun.html#creating-a-worker>`_
86   for more details) by running the following command:
87
88    .. code-block:: bash
89
90       $ buildbot-worker create-worker <buildbot-worker-root-directory> \
91                    lab.llvm.org:9990 \
92                    <buildbot-worker-access-name> \
93                    <buildbot-worker-access-password>
94
95   To point a worker to silent master please use lab.llvm.org:9994 instead
96   of lab.llvm.org:9990.
97
98#. Fill the buildbot-worker description and admin name/e-mail.  Here is an
99   example of the buildbot-worker description::
100
101       Windows 7 x64
102       Core i7 (2.66GHz), 16GB of RAM
103
104       g++.exe (TDM-1 mingw32) 4.4.0
105       GNU Binutils 2.19.1
106       cmake version 2.8.4
107       Microsoft(R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
108
109#. Make sure you can actually start the buildbot-worker successfully. Then set
110   up your buildbot-worker to start automatically at the start up time.  See the
111   buildbot documentation for help.  You may want to restart your computer
112   to see if it works.
113
114#. Send a patch which adds your build worker and your builder to
115   `zorg <https://github.com/llvm/llvm-zorg>`_. Use the typical LLVM
116   `workflow <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.
117
118   * workers are added to ``buildbot/osuosl/master/config/workers.py``
119   * builders are added to ``buildbot/osuosl/master/config/builders.py``
120
121   Please make sure your builder name and its builddir are unique through the
122   file.
123
124   All new builders should default to using the "'collapseRequests': False"
125   configuration.  This causes the builder to build each commit individually
126   and not merge build requests.  To maximize quality of feedback to developers,
127   we *strongly prefer* builders to be configured not to collapse requests.
128   This flag should be removed only after all reasonable efforts have been
129   exhausted to improve build times such that the builder can keep up with
130   commit flow.
131
132   It is possible to allow email addresses to unconditionally receive
133   notifications on build failure; for this you'll need to add an
134   ``InformativeMailNotifier`` to ``buildbot/osuosl/master/config/status.py``.
135   This is particularly useful for the staging buildmaster which is silent
136   otherwise.
137
138#. Send the buildbot-worker access name and the access password directly to
139   `Galina Kistanova <mailto:[email protected]>`_, and wait till she
140   will let you know that your changes are applied and buildmaster is
141   reconfigured.
142
143#. Check the status of your buildbot-worker on the `Waterfall Display
144   <http://lab.llvm.org/buildbot/#/waterfall>`_ to make sure it is connected,
145   and the `Workers Display <http://lab.llvm.org/buildbot/#/workers>`_ to see if
146   administrator contact and worker information are correct.
147
148#. Wait for the first build to succeed and enjoy.
149
150
151Best Practices for Configuring a Fast Builder
152=============================================
153
154As mentioned above, we generally have a strong preference for
155builders which can build every commit as they come in.  This section
156includes best practices and some recommendations as to how to achieve
157that end.
158
159The goal
160  In 2020, the monorepo had just under 35 thousand commits.  This works
161  out to an average of 4 commits per hour.  Already, we can see that a
162  builder must cycle in less than 15 minutes to have a hope of being
163  useful.  However, those commits are not uniformly distributed.  They
164  tend to cluster strongly during US working hours.  Looking at a couple
165  of recent (Nov 2021) working days, we routinely see ~10 commits per
166  hour during peek times, with occasional spikes as high as ~15 commits
167  per hour.  Thus, as a rule of thumb, we should plan for our builder to
168  complete ~10-15 builds an hour.
169
170Resource Appropriately
171  At 10-15 builds per hour, we need to complete a new build on average every
172  4 to 6 minutes.  For anything except the fastest of hardware/build configs,
173  this is going to be well beyond the ability of a single machine.  In buildbot
174  terms, we likely going to need multiple workers to build requests in parallel
175  under a single builder configuration.  For some rough back of the envelope
176  numbers, if your build config takes e.g. 30 minutes, you will need something
177  on the order of 5-8 workers.  If your build config takes ~2 hours, you'll
178  need something on the order of 20-30 workers.  The rest of this section
179  focuses on how to reduce cycle times.
180
181Restrict what you build and test
182  Think hard about why you're setting up a bot, and restrict your build
183  configuration as much as you can.  Basic functionality is probably
184  already covered by other bots, and you don't need to duplicate that
185  testing.  You only need to be building and testing the *unique* parts
186  of the configuration.  (e.g. For a multi-stage clang builder, you probably
187  don't need to be enabling every target or building all the various utilities.)
188
189  It can sometimes be worthwhile splitting a single builder into two or more,
190  if you have multiple distinct purposes for the same builder.  As an example,
191  if you want to both a) confirm that all of LLVM builds with your host
192  compiler, and b) want to do a multi-stage clang build on your target, you
193  may be better off with two separate bots.  Splitting increases resource
194  consumption, but makes it easy for each bot to keep up with commit flow.
195  Additionally, splitting bots may assist in triage by narrowing attention to
196  relevant parts of the failing configuration.
197
198  In general, we recommend Release build types with Assertions enabled.  This
199  generally provides a good balance between build times and bug detection for
200  most buildbots.  There may be room for including some debug info (e.g. with
201  `-gmlt`), but in general the balance between debug info quality and build
202  times is a delicate one.
203
204Use Ninja & LLD
205  Ninja really does help build times over Make, particularly for highly
206  parallel builds.  LLD helps to reduce both link times and memory usage
207  during linking significantly.  With a build machine with sufficient
208  parallism, link times tend to dominate critical path of the build, and are
209  thus worth optimizing.
210
211Use CCache and NOT incremental builds
212  Using ccache materially improves average build times.  Incremental builds
213  can be slightly faster, but introduce the risk of build corruption due to
214  e.g. state changes, etc...  At this point, the recommendation is not to
215  use incremental builds and instead use ccache as the latter captures the
216  majority of the benefit with less risk of false positives.
217
218  One of the non-obvious benefits of using ccache is that it makes the
219  builder less sensitive to which projects are being monitored vs built.
220  If a change triggers a build request, but doesn't change the build output
221  (e.g. doc changes, python utility changes, etc..), the build will entirely
222  hit in cache and the build request will complete in just the testing time.
223
224  With multiple workers, it is tempting to try to configure a shared cache
225  between the workers.  Experience to date indicates this is difficult to
226  well, and that having local per-worker caches gets most of the benefit
227  anyways.  We don't currently recommend shared caches.
228
229  CCache does depend on the builder hardware having sufficient IO to access
230  the cache with reasonable access times - i.e. a fast disk, or enough memory
231  for a RAM cache, etc..  For builders without, incremental may be your best
232  option, but is likely to require higher ongoing involvement from the
233  sponsor.
234
235Enable batch builds
236  As a last resort, you can configure your builder to batch build requests.
237  This makes the build failure notifications markedly less actionable, and
238  should only be done once all other reasonable measures have been taken.
239
240Leave it on the staging buildmaster
241  While most of this section has been biased towards builders intended for
242  the main buildmaster, it is worth highlighting that builders can run
243  indefinitely on the staging buildmaster.  Such a builder may still be
244  useful for the sponsoring organization, without concern of negatively
245  impacting the broader community.  The sponsoring organization simply
246  has to take on the responsibility of all bisection and triage.
247
248
249