1=================================================================== 2How To Add Your Build Configuration To LLVM Buildbot Infrastructure 3=================================================================== 4 5Introduction 6============ 7 8This document contains information about adding a build configuration and 9buildbot-worker to private worker builder to LLVM Buildbot Infrastructure. 10 11Buildmasters 12============ 13 14There are two buildmasters running. 15 16* The main buildmaster at `<https://lab.llvm.org/buildbot>`_. All builders 17 attached to this machine will notify commit authors every time they break 18 the build. 19* The staging buildmaster at `<https://lab.llvm.org/staging>`_. All builders 20 attached to this machine will be completely silent by default when the build 21 is broken. 22 23In order to remain connected to the main buildmaster (and thus notify 24developers of failures), a builbot must: 25 26* Be building a supported configuration. Builders for experimental backends 27 should generally be attached to staging buildmaster. 28* Be able to keep up with new commits to the main branch, or at a minimum 29 recover to tip of tree within a couple of days of falling behind. 30 31Additionally, we encourage all bot owners to point their bots towards the 32staging master during maintenance windows, instability troubleshooting, and 33such. 34 35Roles & Expectations 36==================== 37 38Each buildbot has an owner who is the responsible party for addressing problems 39which arise with said buildbot. We generally expect the bot owner to be 40reasonably responsive. 41 42For some bots, the ownership responsibility is split between a "resource owner" 43who provides the underlying machine resource, and a "configuration owner" who 44maintains the build configuration. Generally, operational responsibility lies 45with the "config owner". We do expect "resource owners" - who are generally 46the contact listed in a workers attributes - to proxy requests to the relevant 47"config owner" in a timely manner. 48 49Most issues with a buildbot should be addressed directly with a bot owner 50via email. Please CC `Galina Kistanova <mailto:[email protected]>`_. 51 52Steps To Add Builder To LLVM Buildbot 53===================================== 54Volunteers can provide their build machines to work as build workers to 55public LLVM Buildbot. 56 57Here are the steps you can follow to do so: 58 59#. Check the existing build configurations to make sure the one you are 60 interested in is not covered yet or gets built on your computer much 61 faster than on the existing one. We prefer faster builds so developers 62 will get feedback sooner after changes get committed. 63 64#. The computer you will be registering with the LLVM buildbot 65 infrastructure should have all dependencies installed and you can 66 actually build your configuration successfully. Please check what degree 67 of parallelism (-j param) would give the fastest build. You can build 68 multiple configurations on one computer. 69 70#. Install buildbot-worker (currently we are using buildbot version 2.8.5). 71 Depending on the platform, buildbot-worker could be available to download and 72 install with your package manager, or you can download it directly from 73 `<http://trac.buildbot.net>`_ and install it manually. 74 75#. Create a designated user account, your buildbot-worker will be running under, 76 and set appropriate permissions. 77 78#. Choose the buildbot-worker root directory (all builds will be placed under 79 it), buildbot-worker access name and password the build master will be using 80 to authenticate your buildbot-worker. 81 82#. Create a buildbot-worker in context of that buildbot-worker account. Point it 83 to the **lab.llvm.org** port **9990** (see `Buildbot documentation, 84 Creating a worker 85 <http://docs.buildbot.net/current/tutorial/firstrun.html#creating-a-worker>`_ 86 for more details) by running the following command: 87 88 .. code-block:: bash 89 90 $ buildbot-worker create-worker <buildbot-worker-root-directory> \ 91 lab.llvm.org:9990 \ 92 <buildbot-worker-access-name> \ 93 <buildbot-worker-access-password> 94 95 To point a worker to silent master please use lab.llvm.org:9994 instead 96 of lab.llvm.org:9990. 97 98#. Fill the buildbot-worker description and admin name/e-mail. Here is an 99 example of the buildbot-worker description:: 100 101 Windows 7 x64 102 Core i7 (2.66GHz), 16GB of RAM 103 104 g++.exe (TDM-1 mingw32) 4.4.0 105 GNU Binutils 2.19.1 106 cmake version 2.8.4 107 Microsoft(R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86 108 109#. Make sure you can actually start the buildbot-worker successfully. Then set 110 up your buildbot-worker to start automatically at the start up time. See the 111 buildbot documentation for help. You may want to restart your computer 112 to see if it works. 113 114#. Send a patch which adds your build worker and your builder to 115 `zorg <https://github.com/llvm/llvm-zorg>`_. Use the typical LLVM 116 `workflow <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_. 117 118 * workers are added to ``buildbot/osuosl/master/config/workers.py`` 119 * builders are added to ``buildbot/osuosl/master/config/builders.py`` 120 121 Please make sure your builder name and its builddir are unique through the 122 file. 123 124 All new builders should default to using the "'collapseRequests': False" 125 configuration. This causes the builder to build each commit individually 126 and not merge build requests. To maximize quality of feedback to developers, 127 we *strongly prefer* builders to be configured not to collapse requests. 128 This flag should be removed only after all reasonable efforts have been 129 exhausted to improve build times such that the builder can keep up with 130 commit flow. 131 132 It is possible to allow email addresses to unconditionally receive 133 notifications on build failure; for this you'll need to add an 134 ``InformativeMailNotifier`` to ``buildbot/osuosl/master/config/status.py``. 135 This is particularly useful for the staging buildmaster which is silent 136 otherwise. 137 138#. Send the buildbot-worker access name and the access password directly to 139 `Galina Kistanova <mailto:[email protected]>`_, and wait till she 140 will let you know that your changes are applied and buildmaster is 141 reconfigured. 142 143#. Check the status of your buildbot-worker on the `Waterfall Display 144 <http://lab.llvm.org/buildbot/#/waterfall>`_ to make sure it is connected, 145 and the `Workers Display <http://lab.llvm.org/buildbot/#/workers>`_ to see if 146 administrator contact and worker information are correct. 147 148#. Wait for the first build to succeed and enjoy. 149 150 151Best Practices for Configuring a Fast Builder 152============================================= 153 154As mentioned above, we generally have a strong preference for 155builders which can build every commit as they come in. This section 156includes best practices and some recommendations as to how to achieve 157that end. 158 159The goal 160 In 2020, the monorepo had just under 35 thousand commits. This works 161 out to an average of 4 commits per hour. Already, we can see that a 162 builder must cycle in less than 15 minutes to have a hope of being 163 useful. However, those commits are not uniformly distributed. They 164 tend to cluster strongly during US working hours. Looking at a couple 165 of recent (Nov 2021) working days, we routinely see ~10 commits per 166 hour during peek times, with occasional spikes as high as ~15 commits 167 per hour. Thus, as a rule of thumb, we should plan for our builder to 168 complete ~10-15 builds an hour. 169 170Resource Appropriately 171 At 10-15 builds per hour, we need to complete a new build on average every 172 4 to 6 minutes. For anything except the fastest of hardware/build configs, 173 this is going to be well beyond the ability of a single machine. In buildbot 174 terms, we likely going to need multiple workers to build requests in parallel 175 under a single builder configuration. For some rough back of the envelope 176 numbers, if your build config takes e.g. 30 minutes, you will need something 177 on the order of 5-8 workers. If your build config takes ~2 hours, you'll 178 need something on the order of 20-30 workers. The rest of this section 179 focuses on how to reduce cycle times. 180 181Restrict what you build and test 182 Think hard about why you're setting up a bot, and restrict your build 183 configuration as much as you can. Basic functionality is probably 184 already covered by other bots, and you don't need to duplicate that 185 testing. You only need to be building and testing the *unique* parts 186 of the configuration. (e.g. For a multi-stage clang builder, you probably 187 don't need to be enabling every target or building all the various utilities.) 188 189 It can sometimes be worthwhile splitting a single builder into two or more, 190 if you have multiple distinct purposes for the same builder. As an example, 191 if you want to both a) confirm that all of LLVM builds with your host 192 compiler, and b) want to do a multi-stage clang build on your target, you 193 may be better off with two separate bots. Splitting increases resource 194 consumption, but makes it easy for each bot to keep up with commit flow. 195 Additionally, splitting bots may assist in triage by narrowing attention to 196 relevant parts of the failing configuration. 197 198 In general, we recommend Release build types with Assertions enabled. This 199 generally provides a good balance between build times and bug detection for 200 most buildbots. There may be room for including some debug info (e.g. with 201 `-gmlt`), but in general the balance between debug info quality and build 202 times is a delicate one. 203 204Use Ninja & LLD 205 Ninja really does help build times over Make, particularly for highly 206 parallel builds. LLD helps to reduce both link times and memory usage 207 during linking significantly. With a build machine with sufficient 208 parallism, link times tend to dominate critical path of the build, and are 209 thus worth optimizing. 210 211Use CCache and NOT incremental builds 212 Using ccache materially improves average build times. Incremental builds 213 can be slightly faster, but introduce the risk of build corruption due to 214 e.g. state changes, etc... At this point, the recommendation is not to 215 use incremental builds and instead use ccache as the latter captures the 216 majority of the benefit with less risk of false positives. 217 218 One of the non-obvious benefits of using ccache is that it makes the 219 builder less sensitive to which projects are being monitored vs built. 220 If a change triggers a build request, but doesn't change the build output 221 (e.g. doc changes, python utility changes, etc..), the build will entirely 222 hit in cache and the build request will complete in just the testing time. 223 224 With multiple workers, it is tempting to try to configure a shared cache 225 between the workers. Experience to date indicates this is difficult to 226 well, and that having local per-worker caches gets most of the benefit 227 anyways. We don't currently recommend shared caches. 228 229 CCache does depend on the builder hardware having sufficient IO to access 230 the cache with reasonable access times - i.e. a fast disk, or enough memory 231 for a RAM cache, etc.. For builders without, incremental may be your best 232 option, but is likely to require higher ongoing involvement from the 233 sponsor. 234 235Enable batch builds 236 As a last resort, you can configure your builder to batch build requests. 237 This makes the build failure notifications markedly less actionable, and 238 should only be done once all other reasonable measures have been taken. 239 240Leave it on the staging buildmaster 241 While most of this section has been biased towards builders intended for 242 the main buildmaster, it is worth highlighting that builders can run 243 indefinitely on the staging buildmaster. Such a builder may still be 244 useful for the sponsoring organization, without concern of negatively 245 impacting the broader community. The sponsoring organization simply 246 has to take on the responsibility of all bisection and triage. 247 248 249