Speedbuilding LLVM/Clang in 5 minutes
Frederic Cambus May 11, 2021 [LLVM] [Compilers] [Toolchains]This post is a spiritual successor to my "Building LLVM on OpenBSD/loongson" article, in which I retraced my attempts to build LLVM 3.7.1 on MIPS64 in a RAM constrained environment.
After reading the excellent "Make LLVM fast again", I wanted to revisit the topic, and see how fast I could build a recent version of LLVM and Clang on modern x86 server hardware.
The system I'm using for this experiment is a CCX62 instance from Hetzner, which has 48 dedicated vCPUs and 192 GB of RAM. This is the fastest machine available in their cloud offering at the moment.
The system is running Fedora 34 with up-to-date packages and kernel.
The full result of cat /proc/cpuinfo is available here.
uname -a
Linux benchmarks 5.11.18-300.fc34.x86_64 #1 SMP Mon May 3 15:10:32 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Let's start by installing required packages:
dnf in clang cmake git lld ninja-build
The compiler used for the builds is Clang 12.0.0:
clang --version
clang version 12.0.0 (Fedora 12.0.0-0.3.rc1.fc34)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Regarding linkers, we are using GNU ld and GNU Gold from binutils 2.35.1, and LLD 12.0.0.
GNU ld version 2.35.1-41.fc34
GNU gold (version 2.35.1-41.fc34) 1.16
LLD 12.0.0 (compatible with GNU linkers)
For all the following runs, I'm building from the Git repository main branch commit 831cf15ca6892e2044447f8dc516d76b8a827f1e. The build directory is of course fully erased between each run.
commit 831cf15ca6892e2044447f8dc516d76b8a827f1e
Author: David Spickett <david.spickett@linaro.org>
Date: Wed May 5 11:49:35 2021 +0100
To get a baseline, let's do a full release build on this machine:
cd llvm-project
mkdir build
cd build
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
../llvm
time make -j48
real 11m19.852s
user 436m30.619s
sys 12m5.724s
By default, CMake generates Makefiles. As documented in the "Getting Started with the LLVM System" tutorial, most LLVM developers use Ninja.
Let's switch to generating Ninja build files, and using ninja to build:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-GNinja ../llvm
time ninja
[4182/4182] Generating ../../bin/llvm-readelf
real 10m13.755s
user 452m16.034s
sys 12m7.584s
By default, GNU ld is used for linking. Let's switch to using gold:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_USE_LINKER=gold \
-GNinja ../llvm
time ninja
[4182/4182] Generating ../../bin/llvm-readelf
real 10m13.405s
user 451m35.029s
sys 11m57.649s
LLD has been a viable option for some years now. Let's use it:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_USE_LINKER=lld \
-GNinja ../llvm
time ninja
[4182/4182] Generating ../../bin/llvm-readelf
real 10m12.710s
user 451m12.444s
sys 12m12.634s
During tests on smaller build machines, I had observed that using GNU gold or LLD instead of GNU ld resulted in noticeably faster builds. This doesn't seem to be the case on this machine. We end up with a slightly faster build by using LLD, but not by a large margin at all.
If we want to build faster, we can make some compromises and start stripping the build by removing some components.
Let's start by disabling additional architecture support:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_USE_LINKER=lld \
-DLLVM_TARGETS_TO_BUILD="X86" \
-GNinja ../llvm
time ninja
[3196/3196] Generating ../../bin/llvm-readelf
real 7m55.531s
user 344m56.462s
sys 8m53.970s
We can verify the resulting Clang binary only supports x86 targets:
bin/clang --print-targets
Registered Targets:
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
Let's go further and disable the static analyzer and the ARC Migration Tool:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_USE_LINKER=lld \
-DLLVM_TARGETS_TO_BUILD="X86" \
-DCLANG_ENABLE_STATIC_ANALYZER=OFF \
-DCLANG_ENABLE_ARCMT=OFF \
-GNinja ../llvm
time ninja
[3147/3147] Generating ../../bin/llvm-readelf
real 7m42.299s
user 334m47.916s
sys 8m44.704s
Let's disable building some LLVM tools and utils:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_USE_LINKER=lld \
-DLLVM_TARGETS_TO_BUILD="X86" \
-DCLANG_ENABLE_STATIC_ANALYZER=OFF \
-DCLANG_ENABLE_ARCMT=OFF \
-DLLVM_BUILD_TOOLS=OFF \
-DLLVM_BUILD_UTILS=OFF \
-GNinja ../llvm
time ninja
[2880/2880] Generating ../../bin/llvm-readelf
real 7m21.016s
user 315m42.127s
sys 8m9.377s
Compared to the previous build, the following binaries were not built: FileCheck, count, lli-child-target, llvm-jitlink-executor, llvm-PerfectShuffle, not, obj2yaml, yaml2obj, and yaml-bench.
We are reaching the end of our journey here. At this point, we are done stripping out things.
Let's disable optimizations and do a last run:
cmake -DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++ \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_USE_LINKER=lld \
-DLLVM_TARGETS_TO_BUILD="X86" \
-DCLANG_ENABLE_STATIC_ANALYZER=OFF \
-DCLANG_ENABLE_ARCMT=OFF \
-DLLVM_BUILD_TOOLS=OFF \
-DLLVM_BUILD_UTILS=OFF \
-DCMAKE_CXX_FLAGS_RELEASE="-O0" \
-GNinja ../llvm
time ninja
[2880/2880] Linking CXX executable bin/c-index-test
real 5m37.225s
user 253m18.515s
sys 9m2.413s
That's it. Five minutes. Don't try this at home :-)