Differences between base and ports LLVM in OpenBSD
Frederic Cambus June 20, 2022 [OpenBSD] [LLVM] [Compilers] [Toolchains]LLVM was imported in the OpenBSD ports tree back in 2008, and happily lived there for a long while before being imported in the source tree at the g2k16 hackathon in 2016. I previously wrote about this in "The state of toolchains in OpenBSD" last year.
As mentioned in my previous article, we do not use upstream build system to build LLVM in the base system, but hand-written BSD Makefiles. Importing CMake into the base system was not an option, because of the size of the project and the large dependency chain it requires for building. As a drawback, the build is slower than it could be, were we able to take advantage of a more modern build system.
Nowadays, Clang is the default compiler on the amd64, arm64, armv7, i386, macppc, octeon, powerpc64, and riscv64 platforms. It is also available in the sparc64 base system.
But then, why do we still need LLVM in the ports tree? As an aside, for those wondering why we need a compiler in the base system in the first place, Julio Merino wrote about this in his "Compilers in the (BSD) base system" post.
In the OpenBSD base system, we only build LLVM backends for a given architecture, so on amd64 and i386 we build LLVM's X86 backend. The mapping we do between OpenBSD's MACHINE_ARCH and LLVM_ARCH values can be found in gnu/usr.bin/clang/Makefile.arch.
Note that we also build the AMDGPU backend on platforms requiring it.
On an amd64 machine, the registered targets for the base compiler are:
$ clang --print-targets
Registered Targets:
amdgcn - AMD GCN GPUs
r600 - AMD GPUs HD2XXX-HD6XXX
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
And the ones for Clang installed from ports are:
$ clang-13 --print-targets
Registered Targets:
aarch64 - AArch64 (little endian)
aarch64_32 - AArch64 (little endian ILP32)
aarch64_be - AArch64 (big endian)
amdgcn - AMD GCN GPUs
arm - ARM
arm64 - ARM64 (little endian)
arm64_32 - ARM64 (little endian ILP32)
armeb - ARM (big endian)
avr - Atmel AVR Microcontroller
bpf - BPF (host endian)
bpfeb - BPF (big endian)
bpfel - BPF (little endian)
hexagon - Hexagon
lanai - Lanai
mips - MIPS (32-bit big endian)
mips64 - MIPS (64-bit big endian)
mips64el - MIPS (64-bit little endian)
mipsel - MIPS (32-bit little endian)
msp430 - MSP430 [experimental]
nvptx - NVIDIA PTX 32-bit
nvptx64 - NVIDIA PTX 64-bit
ppc32 - PowerPC 32
ppc32le - PowerPC 32 LE
ppc64 - PowerPC 64
ppc64le - PowerPC 64 LE
r600 - AMD GPUs HD2XXX-HD6XXX
riscv32 - 32-bit RISC-V
riscv64 - 64-bit RISC-V
sparc - Sparc
sparcel - Sparc LE
sparcv9 - Sparc V9
systemz - SystemZ
thumb - Thumb
thumbeb - Thumb (big endian)
wasm32 - WebAssembly 32-bit
wasm64 - WebAssembly 64-bit
x86 - 32-bit X86: Pentium-Pro and above
x86-64 - 64-bit X86: EM64T and AMD64
xcore - XCore
The devel/llvm port is built using CMake and Ninja, resulting in more efficient builds. On top of building all available LLVM backends, we also build:
- The Clang Static Analyzer and its companion tool scan-build
- Clang utilities (clang-format and clang-* tools)
- LLVM utilities (LLVM binary utilities: llvm-ar, llvm-as, llvm-objcopy, llvm-objdump, etc.)
- Tools to process code coverage data (llvm-profdata and llvm-cov)
- Various other tools such as llc, lli, llvm-mc, llvm-mca, etc.
So in essence, we try to keep the base system LLVM somewhat minimal, and build additional features and tooling in the port version. This solution has worked well for us so far.
One last thing to note, we only build one version of LLVM in ports, which is kept in sync with the base version, so we do not ship packages for older (or newer) versions of LLVM.