Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries

You're reading from   LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries Design powerful and reliable compilers using the latest libraries and tools from LLVM

Arrow left icon
Product type Paperback
Published in Apr 2021
Publisher Packt
ISBN-13 9781838824952
Length 370 pages
Edition 1st Edition
Languages
Tools
Concepts
Arrow right icon
Author (1):
Arrow left icon
Min-Yih Hsu Min-Yih Hsu
Author Profile Icon Min-Yih Hsu
Min-Yih Hsu
Arrow right icon
View More author details
Toc

Table of Contents (18) Chapters Close

Preface 1. Section 1: Build System and LLVM-Specific Tooling
2. Chapter 1: Saving Resources When Building LLVM FREE CHAPTER 3. Chapter 2: Exploring LLVM's Build System Features 4. Chapter 3: Testing with LLVM LIT 5. Chapter 4: TableGen Development 6. Section 2: Frontend Development
7. Chapter 5: Exploring Clang's Architecture 8. Chapter 6: Extending the Preprocessor 9. Chapter 7: Handling AST 10. Chapter 8: Working with Compiler Flags and Toolchains 11. Section 3: "Middle-End" Development
12. Chapter 9: Working with PassManager and AnalysisManager 13. Chapter 10: Processing LLVM IR 14. Chapter 11: Gearing Up with Support Utilities 15. Chapter 12: Learning LLVM IR Instrumentation 16. Assessments 17. Other Books You May Enjoy

Tweaking CMake arguments

This section will show you some of the most common CMake arguments in LLVM's build system that can help you customize your build and achieve maximum efficiency.

Before we start, you should have a build folder that has been CMake-configured. Most of the following subsections will modify a file in the build folder; that is, CMakeCache.txt.

Choosing the right build type

LLVM uses several predefined build types provided by CMake. The most common types among them are as follows:

  • Release: This is the default build type if you didn't specify any. It will adopt the highest optimization level (usually -O3) and eliminate most of the debug information. Usually, this build type will make the building speed slightly slower.
  • Debug: This build type will compile without any optimization applied (that is, -O0). It preserves all the debug information. Note that this will generate a huge number of artifacts and usually take up ~20 GB of space, so please be sure you have enough storage space when using this build type. This will usually make the building speed slightly faster since no optimization is being performed.
  • RelWithDebInfo: This build type applies as much compiler optimization as possible (usually -O2) and preserves all the debug information. This is an option balanced between space consumption, runtime speed, and debuggability.

You can choose one of them using the CMAKE_BUILD_TYPE CMake variable. For example, to use the RelWithDebInfo type, you can use the following command:

$ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo …

It is recommended to use RelWithDebInfo first (if you're going to debug LLVM later). Modern compilers have gone a long way to improve the debug information's quality in optimized program binaries. So, always give it a try first to avoid unnecessary storage waste; you can always go back to the Debug type if things don't work out.

In addition to configuring build types, LLVM_ENABLE_ASSERTIONS is another CMake (Boolean) argument that controls whether assertions (that is, the assert(bool predicate) function, which will terminate the program if the predicate argument is not true) are enabled. By default, this flag will only be true if the build type is Debug, but you can always turn it on manually to enforce stricter checks, even in other build types.

Avoiding building all targets

The number of LLVM's supported targets (hardware) has grown rapidly in the past few years. At the time of writing this book, there are nearly 20 officially supported targets. Each of them deals with non-trivial tasks such as native code generation, so it takes a significant amount of time to build. However, the chances that you're going to be working on all of these targets at the same time are low. Thus, you can select a subset of targets to build using the LLVM_TARGETS_TO_BUILD CMake argument. For example, to build the X86 target only, we can use the following command:

$ cmake -DLLVM_TARGETS_TO_BUILD="X86" …

You can also specify multiple targets using a semicolon-separated list, as follows:

$ cmake -DLLVM_TARGETS_TO_BUILD="X86;AArch64;AMDGPU" …

Surround the list of targets with double quotes!

In some shells, such as BASH, a semicolon is an ending symbol for a command. So, the rest of the CMake command will be cut off if you don't surround the list of targets with double-quotes.

Let's see how building shared libraries can help tweak CMake arguments.

Building as shared libraries

One of the most iconic features of LLVM is its modular design. Each component, optimization algorithm, code generation, and utility libraries, to name a few, are put into their own libraries where developers can link individual ones, depending on their usage. By default, each component is built as a static library (*.a in Unix/Linux and *.lib in Windows). However, in this case, static libraries have the following drawbacks:

  • Linking against static libraries usually takes more time than linking against dynamic libraries (*.so in Unix/Linux and *.dll in Windows).
  • If multiple executables link against the same set of libraries, like many of the LLVM tools do, the total size of these executables will be significantly larger when you adopt the static library approach compared to its dynamic library counterpart. This is because each of the executables has a copy of those libraries.
  • When you're debugging LLVM programs with debuggers (GDB, for example), they usually spend quite some time loading the statically linked executables at the very beginning, hindering the debugging experience.

Thus, it's recommended to build every LLVM component as a dynamic library during the development phase by using the BUILD_SHARED_LIBS CMake argument:

$ cmake -DBUILD_SHARED_LIBS=ON …

This will save you a significant amount of storage space and speed up the building process.

Splitting the debug info

When you're building a program in debug mode – adding the -g flag when using you're GCC and Clang, for example – by default, the generated binary contains a section that stores debug information. This information is essential for using a debugger (for example, GDB) to debug that program. LLVM is a large and complex project, so when you're building it in debug mode – using the cmAKE_BUILD_TYPE=Debug variable – the compiled libraries and executables come with a huge amount of debug information that takes up a lot of disk space. This causes the following problems:

  • Due to the design of C/C++, several duplicates of the same debug information might be embedded in different object files (for example, the debug information for a header file might be embedded in every library that includes it), which wastes lots of disk space.
  • The linker needs to load object files AND their associated debug information into memory during the linking stage, meaning that memory pressure will increase if the object file contains a non-trivial amount of debug information.

To solve these problems, the build system in LLVM provides allows us to split debug information into separate files from the original object files. By detaching debug information from object files, the debug info of the same source file is condensed into one place, thus avoiding unnecessary duplicates being created and saving lots of disk space. In addition, since debug info is not part of the object files anymore, the linker no longer needs to load them into memory and thus saves lots of memory resources. Last but not least, this feature can also improve our incremental building speed – that is, rebuild the project after a (small) code change – since we only need to update the modified debug information in a single place.

To use this feature, please use the LLVM_USE_SPLIT_DWARF cmake variable:

$ cmake -DcmAKE_BUILD_TYPE=Debug -DLLVM_USE_SPLIT_DWARF=ON …

Note that this CMake variable only works for compilers that use the DWARF debug format, including GCC and Clang.

Building an optimized version of llvm-tblgen

TableGen is a Domain-Specific Language (DSL) for describing structural data that will be converted into the corresponding C/C++ code as part of LLVM's building process (we will learn more about this in the chapters to come). The conversion tool is called llvm-tblgen. In other words, the running time of llvm-tblgen will affect the building time of LLVM itself. Therefore, if you're not developing the TableGen part, it's always a good idea to build an optimized version of llvm-tblgen, regardless of the global build type (that is, CMAKE_BUILD_TYPE), making llvm-tblgen run faster and shortening the overall building time.

The following CMake command, for example, will create build configurations that build a debug version of everything except the llvm-tblgen executable, which will be built as an optimized version:

$ cmake -DLLVM_OPTIMIZED_TABLEGEN=ON -DCMAKE_BUILD_TYPE=Debug …

Lastly, you'll see how you can use Clang and the new PassManager.

Using the new PassManager and Clang

Clang is LLVM's official C-family frontend (including C, C++, and Objective-C). It uses LLVM's libraries to generate machine code, which is organized by one of the most important subsystems in LLVM – PassManager. PassManager puts together all the tasks (that is, the Passes) required for optimization and code generation.

In Chapter 9, Working with PassManager and AnalysisManager, will introduce LLVM's new PassManager, which builds from the ground up to replace the existing one somewhere in the future. The new PassManager has a faster runtime speed compared to the legacy PassManager. This advantage indirectly brings better runtime performance for Clang. Therefore, the idea here is pretty simple: if we build LLVM's source tree using Clang, with the new PassManager enabled, the compilation speed will be faster. Most of the mainstream Linux distribution package repositories already contain Clang. It's recommended to use Clang 6.0 or later if you want a more stable PassManager implementation. Use the LLVM_USE_NEWPM CMake variable to build LLVM with the new PassManager, as follows:

$ env CC=`which clang` CXX=`which clang++` \
  cmake -DLLVM_USE_NEWPM=ON …

LLVM is a huge project that takes a lot of time to build. The previous two sections introduced some useful tricks and tips for improving its building speed. In the next section, we're going to introduce an alternative build system to build LLVM. It has some advantages over the default CMake build system, which means it will be more suitable in some scenarios.

You have been reading a chapter from
LLVM Techniques, Tips, and Best Practices Clang and Middle-End Libraries
Published in: Apr 2021
Publisher: Packt
ISBN-13: 9781838824952
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image