1.3: Kernel build system
After we examined Linux kernel structure, it worth spending some time investigating how we can build and run it. Linux also uses make
utility to build the kernel, though Linux makefile is much more complicated. Before we will take a look at the makefile, let’s learn some important concepts about Linux build system, which is called “kbuild”.
A few essential kbuild concepts
-
Build process can be customized by using kbuild variables. Those variables are defined in
Kconfig
files. Here you can define the variables themselves and their default values. Variables can have different types, including string, boolean and integer. In a Kconfig file you can also define dependencies between variables (for example, you can say that if variable X is selected then variable Y will be selected implicitly). As an example, you can take a look at arm64 Kconfig file. This file defines all variables, specific forarm64
architecture.Kconfig
functionality is not part of the standardmake
and is implemented in the Linux makefile. Variables, defined inKconfig
are exposed to the kernel source code as well as to the nested makefiles. Variable values can be set during kernel configuration step (for example, if you typemake menuconfig
a console GUI will be shown. It allows you to customize values for all kernel variables and stores the values in.config
. Usemake help
command to view all possible options to configure the kernel) -
Linux uses recursive build. This means that each subfolder of the Linux kernel can define it’s own
Makefile
andKconfig
. Most of the nested Makefiles are very simple and just define what object files need to be compiled. Usually, such definitions have the following format.obj-$(SOME_CONFIG_VARIABLE) += some_file.o
This definition means that
some_file.c
will be compiled and linked to the kernel only ifSOME_CONFIG_VARIABLE
is set. If you want to compile and link a file unconditionally, you need to change the previous definition to look like this.obj-y += some_file.o
An example of the nested Makefile can be found here.
-
Before we move forward, you need to understand the structure of a basic make rule and be comfortable with make terminology. Common rule structure is illustrated in the following diagram.
targets : prerequisites recipe …
targets
are file names, separated by spaces. Targets are generated after the rule is executed. Usually, there is only one target per rule.prerequisites
are files thatmake
trackes to see whether it needs to update the targets.recipe
is a bash script. Make calls it when some of the prerequisites have been updated. The recipe is responsible for generating the targets.- Both targets and prerequisites can include wildcards (
%
). When wildcards are used the recipe is executed for each of the matched prerequisites separately. In this case, you can use$<
and$@
variables to refer to the prerequisite and the target inside the recipe. We already did it in the RPi OS makefile. For additional information about make rules, please refer to the official documentation.
-
make
is very good in detecting whether any of the prerequisites have been changed and updating only targets that need to be rebuilt. However, if a recipe is dynamically updated,make
is unable to detect this change. How can this happen? Very easily. One good example is when you change some configuration variable, which results in appending an additional option to the recipe. By default, in this case,make
will not recompile previously generated object files, because their prerequisites haven’t been changed, only the recipe have been updated. To overcome this behavior Linux introduces if_changed function. To see how it works let’s consider the following example.cmd_compile = gcc $(flags) -o $@ $< %.o: %.c FORCE $(call if_changed,compile)
Here for each
.c
file we build corresponding.o
file by callingif_changed
function with the argumentcompile
.if_changed
then looks forcmd_compile
variable (it addscmd_
prefix to the first argument) and checks whether this variable has been updated since the last execution, or any of the prerequisites has been changed. If yes -cmd_compile
command is executed and object file is regenerated. Our sample rule has 2 prerequisites: source.c
file andFORCE
.FORCE
is a special prerequisite that forces the recipe to be called each time whenmake
command is called. Without it, the recipe would be called only if.c
file was changed. You can read more aboutFORCE
target here.
Building the kernel
Now, that we learned some important concepts about the Linux build system, let’s try to figure out what exactly is going on after you type make
command. This process is very complicated and includes a lot of details, most of which we will skip. Our goal will be to answer 2 questions.
- How exactly are source files compiled into object files?
- How are object files linked into the OS image?
We are going to tackle the second question first.
Link stage
-
As you might see from the output of
make help
command, the default target, which is responsible for building the kernel, is calledvmlinux
. -
vmlinux
target definition can be found here and it looks like this.cmd_link-vmlinux = \ $(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ; \ $(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true) vmlinux: scripts/link-vmlinux.sh vmlinux_prereq $(vmlinux-deps) FORCE +$(call if_changed,link-vmlinux)
This target uses already familiar to us
if_changed
function. Whenever some of the prerequsities are updatedcmd_link-vmlinux
command is executed. This command executes scripts/link-vmlinux.sh script (Note usage of $< automatic variable in thecmd_link-vmlinux
command). It also executes architecture specific postlink script, but we are not very interested in it. -
When scripts/link-vmlinux.sh is executed it assumes that all required object files are already built and their locations are stored in 3 variables:
KBUILD_VMLINUX_INIT
,KBUILD_VMLINUX_MAIN
,KBUILD_VMLINUX_LIBS
. -
link-vmlinux.sh
script first createsthin archive
from all available object files.thin archive
is a special object that contains references to a set of object files as well as their combined symbol table. This is done inside archive_builtin function. In order to createthin archive
this function uses ar utility. Generatedthin archive
is stored asbuilt-in.o
file and has the format that is understandable by the linker, so it can be used as any other normal object file. -
Next modpost_link is called. This function calls linker and generates
vmlinux.o
object file. We need this object file to perform Section mismatch analysis. This analysis is performed by the modpost program and is triggered at this line. -
Next kernel symbol table is generated. It contains information about all functions and global variables as well as their location in the
vmlinux
binary. The main work is done inside kallsyms function. This function first uses nm to extract all symbols fromvmlinux
binary. Then it uses scripts/kallsyms utility to generate a special assembler file containing all symbols in a special format, understandable by the Linux kernel. Next, this assembler file is compiled and linked together with the original binary. This process is repeated several times because after the final link addresses of some symbols can be changed. Information from the kernel symbol table is used to generate ‘/proc/kallsyms’ file at runtime. -
Finally
vmlinux
binary is ready andSystem.map
is build.System.map
contains the same information as/proc/kallsyms
but this is static file and unlike/proc/kallsyms
it is not generated at runtime.System.map
is mostly used to resolve addresses to symbol names during kernel oops. The samenm
utility is used to buildSystem.map
. This is done here.
Build stage
-
Now let’s take one step backward and examine how source code files are compiled into object files. As you might remember one of the prerequisites of the
vmlinux
target is$(vmlinux-deps)
variable. Let me now copy a few relevant lines from the main Linux makefile to demonstrate how this variable is built.init-y := init/ drivers-y := drivers/ sound/ firmware/ net-y := net/ libs-y := lib/ core-y := usr/ core-y += kernel/ certs/ mm/ fs/ ipc/ security/ crypto/ block/ init-y := $(patsubst %/, %/built-in.o, $(init-y)) core-y := $(patsubst %/, %/built-in.o, $(core-y)) drivers-y := $(patsubst %/, %/built-in.o, $(drivers-y)) net-y := $(patsubst %/, %/built-in.o, $(net-y)) export KBUILD_VMLINUX_INIT := $(head-y) $(init-y) export KBUILD_VMLINUX_MAIN := $(core-y) $(libs-y2) $(drivers-y) $(net-y) $(virt-y) export KBUILD_VMLINUX_LIBS := $(libs-y1) export KBUILD_LDS := arch/$(SRCARCH)/kernel/vmlinux.lds vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_INIT) $(KBUILD_VMLINUX_MAIN) $(KBUILD_VMLINUX_LIBS)
It all starts with variables like
init-y
,core-y
, etc., which combined contains all subfolders of the Linux kernel that contains buildable source code. Thenbuilt-in.o
is appended to all the subfolder names, so, for example,drivers/
becomesdrivers/built-in.o
.vmlinux-deps
then just aggregates all resulting values. This explains howvmlinux
eventually becomes dependent on allbuilt-in.o
files. -
Next question is how all
built-in.o
objects are created? Once again, let me copy all relevant lines and explain how it all works.$(sort $(vmlinux-deps)): $(vmlinux-dirs) ; vmlinux-dirs := $(patsubst %/,%,$(filter %/, $(init-y) $(init-m) \ $(core-y) $(core-m) $(drivers-y) $(drivers-m) \ $(net-y) $(net-m) $(libs-y) $(libs-m) $(virt-y))) build := -f $(srctree)/scripts/Makefile.build obj #Copied from `scripts/Kbuild.include` $(vmlinux-dirs): prepare scripts $(Q)$(MAKE) $(build)=$@
The first line tells us that
vmlinux-deps
depends onvmlinux-dirs
. Next, we can see thatvmlinux-dirs
is a variable that contains all direct root subfolders without/
character at the end. And the most important line here is the recipe to build$(vmlinux-dirs)
target. After substitution of all variables, this recipe will look like the following (we usedrivers
folder as an example, but this rule will be executed for all root subfolders)make -f scripts/Makefile.build obj=drivers
This line just calls another makefile (scripts/Makefile.build) and passes
obj
variable, which contains a folder to be compiled. -
Next logical step is to take a look at scripts/Makefile.build. The first important thing that happens after it is executed is that all variables from
Makefile
orKbuild
files, defined in the current directory, are included. By current directory I mean the directory referenced by theobj
variable. The inclusion is done in the following 3 lines.kbuild-dir := $(if $(filter /%,$(src)),$(src),$(srctree)/$(src)) kbuild-file := $(if $(wildcard $(kbuild-dir)/Kbuild),$(kbuild-dir)/Kbuild,$(kbuild-dir)/Makefile) include $(kbuild-file)
Nested makefiles are mostly responsible for initializing variables like
obj-y
. As a quick reminder:obj-y
variable should contain list of all source code files, located in the current directory. Another important variable that is initialized by the nested makefiles issubdir-y
. This variable contains a list of all subfolders that need to be visited before the source code in the curent directory can be built.subdir-y
is used to implement recursive descending into subfolders. -
When
make
is called without specifying the target (as it is in the case whenscripts/Makefile.build
is executed) it uses the first target. The first target forscripts/Makefile.build
is called__build
and it can be found here Let’s take a look at it.__build: $(if $(KBUILD_BUILTIN),$(builtin-target) $(lib-target) $(extra-y)) \ $(if $(KBUILD_MODULES),$(obj-m) $(modorder-target)) \ $(subdir-ym) $(always) @:
As you can see
__build
target doesn’t have a receipt, but it depends on a bunch of other targets. We are only interested in$(builtin-target)
- it is responsible for creatingbuilt-in.o
file, and$(subdir-ym)
- it is responsible for descending into nested directories. -
Let’s take a look at
subdir-ym
. This variable is initialized here and it is just a concatenation ofsubdir-y
andsubdir-m
variables. (subdir-m
variable is similar tosubdir-y
, but it defines subfolders need to be included in a separate kernel module. We skip the discussion of modules, for now, to keep focused.) -
subdir-ym
target is defined here and should look familiar to you.
$(subdir-ym):
$(Q)$(MAKE) $(build)=$@
This target just triggers execution of the scripts/Makefile.build
in one of the nested subfolders.
-
Now it is time to examine the builtin-target target. Once again I am copying only relevant lines here.
cmd_make_builtin = rm -f $@; $(AR) rcSTP$(KBUILD_ARFLAGS) cmd_make_empty_builtin = rm -f $@; $(AR) rcSTP$(KBUILD_ARFLAGS) cmd_link_o_target = $(if $(strip $(obj-y)),\ $(cmd_make_builtin) $@ $(filter $(obj-y), $^) \ $(cmd_secanalysis),\ $(cmd_make_empty_builtin) $@) $(builtin-target): $(obj-y) FORCE $(call if_changed,link_o_target)
This target depends on
$(obj-y)
target andobj-y
is a list of all object files that need to be built in the current folder. After those files become readycmd_link_o_target
command is executed. In case ifobj-y
variable is emptycmd_make_empty_builtin
is called, which just creates an emptybuilt-in.o
. Otherwise,cmd_make_builtin
command is executed; it uses familiar to usar
tool to createbuilt-in.o
thin archive. -
Finally we got to the point where we need to compile something. You remember that our last unexplored dependency is
$(obj-y)
andobj-y
is just a list of object files. The target that compiles all object files from corresponding.c
files is defined here. Let’s examine all lines, needed to understand this target.cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $< define rule_cc_o_c $(call echo-cmd,checksrc) $(cmd_checksrc) \ $(call cmd_and_fixdep,cc_o_c) \ $(cmd_modversions_c) \ $(call echo-cmd,objtool) $(cmd_objtool) \ $(call echo-cmd,record_mcount) $(cmd_record_mcount) endef $(obj)/%.o: $(src)/%.c $(recordmcount_source) $(objtool_dep) FORCE $(call cmd,force_checksrc) $(call if_changed_rule,cc_o_c)
Inside it’s recipe this target calls
rule_cc_o_c
. This rule is responsible for a lot of things, like checking the source code for some common errors (cmd_checksrc
), enabling versioning for exported module symbols (cmd_modversions_c
), using objtool to validate some aspects of generated object files and constructing a list of calls tomcount
function so that ftrace can find them quickly. But most importantly it callscmd_cc_o_c
command that actually compiles all.c
files to object files.
Conclusion
Wow, it was a long journey inside kernel build system internals! Still, we skipped a lot of details and, for those who want to learn more about the subject, I can recommend to read the following document and continue reading Makefiles source code. Let me now emphasize the important points, that you should take as a take-home message from this chapter.
- How
.c
files are compiled into object files. - How object files are combined into
built-in.o
files. - How recursive build pick up all child
built-in.o
files and combines them into a single one. - How
vmlinux
is linked from all top-levelbuilt-in.o
files.
My main goal was that after reading this chapter you will gain a general understanding of all above points.
Previous Page
1.2 Kernel Initialization: Linux project structure