How to verify bugs and bisect regressions¶
This document describes how to check if some Linux kernel problem occurs in codecurrently supported by developers -- to then explain how to locate the changecausing the issue, if it is a regression (e.g. did not happen with earlierversions).
The text aims at people running kernels from mainstream Linux distributions oncommodity hardware who want to report a kernel bug to the upstream Linuxdevelopers. Despite this intent, the instructions work just as well for userswho are already familiar with building their own kernels: they help avoidmistakes occasionally made even by experienced developers.
The essence of the process (aka ‘TL;DR’)¶
[If you are new to building or bisecting Linux, ignore this section and headover to the ‘step-by-step guide’below. It utilizesthe same commands as this section while describing them in brief fashion. Thesteps are nevertheless easy to follow and together with accompanying entriesin a reference section mention many alternatives, pitfalls, and additionalaspects, all of which might be essential in your present case.]
In case you want to check if a bug is present in code currently supported bydevelopers, execute just thepreparations andsegment 1; while doing so,consider the newest Linux kernel you regularly use to be the ‘working’ kernel.In the following example that’s assumed to be 6.0, which is why its sourceswill be used to prepare the .config file.
In case you face a regression, follow the steps at least till the end ofsegment 2. Then you can submit a preliminary report -- or continue withsegment 3, which describes how to perform a bisection needed for afull-fledged regression report. In the following example 6.0.13 is assumed to bethe ‘working’ kernel and 6.1.5 to be the first ‘broken’, which is why 6.0will be considered the ‘good’ release and used to prepare the .config file.
Preparations: set up everything to build your own kernels:
# * Remove any software that depends on externally maintained kernel modules# or builds any automatically during bootup.# * Ensure Secure Boot permits booting self-compiled Linux kernels.# * If you are not already running the 'working' kernel, reboot into it.# * Install compilers and everything else needed for building Linux.# * Ensure to have 15 Gigabyte free space in your home directory.git clone -o mainline --no-checkout \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/cd ~/linux/git remote add -t master stable \ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.gitgit switch --detach v6.0# * Hint: if you used an existing clone, ensure no stale .config is around.make olddefconfig# * Ensure the former command picked the .config of the 'working' kernel.# * Connect external hardware (USB keys, tokens, ...), start a VM, bring up# VPNs, mount network shares, and briefly try the feature that is broken.yes '' | make localmodconfig./scripts/config --set-str CONFIG_LOCALVERSION '-local'./scripts/config -e CONFIG_LOCALVERSION_AUTO# * Note, when short on storage space, check the guide for an alternative:./scripts/config -d DEBUG_INFO_NONE -e KALLSYMS_ALL -e DEBUG_KERNEL \ -e DEBUG_INFO -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT -e KALLSYMS# * Hint: at this point you might want to adjust the build configuration;# you'll have to, if you are running Debian.make olddefconfigcp .config ~/kernel-config-working
Segment 1: build a kernel from the latest mainline codebase.
This among others checks if the problem was fixed already and which developerslater need to be told about the problem; in case of a regression, this rulesout a .config change as root of the problem.
Checking out latest mainline code:
cd ~/linux/git switch --discard-changes --detach mainline/master
Build, install, and boot a kernel:
cp ~/kernel-config-working .configmake olddefconfigmake -j $(nproc --all)# * Make sure there is enough disk space to hold another kernel:df -h /boot/ /lib/modules/# * Note: on Arch Linux, its derivatives and a few other distributions# the following commands will do nothing at all or only part of the# job. See the step-by-step guide for further details.sudo make modules_installcommand -v installkernel && sudo make install# * Check how much space your self-built kernel actually needs, which# enables you to make better estimates later:du -ch /boot/*$(make -s kernelrelease)* | tail -n 1du -sh /lib/modules/$(make -s kernelrelease)/# * Hint: the output of the following command will help you pick the# right kernel from the boot menu:make -s kernelrelease | tee -a ~/kernels-builtreboot# * Once booted, ensure you are running the kernel you just built by# checking if the output of the next two commands matches:tail -n 1 ~/kernels-builtuname -rcat /proc/sys/kernel/tainted
Check if the problem occurs with this kernel as well.
Segment 2: ensure the ‘good’ kernel is also a ‘working’ kernel.
This among others verifies the trimmed .config file actually works well, asbisecting with it otherwise would be a waste of time:
Start by checking out the sources of the ‘good’ version:
cd ~/linux/git switch --discard-changes --detach v6.0
Build, install, and boot a kernel as described earlier insegment 1,section b -- just feel free to skip the ‘du’ commands, as you have a roughestimate already.
Ensure the feature that regressed with the ‘broken’ kernel actually workswith this one.
Segment 3: perform and validate the bisection.
Retrieve the sources for your ‘bad’ version:
git remote set-branches --add stable linux-6.1.ygit fetch stable
Initialize the bisection:
cd ~/linux/git bisect startgit bisect good v6.0git bisect bad v6.1.5
Build, install, and boot a kernel as described earlier insegment 1,section b.
In case building or booting the kernel fails for unrelated reasons, run
gitbisectskip. In all other outcomes, check if the regressed featureworks with the newly built kernel. If it does, tell Git by executinggitbisectgood; if it does not, rungitbisectbadinstead.All three commands will make Git check out another commit; then re-executethis step (e.g. build, install, boot, and test a kernel to then tell Gitthe outcome). Do so again and again until Git shows which commit brokethings. If you run short of disk space during this process, check thesection ‘Complementary tasks: cleanup during and after the process’below.
Once your finished the bisection, put a few things away:
cd ~/linux/git bisect log > ~/bisect-logcp .config ~/bisection-config-culpritgit bisect reset
Try to verify the bisection result:
git switch --discard-changes --detach mainline/mastergit revert --no-edit cafec0cacaca0cp ~/kernel-config-working .config./scripts/config --set-str CONFIG_LOCALVERSION '-local-cafec0cacaca0-reverted'
This is optional, as some commits are impossible to revert. But if thesecond command worked flawlessly, build, install, and boot one more kernelkernel; just this time skip the first command copying the base .config fileover, as that already has been taken care off.
Complementary tasks: cleanup during and after the process.
To avoid running out of disk space during a bisection, you might need toremove some kernels you built earlier. You most likely want to keep thoseyou built during segment 1 and 2 around for a while, but you will mostlikely no longer need kernels tested during the actual bisection(Segment 3 c). You can list them in build order using:
ls -ltr /lib/modules/*-local*
To then for example erase a kernel that identifies itself as‘6.0-rc1-local-gcafec0cacaca0’, use this:
sudo rm -rf /lib/modules/6.0-rc1-local-gcafec0cacaca0sudo kernel-install -v remove 6.0-rc1-local-gcafec0cacaca0# * Note, on some distributions kernel-install is missing# or does only part of the job.
If you performed a bisection and successfully validated the result, feelfree to remove all kernels built during the actual bisection (Segment 3 c);the kernels you built earlier and later you might want to keep around fora week or two.
Optional task: test a debug patch or a proposed fix later:
git fetch mainlinegit switch --discard-changes --detach mainline/mastergit apply /tmp/foobars-proposed-fix-v1.patchcp ~/kernel-config-working .config./scripts/config --set-str CONFIG_LOCALVERSION '-local-foobars-fix-v1'
Build, install, and boot a kernel as described insegment 1, section b --but this time omit the first command copying the build configuration over,as that has been taken care of already.
Step-by-step guide on how to verify bugs and bisect regressions¶
This guide describes how to set up your own Linux kernels for investigating bugsor regressions you intend to report. How far you want to follow the instructionsdepends on your issue:
Execute all steps till the end ofsegment 1 toverify if your kernel problemis present in code supported by Linux kernel developers. If it is, you are allset to report the bug -- unless it did not happen with earlier kernel versions,as then your want to at least continue withsegment 2 tocheck if the issuequalifies as regression which receive priority treatment. Depending on theoutcome you then are ready to report a bug or submit a preliminary regressionreport; instead of the latter your could also head straight on and followsegment 3 toperform a bisection for a full-fledged regression reportdevelopers are obliged to act upon.
The steps in each segment illustrate the important aspects of the process, whilea comprehensive reference section holds additional details for almost all of thesteps. The reference section sometimes also outlines alternative approaches,pitfalls, as well as problems that might occur at the particular step -- and howto get things rolling again.
For further details on how to report Linux kernel issues or regressions checkoutReporting issues, which works in conjunctionwith this document. It among others explains why you need to verify bugs withthe latest ‘mainline’ kernel (e.g. versions like 6.0, 6.1-rc1, or 6.1-rc6),even if you face a problem with a kernel from a ‘stable/longterm’ series(say 6.0.13).
For users facing a regression that document also explains why sending apreliminary report after segment 2 might be wise, as the regression and itsculprit might be known already. For further details on what actually qualifiesas a regression check outReporting regressions.
If you run into any problems while following this guide or have ideas how toimprove it,please let the kernel developers know.
Preparations: set up everything to build your own kernels¶
The following steps lay the groundwork for all further tasks.
Note: the instructions assume you are building and testing on the samemachine; if you want to compile the kernel on another system, checkBuild kernels on a different machine below.
Create a fresh backup and put system repair and restore tools at hand, justto be prepared for the unlikely case of something going sideways.
[details]
Remove all software that depends on externally developed kernel drivers orbuilds them automatically. That includes but is not limited to DKMS, openZFS,VirtualBox, and Nvidia’s graphics drivers (including the GPLed kernel module).
[details]
On platforms with ‘Secure Boot’ or similar solutions, prepare everything toensure the system will permit your self-compiled kernel to boot. Thequickest and easiest way to achieve this on commodity x86 systems is todisable such techniques in the BIOS setup utility; alternatively, removetheir restrictions through a process initiated by
mokutil--disable-validation.[details]
Determine the kernel versions considered ‘good’ and ‘bad’ throughout thisguide:
Do you follow this guide to verify if a bug is present in the code theprimary developers care for? Then consider the version of the newest kernelyou regularly use currently as ‘good’ (e.g. 6.0, 6.0.13, or 6.1-rc2).
Do you face a regression, e.g. something broke or works worse afterswitching to a newer kernel version? In that case it depends on the versionrange during which the problem appeared:
Something regressed when updating from a stable/longterm release(say 6.0.13) to a newer mainline series (like 6.1-rc7 or 6.1) or astable/longterm version based on one (say 6.1.5)? Then consider themainline release your working kernel is based on to be the ‘good’version (e.g. 6.0) and the first version to be broken as the ‘bad’ one(e.g. 6.1-rc7, 6.1, or 6.1.5). Note, at this point it is merely assumedthat 6.0 is fine; this hypothesis will be checked in segment 2.
Something regressed when switching from one mainline version (say 6.0) toa later one (like 6.1-rc1) or a stable/longterm release based on it(say 6.1.5)? Then regard the last working version (e.g. 6.0) as ‘good’ andthe first broken (e.g. 6.1-rc1 or 6.1.5) as ‘bad’.
Something regressed when updating within a stable/longterm series (sayfrom 6.0.13 to 6.0.15)? Then consider those versions as ‘good’ and ‘bad’(e.g. 6.0.13 and 6.0.15), as you need to bisect within that series.
Note, do not confuse ‘good’ version with ‘working’ kernel; the latter termthroughout this guide will refer to the last kernel that has been workingfine.
[details]
Boot into the ‘working’ kernel and briefly use the apparently broken feature.
[details]
Ensure to have enough free space for building Linux. 15 Gigabyte in your homedirectory should typically suffice. If you have less available, be sure to payattention to later steps about retrieving the Linux sources and handling ofdebug symbols: both explain approaches reducing the amount of space, whichshould allow you to master these tasks with about 4 Gigabytes free space.
[details]
Install all software required to build a Linux kernel. Often you will need:‘bc’, ‘binutils’ (‘ld’ et al.), ‘bison’, ‘flex’, ‘gcc’, ‘git’, ‘openssl’,‘pahole’, ‘perl’, and the development headers for ‘libelf’ and ‘openssl’. Thereference section shows how to quickly install those on various popular Linuxdistributions.
[details]
Retrieve the mainline Linux sources; then change into the directory holdingthem, as all further commands in this guide are meant to be executed fromthere.
Note, the following describe how to retrieve the sources using a fullmainline clone, which downloads about 2,75 GByte as of early 2024. Thereference section describes two alternatives:one downloads less than 500 MByte, the other works better with unreliableinternet connections.
Execute the following command to retrieve a fresh mainline codebase whilepreparing things to add branches for stable/longterm series later:
git clone -o mainline --no-checkout \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/cd ~/linux/git remote add -t master stable \ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
[details]
Is one of the versions you earlier established as ‘good’ or ‘bad’ a stable orlongterm release (say 6.1.5)? Then download the code for the series it belongsto (‘linux-6.1.y’ in this example):
git remote set-branches --add stable linux-6.1.ygit fetch stable
Start preparing a kernel build configuration (the ‘.config’ file).
Before doing so, ensure you are still running the ‘working’ kernel an earlierstep told you to boot; if you are unsure, check the current kernelreleaseidentifier using
uname-r.Afterwards check out the source code for the version earlier established as‘good’. In the following example command this is assumed to be 6.0; note thatthe version number in this and all later Git commands needs to be prefixedwith a ‘v’:
git switch --discard-changes --detach v6.0
Now create a build configuration file:
make olddefconfig
The kernel build scripts then will try to locate the build configuration filefor the running kernel and then adjust it for the needs of the kernel sourcesyou checked out. While doing so, it will print a few lines you need to check.
Look out for a line starting with ‘# using defaults found in’. It should befollowed by a path to a file in ‘/boot/’ that contains the release identifierof your currently working kernel. If the line instead continues with somethinglike ‘arch/x86/configs/x86_64_defconfig’, then the build infra failed to findthe .config file for your running kernel -- in which case you have to put onethere manually, as explained in the reference section.
In case you can not find such a line, look for one containing ‘# configurationwritten to .config’. If that’s the case you have a stale build configurationlying around. Unless you intend to use it, delete it; afterwards run‘make olddefconfig’ again and check if it now picked up the right config fileas base.
[details]
Disable any kernel modules apparently superfluous for your setup. This isoptional, but especially wise for bisections, as it speeds up the buildprocess enormously -- at least unless the .config file picked up in theprevious step was already tailored to your and your hardware needs, in whichcase you should skip this step.
To prepare the trimming, connect external hardware you occasionally use (USBkeys, tokens, ...), quickly start a VM, and bring up VPNs. And if you rebootedsince you started that guide, ensure that you tried using the feature causingtrouble since you started the system. Only then trim your .config:
yes '' | make localmodconfig
There is a catch to this, as the ‘apparently’ in initial sentence of this stepand the preparation instructions already hinted at:
The ‘localmodconfig’ target easily disables kernel modules for features onlyused occasionally -- like modules for external peripherals not yet connectedsince booting, virtualization software not yet utilized, VPN tunnels, and afew other things. That’s because some tasks rely on kernel modules Linux onlyloads when you execute tasks like the aforementioned ones for the first time.
This drawback of localmodconfig is nothing you should lose sleep over, butsomething to keep in mind: if something is misbehaving with the kernels builtduring this guide, this is most likely the reason. You can reduce or nearlyeliminate the risk with tricks outlined in the reference section; but whenbuilding a kernel just for quick testing purposes this is usually not worthspending much effort on, as long as it boots and allows to properly test thefeature that causes trouble.
[details]
Ensure all the kernels you will build are clearly identifiable using a specialtag and a unique version number:
./scripts/config --set-str CONFIG_LOCALVERSION '-local'./scripts/config -e CONFIG_LOCALVERSION_AUTO
[details]
Decide how to handle debug symbols.
In the context of this document it is often wise to enable them, as there is adecent chance you will need to decode a stack trace from a ‘panic’, ‘Oops’,‘warning’, or ‘BUG’:
./scripts/config -d DEBUG_INFO_NONE -e KALLSYMS_ALL -e DEBUG_KERNEL \ -e DEBUG_INFO -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT -e KALLSYMS
But if you are extremely short on storage space, you might want to disabledebug symbols instead:
./scripts/config -d DEBUG_INFO -d DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT \ -d DEBUG_INFO_DWARF4 -d DEBUG_INFO_DWARF5 -e CONFIG_DEBUG_INFO_NONE
[details]
Check if you may want or need to adjust some other kernel configurationoptions:
Are you running Debian? Then you want to avoid known problems by performingadditional adjustments explained in the reference section.
[details].
If you want to influence other aspects of the configuration, do so now usingyour preferred tool. Note, to use make targets like ‘menuconfig’ or‘nconfig’, you will need to install the development files of ncurses; for‘xconfig’ you likewise need the Qt5 or Qt6 headers.
[details].
Reprocess the .config after the latest adjustments and store it in a safeplace:
make olddefconfigcp .config ~/kernel-config-working
[details]
Segment 1: try to reproduce the problem with the latest codebase¶
The following steps verify if the problem occurs with the code currentlysupported by developers. In case you face a regression, it also checks that theproblem is not caused by some .config change, as reporting the issue then wouldbe a waste of time. [details]
Check out the latest Linux codebase.
Are your ‘good’ and ‘bad’ versions from the same stable or longterm series?Then check thefront page of kernel.org: if itlists a release from that series without an ‘[EOL]’ tag, checkout the serieslatest version (‘linux-6.1.y’ in the following example):
cd ~/linux/git switch --discard-changes --detach stable/linux-6.1.y
Your series is unsupported, if is not listed or carrying a ‘end of life’tag. In that case you might want to check if a successor series (saylinux-6.2.y) or mainline (see next point) fix the bug.
In all other cases, run:
cd ~/linux/git switch --discard-changes --detach mainline/master
[details]
Build the image and the modules of your first kernel using the config file youprepared:
cp ~/kernel-config-working .configmake olddefconfigmake -j $(nproc --all)
If you want your kernel packaged up as deb, rpm, or tar file, see thereference section for alternatives, which obviously will require othersteps to install as well.
[details]
Install your newly built kernel.
Before doing so, consider checking if there is still enough space for it:
df -h /boot/ /lib/modules/
For now assume 150 MByte in /boot/ and 200 in /lib/modules/ will suffice; howmuch your kernels actually require will be determined later during this guide.
Now install the kernel’s modules and its image, which will be stored inparallel to the your Linux distribution’s kernels:
sudo make modules_installcommand -v installkernel && sudo make install
The second command ideally will take care of three steps required at thispoint: copying the kernel’s image to /boot/, generating an initramfs, andadding an entry for both to the boot loader’s configuration.
Sadly some distributions (among them Arch Linux, its derivatives, and manyimmutable Linux distributions) will perform none or only some of those tasks.You therefore want to check if all of them were taken care of and manuallyperform those that were not. The reference section provides further details onthat; your distribution’s documentation might help, too.
Once you figured out the steps needed at this point, consider writing themdown: if you will build more kernels as described in segment 2 and 3, you willhave to perform those again after executing
command-vinstallkernel[...].[details]
In case you plan to follow this guide further, check how much storage spacethe kernel, its modules, and other related files like the initramfs consume:
du -ch /boot/*$(make -s kernelrelease)* | tail -n 1du -sh /lib/modules/$(make -s kernelrelease)/
Write down or remember those two values for later: they enable you to preventrunning out of disk space accidentally during a bisection.
[details]
Show and store the kernelrelease identifier of the kernel you just built:
make -s kernelrelease | tee -a ~/kernels-built
Remember the identifier momentarily, as it will help you pick the right kernelfrom the boot menu upon restarting.
Reboot into your newly built kernel. To ensure your actually started the oneyou just built, you might want to verify if the output of these commandsmatches:
tail -n 1 ~/kernels-builtuname -r
Check if the kernel marked itself as ‘tainted’:
cat /proc/sys/kernel/tainted
If that command does not return ‘0’, check the reference section, as the causefor this might interfere with your testing.
[details]
Verify if your bug occurs with the newly built kernel. If it does not, checkout the instructions in the reference section to ensure nothing went sidewaysduring your tests.
[details]
Did you just built a stable or longterm kernel? And were you able to reproducethe regression with it? Then you should test the latest mainline codebase aswell, because the result determines which developers the bug must be submittedto.
To prepare that test, check out current mainline:
cd ~/linux/git switch --discard-changes --detach mainline/master
Now use the checked out code to build and install another kernel using thecommands the earlier steps already described in more detail:
cp ~/kernel-config-working .configmake olddefconfigmake -j $(nproc --all)# * Check if the free space suffices holding another kernel:df -h /boot/ /lib/modules/sudo make modules_installcommand -v installkernel && sudo make installmake -s kernelrelease | tee -a ~/kernels-builtreboot
Confirm you booted the kernel you intended to start and check its taintedstatus:
tail -n 1 ~/kernels-builtuname -rcat /proc/sys/kernel/tainted
Now verify if this kernel is showing the problem. If it does, then you needto report the bug to the primary developers; if it does not, report it to thestable team. SeeReporting issues for details.
[details]
Do you follow this guide to verify if a problem is present in the codecurrently supported by Linux kernel developers? Then you are done at thispoint. If you later want to remove the kernel you just built, check outComplementary tasks: cleanup during and after following this guide.
In case you face a regression, move on and execute at least the next segmentas well.
Segment 2: check if the kernels you build work fine¶
In case of a regression, you now want to ensure the trimmed configuration fileyou created earlier works as expected; a bisection with the .config fileotherwise would be a waste of time. [details]
Build your own variant of the ‘working’ kernel and check if the feature thatregressed works as expected with it.
Start by checking out the sources for the version earlier established as‘good’ (once again assumed to be 6.0 here):
cd ~/linux/git switch --discard-changes --detach v6.0
Now use the checked out code to configure, build, and install another kernelusing the commands the previous subsection explained in more detail:
cp ~/kernel-config-working .configmake olddefconfigmake -j $(nproc --all)# * Check if the free space suffices holding another kernel:df -h /boot/ /lib/modules/sudo make modules_installcommand -v installkernel && sudo make installmake -s kernelrelease | tee -a ~/kernels-builtreboot
When the system booted, you may want to verify once again that thekernel you started is the one you just built:
tail -n 1 ~/kernels-builtuname -r
Now check if this kernel works as expected; if not, consult the referencesection for further instructions.
[details]
Segment 3: perform the bisection and validate the result¶
With all the preparations and precaution builds taken care of, you are now readyto begin the bisection. This will make you build quite a few kernels -- usuallyabout 15 in case you encountered a regression when updating to a newer series(say from 6.0.13 to 6.1.5). But do not worry, due to the trimmed buildconfiguration created earlier this works a lot faster than many people assume:overall on average it will often just take about 10 to 15 minutes to compileeach kernel on commodity x86 machines.
Start the bisection and tell Git about the versions earlier established as‘good’ (6.0 in the following example command) and ‘bad’ (6.1.5):
cd ~/linux/git bisect startgit bisect good v6.0git bisect bad v6.1.5
[details]
Now use the code Git checked out to build, install, and boot a kernel usingthe commands introduced earlier:
cp ~/kernel-config-working .configmake olddefconfigmake -j $(nproc --all)# * Check if the free space suffices holding another kernel:df -h /boot/ /lib/modules/sudo make modules_installcommand -v installkernel && sudo make installmake -s kernelrelease | tee -a ~/kernels-builtreboot
If compilation fails for some reason, run
gitbisectskipand restartexecuting the stack of commands from the beginning.In case you skipped the ‘test latest codebase’ step in the guide, check itsdescription as for why the ‘df [...]’ and ‘make -s kernelrelease [...]’commands are here.
Important note: the latter command from this point on will print releaseidentifiers that might look odd or wrong to you -- which they are not, as it’stotally normal to see release identifiers like ‘6.0-rc1-local-gcafec0cacaca0’if you bisect between versions 6.1 and 6.2 for example.
[details]
Now check if the feature that regressed works in the kernel you just built.
You again might want to start by making sure the kernel you booted is the oneyou just built:
cd ~/linux/tail -n 1 ~/kernels-builtuname -r
Now verify if the feature that regressed works at this kernel bisection point.If it does, run this:
git bisect good
If it does not, run this:
git bisect bad
Be sure about what you tell Git, as getting this wrong just once will send therest of the bisection totally off course.
While the bisection is ongoing, Git will use the information you provided tofind and check out another bisection point for you to test. While doing so, itwill print something like ‘Bisecting: 675 revisions left to test after this(roughly 10 steps)’ to indicate how many further changes it expects to betested. Now build and install another kernel using the instructions from theprevious step; afterwards follow the instructions in this step again.
Repeat this again and again until you finish the bisection -- that’s the casewhen Git after tagging a change as ‘good’ or ‘bad’ prints something like‘cafecaca0c0dacafecaca0c0dacafecaca0c0da is the first bad commit’; rightafterwards it will show some details about the culprit including the patchdescription of the change. The latter might fill your terminal screen, so youmight need to scroll up to see the message mentioning the culprit;alternatively, run
gitbisectlog>~/bisection-log.[details]
Store Git’s bisection log and the current .config file in a safe place beforetelling Git to reset the sources to the state before the bisection:
cd ~/linux/git bisect log > ~/bisection-logcp .config ~/bisection-config-culpritgit bisect reset
[details]
Try reverting the culprit on top of latest mainline to see if this fixes yourregression.
This is optional, as it might be impossible or hard to realize. The former isthe case, if the bisection determined a merge commit as the culprit; thelatter happens if other changes depend on the culprit. But if the revertsucceeds, it is worth building another kernel, as it validates the result ofa bisection, which can easily deroute; it furthermore will let kerneldevelopers know, if they can resolve the regression with a quick revert.
Begin by checking out the latest codebase depending on the range you bisected:
Did you face a regression within a stable/longterm series (say between6.0.13 and 6.0.15) that does not happen in mainline? Then check out thelatest codebase for the affected series like this:
git fetch stablegit switch --discard-changes --detach linux-6.0.y
In all other cases check out latest mainline:
git fetch mainlinegit switch --discard-changes --detach mainline/master
If you bisected a regression within a stable/longterm series that alsohappens in mainline, there is one more thing to do: look up the mainlinecommit-id. To do so, use a command like
gitshowabcdcafecabcdtoview the patch description of the culprit. There will be a line nearthe top which looks like ‘commit cafec0cacaca0 upstream.’ or‘Upstreamcommit cafec0cacaca0’; use that commit-id in the next commandand not the one the bisection blamed.
Now try reverting the culprit by specifying its commit id:
git revert --no-edit cafec0cacaca0
If that fails, give up trying and move on to the next step; if it works,adjust the tag to facilitate the identification and prevent accidentallyoverwriting another kernel:
cp ~/kernel-config-working .config./scripts/config --set-str CONFIG_LOCALVERSION '-local-cafec0cacaca0-reverted'
Build a kernel using the familiar command sequence, just without copying thethe base .config over:
make olddefconfig &&make -j $(nproc --all)# * Check if the free space suffices holding another kernel:df -h /boot/ /lib/modules/sudo make modules_installcommand -v installkernel && sudo make installmake -s kernelrelease | tee -a ~/kernels-builtreboot
Now check one last time if the feature that made you perform a bisection workswith that kernel: if everything went well, it should not show the regression.
[details]
Complementary tasks: cleanup during and after the bisection¶
During and after following this guide you might want or need to remove some ofthe kernels you installed: the boot menu otherwise will become confusing orspace might run out.
To remove one of the kernels you installed, look up its ‘kernelrelease’identifier. This guide stores them in ‘~/kernels-built’, but the followingcommand will print them as well:
ls -ltr /lib/modules/*-local*
You in most situations want to remove the oldest kernels built during theactual bisection (e.g. segment 3 of this guide). The two ones you createdbeforehand (e.g. to test the latest codebase and the version considered‘good’) might become handy to verify something later -- thus better keep themaround, unless you are really short on storage space.
To remove the modules of a kernel with the kernelrelease identifier‘6.0-rc1-local-gcafec0cacaca0’, start by removing the directory holding itsmodules:
sudo rm -rf /lib/modules/6.0-rc1-local-gcafec0cacaca0
Afterwards try the following command:
sudo kernel-install -v remove 6.0-rc1-local-gcafec0cacaca0
On quite a few distributions this will delete all other kernel files installedwhile also removing the kernel’s entry from the boot menu. But on somedistributions kernel-install does not exist or leaves boot-loader entries orkernel image and related files behind; in that case remove them as describedin the reference section.
[details]
Once you have finished the bisection, do not immediately remove anything youset up, as you might need a few things again. What is safe to remove dependson the outcome of the bisection:
Could you initially reproduce the regression with the latest codebase andafter the bisection were able to fix the problem by reverting the culprit ontop of the latest codebase? Then you want to keep those two kernels aroundfor a while, but safely remove all others with a ‘-local’ in the releaseidentifier.
Did the bisection end on a merge-commit or seems questionable for otherreasons? Then you want to keep as many kernels as possible around for a fewdays: it’s pretty likely that you will be asked to recheck something.
In other cases it likely is a good idea to keep the following kernels aroundfor some time: the one built from the latest codebase, the one created fromthe version considered ‘good’, and the last three or four you compiledduring the actual bisection process.
[details]
Optional: test reverts, patches, or later versions¶
While or after reporting a bug, you might want or potentially will be asked totest reverts, debug patches, proposed fixes, or other versions. In that casefollow these instructions.
Update your Git clone and check out the latest code.
In case you want to test mainline, fetch its latest changes before checkingits code out:
git fetch mainlinegit switch --discard-changes --detach mainline/master
In case you want to test a stable or longterm kernel, first add the branchholding the series you are interested in (6.2 in the example), unless youalready did so earlier:
git remote set-branches --add stable linux-6.2.y
Then fetch the latest changes and check out the latest version from theseries:
git fetch stablegit switch --discard-changes --detach stable/linux-6.2.y
Copy your kernel build configuration over:
cp ~/kernel-config-working .config
Your next step depends on what you want to do:
In case you just want to test the latest codebase, head to the next step,you are already all set.
In case you want to test if a revert fixes an issue, revert one or multiplechanges by specifying their commit ids:
git revert --no-edit cafec0cacaca0
Now give that kernel a special tag to facilitates its identification andprevent accidentally overwriting another kernel:
./scripts/config --set-str CONFIG_LOCALVERSION '-local-cafec0cacaca0-reverted'
In case you want to test a patch, store the patch in a file like‘/tmp/foobars-proposed-fix-v1.patch’ and apply it like this:
git apply /tmp/foobars-proposed-fix-v1.patch
In case of multiple patches, repeat this step with the others.
Now give that kernel a special tag to facilitates its identification andprevent accidentally overwriting another kernel:
./scripts/config --set-str CONFIG_LOCALVERSION '-local-foobars-fix-v1'
Build a kernel using the familiar commands, just without copying the kernelbuild configuration over, as that has been taken care of already:
make olddefconfig &&make -j $(nproc --all)# * Check if the free space suffices holding another kernel:df -h /boot/ /lib/modules/sudo make modules_installcommand -v installkernel && sudo make installmake -s kernelrelease | tee -a ~/kernels-builtreboot
Now verify you booted the newly built kernel and check it.
[details]
Conclusion¶
You have reached the end of the step-by-step guide.
Did you run into trouble following any of the above steps not cleared up by thereference section below? Did you spot errors? Or do you have ideas how toimprove the guide?
If any of that applies, please take a moment and let the maintainer of thisdocument know by email (Thorsten Leemhuis <linux@leemhuis.info>), ideally whileCCing the Linux docs mailing list (linux-doc@vger.kernel.org). Such feedback isvital to improve this text further, which is in everybody’s interest, as itwill enable more people to master the task described here -- and hopefully alsoimprove similar guides inspired by this one.
Reference section for the step-by-step guide¶
This section holds additional information for almost all the items in the abovestep-by-step guide.
Preparations for building your own kernels¶
The steps in this section lay the groundwork for all further tests.[...]
The steps in all later sections of this guide depend on those described here.
Prepare for emergencies¶
Create a fresh backup and put system repair and restore tools at hand.[...]
Remember, you are dealing with computers, which sometimes do unexpected things-- especially if you fiddle with crucial parts like the kernel of an operatingsystem. That’s what you are about to do in this process. Hence, better preparefor something going sideways, even if that should not happen.
Remove anything related to externally maintained kernel modules¶
Remove all software that depends on externally developed kernel drivers orbuilds them automatically. [...]
Externally developed kernel modules can easily cause trouble during a bisection.
But there is a more important reason why this guide contains this step: mostkernel developers will not care about reports about regressions occurring withkernels that utilize such modules. That’s because such kernels are notconsidered ‘vanilla’ anymore, asReporting issuesexplains in more detail.
Deal with techniques like Secure Boot¶
On platforms with ‘Secure Boot’ or similar techniques, prepare everything toensure the system will permit your self-compiled kernel to boot later.[...]
Many modern systems allow only certain operating systems to start; that’s whythey reject booting self-compiled kernels by default.
You ideally deal with this by making your platform trust your self-built kernelswith the help of a certificate. How to do that is not describedhere, as it requires various steps that would take the text too far away fromits purpose; ‘Kernel module signing facility’ and various websides already explain everything needed in more detail.
Temporarily disabling solutions like Secure Boot is another way to make your ownLinux boot. On commodity x86 systems it is possible to do this in the BIOS Setuputility; the required steps vary a lot between machines and therefore cannot bedescribed here.
On mainstream x86 Linux distributions there is a third and universal option:disable all Secure Boot restrictions for your Linux environment. You caninitiate this process by runningmokutil--disable-validation; this willtell you to create a one-time password, which is safe to write down. Nowrestart; right after your BIOS performed all self-tests the bootloader Shim willshow a blue box with a message ‘Press any key to perform MOK management’. Hitsome key before the countdown exposes, which will open a menu. Choose ‘ChangeSecure Boot state’. Shim’s ‘MokManager’ will now ask you to enter threerandomly chosen characters from the one-time password specified earlier. Onceyou provided them, confirm you really want to disable the validation.Afterwards, permit MokManager to reboot the machine.
Boot the last kernel that was working¶
Boot into the last working kernel and briefly recheck if the feature thatregressed really works. [...]
This will make later steps that cover creating and trimming the configuration dothe right thing.
Space requirements¶
Ensure to have enough free space for building Linux.[...]
The numbers mentioned are rough estimates with a big extra charge to be on thesafe side, so often you will need less.
If you have space constraints, be sure to hay attention to thestep aboutdebug symbols’ and itsaccompanying referencesection’, as disabling then will reduce the consumed diskspace by quite a few gigabytes.
Bisection range¶
Determine the kernel versions considered ‘good’ and ‘bad’ throughout thisguide. [...]
Establishing the range of commits to be checked is mostly straightforward,except when a regression occurred when switching from a release of one stableseries to a release of a later series (e.g. from 6.0.13 to 6.1.5). In that caseGit will need some hand holding, as there is no straight line of descent.
That’s because with the release of 6.0 mainline carried on to 6.1 while thestable series 6.0.y branched to the side. It’s therefore theoretically possiblethat the issue you face with 6.1.5 only worked in 6.0.13, as it was fixed by acommit that went into one of the 6.0.y releases, but never hit mainline or the6.1.y series. Thankfully that normally should not happen due to the way thestable/longterm maintainers maintain the code. It’s thus pretty safe to assume6.0 as a ‘good’ kernel. That assumption will be tested anyway, as that kernelwill be built and tested in the segment ‘2’ of this guide; Git would force youto do this as well, if you tried bisecting between 6.0.13 and 6.1.15.
Install build requirements¶
Install all software required to build a Linux kernel.[...]
The kernel is pretty stand-alone, but besides tools like the compiler you willsometimes need a few libraries to build one. How to install everything neededdepends on your Linux distribution and the configuration of the kernel you areabout to build.
Here are a few examples what you typically need on some mainstreamdistributions:
Arch Linux and derivatives:
sudo pacman --needed -S bc binutils bison flex gcc git kmod libelf openssl \ pahole perl zlib ncurses qt6-base
Debian, Ubuntu, and derivatives:
sudo apt install bc binutils bison dwarves flex gcc git kmod libelf-dev \ libssl-dev make openssl pahole perl-base pkg-config zlib1g-dev \ libncurses-dev qt6-base-dev g++
Fedora and derivatives:
sudo dnf install binutils \ /usr/bin/{bc,bison,flex,gcc,git,openssl,make,perl,pahole,rpmbuild} \ /usr/include/{libelf.h,openssl/pkcs7.h,zlib.h,ncurses.h,qt6/QtGui/QAction}openSUSE and derivatives:
sudo zypper install bc binutils bison dwarves flex gcc git \ kernel-install-tools libelf-devel make modutils openssl openssl-devel \ perl-base zlib-devel rpm-build ncurses-devel qt6-base-devel
These commands install a few packages that are often, but not always needed. Youfor example might want to skip installing the development headers for ncurses,which you will only need in case you later might want to adjust the kernel buildconfiguration using make the targets ‘menuconfig’ or ‘nconfig’; likewise omitthe headers of Qt6 if you do not plan to adjust the .config using ‘xconfig’.
You furthermore might need additional libraries and their development headersfor tasks not covered in this guide -- for example when building utilities fromthe kernel’s tools/ directory.
Download the sources using Git¶
Retrieve the Linux mainline sources.[...]
The step-by-step guide outlines how to download the Linux sources using a fullGit clone of Linus’ mainline repository. There is nothing more to say aboutthat -- but there are two alternatives ways to retrieve the sources that mightwork better for you:
If you have an unreliable internet connection, considerusing a ‘Git bundle’.
If downloading the complete repository would take too long or requires toomuch storage space, considerusing a ‘shallowclone’.
Downloading Linux mainline sources using a bundle¶
Use the following commands to retrieve the Linux mainline sources using abundle:
wget -c \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/clone.bundlegit clone --no-checkout clone.bundle ~/linux/cd ~/linux/git remote remove origingit remote add mainline \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.gitgit fetch mainlinegit remote add -t master stable \ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
In case the ‘wget’ command fails, just re-execute it, it will pick up whereit left off.
Downloading Linux mainline sources using a shallow clone¶
First, execute the following command to retrieve the latest mainline codebase:
git clone -o mainline --no-checkout --depth 1 -b master \ https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ~/linux/cd ~/linux/git remote add -t master stable \ https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
Now deepen your clone’s history to the second predecessor of the mainlinerelease of your ‘good’ version. In case the latter are 6.0 or 6.0.13, 5.19 wouldbe the first predecessor and 5.18 the second -- hence deepen the history up tothat version:
git fetch --shallow-exclude=v5.18 mainline
Afterwards add the stable Git repository as remote and all required stablebranches as explained in the step-by-step guide.
Note, shallow clones have a few peculiar characteristics:
For bisections the history needs to be deepened a few mainline versionsfarther than it seems necessary, as explained above already. That’s becauseGit otherwise will be unable to revert or describe most of the commits withina range (say 6.1..6.2), as they are internally based on earlier kernelsreleases (like 6.0-rc2 or 5.19-rc3).
This document in most places uses
gitfetchwith--shallow-exclude=to specify the earliest version you care about (or to be precise: its gittag). You alternatively can use the parameter--shallow-since=to specifyan absolute (say'2023-07-15') or relative ('12months') date todefine the depth of the history you want to download. When using them whilebisecting mainline, ensure to deepen the history to at least 7 months beforethe release of the mainline release your ‘good’ kernel is based on.Be warned, when deepening your clone you might encounter an error like‘fatal: error in object: unshallow cafecaca0c0dacafecaca0c0dacafecaca0c0da’.In that case run
gitrepack-dand try again.
Start defining the build configuration for your kernel¶
Start preparing a kernel build configuration (the ‘.config’ file).[...]
Note, this is the first of multiple steps in this guide that create or modifybuild artifacts. The commands used in this guide store them right in the sourcetree to keep things simple. In case you prefer storing the build artifactsseparately, create a directory like ‘~/linux-builddir/’ and add the parameter``O=~/linux-builddir/`` to all make calls used throughout this guide. You willhave to point other commands there as well -- among them the ``./scripts/config[...]`` commands, which will require ``--file ~/linux-builddir/.config`` tolocate the right build configuration.
Two things can easily go wrong when creating a .config file as advised:
The oldconfig target will use a .config file from your build directory, ifone is already present there (e.g. ‘~/linux/.config’). That’s totally fine ifthat’s what you intend (see next step), but in all other cases you want todelete it. This for example is important in case you followed this guidefurther, but due to problems come back here to redo the configuration fromscratch.
Sometimes olddefconfig is unable to locate the .config file for your runningkernel and will use defaults, as briefly outlined in the guide. In that casecheck if your distribution ships the configuration somewhere and manually putit in the right place (e.g. ‘~/linux/.config’) if it does. On distributionswhere /proc/config.gz exists this can be achieved using this command:
zcat /proc/config.gz > .config
Once you put it there, run
makeolddefconfigagain to adjust it to theneeds of the kernel about to be built.
Note, the olddefconfig target will set any undefined build options to theirdefault value. If you prefer to set such configuration options manually, usemakeoldconfig instead. Then for each undefined configuration option youwill be asked how to proceed; in case you are unsure what to answer, simply hit‘enter’ to apply the default value. Note though that for bisections you normallywant to go with the defaults, as you otherwise might enable a new feature thatcauses a problem looking like regressions (for example due to securityrestrictions).
Occasionally odd things happen when trying to use a config file prepared for onekernel (say 6.1) on an older mainline release -- especially if it is much older(say 5.15). That’s one of the reasons why the previous step in the guide toldyou to boot the kernel where everything works. If you manually add a .configfile you thus want to ensure it’s from the working kernel and not from a onethat shows the regression.
In case you want to build kernels for another machine, locate its kernel buildconfiguration; usuallyls/boot/config-$(uname-r) will print its name. Copythat file to the build machine and store it as ~/linux/.config; afterwards runmakeolddefconfig to adjust it.
Trim the build configuration for your kernel¶
Disable any kernel modules apparently superfluous for your setup.[...]
As explained briefly in the step-by-step guide already: with localmodconfig itcan easily happen that your self-built kernels will lack modules for tasks youdid not perform at least once before utilizing this make target. That happenswhen a task requires kernel modules which are only autoloaded when you executeit for the first time. So when you never performed that task since starting yourkernel the modules will not have been loaded -- and from localmodconfig’s pointof view look superfluous, which thus disables them to reduce the amount of codeto be compiled.
You can try to avoid this by performing typical tasks that often will autoloadadditional kernel modules: start a VM, establish VPN connections, loop-mount aCD/DVD ISO, mount network shares (CIFS, NFS, ...), and connect all externaldevices (2FA keys, headsets, webcams, ...) as well as storage devices with filesystems you otherwise do not utilize (btrfs, ext4, FAT, NTFS, XFS, ...). But itis hard to think of everything that might be needed -- even kernel developersoften forget one thing or another at this point.
Do not let that risk bother you, especially when compiling a kernel only fortesting purposes: everything typically crucial will be there. And if you forgetsomething important you can turn on a missing feature manually later and quicklyrun the commands again to compile and install a kernel that has everything youneed.
But if you plan to build and use self-built kernels regularly, you might want toreduce the risk by recording which modules your system loads over the course ofa few weeks. You can automate this withmodprobed-db. Afterwards useLSMOD=<path> topoint localmodconfig to the list of modules modprobed-db noticed being used:
yes '' | make LSMOD='${HOME}'/.config/modprobed.db localmodconfigThat parameter also allows you to build trimmed kernels for another machine incase you copied a suitable .config over to use as base (see previous step). Justrunlsmod>lsmod_foo-machine on that system and copy the generated file toyour build’s host home directory. Then run these commands instead of the one thestep-by-step guide mentions:
yes '' | make LSMOD=~/lsmod_foo-machine localmodconfig
Tag the kernels about to be build¶
Ensure all the kernels you will build are clearly identifiable using aspecial tag and a unique version identifier. [...]
This allows you to differentiate your distribution’s kernels from those createdduring this process, as the file or directories for the latter will contain‘-local’ in the name; it also helps picking the right entry in the boot menu andnot lose track of you kernels, as their version numbers will look slightlyconfusing during the bisection.
Decide to enable or disable debug symbols¶
Decide how to handle debug symbols. [...]
Having debug symbols available can be important when your kernel throws a‘panic’, ‘Oops’, ‘warning’, or ‘BUG’ later when running, as then you will beable to find the exact place where the problem occurred in the code. Butcollecting and embedding the needed debug information takes time and consumesquite a bit of space: in late 2022 the build artifacts for a typical x86 kerneltrimmed with localmodconfig consumed around 5 Gigabyte of space with debugsymbols, but less than 1 when they were disabled. The resulting kernel image andmodules are bigger as well, which increases storage requirements for /boot/ andload times.
In case you want a small kernel and are unlikely to decode a stack trace later,you thus might want to disable debug symbols to avoid those downsides. If itlater turns out that you need them, just enable them as shown and rebuild thekernel.
You on the other hand definitely want to enable them for this process, if thereis a decent chance that you need to decode a stack trace later. The section‘Decode failure messages’ inReporting issuesexplains this process in more detail.
Adjust build configuration¶
Check if you may want or need to adjust some other kernel configurationoptions:
Depending on your needs you at this point might want or have to adjust somekernel configuration options.
Distro specific adjustments¶
Are you running [...]
The following sections help you to avoid build problems that are known to occurwhen following this guide on a few commodity distributions.
Debian:
Remove a stale reference to a certificate file that would cause your build tofail:
./scripts/config --set-str SYSTEM_TRUSTED_KEYS ''
Alternatively, download the needed certificate and make that configurationoption point to it, asthe Debian handbook explains in more detail-- or generate your own, as explained inKernel module signing facility.
Individual adjustments¶
If you want to influence the other aspects of the configuration, do sonow. [...]
At this point you can use a command likemakemenuconfig ormakenconfigto enable or disable certain features using a text-based user interface; to usea graphical configuration utility, runmakexconfig instead. Both of themrequire development libraries from toolkits they are rely on (ncursesrespectively Qt5 or Qt6); an error message will tell you if something requiredis missing.
Put the .config file aside¶
Reprocess the .config after the latest changes and store it in a safe place.[...]
Put the .config you prepared aside, as you want to copy it back to the builddirectory every time during this guide before you start building anotherkernel. That’s because going back and forth between different versions can alter.config files in odd ways; those occasionally cause side effects that couldconfuse testing or in some cases render the result of your bisectionmeaningless.
Try to reproduce the problem with the latest codebase¶
Verify the regression is not caused by some .config change and check if itstill occurs with the latest codebase. [...]
For some readers it might seem unnecessary to check the latest codebase at thispoint, especially if you did that already with a kernel prepared by yourdistributor or face a regression within a stable/longterm series. But it’shighly recommended for these reasons:
You will run into any problems caused by your setup before you actually begina bisection. That will make it a lot easier to differentiate between ‘thismost likely is some problem in my setup’ and ‘this change needs to be skippedduring the bisection, as the kernel sources at that stage contain an unrelatedproblem that causes building or booting to fail’.
These steps will rule out if your problem is caused by some change in thebuild configuration between the ‘working’ and the ‘broken’ kernel. This forexample can happen when your distributor enabled an additional securityfeature in the newer kernel which was disabled or not yet supported by theolder kernel. That security feature might get into the way of something youdo -- in which case your problem from the perspective of the Linux kernelupstream developers is not a regression, asReporting regressions explains in more detail.You thus would waste your time if you’d try to bisect this.
If the cause for your regression was already fixed in the latest mainlinecodebase, you’d perform the bisection for nothing. This holds true for aregression you encountered with a stable/longterm release as well, as they areoften caused by problems in mainline changes that were backported -- in whichcase the problem will have to be fixed in mainline first. Maybe it already wasfixed there and the fix is already in the process of being backported.
For regressions within a stable/longterm series it’s furthermore crucial toknow if the issue is specific to that series or also happens in the mainlinekernel, as the report needs to be sent to different people:
Regressions specific to a stable/longterm series are the stable team’sresponsibility; mainline Linux developers might or might not care.
Regressions also happening in mainline are something the regular Linuxdevelopers and maintainers have to handle; the stable team does not careand does not need to be involved in the report, they just should be toldto backport the fix once it’s ready.
Your report might be ignored if you send it to the wrong party -- and evenwhen you get a reply there is a decent chance that developers tell you toevaluate which of the two cases it is before they take a closer look.
Check out the latest Linux codebase¶
Check out the latest Linux codebase.[...]
In case you later want to recheck if an ever newer codebase might fix theproblem, remember to run thatgitfetch--shallow-exclude[...] commandagain mentioned earlier to update your local Git repository.
Build your kernel¶
Build the image and the modules of your first kernel using the config fileyou prepared. [...]
A lot can go wrong at this stage, but the instructions below will help you helpyourself. Another subsection explains how to directly package your kernel up asdeb, rpm or tar file.
Dealing with build errors¶
When a build error occurs, it might be caused by some aspect of your machine’ssetup that often can be fixed quickly; other times though the problem lies inthe code and can only be fixed by a developer. A close examination of thefailure messages coupled with some research on the internet will often tell youwhich of the two it is. To perform such investigation, restart the buildprocess like this:
make V=1
TheV=1 activates verbose output, which might be needed to see the actualerror. To make it easier to spot, this command also omits the-j$(nproc--all) used earlier to utilize every CPU core in the system for the job -- butthis parallelism also results in some clutter when failures occur.
After a few seconds the build process should run into the error again. Now tryto find the most crucial line describing the problem. Then search the internetfor the most important and non-generic section of that line (say 4 to 8 words);avoid or remove anything that looks remotely system-specific, like your usernameor local path names like/home/username/linux/. First try your regularinternet search engine with that string, afterwards search Linux kernel mailinglists vialore.kernel.org/all/.
This most of the time will find something that will explain what is wrong; quiteoften one of the hits will provide a solution for your problem, too. If youdo not find anything that matches your problem, try again from a different angleby modifying your search terms or using another line from the error messages.
In the end, most issues you run into have likely been encountered andreported by others already. That includes issues where the cause is not yoursystem, but lies in the code. If you run into one of those, you might thus finda solution (e.g. a patch) or workaround for your issue, too.
Package your kernel up¶
The step-by-step guide uses the default make targets (e.g. ‘bzImage’ and‘modules’ on x86) to build the image and the modules of your kernel, which latersteps of the guide then install. You instead can also directly build everythingand directly package it up by using one of the following targets:
make-j$(nproc--all)bindeb-pkgto generate a deb packagemake-j$(nproc--all)binrpm-pkgto generate a rpm packagemake-j$(nproc--all)tarbz2-pkgto generate a bz2 compressed tarball
This is just a selection of available make targets for this purpose, seemakehelp for others. You can also use these targets after runningmake-j$(nproc--all), as they will pick up everything already built.
If you employ the targets to generate deb or rpm packages, ignore thestep-by-step guide’s instructions on installing and removing your kernel;instead install and remove the packages using the package utility for the format(e.g. dpkg and rpm) or a package management utility build on top of them (apt,aptitude, dnf/yum, zypper, ...). Be aware that the packages generated usingthese two make targets are designed to work on various distributions utilizingthose formats, they thus will sometimes behave differently than yourdistribution’s kernel packages.
Put the kernel in place¶
Install the kernel you just built. [...]
What you need to do after executing the command in the step-by-step guidedepends on the existence and the implementation of/sbin/installkernelexecutable on your distribution.
If installkernel is found, the kernel’s build system will delegate the actualinstallation of your kernel image to this executable, which then performs someor all of these tasks:
On almost all Linux distributions installkernel will store your kernel’simage in /boot/, usually as ‘/boot/vmlinuz-<kernelrelease_id>’; often it willput a ‘System.map-<kernelrelease_id>’ alongside it.
On most distributions installkernel will then generate an ‘initramfs’(sometimes also called ‘initrd’), which usually are stored as‘/boot/initramfs-<kernelrelease_id>.img’ or‘/boot/initrd-<kernelrelease_id>’. Commodity distributions rely on this filefor booting, hence ensure to execute the make target ‘modules_install’ first,as your distribution’s initramfs generator otherwise will be unable to findthe modules that go into the image.
On some distributions installkernel will then add an entry for your kernelto your bootloader’s configuration.
You have to take care of some or all of the tasks yourself, if yourdistribution lacks an installkernel script or does only handle part of them.Consult the distribution’s documentation for details. If in doubt, install thekernel manually:
sudo install -m 0600 $(make -s image_name) /boot/vmlinuz-$(make -s kernelrelease)sudo install -m 0600 System.map /boot/System.map-$(make -s kernelrelease)
Now generate your initramfs using the tools your distribution provides for thisprocess. Afterwards add your kernel to your bootloader configuration and reboot.
Storage requirements per kernel¶
Check how much storage space the kernel, its modules, and other related fileslike the initramfs consume. [...]
The kernels built during a bisection consume quite a bit of space in /boot/ and/lib/modules/, especially if you enabled debug symbols. That makes it easy tofill up volumes during a bisection -- and due to that even kernels which used towork earlier might fail to boot. To prevent that you will need to know how muchspace each installed kernel typically requires.
Note, most of the time the pattern ‘/boot/$(make -s kernelrelease)’ used inthe guide will match all files needed to boot your kernel -- but neither thepath nor the naming scheme are mandatory. On some distributions you thus willneed to look in different places.
Check if your newly built kernel considers itself ‘tainted’¶
Check if the kernel marked itself as ‘tainted’.[...]
Linux marks itself as tainted when something happens that potentially leads tofollow-up errors that look totally unrelated. That is why developers mightignore or react scantly to reports from tainted kernels -- unless of course thekernel set the flag right when the reported bug occurred.
That’s why you want check why a kernel is tainted as explained inTainted kernels; doing so is also in your owninterest, as your testing might be flawed otherwise.
Check the kernel built from a recent mainline codebase¶
Verify if your bug occurs with the newly built kernel.[...]
There are a couple of reasons why your bug or regression might not show up withthe kernel you built from the latest codebase. These are the most frequent:
The bug was fixed meanwhile.
What you suspected to be a regression was caused by a change in the buildconfiguration the provider of your kernel carried out.
Your problem might be a race condition that does not show up with your kernel;the trimmed build configuration, a different setting for debug symbols, thecompiler used, and various other things can cause this.
In case you encountered the regression with a stable/longterm kernel it mightbe a problem that is specific to that series; the next step in this guide willcheck this.
Check the kernel built from the latest stable/longterm codebase¶
Are you facing a regression within a stable/longterm release, but failed toreproduce it with the kernel you just built using the latest mainline sources?Then check if the latest codebase for the particular series might already fixthe problem. [...]
If this kernel does not show the regression either, there most likely is no needfor a bisection.
Ensure the ‘good’ version is really working well¶
Check if the kernels you build work fine.[...]
This section will reestablish a known working base. Skipping it might beappealing, but is usually a bad idea, as it does something important:
It will ensure the .config file you prepared earlier actually works as expected.That is in your own interest, as trimming the configuration is not foolproof --and you might be building and testing ten or more kernels for nothing beforestarting to suspect something might be wrong with the build configuration.
That alone is reason enough to spend the time on this, but not the only reason.
Many readers of this guide normally run kernels that are patched, use add-onmodules, or both. Those kernels thus are not considered ‘vanilla’ -- thereforeit’s possible that the thing that regressed might never have worked in vanillabuilds of the ‘good’ version in the first place.
There is a third reason for those that noticed a regression betweenstable/longterm kernels of different series (e.g. 6.0.13..6.1.5): it willensure the kernel version you assumed to be ‘good’ earlier in the process (e.g.6.0) actually is working.
Build your own version of the ‘good’ kernel¶
Build your own variant of the working kernel and check if the feature thatregressed works as expected with it. [...]
In case the feature that broke with newer kernels does not work with your firstself-built kernel, find and resolve the cause before moving on. There are amultitude of reasons why this might happen. Some ideas where to look:
Check the taint status and the output of
dmesg, maybe something unrelatedwent wrong.Maybe localmodconfig did something odd and disabled the module required totest the feature? Then you might want to recreate a .config file based on theone from the last working kernel and skip trimming it down; manually disablingsome features in the .config might work as well to reduce the build time.
Maybe it’s not a kernel regression and something that is caused by some fluke,a broken initramfs (also known as initrd), new firmware files, or an updateduserland software?
Maybe it was a feature added to your distributor’s kernel which vanilla Linuxat that point never supported?
Note, if you found and fixed problems with the .config file, you want to use itto build another kernel from the latest codebase, as your earlier tests withmainline and the latest version from an affected stable/longterm series weremost likely flawed.
Perform a bisection and validate the result¶
With all the preparations and precaution builds taken care of, you are nowready to begin the bisection. [...]
The steps in this segment perform and validate the bisection.
Start the bisection¶
Start the bisection and tell Git about the versions earlier established as‘good’ and ‘bad’. [...]
This will start the bisection process; the last of the commands will make Gitcheck out a commit round about half-way between the ‘good’ and the ‘bad’ changesfor you to test.
Build a kernel from the bisection point¶
Build, install, and boot a kernel from the code Git checked out using thesame commands you used earlier. [...]
There are two things worth of note here:
Occasionally building the kernel will fail or it might not boot due someproblem in the code at the bisection point. In that case run this command:
git bisect skip
Git will then check out another commit nearby which with a bit of luck shouldwork better. Afterwards restart executing this step.
Those slightly odd looking version identifiers can happen during bisections,because the Linux kernel subsystems prepare their changes for a new mainlinerelease (say 6.2) before its predecessor (e.g. 6.1) is finished. They thusbase them on a somewhat earlier point like 6.1-rc1 or even 6.0 -- and thenget merged for 6.2 without rebasing nor squashing them once 6.1 is out. Thisleads to those slightly odd looking version identifiers coming up duringbisections.
Bisection checkpoint¶
Check if the feature that regressed works in the kernel you just built.[...]
Ensure what you tell Git is accurate: getting it wrong just one time will bringthe rest of the bisection totally off course, hence all testing after that pointwill be for nothing.
Put the bisection log away¶
Store Git’s bisection log and the current .config file in a safe place.[...]
As indicated above: declaring just one kernel wrongly as ‘good’ or ‘bad’ willrender the end result of a bisection useless. In that case you’d normally haveto restart the bisection from scratch. The log can prevent that, as it mightallow someone to point out where a bisection likely went sideways -- and theninstead of testing ten or more kernels you might only have to build a few toresolve things.
The .config file is put aside, as there is a decent chance that developers mightask for it after you report the regression.
Try reverting the culprit¶
Try reverting the culprit on top of the latest codebase to see if this fixesyour regression. [...]
This is an optional step, but whenever possible one you should try: there is adecent chance that developers will ask you to perform this step when you bringthe bisection result up. So give it a try, you are in the flow already, buildingone more kernel shouldn’t be a big deal at this point.
The step-by-step guide covers everything relevant already except one slightlyrare thing: did you bisected a regression that also happened with mainline usinga stable/longterm series, but Git failed to revert the commit in mainline? Thentry to revert the culprit in the affected stable/longterm series -- and if thatsucceeds, test that kernel version instead.
Cleanup steps during and after following this guide¶
During and after following this guide you might want or need to remove someof the kernels you installed. [...]
The steps in this section describe clean-up procedures.
Cleaning up during the bisection¶
To remove one of the kernels you installed, look up its ‘kernelrelease’identifier. [...]
The kernels you install during this process are easy to remove later, as itsparts are only stored in two places and clearly identifiable. You thus do notneed to worry to mess up your machine when you install a kernel manually (andthus bypass your distribution’s packaging system): all parts of your kernels arerelatively easy to remove later.
One of the two places is a directory in /lib/modules/, which holds the modulesfor each installed kernel. This directory is named after the kernel’s releaseidentifier; hence, to remove all modules for one of the kernels you built,simply remove its modules directory in /lib/modules/.
The other place is /boot/, where typically two up to five files will be placedduring installation of a kernel. All of them usually contain the release name intheir file name, but how many files and their exact names depend somewhat onyour distribution’s installkernel executable and its initramfs generator. Onsome distributions thekernel-installremove... command mentioned in thestep-by-step guide will delete all of these files for you while also removingthe menu entry for the kernel from your bootloader configuration. On others youhave to take care of these two tasks yourself. The following command shouldinteractively remove the three main files of a kernel with the release name‘6.0-rc1-local-gcafec0cacaca0’:
rm -i /boot/{System.map,vmlinuz,initr}-6.0-rc1-local-gcafec0cacaca0Afterwards check for other files in /boot/ that have‘6.0-rc1-local-gcafec0cacaca0’ in their name and consider deleting them as well.Now remove the boot entry for the kernel from your bootloader’s configuration;the steps to do that vary quite a bit between Linux distributions.
Note, be careful with wildcards like ‘*’ when deleting files or directoriesfor kernels manually: you might accidentally remove files of a 6.0.13 kernelwhen all you want is to remove 6.0 or 6.0.1.
Cleaning up after the bisection¶
Once you have finished the bisection, do not immediately remove anythingyou set up, as you might need a few things again.[...]
When you are really short of storage space removing the kernels as described inthe step-by-step guide might not free as much space as you would like. In thatcase consider runningrm-rf~/linux/* as well now. This will remove thebuild artifacts and the Linux sources, but will leave the Git repository(~/linux/.git/) behind -- a simplegitreset--hard thus will bring thesources back.
Removing the repository as well would likely be unwise at this point: thereis a decent chance developers will ask you to build another kernel toperform additional tests -- like testing a debug patch or a proposed fix.Details on how to perform those can be found in the sectionOptionaltasks: test reverts, patches, or later versions.
Additional tests are also the reason why you want to keep the~/kernel-config-working file around for a few weeks.
Test reverts, patches, or later versions¶
While or after reporting a bug, you might want or potentially will be askedto test reverts, patches, proposed fixes, or other versions.[...]
All the commands used in this section should be pretty straight forward, sothere is not much to add except one thing: when setting a kernel tag asinstructed, ensure it is not much longer than the one used in the example, asproblems will arise if the kernelrelease identifier exceeds 63 characters.
Additional information¶
Build kernels on a different machine¶
To compile kernels on another system, slightly alter the step-by-step guide’sinstructions:
Start following the guide on the machine where you want to install and testthe kernels later.
After executing ‘Boot into the working kernel and briefly use theapparently broken feature’, save the list of loadedmodules to a file using
lsmod>~/test-machine-lsmod. Then locate thebuild configuration for the running kernel (see ‘Start defining thebuild configuration for your kernel’ for hints on whereto find it) and store it as ‘~/test-machine-config-working’. Transfer bothfiles to the home directory of your build host.Continue the guide on the build host (e.g. with ‘Ensure to have enoughfree space for building [...]’).
When you reach ‘Start preparing a kernel build configuration[...]’: before running
makeolddefconfigfor the first time,execute the following command to base your configuration on the one from thetest machine’s ‘working’ kernel:cp ~/test-machine-config-working ~/linux/.config
During the next step to ‘disable any apparently superfluous kernelmodules’ use the following command instead:
yes '' | make localmodconfig LSMOD=~/lsmod_foo-machine localmodconfig
Continue the guide, but ignore the instructions outlining how to compile,install, and reboot into a kernel every time they come up. Instead buildlike this:
cp ~/kernel-config-working .configmake olddefconfig &&make -j $(nproc --all) targz-pkg
This will generate a gzipped tar file whose name is printed in the lastline shown; for example, a kernel with the kernelrelease identifier‘6.0.0-rc1-local-g928a87efa423’ built for x86 machines usually willbe stored as ‘~/linux/linux-6.0.0-rc1-local-g928a87efa423-x86.tar.gz’.
Copy that file to your test machine’s home directory.
Switch to the test machine to check if you have enough space to hold anotherkernel. Then extract the file you transferred:
sudo tar -xvzf ~/linux-6.0.0-rc1-local-g928a87efa423-x86.tar.gz -C /
Afterwardsgenerate the initramfs and add the kernel to your bootloader’s configuration; on some distributions the followingcommand will take care of both these tasks:
sudo /sbin/installkernel 6.0.0-rc1-local-g928a87efa423 /boot/vmlinuz-6.0.0-rc1-local-g928a87efa423
Now reboot and ensure you started the intended kernel.
This approach even works when building for another architecture: just installcross-compilers and add the appropriate parameters to every invocation of make(e.g.makeARCH=arm64CROSS_COMPILE=aarch64-linux-gnu-[...]).
Additional reading material¶
Theman page for ‘git bisect’ andfighting regressions with ‘git bisect’in the Git documentation.
Working with git bisectfrom kernel developer Nathan Chancellor.
Using Git bisect to figure out when brokenness was introduced.