Kernel Debugging Techniques
Why This Matters
Kernel development follows a tight loop: write code → build → deploy → test → debug. Unlike userspace programs, you cannot attach a debugger trivially, symbols may be stripped, and a crash halts the entire machine rather than just one process. Even experienced kernel developers identify debugging as the real bottleneck. The earlier you internalize these techniques, the less time you will spend staring at a blank screen after a panic.
1. Print Debug Messages with printk()
printk() is the kernel's equivalent of printf(). It writes a formatted string to the kernel's ring buffer, where it can be retrieved with dmesg or read from /proc/kmsg.
Log Levels
Every printk() call should specify a log level prefix. Levels run from 0 (most urgent) to 7 (most verbose):
| Macro | Level | Meaning |
|---|---|---|
KERN_EMERG |
0 | System is unusable |
KERN_ALERT |
1 | Action must be taken immediately |
KERN_CRIT |
2 | Critical conditions |
KERN_ERR |
3 | Error conditions |
KERN_WARNING |
4 | Warning conditions |
KERN_NOTICE |
5 | Normal but significant |
KERN_INFO |
6 | Informational |
KERN_DEBUG |
7 | Debug-level messages |
If you omit the level, the kernel uses KERN_WARNING or KERN_ERR as the default.
printk(KERN_DEBUG "debug message from %s:%d\n", __func__, __LINE__);
Controlling Which Messages Appear
The kernel only prints messages whose level is higher priority (lower number) than the current console log level. You can inspect and change this:
# Shows: current default minimum boot-time-default
$ cat /proc/sys/kernel/printk
4 4 1 7
# Enable all levels (0–7) during development:
$ echo 7 > /proc/sys/kernel/printk
The Ring Buffer
The kernel message buffer is a fixed-size circular buffer. When it fills up it wraps around, discarding the oldest messages. If you are generating a lot of output, increase the buffer size by adding log_buf_len=1M (must be a power of 2) to the kernel boot parameters.
Special Format Specifiers
printk() supports extra format specifiers beyond standard printf:
/* Print a symbol name + offset from a function pointer */
printk("Calling: %pS\n", p->func); // "versatile_init+0x0/0x110"
printk("Faulted at %pS\n", (void *)regs->ip);
/* Print a symbol from a stack return address */
printk(" %s%pB\n", reliable ? "" : "? ", (void *)*stack);
These are invaluable for decoding raw addresses in panic output.
Convenience Wrappers
Rather than writing the log level prefix by hand, use the pr_* family:
pr_info("Module loaded, version %d\n", VERSION);
pr_debug("Entering %s\n", __func__);
pr_err("Failed to allocate buffer: %d\n", ret);
For driver code that has a struct device *dev, use dev_info(), dev_err(), etc., which automatically prefix the device name. For /proc files, use seq_printf().
2. Assertions: BUG_ON() and WARN_ON()
These macros are the kernel's equivalent of assert().
BUG_ON(ptr == NULL); // panics if ptr is NULL
WARN_ON(len > MAX); // prints backtrace but keeps running
| Macro | Condition true → | Use when |
|---|---|---|
BUG_ON(c) |
Kernel panic + full call stack | The invariant violation is unrecoverable |
WARN_ON(c) |
Call stack printed, execution continues | The invariant violation is suspicious but survivable |
BUG_ON is a hard stop — use it for situations where continuing would corrupt data or produce nonsensical results. WARN_ON is for "this should not happen, but if it does log it and limp on."
3. Analyzing Kernel Panic Messages
When the kernel panics it prints a message like:
RIP: 0010:lkp_init+0x41/0x1000
lkp_init+0x41 means offset 0x41 bytes into the lkp_init function. To find the corresponding source line you have two methods.
Method 1: objdump
objdump -S lkp.o | less
-S interleaves source with disassembly (requires debug symbols). Scroll to the function and count to offset 0x41 from its start. The source annotation will name the file and line.
Method 2: gdb
gdb lkp.o
(gdb) list *(lkp_init+0x41)
gdb decodes the address directly and prints the surrounding source lines. This is faster once you know the syntax.
4. Interactive Debugging with QEMU and GDB
For stepping through kernel code interactively, the gold standard is running the kernel inside QEMU and attaching GDB over QEMU's built-in GDB stub.
Architecture Overview
+------------------+ :1234
| GDB (host) | <-TCP-> | QEMU GDB stub |
| (your terminal) | | (controls VM) |
+------------------+ +------------------+
|
+------------------+
| Linux kernel |
| (guest VM) |
+------------------+
Because the GDB stub is wired directly into QEMU's emulation logic, GDB has full control: it can halt execution, set breakpoints anywhere (even in boot code), inspect registers, and walk kernel data structures.
Step 1: Build the Kernel with Debug Info
Enable these options in .config (or via make menuconfig → Kernel hacking → Compile-time checks):
CONFIG_DEBUG_INFO=y
CONFIG_GDB_SCRIPTS=y
CONFIG_DEBUG_INFO includes DWARF debug info. CONFIG_GDB_SCRIPTS installs Python helpers under scripts/gdb/ that add Linux-aware lx-* commands to GDB.
Step 2: Launch QEMU with the GDB Stub
sudo qemu-system-x86_64 \
-s -nographic -smp 2 -m 2G \
-nic user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:2200-:22 \
-net nic,model=e1000 \
-drive file=alpine.qcow2,format=qcow2 \
-kernel ${BZIMAGE} -append "nokaslr console=ttyS0 root=/dev/sda3"
Key flags:
-s— opens the GDB stub on port 1234.-S(optional) — pauses at the first kernel instruction, waiting for GDB.nokaslrin the kernel command line — disables address space layout randomization so symbol addresses are stable.
Step 3: Connect GDB
cd /path/to/linux-build
gdb vmlinux
(gdb) target remote :1234
You now have a live GDB session attached to the running kernel.
Useful GDB Commands for Kernel Debugging
(gdb) b lkp_init # breakpoint at function
(gdb) hbreak start_kernel # hardware breakpoint (needed for very early boot)
(gdb) d 1 # delete breakpoint 1
(gdb) c # continue
(gdb) bt # backtrace
(gdb) n # next (step over)
(gdb) s # step (step into)
(gdb) p variable # print variable
(gdb) p *ptr # print dereferenced pointer
(gdb) info registers # dump registers
Linux-Provided GDB Helpers (lx-*)
After CONFIG_GDB_SCRIPTS=y is set, load the symbols with:
(gdb) lx-symbols
Then use Linux-specific helpers:
(gdb) lx-dmesg # print kernel log buffer of the target
(gdb) p $lx_current().pid # inspect current task's PID
(gdb) apropos lx # list all lx-* helpers
You can set breakpoints on kernel module functions before the module is loaded:
(gdb) b btrfs_init_sysfs
# GDB will ask: "Make breakpoint pending on future shared library load? (y or [n]) y"
(gdb) c # continue; GDB fires when the module loads
Practical Tips
- Always shut down the QEMU VM with
poweroff(not by closing the terminal). Killing QEMU abruptly can corrupt the disk image. - On a physical x86 host, add
-enable-kvmto dramatically speed up emulation. - If you are not seeing symbols, confirm
vmlinux(notbzImage) is the file passed to GDB —bzImageis compressed and stripped.
Key Takeaways
printk()with explicit log levels is your first line of defense. Set/proc/sys/kernel/printkto7during development to see all messages.BUG_ON()halts the kernel on an invariant violation;WARN_ON()logs and continues. Choose based on whether the violation is recoverable.- Panic messages encode the fault address as
function+offset. Decode withobjdump -Sorgdb list *(func+offset). - QEMU + GDB gives you a full interactive debugger attached to a live kernel. The combination of
CONFIG_DEBUG_INFO,-s(QEMU stub), andlx-symbols(GDB helper) is the standard kernel debugging setup. nokaslris essential for QEMU/GDB debugging — without it, runtime addresses won't match the symbols invmlinux.