Introduction to Linux Kernel & Developer Tools

Why This Matters

The Linux kernel is one of the largest and most actively developed software projects in history—over 27 million lines of C code, with roughly 1,600 contributors adding ~7,500 lines every single day. Before you can write a single line of kernel code, you need a working mental model of what you're modifying and how to work in it efficiently. Professional kernel developers do not just open files and start typing. They use a specific, battle-tested toolchain—and mastering it early saves enormous time later.

This module covers the kernel's architecture and development model, then walks through every tool you'll use throughout the course.

What Is the Linux Kernel?

An operating system kernel is the bridge between applications and hardware. The kernel's job is to:

Responsibility	Example
Abstract hardware	Expose a file descriptor instead of raw disk sectors
Multiplex resources	Schedule 1,000 processes across 8 CPU cores
Isolate processes	Prevent process A from reading process B's memory
Enable sharing	Let two processes open the same file

User-space programs communicate with the kernel through the system call interface:

fd = open("out", 1);      // kernel opens the file, returns a handle
write(fd, "hello\n", 6);  // kernel writes to the underlying storage
pid = fork();             // kernel duplicates the calling process

The CPU enforces this boundary in hardware. On x86, user-space runs at ring 3 and the kernel at ring 0. Only ring-0 code may touch I/O devices, modify page tables, or execute privileged instructions.

Monolithic vs. Micro-Kernel Design

Linux uses a monolithic kernel design: the entire OS (scheduler, file systems, networking, device drivers) runs in kernel space, sharing one address space.

User space:   application A    application B
              ─────────────────────────────── (system call boundary)
Kernel space: scheduler | VFS | net stack | drivers  ← all one binary
Hardware:     CPU, RAM, disk, NIC

Trade-offs:

	Monolithic	Micro-kernel
Performance	Fast (direct function calls between subsystems)	Slower (IPC between servers)
Isolation	Weak (one bug can crash everything)	Strong (servers in user space crash safely)
Complexity	Lower interface count	More IPC plumbing
Examples	Linux, FreeBSD	Minix, seL4

The Tanenbaum–Torvalds debate (1992) argued this exact question. In practice, most production kernels (Linux, macOS, Windows) are hybrids, but Linux's core remains monolithic.

Linux History and Release Cycle

Year	Milestone
1991	First release by Linus Torvalds
1992	GPL license; first distros
1996	v2.0 – SMP (multiprocessor) support
2003	v2.6 – PAE, many new architectures
2015	v4.0 – live patching
Today	Releases ~every 70 days, 13,000 patches/release

Version numbering: (major).(minor).(stable) — e.g., 6.1.71

Mainline — Linus's tree, contains all new features
Stable — bug fixes backported after mainline release
LTS — a stable release maintained for several years (e.g., 6.1, 5.15)
RC — release candidates for testing before mainline

Linux is licensed under GPLv2: any modification to GPL-licensed code must itself be released under the GPL, along with build instructions.

Version Control: git

Git was invented by Linus Torvalds to manage Linux kernel development. It is a distributed VCS: every clone is a full repository with complete history.

Getting the kernel source

git clone https://github.com/torvalds/linux.git   # GitHub mirror
cd linux
git checkout v6.1                                  # pin to a stable tag

Essential daily commands

# History and blame
git log                  # full commit history
git log <file>           # history for one file
git blame <file>         # who changed each line and when

# Local state
git status               # what has changed
git diff                 # exact line-level diff
git add <file>           # stage a file
git commit               # commit staged changes locally

# Sync
git push                 # send commits to remote
git pull                 # fetch and merge from remote

# Tags
git tag                  # list all tags
git checkout v6.1        # check out a tagged version

Use tig for a prettier ncurses log viewer. Useful aliases to add to ~/.gitconfig:

[alias]
    lg = log --graph
    lp = log --graph --pretty=oneline
    st = status
    co = checkout

The Kernel Source Tree

linux/
├── arch/        # Architecture-specific code (x86, arm, …)
├── block/       # Block device layer
├── Documentation/
├── drivers/     # Device drivers (largest directory)
├── fs/          # File systems (ext4, btrfs, proc, …)
├── include/     # Kernel headers
├── init/        # Early boot code (start_kernel lives here)
├── kernel/      # Core kernel: scheduler, signals, timers
├── mm/          # Memory management
├── net/         # Network stack
└── virt/        # Virtualization (KVM)

There are over 630 directories. You need tools to navigate this.

Building the Kernel

Building happens in three distinct phases.

Step 1 — Configure

The .config file at the repo root controls ~3,700 compilation flags for x86. Common approaches:

Command	What it does
`make menuconfig`	Interactive ncurses menu; requires `libncurses`, `flex`, `bison`
`make defconfig`	Default config for the current architecture
`make oldconfig`	Reuse the running kernel's config; prompts only for new options
`make localmodconfig`	Config based on currently loaded modules (smallest build)

Step 2 — Compile

make -j$(nproc)          # build kernel image (bzImage)
make modules -j$(nproc)  # build loadable modules (.ko files)

The -j flag parallelizes across CPU cores. On a 16-core machine, make -j16 cuts build time dramatically. The output kernel image lands at arch/x86/boot/bzImage.

Step 3 — Install

sudo make modules_install   # installs .ko files to /lib/modules/
sudo make install           # copies bzImage and updates bootloader
sudo reboot                 # boots into the new kernel
uname -a                    # verify the version
dmesg                       # inspect kernel log

Alternative (package-based): Generate .deb or .rpm packages for safer, reversible installation:

make deb-pkg          # Debian/Ubuntu
sudo dpkg -i linux-image-6.1_amd64.deb linux-headers-6.1_amd64.deb

Exploring the Code

Linux Cross Reference (LXR)

The fastest way to explore the kernel without installing anything. Visit elixir.bootlin.com to:

Browse any kernel version's source
Search for any identifier (function, variable, struct)
Follow cross-references to see every place a symbol is defined or used

cscope

A terminal-based C code browser. Build its database:

sudo apt install cscope
cd linux
ARCH=x86 make cscope       # x86-only (faster, smaller DB)
# or
make cscope                # all architectures

Inside cscope you can search for: C identifiers, function definitions, functions calling/called-by a given function, and text strings. Press Ctrl-d to quit.

ctags + vim

sudo apt install exuberant-ctags
cd linux; ARCH=x86 make tags -j2

In vim:

:tag start_kernel — jump to the definition of start_kernel
Ctrl-] — follow the tag under the cursor
Ctrl-t — jump back
:bp / :bn — navigate between open files

Terminal Multiplexer: tmux

tmux lets you run multiple terminal sessions inside one SSH connection and detach/reattach without losing state—essential for long kernel builds on a remote machine.

Command	Action
`tmux`	Start a new session
`Ctrl-b %`	Split pane vertically
`Ctrl-b "`	Split pane horizontally
`Ctrl-b z`	Zoom/unzoom current pane
`Ctrl-b c`	Create a new window
`Ctrl-b d`	Detach (session keeps running)
`tmux a`	Reattach to existing session

Kernel vs. User Programming

Writing kernel code feels different from application programming. Key differences:

No standard library

The kernel cannot link against libc. It ships its own equivalents:

User space	Kernel space
`#include <string.h>`	`#include <linux/string.h>`
`printf("Hello!")`	`printk(KERN_INFO "Hello!")`
`malloc(64)`	`kmalloc(64, GFP_KERNEL)`

GCC extensions

The kernel relies heavily on GCC-specific extensions:

static inline void func() { ... }        // inlined function

asm volatile("rdtsc" : "=a" (l), "=d" (h));  // inline assembly

if (unlikely(error)) { ... }   // tell the CPU this branch is rare
if (likely(success)) { ... }   // tell the CPU this is the hot path

likely()/unlikely() are hints to the compiler's branch predictor. Use them only when you have profiling evidence or strong domain knowledge.

Constrained environment

No floating-point — the FPU context belongs to user processes
Tiny stack — 8 KB (2 pages) on x86; deep recursion will corrupt memory silently
No memory protection — a bad pointer doesn't segfault; it triggers a kernel oops (often leading to a full kernel panic)
Concurrency everywhere — kernel code can run on multiple CPUs simultaneously, be preempted at any time, and interrupted by hardware interrupt handlers. You must reason carefully about every shared data structure.

Linux Kernel Coding Style

The kernel enforces a specific style (see Documentation/process/coding-style.rst):

Indentation: 1 tab = 8 characters (not spaces, not 4-space tabs)
Naming: snake_case only — never CamelCase (spin_lock, not SpinLock)
Comments: C-style only — /* like this */, never // like this
Line length: 80 columns max
No typedef for structs by default

/*
 * Multi-line comment: always C-style.
 */
struct foo {
        int member1;      /* 1 tab = 8 chars */
        double member2;
};  /* no typedef! */

void my_function(int the_param, char *string,
        int another_long_parameter)
{
        int x = the_param % 42;
        if (!the_param)
                do_stuff();
        switch (x % 3) {
        case 0:
                cool_function();
                break;
        default:
                do_other_stuff();
        }
}

Writing code that matches the surrounding style is not optional—patches with style violations are rejected during code review.

Key Takeaways

The Linux kernel is huge and fast-moving. 27 million lines, 13,000 patches per release—you cannot read it all. You must use tools.
git is foundational. Every kernel patch, every version, every blame trace flows through git. Master log, blame, diff, checkout.
Building the kernel is a three-step process: configure (.config), compile (make -j), install (make install or .deb/.rpm packages).
Use LXR or cscope to navigate. You will constantly need to trace who calls this? and where is this defined?—those questions are what cscope was built for.
Kernel programming is not user programming. No libc, no FP, tiny stack, no memory protection, mandatory concurrency awareness. These constraints affect every design decision.
Style is not optional. Patches are reviewed by humans; style violations get you ignored or rejected before anyone reads your logic.

Introduction to Linux Kernel & Developer Tools

Why This Matters

What Is the Linux Kernel?

Monolithic vs. Micro-Kernel Design

Linux History and Release Cycle

Version Control: git

Getting the kernel source

Essential daily commands

The Kernel Source Tree

Building the Kernel

Step 1 — Configure

Step 2 — Compile

Step 3 — Install

Exploring the Code

Linux Cross Reference (LXR)

cscope

ctags + vim

Terminal Multiplexer: tmux

Kernel vs. User Programming

No standard library

GCC extensions

Constrained environment

Linux Kernel Coding Style

Key Takeaways

Practice

Model answer

Model answer

Model answer

Results