Memory Allocation, Advanced Data Structures, and Kernel Modules

Why This Matters

Kernel code runs without the safety net of a user-space allocator or runtime library. You cannot call malloc; you choose between two fundamentally different allocators depending on whether the hardware demands physically contiguous memory. On top of that, the kernel maintains its own high-performance data structures — radix trees, XArrays, and bitmaps — that power the page cache, IRQ tracking, and CPU hot-plug. Finally, recompiling and rebooting the entire kernel every time you change a driver is impractical, so the kernel module system lets you load and unload code while the system stays running. All three topics come together whenever you write a real driver or file-system component.


1. Memory Allocation in the Kernel

kmalloc vs vmalloc

The kernel provides two primary allocation APIs:

kmalloc(size, gfp_mask) vmalloc(size)
Physical layout Physically and virtually contiguous Virtually contiguous only
Max size ~4 MB on x86 (arch-dependent) Limited only by free RAM
Allocation unit Byte-granular (slab allocator) Page (4 KB)
Can sleep? Depends on gfp_mask Always may sleep
Use case DMA, memory-mapped I/O, small objects Large kernel buffers not needing DMA
Header <linux/slab.h> <linux/vmalloc.h> (also <linux/slab.h> works)
Free kfree(ptr) vfree(ptr)

Physical contiguity is essential for hardware that accesses memory by physical address, such as DMA controllers and memory-mapped I/O regions. If your code doesn't have that constraint, vmalloc is fine and avoids the size limit.

The gfp_mask

The second argument to kmalloc is a Get Free Pages mask that controls:

Two flags you must know:

#include <linux/slab.h>

char *buf = (char *)kmalloc(128, GFP_KERNEL);   /* may sleep */
my_struct *s = (my_struct *)kmalloc(sizeof(*s), GFP_ATOMIC); /* no sleep */
kfree(buf);
kfree(s);

2. Advanced Kernel Data Structures

2.1 Radix Tree (Trie)

A radix tree (or compact prefix tree) maps long-integer keys to pointer values. All descendants of a node share a key prefix, and values live only at leaves.

Linux's radix tree is tuned for page-cache lookups:

Since kernel 4.19, radix_tree_root is an alias for xarray (the successor API), but the radix tree API still works in older kernels.

Core API:

/* Declare or initialize */
RADIX_TREE(my_tree, GFP_KERNEL);
/* or at runtime: */
struct radix_tree_root my_tree;
INIT_RADIX_TREE(my_tree, GFP_KERNEL);

/* Insert / delete / lookup */
int  radix_tree_insert(struct radix_tree_root *root, unsigned long index, void *item);
void *radix_tree_delete(struct radix_tree_root *root, unsigned long index);
void *radix_tree_lookup(const struct radix_tree_root *root, unsigned long index);

/* Tag management */
void *radix_tree_tag_set  (struct radix_tree_root *root, unsigned long index, unsigned int tag);
void *radix_tree_tag_clear(struct radix_tree_root *root, unsigned long index, unsigned int tag);
int   radix_tree_tag_get  (const struct radix_tree_root *root, unsigned long index, unsigned int tag);

Real-world use — the page cache:

struct address_space {          /* page cache of a file */
    struct inode *host;
    struct radix_tree_root page_tree;   /* older kernels */
    /* struct xarray i_pages; */        /* newer kernels */
};

Every file's pages are indexed by page offset. When the VFS needs page N, it looks it up in the radix tree; on a miss it reads from disk and inserts the page.

2.2 XArray

XArray (merged in 4.19) is the modern replacement for the radix tree. It wraps the same radix-tree mechanics with a cleaner API and embeds a spinlock so you don't have to manage locking separately.

Key properties:

#include <linux/xarray.h>

DEFINE_XARRAY(my_xa);           /* static init */
/* or: */ xa_init(&my_xa);      /* runtime init */

xa_store(&my_xa, index, ptr, GFP_KERNEL);   /* store */
void *val = xa_load(&my_xa, index);         /* load */
xa_erase(&my_xa, index);                    /* delete */

xa_set_tag(&my_xa, index, tag);
xa_clear_tag(&my_xa, index, tag);
bool present = xa_get_tag(&my_xa, index, tag);

/* iteration */
xa_for_each(&my_xa, index, entry) { /* ... */ }
xa_for_each_marked(&my_xa, index, entry, filter) { /* ... */ }

In Linux 6.8 the page cache uses XArray directly:

struct address_space {
    struct inode *host;
    struct xarray i_pages;   /* xarray = radix tree + spinlock */
};

2.3 Data-Structure Comparison

Structure Ordered? Key type Best for
Red-black tree Yes (sorted) Any comparable Sortable key/value pairs
Radix tree / XArray By key value unsigned long Sparse arrays, page cache
Hash table No Hashable Fast point lookup, no ordering needed

2.4 Bitmap

A bitmap is a flat array of bits packed into one or more unsigned long words.

Kernel uses bitmaps for:

#include <linux/bitmap.h>

DECLARE_BITMAP(my_bits, 256);   /* 256-bit bitmap */

set_bit(nr, my_bits);
clear_bit(nr, my_bits);
change_bit(nr, my_bits);

bitmap_zero(my_bits, 256);
bitmap_fill(my_bits, 256);

unsigned long pos = find_first_bit(my_bits, 256);
unsigned long pos = find_first_zero_bit(my_bits, 256);

/* Iteration macros */
for_each_set_bit(bit, my_bits, 256) { /* bit is set */ }
for_each_clear_bit(bit, my_bits, 256) { /* bit is clear */ }

3. Kernel Modules

What Is a Kernel Module?

A kernel module is a chunk of kernel code (a .ko file) that can be dynamically loaded into and unloaded from a running kernel — no reboot required. Support appeared in Linux 1.2 (1995). Many subsystems (file systems, drivers, network protocols) ship as optional modules controlled by the .config flag =m.

CONFIG_KVM_GUEST=y   # built into vmlinux
CONFIG_XFS_FS=m      # compiled as xfs.ko

Why Use Modules?

Writing a Module

Every module needs at minimum:

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>

static int __init lkp_init(void)
{
    printk(KERN_INFO "Module loaded\n");
    return 0;
}

static void __exit lkp_exit(void)
{
    printk(KERN_INFO "Module unloading\n");
}

module_init(lkp_init);
module_exit(lkp_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Xiaoguang Wang <xgwang9@uic.edu>");
MODULE_DESCRIPTION("A simple kernel module");

Use static for all symbols not intentionally exported. Use EXPORT_SYMBOL(name) to expose a function or variable to other modules. All exported kernel symbols are visible in /proc/kallsyms.

Building a Module

Modules are built against the running kernel's headers using the kernel's own build system:

obj-m := lkp.o
CONFIG_MODULE_SIG=n
KDIR  := /lib/modules/$(shell uname -r)/build
PWD   := $(shell pwd)

all: lkp.c
	make -C $(KDIR) M=$(PWD) modules

clean:
	make -C $(KDIR) M=$(PWD) clean

Result: lkp.ko. A module compiled against kernel version X will not load on version Y (the vermagic field enforces this).

Loading and Unloading

Tool Load Unload Notes
insmod sudo insmod file.ko Low-level; no dep resolution
rmmod sudo rmmod file Calls exit function
modprobe sudo modprobe <name> sudo modprobe -r <name> Resolves dependencies via modules.dep

modprobe looks for modules installed under /lib/modules/<kver>/ (put there by make modules_install). It auto-loads dependencies listed in modules.dep. Modules can be loaded at boot via /etc/modules or files under /etc/modprobe.d/.

Module Parameters

Pass arguments at load time with module_param:

static int int_param = 42;
static char *string_param = "default";

module_param(int_param, int, 0);
MODULE_PARM_DESC(int_param, "A sample integer parameter");

module_param(string_param, charp, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
MODULE_PARM_DESC(string_param, "A string parameter");

Load with: sudo insmod lkp.ko int_param=12 string_param="hello"

Parameters are also visible (and writable, if permissions allow) under /sys/module/<name>/parameters/.

Inspecting Modules

modinfo lkp.ko        # show metadata: author, license, vermagic, params
lsmod                 # list currently loaded modules

Key Takeaways

  1. kmalloc gives physically contiguous memory (needed for DMA/MMIO) up to ~4 MB; use GFP_KERNEL in process context and GFP_ATOMIC in interrupt/spinlock context.
  2. vmalloc gives large, virtually contiguous allocations with no DMA guarantee; it always may sleep and allocates in page units.
  3. Radix trees map unsigned long keys to pointers with O(log n) operations and tag bits for metadata; they power the Linux page cache.
  4. XArray is the modern, lock-embedded successor to the radix tree with a cleaner API (kernel ≥ 4.19).
  5. Bitmaps are compact bit arrays used for CPU masks, IRQ sets, and free-block tracking.
  6. Kernel modules are dynamically loadable .ko files; insmod/rmmod handle a single module while modprobe resolves dependencies automatically.
  7. A module is compiled against a specific kernel version and will not load on a different one.
  8. Use static for all non-exported symbols to avoid namespace pollution; use EXPORT_SYMBOL deliberately.

Practice

  1. Which of the following is the PRIMARY reason to prefer kmalloc over vmalloc when writing a DMA driver?
  2. An interrupt handler needs to allocate a small buffer. Which gfp_mask should it use with kmalloc?
  3. In the Linux radix tree implementation, how are internal (non-leaf) nodes structured?
  4. Which statement best describes the relationship between XArray and the Linux radix tree?
  5. Explain why the Linux page cache uses a radix tree (or XArray) rather than a hash table to store cached pages of a file.
  6. You call sudo insmod mydriver.ko on kernel 5.15 to load a module compiled against kernel 5.10. What happens?
  7. What is the key operational difference between insmod and modprobe when loading a kernel module?
  8. A teammate writes a kernel module and names every helper function without a prefix (e.g., init_device, cleanup, write_reg). Why is this a problem, and how should it be fixed?
  9. Given the following bitmap usage, what does for_each_set_bit(bit, cpu_mask, NR_CPUS) iterate over?
  10. You are writing a kernel module that accepts a file path as a parameter. Show the minimal module_param declaration for a char * parameter named target_path, with permissions that allow the owner to read and write it but others only to read it. Then explain what MODULE_PARM_DESC is for.