Kernel Data Structures I: Lists, Hash Tables, and Red-Black Trees

Why This Matters

As Linus Torvalds put it: "Bad programmers worry about the code. Good programmers worry about data structures and their relationships." The Linux kernel is a masterclass in this philosophy. Before you can read kernel source comfortably, you must understand a handful of generic data structures that appear everywhere — in the scheduler, the VFS, device drivers, and networking. This module covers the three most common: the linked list, the hash table, and the red-black tree.


1. The Linux Linked List

The Classic Approach vs. the Kernel Approach

In CS 101 you learn to embed prev/next pointers directly in each node type:

struct car {
    struct car *prev;
    struct car *next;
    unsigned int max_speed;
    unsigned int price_in_dollars;
};

This works, but you'd need separate list-manipulation functions for every struct type. The kernel solves this with inversion: instead of putting pointers in your struct, you embed a generic list_head node inside your struct, and all list APIs operate on list_head alone.

struct list_head {          /* include/linux/types.h */
    struct list_head *next, *prev;
};

struct car {
    struct list_head list;  /* embedded list node */
    unsigned int max_speed;
    unsigned int price_in_dollars;
};

Two Key Design Choices

Property Typical doubly linked list Linux list_head
Termination NULL Loops back to HEAD
HEAD node Contains data Sentinel — no data
Traversal end ptr == NULL ptr == HEAD
Empty test HEAD == NULL HEAD->next == HEAD

The sentinel HEAD means the list is never NULL when empty — the head node's prev and next both point back to itself. This eliminates a whole class of null-pointer bugs.

Getting Back to Your Data: container_of

Since APIs only hand you a list_head *, you need to recover the enclosing struct car *. The list_entry macro (an alias for container_of) does this with pointer arithmetic:

struct car *my_car = list_entry(car_list_ptr, struct car, list);

Under the hood:

#define container_of(ptr, type, member) ({          \
    void *__mptr = (void *)(ptr);                   \
    (type *)((char *)__mptr - offsetof(type, member)); })

It subtracts the known offset of list within struct car from the list_head address, giving the start of the parent struct.

Defining a List

Static (compile-time):

struct car my_car = {
    .max_speed = 150,
    .price_in_dollars = 10000,
    .list = LIST_HEAD_INIT(my_car.list),
};
LIST_HEAD(my_car_list);

Dynamic (runtime — most common in the kernel):

struct car *my_car = kmalloc(sizeof(*my_car), GFP_KERNEL);
my_car->max_speed = 150;
my_car->price_in_dollars = 10000;
INIT_LIST_HEAD(&my_car->list);

Core API (all O(1))

list_add(new, head);        /* insert after head (stack-like) */
list_add_tail(new, head);   /* insert before head (queue-like) */
list_del(entry);            /* unlink — memory not freed */
list_move(list, head);      /* move to another list's head */
list_move_tail(list, head); /* move to another list's tail */
list_empty(head);           /* returns non-zero if empty */
list_splice(list, head);    /* merge two lists */

Iteration (O(n))

/* Using a raw list_head cursor */
struct list_head *p;
struct car *current_car;
list_for_each(p, &my_car_list) {
    current_car = list_entry(p, struct car, list);
    printk(KERN_INFO "Price: %u\n", current_car->price_in_dollars);
}

/* Simpler: list_for_each_entry handles the list_entry call */
list_for_each_entry(current_car, &my_car_list, list) {
    printk(KERN_INFO "Price: %u\n", current_car->price_in_dollars);
}

Safe iteration while deleting — you must use the _safe variants, which pre-fetch the next pointer before you can corrupt it:

struct car *current_car, *next;
list_for_each_entry_safe(current_car, next, &my_car_list, list) {
    list_del(&current_car->list);
    kfree(current_car);
}

Without _safe, deleting a node invalidates the iterator and causes a use-after-free bug.

Where Linked Lists Appear in the Kernel


2. The Linux Hash Table

Structure

The kernel provides a fixed-size open chaining (separate chaining) hash table. Key characteristics:

/* include/linux/hashtable.h, types.h */
struct hlist_head {
    struct hlist_node *first;
};

struct hlist_node {
    struct hlist_node *next;
    struct hlist_node **pprev;  /* points to &prev->next */
};

The pprev double-pointer trick lets you delete a node in O(1) without knowing the head, even though the list is singly linked — you modify *pprev directly.

Like list_head, hlist_node is embedded inside your data struct, and container_of recovers the parent.

API

/* Define and initialize a table with 2^bits buckets */
DEFINE_HASHTABLE(my_table, 8);   /* 256 buckets */
hash_init(my_table);

/* Add an object (hash key determines bucket) */
hash_add(my_table, &obj->node, key);

/* Iterate every entry across all buckets */
hash_for_each(my_table, bkt, obj, node);

/* Iterate entries that hash to the same bucket as key */
hash_for_each_possible(my_table, obj, node, key);

/* Remove an entry */
hash_del(&obj->node);

Hash tables are used heavily in networking (connection tracking), device drivers, and filesystems.


3. The Red-Black Tree (rbtree)

Background

A red-black tree is a self-balancing binary search tree. Each node is colored red or black, and the tree enforces invariants that keep all paths from root to leaf within a factor of two of each other. This guarantees O(log N) search, insert, and delete.

Key properties maintained after every modification:

Linux rbtree

/* include/linux/rbtree.h */
struct rb_node {
    unsigned long __rb_parent_color;  /* parent ptr + color packed together */
    struct rb_node *rb_right;
    struct rb_node *rb_left;
};

struct rb_root {
    struct rb_node *rb_node;
};

#define RB_ROOT (struct rb_root) { NULL, }

Color and parent pointer are packed into a single unsigned long using the two low-order bits (always zero in aligned pointers) — a classic kernel space-saving trick.

API

/* Recover enclosing struct */
rb_entry(ptr, type, member);   /* same as container_of */

/* Navigation */
rb_first(root);   rb_last(root);
rb_next(node);    rb_prev(node);

/* Insert (caller supplies a comparison function) */
rb_add(node, tree, less_fn);

/* Search (caller supplies a key comparison function) */
rb_find(key, tree, cmp_fn);

The kernel does not provide a built-in search that knows your key type — you supply a comparator. This is the "toolbox" pattern explained below.

Real-World Usage: The CFS Scheduler

The Completely Fair Scheduler (CFS), Linux's default process scheduler, maintains a red-black tree keyed on each task's vruntime (virtual runtime — how much CPU time the task has consumed). The scheduler always picks the leftmost node (smallest vruntime) as the next task to run, which it can find in O(log N). This is one of the most performance-critical uses of rbtree in the entire kernel.


4. Design Patterns Across Kernel Data Structures

All three structures share the same idioms:

Embedded Nodes, Not External Containers

Each structure (list_head, hlist_node, rb_node) is embedded inside the application data struct, not the other way around. Benefits:

A Toolbox, Not a Complete Solution

None of the three structures provides a built-in "find by key" function. You build your own search loop using the primitives. This keeps the kernel's generic code minimal and forces callers to write exactly the logic they need.

Caller-Managed Locking

The data structures themselves are not thread-safe. The caller is responsible for acquiring the appropriate lock (spinlock, mutex, RCU, etc.) before calling any API. This separation of concerns keeps the generic code fast and simple.


Key Takeaways

Practice

  1. How does the Linux kernel's list_head-based linked list differ from a typical doubly linked list?
  2. What does container_of(ptr, struct car, list) compute?
  3. You are iterating a Linux linked list and want to call list_del and kfree on each element. Which macro must you use for the outer loop?
  4. Explain why the kernel's hash table uses hlist_node **pprev (a pointer to a pointer) in its singly linked collision chain rather than a simple hlist_node *prev.
  5. Which real Linux kernel subsystem uses a red-black tree keyed on per-task vruntime?
  6. A struct needs to be placed simultaneously on two independent linked lists (e.g., a 'free list' and an 'active list'). What is the correct approach using Linux list_head?
  7. The Linux rbtree API does not provide a rb_search(key) function. Why not, and what does the caller do instead?
  8. What is the time complexity of list_add, list_del, and hash_add in the Linux kernel implementations?
  9. Describe a scenario where failing to use list_for_each_entry_safe instead of list_for_each_entry leads to a use-after-free bug. Be specific about which pointer becomes invalid and when.
  10. The kernel's hash table bucket count must always be a power of two. What is the primary reason for this constraint?