xv6 Comprehensive Review

Why This Matters

xv6 is the teaching OS you have been living inside all semester. Every mechanism you touched — fork, exec, page faults, context switches, file writes — passes through code you can read in its entirety. This review ties those mechanisms into a single coherent picture so you can reason about how the pieces interact, not just recall isolated facts.

1. OS Architecture Recap

xv6 uses a monolithic kernel: all OS services (memory management, file system, device drivers) share one kernel address space. This contrasts with a microkernel, where each service lives in its own protected process and communicates via IPC.

Monolithic kernels are faster (no IPC overhead) but have a larger trusted computing base — a bug anywhere in the kernel can compromise the whole system.

Isolation mechanisms

Each process has its own page table; the kernel maps itself into the upper portion of every process's address space (the kernel half).
User code runs in ring 3 (unprivileged); kernel code runs in ring 0.
Transitions between user and kernel happen only through controlled entry points (traps).

2. Processes and the Process Lifecycle

`struct proc`

Every process is described by struct proc in proc.c. Key fields:

Field	Purpose
`pgdir`	Pointer to the process's page table
`kstack`	Per-process kernel stack (used during syscalls/traps)
`state`	`UNUSED`, `EMBRYO`, `SLEEPING`, `RUNNABLE`, `RUNNING`, `ZOMBIE`
`ofile[]`	Open file descriptors
`cwd`	Current working directory inode
`tf`	Saved `trapframe` (registers at last trap)
`context`	Saved `context` (registers for `swtch`)
`chan`	Sleep channel (what the process is waiting on)
`killed`	Set to 1 when a signal should terminate the process

Why does kill 2 2 not immediately kill the xv6 shell? kill() only sets p->killed = 1. The shell only checks that flag when it returns from a trap (system call or interrupt). If the shell is blocked in read(), it won't check killed until the next trap fires.

Process creation: `fork()`

allocproc() — allocate a struct proc, set up the kernel stack with a fake return address pointing at forkret.
Copy the parent's page table and all mapped pages into the child.
Copy the parent's trapframe so the child returns from fork() at the same instruction — but with return value 0 (child) vs. child PID (parent).
Copy open file descriptors and cwd.
Set state to RUNNABLE; the scheduler will pick it up.

First process: `userinit()` and `initcode`

The very first user process cannot be created by fork() — there is no parent yet. Instead:

main() calls userinit().
userinit() calls allocproc(), then builds a minimal page table with setupkvm() + inituvm().
The page table is loaded with initcode — a tiny hand-assembled binary (compiled from initcode.S) that is embedded directly in the kernel binary.
initcode.S executes exec("/init", argv) as its very first action, replacing itself with the real init program.

# initcode.S (simplified)
start:
  mov  $init, %rdi      # argv[0] = "/init"
  mov  $argv, %rsi
  mov  $SYS_exec, %rax
  syscall

Zombie processes and resource reclaim

When a process calls exit(), it transitions to ZOMBIE — it retains its struct proc so the parent can read the exit status.
The parent calls wait() to collect the status and call freeproc().
If a parent exits before its children, init (PID 1) is reparented to be their new parent, preventing orphaned zombies.

3. Scheduling and Context Switching

Round-robin scheduler

xv6 runs scheduler() in an infinite loop on each CPU core (called from mpmain() at boot). The loop scans ptable for the next RUNNABLE process and switches to it:

scheduler() loop:
  for each proc p in ptable:
    if p->state == RUNNABLE:
      switchuvm(p)        // install p's page table + TSS
      swtch(&cpu->scheduler, p->context)  // jump to p

switchuvm(p) installs the process's page table into CR3 and updates the TSS so that hardware knows where the kernel stack is on the next trap.

`swtch()` — saving and restoring context

swtch is written in assembly (swtch.S). It saves the callee-saved registers (%rbx, %rbp, %r12–%r15, and %rip implicitly via push/ret) to the old context struct and restores them from the new context struct.

These registers are saved because the C calling convention only guarantees that callee-saved registers survive a function call — the scheduler must preserve whatever the kernel was doing before the switch.

swtch is called in three situations:

Voluntary yield — a process calls sleep() or yield(), which calls sched() → swtch back to the scheduler.
Timer interrupt — the timer handler calls yield() → sched() → swtch.
Exit — exit() calls sched() → swtch to switch away permanently.

CFS fairness (Linux contrast)

Linux's Completely Fair Scheduler tracks vruntime (virtual runtime weighted by priority) for each task. It picks the task with the smallest vruntime. CPU-bound tasks accumulate vruntime fast; I/O-bound tasks (mostly sleeping) accumulate it slowly and are prioritized when they wake up — achieving fairness without starvation.

4. Traps, Interrupts, and Device Drivers

A trap is any CPU-forced transfer of control into the kernel. xv6 handles three kinds:

Path 1 — System calls

User syscall instruction
  → trapasm.S: syscall_entry  (saves trapframe)
  → proc.c / syscall.c: syscall()
  → individual handler: sys_read(), sys_write(), …
  → iret back to user

The syscall entry point address is written into the MSR_LSTAR model-specific register at boot.

Path 2 — Hardware interrupts (e.g., timer, keyboard)

Hardware asserts interrupt line
  → APIC routes to CPU
  → vector.S → trapasm.S: alltraps  (saves trapframe, calls trap())
  → trap.c: trap(struct trapframe *tf)
    • timer: ticks++; wakeup(&ticks); yield()
    • keyboard: kbdintr()
  → iret back to interrupted code

Path 3 — Exceptions (e.g., page fault)

Handled similarly to hardware interrupts via alltraps → trap(). The trap number (T_PGFLT = 14) tells the handler what happened.

In hw3 (Copy-on-Write), you used page faults productively:

fork() marks shared pages read-only instead of copying them.
On the first write, the CPU raises T_PGFLT.
copyonwrite() allocates a new page, copies the content, and remaps it writable.

5. Kernel Synchronization

Why locks are needed

With multiple CPUs and preemptible kernel code, two threads of execution can interleave reads and writes to shared data structures (e.g., ptable, the file table). Locks enforce mutual exclusion.

xv6 lock types

Lock	Use case	Behavior while waiting
`spinlock`	Short critical sections, interrupt handlers	Spins (busy-wait)
`sleeplock`	Long operations (disk I/O)	Calls `sleep()`, yields CPU

When to use a lock — checklist

Ask: "Is this data structure accessed by more than one CPU or interrupt handler concurrently?"

If yes: protect reads and writes.
Hold the lock for the shortest possible time.
Never call sleep() while holding a spinlock (deadlock via interrupt re-entry).

Deadlock

Deadlock occurs when two threads each hold a lock the other needs. xv6 avoids it by enforcing a global lock ordering (e.g., always acquire ptable.lock before p->lock).

Advanced lock types (Linux)

rwlock — multiple concurrent readers; writers get exclusive access. Good when reads dominate.
seqlock — writers never block; readers retry if they observe an odd sequence number (write in progress). Good for frequently-read, rarely-written data (e.g., jiffies).

6. File System

Disk layout

| boot | superblock | log | inodes | free bitmap | data blocks |

Superblock — stores filesystem metadata: total blocks, inode count, log start, etc.
dinode — on-disk inode: file type, size, link count, 12 direct block pointers + 1 indirect pointer.

Inodes (in-memory `struct inode`)

An inode is the kernel's handle to a file. iget(dev, inum) looks up or allocates a cached inode. Inodes are reference-counted; iput() drops a reference and, when the count reaches zero, frees the inode.

Directories and path resolution

A directory is just a file whose data blocks hold dirent records:

struct dirent {
  ushort inum;      // inode number (0 = free slot)
  char   name[14];
};

Path resolution (namei) walks the directory tree, calling dirlookup at each component.

Virtual File System (VFS)

VFS is an abstraction layer that makes multiple concrete filesystem types (ext4, FAT, tmpfs, …) look identical to the kernel. Operations like read, write, open dispatch through function pointers in the inode operations table. xv6's design is a simplified version of this concept.

Buffer cache (`bio.c`)

The buffer cache keeps recently-used disk blocks in memory to avoid slow disk I/O.

Function	Role
`bget(dev, blockno)`	Find or allocate a cache slot (LRU replacement)
`bread(dev, blockno)`	`bget` + read from disk if not cached
`bwrite(buf)`	Write block to disk

Eviction uses LRU (least recently used): the least-recently-accessed buffer is recycled when the cache is full.

Journaling and crash recovery

xv6 uses a write-ahead log (journal) to provide atomic multi-block updates:

begin_op() — start a transaction.
log_write(buf) — mark a buffer as part of the current transaction (dirty, logged).
end_op() — when all operations in the transaction complete, call commit(): a. Write all dirty blocks to the log region on disk. b. Write the log header (commit record). c. Install — copy log blocks to their real disk locations. d. Clear the log header.

If the system crashes between steps (a) and (c), recovery replays the log on the next boot. If it crashes before the commit record is written, the partial transaction is simply ignored — the filesystem remains consistent.

Key Takeaways

struct proc is the central data structure — everything about a process lives here.
userinit + initcode bootstrap the first user process by embedding a tiny binary directly in the kernel.
swtch performs context switching by saving/restoring callee-saved registers; it is called from sched() in yield, sleep, and exit paths.
Three trap paths: system call (via MSR entry), hardware interrupt (via APIC + vector table), and exception (e.g., page fault) — all funnel through alltraps / trap().
Spinlocks are for short, interrupt-safe critical sections; sleeplocks are for long I/O waits.
The filesystem layers build on each other: buffer cache → inode → directory → VFS.
Journaling (write-ahead log) guarantees crash consistency by committing all-or-nothing multi-block updates.

xv6 Comprehensive Review

Why This Matters

1. OS Architecture Recap

Isolation mechanisms

2. Processes and the Process Lifecycle

`struct proc`

Process creation: `fork()`

First process: `userinit()` and `initcode`

Zombie processes and resource reclaim

3. Scheduling and Context Switching

Round-robin scheduler

`swtch()` — saving and restoring context

CFS fairness (Linux contrast)

4. Traps, Interrupts, and Device Drivers

Path 1 — System calls

Path 2 — Hardware interrupts (e.g., timer, keyboard)

Path 3 — Exceptions (e.g., page fault)

5. Kernel Synchronization

Why locks are needed

xv6 lock types

When to use a lock — checklist

Deadlock

Advanced lock types (Linux)

6. File System

Disk layout

Inodes (in-memory `struct inode`)

Directories and path resolution

Virtual File System (VFS)

Buffer cache (`bio.c`)

Journaling and crash recovery

Key Takeaways

Practice

Model answer

Model answer

Model answer

Model answer

Results

xv6 Comprehensive Review

Why This Matters

1. OS Architecture Recap

Isolation mechanisms

2. Processes and the Process Lifecycle

struct proc

Process creation: fork()

First process: userinit() and initcode

Zombie processes and resource reclaim

3. Scheduling and Context Switching

Round-robin scheduler

swtch() — saving and restoring context

CFS fairness (Linux contrast)

4. Traps, Interrupts, and Device Drivers

Path 1 — System calls

Path 2 — Hardware interrupts (e.g., timer, keyboard)

Path 3 — Exceptions (e.g., page fault)

5. Kernel Synchronization

Why locks are needed

xv6 lock types

When to use a lock — checklist

Deadlock

Advanced lock types (Linux)

6. File System

Disk layout

Inodes (in-memory struct inode)

Directories and path resolution

Virtual File System (VFS)

Buffer cache (bio.c)

Journaling and crash recovery

Key Takeaways

Practice

Model answer

Model answer

Model answer

Model answer

Results

`struct proc`

Process creation: `fork()`

First process: `userinit()` and `initcode`

`swtch()` — saving and restoring context

Inodes (in-memory `struct inode`)

Buffer cache (`bio.c`)