CPU Scheduling
Why This Matters
Every time you switch between a terminal, a browser, and a video player, the OS scheduler is making rapid-fire decisions about which process gets the CPU and for how long. Without a scheduler, one runaway program would freeze every other process. Understanding scheduling gives you insight into system responsiveness, throughput, and fairness—and it explains subtle behaviors like why a text editor feels snappy even while a compile job burns all your CPU cores.
Process States in xv6
Before diving into the scheduler loop itself, you need to know what states a process can be in. In proc.h:
enum procstate { UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE };
| State | Meaning |
|---|---|
UNUSED |
Slot in the process table is free |
EMBRYO |
Being initialized by allocproc() |
SLEEPING |
Waiting on an event (blocking I/O, sleep()) |
RUNNABLE |
Ready to run, waiting for the CPU |
RUNNING |
Currently executing on a CPU |
ZOMBIE |
Exited but not yet reaped by its parent |
Key legal transitions:
- EMBRYO → RUNNABLE: after
fork()finishes setup - RUNNABLE → RUNNING: scheduler picks the process
- RUNNING → RUNNABLE:
yield()called (e.g., on timer interrupt) - RUNNING → SLEEPING: process calls
sleep() - SLEEPING → RUNNABLE: some other process calls
wakeup() - RUNNING → ZOMBIE: process calls
exit()
There is no direct path from EMBRYO to RUNNING—a new process must pass through RUNNABLE first.
The xv6 Scheduler Loop
The scheduler is the last thing called during kernel boot:
// main.c
mpmain() --> scheduler()
Inside scheduler():
for(p = ptable.proc; p < &ptable.proc[NPROC]; p++){
if(p->state != RUNNABLE) continue;
proc = p;
switchuvm(p); // load p's page table
p->state = RUNNING;
swtch(&cpu->scheduler, p->context); // context switch
switchkvm(); // restore kernel page table
...
}
This is a round-robin over the process table: scan from the beginning, pick the first RUNNABLE process, run it, then loop back. There is no priority and no time-slice enforcement beyond what the timer interrupt provides.
Context Switching: swtch()
The actual register save/restore is done by swtch() in assembly (swtch.S). It:
- Saves the current set of callee-saved registers onto the current stack and records the stack pointer in
*old(astruct context *). - Loads the stack pointer from
*newand pops the saved registers, resuming wherever that context last left off.
// Scheduler → process:
swtch(&cpu->scheduler, p->context);
// Process → scheduler (via sched()):
swtch(&proc->context, cpu->scheduler);
struct context holds only the kernel-mode registers needed to resume execution—it is not the user-space register set (that lives in struct trapframe).
Where does a new process "return"?
When fork() → allocproc() sets up the new process's kernel stack, it arranges the fake context so that swtch() will return into forkret(), which then calls trapret, which returns to user space. This is the clever trick that lets swtch() "resume" a brand-new process.
Yielding Back to the Scheduler
A running process returns the CPU in two situations:
1. Timer interrupt (preemption)
// trap.c
if(proc && proc->state == RUNNING && tf->trapno == T_IRQ0+IRQ_TIMER)
yield();
2. Blocking on I/O (voluntary)
void sleep(void *chan, struct spinlock *lk) {
proc->chan = chan;
proc->state = SLEEPING;
sched();
...
}
yield() and sleep() both call sched(), which calls swtch() back to the scheduler:
void yield(void) {
acquire(&ptable.lock);
proc->state = RUNNABLE;
sched();
release(&ptable.lock);
}
void sched(void) {
swtch(&proc->context, cpu->scheduler);
}
wakeup() scans the process table and moves any process sleeping on a matching channel back to RUNNABLE:
static void wakeup1(void *chan) {
for(p = ptable.proc; p < &ptable.proc[NPROC]; p++)
if(p->state == SLEEPING && p->chan == chan)
p->state = RUNNABLE;
}
Cooperative vs. Preemptive Multitasking
| Type | How the CPU is released | Examples |
|---|---|---|
| Cooperative | Process voluntarily yields | Windows 3.1, early macOS, Go runtime goroutines |
| Preemptive | OS forcibly interrupts via timer | Linux, Windows NT+, xv6 |
xv6 is preemptive: the timer interrupt fires periodically and can yank the CPU away from any running process. This guarantees no single process can starve others by looping forever.
I/O-Bound vs. CPU-Bound Processes
| Characteristic | I/O-bound | CPU-bound |
|---|---|---|
| Typical workload | Editor, shell, database query | Video encoder, scientific simulation |
| CPU usage | Short bursts | Long continuous runs |
| What matters | Low latency / quick response | High throughput / cache warmth |
A well-designed scheduler should give I/O-bound processes priority when they wake up (so the editor feels snappy), while letting CPU-bound jobs run for longer slices (so caches stay warm).
Scheduling Policies
Round-Robin (xv6)
Simple: give each RUNNABLE process a turn in table order. No priorities. Preemption only via timer interrupt.
Priority-Based Scheduling
Assign each process a numeric priority; higher-priority processes run before lower-priority ones.
Linux nice values (standard processes):
- Range: −20 (highest priority) to +19 (lowest priority); default is 0.
nice -n 5 vim— start vim at lower prioritysudo renice -n -5 -p $(pidof vim)— boost a running process
Linux real-time priority:
- Range: 0–99; higher value = higher priority.
- Real-time processes always preempt standard (nice) processes.
Problem with strict priority scheduling: Starvation. A low-priority process may never run if higher-priority processes are always ready.
Time Slices
The time slice is the maximum uninterrupted CPU time a process gets before being preempted.
- Too long → poor interactive performance (editor stutters)
- Too short → excessive context-switch overhead
Choosing the right time slice is a classical OS design tension.
Linux Completely Fair Scheduler (CFS)
CFS was the default Linux scheduler from kernel 2.6.23 through 6.6. Its goal: equal CPU share across all runnable processes of the same priority.
Virtual runtime (vruntime): each process accumulates CPU time as vruntime. CFS always picks the process with the smallest vruntime—i.e., the one that has received the least CPU time so far—and runs it next. This naturally handles the text-editor vs. video-encoder scenario:
- The editor sleeps most of the time, so its
vruntimestays low. - When it wakes up, it immediately has the smallest
vruntimeand preempts the encoder. - The encoder gets to run when the editor goes back to sleep.
No explicit priority assignment is needed for this behavior to emerge.
Key Takeaways
- xv6 process states form a strict state machine; not every transition is legal (e.g., EMBRYO → RUNNING is impossible).
swtch()in assembly is the only place registers are saved and restored during a context switch; the scheduler andsched()just arrange when to call it.- xv6 uses a simple round-robin scheduler with timer-based preemption—simple but fair.
- Cooperative multitasking relies on processes being good citizens; preemptive multitasking (xv6, Linux) enforces fairness via hardware timer interrupts.
- Linux priority has two orthogonal axes: nice (−20 to +19, default 0) and real-time (0–99); real-time always wins.
- Strict priority scheduling can starve low-priority processes; CFS fixes this with virtual runtime, giving each process a proportional share without starvation.