Processes

Why This Matters

Your computer runs dozens of programs simultaneously even though it has a limited number of CPU cores. The operating system creates the illusion that each program has its own dedicated CPU and private memory. The abstraction that makes this possible is the process. Understanding how a process is represented and created is foundational to everything else in OS design: scheduling, memory management, file I/O, and security all revolve around the process.

What Is a Process?

A process is a program currently executing in the system. It is the OS's unit of execution and isolation. A process is composed of:

Component	Description
CPU registers	The current values of the program counter, stack pointer, general-purpose registers, etc.
Text section	The compiled program code loaded into memory
Memory segments	Data segment (globals), heap (dynamic allocation), stack (local variables, call frames)
Kernel resources	Open file descriptors, current working directory, signal handlers, etc.

A process is the OS's virtualization of both the processor and memory: each process thinks it owns the CPU and a large, contiguous address space, even though neither is truly the case.

The xv6 User-Space View: fork, exec, wait

From a user program's perspective, three system calls define the life cycle of a process:

`fork()`

pid_t fork(void);

Creates a new process by duplicating the calling process. The child starts as an exact copy of the parent. fork() is unusual: it is called once but returns twice — once in the parent (returning the child's PID) and once in the child (returning 0).

`exec()`

int exec(const char *path, const char *argv[]);

Replaces the current process image with a new one loaded from path. The process keeps its PID but gets a fresh text segment, stack, and heap. It does not return on success.

`wait()`

pid_t wait(void);

Blocks the parent until one of its children changes state (typically, terminates). This lets the parent collect the child's exit status and prevents zombie processes from accumulating.

Putting It Together: the `init` Process

"init" process
     │
     ├── fork()
     │        └── child
     │               ├── exec("sh", argv)   ← replaces itself with the shell
     │               └── exit()
     └── wait()

The init.c code from xv6 illustrates this pattern:

pid = fork();
if (pid == 0) {           // Child
    exec("sh", argv);
    printf(1, "init: exec sh failed\n");
    exit();
}
wait();                   // Parent waits for shell to exit

If exec fails, the child falls through to the error print. The parent always calls wait() so it can reap the child and restart the shell if needed.

The Process Descriptor: `struct proc`

Every process in xv6 is represented by a struct proc defined in proc.h. Think of it as the OS's "dossier" for a running program — everything the kernel needs to know about and manage a process lives here.

struct proc {
    addr_t sz;                  // Size of process memory (bytes)
    pde_t* pgdir;               // Page table
    char *kstack;               // Bottom of kernel stack for this process
    enum procstate state;       // Process state (UNUSED, EMBRYO, SLEEPING, RUNNABLE, RUNNING, ZOMBIE)
    int pid;                    // Process ID
    struct proc *parent;        // Parent process
    struct trapframe *tf;       // Trap frame for current syscall
    struct context *context;    // Saved registers for context switch
    void *chan;                 // Sleep channel (non-zero means sleeping on chan)
    int killed;                 // Non-zero if process has been killed
    struct file *ofile[NOFILE]; // Open files
    struct inode *cwd;          // Current working directory
    char name[16];              // Process name (for debugging)
};

Key fields to know:

Field	Role
`pid`	Unique numeric identifier for this process
`pgdir`	Pointer to the page table — defines the process's virtual address space
`kstack`	Each process has its own kernel stack used during system calls and interrupts
`state`	The scheduling state (RUNNABLE, RUNNING, SLEEPING, ZOMBIE, etc.)
`tf`	Trap frame: saves user-space register state when a trap/syscall is taken
`context`	Saved kernel-mode registers used by the scheduler to switch between processes
`ofile`	Array of open file pointers (file descriptors)
`parent`	Pointer to the creating process, needed for `wait()`

The Process Table

xv6 uses a fixed-size process table holding at most NPROC (64) entries:

struct {
    struct spinlock lock;
    struct proc proc[NPROC];
} ptable;

This is simple and predictable but limits the system to 64 concurrent processes. A production OS would use a more dynamic structure.

Process Creation: Inside `fork()`

Step 1 — Allocate a new process slot: `allocproc()`

fork() delegates the low-level setup to allocproc(), which:

Scans ptable for an UNUSED slot.
Allocates a kernel stack (kalloc()).
Sets up the trap frame pointer at the top of the kernel stack.
Sets context->rip to forkret, so when the scheduler first runs this process it will execute forkret, which returns to user space via syscall_trapret.

static struct proc* allocproc(void) {
    struct proc *p; char *sp;
    // Find an UNUSED slot
    for (p = ptable.proc; p < &ptable.proc[NPROC]; p++)
        if (p->state == UNUSED) goto found;
found:
    p->kstack = kalloc();
    sp = p->kstack + KSTACKSIZE;

    sp -= sizeof *p->tf;
    p->tf = (struct trapframe*)sp;

    sp -= sizeof(addr_t);
    *(addr_t*)sp = (addr_t)syscall_trapret;

    sp -= sizeof *p->context;
    p->context = (struct context*)sp;
    memset(p->context, 0, sizeof *p->context);
    p->context->rip = (addr_t)forkret;
    ...
}

Step 2 — Copy the parent's state

Back in fork():

int fork(void) {
    struct proc *np;
    if ((np = allocproc()) == 0) return -1;

    np->pgdir = copyuvm(proc->pgdir, proc->sz);  // Copy address space
    np->sz    = proc->sz;
    np->parent = proc;
    *np->tf   = *proc->tf;                        // Copy trap frame (registers)
    np->tf->rax = 0;                              // Child sees fork() return 0

    for (i = 0; i < NOFILE; i++)
        if (proc->ofile[i])
            np->ofile[i] = filedup(proc->ofile[i]); // Dup file references
    ...
}

Why Does `fork()` Return Twice?

This is a classic exam question. The mechanism:

Parent: fork() calls allocproc(), which sets up the child and returns the child's struct proc*. The parent returns the child's PID normally through the call stack.
Child: The child is marked RUNNABLE but has not run yet. Its trap frame is a copy of the parent's, except tf->rax is set to 0. When the scheduler eventually runs the child and syscall_trapret restores registers from the trap frame, the child's return value register holds 0. So from the child's point of view, fork() returned 0.

The key insight: the child never actually executes the body of fork(). It only starts running from syscall_trapret onward, with a trap frame that makes it look like fork() returned 0.

Trapframe vs. Context

Students often confuse these two:

Field	When used	What it saves
`tf` (trapframe)	Entering/exiting the kernel (syscall, interrupt)	User-space register state
`context`	Switching between processes inside the kernel (scheduler)	Kernel-space saved registers (callee-saved + `rip`)

The trapframe is the bridge between user space and kernel space. The context is the bridge between one kernel execution and another.

Key Takeaways

A process = program code + memory state + CPU register state + kernel resources.
xv6 represents each process with struct proc; up to 64 live in ptable.
fork() creates a child by duplicating the parent's address space and trap frame, then sets tf->rax = 0 so the child sees a return value of 0.
fork() "returns twice" because the parent returns normally while the child resumes execution via syscall_trapret with a pre-fabricated trap frame.
The trapframe saves user-space registers on a kernel-boundary crossing; the context saves kernel registers during a scheduler switch — they serve different purposes.
exec() replaces the process image; wait() reaps a terminated child.

Processes

Why This Matters

What Is a Process?

The xv6 User-Space View: fork, exec, wait

`fork()`

`exec()`

`wait()`

Putting It Together: the `init` Process

The Process Descriptor: `struct proc`

The Process Table

Process Creation: Inside `fork()`

Step 1 — Allocate a new process slot: `allocproc()`

Step 2 — Copy the parent's state

Why Does `fork()` Return Twice?

Trapframe vs. Context

Key Takeaways

Practice

Model answer

Model answer

Model answer

Results

Processes

Why This Matters

What Is a Process?

The xv6 User-Space View: fork, exec, wait

fork()

exec()

wait()

Putting It Together: the init Process

The Process Descriptor: struct proc

The Process Table

Process Creation: Inside fork()

Step 1 — Allocate a new process slot: allocproc()

Step 2 — Copy the parent's state

Why Does fork() Return Twice?

Trapframe vs. Context

Key Takeaways

Practice

Model answer

Model answer

Model answer

Results

`fork()`

`exec()`

`wait()`

Putting It Together: the `init` Process

The Process Descriptor: `struct proc`

Process Creation: Inside `fork()`

Step 1 — Allocate a new process slot: `allocproc()`

Why Does `fork()` Return Twice?