Virtual Memory I: Address Translation and Paging

Why This Matters

Every program you run believes it owns the entire address space — it can read from address 0x1000 without knowing whether another process is using that same address for something completely different. This illusion is virtual memory, and it is one of the most important abstractions an OS provides. It gives each process isolation, enables programs larger than physical RAM, and lets the kernel control exactly what memory each process can touch.

Three Kinds of Addresses

Understanding virtual memory starts with distinguishing three kinds of addresses on x86:

Term	What it is
Logical address	The raw address a CPU instruction generates (segment selector + offset)
Linear (virtual) address	After segment translation; what the paging hardware sees
Physical address	The actual address on the memory bus / in DRAM

Logical address
   (seg:offset)
       │
       │  segmentation (GDT)
       ▼
Linear / virtual address
       │
       │  paging (page tables)
       ▼
Physical address

In the 64-bit xv6 we study, segmentation is essentially a no-op (base = 0), so a virtual address and a logical address are the same value.

The Core Idea: Indirection via the MMU

Without virtual memory, software writes directly to physical addresses. The problem: two programs can collide, one bug can corrupt the kernel, and the OS has no way to enforce isolation.

The solution is a level of indirection:

Software only ever sees virtual addresses (VA).
The Memory Management Unit (MMU) — hardware inside the CPU — translates every VA into a physical address (PA) on every memory access.
The MMU consults a page table maintained by the kernel. Only the kernel can change that table.

This means a process literally cannot access memory the kernel hasn't mapped for it. Any attempt raises a hardware exception (page fault) that the OS handles.

Pages: The Unit of Translation

x86 divides both virtual and physical memory into fixed-size chunks called pages. The default page size is 4 KB (4096 bytes = 2¹²).

Because addresses are translated at page granularity:

The low 12 bits of a virtual address are the page offset — they pass through to the physical address unchanged.
The upper bits are the virtual page number (VPN), which the page table maps to a physical frame number (PFN).

Example — virtual address 0x1013:

0x1013 = 0001 0000 0001 0011
          ^^^^^^^^^^^^ ^^^^
          VPN = 0x1    offset = 0x013

Naive Approach: A Flat Page Table

The simplest design is one big array of page table entries (PTEs):

GET_PTE(va) = &ptes[va >> 12]   // shift off the 12-bit offset

For 32-bit addresses with 4 KB pages:

20-bit VPN → 2²⁰ = 1 million entries
4 bytes per PTE → 4 MB per process

With 100 processes, that is 400 MB just for page tables — and most of it would be empty (processes don't use the full 4 GB). We need a smarter structure.

Two-Level Paging (x86-32)

x86-32 solves the size problem by splitting the 20-bit VPN into two 10-bit halves:

31      22 21      12 11       0
+----------+----------+---------+
| Dir idx  | Table idx|  Offset |
| (10 bits)| (10 bits)|(12 bits)|
+----------+----------+---------+

Page Directory (PD): A single 1024-entry table. Each entry points to a Page Table (PT), which is only allocated if that 1 GB region is used.

Page Table (PT): Also 1024 entries. Each entry maps one 4 KB page.

Translation works in two hardware "walks":

Take bits [31:22] → index into the Page Directory → get the address of a Page Table.
Take bits [21:12] → index into that Page Table → get the physical frame number.
Append the 12-bit offset → physical address.

Worked Quiz Example

Virtual address 0xCAFEBABE:

Binary: 1100 1010 1111 1110  1011 1010  1011 1110
        ├─────────────────┤  ├────────┤  ├────────┤
        Dir idx = 0x32B(811) Table = 0x3EB(1003) Offset = 0xABE(2750)

The TLB: Caching Translations

A two-level page walk requires two memory accesses before reaching the data you actually wanted. That would halve memory throughput.

The hardware fixes this with the Translation Lookaside Buffer (TLB) — a small, very fast cache of recent VA→PA translations:

On every memory access, the CPU first checks the TLB.
TLB hit (common case): translation returned in ~1 cycle; no page-table walk needed.
TLB miss: MMU walks the page table, loads the PTE into the TLB, then retries.

When the kernel switches to a new page table (context switch), it must flush the TLB (or use address-space IDs to avoid it) so stale translations don't bleed across processes.

Programming the MMU: CR3

The kernel tells the MMU which page table to use by writing the physical address of the Page Directory into the %CR3 register:

%CR3  →  Page Directory base address

Only the kernel (ring 0) can write %CR3.
On every context switch, xv6 loads the new process's page-directory address into %CR3, instantly switching the entire address space.

Beyond x86-32: Real-World Page Tables

Modern hardware supports multiple page sizes and more levels:

Mode	Levels	Sizes
x86-32 (classic)	2	4 KB, 4 MB
x86-32 PAE	3	4 KB, 2 MB
x86-64	4	4 KB, 2 MB, 1 GB

The 4-level x86-64 structure (PML4 → PDPT → PD → PT) allows 48-bit virtual addresses (~256 TB), enough for current workloads.

Key Takeaways

Virtual memory = hardware indirection. The MMU translates every VA to a PA; software never touches physical addresses directly.
Pages are 4 KB by default. The low 12 bits are the offset; upper bits are the virtual page number.
Two-level paging saves memory. Only allocate page tables for regions actually used by a process.
The TLB is a translation cache. It makes paging fast by avoiding repeated page-table walks on hot pages.
CR3 is the root. The kernel programs the MMU by writing the page-directory physical address into %CR3.
Only the kernel can reprogram the MMU — this is what enforces process isolation.

Virtual Memory I: Address Translation and Paging

Why This Matters

Three Kinds of Addresses

The Core Idea: Indirection via the MMU

Pages: The Unit of Translation

Naive Approach: A Flat Page Table

Two-Level Paging (x86-32)

Worked Quiz Example

The TLB: Caching Translations

Programming the MMU: CR3

Beyond x86-32: Real-World Page Tables

Key Takeaways

Practice

Model answer

Model answer

Model answer

Model answer

Results