Virtual Memory I: Address Translation and Paging

Why This Matters

Every program you run believes it owns the entire address space โ€” it can read from address 0x1000 without knowing whether another process is using that same address for something completely different. This illusion is virtual memory, and it is one of the most important abstractions an OS provides. It gives each process isolation, enables programs larger than physical RAM, and lets the kernel control exactly what memory each process can touch.


Three Kinds of Addresses

Understanding virtual memory starts with distinguishing three kinds of addresses on x86:

Term What it is
Logical address The raw address a CPU instruction generates (segment selector + offset)
Linear (virtual) address After segment translation; what the paging hardware sees
Physical address The actual address on the memory bus / in DRAM
Logical address
   (seg:offset)
       โ”‚
       โ”‚  segmentation (GDT)
       โ–ผ
Linear / virtual address
       โ”‚
       โ”‚  paging (page tables)
       โ–ผ
Physical address

In the 64-bit xv6 we study, segmentation is essentially a no-op (base = 0), so a virtual address and a logical address are the same value.


The Core Idea: Indirection via the MMU

Without virtual memory, software writes directly to physical addresses. The problem: two programs can collide, one bug can corrupt the kernel, and the OS has no way to enforce isolation.

The solution is a level of indirection:

This means a process literally cannot access memory the kernel hasn't mapped for it. Any attempt raises a hardware exception (page fault) that the OS handles.


Pages: The Unit of Translation

x86 divides both virtual and physical memory into fixed-size chunks called pages. The default page size is 4 KB (4096 bytes = 2ยนยฒ).

Because addresses are translated at page granularity:

Example โ€” virtual address 0x1013:

0x1013 = 0001 0000 0001 0011
          ^^^^^^^^^^^^ ^^^^
          VPN = 0x1    offset = 0x013

Naive Approach: A Flat Page Table

The simplest design is one big array of page table entries (PTEs):

GET_PTE(va) = &ptes[va >> 12]   // shift off the 12-bit offset

For 32-bit addresses with 4 KB pages:

With 100 processes, that is 400 MB just for page tables โ€” and most of it would be empty (processes don't use the full 4 GB). We need a smarter structure.


Two-Level Paging (x86-32)

x86-32 solves the size problem by splitting the 20-bit VPN into two 10-bit halves:

31      22 21      12 11       0
+----------+----------+---------+
| Dir idx  | Table idx|  Offset |
| (10 bits)| (10 bits)|(12 bits)|
+----------+----------+---------+

Page Directory (PD): A single 1024-entry table. Each entry points to a Page Table (PT), which is only allocated if that 1 GB region is used.

Page Table (PT): Also 1024 entries. Each entry maps one 4 KB page.

Translation works in two hardware "walks":

  1. Take bits [31:22] โ†’ index into the Page Directory โ†’ get the address of a Page Table.
  2. Take bits [21:12] โ†’ index into that Page Table โ†’ get the physical frame number.
  3. Append the 12-bit offset โ†’ physical address.

Worked Quiz Example

Virtual address 0xCAFEBABE:

Binary: 1100 1010 1111 1110  1011 1010  1011 1110
        โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค  โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
        Dir idx = 0x32B(811) Table = 0x3EB(1003) Offset = 0xABE(2750)

The TLB: Caching Translations

A two-level page walk requires two memory accesses before reaching the data you actually wanted. That would halve memory throughput.

The hardware fixes this with the Translation Lookaside Buffer (TLB) โ€” a small, very fast cache of recent VAโ†’PA translations:

  1. On every memory access, the CPU first checks the TLB.
  2. TLB hit (common case): translation returned in ~1 cycle; no page-table walk needed.
  3. TLB miss: MMU walks the page table, loads the PTE into the TLB, then retries.

When the kernel switches to a new page table (context switch), it must flush the TLB (or use address-space IDs to avoid it) so stale translations don't bleed across processes.


Programming the MMU: CR3

The kernel tells the MMU which page table to use by writing the physical address of the Page Directory into the %CR3 register:

%CR3  โ†’  Page Directory base address

Beyond x86-32: Real-World Page Tables

Modern hardware supports multiple page sizes and more levels:

Mode Levels Sizes
x86-32 (classic) 2 4 KB, 4 MB
x86-32 PAE 3 4 KB, 2 MB
x86-64 4 4 KB, 2 MB, 1 GB

The 4-level x86-64 structure (PML4 โ†’ PDPT โ†’ PD โ†’ PT) allows 48-bit virtual addresses (~256 TB), enough for current workloads.


Key Takeaways

  1. Virtual memory = hardware indirection. The MMU translates every VA to a PA; software never touches physical addresses directly.
  2. Pages are 4 KB by default. The low 12 bits are the offset; upper bits are the virtual page number.
  3. Two-level paging saves memory. Only allocate page tables for regions actually used by a process.
  4. The TLB is a translation cache. It makes paging fast by avoiding repeated page-table walks on hot pages.
  5. CR3 is the root. The kernel programs the MMU by writing the page-directory physical address into %CR3.
  6. Only the kernel can reprogram the MMU โ€” this is what enforces process isolation.

Practice

  1. What is the correct order of address transformations on x86?
  2. On x86-32 with 4 KB pages and a flat (single-level) page table, how much memory would be needed just for one process's page table?
  3. On a 32-bit x86 CPU using two-level paging (10-10-12 split), how many entries does a single Page Table (not the Page Directory) contain?
  4. Break virtual address 0xCAFEBABE into its three fields under x86-32 two-level paging (10-10-12 split). Give the decimal and hex value for each field.
  5. What is the primary purpose of the TLB (Translation Lookaside Buffer)?
  6. Which x86 register holds the physical base address of the current process's Page Directory?
  7. Explain why only the kernel should be allowed to modify the page table and write to %CR3. What would go wrong if user-space code could do it?
  8. A page table entry (PTE) typically contains more than just the physical frame number. Name two common permission bits found in an x86 PTE and explain what each controls.
  9. On a 64-bit x86 CPU, how many levels of page tables does the hardware use by default (before 5-level paging extensions)?
  10. A process accesses virtual address 0x5000. The TLB does not contain an entry for this address. Describe, step by step, how the hardware resolves this access on an x86-32 system with two-level paging.