HW3 Hints: Page Deduplication, Copy-on-Write, and Interrupt Handling

Why This Matters

Two of the most important jobs an OS kernel does are managing physical memory efficiently and responding to hardware events without dropping the ball. This lecture ties those together: HW3 asks you to implement two advanced memory techniques (page deduplication and copy-on-write), while the interrupt-handling deep-dive shows you the exact mechanism the kernel uses to respond to timer ticks, keystrokes, and page faults. Getting comfortable with xv6's interrupt path also unlocks your understanding of system calls and scheduling, since they all flow through the same trap() function.


HW3 Part 1: Page Deduplication

The Idea

Many processes hold identical pages (e.g., the zeroed BSS page, a shared library loaded at the same content). Deduplication detects those duplicates and makes multiple virtual addresses point to a single physical page, freeing the extras.

The frameinfo Array

struct frameinfo {
  int refs;        // how many PTEs point at this physical page
  addr_t checksum; // content fingerprint for fast comparison
};
struct frameinfo frameinfo[PHYSTOP/PGSIZE];

One entry per physical page frame. The index is the page frame number (physical address / PGSIZE).

Reference Counting Helpers

Function What it does
kretain(char *v) Increment frameinfo[PA/PGSIZE].refs — "one more owner"
krelease(char *v) Decrement refs; when it hits 0, return the page to the free list

The existing kfree must become aware of reference counts: only truly free the page when refs reaches zero.

Implementing dedup(void *vstart, void *vend)

  1. Walk every mapped virtual page in [vstart, vend).
  2. Compute a checksum of the page's content.
  3. Scan the frameinfo array (or a hash table for efficiency) for another frame with the same checksum.
  4. If a match is found, verify byte-by-byte equality (checksums can collide).
  5. If truly identical: remap the current VA to the existing physical page, increment its refs via kretain, and free the now-unused page via krelease.
  6. Mark both PTEs read-only (because the page is now shared — a write must trigger CoW).

HW3 Part 2: Copy-on-Write (CoW)

The Problem with fork()

The naive fork() copies every parent page immediately. Most fork() calls are followed immediately by exec(), so the copy is wasted work.

The CoW Approach

  1. fork() maps both parent and child to the same physical pages, but marks every writable PTE as read-only in both address spaces.
  2. When either process attempts a write, the MMU raises a page fault (vector 14).
  3. The kernel's page-fault handler detects this is a CoW fault (not a true error):
    • Allocate a new physical page.
    • Copy the faulted page's content into it.
    • Update the faulting process's PTE to point to the new page, now writable.
  4. The other process still points at the original page, unchanged.

The refs field in frameinfo tells you whether you are the sole owner (safe to write in place) or must copy first.


Interrupts and Exceptions

The Hardware Path

Device (keyboard, timer, disk…)
        │  electrical signal
        ▼
   I/O APIC  (system chipset / south bridge)
        │  redistributes to a Local APIC
        ▼
   Local APIC  (inside each CPU)
        │  raises interrupt on the CPU
        ▼
        CPU

Interrupts vs. Exceptions

Both are handled through the same IDT mechanism:

Source Examples Term
Internal CPU fault divide-by-zero, page fault Exception
Software int n instruction system calls Software interrupt
External hardware timer, keyboard, disk Hardware interrupt

x86 Interrupt Vectors

Every interrupt/exception has a vector number (0–255):

Range Owner
0–31 Reserved by Intel (processor-defined exceptions)
32–255 Available for OS use

Key xv6 vectors:

Vector Meaning
14 Page fault (exception)
32 Timer (Local APIC timer)
33 Keyboard

xv6 Interrupt Handling: End-to-End

1. Boot-time Setup

main()
 ├─ lapicinit()   // configure the Local APIC
 ├─ tvinit()      // fill the IDT: mkgate(idt, n, vectors[n], 0) for each n
 └─ mpmain()
     └─ idtinit() // lidt(idt, PGSIZE) — load the IDT register

2. The vectors.S Stubs (generated by vectors.pl)

vectors.pl emits 256 tiny assembly stubs, one per vector. Each stub:

vector14:
  push $14       ; push error code (or 0 if none)
  jmp alltraps

The stubs are purely glue; they normalise the stack so alltraps always sees the same layout.

3. alltraps in trapasm.S

alltraps saves all general-purpose registers onto the kernel stack, constructing a struct trapframe:

alltraps:
  pushq %r15
  pushq %r14
  ...
  movq %rsp, %rdi   ; pass trapframe pointer as first arg
  callq trap

Before alltraps runs, the CPU hardware has already pushed (for a user→kernel transition): SS, RSP, RFLAGS, CS, RIP. In kernel mode it pushes RFLAGS, CS, RIP.

4. trap(struct trapframe *tf) in trap.c

void trap(struct trapframe *tf) {
  switch(tf->trapno) {
  case T_IRQ0 + IRQ_TIMER:   lapiceoi(); /* ack */ break;
  case T_IRQ0 + IRQ_KBD:     kbdintr(); lapiceoi(); break;
  case T_IRQ0 + IRQ_IDE:     /* disk I/O */ break;
  default:
    if (proc == 0 || (tf->cs & 3) == 0) panic("trap");
  }
}

lapiceoi() tells the APIC the interrupt is acknowledged. Without it, the APIC won't deliver further interrupts.

5. Return Path

trap() returns
  └─ trapret:
      pop all registers (reverse of alltraps)
      iretq          // restores RIP, CS, RFLAGS (and RSP, SS if returning to user)

Complete Flow Summary

APIC signal
  → CPU looks up vector in IDT
  → CPU pushes RIP/CS/RFLAGS[/RSP/SS] automatically
  → vectors.S stub: push vector number, jmp alltraps
  → alltraps: push remaining registers → call trap(tf)
  → trap(): dispatch on tf->trapno, handle, lapiceoi()
  → trapret: pop registers
  → iretq: resume interrupted code

The Top-Half / Bottom-Half Problem

xv6 handles the interrupt fully inside the interrupt context. For fast devices that's fine, but slow work (e.g., copying a disk buffer) blocks all other interrupts.

Linux's solution:


Key Takeaways

Practice

  1. Which interrupt vector range is available for OS-defined interrupts on x86?
  2. In xv6, which interrupt vector number is assigned to the keyboard?
  3. What does the label vector13: in xv6's vectors.S file represent?
  4. True or False: alltraps is automatically generated by vectors.pl.
  5. What is the primary reason Linux splits interrupt handling into a 'top half' and a 'bottom half'?
  6. Trace the complete path of a keyboard interrupt in xv6, from the key press to the return to interrupted code. Name each stage and the file/function involved.
  7. Explain how Copy-on-Write (CoW) works in HW3 Part 2. What happens at fork() time, and what happens when the child (or parent) writes to a shared page?
  8. In HW3 Part 1 (page deduplication), what role does the frameinfo array play, and what must happen to PTEs when two virtual addresses are mapped to the same physical page after deduplication?
  9. When transitioning from user mode to kernel mode due to an interrupt, which registers does the x86 CPU push onto the stack automatically (without software assistance)?
  10. What happens if a hardware interrupt handler in xv6 forgets to call lapiceoi() before returning?