Atomic Instructions and xv6 Locks

Why This Matters

Modern operating systems run multiple threads simultaneously. Whenever two threads touch the same data without coordination, the result can be wrong in subtle, hard-to-reproduce ways. This module builds the complete picture: why ordinary C statements aren't safe to share, what the CPU gives us to fix that, and how xv6 turns those CPU primitives into the spinlock you will use throughout the rest of the course.

The Race Condition Problem

Consider the simplest possible shared variable:

int i = 7;

void foo(void) {
    i++;
}

If two threads call foo() concurrently, intuition says i should end up as 9. But it might end up as 8. Here's why.

A single C statement compiles to multiple instructions

i++ looks like one operation, but the compiler translates it into (at least) three machine instructions:

LOAD   reg, [i]   // read i from memory into a register
ADD    reg, 1     // increment the register
STORE  [i], reg   // write the result back to memory

The CPU can switch context—or, on a multicore machine, a second CPU can run—between any of those three steps.

The race in pictures

Thread A	Thread B	Value of `i` in memory
LOAD reg_A ← 7		7
	LOAD reg_B ← 7	7
ADD reg_A = 8		7
	ADD reg_B = 8	7
STORE i ← 8		8
	STORE i ← 8	8 ← wrong!

Both threads read the value 7 before either has written 8 back. Both write 8. One increment is silently lost. This is called a race condition or data race.

The Solution: Atomic Instructions

An atomic operation is one that the CPU guarantees to complete without any possibility of interleaving with other threads. No other thread can observe a half-finished atomic operation.

Operation	Atomic?
`i++` in C	No – compiles to 3 instructions
`atomic_inc(&i)` (Linux)	Yes – single indivisible instruction

The hardware enforces atomicity physically: while one core executes an atomic instruction, the memory bus is locked so no other core can perform a conflicting access.

`xchg()`: The Key Primitive in xv6

xv6 builds its entire lock system on one atomic instruction: xchg (exchange). The function in x86.h is:

static inline uint xchg(volatile uint *addr, uint newval)
{
    uint result;
    asm volatile("lock; xchgl %0, %1" :
        "+m" (*addr), "=a" (result) :
        "1" (newval) :
        "cc");
    return result;
}

What it does

xchg(addr, newval) atomically:

Reads the current value at *addr into result.
Writes newval into *addr.
Returns the old value (result).

Example:

Before: *addr = 0, newval = 1
After:  *addr = 1, result (returned) = 0

The lock; prefix on the x86 instruction is what tells the CPU to assert the bus lock, making the swap indivisible. No other core can read or write *addr between the read and the write.

xv6 Spinlock: `acquire` and `release`

A spinlock uses xchg to implement mutual exclusion. The lock is just a struct with a single integer field locked (0 = free, 1 = held).

`acquire`

void acquire(struct spinlock *lk)
{
    pushcli();              // disable interrupts on this CPU
    if (holding(lk)) panic("acquire");

    // Spin until we can set lk->locked from 0 to 1.
    while (xchg(&lk->locked, 1) != 0)
        ;

    __sync_synchronize();   // memory barrier
}

Step by step:

pushcli() – Disables interrupts on the current CPU. This prevents a situation where the CPU holds the lock, then gets interrupted, and the interrupt handler also tries to acquire the same lock (deadlock).
holding() check – Panics if this CPU already holds the lock (avoids self-deadlock in debug builds).
Spin loop – Calls xchg(&lk->locked, 1) repeatedly. If the lock was already 1 (held by someone else), xchg returns 1 and we keep spinning. The moment the lock is released (set to 0), our xchg will atomically set it to 1 and return 0, breaking out of the loop.
__sync_synchronize() – A compiler/hardware memory barrier. It ensures that no memory read or write that occurs after this point in the code is reordered by the CPU or compiler to happen before we acquired the lock.

`release`

void release(struct spinlock *lk)
{
    __sync_synchronize();   // memory barrier

    asm volatile("movl $0, %0" : "+m" (lk->locked) : );
    popcli();               // re-enable interrupts
}

Step by step:

__sync_synchronize() – Ensures all writes protected by the lock are visible to other CPUs before the lock is released.
movl $0 – Writes 0 to lk->locked, releasing the lock. This is done with inline assembly (rather than lk->locked = 0) to prevent the compiler from reordering it past the barrier.
popcli() – Restores the interrupt-enable state that was in effect before the matching pushcli().

Why Interrupts Must Be Disabled

Disabling interrupts during lock hold prevents a specific deadlock scenario:

Thread A acquires lock L on CPU 0.
A timer interrupt fires on CPU 0.
The interrupt handler tries to acquire lock L.
Deadlock: the handler can never get L because A holds it, and A can never run to release L because the handler never returns.

On a uniprocessor, this is the only form of concurrency. On a multiprocessor, disabling interrupts is still necessary to handle this local-CPU scenario, but the xchg spin is still needed to exclude other CPUs.

Key Takeaways

i++ is not atomic: it compiles to load/add/store, and concurrent threads can interleave these steps, causing lost updates.
Atomic instructions (like x86 xchg with the lock prefix) are indivisible: the CPU ensures no other thread can observe a partial execution.
xv6's xchg(addr, newval) atomically swaps *addr with newval and returns the old value — the single primitive underpinning xv6 spinlocks.
acquire spins calling xchg until it wins the lock (reads 0), then sets a memory barrier; it also disables interrupts to prevent same-CPU deadlock.
release sets a memory barrier, then writes 0 to the lock word, then re-enables interrupts.
Memory barriers (__sync_synchronize) prevent the compiler and CPU from reordering loads/stores across the lock boundary, preserving the correctness guarantee the lock is supposed to provide.

Atomic Instructions and xv6 Locks

Why This Matters

The Race Condition Problem

A single C statement compiles to multiple instructions

The race in pictures

The Solution: Atomic Instructions

`xchg()`: The Key Primitive in xv6

What it does

xv6 Spinlock: `acquire` and `release`

`acquire`

`release`

Why Interrupts Must Be Disabled

Key Takeaways

Practice

Model answer

Model answer

Model answer

Results

Atomic Instructions and xv6 Locks

Why This Matters

The Race Condition Problem

A single C statement compiles to multiple instructions

The race in pictures

The Solution: Atomic Instructions

xchg(): The Key Primitive in xv6

What it does

xv6 Spinlock: acquire and release

acquire

release

Why Interrupts Must Be Disabled

Key Takeaways

Practice

Model answer

Model answer

Model answer

Results

`xchg()`: The Key Primitive in xv6

xv6 Spinlock: `acquire` and `release`

`acquire`

`release`