Return-Oriented Programming (ROP)

Modern operating systems mark the stack and heap non-executable (the W^X / DEP / NX policy): memory that can be written cannot be executed. That one rule kills classic shellcode injection — your payload lands on the stack but the CPU refuses to run it. ret2libc was the first step past this wall: jump to an existing libc function instead of injected bytes. But ret2libc is limited to whole function calls with fixed argument counts. Return-Oriented Programming generalizes the idea to arbitrary computation by chaining tiny snippets of existing code — no injected instructions required.

Why DEP/NX is not enough

When the OS enforces W^X:

Stack and heap pages are writable but not executable (W, no X).
Code pages are executable but not writable (X, no W).

An attacker who controls the return address can no longer jump into a buffer. But the program's own .text segment — and every shared library it loads — is already marked executable and cannot be taken away. Those pages contain every instruction the attacker needs; they just have to be assembled in the right order.

From ret2libc to ROP: the key insight

In ret2libc you overwrite the return address with the address of system(), place "/bin/sh" where the argument goes, and you are done. The problem with chaining three or more calls is argument cleanup: after fun1(t) returns, its argument t is still sitting on the stack. Before fun2(u, v) can be called correctly, the stack pointer must skip past t. The slides call these gadgets — small pieces of existing code that end in ret and perform exactly this cleanup:

popl %ebp; ret       ; skip 1 word, then dispatch to next gadget
popl %ebx; popl %ebp; ret   ; skip 2 words
addl $16, %esp; ret  ; skip 4 words (3 args + saved ebp equivalent)

Each ret pops the next address off the stack and jumps there, handing control to the following gadget. The stack is no longer just data — it has become the program counter for a new virtual machine.

What is a gadget?

A gadget is a short instruction sequence that already exists in the binary (or any loaded library) and ends with a ret instruction. Two to five instructions is typical. Examples:

pop %eax; ret          ; load a value from the stack into %eax
pop %ebx; ret          ; load a value into %ebx
mov [%ebx], %eax; ret  ; write %eax to the memory address in %ebx
add %eax, %ebx; ret    ; arithmetic
xor %eax, %eax; ret    ; zero a register

Crucially, gadgets do not have to align with the original function boundaries. On x86 (a variable-length ISA), you can start decoding mid-instruction in an existing function and discover a different valid instruction sequence ending in 0xc3 (ret). This means a large binary typically contains thousands of usable gadgets.

The gadget chain on the stack

When the overflow fires, %esp points into the attacker-controlled region. The CPU's ret instruction executes %eip = *(%esp); %esp += 4 — it pops the top of the stack into the instruction pointer. If the stack contains a sequence of gadget addresses interleaved with data values, each ret hands control to the next gadget and the gadget's own pop instructions consume the accompanying data:

Stack layout (low address at bottom, %esp points here initially):

 [ &gadget_1         ]  <-- ret pops this into %eip
 [ value_for_gadget_1]  <-- gadget_1's "pop reg" consumes this
 [ &gadget_2         ]  <-- gadget_1's "ret" pops this into %eip
 [ value_for_gadget_2]
 [ &gadget_3         ]
 ...

The ROP comparison table from the slides captures this elegantly:

Concept	Normal programming	ROP equivalent
Instruction pointer	`%eip`	`%esp` (stack pointer)
No-op	`nop`	`ret` (advances `%esp`, does nothing useful)
Unconditional jump	`jmp address`	Set `%esp` to address of gadget
Variables	Registers and memory	Mostly memory (values on stack)

Building a chain: the `Mem[v2] = v1` example

The slides walk through a concrete ROP operation: store value v1 at memory address v2. The desired effect Mem[v2] = v1 requires three gadgets found anywhere in the binary:

a1: pop eax; ret        -- loads v1 into %eax
a2: pop ebx; ret        -- loads v2 into %ebx
a3: mov [ebx], eax      -- writes %eax → *(%ebx)

Stack layout to execute this chain:

 [ &a1 ]   <-- initial return address (from overflow)
 [ v1  ]   <-- consumed by a1's "pop eax"
 [ &a2 ]   <-- a1's "ret" jumps here
 [ v2  ]   <-- consumed by a2's "pop ebx"
 [ &a3 ]   <-- a2's "ret" jumps here

After a3 runs, mem[v2] = v1. By assembling enough such triples you can perform arbitrary computation — move data, call functions, invoke syscalls — entirely from existing code.

Turing completeness

Hovav Shacham's 2007 paper "The Geometry of Innocent Flesh on the Bone" proved that ROP is Turing complete: any computation expressible in a normal program can be expressed as a gadget chain, as long as the binary is large enough to contain the necessary gadget types. The paper's key observation: "In any sufficiently large body of x86 executable code there will exist sufficiently many useful code sequences that an attacker who controls the stack will be able to cause the exploited program to undertake arbitrary computation."

Tools: finding gadgets automatically

Manual gadget hunting is tedious. Two widely used tools automate it:

Tool	Usage
ROPgadget	`ROPgadget --binary ./target --ropchain` scans the binary for gadgets and can automatically synthesize a chain to call `execve`. Pass `--binary /path/to/libc.so.6 --offset <load_addr>` to search a shared library.
pwntools (`ROP` class)	`from pwn import *; elf = ELF('./target'); rop = ROP(elf)` — programmatically build chains by calling `rop.call('system', [next(elf.search(b'/bin/sh'))])`.

ROPgadget's --ropchain output is a ready-to-paste Python snippet using struct.pack to assemble the chain byte by byte.

Defenses against ROP

ROP requires the attacker to know gadget addresses and to control the stack. Defenses attack both requirements:

Defense	Mechanism
Stack canary	Detects stack-buffer overflow before `ret` executes
ASLR	Randomizes load addresses — gadget addresses are unknown at exploit-write time
Control-Flow Integrity (CFI)	Hardware/compiler validates every indirect transfer; `ret` to an unexpected address is blocked
"Return-less" code	Compiler replaces `ret` with equivalent sequences that are not gadget terminators
Compartmentalization	Limits blast radius so even a successful chain cannot reach sensitive operations

In practice, ASLR is the most commonly deployed first-line defense against ROP; CFI (e.g., Intel CET shadow stack) is the more robust hardware solution deployed in modern systems.

Key takeaways

DEP/NX prevents code injection but cannot stop reuse of existing executable code.
A gadget is any short instruction sequence ending in ret found in the binary or its libraries; on x86 there are typically thousands.
Each ret pops the next gadget address off the stack, making %esp the effective instruction pointer of the ROP "machine."
The attacker places a gadget chain — interleaved addresses and data — in the overflow payload; no injected bytes are needed.
Cleanup gadgets (pop reg; ret, add $N, %esp; ret) skip over function arguments between calls, enabling chaining of multiple functions.
ROP is Turing complete: any computation is achievable given a large enough binary.
ROPgadget and pwntools automate gadget discovery and chain synthesis.
Robust defenses include stack canaries (detect the overflow), ASLR (hide addresses), and CFI / shadow stack (validate returns).

Return-Oriented Programming (ROP)

Why DEP/NX is not enough

From ret2libc to ROP: the key insight

What is a gadget?

The gadget chain on the stack

Building a chain: the `Mem[v2] = v1` example

Turing completeness

Tools: finding gadgets automatically

Defenses against ROP

Key takeaways

Practice

Model answer

Model answer

Results

Return-Oriented Programming (ROP)

Why DEP/NX is not enough

From ret2libc to ROP: the key insight

What is a gadget?

The gadget chain on the stack

Building a chain: the Mem[v2] = v1 example

Turing completeness

Tools: finding gadgets automatically

Defenses against ROP

Key takeaways

Practice

Model answer

Model answer

Results

Building a chain: the `Mem[v2] = v1` example