Control-Flow Integrity (CFI)
Stack canaries stop a buffer overflow from silently corrupting the return address. ASLR randomizes where code lives. But Return-Oriented Programming (ROP) can survive both: it overwrites the return address anyway (triggering the canary if present — but many targets lack it), and it reuses code that is already mapped (defeating ASLR once a single address leaks). The defense that directly targets the shape of execution — rather than how the attacker got there — is Control-Flow Integrity.
The threat: code-reuse attacks recap
Every indirect control-flow transfer in a program (indirect call/jmp through a register or memory, and ret) resolves its target at runtime. Attackers who can corrupt memory exploit this:
- Function-pointer overwrite — overwriting
fpbeforecall fpredirects a forward transfer to an arbitrary gadget. - Return-address overwrite — overwriting the saved return address redirects
ret, a backward transfer.
ROP strings together small existing code snippets ("gadgets") ending in ret, each gadget picking up its next target from the attacker-controlled stack. The result is Turing-complete computation without injecting a single byte of new code — which is why W^X / NX cannot stop it.
To chain functions with arguments the attacker constructs a fake stack:
%esp → &fun1()
arg1_for_fun1
&pop_ret_gadget ; cleans up arg1
&fun2()
arg1_for_fun2
arg2_for_fun2
...
Each pop; ret gadget adjusts %esp so the next function address surfaces at the top of the stack when fun1 returns, enabling arbitrary multi-function chains.
The CFI idea
The control-flow graph (CFG) of a program is a static representation where:
- Each node is a basic block — a maximal straight-line sequence of instructions with no branches in or out.
- Each edge represents a possible jump between basic blocks.
CFI enforcement rule: at every indirect transfer instruction, the actual runtime target must be a node that the static CFG says is a legal successor from that instruction. If the check fails, the program aborts.
Because ROP chains jump to arbitrary locations (middle of functions, gadget addresses that were never call targets), they violate the CFG and are caught.
Forward-edge CFI: protecting indirect calls and jumps
A forward-edge transfer is any call or jmp whose target is computed at runtime — including calls through function pointers and C++ virtual dispatch through vtables. These are the transfers attackers use to invoke arbitrary library functions or gadgets.
Label-based implementation (Abadi et al., TISSEC 2009)
The classic approach instruments every indirect jmp/call to check that the destination holds a known CFI label. For example, a jmp ecx becomes:
; before jump (source side)
cmp [ecx], 12345678h ; does destination start with our label?
jne error_label ; if not, abort
lea ecx, [ecx+4] ; skip past the label in the destination
jmp ecx ; jump to actual code
; at each valid destination (destination side)
78 56 34 12 ; label data 12345678h
mov eax, [esp+4] ; real first instruction
...
Labels are chosen so they are not interpretable as harmful instructions and do not appear accidentally elsewhere in code. Different call sites that may target different sets of destinations get different label IDs, giving per-call-site precision.
Clang/LLVM forward-edge CFI (Google)
Clang's -fsanitize=cfi implements forward-edge CFI for indirect function calls (including C++ vtable calls). The mechanism:
- Build an indirect jump table (
f_JT) of size 2^n covering all valid targets for a class of call sites. - At each indirect call (
call *%rax), transform the call: check that%raxfalls within the table range, then redirect through the table entry to the real function.
call *%rax → <check %rax is in valid set>
call *%rax (via jump table slot)
Only addresses that appear in the jump table are reachable. A corrupted pointer that points outside the table causes the check to fail.
Backward-edge CFI: protecting returns
A backward-edge transfer is ret, which pops the return address from the stack. ROP lives here. Protecting returns requires ensuring every ret goes back to the actual call site that invoked the current function.
Shadow stack
The canonical solution is a shadow stack: a second, separate stack that stores a copy of each return address when a function is called, and checks the real stack's return address against the shadow copy when the function returns.
call foo → push return_addr onto shadow stack
call foo
ret → pop shadow copy
compare [esp] vs shadow copy
if mismatch → abort
ret normally
Because the shadow stack is stored in a memory region the attacker cannot reach by overflowing normal stack buffers (it is in a separate allocation or register-protected region), corrupting the regular stack's return address is detected the moment ret executes.
Code Pointer Integrity (CPI) / Code Pointer Separation (CPS)
Full software-based CFI carries performance overhead. CPI (Kuznetsov et al., OSDI 2014) takes a targeted approach: instead of checking all control transfers, it separates sensitive code pointers (return addresses, function pointers) onto a safe stack that is only reachable through safe accesses. The regular stack holds buffers (which can overflow) but not the return address; the safe stack holds the return address (which is never directly adjacent to a buffer).
foo() regular stack: foo() safe stack:
buf[16] r (local variable, pointer)
(nothing sensitive) ret address ← protected
Coarse vs. fine-grained CFI
Not all CFI schemes enforce the same precision:
| Granularity | What is allowed at each indirect transfer | Attacker's room |
|---|---|---|
| Fine-grained | Only the specific targets the CFG says are valid for this call site | Very narrow |
| Coarse-grained | Any function entry point (or any return site) system-wide | Wide — COOP/ROP over valid targets still possible |
Coarse-grained CFI (e.g., "return anywhere that looks like a return site") is much cheaper to implement but leaves substantial attack surface. Fine-grained CFI with per-call-site type checking (Clang's approach, or the original Abadi label scheme) dramatically reduces the valid target set, making practical exploitation far harder.
Hardware-assisted CFI: Intel CET
Software CFI instruments binaries, which is expensive and requires recompilation. Intel Control-flow Enforcement Technology (CET), available in recent Intel CPUs (Tiger Lake and later), bakes CFI primitives into the microarchitecture:
| CET feature | What it does | Edge protected |
|---|---|---|
| Shadow Stack (SS) | Hardware-maintained second stack; CALL pushes to it, RET compares against it and raises #CP on mismatch |
Backward (returns) |
ENDBRANCH (ENDBR32/ENDBR64) |
Valid indirect-call/jump targets must begin with this instruction; the CPU faults if an indirect transfer lands elsewhere | Forward (indirect calls/jumps) |
Key properties of CET:
- No recompilation of legacy code required for the shadow stack — the hardware does the bookkeeping automatically for any binary running on a CET-enabled CPU with an OS that enables the feature.
- ENDBRANCH-based forward-edge protection does require binaries to be compiled with CET support (so valid function entries contain the instruction).
- Linux exposes CET status via
/proc/cpuinfo(ibtandshstkflags); a small C program (cet-support.c) can queryCPUIDto confirm availability.
CFI in context: other defenses
| Defense | What it addresses | What it misses |
|---|---|---|
| Stack canary | Detects stack overflow before ret | Overwrite via non-linear write; ROP if canary leaked |
| ASLR | Randomizes code/stack addresses | Defeated by info leaks; doesn't stop reuse of known addresses |
| NX / W^X | Prevents injected code execution | Does not stop ROP (existing code is executable) |
| CFI | Restricts all indirect transfers to CFG edges | Cannot prevent attacks confined to valid CFG paths (COOP); requires precise CFG |
Key takeaways
- ROP defeats NX by chaining existing code gadgets; it works by corrupting indirect control-flow targets (return addresses, function pointers) — never injecting new code.
- CFI's core idea: compute a static control-flow graph at compile/link time, then insert runtime checks so every indirect transfer is validated against that graph before executing.
- Forward-edge CFI (Clang
-fsanitize=cfi, Abadi label scheme) protects indirectcall/jmpby ensuring the target is a member of the valid destination set for that call site. - Backward-edge CFI via a shadow stack protects
retby keeping a tamper-resistant copy of each return address and comparing on return. - Fine-grained CFI per call site is much stronger than coarse-grained; coarse-grained still leaves room for attacks using only legitimately reachable targets.
- Intel CET provides hardware shadow stack (backward-edge) and
ENDBRANCH(forward-edge) with minimal overhead and no need to recompile legacy code for the shadow stack feature. - CFI does not prevent all exploitation — an attacker confined entirely to valid CFG edges can still mount attacks (COOP, data-only attacks), but CFI eliminates the entire class of arbitrary code-reuse exploits that dominate real-world ROP.