Control-Flow Integrity (CFI)

Stack canaries stop a buffer overflow from silently corrupting the return address. ASLR randomizes where code lives. But Return-Oriented Programming (ROP) can survive both: it overwrites the return address anyway (triggering the canary if present — but many targets lack it), and it reuses code that is already mapped (defeating ASLR once a single address leaks). The defense that directly targets the shape of execution — rather than how the attacker got there — is Control-Flow Integrity.

The threat: code-reuse attacks recap

Every indirect control-flow transfer in a program (indirect call/jmp through a register or memory, and ret) resolves its target at runtime. Attackers who can corrupt memory exploit this:

ROP strings together small existing code snippets ("gadgets") ending in ret, each gadget picking up its next target from the attacker-controlled stack. The result is Turing-complete computation without injecting a single byte of new code — which is why W^X / NX cannot stop it.

To chain functions with arguments the attacker constructs a fake stack:

%esp → &fun1()
         arg1_for_fun1
         &pop_ret_gadget   ; cleans up arg1
         &fun2()
         arg1_for_fun2
         arg2_for_fun2
         ...

Each pop; ret gadget adjusts %esp so the next function address surfaces at the top of the stack when fun1 returns, enabling arbitrary multi-function chains.

The CFI idea

The control-flow graph (CFG) of a program is a static representation where:

CFI enforcement rule: at every indirect transfer instruction, the actual runtime target must be a node that the static CFG says is a legal successor from that instruction. If the check fails, the program aborts.

Because ROP chains jump to arbitrary locations (middle of functions, gadget addresses that were never call targets), they violate the CFG and are caught.

Forward-edge CFI: protecting indirect calls and jumps

A forward-edge transfer is any call or jmp whose target is computed at runtime — including calls through function pointers and C++ virtual dispatch through vtables. These are the transfers attackers use to invoke arbitrary library functions or gadgets.

Label-based implementation (Abadi et al., TISSEC 2009)

The classic approach instruments every indirect jmp/call to check that the destination holds a known CFI label. For example, a jmp ecx becomes:

; before jump (source side)
cmp  [ecx], 12345678h   ; does destination start with our label?
jne  error_label        ; if not, abort
lea  ecx, [ecx+4]       ; skip past the label in the destination
jmp  ecx                ; jump to actual code

; at each valid destination (destination side)
78 56 34 12             ; label data 12345678h
mov eax, [esp+4]        ; real first instruction
...

Labels are chosen so they are not interpretable as harmful instructions and do not appear accidentally elsewhere in code. Different call sites that may target different sets of destinations get different label IDs, giving per-call-site precision.

Clang/LLVM forward-edge CFI (Google)

Clang's -fsanitize=cfi implements forward-edge CFI for indirect function calls (including C++ vtable calls). The mechanism:

  1. Build an indirect jump table (f_JT) of size 2^n covering all valid targets for a class of call sites.
  2. At each indirect call (call *%rax), transform the call: check that %rax falls within the table range, then redirect through the table entry to the real function.
call *%rax           →    <check %rax is in valid set>
                          call *%rax  (via jump table slot)

Only addresses that appear in the jump table are reachable. A corrupted pointer that points outside the table causes the check to fail.

Backward-edge CFI: protecting returns

A backward-edge transfer is ret, which pops the return address from the stack. ROP lives here. Protecting returns requires ensuring every ret goes back to the actual call site that invoked the current function.

Shadow stack

The canonical solution is a shadow stack: a second, separate stack that stores a copy of each return address when a function is called, and checks the real stack's return address against the shadow copy when the function returns.

call foo        →    push return_addr onto shadow stack
                     call foo

ret             →    pop shadow copy
                     compare [esp] vs shadow copy
                     if mismatch → abort
                     ret normally

Because the shadow stack is stored in a memory region the attacker cannot reach by overflowing normal stack buffers (it is in a separate allocation or register-protected region), corrupting the regular stack's return address is detected the moment ret executes.

Code Pointer Integrity (CPI) / Code Pointer Separation (CPS)

Full software-based CFI carries performance overhead. CPI (Kuznetsov et al., OSDI 2014) takes a targeted approach: instead of checking all control transfers, it separates sensitive code pointers (return addresses, function pointers) onto a safe stack that is only reachable through safe accesses. The regular stack holds buffers (which can overflow) but not the return address; the safe stack holds the return address (which is never directly adjacent to a buffer).

foo() regular stack:    foo() safe stack:
  buf[16]                 r (local variable, pointer)
  (nothing sensitive)     ret address   ← protected

Coarse vs. fine-grained CFI

Not all CFI schemes enforce the same precision:

Granularity What is allowed at each indirect transfer Attacker's room
Fine-grained Only the specific targets the CFG says are valid for this call site Very narrow
Coarse-grained Any function entry point (or any return site) system-wide Wide — COOP/ROP over valid targets still possible

Coarse-grained CFI (e.g., "return anywhere that looks like a return site") is much cheaper to implement but leaves substantial attack surface. Fine-grained CFI with per-call-site type checking (Clang's approach, or the original Abadi label scheme) dramatically reduces the valid target set, making practical exploitation far harder.

Hardware-assisted CFI: Intel CET

Software CFI instruments binaries, which is expensive and requires recompilation. Intel Control-flow Enforcement Technology (CET), available in recent Intel CPUs (Tiger Lake and later), bakes CFI primitives into the microarchitecture:

CET feature What it does Edge protected
Shadow Stack (SS) Hardware-maintained second stack; CALL pushes to it, RET compares against it and raises #CP on mismatch Backward (returns)
ENDBRANCH (ENDBR32/ENDBR64) Valid indirect-call/jump targets must begin with this instruction; the CPU faults if an indirect transfer lands elsewhere Forward (indirect calls/jumps)

Key properties of CET:

CFI in context: other defenses

Defense What it addresses What it misses
Stack canary Detects stack overflow before ret Overwrite via non-linear write; ROP if canary leaked
ASLR Randomizes code/stack addresses Defeated by info leaks; doesn't stop reuse of known addresses
NX / W^X Prevents injected code execution Does not stop ROP (existing code is executable)
CFI Restricts all indirect transfers to CFG edges Cannot prevent attacks confined to valid CFG paths (COOP); requires precise CFG

Key takeaways

Practice

  1. Why does NX (no-execute / W^X) fail to stop Return-Oriented Programming?
  2. In a control-flow graph (CFG) used by CFI, what does a node represent?
  3. Which of the following best describes the difference between forward-edge and backward-edge control-flow transfers?
  4. In the Abadi et al. label-based CFI implementation, what does the instrumented code check immediately before executing an indirect jmp ecx?
  5. Clang's -fsanitize=cfi enforces forward-edge CFI using a jump table. What is the security property this jump table enforces?
  6. A shadow stack protects return addresses by:
  7. Intel CET provides two CFI mechanisms: a hardware shadow stack and the ENDBRANCH instruction. Which edges do they protect, respectively?
  8. Compared to fine-grained CFI, coarse-grained CFI is weaker because:
  9. Explain how Code Pointer Integrity (CPI) reduces the overhead of full CFI while still protecting against control-flow hijacking. What is the key idea behind separating the 'safe stack' from the regular stack?