Control-Flow Integrity (CFI)

Stack canaries stop a buffer overflow from silently corrupting the return address. ASLR randomizes where code lives. But Return-Oriented Programming (ROP) can survive both: it overwrites the return address anyway (triggering the canary if present — but many targets lack it), and it reuses code that is already mapped (defeating ASLR once a single address leaks). The defense that directly targets the shape of execution — rather than how the attacker got there — is Control-Flow Integrity.

The threat: code-reuse attacks recap

Every indirect control-flow transfer in a program (indirect call/jmp through a register or memory, and ret) resolves its target at runtime. Attackers who can corrupt memory exploit this:

Function-pointer overwrite — overwriting fp before call fp redirects a forward transfer to an arbitrary gadget.
Return-address overwrite — overwriting the saved return address redirects ret, a backward transfer.

ROP strings together small existing code snippets ("gadgets") ending in ret, each gadget picking up its next target from the attacker-controlled stack. The result is Turing-complete computation without injecting a single byte of new code — which is why W^X / NX cannot stop it.

To chain functions with arguments the attacker constructs a fake stack:

%esp → &fun1()
         arg1_for_fun1
         &pop_ret_gadget   ; cleans up arg1
         &fun2()
         arg1_for_fun2
         arg2_for_fun2
         ...

Each pop; ret gadget adjusts %esp so the next function address surfaces at the top of the stack when fun1 returns, enabling arbitrary multi-function chains.

The CFI idea

The control-flow graph (CFG) of a program is a static representation where:

Each node is a basic block — a maximal straight-line sequence of instructions with no branches in or out.
Each edge represents a possible jump between basic blocks.

CFI enforcement rule: at every indirect transfer instruction, the actual runtime target must be a node that the static CFG says is a legal successor from that instruction. If the check fails, the program aborts.

Because ROP chains jump to arbitrary locations (middle of functions, gadget addresses that were never call targets), they violate the CFG and are caught.

Forward-edge CFI: protecting indirect calls and jumps

A forward-edge transfer is any call or jmp whose target is computed at runtime — including calls through function pointers and C++ virtual dispatch through vtables. These are the transfers attackers use to invoke arbitrary library functions or gadgets.

Label-based implementation (Abadi et al., TISSEC 2009)

The classic approach instruments every indirect jmp/call to check that the destination holds a known CFI label. For example, a jmp ecx becomes:

; before jump (source side)
cmp  [ecx], 12345678h   ; does destination start with our label?
jne  error_label        ; if not, abort
lea  ecx, [ecx+4]       ; skip past the label in the destination
jmp  ecx                ; jump to actual code

; at each valid destination (destination side)
78 56 34 12             ; label data 12345678h
mov eax, [esp+4]        ; real first instruction
...

Labels are chosen so they are not interpretable as harmful instructions and do not appear accidentally elsewhere in code. Different call sites that may target different sets of destinations get different label IDs, giving per-call-site precision.

Clang/LLVM forward-edge CFI (Google)

Clang's -fsanitize=cfi implements forward-edge CFI for indirect function calls (including C++ vtable calls). The mechanism:

Build an indirect jump table (f_JT) of size 2^n covering all valid targets for a class of call sites.
At each indirect call (call *%rax), transform the call: check that %rax falls within the table range, then redirect through the table entry to the real function.

call *%rax           →    <check %rax is in valid set>
                          call *%rax  (via jump table slot)

Only addresses that appear in the jump table are reachable. A corrupted pointer that points outside the table causes the check to fail.

Backward-edge CFI: protecting returns

A backward-edge transfer is ret, which pops the return address from the stack. ROP lives here. Protecting returns requires ensuring every ret goes back to the actual call site that invoked the current function.

Shadow stack

The canonical solution is a shadow stack: a second, separate stack that stores a copy of each return address when a function is called, and checks the real stack's return address against the shadow copy when the function returns.

call foo        →    push return_addr onto shadow stack
                     call foo

ret             →    pop shadow copy
                     compare [esp] vs shadow copy
                     if mismatch → abort
                     ret normally

Because the shadow stack is stored in a memory region the attacker cannot reach by overflowing normal stack buffers (it is in a separate allocation or register-protected region), corrupting the regular stack's return address is detected the moment ret executes.

Code Pointer Integrity (CPI) / Code Pointer Separation (CPS)

Full software-based CFI carries performance overhead. CPI (Kuznetsov et al., OSDI 2014) takes a targeted approach: instead of checking all control transfers, it separates sensitive code pointers (return addresses, function pointers) onto a safe stack that is only reachable through safe accesses. The regular stack holds buffers (which can overflow) but not the return address; the safe stack holds the return address (which is never directly adjacent to a buffer).

foo() regular stack:    foo() safe stack:
  buf[16]                 r (local variable, pointer)
  (nothing sensitive)     ret address   ← protected

Coarse vs. fine-grained CFI

Not all CFI schemes enforce the same precision:

Granularity	What is allowed at each indirect transfer	Attacker's room
Fine-grained	Only the specific targets the CFG says are valid for this call site	Very narrow
Coarse-grained	Any function entry point (or any return site) system-wide	Wide — COOP/ROP over valid targets still possible

Coarse-grained CFI (e.g., "return anywhere that looks like a return site") is much cheaper to implement but leaves substantial attack surface. Fine-grained CFI with per-call-site type checking (Clang's approach, or the original Abadi label scheme) dramatically reduces the valid target set, making practical exploitation far harder.

Hardware-assisted CFI: Intel CET

Software CFI instruments binaries, which is expensive and requires recompilation. Intel Control-flow Enforcement Technology (CET), available in recent Intel CPUs (Tiger Lake and later), bakes CFI primitives into the microarchitecture:

CET feature	What it does	Edge protected
Shadow Stack (SS)	Hardware-maintained second stack; `CALL` pushes to it, `RET` compares against it and raises `#CP` on mismatch	Backward (returns)
ENDBRANCH (`ENDBR32`/`ENDBR64`)	Valid indirect-call/jump targets must begin with this instruction; the CPU faults if an indirect transfer lands elsewhere	Forward (indirect calls/jumps)

Key properties of CET:

No recompilation of legacy code required for the shadow stack — the hardware does the bookkeeping automatically for any binary running on a CET-enabled CPU with an OS that enables the feature.
ENDBRANCH-based forward-edge protection does require binaries to be compiled with CET support (so valid function entries contain the instruction).
Linux exposes CET status via /proc/cpuinfo (ibt and shstk flags); a small C program (cet-support.c) can query CPUID to confirm availability.

CFI in context: other defenses

Defense	What it addresses	What it misses
Stack canary	Detects stack overflow before ret	Overwrite via non-linear write; ROP if canary leaked
ASLR	Randomizes code/stack addresses	Defeated by info leaks; doesn't stop reuse of known addresses
NX / W^X	Prevents injected code execution	Does not stop ROP (existing code is executable)
CFI	Restricts all indirect transfers to CFG edges	Cannot prevent attacks confined to valid CFG paths (COOP); requires precise CFG

Key takeaways

ROP defeats NX by chaining existing code gadgets; it works by corrupting indirect control-flow targets (return addresses, function pointers) — never injecting new code.
CFI's core idea: compute a static control-flow graph at compile/link time, then insert runtime checks so every indirect transfer is validated against that graph before executing.
Forward-edge CFI (Clang -fsanitize=cfi, Abadi label scheme) protects indirect call/jmp by ensuring the target is a member of the valid destination set for that call site.
Backward-edge CFI via a shadow stack protects ret by keeping a tamper-resistant copy of each return address and comparing on return.
Fine-grained CFI per call site is much stronger than coarse-grained; coarse-grained still leaves room for attacks using only legitimately reachable targets.
Intel CET provides hardware shadow stack (backward-edge) and ENDBRANCH (forward-edge) with minimal overhead and no need to recompile legacy code for the shadow stack feature.
CFI does not prevent all exploitation — an attacker confined entirely to valid CFG edges can still mount attacks (COOP, data-only attacks), but CFI eliminates the entire class of arbitrary code-reuse exploits that dominate real-world ROP.