AES, Encryption Modes & Hash Functions

Every time you connect to a website over HTTPS, buy something with a credit card, or unlock your phone, symmetric encryption and hashing are working behind the scenes. This module covers the two most important primitives: AES, the world-standard block cipher, and cryptographic hash functions, the workhorses of integrity and authentication. Getting either one wrong — using the wrong mode, reusing an IV, or storing passwords with MD5 — is a routine source of real-world vulnerabilities.

Background: block ciphers vs. stream ciphers

A block cipher partitions plaintext into fixed-size chunks and encrypts each chunk with a key. A stream cipher generates a pseudorandom key stream from a short key and XORs it with the plaintext byte-by-byte. AES is a block cipher; some modes of operation (CFB, CTR) turn it into a stream cipher.

AES: the Advanced Encryption Standard

DES (Data Encryption Standard, 1977) had a 56-bit key and 64-bit blocks; brute-force attacks succeeded in the 1990s. In 1997 NIST held an open competition for a replacement. The winner — Rijndael, designed by Daemen and Rijmen — became AES in 2001.

Key facts:

Property	Value
Block size	128 bits (16 bytes) — always
Key sizes	128, 192, or 256 bits
Rounds	10 (128-bit key), 12 (192-bit), 14 (256-bit)
Structure	Substitution-permutation network (not Feistel)

Each round applies four transformations to a 4×4 byte state matrix:

SubBytes — every byte is replaced by its entry in a fixed S-box (non-linear substitution that resists algebraic attacks).
ShiftRows — rows of the state matrix are cyclically shifted left by 0, 1, 2, and 3 positions, spreading bytes across columns.
MixColumns — each column is treated as a polynomial and multiplied modulo a fixed polynomial, achieving diffusion (one byte change affects the whole column).
AddRoundKey — the state is XORed with a 128-bit round key derived from the original key via the key schedule.

The final round omits MixColumns. Decryption runs all steps in reverse with inverse operations. You do not need to remember the math; what matters is that these four steps together ensure that a single-bit change in plaintext or key scrambles every output bit — the avalanche effect.

Encryption modes: handling multi-block messages

AES encrypts exactly one 16-byte block at a time. Real messages are longer. Cipher modes define how to chain block-cipher calls over a sequence of blocks.

ECB — Electronic Codebook (broken for most uses)

The simplest mode: encrypt each block independently.

C_i = E_K(P_i)

P1 ──► [AES] ──► C1
P2 ──► [AES] ──► C2      (same key every block)
P3 ──► [AES] ──► C3

Fatal flaw: identical plaintext blocks produce identical ciphertext blocks. Encrypt a bitmap image in ECB mode and the outline of the original image remains visible in the ciphertext — the famous "ECB penguin." ECB is essentially a lookup table (codebook), and any structure in the plaintext bleeds straight through into the ciphertext. Never use ECB for real data.

CBC — Cipher Block Chaining (the classic choice)

Each plaintext block is XORed with the previous ciphertext block before encryption. The first block uses an Initialization Vector (IV) in place of a prior ciphertext block.

C_i = E_K(P_i XOR C_{i-1})    where C_0 = IV

IV ──►(XOR)──► [AES] ──► C1
P1 ──►↑         ↓
               C1 ──►(XOR)──► [AES] ──► C2
        P2 ──►↑        ...

Identical plaintext blocks now produce different ciphertext blocks (because the chained ciphertext differs). Note:

Encryption is sequential (each block depends on the previous ciphertext).
Decryption can be parallelized (all ciphertext blocks are already available).

CFB — Cipher Feedback

CFB encrypts the previous ciphertext block with AES and XORs the result with the current plaintext — effectively turning AES into a stream cipher.

No padding required for the last block (output length equals input length).
Like CBC, encryption is sequential; decryption can be parallelized.

CTR — Counter Mode

A counter value (nonce + block number) is encrypted with AES to produce a key-stream block, which is then XORed with the plaintext. This is a stream-cipher construction.

Both encryption and decryption are fully parallelizable.
No padding required.
Loss of the counter/nonce uniqueness is catastrophic (see IV reuse below).

GCM — Galois/Counter Mode (authenticated encryption)

CTR mode extended with a Galois-field authentication tag. GCM provides both confidentiality and integrity in one pass. It is the standard choice in TLS 1.3. (Not covered in depth in this lecture, but know it exists and why it matters.)

Mode comparison

Mode	Padding needed	Encryption parallel	Decryption parallel	Hides patterns
ECB	Yes	Yes	Yes	No
CBC	Yes	No	Yes	Yes
CFB	No	No	Yes	Yes
CTR	No	Yes	Yes	Yes

Padding: PKCS#7

Block modes that require padding (ECB, CBC) must expand the last block to a full 16 bytes. PKCS#7 (also called PKCS#5 in many library docs) pads with N bytes each of value N. If the input is 9 bytes, 7 bytes of 0x07 are appended:

Original:  31 32 33 34 35 36 37 38 39
Padded:    31 32 33 34 35 36 37 38 39  07 07 07 07 07 07 07

The value of each padding byte tells the decryptor how many bytes to strip. If the last block is already full, a whole extra block of 0x10 (16) is added so the rule is unambiguous.

Stream-mode derivatives (CFB, CTR) produce output the same length as the input — no padding.

IV requirements and pitfalls

The IV must be:

Transmitted in plaintext alongside the ciphertext (it is not secret, just random).
Unique per encryption — reusing the same IV with the same key in CBC leaks whether two messages share a common prefix. In CTR mode, IV reuse is catastrophic: XORing two ciphertexts encrypted with the same keystream cancels the key and exposes the XOR of the two plaintexts.
Unpredictable — a predictable IV in CBC enables chosen-plaintext attacks (e.g., the BEAST attack against TLS 1.0).

Generate IVs with a cryptographically secure random number generator; never use a counter or timestamp as an IV in CBC.

Cryptographic hash functions

A hash function maps arbitrary-length input to a fixed-length output (the digest). A cryptographic hash function adds three security properties:

Property	Meaning
One-wayness (preimage resistance)	Given `h`, it is infeasible to find any `m` such that `hash(m) = h`
Second-preimage resistance	Given `m`, it is infeasible to find `m' ≠ m` with the same hash
Collision resistance	It is infeasible to find any pair `m1 ≠ m2` with `hash(m1) = hash(m2)`

Note: collision resistance implies second-preimage resistance, but not the other way around. A hash that is merely one-way is not necessarily collision-resistant.

Merkle-Damgard construction

MD5, SHA-1, and SHA-2 are built on the Merkle-Damgard construction: the message is split into fixed-size chunks, padded, and fed through a compression function iteratively:

IV ──►[h]──► intermediate ──►[h]──► ... ──►[h]──► Hash(M)
        ↑                      ↑
        M1                     M2 || ... || Mn || padding

This construction has a known weakness — length-extension attacks — which is why you should not use hash(key || message) as a MAC. Use HMAC instead.

The MD family

Algorithm	Status
MD2, MD4	Severely broken — obsolete
MD5	Collision resistance broken (practical collisions found); one-wayness is technically intact but MD5 is not safe for any security purpose
MD6	Developed in response to NIST's SHA-3 competition

The SHA family

Published by NIST:

Algorithm	Status
SHA-0	Withdrawn due to a design flaw shortly after publication
SHA-1	Collision attack demonstrated in 2017 (SHAttered); deprecated — do not use
SHA-2 (SHA-256, SHA-512, truncated variants)	Currently secure; the standard for most applications
SHA-3 (Keccak)	Released 2015; different sponge construction; no significant attack

SHA-256 (32-byte digest) and SHA-512 (64-byte digest) are the workhorses of SHA-2. Use them for new systems.

Applications of hash functions

Integrity verification

A single-bit change in the input completely changes the hash (avalanche effect). Distributors publish SHA-256 checksums of software downloads; users verify the download before trusting it. System-integrity tools like Tripwire use hashes to detect tampered files.

Password storage

Passwords must not be stored in plaintext. The correct pattern:

Generate a per-user random salt (e.g., 16 bytes).
Compute hash = KDF(password || salt) using a slow key-derivation function (bcrypt, scrypt, Argon2) or at minimum many rounds of SHA-512.
Store (algorithm_id, salt, hash) — never the password itself.

Linux stores this in /etc/shadow. An entry like $6$wDRrWCQz$... means: algorithm 6 (SHA-512), salt wDRrWCQz, then the hash.

Why salt? Without a salt, two users with the same password produce the same hash. An attacker who obtains the database can crack all identical entries at once with a single dictionary lookup (rainbow table). Unique per-user salts force the attacker to crack each entry independently.

Commitment schemes

Because hash(m) reveals nothing about m (one-wayness), you can publish a hash to commit to a secret without disclosing it. Later, revealing m lets anyone verify the commitment — useful in protocols, auctions, and blockchain applications.

MACs vs. bare hashes

A hash alone does not authenticate the sender (anyone can compute a hash). A Message Authentication Code (MAC) such as HMAC combines the hash with a shared secret key. Use HMAC when you need both integrity and authenticity.

Key takeaways

AES is a symmetric block cipher with 128-bit blocks and 128/192/256-bit keys; each round applies SubBytes, ShiftRows, MixColumns, and AddRoundKey.
ECB encrypts each block independently and is insecure for structured data — identical blocks produce identical ciphertext.
CBC chains each block through the previous ciphertext using an XOR and an IV; encryption is sequential, decryption is parallelizable.
CFB and CTR eliminate padding and can convert AES into a stream cipher; CTR is fully parallelizable.
PKCS#7 padding makes the last block a full 16 bytes; each padding byte equals the number of bytes added.
IVs must be unique, unpredictable, and non-secret; reusing an IV with the same key leaks plaintext relationships.
A cryptographic hash is one-way and collision-resistant; MD5 and SHA-1 are broken for collision resistance and must not be used.
SHA-256 / SHA-512 (SHA-2 family) are the current standards.
Passwords must be stored as hash(password || salt) with a per-user salt to defeat rainbow-table attacks; never store plaintext passwords or unsalted hashes.