PKI & Authentication

Every time your browser visits https://bank.com, two things must be true: the traffic is encrypted and you are talking to the real bank, not an imposter. Public Key Infrastructure (PKI) is the machinery that makes the second guarantee possible. Without it, public-key cryptography is wide open to a devastating man-in-the-middle (MITM) attack.

The Key-Authenticity Problem

In a basic public-key exchange, Alice sends her public key to Bob. Nothing stops Mallory from intercepting that transmission and substituting her own public key. Bob now encrypts data that only Mallory can read; Mallory decrypts it, re-encrypts with Alice's real key, and forwards it — neither party notices the interception.

Alice ──(Alice's pub key)──► Mallory ──(Mallory's pub key)──► Bob
Alice ◄──[secret data]◄──── Mallory ◄──[secret data]◄─────── Bob

Fundamental problem: Bob cannot tell whether the public key he received actually belongs to Alice.

Solution: Involve a trusted third party that binds an identity to a public key in a tamper-evident document called a digital certificate, and signs that document with its own private key. Because the certificate cannot be forged, Bob can verify it and learn the true owner of the key.

Digital Certificates and X.509

A digital certificate (standardized as X.509) is a data structure containing:

Field	Purpose
Subject	Identity being certified (e.g., `CN=www.paypal.com`)
Subject Public Key	The public key that belongs to the subject
Issuer	The CA that signed this certificate
Validity Period	Not Before / Not After dates
Serial Number	Unique identifier assigned by the CA
Signature	CA's digital signature over all the above fields

The CA signs the certificate with its private key. Anyone who has the CA's public key can verify the signature, confirming that the binding of identity to public key has not been tampered with.

openssl s_client -showcerts -connect www.paypal.com:443 </dev/null
openssl x509 -in paypal.pem -text -noout     # decode the certificate

Certificate Authorities and the Chain of Trust

A Certificate Authority (CA) is the trusted party that:

Verifies that the applicant controls the domain or identity claimed in the certificate request (Certificate Signing Request, or CSR).
Signs the certificate with its own private key.

CAs are organized in a hierarchy:

Root CA  (self-signed)
  ├── Intermediate CA 1
  │     └── Sub CA ── Domain Owner 1
  └── Intermediate CA 2
        ├── Sub CA 1 ── Domain Owner 2
        └── Sub CA 2 ── Domain Owner 3

Root CAs issue self-signed certificates (Issuer == Subject). Their public keys are pre-installed in operating systems, browsers, and other software — these are the root stores (e.g., Windows Certificate Store, NSS/Mozilla store).
Intermediate CAs hold certificates signed by a root CA and in turn sign certificates for end entities or other sub-CAs.

This layered design limits risk: root CA private keys are kept offline and almost never used directly; intermediate CAs do day-to-day signing.

Getting a Certificate (the CSR flow)

Generate a key pair (openssl genrsa).
Create a CSR with your identity information (openssl req -new).
Send the CSR to a CA. The CA verifies your identity and signs the certificate.
Install the issued certificate on your server (e.g., Apache SSLCertificateFile).

Certificate Revocation

A certificate may need to be invalidated before it expires (compromised private key, domain change). Two mechanisms:

Mechanism	How it works	Drawback
CRL (Certificate Revocation List)	CA publishes a signed list of revoked serial numbers	Clients must download the full list; may be stale
OCSP (Online Certificate Status Protocol)	Client queries CA in real time for one certificate's status	Adds latency; CA learns browsing patterns
OCSP Stapling	Server fetches its own OCSP response and "staples" it to the TLS handshake	Solves latency and privacy concerns

How TLS Uses Certificates

During a TLS handshake the server presents its certificate. The client:

Verifies the CA's signature on the certificate.
Walks the certificate chain up to a root CA in its trust store.
Checks validity dates and revocation status.
If everything checks out, uses the server's public key to establish a shared session key.

This is why a compromised CA is catastrophic: a CA that has been subverted can issue fraudulent certificates for any domain, enabling MITM attacks against any site signed by that CA.

Entity Authentication

Authentication is the process of verifying that an entity is who it claims to be. Three classic factors:

Factor	Description	Examples
Something you know	A secret the user memorizes	Password, PIN, security question
Something you have	A physical object the user possesses	Hardware token, smart card, TOTP app
Something you are	A biological characteristic	Fingerprint, face, iris, voice

Single-factor systems (e.g., password only) are weaker; multi-factor authentication (MFA / 2FA) requires at least two different factor types. Either factor alone is useless without the other.

Passwords and Secure Storage

Passwords are the most common "something you know" factor but are routinely misused.

Storing passwords safely:

Never store plaintext. A database breach immediately exposes all credentials.
Hash the password using a one-way function. On UNIX, /etc/shadow stores entries in the format $id$salt$hashed where $id indicates the algorithm ( $1$ = MD5, $2y$ = bcrypt/Blowfish, $5$ = SHA-256, $6$ = SHA-512, $y$ = yescrypt).
Add a per-user random salt before hashing. The salt is stored alongside the hash. This means two users with the same password get different stored hashes, defeating precomputed rainbow tables.
Use a key-stretching / password-hashing function (bcrypt, scrypt, Argon2) rather than a raw cryptographic hash. These are intentionally slow, making brute-force and dictionary attacks orders of magnitude more expensive.

Why a fast hash like MD5 or SHA-256 is wrong for passwords:
A GPU array can compute billions of SHA-256 hashes per second. Argon2 (the 2015 Password Hashing Competition winner) lets you tune memory and CPU cost so that each guess takes tens of milliseconds on the attacker's hardware, but verification on your server is still fast enough to be imperceptible.

Password attacks:

Replay attack — capture a password by keylogging, network sniffing, or privilege escalation and reuse it.
Brute-force — try every possible string; feasible for short or low-entropy passwords.
Dictionary attack — try words from a common-password list; effective because human-chosen passwords cluster predictably (top 2023 passwords include 123456, password, qwerty).

Password hygiene rules: use long, randomly generated passwords, never reuse passwords across sites, and consider a password manager.

Multi-Factor Authentication and FIDO/WebAuthn

2FA adds a second factor — typically a time-based one-time password (TOTP) from an authenticator app (Duo, Authy, Google Authenticator), an SMS code, or a hardware security key.

FIDO2 / WebAuthn goes further by using public-key cryptography directly for authentication. The security key or device stores a private key that never leaves the hardware; the server holds only the corresponding public key. Because no shared secret is transmitted, FIDO2 is phishing-resistant by design.

Challenge-response is the underlying pattern: the server sends a random challenge r; the client responds with f(r) (e.g., HMAC-SHA1 of r with the shared or private key). Because each challenge is unique, a replay of the response is useless.

Phishing: The Main Authentication Threat

All of the above mechanisms can be bypassed by phishing — tricking the user into voluntarily submitting credentials to a fake site. A certificate only proves that the server controls the domain; it says nothing about whether the domain is trustworthy. FIDO2/WebAuthn is the strongest defense because the private key response is cryptographically bound to the origin, so a phishing site cannot receive a valid response even if the user enters their PIN.

Key Takeaways

The MITM attack on public-key exchange motivates PKI: an attacker who controls the channel can substitute their own public key.
An X.509 certificate binds a subject's identity to a public key via a CA's digital signature; it cannot be forged or altered without detection.
CAs form a hierarchy: root CAs (pre-installed in root stores) sign intermediate CAs which sign end-entity certificates — this is the chain of trust.
Revocation (CRL, OCSP, OCSP Stapling) allows certificates to be invalidated before expiry.
TLS uses the chain of trust during the handshake to authenticate the server; a compromised CA can undermine any domain it is trusted to sign.
The three authentication factors are something you know, something you have, and something you are; MFA combines at least two.
Passwords must be stored as salted, key-stretched hashes (bcrypt/scrypt/Argon2); fast hashes (MD5, SHA-256) are dangerously weak for this purpose.
FIDO2/WebAuthn provides phishing-resistant authentication by binding the cryptographic response to the legitimate origin.