RSA
Every time you visit a website over HTTPS, authenticate with an SSH key, or verify a software signature, RSA is likely doing the heavy lifting. Invented by Rivest, Shamir, and Adleman in 1978 โ and honoured with the 2002 ACM Turing Award โ it remains the most widely deployed public-key cryptosystem in existence. Understanding how it works, why it works, and where it fails is essential for anyone building or auditing secure systems.
Background: modular arithmetic
RSA lives entirely in modular arithmetic. a mod n is the remainder after dividing a by n. The key property you need is that modulo distributes over multiplication and exponentiation:
(a * b) mod n = [(a mod n) * (b mod n)] mod n
a^x mod n = (a mod n)^x mod n
This means you can reduce intermediate results at each step when computing a huge exponentiation โ no need to evaluate the full number before taking the remainder.
Background: Euler's theorem
Euler's totient function ฯ(n) counts the positive integers less than n that share no common factor with n.
- If
pis prime,ฯ(p) = p - 1(every integer from 1 to p-1 is coprime to p). - If
n = p * qfor distinct primespandq, thenฯ(n) = ฯ(p) * ฯ(q) = (p-1)(q-1).
Euler's theorem states: for any a coprime to n,
a^ฯ(n) mod n = 1
This is the engine that makes RSA correct.
Key generation
- Choose two large distinct primes
pandq. Keep them secret. - Compute the modulus
n = p * q. This is published. - Compute
ฯ(n) = (p - 1)(q - 1). Keep this secret. - Choose a public exponent
esuch that1 < e < ฯ(n)andgcd(e, ฯ(n)) = 1(i.e.eis coprime toฯ(n)). The value 65537 is the standard choice in practice. - Compute the private exponent
das the modular inverse ofemoduloฯ(n):
This is solved with the extended Euclidean algorithm.e * d mod ฯ(n) = 1
Public key: (e, n) โ share freely.
Private key: d โ keep secret. Also discard p, q, and ฯ(n) after key generation; they must not be recoverable.
Encryption and decryption
Given a message m represented as an integer with 0 < m < n:
Encrypt (anyone with the public key can do this):
c = m^e mod n
Decrypt (only the private-key holder can do this):
m = c^d mod n
That's the complete textbook cipher. The asymmetry comes from the fact that computing d from e and n alone requires factoring n โ which is believed to be computationally infeasible for large n.
Why correctness works (Euler's theorem)
Decryption recovers the original message because:
c^d mod n
= (m^e)^d mod n
= m^(ed) mod n
Since e * d โก 1 (mod ฯ(n)), we can write ed = k * ฯ(n) + 1 for some integer k. Then:
m^(ed) mod n
= m^(k*ฯ(n)+1) mod n
= (m^ฯ(n))^k * m mod n
= 1^k * m mod n โ Euler's theorem: m^ฯ(n) mod n = 1
= m
The exponent "wraps around" in a cycle of length ฯ(n), landing back on m.
Worked example
Use the values from the lecture: p = 11, q = 7.
| Step | Value |
|---|---|
n = p * q |
11 * 7 = 77 |
ฯ(n) = (p-1)(q-1) |
10 * 6 = 60 |
Choose e coprime to 60 |
e = 37 |
Find d: 37 * d mod 60 = 1 |
d = 13 (since 37 * 13 = 481 = 8 * 60 + 1) |
| Public key | (37, 77) |
| Private key | 13 |
Encrypt message m = 15:
c = 15^37 mod 77 = 71
Decrypt ciphertext c = 71:
m = 71^13 mod 77 = 15 โ
You can verify small steps by reducing modulo 77 at each multiplication โ the distributive property of mod keeps numbers manageable.
Digital signatures
RSA naturally supports signing by inverting the roles of the keys:
- Sign (private key holder):
ฯ = m^d mod n - Verify (anyone with the public key): compute
ฯ^e mod n; if it equalsm, the signature is valid.
The same Euler's theorem argument guarantees (m^d)^e mod n = m, so verification recovers the original value.
Why you must hash before signing
Signing the raw message directly has two fatal problems:
- Size: RSA can only operate on values smaller than
n. A document of arbitrary length cannot be directly exponentiated. - Malleability: An attacker who has signatures
ฯโonmโandฯโonmโcan trivially forge a signatureฯโ * ฯโ mod nonmโ * mโ mod nwithout knowingd.
The fix is to sign a hash of the message: ฯ = H(m)^d mod n. A cryptographic hash (SHA-256 or better) produces a fixed-size digest, making both problems disappear. The standard scheme for signature padding is RSA-PSS.
Security foundation: factoring is hard
The security of RSA rests on the integer factorization assumption: given n = p * q, there is no efficient algorithm to recover p and q when they are large (โฅ 1024 bits each).
If an attacker could factor n, they would compute ฯ(n) = (p-1)(q-1) and then recover d = eโปยน mod ฯ(n) โ breaking the scheme completely. After more than 45 years of sustained cryptanalytic effort, no polynomial-time classical algorithm for factoring is known. (Note: Shor's algorithm on a sufficiently large quantum computer would break RSA, which is why post-quantum cryptography is an active area.)
Practical considerations
| Concern | What to do |
|---|---|
| Key size | Use at least 2048-bit keys (3072 or 4096 for long-term security) |
| Padding for encryption | Use OAEP (Optimal Asymmetric Encryption Padding), not textbook RSA |
| Padding for signatures | Use PSS (Probabilistic Signature Scheme) |
| Never use textbook RSA | Deterministic: same m always produces the same c; malleable; no semantic security |
| Hybrid encryption | RSA is slow โ use it to encrypt an AES session key, then encrypt data with AES |
Textbook RSA pitfall
Textbook (plain) RSA is deterministic: encrypting the same message twice gives the same ciphertext. An attacker who can make encryption queries can build a dictionary of (m, c) pairs and look up any observed ciphertext. OAEP adds randomness so every encryption is different, eliminating this attack.
Hybrid encryption in practice
Because RSA is orders of magnitude slower than symmetric ciphers, real protocols (TLS, SSH, PGP) use RSA only to securely exchange a random AES key. All subsequent data is encrypted with AES. This gives you the key-distribution convenience of RSA at the bulk speed of AES.
RSA in SSH
Running ssh-keygen generates an RSA key pair. The public key goes to the server (~/.ssh/authorized_keys); the private key stays on your machine (~/.ssh/id_rsa). During login, the server challenges you to prove you hold the private key โ no password ever crosses the network.
Key takeaways
- RSA key generation: pick primes
p,q; computen = pq,ฯ(n) = (p-1)(q-1); chooseecoprime toฯ(n); findd = eโปยน mod ฯ(n). Public key is(e, n), private key isd. - Encrypt with the public key:
c = m^e mod n. Decrypt with the private key:m = c^d mod n. - Correctness follows from Euler's theorem: because
ed โก 1 mod ฯ(n), raisingm^eto the powerdcycles back tom. - Signing inverts the key roles: sign with
d, verify withe. Always sign a hash, never the raw message. - Security rests on the hardness of factoring
n. There is no known efficient classical algorithm. - Never deploy textbook RSA. Use OAEP for encryption, PSS for signatures, and at least 2048-bit keys.
- In practice, RSA encrypts a symmetric session key (hybrid encryption); bulk data is encrypted with AES.