Malware
Every major security incident you read about — a hospital's systems locked by ransomware, millions of credit-card numbers stolen, a nuclear centrifuge silently destroyed — traces back to malware: software that runs on a machine and does something the attacker wants rather than what the owner intended. Understanding malware is not just about naming categories; it is about seeing the full attack lifecycle, from initial compromise through persistence, payload execution, and evasion of detection.
What Is Malware?
Malware (malicious software) is any set of instructions that executes on a computer to serve an attacker's goals. The key distinction from a vulnerability: a vulnerability is a flaw; malware is the code that exploits or abuses that flaw (or uses entirely non-technical means like social engineering). The same piece of malware may use a buffer-overflow exploit to get in, social engineering to persist, and cryptographic ransomware as its payload — combining techniques across the attack chain.
Taxonomy: Types of Malware
| Type | Core behavior |
|---|---|
| Virus | Attaches itself to legitimate files/programs; spreads when the infected file is executed |
| Worm | Self-replicates across networks without user action; exploits network services |
| Trojan horse | Appears to do something useful; hides malicious functionality (e.g., spyware bundled with a free app) |
| Ransomware | Encrypts the victim's files and demands payment (e.g., cryptocurrency) to restore access |
| Rootkit | Modifies the OS to hide malware files, processes, and network connections |
| Bot / Botnet | Compromised machine (bot) controlled remotely by a botmaster via a command-and-control (C&C) server |
| Spyware | Covertly collects user data (keystrokes, screenshots, documents) |
| Adware | Serves unwanted advertisements, often as a revenue mechanism |
| Keylogger | Specifically captures keystrokes; can be software or hardware (a USB dongle between keyboard and PC) |
| Mobile malware | Targets smartphone operating systems (Android, iOS) |
Types frequently co-occur: a trojan may install a rootkit that installs a bot agent that enables ransomware to be pushed later.
What Can Malware Do? (Payloads)
Once running, malware can:
- Surveil — keylogging, screen/camera capture, document exfiltration
- Control — launch a reverse shell or remote desktop so the attacker can interact with the machine in real time
- Extort — encrypt all user files and display a ransom note (WannaCry is the canonical example: it encrypted files and demanded Bitcoin payment, with a countdown timer before the price doubled)
- Disrupt — deliver pop-ups (adware), delete data, or even physically damage hardware (Stuxnet caused Iranian centrifuges to spin at destructive speeds)
- Amplify — enroll the machine into a botnet for DDoS attacks, brute-force password campaigns, or cryptocurrency mining
How Does Malware Get Onto a System?
Getting the malware to execute is its own challenge. Common initial-access vectors:
- Exploit a network service — buffer/integer overflow in an HTTP, RPC, or file-sharing daemon (the attacker sends a crafted packet; the server executes injected code)
- Client-side exploit — malicious PDF, Word macro, Flash object, or browser exploit that triggers when an unsuspecting user opens or views the file
- Social engineering — trick the user into running the malware directly (phishing email with an attachment, fake software installer, malicious link)
- Autorun / physical media — USB drive with autorun functionality (Stuxnet spread this way into air-gapped Iranian facilities)
- Drive-by download — user visits a compromised or malicious web page; an exploit kit fingerprints the browser and silently delivers the best-matching exploit
- Insider / local access — a malicious or compromised employee with physical or account access
Rootkits: Hiding in Plain Sight
A rootkit's defining goal is concealment. It modifies OS internals so that standard utilities (ls, ps, netstat) cannot see the malware's files, processes, or network connections. The mechanism:
- Intercept system calls — when an application calls
readdir()orgetpid(), the rootkit interposes between the application and the kernel - Filter results — if the returned process ID matches the rootkit's own PID, suppress it from the listing:
if (PID == rootkit_PID) → don't show - Because detection tools that rely on the OS API will get filtered output, a compromised OS cannot be trusted to audit itself — clean-boot forensics (booting from external media) is required
An extreme variant is the virtualization-based rootkit ("Blue Pill"): the rootkit inserts a tiny hypervisor (VMM) beneath the running OS so the OS continues running as a guest virtual machine, completely unaware it has been relocated. The attack system and the VMM become invisible below the target OS stack.
Botnets: Malware at Scale
A botnet is a collection of compromised machines (bots) under unified control of a botmaster. Key properties:
- Decoupled compromise and control — the method used to infect (worm, trojan, malicious URL) is separate from the ongoing control channel
- Upon infection, the new bot "phones home" to a command-and-control (C&C) server to register itself
- The botmaster pushes commands via C&C: DDoS (ping flood), brute-force password attacks, spam relay, cryptocurrency mining
- Modern botnets route C&C traffic through anonymizing infrastructure (e.g., Tor) to frustrate takedown; the PgMiner botnet used PostgreSQL as an initial compromise vector and Tor for C&C
Detection Approaches
Signature-Based Detection
The traditional antivirus model: maintain a database of malware signatures (byte sequences or instruction patterns unique to each known malware family). Scan candidate files and look for matches.
- Fast — pattern matching is computationally cheap
- Precise for known threats — low false-positive rate on catalogued malware
- Blind to new/unknown malware — any sample not yet in the database is missed (zero-day problem)
- Signatures are proprietary and must be kept continuously updated
Allow/Block Listing (Hash-Based)
Maintain a database of cryptographic hashes of known-good files (OS binaries, popular applications) and known-bad files (confirmed malware). Compare the hash of each file on disk:
- Hash match in blocklist → quarantine
- Hash match in allowlist → trusted
- No match → unknown (potential new malware or custom software)
Heuristic / Behavioral Analysis
Useful for zero-day malware that has no signature yet. Two sub-approaches:
- Static code analysis — inspect the binary's instructions without running it; flag code that contains patterns associated with malicious behavior (e.g., instructions to delete system files, enumerate processes, open raw network sockets)
- Dynamic analysis / sandboxing — execute the file in an isolated emulation environment; monitor what the program actually does (file writes, network calls, registry changes); if the observed behavior is harmful, classify as malware
Sandboxing is powerful but malware authors fight back: some samples detect they are running inside a VM or emulator and refuse to execute their malicious payload until they confirm they are on a real machine.
IDS and IPS
| Term | Role |
|---|---|
| IDS (Intrusion Detection System) | Observes traffic or system activity and raises an alert after suspicious behavior is detected |
| IPS (Intrusion Prevention System) | Actively blocks or drops traffic/actions before they reach the target |
| HIDS / NIDS | Host-based vs. network-based placement |
Detection is inherently reactive (the memory may already be corrupted before the alert fires); prevention is proactive but requires high confidence to avoid blocking legitimate traffic. Tools like Snort can function as both IDS and IPS.
Evasion: How Malware Fights Back Against Detection
Signature-based AV is only as good as its database, and malware authors know it.
Encrypted virus: the virus body is encrypted; only a small decryption engine is stored in plaintext. Each propagation uses a freshly generated key, producing a different ciphertext blob. The AV must find the decryption engine — but if running under emulation, the virus may detect the emulator and refuse to decrypt.
Polymorphic virus: an encrypted virus that also mutates its decryption engine on each infection (e.g., inserts padding/junk instructions, reorders equivalent instructions). No two copies share the same engine bytes, defeating simple engine signatures.
Metamorphic virus: goes further — the entire virus body is rewritten each generation using code permutation and instruction substitution (e.g., swap ADD for equivalent INC sequence). No encryption is even needed; the code itself looks different every time. Metamorphic viruses are the hardest to detect with static signatures.
Defenses Summary
No single control is sufficient. Effective defense layers:
- Keep software patched — eliminates the known-vulnerability entry points
- Principle of least privilege — limits damage if malware runs
- Email/web filtering — blocks malicious attachments and drive-by sites at the perimeter
- Endpoint AV with behavioral detection — catches known and heuristically-suspicious samples
- Network IDS/IPS (e.g., Snort) — monitors traffic for C&C beacons, exploit payloads
- Sandboxed execution / application allowlisting — prevents unauthorized code from running
- Offline / out-of-band forensics — for rootkit investigation, boot from trusted external media so the compromised OS cannot filter results
- Backups — the primary recovery path when ransomware strikes
Key Takeaways
- Malware is software that serves the attacker's goals; it is distinct from (but may exploit) vulnerabilities.
- The main types — viruses, worms, trojans, ransomware, rootkits, botnets, spyware, keyloggers — differ primarily in how they spread and what payload they deliver.
- Rootkits hide by intercepting OS system calls; virtualization-based rootkits go further by inserting a hypervisor under the entire OS.
- Botnets decouple compromise from control: bots phone home to a C&C server and can be tasked with DDoS, spam, or mining at scale.
- Signature-based detection is fast but blind to unknown malware; heuristic/behavioral detection catches zero-days but is slower and can be fooled by sandbox-aware malware.
- Encrypted, polymorphic, and metamorphic viruses are specifically engineered to defeat signature scanning by mutating their byte patterns on each infection.
- Defense requires layered controls: patching, least privilege, AV, network IDS/IPS, sandboxing, and reliable offline backups.