Malware

Every major security incident you read about — a hospital's systems locked by ransomware, millions of credit-card numbers stolen, a nuclear centrifuge silently destroyed — traces back to malware: software that runs on a machine and does something the attacker wants rather than what the owner intended. Understanding malware is not just about naming categories; it is about seeing the full attack lifecycle, from initial compromise through persistence, payload execution, and evasion of detection.

What Is Malware?

Malware (malicious software) is any set of instructions that executes on a computer to serve an attacker's goals. The key distinction from a vulnerability: a vulnerability is a flaw; malware is the code that exploits or abuses that flaw (or uses entirely non-technical means like social engineering). The same piece of malware may use a buffer-overflow exploit to get in, social engineering to persist, and cryptographic ransomware as its payload — combining techniques across the attack chain.

Taxonomy: Types of Malware

Type Core behavior
Virus Attaches itself to legitimate files/programs; spreads when the infected file is executed
Worm Self-replicates across networks without user action; exploits network services
Trojan horse Appears to do something useful; hides malicious functionality (e.g., spyware bundled with a free app)
Ransomware Encrypts the victim's files and demands payment (e.g., cryptocurrency) to restore access
Rootkit Modifies the OS to hide malware files, processes, and network connections
Bot / Botnet Compromised machine (bot) controlled remotely by a botmaster via a command-and-control (C&C) server
Spyware Covertly collects user data (keystrokes, screenshots, documents)
Adware Serves unwanted advertisements, often as a revenue mechanism
Keylogger Specifically captures keystrokes; can be software or hardware (a USB dongle between keyboard and PC)
Mobile malware Targets smartphone operating systems (Android, iOS)

Types frequently co-occur: a trojan may install a rootkit that installs a bot agent that enables ransomware to be pushed later.

What Can Malware Do? (Payloads)

Once running, malware can:

How Does Malware Get Onto a System?

Getting the malware to execute is its own challenge. Common initial-access vectors:

  1. Exploit a network service — buffer/integer overflow in an HTTP, RPC, or file-sharing daemon (the attacker sends a crafted packet; the server executes injected code)
  2. Client-side exploit — malicious PDF, Word macro, Flash object, or browser exploit that triggers when an unsuspecting user opens or views the file
  3. Social engineering — trick the user into running the malware directly (phishing email with an attachment, fake software installer, malicious link)
  4. Autorun / physical media — USB drive with autorun functionality (Stuxnet spread this way into air-gapped Iranian facilities)
  5. Drive-by download — user visits a compromised or malicious web page; an exploit kit fingerprints the browser and silently delivers the best-matching exploit
  6. Insider / local access — a malicious or compromised employee with physical or account access

Rootkits: Hiding in Plain Sight

A rootkit's defining goal is concealment. It modifies OS internals so that standard utilities (ls, ps, netstat) cannot see the malware's files, processes, or network connections. The mechanism:

An extreme variant is the virtualization-based rootkit ("Blue Pill"): the rootkit inserts a tiny hypervisor (VMM) beneath the running OS so the OS continues running as a guest virtual machine, completely unaware it has been relocated. The attack system and the VMM become invisible below the target OS stack.

Botnets: Malware at Scale

A botnet is a collection of compromised machines (bots) under unified control of a botmaster. Key properties:

Detection Approaches

Signature-Based Detection

The traditional antivirus model: maintain a database of malware signatures (byte sequences or instruction patterns unique to each known malware family). Scan candidate files and look for matches.

Allow/Block Listing (Hash-Based)

Maintain a database of cryptographic hashes of known-good files (OS binaries, popular applications) and known-bad files (confirmed malware). Compare the hash of each file on disk:

Heuristic / Behavioral Analysis

Useful for zero-day malware that has no signature yet. Two sub-approaches:

Sandboxing is powerful but malware authors fight back: some samples detect they are running inside a VM or emulator and refuse to execute their malicious payload until they confirm they are on a real machine.

IDS and IPS

Term Role
IDS (Intrusion Detection System) Observes traffic or system activity and raises an alert after suspicious behavior is detected
IPS (Intrusion Prevention System) Actively blocks or drops traffic/actions before they reach the target
HIDS / NIDS Host-based vs. network-based placement

Detection is inherently reactive (the memory may already be corrupted before the alert fires); prevention is proactive but requires high confidence to avoid blocking legitimate traffic. Tools like Snort can function as both IDS and IPS.

Evasion: How Malware Fights Back Against Detection

Signature-based AV is only as good as its database, and malware authors know it.

Encrypted virus: the virus body is encrypted; only a small decryption engine is stored in plaintext. Each propagation uses a freshly generated key, producing a different ciphertext blob. The AV must find the decryption engine — but if running under emulation, the virus may detect the emulator and refuse to decrypt.

Polymorphic virus: an encrypted virus that also mutates its decryption engine on each infection (e.g., inserts padding/junk instructions, reorders equivalent instructions). No two copies share the same engine bytes, defeating simple engine signatures.

Metamorphic virus: goes further — the entire virus body is rewritten each generation using code permutation and instruction substitution (e.g., swap ADD for equivalent INC sequence). No encryption is even needed; the code itself looks different every time. Metamorphic viruses are the hardest to detect with static signatures.

Defenses Summary

No single control is sufficient. Effective defense layers:

  1. Keep software patched — eliminates the known-vulnerability entry points
  2. Principle of least privilege — limits damage if malware runs
  3. Email/web filtering — blocks malicious attachments and drive-by sites at the perimeter
  4. Endpoint AV with behavioral detection — catches known and heuristically-suspicious samples
  5. Network IDS/IPS (e.g., Snort) — monitors traffic for C&C beacons, exploit payloads
  6. Sandboxed execution / application allowlisting — prevents unauthorized code from running
  7. Offline / out-of-band forensics — for rootkit investigation, boot from trusted external media so the compromised OS cannot filter results
  8. Backups — the primary recovery path when ransomware strikes

Key Takeaways

Practice

  1. What is the defining difference between a virus and a worm?
  2. What is a rootkit's primary technique for staying hidden once installed?
  3. In a botnet, what is the role of the C&C (command-and-control) server?
  4. Why does pure signature-based detection have a structural blind spot for zero-day malware?
  5. What distinguishes a metamorphic virus from a polymorphic virus?
  6. Why is sandboxing (running a suspicious file in an isolated VM and watching its behavior) not a complete answer to zero-day malware?
  7. Why is it standard practice in incident response, when a rootkit is suspected, to boot from external trusted media rather than rely on the host's own utilities to investigate?
  8. Modern antivirus combines signature and heuristic/behavioral detection. Briefly explain the trade-off each makes and why neither alone is sufficient.
  9. Stuxnet spread into air-gapped Iranian nuclear facilities. Which initial-access vector did it primarily rely on, given the targets had no Internet connectivity?