Malware

Every major security incident you read about — a hospital's systems locked by ransomware, millions of credit-card numbers stolen, a nuclear centrifuge silently destroyed — traces back to malware: software that runs on a machine and does something the attacker wants rather than what the owner intended. Understanding malware is not just about naming categories; it is about seeing the full attack lifecycle, from initial compromise through persistence, payload execution, and evasion of detection.

What Is Malware?

Malware (malicious software) is any set of instructions that executes on a computer to serve an attacker's goals. The key distinction from a vulnerability: a vulnerability is a flaw; malware is the code that exploits or abuses that flaw (or uses entirely non-technical means like social engineering). The same piece of malware may use a buffer-overflow exploit to get in, social engineering to persist, and cryptographic ransomware as its payload — combining techniques across the attack chain.

Taxonomy: Types of Malware

Type	Core behavior
Virus	Attaches itself to legitimate files/programs; spreads when the infected file is executed
Worm	Self-replicates across networks without user action; exploits network services
Trojan horse	Appears to do something useful; hides malicious functionality (e.g., spyware bundled with a free app)
Ransomware	Encrypts the victim's files and demands payment (e.g., cryptocurrency) to restore access
Rootkit	Modifies the OS to hide malware files, processes, and network connections
Bot / Botnet	Compromised machine (bot) controlled remotely by a botmaster via a command-and-control (C&C) server
Spyware	Covertly collects user data (keystrokes, screenshots, documents)
Adware	Serves unwanted advertisements, often as a revenue mechanism
Keylogger	Specifically captures keystrokes; can be software or hardware (a USB dongle between keyboard and PC)
Mobile malware	Targets smartphone operating systems (Android, iOS)

Types frequently co-occur: a trojan may install a rootkit that installs a bot agent that enables ransomware to be pushed later.

What Can Malware Do? (Payloads)

Once running, malware can:

Surveil — keylogging, screen/camera capture, document exfiltration
Control — launch a reverse shell or remote desktop so the attacker can interact with the machine in real time
Extort — encrypt all user files and display a ransom note (WannaCry is the canonical example: it encrypted files and demanded Bitcoin payment, with a countdown timer before the price doubled)
Disrupt — deliver pop-ups (adware), delete data, or even physically damage hardware (Stuxnet caused Iranian centrifuges to spin at destructive speeds)
Amplify — enroll the machine into a botnet for DDoS attacks, brute-force password campaigns, or cryptocurrency mining

How Does Malware Get Onto a System?

Getting the malware to execute is its own challenge. Common initial-access vectors:

Exploit a network service — buffer/integer overflow in an HTTP, RPC, or file-sharing daemon (the attacker sends a crafted packet; the server executes injected code)
Client-side exploit — malicious PDF, Word macro, Flash object, or browser exploit that triggers when an unsuspecting user opens or views the file
Social engineering — trick the user into running the malware directly (phishing email with an attachment, fake software installer, malicious link)
Autorun / physical media — USB drive with autorun functionality (Stuxnet spread this way into air-gapped Iranian facilities)
Drive-by download — user visits a compromised or malicious web page; an exploit kit fingerprints the browser and silently delivers the best-matching exploit
Insider / local access — a malicious or compromised employee with physical or account access

Rootkits: Hiding in Plain Sight

A rootkit's defining goal is concealment. It modifies OS internals so that standard utilities (ls, ps, netstat) cannot see the malware's files, processes, or network connections. The mechanism:

Intercept system calls — when an application calls readdir() or getpid(), the rootkit interposes between the application and the kernel
Filter results — if the returned process ID matches the rootkit's own PID, suppress it from the listing: if (PID == rootkit_PID) → don't show
Because detection tools that rely on the OS API will get filtered output, a compromised OS cannot be trusted to audit itself — clean-boot forensics (booting from external media) is required

An extreme variant is the virtualization-based rootkit ("Blue Pill"): the rootkit inserts a tiny hypervisor (VMM) beneath the running OS so the OS continues running as a guest virtual machine, completely unaware it has been relocated. The attack system and the VMM become invisible below the target OS stack.

Botnets: Malware at Scale

A botnet is a collection of compromised machines (bots) under unified control of a botmaster. Key properties:

Decoupled compromise and control — the method used to infect (worm, trojan, malicious URL) is separate from the ongoing control channel
Upon infection, the new bot "phones home" to a command-and-control (C&C) server to register itself
The botmaster pushes commands via C&C: DDoS (ping flood), brute-force password attacks, spam relay, cryptocurrency mining
Modern botnets route C&C traffic through anonymizing infrastructure (e.g., Tor) to frustrate takedown; the PgMiner botnet used PostgreSQL as an initial compromise vector and Tor for C&C

Detection Approaches

Signature-Based Detection

The traditional antivirus model: maintain a database of malware signatures (byte sequences or instruction patterns unique to each known malware family). Scan candidate files and look for matches.

Fast — pattern matching is computationally cheap
Precise for known threats — low false-positive rate on catalogued malware
Blind to new/unknown malware — any sample not yet in the database is missed (zero-day problem)
Signatures are proprietary and must be kept continuously updated

Allow/Block Listing (Hash-Based)

Maintain a database of cryptographic hashes of known-good files (OS binaries, popular applications) and known-bad files (confirmed malware). Compare the hash of each file on disk:

Hash match in blocklist → quarantine
Hash match in allowlist → trusted
No match → unknown (potential new malware or custom software)

Heuristic / Behavioral Analysis

Useful for zero-day malware that has no signature yet. Two sub-approaches:

Static code analysis — inspect the binary's instructions without running it; flag code that contains patterns associated with malicious behavior (e.g., instructions to delete system files, enumerate processes, open raw network sockets)
Dynamic analysis / sandboxing — execute the file in an isolated emulation environment; monitor what the program actually does (file writes, network calls, registry changes); if the observed behavior is harmful, classify as malware

Sandboxing is powerful but malware authors fight back: some samples detect they are running inside a VM or emulator and refuse to execute their malicious payload until they confirm they are on a real machine.

IDS and IPS

Term	Role
IDS (Intrusion Detection System)	Observes traffic or system activity and raises an alert after suspicious behavior is detected
IPS (Intrusion Prevention System)	Actively blocks or drops traffic/actions before they reach the target
HIDS / NIDS	Host-based vs. network-based placement

Detection is inherently reactive (the memory may already be corrupted before the alert fires); prevention is proactive but requires high confidence to avoid blocking legitimate traffic. Tools like Snort can function as both IDS and IPS.

Evasion: How Malware Fights Back Against Detection

Signature-based AV is only as good as its database, and malware authors know it.

Encrypted virus: the virus body is encrypted; only a small decryption engine is stored in plaintext. Each propagation uses a freshly generated key, producing a different ciphertext blob. The AV must find the decryption engine — but if running under emulation, the virus may detect the emulator and refuse to decrypt.

Polymorphic virus: an encrypted virus that also mutates its decryption engine on each infection (e.g., inserts padding/junk instructions, reorders equivalent instructions). No two copies share the same engine bytes, defeating simple engine signatures.

Metamorphic virus: goes further — the entire virus body is rewritten each generation using code permutation and instruction substitution (e.g., swap ADD for equivalent INC sequence). No encryption is even needed; the code itself looks different every time. Metamorphic viruses are the hardest to detect with static signatures.

Defenses Summary

No single control is sufficient. Effective defense layers:

Keep software patched — eliminates the known-vulnerability entry points
Principle of least privilege — limits damage if malware runs
Email/web filtering — blocks malicious attachments and drive-by sites at the perimeter
Endpoint AV with behavioral detection — catches known and heuristically-suspicious samples
Network IDS/IPS (e.g., Snort) — monitors traffic for C&C beacons, exploit payloads
Sandboxed execution / application allowlisting — prevents unauthorized code from running
Offline / out-of-band forensics — for rootkit investigation, boot from trusted external media so the compromised OS cannot filter results
Backups — the primary recovery path when ransomware strikes

Key Takeaways

Malware is software that serves the attacker's goals; it is distinct from (but may exploit) vulnerabilities.
The main types — viruses, worms, trojans, ransomware, rootkits, botnets, spyware, keyloggers — differ primarily in how they spread and what payload they deliver.
Rootkits hide by intercepting OS system calls; virtualization-based rootkits go further by inserting a hypervisor under the entire OS.
Botnets decouple compromise from control: bots phone home to a C&C server and can be tasked with DDoS, spam, or mining at scale.
Signature-based detection is fast but blind to unknown malware; heuristic/behavioral detection catches zero-days but is slower and can be fooled by sandbox-aware malware.
Encrypted, polymorphic, and metamorphic viruses are specifically engineered to defeat signature scanning by mutating their byte patterns on each infection.
Defense requires layered controls: patching, least privilege, AV, network IDS/IPS, sandboxing, and reliable offline backups.