Superblock and Inode
Why this matters
Every filesystem must answer four fundamental questions at runtime:
- How do we track disk/file metadata (permissions, size, type)?
- How do we track free space so we know where to write next?
- How do we find all the disk blocks belonging to a given file?
- How do we map human-readable paths to files?
The superblock and inode structures are xv6's answers to the first three. Understanding them is the key to understanding how any Unix-like filesystem works — the same ideas appear in ext2/ext4, UFS, and ZFS.
Disk Layout Overview
xv6 treats the disk as a flat array of 512-byte blocks. mkfs lays them out in a fixed order:
Block 0 | Boot block (unused by xv6)
Block 1 | Superblock
Blocks 2–31 | Log (journaling, covered later)
Blocks 32–57 | Inode blocks
Block 58 | Free-space bitmap
Blocks 59… | Data blocks
All the numeric constants (32, 58, …) are not hard-coded in the kernel — they are read from the superblock at mount time, so the layout can be tuned when the image is created.
The Superblock
The superblock lives in block 1 and records the global parameters of the filesystem image.
struct superblock {
uint size; // Total blocks in the image
uint nblocks; // Data blocks
uint ninodes; // Number of inodes
uint nlog; // Log blocks
uint logstart; // First log block
uint inodestart; // First inode block
uint bmapstart; // First bitmap block
};
mkfs writes this once; the kernel reads it at boot with readsb(). Everything else on disk is located relative to the offsets stored here — that is what makes the layout flexible.
Reading the raw bytes
You can inspect a real xv6 disk image with xxd:
$ xxd -s 512 -l 512 fs.img
00000200: e803 0000 ad03 0000 c800 0000 1e00 0000 ................
00000210: 0200 0000 2000 0000 3a00 0000 ...
Decoding (little-endian 32-bit integers):
| Field | Hex | Decimal |
|---|---|---|
| size | e8030000 | 1000 |
| nblocks | ad030000 | 941 |
| ninodes | c8000000 | 200 |
| nlog | 1e000000 | 30 |
| logstart | 02000000 | 2 |
| inodestart | 20000000 | 32 |
| bmapstart | 3a000000 | 58 |
Block I/O API
Two layers sit between the superblock and raw disk I/O:
| Function | File | Purpose |
|---|---|---|
readsb(dev, sb) |
fs.c | Populate a superblock struct from disk |
balloc(dev) |
fs.c | Allocate a zeroed data block; returns block number |
bfree(dev, b) |
fs.c | Mark block b free in the bitmap |
bread(dev, blockno) |
bio.c | Return a buffer (possibly cached) for a block |
bwrite(buf) |
bio.c | Flush a buffer to disk |
bget(dev, blockno) |
bio.c | Internal: locate or create a buffer cache entry |
bread/bwrite operate on a buffer cache so that frequently accessed blocks (like inode blocks) stay in RAM.
The On-Disk Inode (dinode)
Every file (regular file, directory, or device) gets one on-disk inode (struct dinode):
struct dinode {
short type; // T_DIR, T_FILE, or T_DEV
short major; // Major device number (T_DEV only)
short minor; // Minor device number (T_DEV only)
short nlink; // Hard link count
uint size; // File size in bytes
uint addrs[NDIRECT+1]; // Block addresses (12 direct + 1 indirect)
};
#define NDIRECT 12
Field meanings
| Field | Meaning |
|---|---|
type |
T_DIR = directory, T_FILE = regular file, T_DEV = device |
major/minor |
Identify the device driver for T_DEV files |
nlink |
How many directory entries point to this inode (hard links) |
size |
Logical file size in bytes |
addrs[0..11] |
Direct pointers: each holds the block number of a 512-byte data block |
addrs[12] |
Indirect pointer: points to a block full of 128 more block numbers |
Size of dinode
short = 2 bytes × 4 fields = 8 bytesuint = 4 bytes × (1 + 13) fields = 56 bytes
Total: 64 bytes per dinode
Inodes per block (IPB)
#define IPB (BSIZE / sizeof(struct dinode)) // 512 / 64 = 8
Eight dinodes pack into each 512-byte inode block. To find which block holds inode i:
#define IBLOCK(i, sb) ((i) / IPB + sb.inodestart)
Maximum file size
- 12 direct blocks × 512 B = 6,144 B
- 1 indirect block → 512 / 4 = 128 block addresses → 128 × 512 B = 65,536 B
- Total: (12 + 128) × 512 = 71,680 bytes ≈ 70 KB
ext2/ext4 solve this limitation with double- and triple-indirect blocks, plus extent trees.
Hard links and nlink
$ echo hello > test
$ ln test tt # creates a second directory entry pointing to the same inode
$ ls -i
20 test 20 tt # same inode number!
Both test and tt have inode 20. The inode's nlink field is 2. When you rm test, the kernel decrements nlink to 1 but keeps the data. Only when nlink reaches 0 (and no process has the file open) are the blocks freed.
The In-Memory Inode
Reading from disk on every access would be slow. The kernel caches active inodes as struct inode (in file.h):
struct inode {
uint dev; // Which device this inode lives on
uint inum; // Inode number (index into inode table)
int ref; // Reference count (how many open file descriptions)
struct sleeplock lock; // Protects inode contents; only one thread modifies at a time
int flags; // I_VALID: inode data has been read from disk
// Mirror of dinode fields:
short type;
short major;
short minor;
short nlink;
uint size;
uint addrs[NDIRECT+1];
};
dinode vs inode
| Field | dinode (disk) | inode (memory) | Purpose |
|---|---|---|---|
type…addrs |
✓ | ✓ (copy) | File metadata |
inum |
implicit (position) | explicit field | Which inode this is |
dev |
— | ✓ | Supports multiple disks |
ref |
— | ✓ | Tracks open file descriptions |
lock |
— | ✓ | Concurrent access control |
flags |
— | ✓ | Validity of cached data |
Key inode API
| Function | Purpose |
|---|---|
ialloc(dev, type) |
Allocate a new inode on disk |
iget(dev, inum) |
Find (or load) an in-memory inode; increments ref |
ilock(ip) |
Acquire the inode lock; reads from disk if I_VALID is not set |
iupdate(ip) |
Write in-memory inode back to disk |
iput(ip) |
Drop a reference; frees inode when ref and nlink both hit 0 |
readi(ip, dst, off, n) |
Read n bytes at offset off from file |
writei(ip, src, off, n) |
Write n bytes at offset off into file |
The pattern is: iget → ilock → read/modify → iunlock → iput.
Key Takeaways
- The superblock (block 1) records the global layout; all other locations are derived from it.
- Each file's metadata lives in a dinode (64 bytes); eight pack per 512-byte block.
addrs[0..11]are direct block pointers;addrs[12]is a single-indirect pointer, capping xv6 files at 71,680 bytes.- The in-memory inode mirrors the dinode and adds kernel-only fields (
ref,lock,dev,inum,flags). - Hard links share one inode;
nlinkcounts how many directory entries reference it — blocks are freed only whennlinkandrefboth reach 0. - Always follow the
iget→ilock→ work →iunlock→iputpattern to safely access inode data in a concurrent kernel.