The Virtual File System (VFS)
Why This Matters
A modern Linux system might have an ext4 root partition, a FAT32 USB stick, a tmpfs
in RAM, and a network-mounted NFS share — all visible in the same directory tree.
The fact that open(), read(), and write() work identically on all of them is
not magic: it is the Virtual File System (VFS). Understanding VFS tells you how the
kernel generalises over wildly different storage layouts, and it is the contract every
new filesystem driver must conform to.
What VFS Does
VFS sits between user space and the concrete filesystem implementations (ext4, XFS, FAT32, etc.). It provides two interfaces:
| Interface | Direction | Purpose |
|---|---|---|
| Top (with user space) | User ↔ VFS | Standard syscalls: open, read, write, lseek, … |
| Bottom (with filesystems) | VFS ↔ Driver | Operations a concrete filesystem must implement |
When a process calls read(), VFS receives the request, looks up which concrete
filesystem owns that file, and dispatches to that filesystem's read implementation.
The process never needs to know whether it is reading ext4 or FAT32.
Unix Filesystem Fundamentals
Before diving into VFS objects, recall the basic model:
- File — an ordered byte string from address 0 to (size − 1). Its data and metadata are stored separately.
- Directory — a container of files and sub-directories, forming a hierarchical
tree. A path like
/home/lkp/Desktop/fileis a chain of nested directories.
VFS expresses these concepts through four kernel objects.
The Four Core VFS Objects
1. Superblock (struct super_block)
The superblock represents a mounted filesystem instance (a partition). It holds global information:
- Filesystem type (ext4, XFS, …)
- Total size and block size
- The list of all inodes in the filesystem
- Whether the filesystem is clean or has errors
A disk-based filesystem writes its superblock to a fixed location when the partition
is formatted. At mount time the filesystem reads it and hands a populated
struct super_block to VFS.
Superblock operations (struct super_operations) are function pointers that VFS
calls to manage the filesystem. Key ones:
| Function | When called |
|---|---|
alloc_inode(sb) |
Allocate a new inode object |
destroy_inode(inode) |
Free an inode |
dirty_inode(inode) |
Mark an inode as modified (needs writeback) |
write_inode(inode, wait) |
Flush inode to disk |
drop_inode(inode) |
Last reference to inode dropped |
put_super(sb) |
VFS is unmounting — release the superblock |
sync_fs(sb, wait) |
Sync filesystem metadata to disk |
statfs(sb, statfs) |
Return filesystem statistics |
Usage pattern: sb->s_op->alloc_inode(sb) — the superblock carries a pointer to its
operations table.
2. Inode (struct inode)
An inode (index node) represents a single file or directory on the filesystem. Every file has exactly one inode. It stores:
- File type, size, permissions
- Owner/group, timestamps
- How to locate the file's data blocks on disk
When a file is created, the filesystem allocates a new inode.
Inode operations (struct inode_operations) include:
| Function | Purpose |
|---|---|
create(dir, dentry, mode) |
Create a regular file (called from creat/open) |
lookup(dir, dentry) |
Search a directory for a filename |
link(old_dentry, dir, dentry) |
Create a hard link |
unlink(dir, dentry) |
Remove a hard link / delete a file |
symlink(dir, dentry, symname) |
Create a symbolic link |
mkdir(dir, dentry, mode) |
Create a directory |
rmdir(dir, dentry) |
Remove a directory |
mknod(dir, dentry, mode, dev) |
Create a special file (device, pipe, socket) |
rename(…) |
Rename/move a file |
3. Dentry (struct dentry)
A dentry (directory entry) bridges names and inodes. For the path
/home/lkp/test.txt, VFS creates four dentries: one for /, one for home, one
for lkp, and one for test.txt. Each dentry stores:
- The component name
- Its position in the directory hierarchy
- A pointer to the corresponding inode
Dentries are constructed on the fly as paths are resolved — they are a cache of the on-disk directory structure.
Dentry States
| State | d_inode |
d_count |
Can be reclaimed? |
|---|---|---|---|
| Used | valid | > 0 | No |
| Unused | valid | 0 | Yes (LRU) |
| Negative | NULL | — | Yes (LRU) |
A negative dentry arises when a path is looked up but does not exist (e.g.,
open() on a missing file). Caching the negative result avoids redundant disk
lookups for repeated failed opens.
Dentry Cache (dcache)
- A linked list of used dentries tied to their inodes via
i_dentry - An LRU list of unused and negative dentries (reclaimed from the tail)
- A hash table for O(1) path-to-dentry resolution
Dentry operations (struct dentry_operations):
| Function | Purpose |
|---|---|
d_hash(dentry, name) |
Compute the dcache hash for a name |
d_compare(dentry, name1, name2) |
Compare two filenames |
d_delete(dentry) |
Called when d_count hits zero |
4. File Object (struct file)
A file object represents an open file descriptor in a specific process.
- Created by
open(), destroyed byclose() - If two processes open the same file, each gets its own
struct file(with its own file-position offset), but both point to the same dentry, which points to the same inode - Unlike the other three objects, the file object has no on-disk representation
File operations (struct file_operations):
| Function | Purpose |
|---|---|
llseek(file, offset, origin) |
Update the file position |
read(file, buf, count, offset) |
Synchronous read |
aio_read(iocb, buf, count, offset) |
Asynchronous read |
write(file, buf, count, offset) |
Synchronous write |
aio_write(iocb, buf, count, offset) |
Asynchronous write |
readdir(file, dirent, filldir) |
Read directory entries |
ioctl(inode, file, cmd, arg) |
Device-control command |
mmap(file, vma) |
Map file into a process address space |
open(inode, file) |
Open the file |
How the Objects Relate
struct file ──────────▶ struct dentry ──────────▶ struct inode
(per open fd) (path component) (file metadata)
file_operations dentry_operations inode_operations
struct super_block ◀──────────────────────────────── struct inode
(mounted partition) (s_sb backlink)
super_operations
One inode can be referenced by multiple dentries (hard links). One dentry can be referenced by multiple file objects (multiple processes or fds).
Key Takeaways
- VFS is an abstraction layer with a top interface (syscalls for user space) and a bottom interface (operations that each concrete filesystem must implement).
- Four core objects:
super_block(partition),inode(file metadata),dentry(path component / name cache),file(open file descriptor). - Operations tables (e.g.,
super_operations,inode_operations) are structs of function pointers — essentially vtables in C — that let VFS dispatch to the right filesystem driver. - Dentries are a cache: they are built on demand and kept in the dcache (an LRU list + hash table). Negative dentries cache lookup failures.
- File objects are per-process: two processes opening the same file share a dentry and inode but have independent file objects (independent offsets).
- Writing a new Linux filesystem means implementing the bottom VFS interface — filling in the operations structs for superblock, inode, dentry, and file.