Project 2: In-Memory Filesystem with FUSE

Project 2: In-Memory Filesystem with FUSE

Project Overview

Attribute Details
Difficulty Advanced
Time Estimate 2-3 weeks
Language C
Knowledge Area Filesystems / Kernel Interface
Main Book โ€œThe Linux Programming Interfaceโ€ by Michael Kerrisk
Prerequisites Project 1 or equivalent understanding, comfort with C pointers and structs

What Youโ€™ll Build

A complete filesystem that stores files in RAM, mounted as a real directory on your Linux system. Create files, directories, read, write, deleteโ€”all backed by your C code.

Real World Output:

$ mkdir /tmp/myfs
$ ./memfs /tmp/myfs
$ cd /tmp/myfs
$ echo "Hello filesystem!" > test.txt
$ cat test.txt
Hello filesystem!
$ ls -la
total 0
drwxr-xr-x 2 user user    0 Dec 18 10:00 .
drwxr-xr-x 3 user user 4096 Dec 18 10:00 ..
-rw-r--r-- 1 user user   18 Dec 18 10:01 test.txt
$ mkdir subdir
$ # Your filesystem is real and mounted!

Learning Objectives

By completing this project, you will:

  1. Understand the VFS abstraction and how Linux presents a unified interface to different filesystems
  2. Implement file operations (open, read, write, create, unlink) that handle real syscalls
  3. Design in-memory data structures for inodes, directory entries, and file data
  4. Work with FUSE (Filesystem in Userspace) to create mountable filesystems without kernel code
  5. Handle file metadata including permissions, timestamps, and link counts
  6. Debug userspace filesystems using FUSEโ€™s debugging features

The Core Question Youโ€™re Answering

โ€œWhat actually happens when a program calls open(), read(), or write()? How does the kernel translate those syscalls into filesystem operations?โ€

When you type cat file.txt, what happens? The cat program calls open("file.txt", O_RDONLY). The kernel receives this system call and needs to:

  1. Resolve โ€œfile.txtโ€ to an inode number
  2. Check permissions
  3. Allocate a file descriptor
  4. Return that descriptor to cat

FUSE (Filesystem in Userspace) lets you implement these operations in a regular C program instead of kernel code. You become the โ€œfilesystemโ€ that the kernel calls. When a user runs ls /mnt/myfs, the kernel asks YOUR code for directory contents.

This project demystifies the Virtual Filesystem (VFS) layer. Youโ€™ll see exactly how high-level operations like โ€œlist directoryโ€ decompose into low-level filesystem callbacks.


Deep Theoretical Foundation

1. The Virtual Filesystem (VFS) Layer

Linux supports dozens of filesystems (ext4, XFS, Btrfs, NFS, FUSE, etc.), but applications use the same system calls for all of them. The VFS layer makes this possible.

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    User Application                      โ”‚
โ”‚               (calls open(), read(), write())            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚ System calls
                         โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    VFS Layer                             โ”‚
โ”‚         (routes operations to correct filesystem)        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚    ext4     โ”‚    XFS      โ”‚     NFS       โ”‚    FUSE     โ”‚
โ”‚   driver    โ”‚   driver    โ”‚    client     โ”‚   bridge    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                                   โ”‚
                                                   โ–ผ
                                        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                                        โ”‚  Your Userspace  โ”‚
                                        โ”‚    Filesystem    โ”‚
                                        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

VFS abstractions:

  • Superblock: Filesystem-wide metadata (type, size, mount options)
  • Inode: Per-file metadata (not the name)
  • Dentry: Directory entry cache (name โ†’ inode mapping)
  • File: Open file instance (position, flags, operations)

2. FUSE Architecture

FUSE provides a bridge between the kernel VFS and your userspace code:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    /dev/fuse                             โ”‚
โ”‚            (character device for communication)          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                         โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ”‚                             โ”‚
          โ–ผ                             โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Kernel Module     โ”‚      โ”‚   libfuse Library    โ”‚
โ”‚   (fuse.ko)         โ”‚โ—„โ”€โ”€โ”€โ”€โ–บโ”‚   (userspace)        โ”‚
โ”‚                     โ”‚      โ”‚                      โ”‚
โ”‚   Translates VFS    โ”‚      โ”‚   Calls your         โ”‚
โ”‚   to /dev/fuse      โ”‚      โ”‚   callbacks          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

How a read() syscall flows:

  1. Application calls read(fd, buf, 100)
  2. Kernel VFS dispatches to FUSE kernel module
  3. FUSE module writes request to /dev/fuse
  4. libfuse reads request, calls your read() callback
  5. Your code returns data
  6. libfuse writes response to /dev/fuse
  7. FUSE module returns data to VFS
  8. Application receives data

3. The fuse_operations Structure

FUSE filesystems implement callbacks in a fuse_operations struct:

struct fuse_operations {
    int (*getattr)(const char *path, struct stat *stbuf, ...);
    int (*readdir)(const char *path, void *buf, fuse_fill_dir_t filler, ...);
    int (*open)(const char *path, struct fuse_file_info *fi);
    int (*read)(const char *path, char *buf, size_t size, off_t offset, ...);
    int (*write)(const char *path, const char *buf, size_t size, off_t offset, ...);
    int (*create)(const char *path, mode_t mode, struct fuse_file_info *fi);
    int (*unlink)(const char *path);
    int (*mkdir)(const char *path, mode_t mode);
    int (*rmdir)(const char *path);
    int (*rename)(const char *from, const char *to, unsigned int flags);
    int (*truncate)(const char *path, off_t size, ...);
    int (*utimens)(const char *path, const struct timespec tv[2], ...);
    int (*chmod)(const char *path, mode_t mode, ...);
    int (*chown)(const char *path, uid_t uid, gid_t gid, ...);
    // ... many more
};

Critical operations:

  • getattr: Called for EVERY file access (stat, ls, open checks)
  • readdir: List directory contents
  • read/write: File data access
  • create/unlink: Create/delete files
  • mkdir/rmdir: Create/delete directories

4. The stat Structure

Every filesystem must provide file metadata via struct stat:

struct stat {
    dev_t     st_dev;      // Device ID
    ino_t     st_ino;      // Inode number
    mode_t    st_mode;     // File type + permissions
    nlink_t   st_nlink;    // Number of hard links
    uid_t     st_uid;      // Owner user ID
    gid_t     st_gid;      // Owner group ID
    off_t     st_size;     // Total size in bytes
    blksize_t st_blksize;  // Block size for I/O
    blkcnt_t  st_blocks;   // Number of 512B blocks
    struct timespec st_atim;  // Last access time
    struct timespec st_mtim;  // Last modification time
    struct timespec st_ctim;  // Last status change time
};

st_mode encoding:

// File type bits (mutually exclusive)
S_IFREG  0100000   // Regular file
S_IFDIR  0040000   // Directory
S_IFLNK  0120000   // Symbolic link

// Permission bits
S_IRUSR  00400     // Owner read
S_IWUSR  00200     // Owner write
S_IXUSR  00100     // Owner execute
S_IRGRP  00040     // Group read
// ... etc

// Example: Regular file with rw-r--r--
mode_t mode = S_IFREG | 0644;

// Example: Directory with rwxr-xr-x
mode_t mode = S_IFDIR | 0755;

Every file has a link count (st_nlink):

  • Regular files start with nlink=1
  • Directories start with nlink=2 (own . entry + parentโ€™s reference)
  • Each subdirectory adds 1 to parentโ€™s nlink (for .. entry)
/mydir              nlink=3  (., parent entry, subdir/..)
โ”œโ”€โ”€ .               (points to /mydir, doesn't add nlink)
โ”œโ”€โ”€ ..              (points to parent)
โ”œโ”€โ”€ file.txt        nlink=1
โ””โ”€โ”€ subdir/         nlink=2
    โ”œโ”€โ”€ .           (points to subdir)
    โ””โ”€โ”€ ..          (points to /mydir, adds to mydir's nlink)

Why this matters:

  • When nlink reaches 0 AND no open file handles exist, the file can be deleted
  • rm decrements nlink; actual deletion happens when nlink=0

6. In-Memory Data Structure Design

For a RAM filesystem, you need structures to store:

  1. Inodes: Metadata for each file/directory
  2. Directory entries: Name โ†’ inode mappings
  3. File data: Actual content bytes

Design decisions:

  • Fixed-size inode array vs dynamic allocation?
  • Store data in inode vs separate data pointers?
  • Hierarchical directory traversal vs flat path โ†’ inode map?

Simple approach for learning:

#define MAX_INODES 1024
#define MAX_PATH 256

struct my_inode {
    uint32_t ino;
    mode_t mode;
    uid_t uid;
    gid_t gid;
    off_t size;
    nlink_t nlink;
    time_t atime, mtime, ctime;

    // For regular files
    char *data;
    size_t data_capacity;

    // For directories
    struct dir_entry *entries;
    size_t num_entries;
};

struct dir_entry {
    uint32_t ino;
    char name[256];
};

// Global filesystem state
struct my_inode inodes[MAX_INODES];
int next_ino = 1;

Project Specification

Functional Requirements

  1. Mount and unmount
    • Mount filesystem to a directory
    • Unmount cleanly with fusermount -u
    • Handle mount options (foreground, debug)
  2. File operations
    • Create files (touch, echo > file)
    • Read files (cat, head)
    • Write files (echo >>, editors)
    • Delete files (rm)
    • Truncate files (resize)
  3. Directory operations
    • Create directories (mkdir)
    • List directories (ls, ls -la)
    • Remove directories (rmdir)
    • Navigate directories (cd)
  4. Metadata operations
    • Get attributes (stat, ls -l)
    • Set timestamps (touch -t)
    • Change permissions (chmod)
    • Change ownership (chown)
  5. Path operations
    • Resolve absolute paths
    • Handle . and .. correctly
    • Support nested paths (/a/b/c/file.txt)

Non-Functional Requirements

  • Support files up to 100MB
  • Support up to 1000 files
  • Handle concurrent access (basic)
  • Provide meaningful errors (ENOENT, EACCES, etc.)

Solution Architecture

Component Design

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    FUSE Callbacks                            โ”‚
โ”‚         (myfs_getattr, myfs_read, myfs_write, ...)          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   Path Resolution                            โ”‚
โ”‚              (path string โ†’ inode number)                    โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                           โ”‚
                           โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                   Inode Manager                              โ”‚
โ”‚         (allocate, lookup, update inode structures)         โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚                        โ”‚
                  โ–ผ                        โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚    Directory Manager    โ”‚  โ”‚       Data Manager             โ”‚
โ”‚  (entries, lookup)      โ”‚  โ”‚   (file content storage)       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚                        โ”‚
                  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                    In-Memory Storage                         โ”‚
โ”‚              (inode array, data buffers)                     โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Data Structures

// File type enumeration
typedef enum {
    MY_FILE,
    MY_DIR,
    MY_SYMLINK
} my_file_type;

// Inode structure
struct my_inode {
    uint32_t ino;              // Unique inode number
    my_file_type type;         // File type
    mode_t mode;               // Permissions
    uid_t uid;
    gid_t gid;
    off_t size;                // File size
    nlink_t nlink;             // Link count
    struct timespec atime;     // Access time
    struct timespec mtime;     // Modification time
    struct timespec ctime;     // Status change time

    union {
        struct {               // For regular files
            char *data;
            size_t capacity;
        } file;
        struct {               // For directories
            struct dir_entry *entries;
            size_t count;
            size_t capacity;
        } dir;
        struct {               // For symlinks
            char *target;
        } symlink;
    };
};

// Directory entry
struct dir_entry {
    uint32_t ino;
    char name[256];
};

// Filesystem state
struct myfs_state {
    struct my_inode inodes[MAX_INODES];
    int inode_bitmap[MAX_INODES];  // 1 = used, 0 = free
    uint32_t next_ino;
    pthread_mutex_t lock;          // For thread safety
};

Key Functions

// Initialization
void myfs_init_state(struct myfs_state *state);
void myfs_destroy_state(struct myfs_state *state);

// Inode management
uint32_t inode_alloc(struct myfs_state *state, my_file_type type, mode_t mode);
void inode_free(struct myfs_state *state, uint32_t ino);
struct my_inode* inode_get(struct myfs_state *state, uint32_t ino);

// Path resolution
uint32_t resolve_path(struct myfs_state *state, const char *path);
uint32_t resolve_parent(struct myfs_state *state, const char *path, char *name_out);

// Directory operations
int dir_add_entry(struct my_inode *dir, const char *name, uint32_t ino);
int dir_remove_entry(struct my_inode *dir, const char *name);
uint32_t dir_lookup(struct my_inode *dir, const char *name);

// File data operations
int file_resize(struct my_inode *inode, off_t new_size);
ssize_t file_read(struct my_inode *inode, char *buf, size_t size, off_t offset);
ssize_t file_write(struct my_inode *inode, const char *buf, size_t size, off_t offset);

// FUSE callbacks
static int myfs_getattr(const char *path, struct stat *stbuf,
                        struct fuse_file_info *fi);
static int myfs_readdir(const char *path, void *buf, fuse_fill_dir_t filler,
                        off_t offset, struct fuse_file_info *fi,
                        enum fuse_readdir_flags flags);
static int myfs_open(const char *path, struct fuse_file_info *fi);
static int myfs_read(const char *path, char *buf, size_t size,
                     off_t offset, struct fuse_file_info *fi);
static int myfs_write(const char *path, const char *buf, size_t size,
                      off_t offset, struct fuse_file_info *fi);
static int myfs_create(const char *path, mode_t mode,
                       struct fuse_file_info *fi);
static int myfs_unlink(const char *path);
static int myfs_mkdir(const char *path, mode_t mode);
static int myfs_rmdir(const char *path);

Implementation Guide

Phase 1: FUSE Skeleton (Days 1-2)

Goals:

  • Set up build environment with libfuse
  • Create minimal mountable filesystem
  • Implement getattr and readdir for root only

Steps:

  1. Install FUSE development headers:
    sudo apt install libfuse3-dev
    
  2. Create minimal skeleton: ```c #define FUSE_USE_VERSION 31 #include #include #include

static int myfs_getattr(const char *path, struct stat *stbuf, struct fuse_file_info *fi) { memset(stbuf, 0, sizeof(struct stat));

if (strcmp(path, "/") == 0) {
    stbuf->st_mode = S_IFDIR | 0755;
    stbuf->st_nlink = 2;
    stbuf->st_uid = getuid();
    stbuf->st_gid = getgid();
    return 0;
}

return -ENOENT; }

static int myfs_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi, enum fuse_readdir_flags flags) { if (strcmp(path, โ€œ/โ€) != 0) return -ENOENT;

filler(buf, ".", NULL, 0, 0);
filler(buf, "..", NULL, 0, 0);

return 0; }

static struct fuse_operations myfs_oper = { .getattr = myfs_getattr, .readdir = myfs_readdir, };

int main(int argc, char *argv[]) { return fuse_main(argc, argv, &myfs_oper, NULL); }


3. Compile and test:
```bash
gcc -Wall myfs.c -o myfs $(pkg-config fuse3 --cflags --libs)
mkdir /tmp/myfs
./myfs /tmp/myfs
ls /tmp/myfs    # Should show empty directory
fusermount -u /tmp/myfs

Key learning: FUSE callback structure, mounting basics

Phase 2: Inode Infrastructure (Days 2-4)

Goals:

  • Build inode allocation/deallocation
  • Create root inode at initialization
  • Implement inode lookup

Steps:

  1. Define inode structure and global state
  2. Initialize root inode (ino=1) as directory
  3. Implement inode_alloc() and inode_free()
  4. Update getattr to use inode data

Testing:

// Unit test for inode allocation
void test_inode_alloc() {
    struct myfs_state state;
    myfs_init_state(&state);

    uint32_t ino1 = inode_alloc(&state, MY_FILE, 0644);
    assert(ino1 == 2);  // Root is 1

    uint32_t ino2 = inode_alloc(&state, MY_DIR, 0755);
    assert(ino2 == 3);

    inode_free(&state, ino1);
    uint32_t ino3 = inode_alloc(&state, MY_FILE, 0644);
    assert(ino3 == 2);  // Reused!
}

Phase 3: Path Resolution (Days 4-5)

Goals:

  • Parse paths into components
  • Traverse directory tree
  • Find inode for any valid path

Steps:

  1. Implement path tokenization (split by โ€˜/โ€™)
  2. Start at root inode (ino=1)
  3. For each component, search directory entries
  4. Return final inode or -ENOENT
uint32_t resolve_path(struct myfs_state *state, const char *path) {
    if (strcmp(path, "/") == 0)
        return 1;  // Root inode

    char *path_copy = strdup(path);
    char *token = strtok(path_copy + 1, "/");  // Skip leading '/'
    uint32_t current_ino = 1;  // Start at root

    while (token != NULL) {
        struct my_inode *dir = inode_get(state, current_ino);
        if (dir->type != MY_DIR) {
            free(path_copy);
            return 0;  // Not found (ENOTDIR)
        }

        current_ino = dir_lookup(dir, token);
        if (current_ino == 0) {
            free(path_copy);
            return 0;  // Not found
        }

        token = strtok(NULL, "/");
    }

    free(path_copy);
    return current_ino;
}

Phase 4: File Creation and Deletion (Days 5-7)

Goals:

  • Implement create callback
  • Implement unlink callback
  • Update directory entries

Steps:

  1. create: Allocate inode, add directory entry
  2. unlink: Remove entry, decrement nlink, free if nlink=0
  3. Handle parent directory updates

Testing:

./myfs /tmp/myfs
touch /tmp/myfs/test.txt
ls /tmp/myfs     # Should show test.txt
rm /tmp/myfs/test.txt
ls /tmp/myfs     # Should be empty

Phase 5: Read and Write (Days 7-10)

Goals:

  • Store file data in memory
  • Handle offset and size correctly
  • Update size and timestamps

Steps:

  1. Allocate data buffer on first write
  2. Grow buffer as needed (realloc)
  3. Handle read beyond EOF (return partial/0)
  4. Update mtime on write, atime on read
static int myfs_write(const char *path, const char *buf, size_t size,
                      off_t offset, struct fuse_file_info *fi) {
    uint32_t ino = resolve_path(state, path);
    if (ino == 0) return -ENOENT;

    struct my_inode *inode = inode_get(state, ino);
    if (inode->type != MY_FILE) return -EISDIR;

    // Grow buffer if needed
    off_t new_size = offset + size;
    if (new_size > inode->file.capacity) {
        inode->file.data = realloc(inode->file.data, new_size);
        inode->file.capacity = new_size;
    }

    // Fill gap with zeros if offset > current size
    if (offset > inode->size) {
        memset(inode->file.data + inode->size, 0, offset - inode->size);
    }

    memcpy(inode->file.data + offset, buf, size);

    if (new_size > inode->size)
        inode->size = new_size;

    clock_gettime(CLOCK_REALTIME, &inode->mtime);
    inode->ctime = inode->mtime;

    return size;
}

Phase 6: Directory Operations (Days 10-12)

Goals:

  • Implement mkdir and rmdir
  • Update nlink correctly
  • Handle non-empty directory deletion

Steps:

  1. mkdir: Allocate inode, add . and .. entries, add to parent
  2. rmdir: Check empty, remove entry, decrement parent nlink
  3. Update readdir to iterate actual entries

Testing:

mkdir /tmp/myfs/subdir
mkdir /tmp/myfs/subdir/nested
ls -la /tmp/myfs/subdir  # Should show . and ..
rmdir /tmp/myfs/subdir   # Should fail (not empty)
rmdir /tmp/myfs/subdir/nested
rmdir /tmp/myfs/subdir   # Should succeed

Phase 7: Metadata Operations (Days 12-14)

Goals:

  • Implement chmod, chown
  • Implement utimens (timestamp updates)
  • Implement truncate

Steps:

  1. chmod: Update inode mode (preserve file type bits)
  2. chown: Update uid/gid
  3. utimens: Update atime/mtime
  4. truncate: Resize file (shrink or grow)

Testing:

echo "test" > /tmp/myfs/file.txt
chmod 600 /tmp/myfs/file.txt
ls -l /tmp/myfs/file.txt  # Should show -rw-------
truncate -s 0 /tmp/myfs/file.txt
cat /tmp/myfs/file.txt    # Should be empty

Testing Strategy

Unit Tests

// Test directory entry operations
void test_dir_operations() {
    struct my_inode dir;
    memset(&dir, 0, sizeof(dir));
    dir.type = MY_DIR;

    // Add entry
    int ret = dir_add_entry(&dir, "test.txt", 5);
    assert(ret == 0);
    assert(dir.dir.count == 1);

    // Lookup
    uint32_t ino = dir_lookup(&dir, "test.txt");
    assert(ino == 5);

    // Non-existent lookup
    ino = dir_lookup(&dir, "missing.txt");
    assert(ino == 0);

    // Remove
    ret = dir_remove_entry(&dir, "test.txt");
    assert(ret == 0);
    assert(dir.dir.count == 0);
}

// Test file read/write
void test_file_io() {
    struct my_inode file;
    memset(&file, 0, sizeof(file));
    file.type = MY_FILE;

    // Write at offset 0
    const char *data = "Hello World";
    ssize_t written = file_write(&file, data, strlen(data), 0);
    assert(written == 11);
    assert(file.size == 11);

    // Read back
    char buf[32];
    ssize_t read = file_read(&file, buf, sizeof(buf), 0);
    assert(read == 11);
    assert(memcmp(buf, data, 11) == 0);

    // Read with offset
    read = file_read(&file, buf, 5, 6);
    assert(read == 5);
    assert(memcmp(buf, "World", 5) == 0);
}

Integration Tests

#!/bin/bash
# test_myfs.sh

MOUNT=/tmp/myfs_test

cleanup() {
    fusermount -u $MOUNT 2>/dev/null
    rmdir $MOUNT 2>/dev/null
}
trap cleanup EXIT

mkdir -p $MOUNT
./myfs $MOUNT

# Test file creation
echo "hello" > $MOUNT/test.txt
[ "$(cat $MOUNT/test.txt)" = "hello" ] && echo "PASS: File write/read" || echo "FAIL"

# Test directory creation
mkdir $MOUNT/subdir
[ -d $MOUNT/subdir ] && echo "PASS: mkdir" || echo "FAIL"

# Test nested files
echo "nested" > $MOUNT/subdir/nested.txt
[ "$(cat $MOUNT/subdir/nested.txt)" = "nested" ] && echo "PASS: Nested file" || echo "FAIL"

# Test permissions
chmod 000 $MOUNT/test.txt
touch $MOUNT/test.txt 2>/dev/null && echo "FAIL: chmod not working" || echo "PASS: chmod"
chmod 644 $MOUNT/test.txt

# Test deletion
rm $MOUNT/test.txt
[ ! -f $MOUNT/test.txt ] && echo "PASS: unlink" || echo "FAIL"

echo "All tests completed"

Stress Tests

# Create many files
for i in $(seq 1 100); do
    echo "file $i" > $MOUNT/file_$i.txt
done

# Large file
dd if=/dev/urandom of=$MOUNT/large.bin bs=1M count=10

# Deep directory nesting
mkdir -p $MOUNT/a/b/c/d/e/f/g/h/i/j
echo "deep" > $MOUNT/a/b/c/d/e/f/g/h/i/j/file.txt

Common Pitfalls and Debugging

Pitfall 1: getattr Called Constantly

Problem: getattr is called for EVERY file access, even just to check existence.

Solution: Keep it fast. Cache inode lookups if needed.

// This is called constantly - keep it lightweight!
static int myfs_getattr(const char *path, struct stat *stbuf, ...) {
    // Don't do expensive operations here
}

Pitfall 2: Stale Mount Points

Problem: โ€œTransport endpoint is not connectedโ€ error.

Solution: Always unmount properly, or force unmount:

fusermount -u /tmp/myfs
# If that fails:
sudo umount -f /tmp/myfs

Pitfall 3: Wrong st_mode

Problem: ls shows wrong file type or permissions.

Solution: Always include file type bits:

// WRONG
stbuf->st_mode = 0755;

// CORRECT
stbuf->st_mode = S_IFDIR | 0755;  // Directory
stbuf->st_mode = S_IFREG | 0644;  // Regular file

Problem: Directory shows wrong link count, filesystem behaves oddly.

Solution: Track nlink carefully:

  • File: nlink=1 (or more for hard links)
  • Directory: nlink=2 + (number of subdirectories)

Debugging Techniques

1. Run in foreground with debug:

./myfs -f -d /tmp/myfs

This shows every FUSE operation as it happens.

2. Add logging:

#include <syslog.h>

openlog("myfs", LOG_PID, LOG_USER);
syslog(LOG_INFO, "getattr called for: %s", path);

3. Use strace on client programs:

strace ls /tmp/myfs 2>&1 | grep -E "(stat|getdents|open)"

Extensions and Challenges

static int myfs_symlink(const char *target, const char *path);
static int myfs_readlink(const char *path, char *buf, size_t size);
static int myfs_link(const char *from, const char *to);
// Increment nlink on target inode, add new directory entry

Extension 3: Extended Attributes

static int myfs_setxattr(const char *path, const char *name,
                         const char *value, size_t size, int flags);
static int myfs_getxattr(const char *path, const char *name,
                         char *value, size_t size);

Extension 4: Access Control

static int myfs_access(const char *path, int mask);
// Check if current user has requested permissions

Challenge: Thread Safety

Add proper locking for concurrent access:

pthread_rwlock_t fs_lock;

// Read operations: pthread_rwlock_rdlock()
// Write operations: pthread_rwlock_wrlock()

Real-World Connections

Cloud Storage Filesystems

FUSE powers many cloud storage integrations:

  • s3fs: Mount S3 buckets as local directories
  • gcsfuse: Google Cloud Storage access
  • rclone mount: Multi-cloud filesystem

Encryption Filesystems

  • EncFS: Encrypted overlay filesystem
  • gocryptfs: Modern encrypted FUSE filesystem
  • CryFS: Cloud-storage-optimized encryption

Development Tools

  • sshfs: Mount remote directories via SSH
  • unionfs-fuse: Overlay multiple directories
  • bindfs: Remount with different permissions

Interview Questions

Conceptual

  1. โ€œWhat is FUSE, and why is it useful?โ€
    • Expected: Userspace filesystems without kernel modules. Easier debugging, safer crashes, faster development.
  2. โ€œWhat happens when you run โ€˜lsโ€™ on a FUSE filesystem?โ€
    • Expected: ls โ†’ opendir โ†’ kernel โ†’ FUSE โ†’ your readdir โ†’ entries returned
  3. โ€œWhat is getattr, and why is it called so frequently?โ€
    • Expected: Itโ€™s stat(). Called for every file access. Keep it fast.
  4. โ€œHow does a filesystem resolve a path like โ€˜/a/b/cโ€™?โ€
    • Expected: Start at root, read directory, find โ€œaโ€, repeat for each component

Implementation

  1. โ€œWhatโ€™s the difference between create and open?โ€
    • Expected: create makes a new file, open opens existing. O_CREAT can blur the line.
  2. โ€œWhy do directories have link count >= 2?โ€
    • Expected: Own . entry + parentโ€™s reference. Each subdirectory adds one more.
  3. โ€œWhat should read() return if offset is past EOF?โ€
    • Expected: 0 bytes (EOF indicator, not an error)
  4. โ€œHow would you implement hard links?โ€
    • Expected: Multiple directory entries pointing to same inode. Increment nlink.

Self-Assessment Checklist

Before considering this project complete, verify you can:

  • Mount and unmount your filesystem cleanly
  • Create, read, write, and delete files
  • Create and remove directories
  • Navigate nested directory structures
  • Handle permissions correctly (chmod works)
  • Update timestamps appropriately
  • Handle edge cases (empty files, files at root, deep nesting)
  • Debug using FUSEโ€™s -d flag
  • Explain what happens for each syscall

Resources

Documentation

Books

  • โ€œThe Linux Programming Interfaceโ€ Ch. 14-18 - Files, directories, VFS
  • โ€œUnderstanding the Linux Kernelโ€ Ch. 12 - VFS internals
  • โ€œLinux System Programmingโ€ Ch. 3-4 - File I/O

Example Filesystems