Project 6: The “Mirror” Filesystem (FUSE)

Build a user-space filesystem that transforms data on read/write.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 1-2 weeks
Language C or Python (Alt: Rust)
Prerequisites C, pointers, Project 3
Key Topics VFS, FUSE callbacks, mount points

1. Learning Objectives

By completing this project, you will:

  1. Implement a basic FUSE filesystem with passthrough behavior.
  2. Handle core callbacks: getattr, readdir, read, write.
  3. Transform data on read/write (reverse or encrypt).
  4. Understand the user-kernel bridge via /dev/fuse.

2. Theoretical Foundation

2.1 Core Concepts

  • VFS layer: Kernel interface that unifies filesystem operations.
  • FUSE: Userspace filesystems via a kernel module and callbacks.
  • Mount points: Points where the kernel dispatches ops to your FS.

2.2 Why This Matters

FUSE shows how the kernel asks filesystems for data. It is the most approachable way to implement a filesystem.

2.3 Historical Context / Background

FUSE enabled safe filesystem experimentation without kernel modules, powering tools like sshfs.

2.4 Common Misconceptions

  • “Filesystems must be in kernel”: FUSE shows userspace can work too.
  • “read/write are simple”: You must honor offsets and sizes.

3. Project Specification

3.1 What You Will Build

A filesystem that mirrors a backing directory and transforms file contents on write/read.

3.2 Functional Requirements

  1. Mount a backing directory at a mount point.
  2. Implement getattr, readdir, open, read, write.
  3. Transform content (reverse or encrypt).
  4. Unmount cleanly.

3.3 Non-Functional Requirements

  • Reliability: Correct offsets and sizes.
  • Safety: Graceful unmount on exit.
  • Usability: Logs show which callbacks are called.

3.4 Example Usage / Output

$ ./mirrorfs root_dir/ mount_point/
$ echo "Hello" > mount_point/test.txt
$ cat root_dir/test.txt
olleH

3.5 Real World Outcome

You will write into the mount and see transformed content on disk:

$ ./mirrorfs root_dir/ mount_point/
$ echo "Hello" > mount_point/test.txt
$ cat root_dir/test.txt
olleH

4. Solution Architecture

4.1 High-Level Design

VFS call -> FUSE callback -> translate path -> perform IO -> transform -> return

4.2 Key Components

Component Responsibility Key Decisions
Path mapper Map mount path to backing path Prefix join
Callbacks getattr/read/write Use libfuse API
Transformer Reverse/encrypt Simple and deterministic

4.3 Data Structures

static const char *backing_root;

4.4 Algorithm Overview

Key Algorithm: Read/Write

  1. Translate path to backing path.
  2. Perform pread/pwrite.
  3. Apply transform on buffer.

Complexity Analysis:

  • Time: O(n) per buffer
  • Space: O(n) buffer

5. Implementation Guide

5.1 Development Environment Setup

pkg-config fuse3 --cflags --libs

5.2 Project Structure

project-root/
├── mirrorfs.c
└── README.md

5.3 The Core Question You’re Answering

“How can the kernel support many filesystems through a single interface?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. FUSE architecture
  2. Error codes (-ENOENT)
  3. Offset-based IO semantics

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. Should transformation happen on read, write, or both?
  2. How will you handle partial writes?
  3. How will you log callback activity for learning?

5.6 Thinking Exercise

Trace a cat

List the sequence of VFS calls for cat file (open -> read -> release).

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Why is FUSE slower than kernel filesystems?”
  2. “What is VFS and why does it exist?”
  3. “What does getattr return for a missing file?”

5.8 Hints in Layers

Hint 1: Start with passthrough Get read/write working before adding transformation.

Hint 2: Use pread/pwrite Avoid messing with file offsets globally.

Hint 3: Log callbacks Print callback name and path for visibility.

5.9 Books That Will Help

Topic Book Chapter
VFS “Linux Kernel Development” Ch. 13
FUSE FUSE docs Official Wiki

5.10 Implementation Phases

Phase 1: Foundation (3-4 days)

Goals:

  • Mount and passthrough.

Tasks:

  1. Implement getattr and readdir.
  2. Pass open/read/write to backing dir.

Checkpoint: ls and cat work on mount.

Phase 2: Core Functionality (3-4 days)

Goals:

  • Add data transformation.

Tasks:

  1. Reverse buffer on write.
  2. Reverse back on read.

Checkpoint: Data is transformed in backing dir.

Phase 3: Polish & Edge Cases (2-3 days)

Goals:

  • Improve errors and cleanup.

Tasks:

  1. Handle missing paths with -ENOENT.
  2. Implement clean unmount.

Checkpoint: Unmount leaves terminal clean.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
API FUSE2 vs FUSE3 FUSE3 Current standard
Transform reverse vs encrypt reverse first Simple to verify

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Mount Validate FS mounts ls mount
IO Read/write correctness cat file
Transform Data change reverse text

6.2 Critical Test Cases

  1. Writing a file changes backing content.
  2. Reading returns original data.
  3. Unmount works without errors.

6.3 Test Data

Hello -> olleH

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong path mapping ENOENT Join backing root correctly
Ignoring offset Corrupt file Use pread/pwrite
Not handling permissions Access denied Pass through mode bits

7.2 Debugging Strategies

  • Run with -f (foreground) and log callbacks.
  • Compare with passthrough behavior.

7.3 Performance Traps

Transforming large buffers is expensive; keep it simple.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a case-transform feature.
  • Add read-only mode.

8.2 Intermediate Extensions

  • Add encryption with a simple XOR key.
  • Add metadata mapping (fake sizes).

8.3 Advanced Extensions

  • Add caching for metadata.
  • Add multi-thread safety and locks.

9. Real-World Connections

9.1 Industry Applications

  • User-space filesystems like sshfs and encfs.
  • sshfs: https://github.com/libfuse/sshfs
  • encfs: https://github.com/vgough/encfs

9.3 Interview Relevance

  • VFS and FUSE are strong systems design topics.

10. Resources

10.1 Essential Reading

  • libfuse docs and examples

10.2 Video Resources

  • FUSE tutorials (search “libfuse tutorial”)

10.3 Tools & Documentation

  • fusermount3 manual

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain VFS and FUSE roles.
  • I can implement read/write callbacks.
  • I can explain kernel/user context switches.

11.2 Implementation

  • Mount works and mirrors a directory.
  • Transformations are correct.
  • Unmount is clean.

11.3 Growth

  • I can extend with encryption.
  • I can explain FUSE performance limits.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Mount and read/write through passthrough.

Full Completion:

  • Add content transformation.

Excellence (Going Above & Beyond):

  • Add caching or encryption with key management.

This guide was generated from LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md. For the complete learning path, see the parent directory.