Project 11: File System Driver (FAT16)

Build a FAT16 file system driver that can read files from a disk image–implementing ATA/IDE disk I/O, FAT table parsing, directory traversal, and file reading–the foundation for any operating system that needs persistent storage.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 3-4 weeks
Language C (alt: Rust)
Prerequisites Projects 6-8 (x86 protected mode, IDT, paging)
Key Topics ATA PIO, FAT16, block I/O, cluster chains, directory entries
Platform x86 32-bit (QEMU)
Tools GCC cross-compiler, NASM, QEMU, GDB
Main Book “Operating Systems: Three Easy Pieces” by Arpaci-Dusseau

1. Learning Objectives

By completing this project, you will:

  1. Understand block device I/O: Master how operating systems communicate with storage hardware at the lowest level
  2. Implement ATA PIO mode: Write a driver that reads sectors from a hard disk using x86 I/O ports
  3. Parse FAT16 structures: Decode the Boot Parameter Block (BPB), FAT table, and directory entries
  4. Navigate file hierarchies: Implement directory listing and subdirectory traversal
  5. Read file contents: Follow cluster chains through the FAT table to read complete files
  6. Handle fragmentation: Understand why files are stored in chains and how to reassemble them
  7. Debug storage code: Develop techniques for debugging persistent storage drivers
  8. Build toward a complete OS: File system support is essential for loading programs and configurations

2. Theoretical Foundation

2.1 Core Concepts

What is a File System?

A file system provides a logical abstraction over raw disk storage. Without a file system, a disk is just a sequence of 512-byte sectors. The file system organizes these sectors into files and directories:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         RAW DISK vs FILE SYSTEM                              │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  RAW DISK (what hardware sees):                                             │
│  ─────────────────────────────                                              │
│                                                                             │
│  ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐       │
│  │ S0  │ S1  │ S2  │ S3  │ S4  │ S5  │ S6  │ S7  │ S8  │ S9  │ ... │       │
│  └─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘       │
│    512   512   512   512   512   512   512   512   512   512  bytes        │
│                                                                             │
│  Just numbered sectors. No concept of files, names, or organization.       │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  FILE SYSTEM (what users see):                                              │
│  ─────────────────────────────                                              │
│                                                                             │
│  /                                                                          │
│  ├── hello.txt (17 bytes)                                                  │
│  ├── documents/                                                            │
│  │   ├── notes.txt (1,240 bytes)                                          │
│  │   └── data.csv (15,892 bytes)                                          │
│  └── programs/                                                             │
│      └── editor.exe (45,312 bytes)                                        │
│                                                                             │
│  Files have names, sizes, and can be organized in directories.             │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  THE FILE SYSTEM'S JOB:                                                     │
│                                                                             │
│  • Map file names to disk sectors                                          │
│  • Track which sectors belong to which file                                │
│  • Manage free space                                                        │
│  • Support directories (folders)                                           │
│  • Store metadata (size, dates, permissions)                               │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Why FAT16?

FAT16 (File Allocation Table, 16-bit) is the ideal learning file system:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           WHY FAT16 FOR LEARNING?                           │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ADVANTAGES:                                                                │
│                                                                             │
│  ✓ Simple structure - only 3 regions to understand                         │
│  ✓ Publicly documented - Microsoft specification available                 │
│  ✓ No journaling complexity                                                │
│  ✓ Easy to create test images                                              │
│  ✓ Widely supported (Windows, Linux, embedded)                             │
│  ✓ Still used in many embedded systems                                     │
│                                                                             │
│  LIMITATIONS (that make it a good learning tool):                           │
│                                                                             │
│  • Max partition size ~2GB (perfect for learning)                          │
│  • Max file size 2GB                                                       │
│  • No permissions (keeps things simple)                                    │
│  • No journaling (simpler recovery)                                        │
│  • 8.3 filenames (easier parsing, LFN is optional)                         │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  COMPARISON:                                                                │
│                                                                             │
│  File System    Complexity    Learning Value    Real-World Use            │
│  ───────────    ──────────    ──────────────    ──────────────            │
│  FAT12          Very Low      Good              Floppy disks               │
│  FAT16          Low           Excellent         USB drives, cameras        │
│  FAT32          Medium        Good              USB, SD cards              │
│  ext2           Medium-High   Good              Linux legacy               │
│  ext4           High          Advanced          Linux current              │
│  NTFS           Very High     Expert            Windows current            │
│  ZFS            Extreme       Master            Enterprise storage         │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

FAT16 Disk Layout

A FAT16 partition has a simple three-region structure:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           FAT16 DISK LAYOUT                                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Sector 0             Reserved           FAT             Root Dir   Data   │
│  ↓                    Sectors            Region          Region     Region │
│  ┌────────┬─────────────────────┬────────────────┬──────────┬────────────┐ │
│  │ Boot   │ (optional reserved │ FAT1 │ FAT2   │ Root Dir │ Data       │ │
│  │ Sector │  sectors)          │      │ (copy) │ Entries  │ Clusters   │ │
│  └────────┴─────────────────────┴──────┴────────┴──────────┴────────────┘ │
│  │        │                     │               │          │              │
│  │        │                     │               │          │              │
│  ▼        ▼                     ▼               ▼          ▼              │
│                                                                             │
│  REGION 1: RESERVED SECTORS (starts at sector 0)                           │
│  ───────────────────────────────────────────────                           │
│  • First sector = Boot Sector (BPB)                                        │
│  • Contains file system parameters                                          │
│  • Usually 1 sector, but can be more                                        │
│                                                                             │
│  REGION 2: FAT TABLES (File Allocation Tables)                             │
│  ────────────────────────────────────────────                               │
│  • Array of 16-bit cluster pointers                                         │
│  • Maps cluster chains for files                                           │
│  • Usually 2 copies for redundancy                                          │
│  • Size depends on partition size                                           │
│                                                                             │
│  REGION 3: ROOT DIRECTORY                                                  │
│  ────────────────────────────                                               │
│  • Fixed-size area for root directory entries                              │
│  • Each entry is 32 bytes                                                  │
│  • Maximum entries defined in BPB (typically 512)                          │
│  • Unlike subdirectories, cannot grow                                       │
│                                                                             │
│  REGION 4: DATA REGION                                                     │
│  ────────────────────────                                                   │
│  • Where actual file contents are stored                                   │
│  • Organized in "clusters" (groups of sectors)                             │
│  • Subdirectories also stored here                                          │
│  • First cluster is cluster 2 (clusters 0,1 are reserved)                  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

The Boot Sector (BIOS Parameter Block)

The first sector contains critical file system parameters:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    FAT16 BOOT SECTOR STRUCTURE (512 bytes)                  │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Offset   Size   Name                Description                            │
│  ──────   ────   ─────────────────   ────────────────────────────────────  │
│  0x00     3      jmp_boot           Jump instruction (EB XX 90)            │
│  0x03     8      oem_name           OEM identifier ("MSWIN4.1")            │
│                                                                             │
│  ─── BIOS PARAMETER BLOCK (BPB) ──────────────────────────────────────────  │
│                                                                             │
│  0x0B     2      bytes_per_sector   Usually 512                            │
│  0x0D     1      sectors_per_cluster Power of 2 (1,2,4,8,16,32,64,128)     │
│  0x0E     2      reserved_sectors   Sectors before FAT (usually 1)         │
│  0x10     1      num_fats           Number of FAT copies (usually 2)       │
│  0x11     2      root_entry_count   Max root dir entries (512 for FAT16)   │
│  0x13     2      total_sectors_16   Total sectors if < 65536               │
│  0x15     1      media_type         0xF8 for hard disk                     │
│  0x16     2      fat_size_16        Sectors per FAT                        │
│  0x18     2      sectors_per_track  For CHS geometry (legacy)              │
│  0x1A     2      num_heads          For CHS geometry (legacy)              │
│  0x1C     4      hidden_sectors     Sectors before partition               │
│  0x20     4      total_sectors_32   Total sectors if >= 65536              │
│                                                                             │
│  ─── EXTENDED BOOT RECORD ────────────────────────────────────────────────  │
│                                                                             │
│  0x24     1      drive_number       BIOS drive number (0x80 for HD)        │
│  0x25     1      reserved1          Reserved                               │
│  0x26     1      boot_signature     Extended boot sig (0x29)               │
│  0x27     4      volume_id          Volume serial number                   │
│  0x2B     11     volume_label       Volume label ("MY DISK    ")           │
│  0x36     8      fs_type            "FAT16   " (informational only)        │
│                                                                             │
│  ─── BOOT CODE AND SIGNATURE ─────────────────────────────────────────────  │
│                                                                             │
│  0x3E     448    boot_code          Bootstrap code (or zeros)              │
│  0x1FE    2      signature          Boot signature (0xAA55)                │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  CALCULATING KEY VALUES:                                                    │
│                                                                             │
│  fat_start = reserved_sectors                                              │
│  root_dir_start = reserved_sectors + (num_fats * fat_size_16)              │
│  root_dir_sectors = (root_entry_count * 32) / bytes_per_sector             │
│  data_start = root_dir_start + root_dir_sectors                            │
│  first_data_sector = reserved_sectors + (num_fats * fat_size_16)           │
│                      + root_dir_sectors                                     │
│  cluster_to_lba(N) = first_data_sector + (N - 2) * sectors_per_cluster     │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

The FAT Table (File Allocation Table)

The FAT is an array of 16-bit values that form cluster chains:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    FILE ALLOCATION TABLE (FAT) STRUCTURE                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  The FAT is an array where each entry corresponds to a cluster:            │
│                                                                             │
│  Index:     0      1      2      3      4      5      6      7      8      │
│           ┌──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┬──────┐ │
│  Value:   │0xFFF8│0xFFFF│  3   │  4   │0xFFFF│  7   │  0   │  8   │0xFFFF│ │
│           └──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┴──────┘ │
│             ↑      ↑      ↑      ↑      ↑      ↑      ↑      ↑      ↑      │
│             │      │      │      │      │      │      │      │      │      │
│          Media  Reserved │      │   End of   │   Free   │      │   End   │
│           ID    cluster  │      │   file A   │ cluster  │      │of file B│
│                          │      │            │          │      │         │
│                          └──────┴────────────┘          └──────┴─────────┘ │
│                          File A: clusters 2→3→4         File B: 5→7→8     │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  FAT ENTRY VALUES:                                                          │
│                                                                             │
│  Value          Meaning                                                     │
│  ─────          ───────                                                     │
│  0x0000         Free cluster (available for allocation)                    │
│  0x0001         Reserved (never used)                                       │
│  0x0002-0xFFEF  Next cluster in chain                                       │
│  0xFFF0-0xFFF6  Reserved                                                    │
│  0xFFF7         Bad cluster (damaged sector)                               │
│  0xFFF8-0xFFFF  End of cluster chain (EOF marker)                          │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  READING A FILE:                                                            │
│                                                                             │
│  1. Get starting cluster from directory entry                              │
│  2. Read data from that cluster                                            │
│  3. Look up next cluster in FAT table                                      │
│  4. If next cluster < 0xFFF8, repeat from step 2                           │
│  5. If next cluster >= 0xFFF8, file is complete                            │
│                                                                             │
│  EXAMPLE: Reading a file starting at cluster 2:                            │
│                                                                             │
│  Cluster 2: Read data, FAT[2] = 3 → continue                               │
│  Cluster 3: Read data, FAT[3] = 4 → continue                               │
│  Cluster 4: Read data, FAT[4] = 0xFFFF → END OF FILE                       │
│                                                                             │
│  File contents = data from clusters 2 + 3 + 4                              │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Directory Entry Structure

Each directory entry is 32 bytes:

┌─────────────────────────────────────────────────────────────────────────────┐
│                      FAT DIRECTORY ENTRY (32 bytes)                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Offset   Size   Name             Description                               │
│  ──────   ────   ─────────        ───────────────────────────────────────  │
│  0x00     8      name             Short filename (8 chars, space-padded)   │
│  0x08     3      extension        Extension (3 chars, space-padded)        │
│  0x0B     1      attributes       File attributes (see below)              │
│  0x0C     1      reserved         Reserved for Windows NT                   │
│  0x0D     1      create_time_10ms Creation time in 10ms increments         │
│  0x0E     2      create_time      Creation time (hour:min:sec)             │
│  0x10     2      create_date      Creation date (year:month:day)           │
│  0x12     2      access_date      Last access date                         │
│  0x14     2      cluster_high     High 16 bits of cluster (FAT32 only)     │
│  0x16     2      modify_time      Last modification time                   │
│  0x18     2      modify_date      Last modification date                   │
│  0x1A     2      cluster_low      Starting cluster (low 16 bits)           │
│  0x1C     4      file_size        File size in bytes                       │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  ATTRIBUTE BYTE (offset 0x0B):                                             │
│                                                                             │
│  Bit   Hex     Name           Description                                   │
│  ───   ────    ────           ───────────────────────────────────────────  │
│  0     0x01    READ_ONLY      File is read-only                            │
│  1     0x02    HIDDEN         File is hidden                               │
│  2     0x04    SYSTEM         System file                                  │
│  3     0x08    VOLUME_ID      Volume label (special entry)                 │
│  4     0x10    DIRECTORY      Entry is a directory, not a file             │
│  5     0x20    ARCHIVE        File has been modified                       │
│                                                                             │
│  Special: 0x0F = Long File Name entry (see LFN section)                    │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  8.3 FILENAME FORMAT:                                                       │
│                                                                             │
│  Filename stored as 8 chars + 3 chars, uppercase, space-padded:            │
│                                                                             │
│  "HELLO   TXT" = HELLO.TXT                                                 │
│  "README     " = README (no extension)                                      │
│  "DATA    DAT" = DATA.DAT                                                  │
│  ".       " = "." (current directory)                                      │
│  "..         " = ".." (parent directory)                                   │
│                                                                             │
│  SPECIAL FIRST-BYTE VALUES:                                                 │
│                                                                             │
│  0x00 = End of directory (no more entries)                                 │
│  0xE5 = Deleted entry (slot can be reused)                                 │
│  0x05 = First char is actually 0xE5 (escaped)                              │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  DATE/TIME FORMAT:                                                          │
│                                                                             │
│  Time (2 bytes):                                                            │
│  Bits 15-11: Hours (0-23)                                                  │
│  Bits 10-5:  Minutes (0-59)                                                │
│  Bits 4-0:   Seconds/2 (0-29, representing 0-58)                           │
│                                                                             │
│  Date (2 bytes):                                                            │
│  Bits 15-9:  Year since 1980 (0-127, so 1980-2107)                        │
│  Bits 8-5:   Month (1-12)                                                  │
│  Bits 4-0:   Day (1-31)                                                    │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Long File Name (LFN) Entries

FAT16 supports long filenames through special directory entries:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    LONG FILE NAME (LFN) ENTRIES                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Long filenames are stored in special directory entries BEFORE the         │
│  standard 8.3 entry. Each LFN entry holds up to 13 Unicode characters.     │
│                                                                             │
│  LFN ENTRY STRUCTURE (32 bytes):                                           │
│                                                                             │
│  Offset   Size   Description                                                │
│  ──────   ────   ───────────────────────────────────────────────           │
│  0x00     1      Sequence number (1-based, 0x40 = last entry)              │
│  0x01     10     Characters 1-5 (Unicode, 2 bytes each)                    │
│  0x0B     1      Attribute (always 0x0F for LFN)                           │
│  0x0C     1      Type (0 for LFN entry)                                    │
│  0x0D     1      Checksum of 8.3 name                                      │
│  0x0E     12     Characters 6-11 (Unicode, 2 bytes each)                   │
│  0x1A     2      First cluster (always 0 for LFN)                          │
│  0x1C     4      Characters 12-13 (Unicode, 2 bytes each)                  │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  EXAMPLE: File "My Long Filename.txt"                                       │
│                                                                             │
│  Directory entries (read from disk):                                        │
│                                                                             │
│  Entry 1: LFN #2 (0x42) "name.txt"                                         │
│  Entry 2: LFN #1 (0x01) "My Long File"                                     │
│  Entry 3: 8.3 entry    "MYLONG~1TXT"                                       │
│                                                                             │
│  The 8.3 name is auto-generated with ~N suffix.                            │
│  LFN entries are stored in REVERSE order.                                   │
│  Sequence number 0x40 | N indicates last LFN entry.                         │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  LFN CHECKSUM ALGORITHM:                                                    │
│                                                                             │
│  uint8_t lfn_checksum(const char *shortname) {                             │
│      uint8_t sum = 0;                                                       │
│      for (int i = 0; i < 11; i++) {                                        │
│          sum = ((sum & 1) ? 0x80 : 0) + (sum >> 1) + shortname[i];         │
│      }                                                                      │
│      return sum;                                                            │
│  }                                                                          │
│                                                                             │
│  For this project, you can focus on 8.3 names first, then add LFN         │
│  support as an extension.                                                   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

ATA PIO Mode

ATA (AT Attachment) is the interface to IDE/PATA hard drives. PIO (Programmed I/O) mode uses CPU I/O ports:

┌─────────────────────────────────────────────────────────────────────────────┐
│                           ATA PIO MODE INTERFACE                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ATA uses two I/O port ranges:                                              │
│                                                                             │
│  Primary Controller:   0x1F0 - 0x1F7, 0x3F6                                │
│  Secondary Controller: 0x170 - 0x177, 0x376                                │
│                                                                             │
│  PORT LAYOUT (Primary Controller):                                          │
│                                                                             │
│  Port    Read                      Write                                   │
│  ─────   ────                      ─────                                   │
│  0x1F0   Data (16-bit)            Data (16-bit)                            │
│  0x1F1   Error register           Features register                        │
│  0x1F2   Sector count             Sector count                             │
│  0x1F3   LBA low (0-7)            LBA low (0-7)                            │
│  0x1F4   LBA mid (8-15)           LBA mid (8-15)                           │
│  0x1F5   LBA high (16-23)         LBA high (16-23)                         │
│  0x1F6   Drive/head               Drive/head                               │
│  0x1F7   Status                   Command                                  │
│  0x3F6   Alt status               Device control                           │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  STATUS REGISTER (0x1F7 read):                                              │
│                                                                             │
│  Bit   Name   Description                                                   │
│  ───   ────   ───────────────────────────────────────────────────────────  │
│  7     BSY    Drive is busy (wait for 0 before sending commands)           │
│  6     DRDY   Drive is ready                                               │
│  5     DF     Drive fault                                                  │
│  4     SRV    Service request                                              │
│  3     DRQ    Data request (ready to transfer data)                        │
│  2     CORR   Corrected data                                               │
│  1     IDX    Index mark (obsolete)                                        │
│  0     ERR    Error occurred (check error register)                        │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  DRIVE/HEAD REGISTER (0x1F6):                                               │
│                                                                             │
│  Bit 7:    1 (reserved)                                                    │
│  Bit 6:    LBA mode (1) or CHS mode (0)                                    │
│  Bit 5:    1 (reserved)                                                    │
│  Bit 4:    Drive (0 = master, 1 = slave)                                   │
│  Bits 3-0: LBA bits 24-27 (or head number in CHS)                          │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  ATA COMMANDS:                                                              │
│                                                                             │
│  0x20   READ SECTORS (PIO)                                                 │
│  0x30   WRITE SECTORS (PIO)                                                │
│  0xEC   IDENTIFY DRIVE                                                     │
│  0xE7   FLUSH CACHE                                                        │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Reading a Sector with ATA PIO

┌─────────────────────────────────────────────────────────────────────────────┐
│                      ATA PIO READ SEQUENCE                                   │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  STEP 1: Wait for drive ready                                               │
│  ─────────────────────────────                                              │
│                                                                             │
│  while (inb(0x1F7) & 0x80);  // Wait for BSY = 0                           │
│                                                                             │
│  STEP 2: Select drive and send LBA address                                 │
│  ─────────────────────────────────────────                                  │
│                                                                             │
│  outb(0x1F6, 0xE0 | ((lba >> 24) & 0x0F));  // Drive 0, LBA bits 24-27    │
│  outb(0x1F2, 1);                             // Sector count = 1           │
│  outb(0x1F3, lba & 0xFF);                    // LBA bits 0-7               │
│  outb(0x1F4, (lba >> 8) & 0xFF);             // LBA bits 8-15              │
│  outb(0x1F5, (lba >> 16) & 0xFF);            // LBA bits 16-23             │
│                                                                             │
│  STEP 3: Send READ command                                                  │
│  ─────────────────────────                                                  │
│                                                                             │
│  outb(0x1F7, 0x20);  // READ SECTORS command                               │
│                                                                             │
│  STEP 4: Wait for data ready                                                │
│  ────────────────────────────                                               │
│                                                                             │
│  while (!(inb(0x1F7) & 0x08));  // Wait for DRQ = 1                        │
│                                                                             │
│  STEP 5: Read data (256 words = 512 bytes)                                 │
│  ─────────────────────────────────────────                                  │
│                                                                             │
│  for (int i = 0; i < 256; i++) {                                           │
│      buffer[i] = inw(0x1F0);  // Read 16-bit word                          │
│  }                                                                          │
│                                                                             │
│  ─────────────────────────────────────────────────────────────────────────  │
│                                                                             │
│  COMPLETE READ FUNCTION:                                                    │
│                                                                             │
│  void ata_read_sector(uint32_t lba, uint16_t *buffer) {                    │
│      // Wait for drive ready                                               │
│      while (inb(0x1F7) & 0x80);                                            │
│                                                                             │
│      // Select drive 0, LBA mode                                           │
│      outb(0x1F6, 0xE0 | ((lba >> 24) & 0x0F));                             │
│      outb(0x1F2, 1);                      // 1 sector                      │
│      outb(0x1F3, lba & 0xFF);             // LBA low                       │
│      outb(0x1F4, (lba >> 8) & 0xFF);      // LBA mid                       │
│      outb(0x1F5, (lba >> 16) & 0xFF);     // LBA high                      │
│                                                                             │
│      // Send read command                                                   │
│      outb(0x1F7, 0x20);                                                    │
│                                                                             │
│      // Wait for data                                                       │
│      while (!(inb(0x1F7) & 0x08));                                         │
│                                                                             │
│      // Read 256 words (512 bytes)                                         │
│      for (int i = 0; i < 256; i++) {                                       │
│          buffer[i] = inw(0x1F0);                                           │
│      }                                                                      │
│  }                                                                          │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

2.2 Why This Matters

Foundation for Operating Systems:

  • Every OS needs to read files from disk
  • File system drivers are essential kernel components
  • Understanding FAT helps understand more complex file systems

Understanding Storage:

  • How data is organized on disk
  • Why fragmentation affects performance
  • How file deletion really works

Practical Applications:

  • Embedded systems often use FAT
  • USB drives and SD cards use FAT32
  • Understanding enables data recovery
  • Firmware and bootloaders read FAT

2.3 Historical Context

The Evolution of FAT:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         HISTORY OF FAT FILE SYSTEMS                         │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  1977 - FAT (original)                                                      │
│         Created by Marc McDonald for Microsoft Standalone Disk BASIC       │
│         Used in MDOS/MIDAS                                                  │
│                                                                             │
│  1980 - FAT12                                                               │
│         12-bit cluster entries                                              │
│         Max volume size: 32 MB                                              │
│         Used on floppy disks                                                │
│         QDOS (later MS-DOS) adopted FAT                                     │
│                                                                             │
│  1984 - FAT16                                                               │
│         16-bit cluster entries                                              │
│         Originally max 32 MB, later 2 GB                                    │
│         MS-DOS 3.0+, Windows 3.x                                            │
│                                                                             │
│  1996 - FAT32 (Windows 95 OSR2)                                             │
│         28-bit cluster entries (called "32" for marketing)                  │
│         Max volume size: 2 TB                                               │
│         Still widely used on USB drives                                     │
│                                                                             │
│  2006 - exFAT                                                               │
│         Designed for flash drives                                           │
│         Max file size: 16 EB (exabytes)                                    │
│         Used on SDXC cards                                                  │
│                                                                             │
│  Today - FAT is everywhere!                                                 │
│         - USB drives, SD cards                                              │
│         - Cameras, game consoles                                            │
│         - Embedded systems                                                  │
│         - EFI System Partition (ESP)                                        │
│         - Cross-platform compatibility                                      │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

2.4 Common Misconceptions

Misconception Reality
“FAT is obsolete” FAT is still the most portable file system, used on billions of devices
“FAT12/16/32 are different file systems” They share the same structure, only differ in cluster pointer size
“The FAT has to be at sector 1” FAT starts after reserved sectors (usually 1, but can be more)
“Root directory can grow” Unlike subdirectories, root directory has fixed size in FAT16
“LFN is part of original FAT” LFN was added by Microsoft in Windows 95, layered on top of 8.3
“Cluster 0 and 1 can be used” First usable cluster is always 2; 0 and 1 have special meanings

3. Project Specification

3.1 What You Will Build

A complete FAT16 file system driver that:

  1. Reads sectors from disk using ATA PIO mode
  2. Parses the boot sector to understand file system layout
  3. Reads the FAT table to follow cluster chains
  4. Lists directory contents showing files and subdirectories
  5. Reads file contents from fragmented clusters
  6. Navigates subdirectories to access nested files
  7. Provides a simple shell interface for user interaction

3.2 Functional Requirements

ID Requirement
FR1 Read sectors from ATA drive using PIO mode
FR2 Parse FAT16 boot sector (BPB) and extract parameters
FR3 Load and interpret the FAT table
FR4 List root directory contents with file names, sizes, and types
FR5 Read entire file contents by following cluster chains
FR6 Support subdirectory navigation
FR7 Handle both short (8.3) filenames and optional LFN
FR8 Provide ls and cat commands
FR9 Handle files larger than one cluster (fragmented files)
FR10 Display meaningful error messages for invalid operations

3.3 Non-Functional Requirements

ID Requirement
NFR1 Read speed should be limited only by hardware, not software
NFR2 Support disk images up to 128 MB (plenty for learning)
NFR3 Work correctly with standard FAT16 images from mkfs.fat
NFR4 Code should be readable and well-documented
NFR5 Must handle edge cases (empty directories, zero-byte files)

3.4 Example Output

$ make
nasm -f elf32 boot.asm -o boot.o
i686-elf-gcc -c kernel.c -o kernel.o -ffreestanding -Wall -Wextra
i686-elf-gcc -c ata.c -o ata.o -ffreestanding -Wall -Wextra
i686-elf-gcc -c fat16.c -o fat16.o -ffreestanding -Wall -Wextra
i686-elf-gcc -c shell.c -o shell.o -ffreestanding -Wall -Wextra
i686-elf-ld -T link.ld -o kernel.elf boot.o kernel.o ata.o fat16.o shell.o
$ qemu-system-i386 -kernel kernel.elf -hda disk.img -serial stdio

================================================================================
                    FAT16 File System Driver Demo v1.0
================================================================================

[INIT] Detecting ATA drives...
  Primary Master: 64 MB disk detected

[INIT] Reading boot sector...
  Bytes per sector: 512
  Sectors per cluster: 4
  Reserved sectors: 1
  Number of FATs: 2
  Root entry count: 512
  FAT size: 128 sectors
  Volume label: MY DISK
  Total clusters: 16256

[INIT] File system mounted successfully!

================================================================================
                              FAT16 Shell
================================================================================
Type 'help' for available commands.

> ls
Directory of /

  HELLO.TXT          17 bytes     2024-12-29 14:30
  README.MD         256 bytes     2024-12-29 14:32
  CONFIG.SYS         48 bytes     2024-12-29 14:28
  DOCS/             <DIR>         2024-12-29 14:35
  PROGRAMS/         <DIR>         2024-12-29 14:36

5 entries (321 bytes in files)

> cat hello.txt
Hello from file!

> ls docs
Directory of /DOCS

  .                 <DIR>
  ..                <DIR>
  NOTES.TXT       1,240 bytes     2024-12-29 14:35
  DATA.CSV       15,892 bytes     2024-12-29 14:35

4 entries (17,132 bytes in files)

> cat docs/notes.txt
These are my notes about FAT16 file system implementation.

The FAT16 file system was developed by Microsoft and has been
widely used in various operating systems including MS-DOS and
Windows 95/98.

Key components:
- Boot sector with BPB
- File Allocation Table
- Root directory
- Data region

...

> cd docs
Current directory: /DOCS

> ls
Directory of /DOCS

  .                 <DIR>
  ..                <DIR>
  NOTES.TXT       1,240 bytes     2024-12-29 14:35
  DATA.CSV       15,892 bytes     2024-12-29 14:35

4 entries (17,132 bytes in files)

> cat data.csv
Name,Value,Date
Item1,100,2024-01-01
Item2,200,2024-01-02
Item3,300,2024-01-03
...

> cd ..
Current directory: /

> cat programs/missing.txt
Error: File not found: MISSING.TXT

> help
Available commands:
  ls [path]     - List directory contents
  cat <file>    - Display file contents
  cd <path>     - Change directory
  pwd           - Print working directory
  info          - Show file system information
  help          - Show this help message

> info
=== FAT16 File System Information ===

Volume Label:      MY DISK
Volume Serial:     0x12345678
OEM Name:          MSDOS5.0

Geometry:
  Bytes/Sector:    512
  Sectors/Cluster: 4
  Cluster Size:    2048 bytes

Layout:
  Reserved:        1 sectors
  FAT Start:       sector 1
  FAT Size:        128 sectors x 2 copies
  Root Dir Start:  sector 257
  Root Dir Size:   32 sectors
  Data Start:      sector 289
  Total Clusters:  16256

Usage:
  Total Space:     64 MB
  Used:            2 MB
  Free:            62 MB

>

3.5 Real World Outcome

After completing this project, you will have:

  1. A working file system driver that can read files from a FAT16 disk image
  2. Deep understanding of how operating systems access persistent storage
  3. ATA driver experience for direct hardware communication
  4. Foundation for OS development – loading programs, reading configs
  5. Debugging skills for storage-related issues
  6. Practical FAT knowledge applicable to embedded systems

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────────────────┐
│                        FAT16 DRIVER ARCHITECTURE                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌────────────────────────────────────────────────────────────────────────┐│
│  │                         USER INTERFACE                                  ││
│  │  ┌──────────────────────────────────────────────────────────────────┐  ││
│  │  │                      Shell / Commands                            │  ││
│  │  │   ls    cat    cd    pwd    info    help                        │  ││
│  │  └──────────────────────────────────────────────────────────────────┘  ││
│  └────────────────────────────────────────────────────────────────────────┘│
│                                    │                                        │
│                                    ▼                                        │
│  ┌────────────────────────────────────────────────────────────────────────┐│
│  │                      FAT16 FILE SYSTEM LAYER                           ││
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌────────────────────────┐   ││
│  │  │ Directory Ops   │ │   File Ops      │ │  Cluster Management    │   ││
│  │  │  - list_dir()   │ │  - read_file()  │ │  - get_next_cluster()  │   ││
│  │  │  - find_entry() │ │  - get_size()   │ │  - cluster_to_lba()    │   ││
│  │  │  - parse_entry()│ │                 │ │  - read_cluster()      │   ││
│  │  └─────────────────┘ └─────────────────┘ └────────────────────────┘   ││
│  │                              │                                          ││
│  │  ┌───────────────────────────────────────────────────────────────────┐ ││
│  │  │                    FAT Table Cache                                │ ││
│  │  │     (16-bit array mapping cluster -> next cluster)                │ ││
│  │  └───────────────────────────────────────────────────────────────────┘ ││
│  │                              │                                          ││
│  │  ┌───────────────────────────────────────────────────────────────────┐ ││
│  │  │                   BPB / Superblock Info                           │ ││
│  │  │   sectors_per_cluster, fat_start, root_start, data_start, etc.   │ ││
│  │  └───────────────────────────────────────────────────────────────────┘ ││
│  └────────────────────────────────────────────────────────────────────────┘│
│                                    │                                        │
│                                    ▼                                        │
│  ┌────────────────────────────────────────────────────────────────────────┐│
│  │                        BLOCK I/O LAYER                                  ││
│  │  ┌─────────────────────────────────────────────────────────────────┐   ││
│  │  │                    Sector Cache (optional)                       │   ││
│  │  └─────────────────────────────────────────────────────────────────┘   ││
│  │                              │                                          ││
│  │  ┌─────────────────────────────────────────────────────────────────┐   ││
│  │  │               read_sector(lba, buffer)                           │   ││
│  │  │               read_sectors(lba, count, buffer)                   │   ││
│  │  └─────────────────────────────────────────────────────────────────┘   ││
│  └────────────────────────────────────────────────────────────────────────┘│
│                                    │                                        │
│                                    ▼                                        │
│  ┌────────────────────────────────────────────────────────────────────────┐│
│  │                         ATA DRIVER                                      ││
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────────┐   ││
│  │  │  ata_identify() │ │ ata_read_pio()  │ │  Port I/O (inb/outb)   │   ││
│  │  └─────────────────┘ └─────────────────┘ └─────────────────────────┘   ││
│  └────────────────────────────────────────────────────────────────────────┘│
│                                    │                                        │
│                                    ▼                                        │
│  ┌────────────────────────────────────────────────────────────────────────┐│
│  │                           HARDWARE                                      ││
│  │                     ATA Controller (0x1F0-0x1F7)                        ││
│  │                            Hard Disk                                    ││
│  └────────────────────────────────────────────────────────────────────────┘│
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

4.2 Key Components

Component File Responsibility
ATA Driver ata.c / ata.h Low-level disk I/O via PIO mode
FAT16 Driver fat16.c / fat16.h File system parsing and navigation
Shell shell.c / shell.h User command interface
Kernel Main kernel.c Initialization and main loop
Boot boot.asm Protected mode entry (from previous projects)
Linker Script link.ld Memory layout

4.3 Data Structures

//------------------------------------------
// ATA Driver Structures
//------------------------------------------

typedef struct {
    uint16_t base_port;       // Command block base (0x1F0 or 0x170)
    uint16_t ctrl_port;       // Control block (0x3F6 or 0x376)
    uint8_t  drive;           // 0 = master, 1 = slave
    uint32_t size_sectors;    // Total sectors on drive
    char     model[41];       // Drive model string
} ata_drive_t;

//------------------------------------------
// FAT16 Boot Parameter Block
//------------------------------------------

typedef struct __attribute__((packed)) {
    uint8_t  jmp_boot[3];           // Jump instruction
    char     oem_name[8];           // OEM identifier
    uint16_t bytes_per_sector;      // Usually 512
    uint8_t  sectors_per_cluster;   // Power of 2
    uint16_t reserved_sectors;      // Before first FAT
    uint8_t  num_fats;              // Usually 2
    uint16_t root_entry_count;      // Max root entries (512)
    uint16_t total_sectors_16;      // If < 65536
    uint8_t  media_type;            // 0xF8 for HD
    uint16_t fat_size_16;           // Sectors per FAT
    uint16_t sectors_per_track;     // CHS geometry
    uint16_t num_heads;             // CHS geometry
    uint32_t hidden_sectors;        // Before partition
    uint32_t total_sectors_32;      // If >= 65536

    // Extended boot record
    uint8_t  drive_number;          // BIOS drive
    uint8_t  reserved1;
    uint8_t  boot_signature;        // 0x29 if extended
    uint32_t volume_id;             // Serial number
    char     volume_label[11];      // Volume name
    char     fs_type[8];            // "FAT16   "
} fat16_bpb_t;

//------------------------------------------
// FAT16 Directory Entry
//------------------------------------------

typedef struct __attribute__((packed)) {
    char     name[8];               // Short name
    char     ext[3];                // Extension
    uint8_t  attributes;            // File attributes
    uint8_t  reserved;              // NT reserved
    uint8_t  create_time_tenth;     // 10ms units
    uint16_t create_time;           // Hour:min:sec
    uint16_t create_date;           // Year:month:day
    uint16_t access_date;           // Last access
    uint16_t cluster_high;          // High 16 bits (FAT32)
    uint16_t modify_time;           // Last modification
    uint16_t modify_date;
    uint16_t cluster_low;           // Starting cluster
    uint32_t file_size;             // Size in bytes
} fat16_dir_entry_t;

// Attribute flags
#define ATTR_READ_ONLY  0x01
#define ATTR_HIDDEN     0x02
#define ATTR_SYSTEM     0x04
#define ATTR_VOLUME_ID  0x08
#define ATTR_DIRECTORY  0x10
#define ATTR_ARCHIVE    0x20
#define ATTR_LFN        0x0F  // Long filename marker

//------------------------------------------
// FAT16 File System State
//------------------------------------------

typedef struct {
    // Drive reference
    ata_drive_t *drive;

    // BPB cached values
    uint16_t bytes_per_sector;
    uint8_t  sectors_per_cluster;
    uint16_t reserved_sectors;
    uint8_t  num_fats;
    uint16_t root_entry_count;
    uint16_t fat_size;
    uint32_t total_sectors;

    // Calculated values
    uint32_t fat_start_sector;
    uint32_t root_dir_start_sector;
    uint32_t root_dir_sectors;
    uint32_t data_start_sector;
    uint32_t total_clusters;
    uint32_t cluster_size;          // bytes

    // FAT table in memory
    uint16_t *fat_table;

    // Current directory tracking
    uint16_t current_dir_cluster;   // 0 = root
    char     current_path[256];
} fat16_fs_t;

//------------------------------------------
// File Handle (for reading files)
//------------------------------------------

typedef struct {
    fat16_fs_t     *fs;
    uint16_t       start_cluster;
    uint16_t       current_cluster;
    uint32_t       file_size;
    uint32_t       position;        // Current read position
    char           name[13];        // 8.3 + dot + null
} fat16_file_t;

4.4 Algorithm Overview

Mounting the File System:

1. Read sector 0 (boot sector)
2. Parse BPB structure
3. Calculate derived values:
   - fat_start = reserved_sectors
   - root_dir_start = reserved + (num_fats * fat_size)
   - root_dir_sectors = (root_entries * 32) / bytes_per_sector
   - data_start = root_dir_start + root_dir_sectors
4. Allocate and read FAT table into memory
5. Initialize current directory to root

Reading a Directory:

1. If directory is root:
   - Read sectors from root_dir_start
   - Process up to root_entry_count entries
2. If directory is subdirectory:
   - Start at directory's cluster
   - Read cluster, process entries
   - Follow cluster chain for large directories
3. For each entry:
   - Skip if first byte is 0x00 (end) or 0xE5 (deleted)
   - Skip if attributes == 0x0F (LFN entry)
   - Parse filename, size, attributes, cluster

Reading a File:

1. Find file's directory entry
2. Get starting cluster from entry
3. Initialize position = 0, bytes_read = 0
4. While position < file_size:
   a. Calculate offset within current cluster
   b. Read data from current cluster
   c. Advance position
   d. If reached end of cluster:
      - Look up next cluster in FAT
      - If FAT[cluster] >= 0xFFF8, EOF
      - Else current_cluster = FAT[cluster]
5. Return bytes_read

Converting Cluster to LBA:

lba = data_start_sector + (cluster - 2) * sectors_per_cluster

5. Implementation Guide

5.1 Environment Setup

# Same as previous x86 kernel projects

# Verify cross-compiler
i686-elf-gcc --version

# Verify NASM
nasm --version

# Verify QEMU
qemu-system-i386 --version

# Create a FAT16 disk image for testing
dd if=/dev/zero of=disk.img bs=1M count=64
mkfs.fat -F 16 -n "MY DISK" disk.img

# Mount and add test files (Linux)
mkdir mnt
sudo mount disk.img mnt
echo "Hello from file!" | sudo tee mnt/HELLO.TXT
sudo mkdir mnt/DOCS
echo "Test content" | sudo tee mnt/DOCS/TEST.TXT
sudo umount mnt

# Alternative: Use mtools (doesn't need root)
mformat -F -v "MY DISK" -i disk.img ::
echo "Hello from file!" | mwrite -i disk.img - ::/HELLO.TXT
mmd -i disk.img ::/DOCS
echo "Test content" | mwrite -i disk.img - ::/DOCS/TEST.TXT

5.2 Project Structure

fat16_driver/
├── Makefile
├── link.ld                 # Linker script
├── boot.asm                # Protected mode entry
├── kernel.c                # Main initialization
├── kernel.h                # Common definitions
├── ata.c                   # ATA PIO driver
├── ata.h
├── fat16.c                 # FAT16 file system
├── fat16.h
├── shell.c                 # Command interface
├── shell.h
├── string.c                # String utilities
├── string.h
├── vga.c                   # VGA text output (from P6)
├── vga.h
├── serial.c                # Serial debug output
├── serial.h
├── io.h                    # Port I/O functions
└── disk.img                # Test FAT16 image

5.3 The Core Question You’re Answering

“How does an operating system read files from a hard disk?”

You’ll answer this by:

  1. Writing code that communicates with the disk controller
  2. Understanding how data is organized into clusters and chains
  3. Parsing on-disk data structures to find files
  4. Following chains through the FAT to read complete files

5.4 Concepts You Must Understand First

Concept Self-Assessment Question Reference
x86 Port I/O How do inb/outb/inw work? Intel SDM, Vol 1
Protected Mode Is your kernel in protected mode? Project 6
Memory-mapped structs How to read a struct from raw bytes? C alignment
LBA addressing How is LBA different from CHS? ATA specification
Endianness Is FAT little-endian or big-endian? FAT specification

5.5 Questions to Guide Your Design

ATA Driver:

  • How do you detect if a drive is present?
  • What’s the maximum LBA you can address with 28-bit LBA?
  • How do you handle read errors?
  • Should you poll status or use interrupts?

FAT16 Driver:

  • How do you distinguish FAT12 from FAT16 from FAT32?
  • Should you load the entire FAT into memory?
  • How do you handle the root directory differently from subdirectories?
  • What happens when a file spans multiple clusters?

Shell:

  • How do you parse user input?
  • How do you handle paths with subdirectories?
  • What happens when a file is too large to display?

5.6 Thinking Exercise

Before coding, trace through this scenario by hand:

SCENARIO: Reading a 5,000 byte file on FAT16 with 2KB clusters

Given:
- bytes_per_sector = 512
- sectors_per_cluster = 4
- cluster_size = 2048 bytes
- File "DATA.TXT" starts at cluster 10
- FAT table: FAT[10] = 11, FAT[11] = 12, FAT[12] = 0xFFFF

Questions:
1. How many clusters does this file occupy?
2. What are the LBA addresses for each cluster?
3. How many sectors do you need to read?
4. How much of the last cluster is actually file data?

Draw the cluster chain and calculate byte ranges for each cluster.

5.7 Hints in Layers

Hint 1 - Starting Point: Start with the ATA driver. Get single-sector reads working first. Use a simple test: read sector 0 and verify the 0xAA55 signature at offset 0x1FE.

Hint 2 - Next Level: FAT16 detection: After reading the boot sector, check these conditions:

  • bpb->bytes_per_sector should be 512, 1024, 2048, or 4096
  • bpb->sectors_per_cluster should be 1, 2, 4, 8, 16, 32, 64, or 128
  • The signature at offset 0x1FE should be 0xAA55

Calculate total clusters:

uint32_t root_dir_sectors = ((bpb->root_entry_count * 32) +
                             (bpb->bytes_per_sector - 1)) /
                             bpb->bytes_per_sector;

uint32_t data_sectors = total_sectors -
                        (bpb->reserved_sectors +
                         (bpb->num_fats * bpb->fat_size_16) +
                         root_dir_sectors);

uint32_t total_clusters = data_sectors / bpb->sectors_per_cluster;

// FAT type determination:
// If total_clusters < 4085, it's FAT12
// If total_clusters < 65525, it's FAT16
// Otherwise, it's FAT32

Hint 3 - Technical Details: Directory entry parsing:

int parse_dir_entry(fat16_dir_entry_t *entry, char *filename) {
    // Check for end of directory
    if (entry->name[0] == 0x00) return -1;  // No more entries

    // Check for deleted entry
    if (entry->name[0] == 0xE5) return 0;   // Skip this entry

    // Check for LFN entry
    if ((entry->attributes & 0x0F) == 0x0F) return 0;  // Skip LFN

    // Check for volume label
    if (entry->attributes & ATTR_VOLUME_ID) return 0;  // Skip

    // Build 8.3 filename
    int pos = 0;
    for (int i = 0; i < 8 && entry->name[i] != ' '; i++) {
        filename[pos++] = entry->name[i];
    }

    // Add extension if present
    if (entry->ext[0] != ' ') {
        filename[pos++] = '.';
        for (int i = 0; i < 3 && entry->ext[i] != ' '; i++) {
            filename[pos++] = entry->ext[i];
        }
    }
    filename[pos] = '\0';

    return 1;  // Valid entry
}

Hint 4 - Debugging: Add extensive debug output:

void dump_bpb(fat16_bpb_t *bpb) {
    serial_printf("BPB Debug:\n");
    serial_printf("  bytes_per_sector:    %u\n", bpb->bytes_per_sector);
    serial_printf("  sectors_per_cluster: %u\n", bpb->sectors_per_cluster);
    serial_printf("  reserved_sectors:    %u\n", bpb->reserved_sectors);
    serial_printf("  num_fats:            %u\n", bpb->num_fats);
    serial_printf("  root_entry_count:    %u\n", bpb->root_entry_count);
    serial_printf("  fat_size_16:         %u\n", bpb->fat_size_16);
    serial_printf("  total_sectors:       %u\n",
                  bpb->total_sectors_16 ? bpb->total_sectors_16
                                        : bpb->total_sectors_32);
}

void dump_fat_chain(fat16_fs_t *fs, uint16_t start_cluster) {
    serial_printf("FAT chain from cluster %u:\n  ", start_cluster);
    uint16_t cluster = start_cluster;
    int count = 0;
    while (cluster < 0xFFF8 && count < 100) {  // Limit to prevent infinite loop
        serial_printf("%u -> ", cluster);
        cluster = fs->fat_table[cluster];
        count++;
    }
    serial_printf("END\n");
}

5.8 The Interview Questions They’ll Ask

  1. “Explain how FAT16 stores files on disk.”
    • Expected: Boot sector -> FAT table -> Directory entries -> Data clusters
    • Mention cluster chains and how FAT entries link clusters
  2. “What is fragmentation and how does FAT handle it?”
    • Expected: Files stored in non-contiguous clusters
    • FAT table links clusters regardless of physical location
    • Performance impact: more seeks needed
  3. “How does the operating system know where a file’s data is located?”
    • Expected: Directory entry contains starting cluster
    • FAT table chains to subsequent clusters
    • Data region starts after root directory
  4. “What’s the difference between a file and a directory in FAT?”
    • Expected: Both are directory entries
    • Directory has ATTR_DIRECTORY flag set
    • Directory’s “file data” is array of more directory entries
  5. “Why might FAT16 be preferred over FAT32 in some embedded systems?”
    • Expected: Smaller FAT table, simpler code, less memory
    • Good enough for small storage (< 2GB)
    • More compatible with legacy systems
  6. “How would you implement file deletion in FAT?”
    • Expected: Mark directory entry as deleted (0xE5)
    • Mark all clusters in chain as free (0x0000)
    • Data remains until overwritten (data recovery possible)
  7. “What happens if the FAT table becomes corrupted?”
    • Expected: File data becomes unrecoverable
    • That’s why two FAT copies exist
    • fsck/chkdsk compare and repair

5.9 Books That Will Help

Topic Book Chapters
File Systems Overview “Operating Systems: Three Easy Pieces” Ch. 39-42
FAT File System Microsoft FAT Specification Entire document
ATA Interface ATA/ATAPI-7 Specification Volume 1, Ch. 5-6
x86 I/O “Write Great Code, Vol 1” Ch. 11
Block I/O “Linux Device Drivers” Ch. 16

5.10 Implementation Phases

Phase 1: ATA Driver (Week 1)

  • Implement port I/O functions (inb, outb, inw, outw)
  • Write ATA initialization and drive detection
  • Implement single-sector read function
  • Test by reading sector 0 and verifying boot signature
  • Checkpoint: Can read any sector from disk

Phase 2: FAT16 Mounting (Week 1-2)

  • Parse boot sector into BPB structure
  • Calculate file system geometry (FAT start, root start, data start)
  • Allocate and load FAT table into memory
  • Verify FAT16 detection (cluster count check)
  • Checkpoint: Prints correct file system parameters

Phase 3: Directory Reading (Week 2)

  • Implement root directory reading
  • Parse directory entries, skip deleted and LFN entries
  • Format and display directory listing
  • Handle directories with many entries (multiple sectors)
  • Checkpoint: ls command shows root directory

Phase 4: File Reading (Week 2-3)

  • Implement file lookup by name
  • Follow cluster chain to read file data
  • Handle files larger than one cluster
  • Handle edge cases (empty files, exactly cluster-sized files)
  • Checkpoint: cat command displays file contents

Phase 5: Subdirectory Navigation (Week 3)

  • Implement subdirectory reading (similar to file reading)
  • Handle “.” and “..” entries
  • Implement path parsing and navigation
  • Implement cd command
  • Checkpoint: Can navigate and list subdirectories

Phase 6: Polish and Testing (Week 3-4)

  • Add pwd, info, help commands
  • Create comprehensive test disk image
  • Test edge cases (long paths, many files)
  • Add error handling and user-friendly messages
  • Optional: Add LFN support
  • Checkpoint: All tests pass, clean user experience

5.11 Key Implementation Decisions

Decision Option A Option B Recommendation
FAT caching Load entire FAT Load on demand Entire FAT (simpler, <128KB for 64MB disk)
Sector buffer Static global Stack-allocated Static (stack might be limited)
Case sensitivity Case-sensitive Case-insensitive Case-insensitive (FAT standard)
Path separator / Unix style \ DOS style / (Unix, more familiar)
Error handling Return codes Global errno Return codes (simpler)
Directory listing Simple Column-formatted Column-formatted (more readable)

6. Testing Strategy

6.1 Creating Test Disk Images

#!/bin/bash
# create_test_disk.sh

# Create 64MB FAT16 disk image
dd if=/dev/zero of=disk.img bs=1M count=64
mkfs.fat -F 16 -n "TESTDISK" disk.img

# Add test files using mtools (no root needed)
# Root directory files
echo "Hello from file!" > /tmp/hello.txt
mcopy -i disk.img /tmp/hello.txt ::HELLO.TXT

echo "Short file" > /tmp/short.txt
mcopy -i disk.img /tmp/short.txt ::SHORT.TXT

# Create a file larger than one cluster
dd if=/dev/urandom bs=5000 count=1 | base64 > /tmp/large.txt
mcopy -i disk.img /tmp/large.txt ::LARGE.TXT

# Create subdirectory with files
mmd -i disk.img ::DOCS
echo "Document content" > /tmp/doc.txt
mcopy -i disk.img /tmp/doc.txt ::DOCS/README.TXT

# Create nested subdirectory
mmd -i disk.img ::DOCS/NESTED
echo "Nested file" > /tmp/nested.txt
mcopy -i disk.img /tmp/nested.txt ::DOCS/NESTED/FILE.TXT

# Create empty directory
mmd -i disk.img ::EMPTY

# Create empty file
touch /tmp/empty.txt
mcopy -i disk.img /tmp/empty.txt ::EMPTY.TXT

# Show what was created
echo "Disk contents:"
mdir -i disk.img ::
mdir -i disk.img ::DOCS

6.2 Unit Tests

// test_ata.c

void test_ata_read_sector_0(void) {
    uint8_t buffer[512];

    int result = ata_read_sector(0, buffer);
    assert(result == 0);  // No error

    // Check boot signature
    uint16_t signature = *(uint16_t*)(buffer + 0x1FE);
    assert(signature == 0xAA55);

    serial_print("test_ata_read_sector_0: PASS\n");
}

void test_ata_read_multiple_sectors(void) {
    uint8_t buffer1[512], buffer2[512];

    // Read same sector twice, should match
    ata_read_sector(0, buffer1);
    ata_read_sector(0, buffer2);

    for (int i = 0; i < 512; i++) {
        assert(buffer1[i] == buffer2[i]);
    }

    // Read different sectors, should NOT match
    ata_read_sector(1, buffer2);
    int different = 0;
    for (int i = 0; i < 512; i++) {
        if (buffer1[i] != buffer2[i]) different++;
    }
    assert(different > 0);  // At least some bytes different

    serial_print("test_ata_read_multiple_sectors: PASS\n");
}
// test_fat16.c

void test_fat16_mount(void) {
    fat16_fs_t fs;

    int result = fat16_mount(&primary_drive, &fs);
    assert(result == 0);

    // Verify expected values
    assert(fs.bytes_per_sector == 512);
    assert(fs.sectors_per_cluster == 4);  // 2KB clusters for 64MB

    serial_print("test_fat16_mount: PASS\n");
}

void test_fat16_list_root(void) {
    fat16_fs_t fs;
    fat16_mount(&primary_drive, &fs);

    fat16_dir_entry_t entries[16];
    int count = fat16_list_dir(&fs, 0, entries, 16);

    assert(count > 0);  // Should have at least one entry

    // Look for HELLO.TXT
    int found = 0;
    for (int i = 0; i < count; i++) {
        char name[13];
        parse_filename(&entries[i], name);
        if (strcmp(name, "HELLO.TXT") == 0) {
            found = 1;
            assert(entries[i].file_size == 17);  // "Hello from file!\n"
        }
    }
    assert(found);

    serial_print("test_fat16_list_root: PASS\n");
}

void test_fat16_read_file(void) {
    fat16_fs_t fs;
    fat16_mount(&primary_drive, &fs);

    char buffer[64];
    int bytes = fat16_read_file(&fs, "HELLO.TXT", buffer, sizeof(buffer));

    assert(bytes == 17);
    buffer[bytes] = '\0';
    assert(strcmp(buffer, "Hello from file!\n") == 0);

    serial_print("test_fat16_read_file: PASS\n");
}

void test_fat16_large_file(void) {
    fat16_fs_t fs;
    fat16_mount(&primary_drive, &fs);

    // LARGE.TXT is ~6700 bytes, spans multiple clusters
    char buffer[8192];
    int bytes = fat16_read_file(&fs, "LARGE.TXT", buffer, sizeof(buffer));

    assert(bytes > 2048);  // Should be larger than one cluster

    // Verify it's valid base64 (no null bytes in middle)
    for (int i = 0; i < bytes - 1; i++) {
        assert(buffer[i] != '\0');
    }

    serial_print("test_fat16_large_file: PASS\n");
}

void test_fat16_subdirectory(void) {
    fat16_fs_t fs;
    fat16_mount(&primary_drive, &fs);

    // Find DOCS directory
    fat16_dir_entry_t entry;
    int result = fat16_find_entry(&fs, "DOCS", &entry);
    assert(result == 0);
    assert(entry.attributes & ATTR_DIRECTORY);

    // List DOCS directory
    fat16_dir_entry_t entries[16];
    int count = fat16_list_dir(&fs, entry.cluster_low, entries, 16);

    // Should have ".", "..", and "README.TXT"
    assert(count >= 3);

    serial_print("test_fat16_subdirectory: PASS\n");
}

6.3 Integration Tests

void test_shell_commands(void) {
    // Test ls command
    shell_execute("ls");
    // Manual verification: check serial output

    // Test cat command
    shell_execute("cat HELLO.TXT");
    // Should print: Hello from file!

    // Test cd and pwd
    shell_execute("cd DOCS");
    shell_execute("pwd");
    // Should print: /DOCS

    // Test relative paths
    shell_execute("cat README.TXT");
    // Should print: Document content

    // Test parent directory
    shell_execute("cd ..");
    shell_execute("pwd");
    // Should print: /

    serial_print("test_shell_commands: Manual verification required\n");
}

6.4 Edge Case Tests

void test_edge_cases(void) {
    fat16_fs_t fs;
    fat16_mount(&primary_drive, &fs);

    // Test empty file
    char buffer[64];
    int bytes = fat16_read_file(&fs, "EMPTY.TXT", buffer, sizeof(buffer));
    assert(bytes == 0);

    // Test non-existent file
    bytes = fat16_read_file(&fs, "NOEXIST.TXT", buffer, sizeof(buffer));
    assert(bytes == -1);  // Error return

    // Test empty directory
    fat16_dir_entry_t entries[16];
    fat16_dir_entry_t dir_entry;
    fat16_find_entry(&fs, "EMPTY", &dir_entry);
    int count = fat16_list_dir(&fs, dir_entry.cluster_low, entries, 16);
    assert(count == 2);  // Only "." and ".."

    serial_print("test_edge_cases: PASS\n");
}

7. Common Pitfalls and Debugging

7.1 Common Pitfalls

Symptom Likely Cause Fix
Garbage when reading sectors Wrong endianness x86 is little-endian, FAT is little-endian (should match)
First cluster data wrong Cluster numbering First data cluster is 2, not 0
Can’t find files Case sensitivity Convert filenames to uppercase for comparison
Partial file contents Wrong cluster chain Verify FAT table reading and interpretation
Subdirectory empty Reading root instead Root uses sectors, subdirs use clusters
Hang on disk read Not waiting for BSY Poll status register before and after commands
Wrong file size Attribute confusion Check attribute byte, don’t confuse with size
Truncated long files Single cluster read Must follow chain for multi-cluster files

7.2 Debugging Techniques

Hex Dump Sector:

void dump_sector(uint8_t *buffer) {
    for (int row = 0; row < 32; row++) {
        serial_printf("%04x: ", row * 16);
        for (int col = 0; col < 16; col++) {
            serial_printf("%02x ", buffer[row * 16 + col]);
        }
        serial_printf(" ");
        for (int col = 0; col < 16; col++) {
            char c = buffer[row * 16 + col];
            serial_printf("%c", (c >= 32 && c < 127) ? c : '.');
        }
        serial_printf("\n");
    }
}

Verify Disk Image with Linux:

# Mount the disk image to verify contents
sudo mount -o loop,ro disk.img /mnt
ls -la /mnt
cat /mnt/HELLO.TXT
sudo umount /mnt

# Or use mtools
mdir -i disk.img ::
mtype -i disk.img ::HELLO.TXT

# Hex dump boot sector
xxd disk.img | head -32

# Check FAT entries
# FAT starts at sector 1 (reserved_sectors = 1)
# Each FAT16 entry is 2 bytes
dd if=disk.img bs=512 skip=1 count=1 | xxd | head

GDB with QEMU:

# Terminal 1:
qemu-system-i386 -kernel kernel.elf -hda disk.img -s -S -serial stdio

# Terminal 2:
gdb kernel.elf
(gdb) target remote localhost:1234
(gdb) break fat16_read_file
(gdb) continue

# When breakpoint hits:
(gdb) print *fs
(gdb) print filename
(gdb) x/32xb buffer    # Examine buffer contents

7.3 The Classic Bugs

Bug 1: Off-by-two cluster calculation

// WRONG - forgetting that clusters start at 2
uint32_t lba = fs->data_start_sector + cluster * fs->sectors_per_cluster;

// RIGHT
uint32_t lba = fs->data_start_sector + (cluster - 2) * fs->sectors_per_cluster;

Bug 2: Reading root directory as clusters

// WRONG - treating root directory like a subdirectory
uint32_t lba = cluster_to_lba(fs, 0);

// RIGHT - root directory has special location
if (cluster == 0) {
    // Root directory: use root_dir_start_sector
    lba = fs->root_dir_start_sector;
} else {
    // Subdirectory: use cluster chain
    lba = cluster_to_lba(fs, cluster);
}

Bug 3: Case-sensitive comparison

// WRONG - FAT stores uppercase
if (strcmp(entry->name, "hello   ") == 0)

// RIGHT - compare uppercase
char name_upper[12];
for (int i = 0; i < 11; i++) {
    name_upper[i] = toupper(entry->name[i]);
}
if (memcmp(name_upper, "HELLO   TXT", 11) == 0)

Bug 4: Not handling packed structures

// WRONG - might have padding on some compilers
struct fat16_bpb {
    uint8_t jmp[3];
    char oem[8];
    uint16_t bytes_per_sector;  // Might be at wrong offset!
};

// RIGHT - force packed
struct fat16_bpb {
    uint8_t jmp[3];
    char oem[8];
    uint16_t bytes_per_sector;
} __attribute__((packed));

Bug 5: Not waiting for drive ready

// WRONG - sending commands immediately
outb(0x1F7, 0x20);  // Might fail if drive busy

// RIGHT - wait for BSY clear first
while (inb(0x1F7) & 0x80);  // Wait for BSY = 0
outb(0x1F7, 0x20);

8. Extensions and Challenges

8.1 Beginner Extensions

  1. File search: Implement recursive search for files by name
  2. File size formatting: Show sizes in KB/MB when appropriate
  3. Date formatting: Parse and display human-readable dates
  4. Tab completion: Basic filename completion for shell commands
  5. Command history: Up/down arrows to recall previous commands

8.2 Intermediate Extensions

  1. Long File Name (LFN) support: Read and display long filenames
  2. Write support: Implement file creation and writing
  3. Directory creation: Implement mkdir command
  4. File copying: Implement cp command (requires write)
  5. Multiple drives: Support secondary ATA drive

8.3 Advanced Extensions

  1. FAT32 support: Extend to handle 32-bit cluster numbers
  2. Caching layer: Add sector cache for better performance
  3. VFS layer: Abstract file system interface
  4. Partition table parsing: Read MBR and support partitions
  5. DMA transfers: Use ATA DMA mode for better performance

8.4 Expert Challenges

  1. File deletion and defragmentation: Implement delete and defrag
  2. fsck equivalent: Implement file system consistency checking
  3. Journaling wrapper: Add simple journaling for crash recovery
  4. Network file access: Load files from TFTP or HTTP
  5. Port to ext2: Implement a more complex file system

9. Real-World Connections

9.1 How Linux Does It

Linux VFS Layer: Linux has a Virtual File System layer that abstracts different file systems:

User Space:     open("/mnt/usb/file.txt", O_RDONLY)
                           │
                           ▼
Kernel VFS:     struct file_operations { .read, .write, ... }
                           │
                           ▼
FAT Driver:     fat_file_read() → fat_get_cluster() → sb_bread()
                           │
                           ▼
Block Layer:    submit_bio() → request queue
                           │
                           ▼
ATA Driver:     ata_scsi_queuecmd() → ata_pio_task()

Linux FAT Implementation:

  • Located in fs/fat/ directory
  • inode.c - inode operations
  • dir.c - directory operations
  • file.c - file operations
  • cache.c - FAT table caching
  • fatent.c - FAT entry manipulation

9.2 Modern Storage Stack

Your simple ATA PIO driver is the ancestor of modern storage:

Your Implementation:        Modern Linux:
─────────────────          ─────────────
                           ┌─────────────────┐
                           │ io_uring / AIO  │  Async interface
                           └────────┬────────┘
                                    │
FAT16 Driver     ───►      ┌────────▼────────┐
                           │    VFS Layer    │  Abstraction
                           └────────┬────────┘
                                    │
                           ┌────────▼────────┐
                           │  Block Layer    │  Scheduling, merging
                           └────────┬────────┘
                                    │
ATA PIO Driver   ───►      ┌────────▼────────┐
                           │    libata       │  AHCI, NVMe
                           └────────┬────────┘
                                    │
Direct Port I/O  ───►      ┌────────▼────────┐
                           │   DMA / IOMMU   │  Hardware acceleration
                           └─────────────────┘

9.3 Career Applications

Embedded Systems Developer:

  • Many embedded devices use FAT for external storage
  • SD cards in cameras, IoT devices, automotive
  • Understanding FAT is directly applicable

Operating Systems Developer:

  • File system development is core OS work
  • Companies: Microsoft, Apple, Google, Red Hat
  • Foundation for more complex file systems

Storage Systems Engineer:

  • Understanding from raw blocks to files
  • Companies: NetApp, Pure Storage, Dell EMC
  • Performance optimization, reliability

Forensics and Data Recovery:

  • Understanding FAT structure enables recovery
  • Deleted file recovery
  • Disk image analysis

10. Resources

10.1 Essential Reading

Resource URL/Location Usage
Microsoft FAT Spec Microsoft Hardware Dev Definitive reference
OSDev FAT wiki.osdev.org/FAT Implementation guide
OSDev ATA PIO wiki.osdev.org/ATA_PIO_Mode ATA driver reference
OSTEP File Systems ostep.org, Ch. 39-42 Conceptual foundation

10.2 Reference Implementations

Project URL Notes
FatFs elm-chan.org/fsw/ff/ Embedded FAT library
Linux FAT github.com/torvalds/linux/tree/master/fs/fat Production implementation
SerenityOS FAT serenityos.org Clean C++ implementation
xv6 file system github.com/mit-pdos/xv6-public Unix V6 style (simpler)

10.3 Specifications

Document Description
Microsoft FAT Specification Official FAT12/16/32 spec
ATA/ATAPI-7 T13 standard for ATA interface
ECMA-107 FAT file system standard

10.4 Tools

Tool Usage
mtools Manipulate FAT images without mounting
mkfs.fat Create FAT file systems
dosfsck Check/repair FAT file systems
xxd Hex dump for debugging
hd / hexdump Alternative hex dump

11. Self-Assessment Checklist

Before Starting

  • I have completed Project 6 (Protected Mode Kernel)
  • I understand x86 port I/O (inb/outb/inw/outw)
  • I can read and understand hardware specifications
  • I understand binary/hexadecimal and byte ordering
  • I’m comfortable with C structs and pointers

After Phase 2 (FAT16 Mounting)

  • I can explain the FAT16 disk layout
  • I understand the difference between sectors and clusters
  • My driver correctly parses the BPB
  • I can calculate FAT start, root start, and data start sectors
  • I understand why FAT is duplicated

After Phase 4 (File Reading)

  • I can explain how cluster chains work
  • My driver reads files spanning multiple clusters
  • I understand the difference between root directory and subdirectories
  • I can trace a file from directory entry to disk sectors
  • I handle edge cases (empty files, exact cluster-size files)

After Completion

  • All test cases pass
  • I can navigate directories with cd command
  • I can display files with cat command
  • I understand how to extend to FAT32
  • I can answer all interview questions in section 5.8
  • I could port this to a different architecture

12. Submission / Completion Criteria

Your project is complete when:

  1. Functionality:
    • ATA driver can read any sector from disk
    • FAT16 boot sector is correctly parsed
    • Root directory listing works
    • File reading works (including multi-cluster files)
    • Subdirectory navigation works
    • Shell provides ls, cat, cd, pwd commands
  2. Code Quality:
    • Clean separation: ata.c, fat16.c, shell.c
    • Well-documented structures and algorithms
    • No warnings with -Wall -Wextra
    • Handles errors gracefully
  3. Testing:
    • Works with standard FAT16 images from mkfs.fat
    • Handles edge cases (empty files, large files)
    • Verified against mtools output
  4. Understanding:
    • Can explain FAT16 structure
    • Can explain cluster chain following
    • Can add new features (LFN, write support)
    • Understands real-world applications

Summary

This project bridges the gap between raw disk hardware and the organized files users expect. You’ll learn:

  • Block Device I/O – How to talk to storage hardware
  • On-Disk Data Structures – How file systems organize data
  • Cluster Chains – How files are stored and retrieved
  • Directory Hierarchy – How paths map to disk locations

These concepts are fundamental to all operating systems. Whether you work on Linux, Windows, embedded systems, or cloud storage, understanding file systems from the ground up makes you a more effective systems programmer.

Estimated time: 3-4 weeks for complete implementation Difficulty: Expert (requires understanding of Projects 6-8) Reward: Your OS can now load files!


Previous: P10-kernel-multitasking.md - Preemptive multitasking kernel Next: P12-uefi-application.md - UEFI application development