Project 6: ZFS Lab (Datasets, Snapshots, Quotas)

Build a ZFS pool with datasets, snapshots, and quotas, then prove rollback works.

Quick Reference

Attribute Value
Difficulty Level 2
Time Estimate 1-2 weekends
Main Programming Language Shell (sh) (Alternatives: csh, Python)
Alternative Programming Languages sh, csh, Python
Coolness Level Level 3
Business Potential Level 2
Prerequisites Projects 1-5 complete, basic storage concepts
Key Topics zpool, zfs datasets, snapshots, quotas

1. Learning Objectives

By completing this project, you will:

  1. Create and inspect a ZFS pool and datasets.
  2. Apply quotas and reservations to manage space.
  3. Create, list, and rollback snapshots safely.
  4. Document a repeatable ZFS workflow for recovery.

2. All Theory Needed (Per-Concept Breakdown)

Concept 1: ZFS Pools, Datasets, and Properties

Fundamentals ZFS combines a filesystem and volume manager. Storage is organized into pools (zpool) that contain datasets (zfs). Each dataset is a filesystem with its own properties like mountpoint, compression, and quota. ZFS properties are set per dataset and inherited, which makes layout planning important. In a FreeBSD VM, ZFS gives you snapshots, data integrity checks, and the ability to manage storage with fine-grained control. For this project, you must understand how pools and datasets relate, how properties cascade, and how to inspect the pool’s health.

In practice, write a short checklist for ZFS pools and datasets and confirm it after each reboot. This keeps the concept concrete and prevents accidental drift between sessions.

Deep Dive into the concept A ZFS pool is built from one or more virtual devices (vdevs), which can be disks, partitions, or files (in labs). The pool is the top-level storage allocator and is responsible for distributing writes across vdevs, tracking checksums, and managing space. When you create a dataset, you are creating a logical filesystem inside the pool. Datasets are lightweight and can be created for different directories (e.g., /home, /var, /usr/local). Because datasets are cheap, you can use them to isolate data and apply different properties.

Properties are at the heart of ZFS. They define behaviors like compression, recordsize, atime, quota, and reservation. Properties inherit from parent datasets unless overridden. This allows you to set global defaults at the pool level and then fine-tune specific datasets. For example, you might set compression=lz4 globally, but disable it for a dataset that stores already-compressed media. Quotas and reservations are also properties: a quota caps maximum usage, while a reservation guarantees minimum space. This is important for ensuring critical datasets cannot be starved by others.

ZFS also integrates data integrity. Each block is checksummed, and the checksum is verified on read. This means ZFS can detect corruption, and in redundant pools it can repair it. In a VM with a single disk, you still benefit from detection even if you cannot repair. You can run zpool scrub to proactively verify data. This fits with FreeBSD’s operational ethos: systems should detect errors before they become outages.

When planning datasets, you need to balance simplicity and control. A simple lab might create datasets for /home, /var, and /usr/local. This allows you to snapshot and rollback user data separately from system software. In later projects, this separation becomes important for upgrades and jails. The key is to treat datasets as boundaries: each one should contain a coherent category of data and have properties aligned to its usage.

Operationally, ZFS pools and datasets is easiest to keep stable when you treat it as a small contract between configuration, tooling, and observable outputs. Write down the exact files that own the state and the commands that reveal the current truth. Then verify the contract at three points: immediately after you make the change, after a reboot, and after a deliberate disturbance such as restarting services or reloading modules. FreeBSD rewards this discipline because it rarely hides state; if something changes, it is usually in a file you control. Make a habit of collecting a before-and-after snapshot of commands and outputs so you can explain which change caused which effect.

At scale, ZFS pools and datasets is also about failure containment. Identify what must remain available when something breaks and design a safe escape hatch. For example, keep console access for firewall changes, keep a previous boot environment for upgrades, or keep a dataset snapshot before risky edits. The same pattern applies across domains: define invariants, define the rollback path, and then only proceed when you can trigger that rollback quickly. Finally, test the failure path while the system is healthy; you learn more from a controlled rollback than from an emergency. This perspective turns the lab exercise into an operational capability you can trust on production systems.

How this fits on projects

Definitions & key terms

  • zpool -> ZFS storage pool.
  • dataset -> ZFS filesystem inside a pool.
  • property -> Config value like compression or quota.
  • scrub -> Integrity check of all pool data.

Mental model diagram

Pool (zroot)
+-- dataset: / (root)
+-- dataset: /home
+-- dataset: /var

How it works (step-by-step, with invariants and failure modes)

  1. Create or import a pool.
  2. Create datasets inside the pool.
  3. Set properties and quotas.
  4. Verify pool health with zpool status.

Invariants:

  • Pool must be imported to access datasets.
  • Dataset mountpoints must not conflict.

Failure modes:

  • Pool out of space -> dataset writes fail.
  • Wrong mountpoint -> data appears missing.

Minimal concrete example

zpool status
zfs list
zfs get all zroot

Common misconceptions

  • “Datasets are just directories.” -> They are independent filesystems.
  • “Quotas are global.” -> They apply per dataset.

Check-your-understanding questions

  1. What is the difference between a pool and a dataset?
  2. How does property inheritance work?
  3. Why run a scrub?

Check-your-understanding answers

  1. Pools are storage containers; datasets are filesystems within pools.
  2. Child datasets inherit unless overridden.
  3. To verify checksums and detect corruption.

Real-world applications

  • Multi-tenant storage isolation on servers.
  • Safe testing with datasets per service.

Where you’ll apply it

References

  • “Absolute FreeBSD, 3rd Edition” (Ch. 12)
  • FreeBSD Handbook: ZFS

Key insights ZFS datasets let you slice storage into recoverable, controllable units.

Summary Understanding pools, datasets, and properties is the foundation of safe ZFS use.

Homework/Exercises to practice the concept

  1. Create datasets for /home and /var.
  2. Change a property on a child dataset and observe inheritance.
  3. Run a scrub and check status.

Solutions to the homework/exercises

  1. zfs create zroot/home and zfs create zroot/var.
  2. zfs set compression=off zroot/var overrides parent.
  3. zpool scrub zroot then zpool status shows completion.

Concept 2: Snapshots, Rollback, and Space Management

Fundamentals ZFS snapshots are point-in-time copies that are space efficient because they use copy-on-write. A snapshot preserves the state of a dataset at a moment and can be used for rollback or cloning. Snapshots are not backups; they live in the same pool and can be lost if the pool is lost. Quotas and reservations control space usage, and snapshots can consume space if not managed. For this project, you must create snapshots, roll back safely, and observe how snapshots affect space usage.

In practice, write a short checklist for ZFS snapshots and rollback and confirm it after each reboot. This keeps the concept concrete and prevents accidental drift between sessions.

Deep Dive into the concept Snapshots in ZFS capture a consistent view of a dataset without copying all data. When you create a snapshot, ZFS freezes the current block references. As new data is written, ZFS writes new blocks and leaves the old blocks referenced by the snapshot. This is why snapshots are efficient: they only consume space for changed blocks. However, if you keep many snapshots or make large changes, they can accumulate and consume significant space. This is why snapshot management is part of operational hygiene.

Rollback is a powerful but dangerous operation. Rolling back a dataset returns it to the exact state of the snapshot, discarding any later changes. This is perfect for recovery drills, but you must understand that it is destructive. In practice, you might clone a snapshot to inspect or recover files without destroying current state. ZFS snapshots can also be used for boot environments, which provide safe system upgrades. In that context, rollback is equivalent to booting into an older environment.

Space management is tied to snapshots because snapshots “hold” blocks that would otherwise be freed. If you delete files but snapshots reference the old blocks, the pool space does not recover. This can be confusing to new users. The correct mental model is: “space is freed only when no snapshots reference the blocks.” This is why zfs list -t snapshot and zfs get usedby* are useful-they show how much space snapshots are holding.

Quotas and reservations provide another dimension of control. A quota limits how large a dataset can grow. A reservation guarantees space for a dataset even if the pool is otherwise full. In a lab, you can use quotas to prevent runaway datasets from filling the pool. This is especially important when experimenting with jails or logs. Proper quotas make your system more resilient and predictable.

Ultimately, snapshots are an operational tool. They are not backups, but they are the fastest way to recover from mistakes. The habit to build is: snapshot before risky changes, verify afterwards, and prune old snapshots. This mindset supports later projects like boot environments and update runbooks.

Operationally, ZFS snapshots and rollback is easiest to keep stable when you treat it as a small contract between configuration, tooling, and observable outputs. Write down the exact files that own the state and the commands that reveal the current truth. Then verify the contract at three points: immediately after you make the change, after a reboot, and after a deliberate disturbance such as restarting services or reloading modules. FreeBSD rewards this discipline because it rarely hides state; if something changes, it is usually in a file you control. Make a habit of collecting a before-and-after snapshot of commands and outputs so you can explain which change caused which effect.

At scale, ZFS snapshots and rollback is also about failure containment. Identify what must remain available when something breaks and design a safe escape hatch. For example, keep console access for firewall changes, keep a previous boot environment for upgrades, or keep a dataset snapshot before risky edits. The same pattern applies across domains: define invariants, define the rollback path, and then only proceed when you can trigger that rollback quickly. Finally, test the failure path while the system is healthy; you learn more from a controlled rollback than from an emergency. This perspective turns the lab exercise into an operational capability you can trust on production systems.

How this fits on projects

Definitions & key terms

  • snapshot -> Point-in-time dataset state.
  • rollback -> Revert dataset to a snapshot.
  • clone -> Writable copy of a snapshot.
  • usedby -> ZFS space accounting for snapshots.

Mental model diagram

Dataset -> Snapshot -> Rollback

How it works (step-by-step, with invariants and failure modes)

  1. Create a snapshot with zfs snapshot.
  2. Make changes to files.
  3. Roll back with zfs rollback if needed.
  4. Destroy snapshots to free space.

Invariants:

  • Snapshot names are dataset@snap.
  • Rollback discards newer changes.

Failure modes:

  • Pool full due to many snapshots.
  • Rollback unintentionally deletes new data.

Minimal concrete example

zfs snapshot zroot/home@before-test
zfs rollback zroot/home@before-test

Common misconceptions

  • “Snapshots are backups.” -> They are not; they live in the same pool.
  • “Deleting files frees space immediately.” -> Snapshots can keep blocks alive.

Check-your-understanding questions

  1. Why are snapshots space efficient?
  2. What happens to changes after a rollback?
  3. How do you see snapshot space usage?

Check-your-understanding answers

  1. Only changed blocks consume extra space.
  2. They are discarded permanently.
  3. zfs list -t snapshot or zfs get usedby*.

Real-world applications

  • Safe configuration experiments.
  • Rapid recovery from bad updates.

Where you’ll apply it

References

  • “Absolute FreeBSD, 3rd Edition” (Ch. 12)
  • FreeBSD Handbook: ZFS snapshots

Key insights Snapshots are safety nets, but they require space discipline.

Summary Use snapshots before risky changes, and prune them to recover space.

Homework/Exercises to practice the concept

  1. Create a snapshot, delete a file, roll back.
  2. Create multiple snapshots and observe space usage.
  3. Destroy an old snapshot and verify space recovery.

Solutions to the homework/exercises

  1. The deleted file returns after rollback.
  2. zfs list -t snapshot shows usage growth.
  3. Space usage drops after destroying snapshot.

3. Project Specification

3.1 What You Will Build

A ZFS pool with multiple datasets, configured properties, snapshots, and quotas. You will demonstrate rollback by restoring deleted files.

3.2 Functional Requirements

  1. Pool created and healthy (zpool status).
  2. Datasets created with documented mountpoints.
  3. Snapshots taken and listed.
  4. Rollback performed to recover data.
  5. Quotas applied and verified.

3.3 Non-Functional Requirements

  • Performance: Snapshot and rollback complete in under 5 seconds.
  • Reliability: Pool remains online after scrub.
  • Usability: Dataset layout documented in notes.

3.4 Example Usage / Output

$ zfs list
NAME USED AVAIL MOUNTPOINT
zroot 2G 18G /
zroot/home 200M 18G /home

$ zfs list -t snapshot
NAME USED
zroot/home@before-test 0B

3.5 Data Formats / Schemas / Protocols

  • Dataset plan
    /home -> zroot/home
    /var -> zroot/var
    
  • Snapshot naming
    zroot/home@before-test
    

3.6 Edge Cases

  • Pool full due to snapshots.
  • Rolling back the wrong dataset.
  • Quotas too strict causing write failures.

3.7 Real World Outcome

A ZFS layout that protects your data from mistakes and makes recovery fast.

3.7.1 How to Run (Copy/Paste)

zfs create zroot/home
zfs snapshot zroot/home@before-test
rm /home/admin/test.txt
zfs rollback zroot/home@before-test

3.7.2 Golden Path Demo (Deterministic)

  • Use fixed dataset names and snapshot labels.
  • Record outputs without timestamps.

3.7.3 If CLI: provide an exact terminal transcript

$ zfs snapshot zroot/home@before-test

$ rm /home/admin/test.txt

$ zfs rollback zroot/home@before-test

$ echo $?
0

Failure demo (deterministic)

$ zfs rollback zroot/home@does-not-exist
cannot open 'zroot/home@does-not-exist': snapshot does not exist

$ echo $?
1

4. Solution Architecture

4.1 High-Level Design

zpool -> datasets -> snapshots -> rollback

4.2 Key Components

Component Responsibility Key Decisions
Pool Storage container Single disk vs mirror
Datasets Data separation /home, /var, /usr/local
Snapshots Recovery points Naming scheme

4.3 Data Structures (No Full Code)

ZfsLayout
- pool: "zroot"
- datasets: ["zroot/home", "zroot/var"]
- snapshot_policy: "before-change"

4.4 Algorithm Overview

Key Algorithm: Safe Change with Snapshot

  1. Snapshot dataset.
  2. Perform change.
  3. Validate.
  4. Roll back if needed.

Complexity Analysis:

  • Time: O(changed blocks)
  • Space: O(snapshot delta)

5. Implementation Guide

5.1 Development Environment Setup

# Ensure ZFS tools are installed (base system)

5.2 Project Structure

zfs-lab/
+-- dataset-plan.md
+-- snapshots.log
+-- rollback-notes.md

5.3 The Core Question You’re Answering

“Can I use ZFS to make storage safe and reversible?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Pools, datasets, and properties
  2. Snapshots and rollback

5.5 Questions to Guide Your Design

  1. Which directories should be datasets?
  2. What snapshot naming scheme will you use?
  3. What quota limits are reasonable in a VM?

5.6 Thinking Exercise

Rollback Drill

Delete a file, then restore it from a snapshot.

5.7 The Interview Questions They’ll Ask

  1. What is a ZFS dataset?
  2. Why are snapshots not backups?
  3. What does a scrub do?
  4. How do you check pool health?
  5. What is a boot environment?

5.8 Hints in Layers

Hint 1: Start with /home Create a dataset for user data first.

Hint 2: Snapshot before changes Use a clear label like before-test.

Hint 3: Use zpool status Check pool health after each step.

Hint 4: Watch space usage Use zfs list -t snapshot.

5.9 Books That Will Help

Topic Book Chapter
ZFS “Absolute FreeBSD, 3rd Edition” Ch. 12

5.10 Implementation Phases

Phase 1: Pool and Datasets (3-4 hours)

Goals: Create layout. Tasks:

  1. Inspect pool with zpool status.
  2. Create datasets for /home and /var. Checkpoint: zfs list shows datasets.

Phase 2: Snapshots (2-3 hours)

Goals: Practice snapshots and rollback. Tasks:

  1. Create snapshot.
  2. Delete file and roll back. Checkpoint: file restored.

Phase 3: Quotas (2-3 hours)

Goals: Apply quotas. Tasks:

  1. Set quota on /home.
  2. Observe behavior when exceeded. Checkpoint: quota enforced.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Dataset layout minimal, granular moderate Balance simplicity and control
Snapshot naming timestamp, label label + date Human-readable
Quotas none, per-dataset per-dataset Prevent runaway usage

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Pool Tests Health and import zpool status
Snapshot Tests Recovery zfs rollback
Quota Tests Limit enforcement write until quota hit

6.2 Critical Test Cases

  1. Pool health: status reports ONLINE.
  2. Rollback: deleted file restored.
  3. Quota enforcement: writes fail when limit exceeded.

6.3 Test Data

Dataset: zroot/home
Snapshot: zroot/home@before-test

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Snapshots fill pool No space left Destroy old snapshots
Wrong dataset rollback Data loss Verify dataset name
Quota too small Write failures Adjust quota

7.2 Debugging Strategies

  • Inspect usedby: zfs get usedby* to see snapshot space.
  • Verify mountpoints: zfs get mountpoint.

7.3 Performance Traps

  • Over-snapshotting can degrade performance in small pools.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Create a dataset for /usr/local.
  • Enable compression on all datasets.

8.2 Intermediate Extensions

  • Create a snapshot schedule script.
  • Clone a snapshot for recovery testing.

8.3 Advanced Extensions

  • Add a second disk and create a mirror.
  • Test send/receive to a second pool.

9. Real-World Connections

9.1 Industry Applications

  • Storage servers: datasets per tenant.
  • DevOps: snapshots before deployments.
  • OpenZFS: upstream ZFS project.
  • bectl: boot environment tool.

9.3 Interview Relevance

  • Explaining datasets vs snapshots.
  • Demonstrating recovery workflows.

10. Resources

10.1 Essential Reading

  • “Absolute FreeBSD, 3rd Edition” by Michael W. Lucas - Ch. 12
  • FreeBSD Handbook - ZFS

10.2 Video Resources

  • “ZFS Fundamentals” - community lectures

10.3 Tools & Documentation

  • zfs(8): ZFS filesystem management
  • zpool(8): ZFS pool management

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain pools vs datasets.
  • I can explain snapshots and rollback.
  • I understand quotas and reservations.

11.2 Implementation

  • Pool and datasets created.
  • Snapshot and rollback tested.
  • Quotas applied and verified.

11.3 Growth

  • I documented dataset layout choices.
  • I can recover data quickly.
  • I can teach ZFS basics to someone else.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • ZFS pool and datasets created.
  • Snapshot and rollback demonstrated.

Full Completion:

  • Quotas applied and verified.
  • Pool scrub run and status checked.

Excellence (Going Above & Beyond):

  • Mirror added or send/receive tested.
  • Snapshot schedule automated.