Project 10: Update and Recovery Runbook

Write and execute a step-by-step runbook for patching and recovery with rollback.

Quick Reference

Attribute	Value
Difficulty	Level 3
Time Estimate	2 weekends
Main Programming Language	Shell (sh) (Alternatives: csh, Python)
Alternative Programming Languages	sh, csh, Python
Coolness Level	Level 3
Business Potential	Level 3
Prerequisites	Projects 1-9 complete
Key Topics	Runbooks, freebsd-update, pkg, rollback

1. Learning Objectives

By completing this project, you will:

Build a deterministic update runbook with validation and rollback steps.
Apply the runbook to a VM and record outputs.
Handle a simulated failure and recover cleanly.
Produce a reusable operations checklist for future upgrades.

2. All Theory Needed (Per-Concept Breakdown)

Concept 1: Update Workflow Discipline and Change Control

Fundamentals A runbook is a documented, repeatable procedure that reduces risk during maintenance. FreeBSD updates require a specific sequence: base system updates, reboot, validation, and then package updates. Each step should have a clear entry condition, expected outcome, and rollback trigger. Change control adds discipline: you plan a maintenance window, document pre-checks, and verify post-checks. For this project, you must be able to define a strict workflow and follow it without improvisation.

In practice, write a short checklist for update runbook discipline and confirm it after each reboot. This keeps the concept concrete and prevents accidental drift between sessions.

Deep Dive into the concept Operations reliability comes from repeatability. A runbook captures the exact commands, outputs, and decision points required to perform a change safely. In FreeBSD, the separation between base and packages makes this especially important. The base update affects the kernel and core libraries; packages depend on those libraries. This dependency chain means you must sequence updates correctly and validate at each step. A runbook formalizes the sequence so you are not relying on memory during a high-risk task.

Change control is not just bureaucracy. It is a set of safety checks designed to prevent surprises. Pre-checks ensure the system is healthy (disk space, ZFS status, service status). Execution steps are explicit and include safeguards such as creating a boot environment. Post-checks verify system health (version output, service status, logs). Rollback triggers are pre-defined: if a critical service fails or the system fails to boot, you switch to the previous BE. By writing these down, you reduce ambiguity and speed up recovery.

Determinism matters because you want to compare outcomes across runs. This is why the runbook should include fixed commands and expected outputs. If time-based data is needed, you set TZ=UTC to normalize timestamps. If randomness exists, you avoid it or explicitly note it. This makes your runbook a diagnostic tool as well as a procedure. When a step fails, you can compare the output to the expected output and immediately isolate the discrepancy.

In production, runbooks are often reviewed and approved before execution. In this lab, you simulate that by reading your runbook top to bottom before running it. You should be able to explain why each step exists. This creates the mental model that operations are systems, not ad-hoc tasks. The result is a workflow you can reuse in any FreeBSD environment.

Operationally, update runbook discipline is easiest to keep stable when you treat it as a small contract between configuration, tooling, and observable outputs. Write down the exact files that own the state and the commands that reveal the current truth. Then verify the contract at three points: immediately after you make the change, after a reboot, and after a deliberate disturbance such as restarting services or reloading modules. FreeBSD rewards this discipline because it rarely hides state; if something changes, it is usually in a file you control. Make a habit of collecting a before-and-after snapshot of commands and outputs so you can explain which change caused which effect.

At scale, update runbook discipline is also about failure containment. Identify what must remain available when something breaks and design a safe escape hatch. For example, keep console access for firewall changes, keep a previous boot environment for upgrades, or keep a dataset snapshot before risky edits. The same pattern applies across domains: define invariants, define the rollback path, and then only proceed when you can trigger that rollback quickly. Finally, test the failure path while the system is healthy; you learn more from a controlled rollback than from an emergency. This perspective turns the lab exercise into an operational capability you can trust on production systems.

How this fits on projects

This concept is used in Section 3.1 and Section 5.10.
It depends on P07 Boot Environments and P03 Package Workflow.

Definitions & key terms

Runbook -> Step-by-step operational procedure.
Pre-check -> Validation before changes.
Post-check -> Validation after changes.
Rollback trigger -> Condition that forces recovery.

Mental model diagram

Pre-check -> Change -> Validate -> Rollback if needed

How it works (step-by-step, with invariants and failure modes)

Capture system state and health checks.
Create rollback path (BE or snapshot).
Apply updates.
Validate system health.

Invariants:

Updates are not applied without rollback.
Validation is mandatory.

Failure modes:

Skipped validation -> hidden failures.
No rollback -> extended downtime.

Minimal concrete example

bectl create preupdate
freebsd-update fetch install

Common misconceptions

“A runbook is just notes.” -> It must be executable and deterministic.
“Rollback is optional.” -> It is essential for safe changes.

Check-your-understanding questions

Why must validation be explicit?
What is the difference between pre-check and post-check?
Why do you document rollback triggers in advance?

Check-your-understanding answers

To detect failures before they become incidents.
Pre-checks ensure readiness; post-checks confirm success.
To avoid debate during an outage.

Real-world applications

Production patching with minimal downtime.
Compliance evidence for maintenance operations.

Where you’ll apply it

Section 3.7 Real World Outcome
Section 5.10 Phase 2
Also used in: P07 Boot Environments

References

“Absolute FreeBSD, 3rd Edition” (Ch. 18)
FreeBSD Handbook: Updating FreeBSD

Key insights A runbook is a safety system, not a checklist.

Summary Operational success comes from a disciplined, repeatable update process.

Homework/Exercises to practice the concept

Write a pre-check list for disk space and services.
Define three rollback triggers.
Compare two update runs and note differences.

Solutions to the homework/exercises

Use df -h, zpool status, service -e.
SSH failure, kernel panic, critical service down.
Differences indicate changes in system state.

Concept 2: Security Advisories and Patch Planning

Fundamentals FreeBSD publishes security advisories and errata that inform when patches are needed. Understanding how to read advisories and map them to your system is part of responsible maintenance. A patch plan includes identifying affected components, scheduling a maintenance window, and verifying that updates apply cleanly. For this project, you must identify where advisories come from and how they influence your update schedule.

In practice, write a short checklist for security advisory driven patch planning and confirm it after each reboot. This keeps the concept concrete and prevents accidental drift between sessions.

In practice, rehearse the steps on a disposable VM so you can recognize normal outputs and failure signals quickly.

Deep Dive into the concept Security advisories describe vulnerabilities in the FreeBSD base system or packages. The FreeBSD Security Team issues advisories with identifiers and severity information, and they often include affected versions. In practice, you check advisories to determine whether your current release is impacted. If an advisory affects your system, you schedule a patch window and follow your runbook.

Patch planning is more than “run updates.” It includes impact analysis: which services rely on affected components, what downtime is expected, and what tests must be run afterward. In a lab, you simulate this by identifying a change window and documenting expected outcomes. You also track the version before and after the update so you can prove the change took effect.

There are two layers to patching: base system and packages. Advisories apply to both, but the mechanisms differ. Base advisories are handled with freebsd-update, while package advisories are handled with pkg audit and upgrades. This is why a runbook must include both steps. In a real environment, you may also need to coordinate with application owners or schedule longer maintenance windows for critical services.

The discipline of reading advisories trains you to treat updates as risk management, not just routine chores. This is why this project ends the sequence: it pulls together installation, service management, storage, and rollback into a professional operational workflow.

Operationally, security advisory driven patch planning is easiest to keep stable when you treat it as a small contract between configuration, tooling, and observable outputs. Write down the exact files that own the state and the commands that reveal the current truth. Then verify the contract at three points: immediately after you make the change, after a reboot, and after a deliberate disturbance such as restarting services or reloading modules. FreeBSD rewards this discipline because it rarely hides state; if something changes, it is usually in a file you control. Make a habit of collecting a before-and-after snapshot of commands and outputs so you can explain which change caused which effect.

At scale, security advisory driven patch planning is also about failure containment. Identify what must remain available when something breaks and design a safe escape hatch. For example, keep console access for firewall changes, keep a previous boot environment for upgrades, or keep a dataset snapshot before risky edits. The same pattern applies across domains: define invariants, define the rollback path, and then only proceed when you can trigger that rollback quickly. Finally, test the failure path while the system is healthy; you learn more from a controlled rollback than from an emergency. This perspective turns the lab exercise into an operational capability you can trust on production systems.

Also, document the specific signals you will treat as success or failure for security advisory driven patch planning. For access or policy topics, that might be a deliberate allow case and a deliberate deny case that is correctly logged. For workflow topics, it might be a version change plus a service health check. Writing down these signals forces you to define what working actually means and prevents you from moving forward on assumptions.

How this fits on projects

This concept is used in Section 3.2 and Section 5.10.
It builds on P03 Package Workflow and P07 Boot Environments.

Definitions & key terms

Security advisory -> Official notice of a vulnerability.
Errata -> Update notice for non-security issues.
Patch window -> Scheduled maintenance period.

Mental model diagram

Advisory -> Impact analysis -> Patch window -> Runbook execution

How it works (step-by-step, with invariants and failure modes)

Read advisory and confirm affected version.
Plan patch window and rollback.
Apply base and package updates.
Validate and document results.

Invariants:

Advisories must be mapped to actual system version.
Patch plan includes rollback.

Failure modes:

Applying updates without understanding scope.
Skipping package audits after base updates.

Minimal concrete example

pkg audit
freebsd-version

Common misconceptions

“Advisories are optional.” -> They guide safe patching.
“Only base system matters.” -> Packages can have vulnerabilities too.

Check-your-understanding questions

What tool checks package vulnerabilities?
Why do you record version numbers before and after?
What is a patch window?

Check-your-understanding answers

pkg audit.
To verify the update actually applied.
Scheduled time for safe maintenance.

Real-world applications

Compliance-driven patching.
Incident response to newly disclosed vulnerabilities.

Where you’ll apply it

Section 3.2 Functional Requirements
Section 5.10 Phase 1
Also used in: P03 Package Workflow

References

FreeBSD Security Advisories and Errata (official site)
“Mastering FreeBSD and OpenBSD Security” (Ch. 3-4)

Key insights Patch planning starts with understanding advisories, not just running updates.

Summary Security advisories drive maintenance schedules and define what “safe” means.

Homework/Exercises to practice the concept

Find a recent advisory and summarize its impact.
Write a patch plan with a rollback trigger.
Run pkg audit and document output.

Solutions to the homework/exercises

Advisory summary should list affected versions and fixes.
Patch plan includes window, steps, and rollback.
Audit output lists vulnerable packages.

3. Project Specification

3.1 What You Will Build

A complete update and recovery runbook that includes pre-checks, update steps, validation, and rollback procedures, along with a recorded execution on your VM.

3.2 Functional Requirements

Runbook written with explicit steps.
Pre-checks documented (disk, services, ZFS).
Base update performed with freebsd-update.
Package update performed with pkg.
Rollback tested or simulated.

3.3 Non-Functional Requirements

Performance: Runbook fits within a defined maintenance window.
Reliability: Steps are deterministic and repeatable.
Usability: Another person can follow the runbook without guesswork.

3.4 Example Usage / Output

$ freebsd-version
14.1-RELEASE

$ pkg audit
0 problem(s) in the installed packages found.

3.5 Data Formats / Schemas / Protocols

Runbook format

1) Pre-checks
2) Create BE
3) Update base
4) Reboot and validate
5) Update packages
6) Rollback if needed

3.6 Edge Cases

Update requires multiple reboots.
Network failure during fetch.
Package update fails due to ABI mismatch.

3.7 Real World Outcome

You can follow a written procedure to update and recover a FreeBSD system safely.

3.7.1 How to Run (Copy/Paste)

bectl create preupdate
freebsd-update fetch install
reboot
pkg upgrade

3.7.2 Golden Path Demo (Deterministic)

Use fixed BE name preupdate.
Use TZ=UTC for consistent logs.

3.7.3 If CLI: provide an exact terminal transcript

$ TZ=UTC freebsd-version
14.1-RELEASE

$ TZ=UTC pkg audit
0 problem(s) in the installed packages found.

$ echo $?
0

Failure demo (deterministic)

$ TZ=UTC freebsd-update install
freebsd-update: Cannot fetch updates; server unreachable.

$ echo $?
1

4. Solution Architecture

4.1 High-Level Design

Pre-checks -> BE -> Base update -> Reboot -> Validate -> Packages -> Rollback

4.2 Key Components

Component	Responsibility	Key Decisions
Runbook	Step-by-step procedure	Deterministic steps
BE	Rollback safety	Naming scheme
Validation checks	Proof of success	Services + versions

4.3 Data Structures (No Full Code)

RunbookStep
- id: 3
- description: "Update base system"
- command: "freebsd-update fetch install"
- expected: "no errors"

4.4 Algorithm Overview

Key Algorithm: Update Runbook Execution

Capture pre-check state.
Create rollback path.
Apply updates and reboot.
Validate system.
Roll back if any critical test fails.

Complexity Analysis:

Time: O(update duration)
Space: O(BE size)

5. Implementation Guide

5.1 Development Environment Setup

# Ensure you have console access in case of lockout

5.2 Project Structure

update-runbook/
+-- runbook.md
+-- precheck.log
+-- postcheck.log

5.3 The Core Question You’re Answering

“Can I update a FreeBSD system without fear?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

Update workflow discipline
Security advisories and patch planning

5.5 Questions to Guide Your Design

What is your maintenance window length?
Which services are critical to validate?
What is your rollback trigger list?

5.6 Thinking Exercise

Failure Scenario

Assume an update breaks SSH. How do you recover?

5.7 The Interview Questions They’ll Ask

What is freebsd-update used for?
Why upgrade packages after the base system?
How do you validate an upgrade?
What is your rollback plan?
Where do you find security advisories?

5.8 Hints in Layers

Hint 1: Write the checklist first Don’t update without a plan.

Hint 2: Use BEs Rollback should be easy.

Hint 3: Record outputs Keep before/after version logs.

Hint 4: Simulate failure Practice recovery before production.

5.9 Books That Will Help

Topic	Book	Chapter
Upgrades	“Absolute FreeBSD, 3rd Edition”	Ch. 18
Security	“Mastering FreeBSD and OpenBSD Security”	Ch. 3-4

5.10 Implementation Phases

Phase 1: Runbook Draft (2-3 hours)

Goals: Write a complete runbook. Tasks:

Document pre-checks and commands.
Define rollback triggers. Checkpoint: runbook has no missing steps.

Phase 2: Execution (3-4 hours)

Goals: Run the update in a VM. Tasks:

Execute runbook steps.
Record outputs. Checkpoint: post-checks match expected state.

Phase 3: Recovery Drill (2-3 hours)

Goals: Test rollback. Tasks:

Simulate a failure.
Execute rollback steps. Checkpoint: system returns to stable state.

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
BE naming	preupdate, timestamp	preupdate	Clear intent
Validation scope	minimal, full	minimal + critical services	Balanced effort
Patch cadence	monthly, quarterly	quarterly	Stable for lab

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Pre-check Tests	System readiness	`zpool status`, `df -h`
Validation Tests	Post-update health	`freebsd-version`
Rollback Tests	Recovery	`bectl activate`

6.2 Critical Test Cases

Pre-check success: system healthy before update.
Post-check success: version and services ok.
Rollback success: prior BE boots and services start.

6.3 Test Data

BE name: preupdate
Critical services: sshd

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall	Symptom	Solution
No rollback path	Downtime	Always create BE
Skipping package upgrades	Service failures	Run pkg upgrade
No post-checks	Hidden issues	Validate explicitly

7.2 Debugging Strategies

Compare before/after versions: identify what changed.
Read logs: /var/log/messages for update errors.

7.3 Performance Traps

Long updates without planning can exceed maintenance window.

8. Extensions & Challenges

8.1 Beginner Extensions

Add a quick smoke-test script to the runbook.
Include a backup reminder step.

8.2 Intermediate Extensions

Automate runbook execution with a script.
Add service-level health checks.

8.3 Advanced Extensions

Integrate monitoring alerts into rollback triggers.
Create a runbook template for other systems.

9. Real-World Connections

9.1 Industry Applications

Operations teams: standard patching runbooks.
Compliance: audit trails for system maintenance.

freebsd-update: base update tool.
pkg: package management tool.

9.3 Interview Relevance

Ability to explain safe maintenance workflows.
Understanding of rollback strategies.

10. Resources

10.1 Essential Reading

“Absolute FreeBSD, 3rd Edition” by Michael W. Lucas - Ch. 18
“Mastering FreeBSD and OpenBSD Security” - Ch. 3-4

10.2 Video Resources

“FreeBSD Update Best Practices” - community talks

10.3 Tools & Documentation

freebsd-update(8): base updates
pkg(8): package updates

P07 Boot Environments - rollback safety.
P03 Package Workflow - update hygiene.

11. Self-Assessment Checklist

11.1 Understanding

I can explain the update sequence.
I can define rollback triggers.
I understand how advisories drive patches.

11.2 Implementation

Runbook written and executed.
Outputs recorded for pre/post checks.
Rollback tested or simulated.

11.3 Growth

I can reuse this runbook for future upgrades.
I can explain the workflow in interviews.
I can teach safe update practices.

12. Submission / Completion Criteria

Minimum Viable Completion:

Runbook written with pre/post checks.
Base and package updates executed in VM.

Full Completion:

Rollback tested successfully.
Update outputs recorded and reviewed.

Excellence (Going Above & Beyond):

Automated runbook script.
Template created for additional systems.