Project 1: Ad-Hoc Fleet Baseline Audit

Build a repeatable, low-risk baseline visibility workflow for your host fleet before you automate writes.

Quick Reference

Attribute Value
Difficulty Level 1
Time Estimate 4-6 hours
Main Programming Language Ansible CLI
Alternative Programming Languages Shell
Coolness Level Level 2
Business Potential 1. Resume Gold
Prerequisites SSH basics, inventory basics, Linux service checks
Key Topics Inventory targeting, ad-hoc commands, baseline evidence

1. Learning Objectives

By completing this project, you will:

  1. Build and validate a static inventory for a multi-host lab.
  2. Run deterministic ad-hoc checks and collect comparable output.
  3. Distinguish connection failures from command failures.
  4. Produce a baseline report that can be compared before/after later projects.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Inventory as a Trust Boundary

Fundamentals Inventory is not a passive list. It defines the blast radius for every operation. A single grouping mistake can route commands to unintended systems. Good inventory design includes explicit groups, clear host aliases, and predictable variable attachment points.

Deep Dive into the concept Treat inventory as production control data. This means change-reviewing inventory like code, tracking ownership of host groups, and validating inventory output before any write operation. Ad-hoc audits are ideal for this because they run read-only checks and let you test your group definitions under real connectivity conditions.

How this fit on projects Used in P01 directly and reused in P02-P07.

Definitions & key terms

  • Host pattern: selection expression (all, group, host).
  • Group: logical host set.
  • Alias: inventory name mapped to connection endpoint.

Mental model diagram

Inventory file -> host groups -> target pattern -> selected nodes -> command execution

How it works

  1. Load inventory.
  2. Resolve host pattern.
  3. Connect via SSH.
  4. Run module.
  5. Return per-host results.

Minimal concrete example

inventory.ini
[linux_nodes]
node-a ansible_host=10.10.1.11
node-b ansible_host=10.10.1.12

Common misconceptions

  • “Inventory can be fixed later” -> wrong; it defines safety now.

Check-your-understanding questions

  1. Why is inventory review required before playbook writes?

Check-your-understanding answers

  1. It sets which hosts can be changed.

Real-world applications

  • Fleet readiness and pre-maintenance checks.

Where you’ll apply it P01, P02, P05.

References

  • Ansible inventory docs.

Key insights Inventory correctness is the first operational safety gate.

Summary Targeting errors are more dangerous than syntax errors.

Homework/Exercises

  • Build two groups and verify output with ansible-inventory --graph.

Solutions

  • Keep names explicit and verify group membership before command runs.

3. Project Specification

3.1 What You Will Build

A baseline audit command set that checks:

  • host reachability
  • uptime
  • OS family
  • active state of one critical service

Included: deterministic command transcript and recap summary. Excluded: configuration mutation.

3.2 Functional Requirements

  1. Inventory with at least 3 hosts in one group.
  2. Ping success for all hosts.
  3. At least three read-only checks collected in one report.
  4. Failure output preserved for unreachable hosts.

3.3 Non-Functional Requirements

  • Reliability: repeated runs should produce stable schema.
  • Usability: output should be easy to compare across runs.
  • Security: no plaintext credentials in inventory.

3.4 Example Usage / Output

$ ansible linux_nodes -i inventory.ini -m ping
node-a | SUCCESS => {"changed": false, "ping": "pong"}
node-b | SUCCESS => {"changed": false, "ping": "pong"}
node-c | SUCCESS => {"changed": false, "ping": "pong"}

3.5 Data Formats / Schemas / Protocols

  • Input: INI inventory format.
  • Output: line-delimited per-host execution result.

3.6 Edge Cases

  • One unreachable host should not hide success on others.
  • One command failure should be reported with return code.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

ansible all -i inventory.ini -m ping
ansible linux_nodes -i inventory.ini -m command -a "uptime"
ansible linux_nodes -i inventory.ini -m setup -a "filter=ansible_os_family"

3.7.2 Golden Path Demo (Deterministic)

All hosts respond, and recap includes zero failures.

3.7.3 If CLI: exact transcript

$ ansible linux_nodes -i inventory.ini -m command -a "systemctl is-active sshd"
node-a | CHANGED | rc=0 >>
active
node-b | CHANGED | rc=0 >>
active
node-c | CHANGED | rc=0 >>
active

4. Solution Architecture

4.1 High-Level Design

Inventory -> Connectivity check -> Read-only command set -> Report file

4.2 Key Components

Component Responsibility Key Decisions
Inventory file Host selection Explicit groups only
Audit command set Baseline capture Deterministic commands
Report output Evidence Keep stable ordering

4.4 Data Structures (No Full Code)

baseline_record = {
  host: string,
  reachable: bool,
  uptime: string,
  os_family: string,
  service_state: string
}

4.4 Algorithm Overview

  1. Resolve host list.
  2. Validate connectivity.
  3. Execute fixed command set.
  4. Store results by host.

Complexity:

  • Time: O(hosts * checks)
  • Space: O(hosts)

5. Implementation Guide

5.1 Development Environment Setup

ansible --version
ssh node-a "echo ok"

5.2 Project Structure

P01/
├── inventory.ini
├── commands.txt
└── reports/

5.3 The Core Question You’re Answering

“Can I trust my host targeting and baseline telemetry before writing state changes?”

5.4 Concepts You Must Understand First

  1. Inventory host patterns.
  2. SSH authentication path.
  3. Per-host failure isolation.

5.5 Questions to Guide Your Design

  1. How will you verify inventory freshness?
  2. How will you keep baseline reports comparable?

5.6 Thinking Exercise

Sketch baseline report fields and mark which can change every minute.

5.7 The Interview Questions They’ll Ask

  1. How do you avoid accidental host targeting?
  2. Why start with ad-hoc read checks?
  3. What failure evidence do you keep?

5.8 Hints in Layers

  • Hint 1: Validate inventory graph first.
  • Hint 2: Use one group only initially.
  • Hint 3: Keep command set minimal.
  • Hint 4: Save raw output untouched.

5.9 Books That Will Help

Topic Book Chapter
Linux operations basics How Linux Works service/process chapters
Automation mindset Ansible: Up and Running inventory basics

5.10 Implementation Phases

  • Phase 1: Inventory and connectivity.
  • Phase 2: Baseline command bundle.
  • Phase 3: Report stabilization.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Output storage ad-hoc screen only / files files auditability
Host naming IP-only / aliases aliases readability

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Connectivity tests Access validation ping all
Output schema tests Report consistency host fields present
Edge tests Partial failure handling one host down

6.2 Critical Test Cases

  1. All hosts reachable.
  2. One host unreachable.
  3. One command returns non-zero on one host.

6.3 Test Data

Use 3 hosts; intentionally block SSH on one host for failure path.


7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong host alias unreachable verify inventory mapping
Mixed distro commands inconsistent output use fact-gated checks
No saved logs no evidence write report files

7.2 Debugging Strategies

  • Run with -vvvv for SSH diagnostics.
  • Test direct SSH outside Ansible first.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add disk and memory checks.
  • Add per-group baseline reports.

8.2 Intermediate Extensions

  • Export report as JSON.
  • Add daily scheduled baseline run.

8.3 Advanced Extensions

  • Compare baseline deltas automatically.
  • Trigger alerts on service state drift.

9. Real-World Connections

9.1 Industry Applications

  • Pre-maintenance fleet validation.
  • Compliance evidence snapshots.
  • ansible/ansible
  • community inventory plugins

9.3 Interview Relevance

Covers inventory safety, operational evidence, and failure triage.


10. Resources

10.1 Essential Reading

  • Ansible inventory docs.
  • Ansible ad-hoc command docs.

10.2 Tools & Documentation

  • ansible CLI
  • ansible-inventory
  • Next: P02-idempotent-web-tier-bootstrap.md

11. Self-Assessment Checklist

  • I can explain host targeting before running commands.
  • I can produce a consistent baseline report.
  • I can isolate and explain connection failures quickly.

12. Submission / Completion Criteria

Minimum Viable Completion

  • Successful baseline run for 3 hosts.
  • One saved report file.

Full Completion

  • Includes failure-path evidence and documented fix.

Excellence

  • Includes automated baseline comparison and drift notes.