Project 1: Ad-Hoc Fleet Baseline Audit
Build a repeatable, low-risk baseline visibility workflow for your host fleet before you automate writes.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 1 |
| Time Estimate | 4-6 hours |
| Main Programming Language | Ansible CLI |
| Alternative Programming Languages | Shell |
| Coolness Level | Level 2 |
| Business Potential | 1. Resume Gold |
| Prerequisites | SSH basics, inventory basics, Linux service checks |
| Key Topics | Inventory targeting, ad-hoc commands, baseline evidence |
1. Learning Objectives
By completing this project, you will:
- Build and validate a static inventory for a multi-host lab.
- Run deterministic ad-hoc checks and collect comparable output.
- Distinguish connection failures from command failures.
- Produce a baseline report that can be compared before/after later projects.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Inventory as a Trust Boundary
Fundamentals Inventory is not a passive list. It defines the blast radius for every operation. A single grouping mistake can route commands to unintended systems. Good inventory design includes explicit groups, clear host aliases, and predictable variable attachment points.
Deep Dive into the concept Treat inventory as production control data. This means change-reviewing inventory like code, tracking ownership of host groups, and validating inventory output before any write operation. Ad-hoc audits are ideal for this because they run read-only checks and let you test your group definitions under real connectivity conditions.
How this fit on projects Used in P01 directly and reused in P02-P07.
Definitions & key terms
- Host pattern: selection expression (
all, group, host). - Group: logical host set.
- Alias: inventory name mapped to connection endpoint.
Mental model diagram
Inventory file -> host groups -> target pattern -> selected nodes -> command execution
How it works
- Load inventory.
- Resolve host pattern.
- Connect via SSH.
- Run module.
- Return per-host results.
Minimal concrete example
inventory.ini
[linux_nodes]
node-a ansible_host=10.10.1.11
node-b ansible_host=10.10.1.12
Common misconceptions
- “Inventory can be fixed later” -> wrong; it defines safety now.
Check-your-understanding questions
- Why is inventory review required before playbook writes?
Check-your-understanding answers
- It sets which hosts can be changed.
Real-world applications
- Fleet readiness and pre-maintenance checks.
Where you’ll apply it P01, P02, P05.
References
- Ansible inventory docs.
Key insights Inventory correctness is the first operational safety gate.
Summary Targeting errors are more dangerous than syntax errors.
Homework/Exercises
- Build two groups and verify output with
ansible-inventory --graph.
Solutions
- Keep names explicit and verify group membership before command runs.
3. Project Specification
3.1 What You Will Build
A baseline audit command set that checks:
- host reachability
- uptime
- OS family
- active state of one critical service
Included: deterministic command transcript and recap summary. Excluded: configuration mutation.
3.2 Functional Requirements
- Inventory with at least 3 hosts in one group.
- Ping success for all hosts.
- At least three read-only checks collected in one report.
- Failure output preserved for unreachable hosts.
3.3 Non-Functional Requirements
- Reliability: repeated runs should produce stable schema.
- Usability: output should be easy to compare across runs.
- Security: no plaintext credentials in inventory.
3.4 Example Usage / Output
$ ansible linux_nodes -i inventory.ini -m ping
node-a | SUCCESS => {"changed": false, "ping": "pong"}
node-b | SUCCESS => {"changed": false, "ping": "pong"}
node-c | SUCCESS => {"changed": false, "ping": "pong"}
3.5 Data Formats / Schemas / Protocols
- Input: INI inventory format.
- Output: line-delimited per-host execution result.
3.6 Edge Cases
- One unreachable host should not hide success on others.
- One command failure should be reported with return code.
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
ansible all -i inventory.ini -m ping
ansible linux_nodes -i inventory.ini -m command -a "uptime"
ansible linux_nodes -i inventory.ini -m setup -a "filter=ansible_os_family"
3.7.2 Golden Path Demo (Deterministic)
All hosts respond, and recap includes zero failures.
3.7.3 If CLI: exact transcript
$ ansible linux_nodes -i inventory.ini -m command -a "systemctl is-active sshd"
node-a | CHANGED | rc=0 >>
active
node-b | CHANGED | rc=0 >>
active
node-c | CHANGED | rc=0 >>
active
4. Solution Architecture
4.1 High-Level Design
Inventory -> Connectivity check -> Read-only command set -> Report file
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Inventory file | Host selection | Explicit groups only |
| Audit command set | Baseline capture | Deterministic commands |
| Report output | Evidence | Keep stable ordering |
4.4 Data Structures (No Full Code)
baseline_record = {
host: string,
reachable: bool,
uptime: string,
os_family: string,
service_state: string
}
4.4 Algorithm Overview
- Resolve host list.
- Validate connectivity.
- Execute fixed command set.
- Store results by host.
Complexity:
- Time: O(hosts * checks)
- Space: O(hosts)
5. Implementation Guide
5.1 Development Environment Setup
ansible --version
ssh node-a "echo ok"
5.2 Project Structure
P01/
├── inventory.ini
├── commands.txt
└── reports/
5.3 The Core Question You’re Answering
“Can I trust my host targeting and baseline telemetry before writing state changes?”
5.4 Concepts You Must Understand First
- Inventory host patterns.
- SSH authentication path.
- Per-host failure isolation.
5.5 Questions to Guide Your Design
- How will you verify inventory freshness?
- How will you keep baseline reports comparable?
5.6 Thinking Exercise
Sketch baseline report fields and mark which can change every minute.
5.7 The Interview Questions They’ll Ask
- How do you avoid accidental host targeting?
- Why start with ad-hoc read checks?
- What failure evidence do you keep?
5.8 Hints in Layers
- Hint 1: Validate inventory graph first.
- Hint 2: Use one group only initially.
- Hint 3: Keep command set minimal.
- Hint 4: Save raw output untouched.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Linux operations basics | How Linux Works | service/process chapters |
| Automation mindset | Ansible: Up and Running | inventory basics |
5.10 Implementation Phases
- Phase 1: Inventory and connectivity.
- Phase 2: Baseline command bundle.
- Phase 3: Report stabilization.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Output storage | ad-hoc screen only / files | files | auditability |
| Host naming | IP-only / aliases | aliases | readability |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Connectivity tests | Access validation | ping all |
| Output schema tests | Report consistency | host fields present |
| Edge tests | Partial failure handling | one host down |
6.2 Critical Test Cases
- All hosts reachable.
- One host unreachable.
- One command returns non-zero on one host.
6.3 Test Data
Use 3 hosts; intentionally block SSH on one host for failure path.
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong host alias | unreachable | verify inventory mapping |
| Mixed distro commands | inconsistent output | use fact-gated checks |
| No saved logs | no evidence | write report files |
7.2 Debugging Strategies
- Run with
-vvvvfor SSH diagnostics. - Test direct SSH outside Ansible first.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add disk and memory checks.
- Add per-group baseline reports.
8.2 Intermediate Extensions
- Export report as JSON.
- Add daily scheduled baseline run.
8.3 Advanced Extensions
- Compare baseline deltas automatically.
- Trigger alerts on service state drift.
9. Real-World Connections
9.1 Industry Applications
- Pre-maintenance fleet validation.
- Compliance evidence snapshots.
9.2 Related Open Source Projects
- ansible/ansible
- community inventory plugins
9.3 Interview Relevance
Covers inventory safety, operational evidence, and failure triage.
10. Resources
10.1 Essential Reading
- Ansible inventory docs.
- Ansible ad-hoc command docs.
10.2 Tools & Documentation
- ansible CLI
- ansible-inventory
10.4 Related Projects in This Series
- Next:
P02-idempotent-web-tier-bootstrap.md
11. Self-Assessment Checklist
- I can explain host targeting before running commands.
- I can produce a consistent baseline report.
- I can isolate and explain connection failures quickly.
12. Submission / Completion Criteria
Minimum Viable Completion
- Successful baseline run for 3 hosts.
- One saved report file.
Full Completion
- Includes failure-path evidence and documented fix.
Excellence
- Includes automated baseline comparison and drift notes.