Project 3: Ownership Boundary Mapper (RACI 2.0)
Build a schema and validation tool that maps every technical asset to exactly one owning team and an escalation path.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Intermediate |
| Time Estimate | 1 Week (15-20 hours) |
| Primary Language | YAML / JSON |
| Alternative Languages | Python, Go |
| Prerequisites | Basic scripting, understanding of cloud resources |
| Key Topics | RACI, Bounded Contexts, Code Ownership |
1. Learning Objectives
By completing this project, you will:
- Define ownership at the asset level (repo, service, bucket, queue)
- Create a machine-readable ownership registry
- Build a validation tool that flags orphaned resources
- Implement the rule: “If it exists, someone owns it”
- Connect ownership to on-call and escalation
2. Theoretical Foundation
2.1 Core Concepts
The Ownership Problem
WITHOUT CLEAR OWNERSHIP WITH CLEAR OWNERSHIP
┌─────────────────────────────────┐ ┌─────────────────────────────────┐
│ "Whose service is this?" │ │ owner-check service-xyz │
│ "I think Team A? Maybe B?" │ │ => Team: Payments │
│ "Let me Slack around..." │ │ => On-call: @jane (PagerDuty) │
│ [30 minutes of searching] │ │ => Escalation: #payments-oncall │
│ "Actually, nobody knows" │ │ [2 seconds] │
└─────────────────────────────────┘ └─────────────────────────────────┘
Every hour spent searching for ownership is an hour not spent fixing the problem.
The RACI Matrix Evolved
Traditional RACI defines roles:
- Responsible: Who does the work
- Accountable: Who is the decision-maker (only one!)
- Consulted: Who provides input
- Informed: Who needs to know
For operating models, we simplify:
- Owner: The single team accountable for the asset
- Contributors: Teams that can submit changes (PRs)
- Consumers: Teams that depend on the asset
- Escalation: Who to page when things break
Bounded Contexts (DDD)
Domain-Driven Design teaches that systems should be divided into “bounded contexts”—areas where a specific domain model applies. Ownership boundaries should align with these contexts.
┌──────────────────────────────────────────────────────────────┐
│ ORDER DOMAIN │
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
│ │ Order Service │ │ Cart Service │ │ Promo Engine │ │
│ │ Owner: Orders │ │ Owner: Orders │ │ Owner: Orders│ │
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────┘
│
Shared DB (Owned by Orders)
│
┌──────────────────────────────────────────────────────────────┐
│ PAYMENT DOMAIN │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Payment Gateway │ │ Fraud Detection │ │
│ │ Owner: Payments │ │ Owner: Payments │ │
│ └─────────────────┘ └─────────────────┘ │
└──────────────────────────────────────────────────────────────┘
2.2 Why This Matters
Incident ping-pong is the #1 symptom of unclear ownership:
3:02 AM: Alert fires
3:05 AM: SRE pages Team A
3:15 AM: "Not our service" - Team A pages Team B
3:30 AM: "We just consume it" - Team B pages Team C
3:45 AM: "The original team left" - Team C pages... everyone?
4:00 AM: CTO joins call asking "WHO OWNS THIS?"
With an ownership registry, the 3:02 AM alert goes directly to the right team.
2.3 Historical Context
- CODEOWNERS (GitHub, 2016): First mainstream ownership-in-code
- Backstage (Spotify, 2020): Service catalog with ownership
- OpsLevel (2019): Dedicated service ownership platform
2.4 Common Misconceptions
| Misconception | Reality |
|---|---|
| “Shared ownership works” | “Shared” means “no one” when there’s a problem |
| “We can figure it out when needed” | You can’t figure it out at 3 AM under pressure |
| “Ownership = who wrote the code” | Ownership = who maintains, operates, and evolves |
| “We need ownership per file” | Start at service/repo level, go finer only if needed |
3. Project Specification
3.1 What You Will Build
A CLI tool (owner-check) that:
- Reads a registry of teams and assets
- Validates that every asset has exactly one owner
- Flags orphaned resources and stale team references
- Outputs ownership information for any asset
3.2 Functional Requirements
- Team Registry
- Store team metadata (ID, name, Slack, on-call link)
- Support team lifecycle (active, deprecated, merged)
- Asset Registry
- Map assets to owning teams
- Support multiple asset types (repo, service, S3 bucket, queue)
- Include escalation path
- Validation
- Every asset must have exactly one owner
- Owner team must exist and be active
- Dependencies must reference valid assets
- Query Interface
owner-check <asset-id>→ returns owner infoowner-check --team <team-id>→ lists all assetsowner-check --orphans→ lists unowned resources
3.3 Non-Functional Requirements
- Schema must be version-controllable (YAML/JSON)
- Must support 1000+ assets without performance issues
- Should integrate with CI/CD (validate on PR)
- Must be extensible for new asset types
3.4 Example Usage / Output
Input (teams.yaml):
teams:
- id: team-identity
name: Identity & Access
slack: "#team-identity"
oncall: https://pagerduty.com/services/identity
status: active
- id: team-payments
name: Payments & Billing
slack: "#team-payments"
oncall: https://pagerduty.com/services/payments
status: active
- id: team-rocket
name: Legacy Rocket Team
status: deprecated
merged_into: team-identity
Input (assets.yaml):
assets:
- id: service-auth
type: service
name: Authentication Service
owner: team-identity
repo: github.com/company/auth-service
dependencies:
- asset-id: service-userdb
type: runtime
- id: service-payment-gateway
type: service
name: Payment Gateway
owner: team-payments
repo: github.com/company/payment-gateway
- id: bucket-legacy-logs
type: s3_bucket
name: Legacy Log Archive
owner: team-rocket # Problem: team is deprecated!
- id: queue-notifications
type: sqs_queue
name: Notification Queue
# Problem: no owner defined!
CLI Output:
$ ./owner-check service-auth
Asset: service-auth (Authentication Service)
Type: service
Owner: team-identity (Identity & Access)
On-call: https://pagerduty.com/services/identity
Slack: #team-identity
Repo: github.com/company/auth-service
Dependencies:
- service-userdb (runtime)
$ ./owner-check --orphans
[ERROR] queue-notifications: No owner defined
[ERROR] bucket-legacy-logs: Owner 'team-rocket' is deprecated
$ ./owner-check --team team-payments
Assets owned by team-payments (Payments & Billing):
- service-payment-gateway (service)
$ ./owner-check --validate
Validating ownership registry...
[OK] 2 teams active
[OK] 2 assets fully defined
[ERROR] bucket-legacy-logs: owner team-rocket is deprecated (merged into team-identity)
[ERROR] queue-notifications: missing owner field
Validation FAILED: 2 errors found
3.5 Real World Outcome
After implementing this tool:
- Incident response time decreases (right team paged immediately)
- Orphaned resources are discovered and assigned
- Offboarding is cleaner (assets transferred before team dissolves)
- Audit/compliance is simpler (clear ownership trail)
4. Solution Architecture
4.1 High-Level Design
┌─────────────────────────────────────────────────────────────────┐
│ OWNERSHIP SYSTEM │
└─────────────────────────────────────────────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ teams.yaml │ │ assets.yaml │ │ Infrastructure│
│ │ │ │ │ (AWS/GCP) │
│ Team registry │ │ Asset->Team │ │ Actual state │
│ with metadata │ │ mapping │ │ for comparison│
└───────────────┘ └───────────────┘ └───────────────┘
│ │ │
└─────────────────────┼─────────────────────┘
│
▼
┌───────────────────┐
│ owner-check │
│ CLI Tool │
│ │
│ - validate │
│ - query │
│ - compare │
└───────────────────┘
│
┌───────────────┼───────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Terminal │ │ CI/CD │ │ Backstage│
│ Output │ │ Check │ │ API │
└──────────┘ └──────────┘ └──────────┘
4.2 Key Components
- Team Registry (teams.yaml)
- Source of truth for team metadata
- Lifecycle management (active/deprecated/merged)
- Asset Registry (assets.yaml)
- Maps every asset to one owner
- Includes dependencies and escalation
- Validator
- Checks referential integrity
- Ensures every asset has valid owner
- Query Engine
- Fast lookup by asset or team
- Supports wildcards and filters
4.3 Data Structures
# Schema: teams.yaml
teams:
- id: string # Unique identifier (required)
name: string # Human-readable name (required)
slack: string # Slack channel for contact
oncall: url # PagerDuty/OpsGenie link
status: enum # active | deprecated | pending
merged_into: string # If deprecated, new owner
members: # Optional: team composition
- email: string
role: string
# Schema: assets.yaml
assets:
- id: string # Unique identifier (required)
type: enum # service | repo | s3_bucket | rds | queue | etc.
name: string # Human-readable name
owner: string # team-id (required, must exist)
repo: url # GitHub/GitLab repository
documentation: url # Link to docs
criticality: enum # tier-1 | tier-2 | tier-3
dependencies: # What this asset depends on
- asset_id: string
type: enum # runtime | build | data
slo_link: url # Link to SLO dashboard
4.4 Algorithm Overview
Validation Algorithm:
1. Load teams.yaml → team_map
2. Load assets.yaml → asset_list
3. For each asset:
a. Check owner field exists
b. Check owner is in team_map
c. Check owner.status == 'active'
d. For each dependency:
i. Check dependency asset exists
4. For each team:
a. Check at least one asset is owned (warn if none)
5. Report all errors
Sync with Infrastructure (Advanced):
1. Scan AWS/GCP for resources with tags
2. Compare to assets.yaml
3. Flag resources not in registry
4. Flag registry entries not in infrastructure
5. Implementation Guide
5.1 Development Environment Setup
# Python implementation
python3 -m venv venv
source venv/bin/activate
pip install pyyaml click
# Or Go implementation
go mod init owner-check
go get gopkg.in/yaml.v3
go get github.com/spf13/cobra
5.2 Project Structure
owner-check/
├── data/
│ ├── teams.yaml
│ └── assets.yaml
├── src/
│ ├── __init__.py
│ ├── models.py # Data classes
│ ├── loader.py # YAML parsing
│ ├── validator.py # Validation logic
│ └── cli.py # Command-line interface
├── tests/
│ ├── test_validator.py
│ └── fixtures/
│ ├── valid_teams.yaml
│ └── invalid_assets.yaml
└── owner-check # Entry point script
5.3 The Core Question You’re Answering
“If this service breaks at 3 AM, whose phone rings, and do they know it’s their problem?”
Ambiguous ownership is the leading cause of “Incident Ping-Pong”—tickets bouncing between teams because no one is sure they own the fix.
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Bounded Contexts (DDD)
- How do you draw lines around code so it can be owned by one team?
- What happens when two teams need to change the same file?
- Book Reference: “Domain-Driven Design” by Eric Evans, Ch. 14
- The RACI Matrix
- What’s the difference between Responsible and Accountable?
- Why can there be multiple R’s but only one A?
- Book Reference: Standard Project Management literature
- GitHub CODEOWNERS
- How does GitHub enforce code review by owners?
- What’s the difference between CODEOWNERS and service ownership?
- Reference: GitHub documentation
5.5 Questions to Guide Your Design
Before implementing, think through these:
Granularity
- Do you own at the “Repo” level, the “Microservice” level, or the “S3 Bucket” level?
- What happens when one repo contains code for multiple services?
- Should infrastructure (VPCs, subnets) have owners?
The Registry
- Where do the “Teams” live? A YAML file? An LDAP group? A database?
- How do you handle team renames or merges?
- Who is allowed to change ownership?
Lifecycle
- What happens when an owner leaves the company?
- How do you transfer ownership gracefully?
- What’s the process for deprecating a team?
5.6 Thinking Exercise
The “Burning Building” Trace
Take a random microservice in your system. Imagine it starts returning 500 errors at 3 AM.
Questions while analyzing:
- Who is the first person to get an alert?
- How do they know which team the alert belongs to?
- If they look at the source code, is there a clear “Contact Us” or “Owned By” header?
- If the owner isn’t listed, how many people do they have to ask before finding the owner?
Write down:
- The path from “alert fires” to “correct human is paged”
- Every point where someone had to guess or ask
- Each guess/ask is a bug in your ownership model
5.7 Hints in Layers
Hint 1: Use CODEOWNERS as Inspiration GitHub’s CODEOWNERS file is a great simple format:
# Pattern Owners
/services/auth/* @team-identity
/services/payment/* @team-payments
* @platform-team # fallback
Start here, then extend to non-code assets.
Hint 2: Define the Team Schema First Before mapping assets, define what a “team” is:
team:
id: string
name: string
slack: string
oncall: url
status: active|deprecated
Hint 3: Create the Assets Schema Map asset IDs to team IDs:
asset:
id: string
type: service|bucket|queue|...
owner: team-id
Hint 4: Write the Validator The core logic is simple:
for asset in assets:
if asset.owner not in teams:
errors.append(f"Unknown owner: {asset.owner}")
elif teams[asset.owner].status != "active":
errors.append(f"Deprecated owner: {asset.owner}")
5.8 The Interview Questions They’ll Ask
Prepare to answer these:
- “How do you handle shared infrastructure that multiple teams use?”
- One owner, multiple consumers. Owner manages the resource, others have usage rights.
- “What are the dangers of ‘Shared Ownership’?”
- No single point of accountability. “Everyone’s job” becomes “no one’s job.”
- “How do you transition ownership of a legacy system to a new team?”
- Knowledge transfer period, documented handoff, ownership change in registry, redirect of alerts.
- “Should the person who writes the code always be the one who owns it in production?”
- No. Ownership is about maintenance and operations, not original authorship.
- “What metrics can you use to prove that ownership boundaries are clear?”
- MTTA (Mean Time To Acknowledge), incident reassignment rate, orphan resource count.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Bounded Contexts | “Domain-Driven Design” by Eric Evans | Ch. 14: Maintaining Model Integrity |
| Ownership Patterns | “Modern Software Engineering” by David Farley | Ch. 12 |
| Incident Response | “Site Reliability Engineering” | Ch. 14: Managing Incidents |
5.10 Implementation Phases
Phase 1: Schema Design (2-3 hours)
- Define team schema with required fields
- Define asset schema with required fields
- Document the relationship rules
Phase 2: Registry Bootstrap (3-4 hours)
- Create teams.yaml with 5-10 teams
- Create assets.yaml with 20-30 assets
- Validate manually that mappings are correct
Phase 3: CLI Tool (4-5 hours)
- Implement YAML loading
- Implement validation logic
- Implement query commands
- Add colorized output for errors
Phase 4: CI/CD Integration (2-3 hours)
- Add GitHub Action to validate on PR
- Block merge if validation fails
- Add badge to README showing status
5.11 Key Implementation Decisions
| Decision | Option A | Option B | Recommendation |
|---|---|---|---|
| Storage | YAML files | Database | YAML (version control) |
| Team source | Manual YAML | LDAP sync | Manual first, LDAP later |
| Asset discovery | Manual | Cloud API scan | Manual first, scan for validation |
| Enforcement | Advisory | Blocking | Start advisory, move to blocking |
6. Testing Strategy
Unit Tests
def test_valid_ownership():
teams = load_teams("fixtures/valid_teams.yaml")
assets = load_assets("fixtures/valid_assets.yaml")
errors = validate(teams, assets)
assert len(errors) == 0
def test_missing_owner():
assets = [Asset(id="test", owner=None)]
errors = validate([], assets)
assert "missing owner" in errors[0]
def test_deprecated_owner():
teams = [Team(id="old", status="deprecated")]
assets = [Asset(id="test", owner="old")]
errors = validate(teams, assets)
assert "deprecated" in errors[0]
Integration Tests
- Load real teams.yaml and assets.yaml
- Validate all assets have valid owners
- Check for circular dependencies
Smoke Tests
- Run
owner-check --validatein CI - Fail build if errors found
7. Common Pitfalls & Debugging
| Problem | Symptom | Root Cause | Fix |
|---|---|---|---|
| Orphaned resources | --validate finds unknown owners |
Team renamed/merged without updating | Add owner to teams or update asset |
| Duplicate ownership | Multiple teams claim same asset | No single source of truth | Pick one owner, others become consumers |
| Stale registry | Production resources not in registry | Manual process, no automation | Add cloud scanning or PR requirement |
| Over-granular | 1000+ assets, unmaintainable | Mapped at file level instead of service | Aggregate to service/repo level |
8. Extensions & Challenges
Extension 1: Cloud Resource Sync
Use AWS/GCP APIs to discover resources. Compare to registry. Flag orphans.
$ owner-check --sync-aws
Discovered 150 S3 buckets in AWS
Matched 142 to registry
[WARN] 8 buckets have no owner:
- legacy-logs-2019
- temp-data-export
...
Extension 2: Dependency Graph
Visualize asset dependencies. Highlight cross-team dependencies.
Extension 3: Ownership Cost Report
Integrate with billing. Show cost per team based on owned assets.
Extension 4: Slack Integration
Slash command: /owner service-auth returns owner info in Slack.
9. Real-World Connections
How Big Tech Does This:
- Google: Service Mesh with mandatory owner metadata
- Netflix: Ownership tags on every AWS resource
- Spotify: Backstage service catalog with ownership
Open Source Tools:
- Backstage: Service catalog (catalog-info.yaml)
- OpsLevel: Service ownership maturity
- Cortex: Internal developer portal
10. Resources
GitHub CODEOWNERS
Backstage
Articles
Related Projects
- P01: Team Interaction Audit - Map team relationships
- P11: Internal Service Catalog - Full catalog implementation
11. Self-Assessment Checklist
Before considering this project complete, verify:
- I can explain the difference between Responsible and Accountable
- teams.yaml has at least 5 teams with complete metadata
- assets.yaml has at least 20 assets mapped to owners
owner-check --validatepasses with no errorsowner-check <asset>returns owner info in under 1 second- There’s a CI check that validates on PR
- I’ve found and assigned at least one orphaned resource
12. Submission / Completion Criteria
This project is complete when you have:
- teams.yaml with 5+ active teams
- assets.yaml with 20+ mapped assets
- owner-check CLI with validate, query, and orphan commands
- CI integration blocking merges on validation errors
- Documentation explaining how to add new assets/teams
- One orphan resolution - found an unowned resource and assigned it
Previous Project: P02: Team Service Interface Next Project: P04: Escalation Logic Tree