Project 27: Autonomy Boundaries and Self-Improvement Guardrails

Build policy controls that limit autonomous action by reversibility, risk, and economic consequence.

Quick Reference

Attribute	Value
Difficulty	Level 4: Expert
Time Estimate	10-18 hours
Language	Python (alt: TypeScript)
Prerequisites	Projects 6, 16, 22
Key Topics	autonomy levels, human checkpoints, safe adaptation, TCO analysis

Learning Objectives

Define explicit autonomy levels and transition rules.
Route irreversible/high-risk actions to human checkpoints.
Bound self-improvement loops with policy and eval gates.
Compare agent-vs-human economics including incident externalities.

The Core Question You’re Answering

“When should an agent act alone, and when must humans remain in the loop?”

Concepts You Must Understand First

Concept	Why It Matters	Where to Learn
Reversibility analysis	controls irreversible damage	risk management methods
Human checkpoint routing	keeps risk acceptable	approval workflow design
Safe adaptation boundaries	prevents policy drift	AI risk frameworks
Full-cost modeling	avoids false automation savings	operations economics

Theoretical Foundation

Task Risk + Reversibility + Confidence + Blast Radius -> Autonomy Policy -> Action or Human Review

Higher autonomy must be earned with measured evidence.

Project Specification

What You’ll Build

A boundary manager that:

Assigns autonomy levels
Applies mandatory approvals
Restricts adaptive changes
Produces economic decision reports

Functional Requirements

Risk/reversibility scoring engine
Human checkpoint router
Adaptation gate controls
Human-vs-agent cost model

Non-Functional Requirements

Auditable policy decisions
Low approval-routing ambiguity
Clear rollback policy

Real World Outcome

$ python p27_autonomy_guard.py --scenario "vendor_contract_update"
[risk] high reversibility=low
[autonomy] level=review_only
[checkpoint] legal_ops_required=true
[adaptation] online_learning_blocked=true
[economics] agent=$6.70 human=$14.20 incident_adjusted=agent_not_preferred

Architecture Overview

Policy Matrix -> Risk Scorer -> Approval Router -> Action Dispatcher -> Audit Log

Implementation Guide

Phase 1: Policy Matrix

Define autonomy levels and allowed actions.

Phase 2: Runtime Routing

Add approval and dispatch controls.

Phase 3: Economic Layer

Include incident-adjusted TCO model.

Testing Strategy

High-risk action routing tests
Policy bypass tests
Adaptation drift tests

Common Pitfalls & Debugging

Pitfall	Symptom	Fix
Cosmetic checkpoints	risky actions still execute	hard-block pre-dispatch
Optimistic economics	over-automation	include incident and maintenance costs
Unbounded adaptation	policy drift	gate every adaptation change

Interview Questions They’ll Ask

How do you define automation irreversibility?
What should always require human approval?
How do you constrain self-improving loops?
How do you compare agent and human total cost?

Hints in Layers

Hint 1: Keep autonomy levels finite and explicit.
Hint 2: Treat approvals as workflow primitives.
Hint 3: Require policy version tags on decisions.
Hint 4: Stress-test economics with incident scenarios.

Submission / Completion Criteria

Minimum Completion

Enforced autonomy matrix + checkpoint routing

Full Completion

Adaptation limits + incident-adjusted economics

Excellence

Governance-ready autonomy policy pack