Project 38: AI Development Pipeline - Full Lifecycle Automation
Project 38: AI Development Pipeline - Full Lifecycle Automation
Build an end-to-end development pipeline where Claude Code handles everything from issue triage, through implementation, testing, code review, documentation, and deployment - with human checkpoints at critical stages.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Master |
| Time Estimate | 2+ months |
| Languages | TypeScript (Alternatives: Python, Go) |
| Prerequisites | All previous projects, CI/CD experience |
| Key Topics | CI/CD, GitOps, Human-in-the-Loop, Feature Flags |
| Knowledge Area | DevOps / AI-Native Development |
| Software/Tools | Claude Code, GitHub Actions, CI/CD pipelines |
| Coolness Level | Level 5: Pure Magic (Super Cool) |
| Business Potential | 5. The “Industry Disruptor” |
1. Project Overview
This is the “AI teammate” vision realized. You will integrate every Claude Code capability into a cohesive workflow that augments human developers rather than replacing them. The goal is not full automation - it is intelligent automation with human oversight at critical decision points.
What makes this Master level:
- End-to-end system integration
- Production safety concerns
- Human workflow integration
- Organizational change management
- Trust calibration for AI systems
2. Real World Outcome
You will have an AI-augmented development pipeline that handles routine work while keeping humans in control of critical decisions:
GitHub Issue Created:
+------------------------------------------------------------------+
| Issue #234: Add dark mode support to user settings |
+------------------------------------------------------------------+
| |
| Description: |
| Users have requested the ability to toggle dark mode in their |
| profile settings. This should persist across sessions and |
| respect system preferences. |
| |
| Labels: enhancement, frontend |
| Assignee: claude-pipeline-bot |
| |
+------------------------------------------------------------------+
Pipeline Execution:
+------------------------------------------------------------------+
| CLAUDE AI DEVELOPMENT PIPELINE |
+------------------------------------------------------------------+
| Trigger: Issue #234 |
| Started: 2024-01-15 09:00:00 |
+------------------------------------------------------------------+
PHASE 1: UNDERSTANDING (2 min)
+------------------------------------------------------------------+
| Status: COMPLETE |
| |
| - Parsed issue requirements |
| - Analyzed existing codebase for patterns |
| - Identified affected components: |
| - /src/components/Settings/ThemeToggle.tsx (new) |
| - /src/contexts/ThemeContext.tsx (new) |
| - /src/styles/themes/ (new directory) |
| - /src/components/Layout.tsx (modify) |
| - Created implementation plan |
| |
| Confidence: HIGH (similar patterns exist in codebase) |
+------------------------------------------------------------------+
PHASE 2: IMPLEMENTATION (8 min)
+------------------------------------------------------------------+
| Status: COMPLETE |
| |
| - Created branch: feature/issue-234-dark-mode |
| - Implemented ThemeContext with system preference detection |
| - Created ThemeToggle component |
| - Added dark theme CSS variables |
| - Updated Layout.tsx to use theme context |
| - Added localStorage persistence |
| |
| Files changed: 6 |
| Lines added: 347 |
+------------------------------------------------------------------+
PHASE 3: TESTING (5 min)
+------------------------------------------------------------------+
| Status: COMPLETE |
| |
| - Generated unit tests for ThemeContext |
| - Generated component tests for ThemeToggle |
| - Ran existing test suite |
| |
| Test Results: |
| Total: 59 |
| Passed: 59 |
| Failed: 0 |
| New: 12 |
+------------------------------------------------------------------+
PHASE 4: SELF-REVIEW (3 min)
+------------------------------------------------------------------+
| Status: COMPLETE |
| |
| Checks performed: |
| [PASS] Code follows project conventions |
| [PASS] No hardcoded values |
| [PASS] Accessibility (WCAG AA compliance) |
| [PASS] No security issues detected |
| [PASS] JSDoc documentation added |
| [PASS] No console.log statements |
| [PASS] Error handling implemented |
| |
| Recommendations: None |
+------------------------------------------------------------------+
PHASE 5: PR CREATION (1 min)
+------------------------------------------------------------------+
| Status: COMPLETE |
| |
| Created PR #89: |
| Title: "feat: Add dark mode support (closes #234)" |
| |
| PR contains: |
| - Implementation summary |
| - Screenshots of light/dark modes |
| - Test coverage report |
| - Breaking changes: None |
| |
| Requested reviewers: @frontend-team |
+------------------------------------------------------------------+
+------------------------------------------------------------------+
| HUMAN CHECKPOINT REQUIRED |
+------------------------------------------------------------------+
| |
| PR #89 awaiting human approval before merge. |
| |
| Reviewer: @frontend-team |
| Link: https://github.com/org/repo/pull/89 |
| |
| Actions available: |
| - Approve: Pipeline continues to deployment |
| - Request changes: Pipeline pauses for revision |
| - Close: Pipeline terminated |
| |
+------------------------------------------------------------------+
[After human approval and merge...]
PHASE 6: DEPLOYMENT (2 min)
+------------------------------------------------------------------+
| Status: COMPLETE |
| |
| - Merged to main |
| - CI/CD triggered |
| - Deployed to staging |
| - Smoke tests passed |
| - Feature flag: dark_mode_enabled = true (5% rollout) |
| |
| Staging URL: https://staging.example.com |
+------------------------------------------------------------------+
+------------------------------------------------------------------+
| PIPELINE SUMMARY |
+------------------------------------------------------------------+
| Total time: | 21 minutes |
| Files changed: | 6 |
| Lines added: | 347 |
| Lines removed: | 12 |
| Tests added: | 12 |
| Human interventions: | 1 (PR approval) |
| Issue #234: | RESOLVED |
+------------------------------------------------------------------+
3. The Core Question You Are Answering
“How do you build a development workflow where AI handles routine work while humans make critical decisions?”
This is not about replacing developers. It is about eliminating toil - the repetitive tasks that drain energy - so humans can focus on architecture, product decisions, and creative problem-solving.
Key balance to achieve:
- AI handles: boilerplate, tests, documentation, formatting, simple bug fixes
- Humans handle: architecture, product decisions, security review, deployment approval
4. Concepts You Must Understand First
Stop and research these before coding:
4.1 Pipeline Design
Questions to answer:
- What are the stages of software delivery?
- Where are natural checkpoints for human review?
- How do you handle pipeline failures?
+-----------------------------------------------------------------------+
| SOFTWARE DELIVERY PIPELINE |
+-----------------------------------------------------------------------+
| |
| Issue Created |
| | |
| v |
| +---------+ +-----------+ +--------+ +----------+ |
| | TRIAGE | --> | IMPLEMENT | --> | TEST | --> | REVIEW | |
| +---------+ +-----------+ +--------+ +----------+ |
| | | | | |
| | AI-driven | AI-driven | AI-driven | HUMAN |
| | scope check | coding | generation | checkpoint |
| | | | | |
| v v v v |
| +-----------------------------------------------------------------------+
| | |
| | +--------+ +---------+ +----------+ +-----------+ |
| | | MERGE | --> | DEPLOY | --> | MONITOR | --> | ROLLBACK? | |
| | +--------+ +---------+ +----------+ +-----------+ |
| | | | | | |
| | | Auto after | AI-driven | AI-driven | HUMAN |
| | | approval | progressive | alerting | decision |
| | |
| +-----------------------------------------------------------------------+
| |
+-----------------------------------------------------------------------+
Reference: “Continuous Delivery” by Humble & Farley - Chapters 5-6
4.2 Issue Understanding
Questions to answer:
- How do you parse natural language requirements?
- What makes an issue “actionable” for AI?
- How do you handle ambiguous requirements?
+-----------------------------------------------------------------------+
| ISSUE ACTIONABILITY SPECTRUM |
+-----------------------------------------------------------------------+
| |
| FULLY ACTIONABLE NEEDS CLARIFICATION NOT ACTIONABLE |
| (AI can proceed) (AI should ask) (Human required) |
| |
| "Fix typo in README "Improve performance" "Redesign the |
| line 42: 'teh' -> architecture" |
| 'the'" "Add user feedback |
| feature" "Investigate why |
| "Add input validation users are |
| to email field in "Make the dashboard churning" |
| signup form" faster" |
| "Should we use |
| "Update React from "Handle edge cases microservices?" |
| 17.0.1 to 17.0.2" better" |
| |
+-----------------------------------------------------------------------+
| |
| AI Detection Signals: |
| |
| - Specific file/line references --> More actionable |
| - Concrete examples --> More actionable |
| - Vague adjectives ("better", "faster") --> Less actionable |
| - Questions or uncertainties --> Needs clarification |
| - Strategic/architectural terms --> Human required |
| |
+-----------------------------------------------------------------------+
4.3 Safe Deployment
Questions to answer:
- What is progressive delivery?
- How do you implement feature flags?
- What monitoring do you need?
+-----------------------------------------------------------------------+
| PROGRESSIVE DELIVERY STAGES |
+-----------------------------------------------------------------------+
| |
| STAGE 1: INTERNAL (0.1%) |
| +-------------------------------------------------------------------+|
| | Deploy to internal users only ||
| | Monitor: Error rates, performance ||
| | Duration: 1 hour ||
| | Rollback trigger: >1% error rate ||
| +-------------------------------------------------------------------+|
| | |
| | All metrics healthy |
| v |
| STAGE 2: CANARY (5%) |
| +-------------------------------------------------------------------+|
| | Deploy to 5% of production traffic ||
| | Monitor: All metrics + user behavior ||
| | Duration: 4 hours ||
| | Rollback trigger: >0.5% error rate OR user complaints ||
| +-------------------------------------------------------------------+|
| | |
| | All metrics healthy |
| v |
| STAGE 3: GRADUAL (25% -> 50% -> 75%) |
| +-------------------------------------------------------------------+|
| | Increase traffic in steps ||
| | Monitor: All metrics + business metrics ||
| | Duration: 24 hours per step ||
| | Rollback trigger: Any regression ||
| +-------------------------------------------------------------------+|
| | |
| | All metrics healthy |
| v |
| STAGE 4: FULL (100%) |
| +-------------------------------------------------------------------+|
| | Full production deployment ||
| | Keep feature flag for emergency rollback ||
| | Remove flag after 1 week stable ||
| +-------------------------------------------------------------------+|
| |
+-----------------------------------------------------------------------+
Reference: “Accelerate” by Forsgren et al. - Chapters 2-4
4.4 Human-in-the-Loop Design
Questions to answer:
- When should AI pause for human input?
- How do you present choices to humans?
- How do you handle human unavailability?
Reference: “Human + Machine” by Daugherty & Wilson - Chapters 5-6
5. Questions to Guide Your Design
Before implementing, think through these:
5.1 Scope Boundaries
- What types of issues should the pipeline handle?
- Documentation updates
- Bug fixes with clear reproduction steps
- Feature additions with clear specifications
- Dependency updates
- What should always require human implementation?
- Security-sensitive changes
- Database schema changes
- Breaking API changes
- Performance-critical code
- How do you detect out-of-scope issues?
- Keyword detection
- Complexity estimation
- File sensitivity analysis
5.2 Quality Gates
- What checks must pass before each phase?
+-----------------------------------------------------------------------+
| QUALITY GATES BY PHASE |
+-----------------------------------------------------------------------+
| |
| BEFORE IMPLEMENTATION: |
| [ ] Issue is labeled ai-eligible |
| [ ] No blocking issues linked |
| [ ] Affected files are not in restricted list |
| [ ] Estimated complexity is within bounds |
| |
| BEFORE TESTING: |
| [ ] Code compiles without errors |
| [ ] Linting passes |
| [ ] No new security vulnerabilities |
| [ ] Changes are within expected scope |
| |
| BEFORE PR CREATION: |
| [ ] All tests pass |
| [ ] Coverage meets threshold |
| [ ] Self-review passed |
| [ ] Documentation updated |
| |
| BEFORE DEPLOYMENT: |
| [ ] Human approval received |
| [ ] No merge conflicts |
| [ ] CI pipeline passed |
| [ ] Feature flag configured |
| |
+-----------------------------------------------------------------------+
- How strict should automated review be?
- Conservative: Flag anything uncertain
- Balanced: Allow minor issues
- Aggressive: Only block on errors
- What thresholds trigger human escalation?
- Confidence below 80%
- Changes to >10 files
- Changes to security-sensitive paths
5.3 Rollback Strategy
- What happens if deployment fails?
- Automatic rollback to previous version
- Feature flag disabled
- Alert sent to on-call
- How do you revert AI-generated changes?
- Git revert of merge commit
- Issue reopened with context
- Learning feedback captured
- What is the blast radius of a bad change?
- Feature flag limits impact
- Canary deployment catches issues early
- Monitoring detects regressions
6. Thinking Exercise: Map Issue Types to Automation Level
Consider these issue types and categorize them:
Issue Categories
+-----------------------------------------------------------------------+
| AUTOMATION LEVEL MATRIX |
+-----------------------------------------------------------------------+
| |
| Issue Type | Automation | Human Role |
| ------------------------------|------------|--------------------------|
| "Fix typo in README" | FULL | None (auto-merge) |
| "Add input validation" | HIGH | PR review |
| "Upgrade React 17 -> 18" | MEDIUM | Planning + review |
| "Redesign dashboard" | LOW | Architecture + review |
| "Investigate perf issue" | ASSIST | Investigation partner |
| |
+-----------------------------------------------------------------------+
Signals for Automation Level
+-----------------------------------------------------------------------+
| AUTOMATION DECISION TREE |
+-----------------------------------------------------------------------+
| |
| Does issue have clear scope? |
| | |
| +-- NO --> Clarify with issue author |
| | |
| +-- YES --> Does it touch sensitive files? |
| | |
| +-- YES --> Human implements |
| | |
| +-- NO --> Is it within complexity bounds? |
| | |
| +-- NO --> Human implements |
| | |
| +-- YES --> AI implements |
| | |
| v |
| Human reviews PR |
| |
+-----------------------------------------------------------------------+
Questions to Answer
For each issue type:
- What signals indicate automation level?
- Where should human checkpoints be?
- What could go wrong with full automation?
7. The Interview Questions They Will Ask
Prepare to answer these:
-
“How do you prevent the AI from shipping bugs to production?”
Think about: Multiple quality gates, human review, feature flags, progressive rollout, automated testing, monitoring and rollback
-
“What is your strategy for handling security-sensitive changes?”
Think about: File-based restrictions, mandatory human review, security scanning, audit logging, principle of least privilege
-
“How do you measure the quality of AI-generated code?”
Think about: Test coverage, code review feedback, production errors, technical debt metrics, developer satisfaction
-
“What happens when the pipeline makes a mistake?”
Think about: Rollback procedures, learning from failures, improving detection, adjusting thresholds
-
“How do you train developers to work with AI teammates?”
Think about: Gradual introduction, clear ownership boundaries, feedback mechanisms, trust calibration
8. Hints in Layers
Only read when stuck:
Hint 1: Start with Low-Risk Issues
Begin with documentation updates, test additions, and minor fixes. Build trust before expanding scope.
# Initial scope - maximum safety
ai-eligible-types:
- documentation
- typo-fix
- test-addition
- dependency-patch
restricted-paths:
- "**/auth/**"
- "**/security/**"
- "**/billing/**"
- "**/migrations/**"
- "**/*.env*"
Hint 2: GitHub Actions Integration
Use GitHub Actions as the orchestration layer. Claude Code headless runs as action steps.
name: AI Development Pipeline
on:
issues:
types: [opened, labeled]
jobs:
triage:
if: contains(github.event.issue.labels.*.name, 'ai-eligible')
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Analyze Issue
run: |
claude -p "Analyze this issue and create implementation plan:
${{ github.event.issue.body }}" \
--output-format json > plan.json
Hint 3: Conservative Defaults
Default to human review. Only skip it for well-understood, low-risk changes.
const DEFAULT_POLICY = {
requireHumanReview: true,
requireApprovalCount: 1,
autoMergeEnabled: false,
featureFlagRequired: true
};
// Only relax for known-safe patterns
const RELAXED_POLICY = {
patterns: [
{ path: 'docs/**', autoMerge: true },
{ path: '**/*.test.ts', requireApprovalCount: 0 }
]
};
Hint 4: Audit Everything
Log every decision the pipeline makes. You will need this for debugging and trust-building.
interface AuditEntry {
timestamp: Date;
phase: string;
decision: string;
reasoning: string;
inputs: Record<string, any>;
outputs: Record<string, any>;
humanOverride?: {
by: string;
reason: string;
};
}
9. Books That Will Help
| Topic | Book | Chapters | Why It Helps |
|---|---|---|---|
| CI/CD | “Continuous Delivery” by Humble & Farley | Ch. 5-7 | Pipeline design, deployment patterns |
| DevOps | “Accelerate” by Forsgren et al. | Ch. 2-4 | Measuring delivery performance |
| Human-AI | “Human + Machine” by Daugherty & Wilson | Ch. 5-6 | Collaboration patterns |
| Reliability | “Site Reliability Engineering” by Google | Ch. 3, 16 | Release engineering, progressive rollout |
| Feature Flags | “Feature Flag Best Practices” by LaunchDarkly | All | Flag strategies, rollout patterns |
10. Architecture Deep Dive
10.1 System Architecture
+------------------------------------------------------------------------+
| AI DEVELOPMENT PIPELINE ARCHITECTURE |
+------------------------------------------------------------------------+
| |
| GITHUB |
| +-------------------------------------------------------------------+ |
| | Issues | PRs | Actions | Webhooks | |
| +-------------------------------------------------------------------+ |
| | ^ | |
| | trigger | | status |
| v | v |
| +-------------------------------------------------------------------+ |
| | PIPELINE ORCHESTRATOR | |
| | (GitHub Actions / Custom Server) | |
| | | |
| | +------------+ +------------+ +------------+ +------------+ | |
| | | Triage | | Implement | | Test | | Review | | |
| | | Phase | | Phase | | Phase | | Phase | | |
| | +------------+ +------------+ +------------+ +------------+ | |
| +-------------------------------------------------------------------+ |
| | | | | |
| v v v v |
| +-------------------------------------------------------------------+ |
| | CLAUDE CODE (HEADLESS) | |
| | | |
| | - Analyze issues | |
| | - Generate code | |
| | - Write tests | |
| | - Self-review | |
| +-------------------------------------------------------------------+ |
| | |
| v |
| +-------------------------------------------------------------------+ |
| | QUALITY GATES | |
| | | |
| | +--------+ +--------+ +--------+ +--------+ +--------+ | |
| | | Linter | | Tests | |Security| |Coverage| | Review | | |
| | +--------+ +--------+ +--------+ +--------+ +--------+ | |
| +-------------------------------------------------------------------+ |
| | |
| v |
| +-------------------------------------------------------------------+ |
| | DEPLOYMENT | |
| | | |
| | Feature Flags --> Staging --> Canary --> Production | |
| +-------------------------------------------------------------------+ |
| |
+------------------------------------------------------------------------+
10.2 GitHub Actions Workflow
# .github/workflows/ai-pipeline.yml
name: AI Development Pipeline
on:
issues:
types: [opened, labeled]
env:
CLAUDE_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
jobs:
triage:
if: contains(github.event.issue.labels.*.name, 'ai-eligible')
runs-on: ubuntu-latest
outputs:
should_proceed: ${{ steps.analyze.outputs.should_proceed }}
plan: ${{ steps.analyze.outputs.plan }}
session_id: ${{ steps.analyze.outputs.session_id }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Node
uses: actions/setup-node@v4
with:
node-version: '20'
- name: Install Claude CLI
run: npm install -g @anthropic-ai/claude-cli
- name: Analyze Issue
id: analyze
run: |
claude -p "Analyze this GitHub issue and determine if it's actionable:
Title: ${{ github.event.issue.title }}
Body: ${{ github.event.issue.body }}
Labels: ${{ join(github.event.issue.labels.*.name, ', ') }}
Output JSON with:
- should_proceed: boolean
- complexity: low|medium|high
- affected_files: string[]
- implementation_plan: string
- risks: string[]" \
--output-format json > analysis.json
cat analysis.json
echo "plan=$(cat analysis.json | jq -c .)" >> $GITHUB_OUTPUT
echo "should_proceed=$(cat analysis.json | jq .should_proceed)" >> $GITHUB_OUTPUT
- name: Check Scope
run: node scripts/verify-scope.js analysis.json
implement:
needs: triage
if: needs.triage.outputs.should_proceed == 'true'
runs-on: ubuntu-latest
outputs:
session_id: ${{ steps.implement.outputs.session_id }}
branch: ${{ steps.branch.outputs.name }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Create Branch
id: branch
run: |
BRANCH="feature/issue-${{ github.event.issue.number }}"
git checkout -b $BRANCH
echo "name=$BRANCH" >> $GITHUB_OUTPUT
- name: Implement
id: implement
run: |
claude -p "Implement the changes according to this plan:
${{ needs.triage.outputs.plan }}
Follow the project's coding standards.
Create or modify files as needed.
Do not modify files outside the affected scope." \
--output-format stream-json | tee implementation.log
- name: Commit Changes
run: |
git config user.name "Claude Pipeline"
git config user.email "claude@pipeline.local"
git add -A
git commit -m "feat: Implement issue #${{ github.event.issue.number }}
Automated implementation by Claude Pipeline.
See implementation.log for details."
- name: Push Branch
run: git push -u origin ${{ steps.branch.outputs.name }}
test:
needs: implement
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ needs.implement.outputs.branch }}
- name: Setup Project
run: npm ci
- name: Generate Tests
run: |
claude -p "Generate tests for the changes in this branch.
Focus on:
- Unit tests for new functions
- Integration tests for new features
- Edge cases mentioned in the issue"
- name: Run Tests
run: npm test
- name: Coverage Report
run: npm run coverage
review:
needs: [implement, test]
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ needs.implement.outputs.branch }}
- name: Self Review
run: |
claude -p "Review the changes in this branch for:
- Code quality
- Security issues
- Performance concerns
- Accessibility compliance
- Documentation completeness
Output a review with pass/fail for each category."
- name: Create PR
uses: peter-evans/create-pull-request@v5
with:
branch: ${{ needs.implement.outputs.branch }}
title: "feat: ${{ github.event.issue.title }} (closes #${{ github.event.issue.number }})"
body: |
## Summary
Automated implementation for issue #${{ github.event.issue.number }}
## Changes
[Auto-generated summary]
## Test Plan
- [ ] Automated tests pass
- [ ] Manual verification of feature
- [ ] No regressions in existing functionality
## Human Review Required
This PR was generated by the AI pipeline and requires human approval before merge.
---
Generated by Claude Pipeline
reviewers: frontend-team
labels: ai-generated, needs-review
10.3 Human Checkpoint Implementation
interface HumanCheckpoint {
id: string;
type: 'approval' | 'decision' | 'review';
context: Record<string, any>;
options: CheckpointOption[];
timeout: number;
escalation: EscalationPolicy;
}
interface CheckpointOption {
label: string;
action: string;
requiresReason: boolean;
}
async function createHumanCheckpoint(
checkpoint: HumanCheckpoint
): Promise<CheckpointResult> {
// Create GitHub review request
await github.pulls.requestReviewers({
owner: config.owner,
repo: config.repo,
pull_number: checkpoint.context.prNumber,
reviewers: checkpoint.context.reviewers
});
// Add checkpoint comment
await github.issues.createComment({
owner: config.owner,
repo: config.repo,
issue_number: checkpoint.context.prNumber,
body: formatCheckpointMessage(checkpoint)
});
// Wait for response or timeout
const response = await waitForHumanResponse(checkpoint);
// Log decision for audit
await auditLog.record({
checkpoint: checkpoint.id,
decision: response.decision,
decidedBy: response.actor,
reason: response.reason,
timestamp: new Date()
});
return response;
}
function formatCheckpointMessage(checkpoint: HumanCheckpoint): string {
return `
## Human Review Required
This PR was generated by the AI Development Pipeline and requires human approval.
### Context
${JSON.stringify(checkpoint.context, null, 2)}
### Actions Available
${checkpoint.options.map(o => `- **${o.label}**: ${o.action}`).join('\n')}
### Timeout
This checkpoint will expire in ${checkpoint.timeout / 1000 / 60} minutes.
---
Pipeline ID: ${checkpoint.id}
`;
}
10.4 Required Human Approvals
// Mandatory human approval points
const HUMAN_CHECKPOINTS = {
// Always require human review before merge
prApproval: {
required: true,
minApprovers: 1,
requiredTeams: ['reviewers']
},
// Require approval for production deployment
productionDeploy: {
required: true,
minApprovers: 2,
requiredTeams: ['sre', 'product']
},
// Require approval for security-sensitive files
securityReview: {
paths: ['**/auth/**', '**/security/**', '**/crypto/**'],
required: true,
requiredTeams: ['security']
},
// Require approval for database changes
databaseChanges: {
paths: ['**/migrations/**', '**/schema/**'],
required: true,
requiredTeams: ['dba']
},
// Require approval for API changes
apiChanges: {
paths: ['**/api/**', '**/openapi/**'],
required: true,
requiredTeams: ['api-owners']
}
};
11. Implementation Milestones
Milestone 1: Simple Issues Implemented Automatically (Week 1-4)
Goal: Basic pipeline works for low-risk issues
Deliverables:
- Issue analysis and classification
- Branch creation and code generation
- Test generation
- PR creation
- Basic quality gates
Validation: Pipeline handles 5 documentation/typo issues end-to-end
Milestone 2: Human Checkpoints Work Correctly (Week 5-8)
Goal: Safety mechanisms are robust
Deliverables:
- PR review workflow
- Approval gates
- Timeout handling
- Escalation policies
- Audit logging
Validation: No changes deploy without human approval
Milestone 3: Complex Issues Get Appropriate Escalation (Week 9-12)
Goal: Pipeline knows its limits
Deliverables:
- Complexity estimation
- Scope detection
- Automatic escalation
- Partial automation (assist mode)
- Learning from feedback
Validation: Pipeline correctly routes 90% of issues
12. Issue Classification System
interface IssueClassification {
automationLevel: 'full' | 'high' | 'medium' | 'low' | 'assist';
confidence: number;
reasoning: string;
requiredHumanInput: string[];
estimatedComplexity: number;
affectedPaths: string[];
risks: Risk[];
}
async function classifyIssue(issue: GitHubIssue): Promise<IssueClassification> {
// Extract signals from issue
const signals = await extractSignals(issue);
// Check against rules
const ruleResult = applyClassificationRules(signals);
// Get AI assessment
const aiResult = await getAIClassification(issue, signals);
// Combine results (rules take precedence)
return combineClassifications(ruleResult, aiResult);
}
function applyClassificationRules(signals: IssueSignals): Partial<IssueClassification> {
// Security-sensitive paths -> always low automation
if (signals.affectedPaths.some(p => isSecurityPath(p))) {
return {
automationLevel: 'low',
requiredHumanInput: ['security-review', 'implementation']
};
}
// Database changes -> always low automation
if (signals.affectedPaths.some(p => isDatabasePath(p))) {
return {
automationLevel: 'low',
requiredHumanInput: ['dba-review', 'migration-review']
};
}
// Documentation only -> full automation
if (signals.affectedPaths.every(p => isDocPath(p))) {
return {
automationLevel: 'full',
requiredHumanInput: []
};
}
// Test additions -> high automation
if (signals.affectedPaths.every(p => isTestPath(p))) {
return {
automationLevel: 'high',
requiredHumanInput: ['pr-review']
};
}
return {}; // Let AI decide
}
13. Monitoring and Observability
+------------------------------------------------------------------------+
| PIPELINE DASHBOARD |
+------------------------------------------------------------------------+
| |
| OVERALL HEALTH |
| +-------------------------------------------------------------------+ |
| | Success Rate: 94% | Avg Time: 23m | Issues/Week: 47 | |
| +-------------------------------------------------------------------+ |
| |
| CURRENT PIPELINES |
| +-------------------------------------------------------------------+ |
| | Issue | Phase | Duration | Status | Assignee | |
| |-------|------------|----------|-----------|------------------------| |
| | #234 | Review | 18m | WAITING | @frontend-team | |
| | #235 | Testing | 5m | RUNNING | - | |
| | #236 | Implement | 12m | RUNNING | - | |
| | #237 | Triage | 1m | RUNNING | - | |
| +-------------------------------------------------------------------+ |
| |
| AUTOMATION METRICS (Last 30 Days) |
| +-------------------------------------------------------------------+ |
| | Metric | Value | |
| |---------------------------------|----------------------------------| |
| | Issues fully automated | 32 (68%) | |
| | Issues with human assist | 12 (25%) | |
| | Issues escalated to human | 3 (7%) | |
| | Average time saved per issue | 2.3 hours | |
| | Bugs introduced by AI | 1 (caught in review) | |
| | Developer satisfaction | 4.2/5 | |
| +-------------------------------------------------------------------+ |
| |
| RECENT ISSUES |
| +-------------------------------------------------------------------+ |
| | #233 | Documentation update | FULL AUTO | Completed 2h ago | |
| | #232 | Fix login validation | ASSISTED | Completed 4h ago | |
| | #231 | Add dark mode | HIGH AUTO | Completed 1d ago | |
| | #230 | Redesign auth flow | ESCALATED | In progress | |
| +-------------------------------------------------------------------+ |
| |
+------------------------------------------------------------------------+
14. Trust Building Strategy
Phase 1: Observation Only (Week 1-2)
- Pipeline runs but creates draft PRs only
- Human implements manually
- Compare AI suggestions to human work
- Collect accuracy metrics
Phase 2: Low-Risk Automation (Week 3-6)
- Enable for documentation only
- Require two approvers
- Monitor closely
- Build trust with team
Phase 3: Expanded Scope (Week 7-12)
- Enable for typos, tests, simple fixes
- Reduce to one approver for known patterns
- Continue monitoring
- Gather developer feedback
Phase 4: Full Pipeline (Week 13+)
- Enable for most enhancement issues
- Auto-merge for documentation
- Single approver for code changes
- Continuous improvement
15. Common Pitfalls and Solutions
Pitfall 1: Over-Automation
Problem: AI makes changes that are technically correct but miss important context.
Solution:
- Always require human review for code changes
- Include context in PR description
- Make it easy to request changes
Pitfall 2: Trust Erosion
Problem: One bad change destroys team confidence in the pipeline.
Solution:
- Start very conservatively
- Celebrate successes publicly
- Handle failures transparently
- Continuous improvement visible to team
Pitfall 3: Context Loss
Problem: AI implementations miss project-specific patterns and conventions.
Solution:
- Comprehensive CLAUDE.md with project context
- Learn from code review feedback
- Style enforcement in quality gates
16. Success Criteria
You have mastered this project when:
- Pipeline handles documentation issues end-to-end
- Pipeline handles simple code changes with human review
- Human checkpoints work reliably
- Team trusts the pipeline for routine work
- Metrics show time savings
- Zero bugs shipped due to AI implementation
- Developers prefer using the pipeline for eligible issues
Source
This project is part of the Claude Code Mastery: 40 Projects learning path.