Vibe Coding Mastery: AI-Assisted Development with Claude Code and OpenAI Codex
Goal: Master the art and science of “vibe coding” - the revolutionary approach to software development where you collaborate with AI assistants to build software through natural language conversation. You will understand not just the mechanics of AI-assisted coding, but the cognitive strategies, prompt engineering techniques, and workflow patterns that separate amateur “prompt-and-pray” developers from professionals who reliably ship production-quality code with AI assistance. By completing these 20 projects, you will develop intuition for when to trust AI, when to verify, how to manage context effectively, and how to integrate AI coding tools into sustainable development workflows.
Why Vibe Coding Matters: The AI Coding Revolution
In February 2025, Andrej Karpathy (co-founder of OpenAI, former Tesla AI director) coined the term “vibe coding” to describe a new paradigm: fully giving in to the vibes, embracing exponentials, and forgetting that the code even exists. He described using voice commands to tell AI assistants to “decrease the padding on the sidebar by half” without ever reading the diffs or understanding the generated code.
This wasn’t a joke - it was a prediction. By July 2025, The Wall Street Journal reported that vibe coding had entered commercial use. Y Combinator’s Winter 2025 batch revealed that 25% of startups had codebases that were 95% AI-generated.
The Evolution of Software Development
┌─────────────────────────────────────────────────────────────────────────────┐
│ │
│ 1950s-1990s: MANUAL CODING │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Developer → Write every line → Compile → Debug → Ship │ │
│ │ Time to feature: Weeks to months │ │
│ │ Context: Only what's in your head │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ 2000s-2020s: IDE-ASSISTED CODING │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Developer → Autocomplete → Snippets → Stack Overflow → Ship │ │
│ │ Time to feature: Days to weeks │ │
│ │ Context: IDE + search + documentation │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ 2021-2024: AI-ASSISTED CODING (Copilot Era) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Developer → Prompt → Accept/Reject → Manual fixes → Ship │ │
│ │ Time to feature: Hours to days │ │
│ │ Context: Current file + some project context │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ 2025+: VIBE CODING (Agentic Era) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ Developer → Describe intent → AI reasons + plans + executes │ │
│ │ Time to feature: Minutes to hours │ │
│ │ Context: Entire codebase + external tools + memory │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

The Numbers Don’t Lie (2025 Statistics)
The adoption of AI coding assistants has been explosive:
| Metric | 2024 | 2025 | Source |
|---|---|---|---|
| Developers using AI coding tools weekly | 45% | 65% | Stack Overflow Survey |
| AI-generated or AI-assisted code | 25% | 41% | Industry reports |
| Fortune 100 companies using GitHub Copilot | 75% | 90% | GitHub data |
| Cursor users completing tasks faster | - | 40-60% | User studies |
| GitHub Copilot market share | 50% | 42% | Industry analysis |
| Cursor revenue | $100M | $200M (projected) | Company data |
But there’s a critical caveat: 46% of developers actively distrust AI output accuracy (vs. 33% who trust it). Experienced developers are the most cautious - only 2.6% “highly trust” AI suggestions.
This creates a paradox: AI tools are powerful, but most developers don’t know how to use them effectively. That’s what this guide teaches.
The Vibe Coding Spectrum
Vibe coding isn’t binary. It exists on a spectrum:
VIBE CODING SPECTRUM
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ PURE MANUAL ASSISTED COLLABORATIVE VIBE │
│ ────────────────────────────────────────────────────────────────────────── │
│ │ │ │ │ │
│ Write every line Autocomplete + AI proposes, AI does │
│ yourself suggestions human reviews everything│
│ │
│ └────────────────────┴──────────────────────┴──────────────────────┘ │
│ Copilot inline Claude Code Full vibe mode │
│ suggestions with review (Karpathy style) │
│ │
│ RECOMMENDED ZONE FOR PRODUCTION: ◄─────────────────► │
│ Assisted to Collaborative │
│ │
│ Risk level: LOW MODERATE HIGH EXTREME │
│ Productivity: 1x 3-5x 5-10x 10x+ │
│ Understanding: FULL HIGH MODERATE LOW │
│ Maintainability:EXCELLENT GOOD FAIR POOR │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

When to Use Each Mode
| Mode | Best For | Avoid When |
|---|---|---|
| Pure Manual | Security-critical code, performance-critical loops, algorithmic puzzles | Time-constrained, boilerplate-heavy |
| Assisted | Day-to-day coding, familiar patterns, well-understood domains | Novel problems, learning new tech |
| Collaborative | Complex features, refactoring, exploring solutions | Simple tasks, security-critical |
| Full Vibe | Prototypes, throwaway scripts, learning experiments | Production code, anything important |
The Tool Landscape (2025)
AI CODING TOOL ECOSYSTEM
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ INLINE ASSISTANTS AGENTIC TOOLS │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ │ │ │ │
│ │ GitHub Copilot │ │ Claude Code │ │
│ │ (42% market share) │ │ (Anthropic) │ │
│ │ │ │ │ │
│ │ - Inline suggest │ │ - 16 built-in tools│ │
│ │ - Tab to accept │ │ - Full codebase │ │
│ │ - File context │ │ - MCP integration │ │
│ │ │ │ - Headless mode │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ │ │ │ │
│ │ Cursor │ │ OpenAI Codex │ │
│ │ (18% market share) │ │ (CLI agent) │ │
│ │ │ │ │ │
│ │ - Full IDE │ │ - Local sandboxing │ │
│ │ - Composer agent │ │ - Approval policies│ │
│ │ - Codebase-aware │ │ - GitHub model │ │
│ │ │ │ │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ │ │ │ │
│ │ Amazon Q Developer │ │ Gemini CLI │ │
│ │ (11% market share) │ │ (Google) │ │
│ │ │ │ │ │
│ │ - AWS integration │ │ - Google Cloud │ │
│ │ - Security focus │ │ - Long context │ │
│ │ │ │ - Multimodal │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

Prerequisites & Background Knowledge
Essential Prerequisites (Must Have)
Programming Experience:
- Comfortable reading and writing code in at least one language
- Understanding of version control (Git) fundamentals
- Basic command-line proficiency
- Experience with at least one complete project (beyond tutorials)
Cognitive Prerequisites:
- Ability to evaluate code quality (is this code good or bad?)
- Understanding of common patterns (CRUD, APIs, testing)
- Willingness to verify and not blindly trust AI output
Helpful But Not Required
LLM Understanding:
- How language models work at a conceptual level
- Token limits and context windows
- Temperature and sampling parameters
- Can learn during: Projects 1-3
Tool-Specific Knowledge:
- Claude Code CLI experience
- Cursor or Copilot experience
- MCP server concepts
- Can learn during: Projects 4-8
Self-Assessment Questions
Before starting, honestly answer:
- Can you read 100 lines of unfamiliar code and identify bugs?
- YES: Ready for advanced vibe coding
- NO: Start with Projects 1-5 only
- Do you understand why code review exists?
- YES: You’ll understand when to trust AI
- NO: Read “Code Complete” Chapter 21 first
- Have you shipped production code that others rely on?
- YES: Full vibe coding with appropriate caution
- NO: Stick to assisted mode, review everything
- Can you explain the difference between syntax and semantics?
- YES: You can evaluate AI-generated code meaningfully
- NO: Complete “Structure and Interpretation of Computer Programs” Chapter 1
- Do you know what a race condition is?
- YES: You can catch AI-generated concurrency bugs
- NO: AI will confidently generate broken concurrent code you won’t catch
Development Environment Setup
Required Tools:
# Claude Code
npm install -g @anthropic-ai/claude-code
# or
brew install claude-code
# OpenAI Codex CLI
pip install openai-codex
# or via npm
npm install -g @openai/codex
# Cursor (download from cursor.com)
# GitHub Copilot (via VS Code extension marketplace)
Recommended Configuration:
# Create a test project for safe experimentation
mkdir ~/vibe-coding-lab
cd ~/vibe-coding-lab
git init
# Set up Claude Code
claude init # Creates CLAUDE.md
# Set up Codex
codex init # Creates configuration
Verification:
# Test Claude Code
claude --version
claude "What tools do you have access to?"
# Test Codex
codex --version
codex "List your capabilities"
Time Investment
| Project Group | Projects | Hours Each | Total |
|---|---|---|---|
| Foundations | 1-5 | 4-8 | 20-40 |
| Core Skills | 6-12 | 8-15 | 56-105 |
| Advanced | 13-17 | 15-25 | 75-125 |
| Mastery | 18-20 | 20-40 | 60-120 |
| Total | 20 | - | 211-390 hours |
Important Reality Check: Vibe coding appears easy but requires deep judgment. The projects below aren’t about making AI work - they’re about developing the intuition to use AI effectively and catch its mistakes before they become production bugs.
Core Concept Analysis
1. The AI Coding Mental Model
AI coding assistants are probabilistic code generators, not programmers. They:
- Predict the most likely next tokens based on training data
- Have no understanding of your specific business requirements
- Cannot reason about runtime behavior
- May confidently generate plausible-but-wrong code
- Excel at patterns they’ve seen many times
HOW AI CODING ASSISTANTS WORK
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ YOUR PROMPT │
│ "Create a function that validates email addresses" │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ TOKENIZATION │ │
│ │ "Create" "a" "function" "that" "validates" "email" "addresses" │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ CONTEXT GATHERING │ │
│ │ │ │
│ │ System prompt + CLAUDE.md + Recent conversation + File contents │ │
│ │ + Tool descriptions + MCP server capabilities │ │
│ │ │ │
│ │ Total context: Up to 200K tokens │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ PATTERN MATCHING │ │
│ │ │ │
│ │ "I've seen 10,000+ email validation functions in training data" │ │
│ │ "Most common pattern: regex with specific format" │ │
│ │ "Framework-specific: Django has validate_email, Flask doesn't" │ │
│ │ │ │
│ │ ⚠️ Training data may contain bugs, outdated patterns, or │ │
│ │ security vulnerabilities that get reproduced │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ TOKEN GENERATION │ │
│ │ │ │
│ │ For each position, predict most likely next token: │ │
│ │ "def" (90%) | "function" (5%) | "const" (3%) | ... │ │
│ │ │ │
│ │ Temperature controls randomness: │ │
│ │ - Low (0.0): Always pick highest probability │ │
│ │ - High (1.0): Sample from distribution (more creative/random) │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ OUTPUT CODE │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ def validate_email(email: str) -> bool: │ │
│ │ pattern = r'^[\w\.-]+@[\w\.-]+\.\w+$' │ │
│ │ return bool(re.match(pattern, email)) │ │
│ │ │ │
│ │ ⚠️ This is PLAUSIBLE but may have edge case bugs! │ │
│ │ (This regex fails on valid emails like "user+tag@domain.com") │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

2. Context is Everything
The quality of AI output is directly proportional to context quality:
CONTEXT HIERARCHY (Claude Code)
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ Priority 1: ENTERPRISE CLAUDE.md (If exists) │
│ ├── Company-wide standards │
│ ├── Security policies │
│ └── Compliance requirements │
│ │ │
│ ▼ │
│ Priority 2: PROJECT CLAUDE.md │
│ ├── Architecture overview │
│ ├── Important file locations │
│ ├── Testing commands │
│ ├── Code style guidelines │
│ └── Common patterns to use │
│ │ │
│ ▼ │
│ Priority 3: ~/.claude/CLAUDE.md (Personal) │
│ ├── Your preferred coding style │
│ ├── Personal shortcuts │
│ └── Common tools you use │
│ │ │
│ ▼ │
│ Priority 4: Conversation history │
│ ├── Previous messages in session │
│ ├── Files you've discussed │
│ └── Decisions you've made │
│ │ │
│ ▼ │
│ Priority 5: On-demand context │
│ ├── Files Claude reads when needed │
│ ├── Grep/glob search results │
│ ├── MCP server responses │
│ └── Web search results │
│ │
│ ┌────────────────────────────────────────────────────────────────────────┐ │
│ │ CONTEXT WINDOW BUDGET │ │
│ │ │ │
│ │ Claude Sonnet: ~200K tokens (~150K words, ~500 pages of text) │ │
│ │ GPT-4 Turbo: ~128K tokens (~96K words, ~320 pages) │ │
│ │ Gemini 1.5: ~2M tokens (~1.5M words, ~5000 pages) │ │
│ │ │ │
│ │ TRADEOFF: More context = slower response + higher cost │ │
│ │ STRATEGY: Include what's relevant, not everything possible │ │
│ └────────────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

3. The Agentic Difference
Traditional AI assistants respond to prompts. Agentic tools like Claude Code and Codex can:
AGENTIC VS CHAT-BASED AI CODING
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ CHAT-BASED (ChatGPT, early Copilot) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Human: "How do I add authentication to my Express app?" │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ AI: "Here's sample code..." (Generic, may not fit your project) │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ Human: Must manually copy, adapt, integrate, test, debug │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ AGENTIC (Claude Code, Codex CLI) │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ │ │
│ │ Human: "Add JWT authentication to this Express app" │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ AI: [Thinks] "Let me understand the current architecture..." │ │
│ │ [Reads] package.json, src/app.ts, src/routes/*.ts │ │
│ │ [Plans] "I'll need to: 1) Add dependencies..." │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ AI: [Executes] npm install jsonwebtoken bcrypt │ │
│ │ [Writes] src/middleware/auth.ts │ │
│ │ [Edits] src/routes/user.ts (adds protected routes) │ │
│ │ [Creates] src/utils/jwt.ts │ │
│ │ [Updates] .env.example │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ AI: [Verifies] "Running tests... All passing" │ │
│ │ "Here's what I did: [summary]" │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
│ KEY DIFFERENCE: Agentic tools have TOOLS (read, write, execute) │
│ They can reason → plan → act → observe → iterate │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

4. The Trust Calibration Problem
The central challenge of vibe coding: calibrating your trust level appropriately.
TRUST CALIBRATION MATRIX
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ AI CONFIDENCE │
│ LOW ◄──────────────────► HIGH │
│ │
│ ┌─────────────────┬─────────────────┐ │
│ HIGH │ │ │ │
│ │ INVESTIGATE │ TRUST BUT │ │
│ Y │ │ VERIFY │ │
│ O │ AI uncertain │ │ │
│ U │ → You should │ Both agree │ │
│ R │ dig deeper │ → Quick check │ │
│ │ │ then proceed │ │
│ C ├─────────────────┼─────────────────┤ │
│ O │ │ │ │
│ N │ REJECT OR │ DANGER ZONE │ │
│ F │ RESEARCH │ │ │
│ I │ │ AI confident, │ │
│ D │ Neither sure │ you're not │ │
│ E │ → Don't ship │ → ALWAYS │ │
│ N │ → Get help │ verify these │ │
│ C │ │ │ │
│ E └─────────────────┴─────────────────┘ │
│ LOW │
│ │
│ DANGER ZONE EXAMPLES: │
│ - AI-generated regex for security validation │
│ - AI-generated SQL without parameterization check │
│ - AI-generated crypto code (NEVER trust without expert review) │
│ - AI-generated performance-critical code │
│ - AI-generated code in unfamiliar language/framework │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

5. The MCP (Model Context Protocol) Architecture
MCP is the open standard that lets AI tools connect to external systems:
MCP ARCHITECTURE
┌──────────────────────────────────────────────────────────────────────────────┐
│ │
│ ┌───────────────────────┐ │
│ │ AI APPLICATION │ │
│ │ (Claude Code, etc.) │ │
│ │ │ │
│ │ ┌─────────────────┐ │ │
│ │ │ MCP CLIENT │ │ │
│ │ └────────┬────────┘ │ │
│ └───────────┼───────────┘ │
│ │ │
│ ┌──────────────────────────┼──────────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ MCP SERVER │ │ MCP SERVER │ │ MCP SERVER │ │
│ │ (GitHub) │ │ (Database) │ │ (Slack) │ │
│ │ │ │ │ │ │ │
│ │ Tools: │ │ Tools: │ │ Tools: │ │
│ │ - list_repos │ │ - query │ │ - send_msg │ │
│ │ - create_pr │ │ - insert │ │ - read_channel│ │
│ │ - get_issues │ │ - schema │ │ - list_users │ │
│ │ │ │ │ │ │ │
│ │ Resources: │ │ Resources: │ │ Resources: │ │
│ │ - repo files │ │ - table data │ │ - messages │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │
│ │ GitHub API │ │ PostgreSQL │ │ Slack API │ │
│ └───────────────┘ └───────────────┘ └───────────────┘ │
│ │
│ BENEFITS: │
│ - Universal interface: AI doesn't need to learn each API │
│ - Security: MCP servers control what AI can access │
│ - Extensibility: Add any tool by writing an MCP server │
│ - Portability: Same servers work with any MCP-compatible AI │
│ │
└──────────────────────────────────────────────────────────────────────────────┘

Concept Summary Table
| Concept | What You Must Internalize |
|---|---|
| Probabilistic Generation | AI predicts likely code, not correct code. Verification is mandatory. |
| Context Quality | Better context = better output. CLAUDE.md is your most powerful lever. |
| Agentic Tools | Claude Code/Codex can read, write, execute. They act, not just advise. |
| Trust Calibration | Match your verification effort to your confidence gap with AI. |
| MCP Protocol | Standard interface for connecting AI to external tools and data. |
| Token Economics | Context has a budget. Prioritize relevant over comprehensive. |
| Hallucination Risk | AI confidently generates plausible-but-wrong code. Always verify. |
Deep Dive Reading by Concept
Understanding AI Coding Assistants
| Concept | Book & Chapter | Why This Matters |
|---|---|---|
| LLM fundamentals | “Deep Learning” by Goodfellow et al. - Ch. 10: Sequence Modeling | Understand why LLMs predict patterns, not truth |
| Prompt engineering | “Prompt Engineering Guide” by DAIR.AI - All chapters | Master the art of getting useful AI output |
| Code quality judgment | “Clean Code” by Robert Martin - Ch. 1-5 | Develop the taste to evaluate AI-generated code |
Working with AI Tools
| Concept | Book & Chapter | Why This Matters |
|---|---|---|
| Agentic systems | “Designing Data-Intensive Applications” by Kleppmann - Ch. 1 | Understand distributed system principles that apply to AI agents |
| Tool automation | “The Pragmatic Programmer” by Hunt & Thomas - Ch. 3: Basic Tools | Philosophy of leveraging tools effectively |
| Version control | “Pro Git” by Chacon - Ch. 2-3 | Essential for managing AI-generated changes |
Safety and Quality
| Concept | Book & Chapter | Why This Matters |
|---|---|---|
| Code review | “Code Complete” by McConnell - Ch. 21: Collaborative Development | Why review matters even for AI code |
| Security mindset | “The Web Application Hacker’s Handbook” - Ch. 1-2 | AI generates vulnerable code; learn to catch it |
| Testing strategy | “Working Effectively with Legacy Code” by Feathers - Ch. 2-4 | How to safely change code you don’t fully understand |
Quick Start: Your First 48 Hours
Feeling overwhelmed? Start here:
Day 1 (4 hours):
- Install Claude Code:
npm install -g @anthropic-ai/claude-code - Run
claude initin a test project - Edit the generated CLAUDE.md with your project specifics
- Ask Claude: “What files are in this project? Explain the architecture.”
- Ask Claude: “Add a simple health check endpoint” (for a web project)
- Review the diff before accepting - What did it get right? Wrong?
Day 2 (4 hours):
- Start Project 1 (Understanding AI Coding Assistants)
- Install Codex:
pip install openai-codex - Try the same tasks in both tools
- Note differences in approach, output quality, tool usage
- Begin documenting your personal CLAUDE.md best practices
End of Weekend: You understand the basic workflow: prompt → context → generation → review → iteration. You can articulate when AI helps vs. when it misleads.
Next Steps:
- If comfortable: Continue to Projects 2-5
- If struggling: Re-read Core Concepts, focus on understanding before doing
Recommended Learning Paths
Path A: Developer New to AI Tools (8-12 weeks)
Week 1-2: Projects 1-3 (Foundations)
Week 3-4: Projects 4-6 (Tool Setup & Context)
Week 5-6: Projects 7-9 (Core Workflows)
Week 7-8: Projects 10-12 (Quality & Safety)
Week 9-10: Projects 13-15 (Advanced Techniques)
Week 11-12: Projects 16-18 (Integration & Mastery)
Path B: AI Tool User Wanting Mastery (4-6 weeks)
Week 1: Skim Projects 1-3, deep-dive Projects 4-6
Week 2-3: Projects 7-12 (Core competencies)
Week 4-5: Projects 13-17 (Advanced techniques)
Week 6: Projects 18-20 (Tool comparison & best practices)
Path C: Team Lead Evaluating AI Tools (2-3 weeks)
Week 1: Projects 1, 4, 10 (Fundamentals, setup, security)
Week 2: Projects 14, 15, 19 (CI/CD, collaboration, comparison)
Week 3: Project 20 (Best practices & anti-patterns)
Projects
Project 1: Understanding AI Coding Assistants
What You’ll Build: A systematic evaluation framework that measures AI coding assistant capabilities across dimensions: code generation accuracy, context understanding, tool usage, and hallucination detection.
Why This Teaches Vibe Coding: Before you can “vibe” with AI, you must understand what it actually does. Most developers anthropomorphize AI (“it understands,” “it knows”), leading to misplaced trust. This project forces you to see AI as a probabilistic system.
Core Challenges:
- Design test cases that reveal AI limitations (edge cases, novel problems, ambiguous specs)
- Measure accuracy quantitatively (% correct, % hallucinated, % incomplete)
- Identify patterns in when AI succeeds vs. fails
- Document your findings in a reusable evaluation rubric
Real World Outcome:
$ python eval_framework.py --tool claude-code --suite basic
AI Coding Assistant Evaluation Report
=====================================
Tool: Claude Code (claude-sonnet-4-20250514)
Test Suite: basic (25 tests)
Date: 2025-01-15
RESULTS BY CATEGORY:
Correct Partial Wrong Hallucinated
Simple functions 8/8 0 0 0
Edge cases 3/6 2 1 0
Algorithm implementation 4/5 1 0 0
Security-sensitive code 1/4 1 1 1
Novel/unusual patterns 1/2 0 1 0
OVERALL ACCURACY: 68% correct, 16% partial, 12% wrong, 4% hallucinated
KEY FINDINGS:
- Excels at: Common patterns, standard library usage, boilerplate
- Struggles with: Crypto, complex algorithms, edge cases
- Hallucination triggers: Obscure libraries, recent API changes
RECOMMENDATIONS:
- Always verify: Security code, algorithms, API integrations
- Trust with verification: CRUD operations, formatting, refactoring
- Manual review required: Anything with novel patterns
Key Concepts: | Concept | Book Reference | |———|—————-| | Evaluation methodology | “Effective Testing” by Mauricio Aniche - Ch. 1 | | Statistical thinking | “Thinking, Fast and Slow” by Kahneman - Part 3 | | Software metrics | “Software Engineering at Google” - Ch. 8 |
| Difficulty: Beginner | Time: 8-12 hours |
Prerequisites:
- Basic Python or JavaScript
- Understanding of what “correct code” means in your domain
The Interview Question They’ll Ask: “How do you evaluate whether AI-generated code is production-ready?”
Project 2: Effective Prompting for Code Generation
What You’ll Build: A prompt engineering toolkit with templates, anti-patterns, and A/B testing capabilities for optimizing AI coding interactions.
Why This Teaches Vibe Coding: Prompt quality determines output quality. Most developers prompt like they’re talking to a junior developer (“make a function that…”). Learning to prompt effectively is the highest-leverage skill in vibe coding.
Core Challenges:
- Compare naive prompts vs. structured prompts (measure quality difference)
- Build a library of reusable prompt templates for common tasks
- Implement A/B testing to measure prompt effectiveness
- Document anti-patterns that consistently produce poor results
Real World Outcome:
$ prompt-toolkit compare --task "email validation"
PROMPT A (Naive):
"Create a function to validate emails"
PROMPT B (Structured):
"Create an email validation function with:
- Input: string email
- Output: tuple (bool is_valid, string error_message)
- Handle edge cases: empty string, no @, multiple @, invalid TLD
- Use RFC 5322 spec, not just regex
- Include docstring with examples
- Return specific error messages, not just true/false"
RESULTS (10 trials each):
Prompt A Prompt B
Correct first try 30% 80%
Handled edge cases 10% 70%
Had tests included 0% 60%
Production ready 0% 50%
Average iterations 4.2 1.3
LESSON: Specific prompts with constraints dramatically improve first-try accuracy.
Key Concepts: | Concept | Book Reference | |———|—————-| | Prompt engineering | DAIR.AI Prompt Engineering Guide - All | | Specification writing | “Writing Effective Use Cases” by Cockburn - Ch. 1-3 | | A/B testing | “Trustworthy Online Controlled Experiments” - Ch. 1 |
| Difficulty: Beginner | Time: 6-10 hours |
Project 3: Claude Code Setup and CLAUDE.md Configuration
What You’ll Build: A production-ready Claude Code environment with optimized CLAUDE.md files at global, project, and directory levels, plus custom slash commands.
Why This Teaches Vibe Coding: CLAUDE.md is your primary lever for improving AI output. A well-configured CLAUDE.md turns Claude from a generic assistant into a domain expert for your specific codebase. This is where vibe coding becomes powerful.
Core Challenges:
- Configure global ~/.claude/CLAUDE.md for personal preferences
- Create project CLAUDE.md with architecture, testing, and style info
- Build custom slash commands for repeated workflows
- Set up directory-specific CLAUDE.md for monorepo patterns
Real World Outcome:
# ~/.claude/CLAUDE.md
I prefer:
- TypeScript over JavaScript
- Functional patterns over OOP
- Explicit types over inference
- Tests alongside implementation
# /project/CLAUDE.md
# E-Commerce Platform
## Architecture
- Frontend: Next.js 14 with App Router
- Backend: tRPC with Prisma
- Database: PostgreSQL
- Auth: NextAuth.js
## Key Files
- src/server/routers/*.ts - API routes
- src/app/(shop)/* - Customer-facing pages
- prisma/schema.prisma - Database schema
## Commands
- `npm run dev` - Start development
- `npm test` - Run tests
- `npm run db:push` - Push schema changes
## Style
- Use named exports, not default
- Prefer async/await over .then()
- All API routes need rate limiting
## Testing
- Tests go in __tests__ next to source
- Use @testing-library/react for components
- Mock Prisma in tests (see src/test/setup.ts)
$ claude "Add a product review feature"
[Claude reads CLAUDE.md, understands architecture]
I'll add a product review feature. Based on your architecture:
1. First, I'll add the Review model to prisma/schema.prisma
2. Create src/server/routers/review.ts for the API
3. Add src/app/(shop)/products/[id]/reviews/* for UI
4. Include tests in __tests__/reviews.test.ts
Let me start with the schema...
Key Concepts: | Concept | Book Reference | |———|—————-| | Project documentation | “Docs for Developers” by Bhatti et al. - Ch. 3-4 | | Configuration management | “The Pragmatic Programmer” - Ch. 6 | | Developer experience | “Developer Experience” by Lawson - Ch. 1-2 |
| Difficulty: Beginner | Time: 4-8 hours |
Project 4: Context Management and MCP Servers
What You’ll Build: A custom MCP server that connects Claude Code to your team’s internal tools (ticketing system, documentation, deployment status), plus strategies for managing context window limits.
Why This Teaches Vibe Coding: Real-world vibe coding requires connecting AI to your actual development environment. MCP lets Claude access databases, APIs, and tools beyond just reading files. This transforms Claude from a code generator into a development partner.
Core Challenges:
- Understand MCP protocol basics (tools, resources, prompts)
- Build a simple MCP server that exposes custom tools
- Connect Claude Code to your server via settings.json
- Manage context effectively when working with large codebases
Real World Outcome:
# Custom MCP server for internal tools
$ cat my-mcp-server/index.ts
import { Server } from '@modelcontextprotocol/sdk/server/index.js';
const server = new Server({
name: 'internal-tools',
version: '1.0.0',
});
// Tool: Get current sprint tickets from Jira
server.setRequestHandler(ListToolsRequestSchema, async () => ({
tools: [
{
name: 'get_sprint_tickets',
description: 'Get all tickets in current sprint',
inputSchema: {
type: 'object',
properties: {
assignee: { type: 'string', description: 'Filter by assignee' }
}
}
},
{
name: 'check_deployment_status',
description: 'Check if a service is deployed and healthy',
inputSchema: {
type: 'object',
properties: {
service: { type: 'string', description: 'Service name' }
},
required: ['service']
}
}
]
}));
# In Claude Code:
$ claude "What tickets are assigned to me in the current sprint?"
[Claude uses get_sprint_tickets tool]
Based on Jira, you have 3 tickets in the current sprint:
1. PROJ-123: Add product review feature (In Progress)
2. PROJ-124: Fix checkout race condition (To Do)
3. PROJ-125: Update API documentation (To Do)
Want me to start working on any of these?
$ claude "Check if the payments service is healthy"
[Claude uses check_deployment_status tool]
Payments service status:
- Production: HEALTHY (last deploy: 2h ago)
- Staging: DEGRADED (high latency: 2.3s avg)
- Dev: HEALTHY
Staging shows degraded performance. Want me to investigate the recent changes?
Key Concepts: | Concept | Book Reference | |———|—————-| | Protocol design | “Designing Web APIs” by Brenda Jin - Ch. 3-4 | | Integration patterns | “Enterprise Integration Patterns” by Hohpe - Ch. 1-3 | | Context management | Claude Code documentation - Context section |
| Difficulty: Intermediate | Time: 12-16 hours |
Project 5: Multi-File Refactoring with AI
What You’ll Build: A systematic approach to large-scale refactoring using AI, including extraction, rename, and restructure operations across dozens of files while maintaining correctness.
Why This Teaches Vibe Coding: Refactoring is where vibe coding shines. AI can see patterns across your entire codebase and apply consistent transformations. But it can also introduce subtle bugs. This project teaches you to leverage AI’s breadth while maintaining correctness.
Core Challenges:
- Refactor a module extraction across 10+ files
- Rename a core concept throughout the codebase
- Restructure code organization (flat to nested, or vice versa)
- Verify correctness through tests at each step
Real World Outcome:
$ claude "Extract all authentication logic into a separate auth module"
[Claude analyzes codebase]
I found authentication logic scattered across:
- src/api/routes/users.ts (login, register, verify)
- src/api/routes/sessions.ts (session management)
- src/middleware/auth.ts (JWT verification)
- src/utils/password.ts (hashing)
- src/types/user.ts (auth-related types)
Proposed new structure:
src/auth/
├── index.ts (public API)
├── login.ts (login flow)
├── register.ts (registration flow)
├── session.ts (session management)
├── middleware.ts (Express middleware)
├── jwt.ts (token operations)
├── password.ts (hashing)
└── types.ts (auth types)
This will affect 23 files that import from the old locations.
Step 1: Create new module structure
Step 2: Move logic with updated imports
Step 3: Update all importing files
Step 4: Run tests after each major step
Shall I proceed? I'll pause for your review after each step.
Key Concepts: | Concept | Book Reference | |———|—————-| | Refactoring patterns | “Refactoring” by Martin Fowler - All | | Safe refactoring | “Working Effectively with Legacy Code” - Ch. 8-10 | | Module design | “A Philosophy of Software Design” - Ch. 4-5 |
| Difficulty: Intermediate | Time: 10-15 hours |
Project 6: Test-Driven Development with AI
What You’ll Build: A TDD workflow where AI helps write tests first, then implements code to pass them, with feedback loops that catch AI-generated bugs before they reach production.
Why This Teaches Vibe Coding: TDD is the safest way to vibe code. When tests come first, AI has clear success criteria. When tests pass, you have some confidence the code works. This project combines the speed of AI with the safety of TDD.
Core Challenges:
- Prompt AI to write tests before implementation
- Have AI implement code to pass the tests
- Iterate when tests fail (without AI just deleting the failing tests)
- Add property-based tests for edge case coverage
Real World Outcome:
$ claude "I need a shopping cart module. Start with tests."
[Claude writes tests first]
I'll create comprehensive tests first. Here's what the shopping cart should do:
// src/__tests__/cart.test.ts
describe('ShoppingCart', () => {
describe('addItem', () => {
it('should add item to empty cart', () => {
const cart = new ShoppingCart();
cart.addItem({ id: '1', name: 'Widget', price: 9.99 });
expect(cart.items).toHaveLength(1);
});
it('should increase quantity for existing item', () => {
const cart = new ShoppingCart();
cart.addItem({ id: '1', name: 'Widget', price: 9.99 });
cart.addItem({ id: '1', name: 'Widget', price: 9.99 });
expect(cart.items).toHaveLength(1);
expect(cart.items[0].quantity).toBe(2);
});
it('should reject negative prices', () => {
const cart = new ShoppingCart();
expect(() => cart.addItem({ id: '1', name: 'Widget', price: -5 }))
.toThrow('Price must be positive');
});
// ... 15 more tests
});
describe('calculateTotal', () => {
it('should sum all items with quantities', () => { /* ... */ });
it('should apply percentage discounts', () => { /* ... */ });
it('should handle floating point correctly', () => {
// Test: $1.10 + $1.10 + $1.10 should equal $3.30, not $3.3000000000000003
});
});
});
$ npm test
# 18 tests fail (no implementation yet)
$ claude "Now implement the cart to pass all tests"
[Claude implements]
# 18 tests pass
$ claude "Add property-based tests to find edge cases"
[Claude adds fast-check tests]
// Finds bug: cart breaks with very large quantities (integer overflow)
Key Concepts: | Concept | Book Reference | |———|—————-| | TDD fundamentals | “Test Driven Development” by Kent Beck - All | | Property-based testing | “Property-Based Testing with PropEr” - Ch. 1-3 | | Testing strategies | “Unit Testing Principles” by Khorikov - Ch. 1-4 |
| Difficulty: Intermediate | Time: 10-15 hours |
Project 7: Code Review with AI Assistance
What You’ll Build: An AI-augmented code review workflow that uses Claude to pre-review PRs, identify potential issues, and suggest improvements before human reviewers spend time.
Why This Teaches Vibe Coding: Even vibe-coded code needs review. AI can catch common issues (security, performance, style) faster than humans. But AI also misses context-specific problems. This project teaches you to use AI as a first-pass filter, not a replacement for human judgment.
Core Challenges:
- Feed PR diffs to Claude with project context
- Generate structured review comments with severity ratings
- Distinguish AI-catchable issues from context-dependent ones
- Integrate with GitHub/GitLab PR workflow
Real World Outcome:
$ claude review-pr --pr-url https://github.com/org/repo/pull/123
PR Review: Add user authentication
============================================================
SECURITY ISSUES (2 found):
[HIGH] src/routes/auth.ts:45
Password stored without hashing. Use bcrypt with cost factor >= 12.
[MEDIUM] src/routes/auth.ts:78
JWT secret loaded from process.env without validation.
Risk: Server crashes in production if missing.
PERFORMANCE ISSUES (1 found):
[LOW] src/middleware/auth.ts:23
Regex email validation runs on every request.
Consider caching valid emails or using simpler check.
STYLE ISSUES (3 found):
[INFO] Multiple files
Inconsistent async/await vs .then() usage. Project prefers async/await.
POTENTIAL BUGS (1 found):
[MEDIUM] src/services/user.ts:89
Race condition: Check-then-create without transaction.
Two simultaneous registrations with same email could succeed.
CONTEXT-DEPENDENT (Human review needed):
- Business logic in validateSubscription() - I can't verify correctness
- Error messages may leak internal structure (security review needed)
- Database migration compatibility not verified
============================================================
Auto-fix available for: 2 issues
Run: claude review-pr --fix to apply safe fixes
Key Concepts: | Concept | Book Reference | |———|—————-| | Code review practices | “Code Complete” by McConnell - Ch. 21 | | Security review | “The Art of Software Security Assessment” - Ch. 1-2 | | PR workflow | “Software Engineering at Google” - Ch. 9 |
| Difficulty: Intermediate | Time: 12-18 hours |
Project 8: Debugging with AI
What You’ll Build: A systematic debugging methodology that uses AI to analyze error logs, reproduce issues, identify root causes, and propose fixes - while teaching you to verify AI’s debugging conclusions.
Why This Teaches Vibe Coding: AI is surprisingly good at debugging - it’s seen millions of error messages and their solutions. But AI can also send you on wild goose chases. This project teaches you to use AI as a debugging partner while maintaining your own hypothesis-testing discipline.
Core Challenges:
- Feed error context (logs, stack traces, recent changes) to AI
- Evaluate AI-proposed root causes critically
- Use AI to generate hypotheses, but verify experimentally
- Document debugging sessions for future learning
Real World Outcome:
$ claude debug --error "Error: ECONNREFUSED 127.0.0.1:5432"
Analyzing error context...
ERROR ANALYSIS:
Type: Connection refused to PostgreSQL
Location: src/db/connection.ts:15 (pool.connect())
When: Startup, after recent deployment
HYPOTHESES (ordered by likelihood):
1. [HIGH] PostgreSQL service not running
- Verify: `docker ps | grep postgres` or `systemctl status postgresql`
- Recent change: docker-compose.yml modified 2h ago
2. [MEDIUM] Port conflict
- Verify: `lsof -i :5432`
- Another service may have claimed the port
3. [MEDIUM] Docker network issue
- Verify: `docker network inspect bridge`
- App container may be on different network than DB
4. [LOW] Firewall blocking
- Verify: `iptables -L | grep 5432`
- Unlikely on local development
RECOMMENDED STEPS:
1. $ docker ps | grep postgres
Expected: Running postgres container
If missing: $ docker-compose up -d db
2. $ docker logs postgres-container
Look for: "ready to accept connections"
Or: Error messages indicating startup failure
Want me to run these diagnostics?
$ claude debug --run-diagnostics
[Running docker ps]
No postgres container found.
[Checking docker-compose.yml]
Found issue: Line 23 has typo "postgress" instead of "postgres" in service name
ROOT CAUSE IDENTIFIED:
Recent commit a1b2c3d introduced typo in docker-compose.yml,
preventing database container from starting.
FIX:
- docker-compose.yml line 23: postgress -> postgres
Would you like me to apply this fix?
Key Concepts: | Concept | Book Reference | |———|—————-| | Debugging methodology | “Debugging” by David Agans - All | | Root cause analysis | “The Field Guide to Understanding Human Error” - Ch. 3 | | System debugging | “Systems Performance” by Brendan Gregg - Ch. 2 |
| Difficulty: Intermediate | Time: 10-15 hours |
Project 9: Architecture Design Conversations
What You’ll Build: A structured approach to using AI as an architecture sounding board - discussing system design, evaluating tradeoffs, and documenting decisions - while recognizing AI’s limitations in understanding business context.
Why This Teaches Vibe Coding: Architecture is where AI is both most helpful and most dangerous. AI can suggest patterns, identify risks, and explore alternatives faster than any human. But AI lacks business context and may confidently recommend inappropriate solutions. This project teaches architectural dialogue with AI.
Core Challenges:
- Describe a complex system requirement to AI and evaluate proposed architectures
- Have AI challenge your design decisions (adversarial review)
- Use AI to generate ADRs (Architecture Decision Records)
- Recognize when AI suggestions don’t fit your constraints
Real World Outcome:
$ claude architect "We need to add real-time notifications to our e-commerce platform.
Current stack: Next.js, PostgreSQL, Redis, deployed on Vercel."
UNDERSTANDING YOUR CONSTRAINTS:
- Vercel: No persistent WebSocket connections (functions timeout at 30s)
- Current scale: ~10k DAU (inferred from Redis setup)
- Existing Redis: Can leverage for pub/sub
ARCHITECTURE OPTIONS:
Option 1: Third-party service (Pusher, Ably)
Pros: Simple integration, handles scaling, works with Vercel
Cons: Additional cost ($50-500/mo), vendor dependency
Best for: Quick ship, scale later
Option 2: Serverless WebSockets (AWS API Gateway)
Pros: Pay-per-use, scalable, full control
Cons: Requires AWS, more complex deployment
Best for: AWS-native teams, high volume
Option 3: Server-Sent Events (SSE) with Edge Functions
Pros: Works on Vercel Edge, simpler than WebSocket
Cons: One-way only (client receives), reconnection handling
Best for: Notification-only (no chat/collaboration)
Option 4: Long-polling fallback
Pros: Works everywhere, simplest implementation
Cons: Higher latency (1-10s), more requests
Best for: MVP, low-frequency notifications
MY RECOMMENDATION: Option 3 (SSE on Edge)
Reasoning: Fits Vercel constraints, notifications are one-way,
Redis pub/sub handles the backend. Migrate to Option 2 if
you outgrow Vercel or need bidirectional.
QUESTIONS I CAN'T ANSWER (need your context):
- Notification volume per user? (affects cost model)
- Latency requirements? (SSE has ~1s delay)
- Future needs for bidirectional communication?
Want me to:
1. Detail the SSE implementation?
2. Evaluate another option?
3. Generate an ADR for this decision?
$ claude architect --generate-adr 3
# ADR-007: Real-time Notifications via Server-Sent Events
## Status: Proposed
## Context
Our e-commerce platform requires real-time notifications for:
- Order status updates
- Inventory alerts for wishlist items
- Flash sale announcements
## Decision
Implement Server-Sent Events (SSE) via Vercel Edge Functions with Redis pub/sub backend.
## Consequences
### Positive
- No additional infrastructure (uses existing Redis)
- Compatible with Vercel's serverless model
- Lower cost than third-party at current scale
### Negative
- One-way communication only
- Must handle reconnection logic in client
- May require migration if bidirectional needed later
## Alternatives Considered
[Details of Options 1, 2, 4 with rejection reasoning]
Key Concepts: | Concept | Book Reference | |———|—————-| | System design | “Designing Data-Intensive Applications” - All | | Architecture patterns | “Fundamentals of Software Architecture” - Ch. 9-15 | | ADRs | “Documenting Software Architectures” - Ch. 6 |
| Difficulty: Advanced | Time: 15-20 hours |
Project 10: Managing AI Hallucinations
What You’ll Build: A hallucination detection and prevention system that identifies when AI generates confidently-wrong information, with verification strategies and red flags to watch for.
Why This Teaches Vibe Coding: Hallucinations are the dark side of vibe coding. AI will confidently generate code that uses non-existent APIs, outdated syntax, or subtly incorrect logic. Andrej Karpathy himself noted that pure vibe coding failed him on complex projects. This project teaches you to catch hallucinations before they become bugs.
Core Challenges:
- Identify common hallucination patterns (fake APIs, wrong signatures, imagined features)
- Build verification workflows for AI-generated code
- Create a “red flag” checklist for high-hallucination-risk situations
- Document hallucinations you encounter for team learning
Real World Outcome:
HALLUCINATION DETECTION CHECKLIST
Before accepting AI-generated code, verify:
[ ] API CALLS
- Does this library/function actually exist?
- Is this the correct method signature?
- Are parameter names/types correct?
Quick check: Search npm/pypi, check official docs
[ ] LANGUAGE FEATURES
- Is this syntax valid for your version?
- Does this feature exist, or is it from a proposal/future version?
Quick check: MDN, official language docs, run in REPL
[ ] FRAMEWORK PATTERNS
- Is this the current way to do this?
- Or is this from an older version/deprecated?
Quick check: Framework changelog, migration guides
[ ] CONFIGURATION
- Are these actual config options?
- Do these environment variables exist?
Quick check: Tool documentation, --help output
RED FLAGS (High hallucination risk):
! AI mentions a "new feature" without version number
! API doesn't appear in official documentation
! Syntax feels unusual or overly clever
! AI "remembers" a library you've never used
! Generated tests pass but don't actually test the behavior
! AI claims something is "built-in" that you've never heard of
COMMON HALLUCINATION PATTERNS:
1. Inventing npm packages (always verify with `npm view <package>`)
2. Mixing React class and function component syntax
3. Using Python 3.10+ features in Python 3.8 projects
4. Imagining REST API endpoints that don't exist
5. Generating config options that look plausible but aren't real
VERIFICATION WORKFLOW:
1. Read the code (don't just accept)
2. Check every import - does this package exist?
3. Check every API call - is this the actual signature?
4. Run the code - does it actually work?
5. Write a test - does it do what it claims?
$ claude verify-hallucinations ./generated_code.ts
Checking for potential hallucinations...
VERIFIED (3):
+ lodash.debounce - Exists, signature correct
+ express.Router - Exists, usage correct
+ process.env.DATABASE_URL - Standard pattern
WARNINGS (2):
? @prisma/client.upsertMany - Method doesn't exist
Actual method: prisma.model.upsert() (singular)
AI may have confused with MongoDB's bulkWrite
? zod.string().email().domain() - .domain() not a zod method
Actual: Use .refine() with custom validator
LIKELY HALLUCINATIONS (1):
! import { rateLimit } from 'express' - Doesn't exist
express doesn't have built-in rate limiting
Correct: npm install express-rate-limit
Key Concepts: | Concept | Book Reference | |———|—————-| | Critical thinking | “Thinking, Fast and Slow” by Kahneman - Part 1-2 | | Verification methods | “The Checklist Manifesto” by Gawande - Ch. 3-4 | | Error detection | “An Introduction to Error Analysis” - Ch. 1 |
| Difficulty: Intermediate | Time: 8-12 hours |
Project 11: Version Control with AI Changes
What You’ll Build: A Git workflow optimized for AI-assisted development, with strategies for reviewing AI-generated diffs, maintaining meaningful commit history, and managing AI-introduced technical debt.
Why This Teaches Vibe Coding: AI generates code fast - sometimes too fast. Without discipline, you end up with incomprehensible diffs, meaningless commit messages, and hidden technical debt. This project teaches you to maintain code quality and history integrity while still moving fast.
Core Challenges:
- Review AI-generated diffs effectively (what to look for, what to skip)
- Write meaningful commit messages for AI-assisted work
- Stage AI changes incrementally (not “accept all”)
- Track AI-introduced technical debt for later cleanup
Real World Outcome:
AI-ASSISTED GIT WORKFLOW
BEFORE ACCEPTING AI CHANGES:
$ git diff --staged # or use Claude's diff output
REVIEW CHECKLIST:
[ ] Understand every changed file's purpose
[ ] Check for unintended deletions
[ ] Verify import changes make sense
[ ] Look for debug code that should be removed
[ ] Confirm tests were updated appropriately
COMMIT STRATEGY:
Bad: "AI generated code"
Bad: "Add feature" (too vague)
Good: "feat: Add product review system
- Add Review model with Prisma schema
- Implement create/read/delete API routes
- Add ReviewCard component with star rating
- Include tests for review validation
Generated with Claude Code assistance.
Areas needing human review: rating calculation logic"
TRACKING AI CONTRIBUTIONS:
# In commit messages
Co-authored-by: Claude <assistant@anthropic.com>
# Or use conventional commits
feat(ai-assisted): Add notification system
# For technical debt tracking
# TODO(ai-review): Claude generated this regex, needs validation
# FIXME(ai-debt): This caching strategy was AI-suggested, may need optimization
AI CHANGE REVIEW PATTERNS:
1. ACCEPT: Boilerplate, formatting, imports
2. REVIEW: Business logic, security, algorithms
3. VERIFY: Tests (do they actually test the behavior?)
4. REJECT: Unexplained complexity, magic numbers
$ claude commit --review
Changes staged for commit:
M src/api/reviews.ts
M src/components/ReviewCard.tsx
A src/models/review.ts
A src/__tests__/reviews.test.ts
M prisma/schema.prisma
CHANGE SUMMARY:
+ 245 lines added
- 12 lines removed
- 5 files modified
AI-GENERATED AREAS (flagged for review):
- src/api/reviews.ts:45-67 - Aggregation query
- src/models/review.ts:23-34 - Validation logic
SUGGESTED COMMIT MESSAGE:
feat(reviews): Add product review system
- Add Review model with star rating (1-5)
- Implement CRUD API endpoints
- Add ReviewCard component
- Include integration tests
AI-assisted implementation. Review notes:
- Rating aggregation uses simple average
- TODO: Add weighted rating by verified purchasers
Accept this commit message? [Y/n/edit]
Key Concepts: | Concept | Book Reference | |———|—————-| | Git best practices | “Pro Git” by Scott Chacon - Ch. 5-7 | | Commit hygiene | “A Philosophy of Software Design” - Ch. 15 | | Technical debt | “Managing Technical Debt” by Kruchten - Ch. 1-3 |
| Difficulty: Intermediate | Time: 8-12 hours |
Project 12: CI/CD Integration
What You’ll Build: A CI/CD pipeline that incorporates AI assistance for code review, test generation, and documentation updates, with appropriate safeguards to prevent AI from breaking production.
Why This Teaches Vibe Coding: AI in CI/CD is powerful but risky. AI can generate tests, update docs, and review code automatically. But AI can also approve bad code, generate useless tests, or make inappropriate changes. This project teaches safe AI integration in automated pipelines.
Core Challenges:
- Use AI for PR review as a CI check
- Generate missing tests with AI (but verify coverage is meaningful)
- Auto-update documentation with AI (with human approval gate)
- Set up headless Claude for scripted automation
Real World Outcome:
# .github/workflows/ai-assist.yml
name: AI-Assisted CI
on:
pull_request:
types: [opened, synchronize]
jobs:
ai-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: AI Code Review
run: |
claude review-pr \
--headless \
--output-format github \
--severity-threshold medium \
--pr ${{ github.event.pull_request.number }}
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
# AI review is advisory, not blocking
continue-on-error: true
ai-test-suggestions:
runs-on: ubuntu-latest
if: github.event.pull_request.labels.contains('needs-tests')
steps:
- uses: actions/checkout@v4
- name: Generate Test Suggestions
run: |
claude suggest-tests \
--headless \
--diff ${{ github.event.pull_request.diff_url }} \
--output tests-suggestions.md
- name: Comment on PR
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const suggestions = fs.readFileSync('tests-suggestions.md', 'utf8');
github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: '## AI-Generated Test Suggestions\n\n' + suggestions +
'\n\n*Review these suggestions. They require human verification.*'
});
# Human must still approve before merge
require-approval:
runs-on: ubuntu-latest
needs: [ai-review, ai-test-suggestions]
steps:
- name: Check required reviews
run: echo "AI review complete. Human approval still required."
$ claude headless --prompt "Review the last 5 commits for security issues" --output json
{
"review_results": [
{
"commit": "a1b2c3d",
"severity": "high",
"issues": [
{
"type": "sql_injection",
"file": "src/api/search.ts",
"line": 45,
"description": "User input concatenated into SQL query",
"fix": "Use parameterized query"
}
]
}
],
"summary": {
"commits_reviewed": 5,
"high_severity": 1,
"medium_severity": 2,
"low_severity": 4
}
}
Key Concepts: | Concept | Book Reference | |———|—————-| | CI/CD practices | “Continuous Delivery” by Humble & Farley - Ch. 5-7 | | GitHub Actions | “Learning GitHub Actions” by Brent Laster - Ch. 4-6 | | Automation safety | “Release It!” by Michael Nygard - Ch. 4 |
| Difficulty: Advanced | Time: 15-20 hours |
Project 13: Documentation Generation
What You’ll Build: An AI-powered documentation system that generates READMEs, API docs, inline comments, and architectural documentation from your codebase, with verification workflows to ensure accuracy.
Why This Teaches Vibe Coding: Documentation is tedious but important. AI is great at generating documentation drafts. But AI can also generate incorrect documentation that misleads future developers. This project teaches you to use AI for documentation while maintaining accuracy.
Core Challenges:
- Generate API documentation from code (verify accuracy)
- Create README files that actually reflect the codebase
- Add inline comments that explain “why” not just “what”
- Generate architecture diagrams from code analysis
Real World Outcome:
$ claude doc generate --type readme
Analyzing codebase...
README.md generated. Key sections:
- Overview (from package.json description + code analysis)
- Installation (from package.json scripts + dependencies)
- Usage examples (from tests + CLI help)
- API reference (from JSDoc + type definitions)
- Configuration (from .env.example + config files)
VERIFICATION NEEDED:
[!] Line 45: "Supports PostgreSQL 12+" - Verify minimum version
[!] Line 78: "Run `npm start` for production" - Check if correct command
[!] Line 112: API response example - Verify actual response format
Review and verify flagged sections before committing.
$ claude doc generate --type api --output docs/api.md
API Documentation generated for 23 endpoints.
VERIFIED (auto-generated from code):
+ Method signatures
+ Parameter types
+ Return types
NEEDS VERIFICATION (inferred):
? Example responses (based on return types, not actual calls)
? Error codes (based on error handling patterns)
? Rate limits (not found in code - please add or confirm N/A)
$ claude doc add-comments ./src/utils/pricing.ts
Adding "why" comments to complex logic...
Before:
const discount = quantity > 100 ? 0.15 : quantity > 50 ? 0.10 : 0.05;
After:
// Tiered volume discount: incentivize bulk orders
// - 100+ units: 15% (enterprise customers, verified in Q3 analysis)
// - 50-99 units: 10% (mid-size orders, common for agencies)
// - <50 units: 5% (base discount, customer retention)
const discount = quantity > 100 ? 0.15 : quantity > 50 ? 0.10 : 0.05;
[!] These comments are inferred. Please verify business logic accuracy.
Key Concepts: | Concept | Book Reference | |———|—————-| | Technical writing | “Docs for Developers” by Bhatti et al. - All | | API documentation | “The Design of Web APIs” by Arnaud Lauret - Ch. 11 | | Code comments | “A Philosophy of Software Design” - Ch. 13 |
| Difficulty: Intermediate | Time: 10-15 hours |
Project 14: API Design with AI
What You’ll Build: An AI-assisted API design workflow covering schema design, endpoint naming, error handling, versioning strategy, and OpenAPI specification generation.
Why This Teaches Vibe Coding: API design benefits from AI’s knowledge of patterns across thousands of APIs. AI can suggest RESTful conventions, identify inconsistencies, and generate OpenAPI specs. But AI may suggest patterns that don’t fit your domain. This project teaches critical API design collaboration with AI.
Core Challenges:
- Design a RESTful API with AI as a sounding board
- Generate OpenAPI specification from requirements
- Have AI review existing API for consistency and best practices
- Design error responses and status code usage
Real World Outcome:
$ claude api design "E-commerce order management system"
UNDERSTANDING DOMAIN:
Based on your request, the core resources are:
- Orders (placed by customers)
- OrderItems (products within orders)
- Payments (associated with orders)
- Shipments (fulfillment of orders)
PROPOSED API DESIGN:
Endpoints:
POST /orders Create new order
GET /orders List orders (with filters)
GET /orders/{id} Get order details
PATCH /orders/{id} Update order (limited fields)
DELETE /orders/{id} Cancel order (soft delete)
GET /orders/{id}/items List items in order
POST /orders/{id}/items Add item to order
POST /orders/{id}/payments Process payment
GET /orders/{id}/payments Payment history
POST /orders/{id}/shipments Create shipment
GET /orders/{id}/shipments Shipment tracking
Naming conventions:
- Plural nouns for resources
- Kebab-case for multi-word paths
- IDs in path, not query params for single resource
Status codes:
- 200: Success with body
- 201: Created (POST success)
- 204: Success no content (DELETE)
- 400: Validation error (with details)
- 404: Resource not found
- 409: Conflict (e.g., already paid)
- 422: Business rule violation
QUESTIONS FOR YOU:
1. Nested vs flat? I used /orders/{id}/items. Prefer /order-items?
2. Should order items be modifiable after creation?
3. Partial vs full updates? I used PATCH. Need PUT?
$ claude api generate-openapi
# Generated openapi.yaml (preview)
openapi: 3.0.3
info:
title: Order Management API
version: 1.0.0
paths:
/orders:
post:
summary: Create a new order
operationId: createOrder
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/CreateOrderRequest'
responses:
'201':
description: Order created
content:
application/json:
schema:
$ref: '#/components/schemas/Order'
'400':
$ref: '#/components/responses/ValidationError'
...
[Full 450-line OpenAPI spec generated]
Key Concepts: | Concept | Book Reference | |———|—————-| | REST API design | “The Design of Web APIs” by Lauret - All | | OpenAPI | “Designing APIs with Swagger and OpenAPI” - Ch. 3-5 | | API patterns | “RESTful Web APIs” by Richardson - Ch. 4-8 |
| Difficulty: Intermediate | Time: 12-16 hours |
Project 15: Performance Optimization with AI
What You’ll Build: An AI-assisted performance optimization workflow for identifying bottlenecks, suggesting optimizations, and verifying improvements through benchmarks.
Why This Teaches Vibe Coding: Performance optimization is subtle. AI can suggest common optimizations (caching, batching, indexing) but may miss context-specific issues or suggest premature optimization. This project teaches you to use AI for performance work while measuring actual impact.
Core Challenges:
- Use AI to analyze code for performance anti-patterns
- Get optimization suggestions with complexity/benefit tradeoffs
- Benchmark before and after (AI can suggest but you must verify)
- Identify when AI suggestions are premature optimization
Real World Outcome:
$ claude perf analyze ./src/api/search.ts
PERFORMANCE ANALYSIS: search.ts
================================
IDENTIFIED ISSUES:
1. [HIGH] N+1 Query Pattern (Line 45-52)
Current: Loop fetching related products individually
Cost: ~100ms per item, O(n) database calls
Fix: Use JOIN or batch fetch
Expected improvement: 80-95% reduction
2. [MEDIUM] Unindexed Query (Line 78)
Current: WHERE category LIKE '%electronics%'
Issue: LIKE with leading wildcard can't use index
Fix: Full-text search or restructure query
Expected improvement: 50-80% on large datasets
3. [LOW] Synchronous String Building (Line 102)
Current: Repeated string concatenation in loop
Issue: Creates many intermediate strings
Fix: Use array.join() or template literal
Expected improvement: 10-30% (only matters at scale)
4. [INFO] No Caching
Current: Every search hits database
Consider: Redis cache for popular searches
Trade-off: Complexity vs. speed
Recommendation: Profile first, cache if >30% of searches are repeats
PREMATURE OPTIMIZATION WARNING:
Item 3 (string building) is likely not your bottleneck.
Fix items 1-2 first, measure, then decide on further optimization.
$ claude perf fix --issue 1
Generating fix for N+1 query pattern...
Before (problematic):
```javascript
const products = await db.product.findMany({ where: { searchTerm } });
for (const product of products) {
product.reviews = await db.review.findMany({ where: { productId: product.id } });
}
After (optimized):
const products = await db.product.findMany({
where: { searchTerm },
include: {
reviews: {
take: 5, // Limit to avoid over-fetching
orderBy: { rating: 'desc' }
}
}
});
BENCHMARK REQUIRED: Run before and after with your data. My estimate may be wrong depending on your actual data distribution and database config.
Suggested benchmark:
console.time('search');
await search('electronics');
console.timeEnd('search');
**Key Concepts:**
| Concept | Book Reference |
|---------|----------------|
| Performance analysis | "Systems Performance" by Brendan Gregg - Ch. 2-4 |
| Database optimization | "High Performance MySQL" - Ch. 6-7 |
| Web performance | "High Performance Browser Networking" - Ch. 1-4 |
**Difficulty:** Advanced | **Time:** 15-20 hours
---
### Project 16: Security Review with AI
**What You'll Build:**
An AI-powered security review system that identifies vulnerabilities in code, suggests secure alternatives, and helps prioritize remediation.
**Why This Teaches Vibe Coding:**
Security is critical and AI-generated code often has vulnerabilities. In 2025 studies, 48% of AI-generated code contained potential security issues. This project teaches you to use AI for security review while understanding its limitations (AI can't catch everything, especially business logic flaws).
**Core Challenges:**
1. Use AI to scan for OWASP Top 10 vulnerabilities
2. Review AI-generated code specifically for security issues
3. Get secure code alternatives from AI
4. Understand what AI security review misses (and needs human review)
**Real World Outcome:**
$ claude security review ./src
SECURITY REVIEW REPORT
Scan date: 2025-01-15 Files scanned: 47 Lines analyzed: 8,234
CRITICAL (Fix immediately):
-
SQL Injection (src/api/search.ts:45) Code:
db.query("SELECT * FROM products WHERE name = '" + name + "'")Risk: Full database compromise Fix: Use parameterized queriesSecure version:
db.query("SELECT * FROM products WHERE name = $1", [name]) -
Hardcoded Secret (src/config/auth.ts:12) Code:
const JWT_SECRET = "super_secret_key_12345"Risk: Token forgery, authentication bypass Fix: Use environment variableSecure version:
const JWT_SECRET = process.env.JWT_SECRETEnsure: Validate presence on startup, use strong random value
HIGH (Fix this sprint):
-
XSS Vulnerability (src/components/Comment.tsx:23) Code:
dangerouslySetInnerHTML={{ __html: comment.body }}Risk: Script injection, session hijacking Fix: Sanitize HTML or use text content -
Missing Rate Limiting (src/api/auth.ts) Issue: No rate limiting on login endpoint Risk: Brute force attacks Fix: Add express-rate-limit or similar
MEDIUM (Fix soon):
- Weak Password Requirements (src/utils/validation.ts:56)
- Missing CSRF Protection (src/api/forms.ts)
- Verbose Error Messages (src/middleware/error.ts)
AI LIMITATIONS - REQUIRES HUMAN REVIEW:
[!] Business logic vulnerabilities (e.g., authorization flaws) [!] Race conditions in payment/inventory [!] Data privacy compliance (GDPR, CCPA) [!] Third-party dependency vulnerabilities (run npm audit separately) [!] Infrastructure security (not visible in code)
NEXT STEPS:
- Fix CRITICAL issues before next deploy
- Run
npm auditfor dependency vulnerabilities - Schedule penetration test for business logic review ```
Key Concepts: | Concept | Book Reference | |———|—————-| | OWASP Top 10 | OWASP Foundation - Top 10 Web Application Security Risks | | Secure coding | “The Web Application Hacker’s Handbook” - Ch. 9-13 | | Security review | “The Art of Software Security Assessment” - Ch. 4-5 |
| Difficulty: Advanced | Time: 15-20 hours |
Project 17: Building Custom Skills and Slash Commands
What You’ll Build: A suite of custom Claude Code skills and slash commands tailored to your team’s workflows, including code generation templates, project-specific tools, and automated sequences.
Why This Teaches Vibe Coding: Custom skills are how you encode team knowledge into AI workflows. Instead of explaining your coding conventions every time, you embed them in skills. This project teaches you to extend Claude Code for your specific needs.
Core Challenges:
- Create a custom slash command for common project tasks
- Build a skill that generates boilerplate with team conventions
- Implement a multi-step skill that orchestrates complex workflows
- Share skills with your team via git
Real World Outcome:
# .claude/commands/new-feature.md
# Custom command: /new-feature
Create a new feature with our standard structure.
When the user runs /new-feature <name>:
1. Create feature directory: src/features/<name>/
2. Generate files using our conventions:
- index.ts (exports)
- <name>.tsx (main component)
- <name>.test.tsx (tests)
- <name>.stories.tsx (Storybook)
- types.ts (TypeScript types)
- hooks.ts (custom hooks if needed)
3. Add feature to src/features/index.ts exports
4. Create placeholder test that passes
5. Add Storybook entry with default story
Use our team's component template (see src/templates/component.tsx).
Follow naming conventions: PascalCase for components, camelCase for hooks.
Arguments: $ARGUMENTS (feature name)
$ claude /new-feature ProductReviews
Creating new feature: ProductReviews
[CREATE] src/features/ProductReviews/
[CREATE] src/features/ProductReviews/index.ts
[CREATE] src/features/ProductReviews/ProductReviews.tsx
[CREATE] src/features/ProductReviews/ProductReviews.test.tsx
[CREATE] src/features/ProductReviews/ProductReviews.stories.tsx
[CREATE] src/features/ProductReviews/types.ts
[CREATE] src/features/ProductReviews/hooks.ts
[UPDATE] src/features/index.ts
Feature scaffolded. Next steps:
1. Define types in types.ts
2. Implement component in ProductReviews.tsx
3. Run tests: npm test ProductReviews
4. View in Storybook: npm run storybook
Want me to start with the types based on your requirements?
# .claude/skills/api-endpoint/SKILL.md
# Skill: Create API Endpoint
Create a new API endpoint following our RESTful conventions.
## Triggers
Invoke when user says:
- "create endpoint"
- "add api route"
- "new api for..."
## Process
1. Ask for resource name and operations (CRUD subset)
2. Generate route file with proper error handling
3. Generate Zod schemas for validation
4. Generate tests for happy path and error cases
5. Update API router to include new routes
6. Generate OpenAPI documentation
## Templates
See ./templates/ for code templates.
## Conventions
- All routes wrapped in asyncHandler
- Validation with Zod, parsed in middleware
- Consistent error response format
- Request logging with correlation ID
Key Concepts: | Concept | Book Reference | |———|—————-| | Workflow automation | “The Pragmatic Programmer” - Ch. 6: While You Are Coding | | Template patterns | “Patterns of Enterprise Application Architecture” - Ch. 14 | | Developer experience | Developer Experience by Lawson - Ch. 4 |
| Difficulty: Advanced | Time: 15-25 hours |
Project 18: Building with Natural Language
What You’ll Build: A project built primarily through natural language conversation with AI, documenting the process, identifying where natural language works well, and where precise specifications are needed.
Why This Teaches Vibe Coding: This is the pure vibe coding experience - building by describing what you want. You’ll discover where this works (UI, CRUD, scaffolding) and where it fails (algorithms, performance-critical code, security). The goal is calibrated understanding, not blind faith.
Core Challenges:
- Build a complete feature using only natural language prompts
- Document where you had to be more specific (vague didn’t work)
- Identify patterns in successful vs. unsuccessful prompts
- Measure velocity vs. traditional development
Real World Outcome:
VIBE CODING SESSION LOG: Task Management App
============================================
Session goal: Build a task management app with natural language only
Time: 4 hours
CONVERSATION EXCERPT:
Me: "I want a task management app with projects and tasks"
Claude: [Created basic React app with project/task models]
Result: WORKED - Basic CRUD scaffolded quickly
Me: "Add drag and drop to reorder tasks"
Claude: [Implemented with react-beautiful-dnd]
Result: WORKED - Common pattern, AI knows it well
Me: "Make the task list infinite scrolling with virtualization"
Claude: [Implemented with react-window]
Result: PARTIALLY WORKED - Needed tweaks for dynamic row heights
Me: "Add real-time collaboration so multiple users can edit"
Claude: [Suggested WebSocket + CRDT approach]
Result: NEEDED SPECIFICATION - "What conflict resolution strategy?"
Learned: Complex features need architectural discussion first
Me: "Make it feel snappy, no loading states"
Claude: [Added optimistic updates]
Result: NEEDED REFINEMENT - Initial version had race conditions
Learned: Vague "feel" requirements need concrete specifications
VELOCITY COMPARISON:
Traditional Vibe Coding
Basic CRUD 4 hours 45 min
Drag-and-drop 2 hours 15 min
Infinite scroll 3 hours 30 min (+ 30 debug)
Real-time collab 16 hours 3 hours (+ 2 hours arch discussion)
-------------------------------------------------------------------
Total 25 hours ~6 hours
FINDINGS:
- Natural language works for: Common patterns, UI, CRUD, libraries
- Need specification for: Architecture, edge cases, performance reqs
- Still need verification for: Security, business logic, algorithms
Key Concepts: | Concept | Book Reference | |———|—————-| | Natural language processing | “Speech and Language Processing” by Jurafsky - Ch. 1-2 | | Requirements elicitation | “Software Requirements” by Wiegers - Ch. 7 | | Iterative development | “The Lean Startup” by Eric Ries - Ch. 3-4 |
| Difficulty: Intermediate | Time: 8-15 hours |
Project 19: Comparing AI Coding Tools
What You’ll Build: A systematic comparison of Claude Code, Cursor, GitHub Copilot, and OpenAI Codex across multiple dimensions: code quality, speed, context handling, tool integration, and cost.
Why This Teaches Vibe Coding: Different tools excel at different tasks. Cursor is great for IDE integration, Claude Code for agentic workflows, Copilot for inline suggestions. This project teaches you to choose the right tool for each situation.
Core Challenges:
- Run identical tasks across 4 tools and compare results
- Evaluate context handling (how much codebase does each understand?)
- Compare tool integration capabilities
- Calculate cost/value for each tool
Real World Outcome:
AI CODING TOOL COMPARISON
=========================
Test Suite: 20 tasks across 5 categories
Date: 2025-01-15
TOOLS TESTED:
- Claude Code (CLI) - Anthropic
- Cursor (IDE) - Anysphere
- GitHub Copilot (VS Code) - Microsoft/OpenAI
- Codex CLI - OpenAI
RESULTS BY CATEGORY:
1. CODE GENERATION (Simple functions)
Claude Code: 95% correct | Cursor: 92% | Copilot: 88% | Codex: 90%
Winner: Claude Code (best at following complex specs)
2. CONTEXT UNDERSTANDING (Multi-file awareness)
Claude Code: EXCELLENT | Cursor: EXCELLENT | Copilot: GOOD | Codex: EXCELLENT
Winner: TIE (Claude/Cursor/Codex) - Full codebase awareness
3. REFACTORING (Cross-file changes)
Claude Code: 90% | Cursor: 85% | Copilot: 60% | Codex: 85%
Winner: Claude Code (agentic approach handles multi-file well)
4. DEBUGGING (Error analysis)
Claude Code: 88% | Cursor: 80% | Copilot: 70% | Codex: 82%
Winner: Claude Code (tool usage for inspection)
5. SPEED (Time to first useful output)
Claude Code: 8s avg | Cursor: 2s | Copilot: 0.5s | Codex: 6s
Winner: Copilot (inline, no context switch)
TOOL INTEGRATION:
Claude Code Cursor Copilot Codex
MCP Support YES (100+) NO NO NO
Git Integration Native Via IDE Via IDE Native
Web Search YES YES NO YES
Custom Tools Skills+Hooks Plugins Limited Limited
Headless Mode YES NO NO YES
COST COMPARISON (Monthly, Individual):
Claude Code: ~$20-100 (API usage based)
Cursor: $20/mo (Pro)
Copilot: $10/mo (Individual)
Codex: ~$20-100 (API usage based)
RECOMMENDATION BY USE CASE:
- Quick inline suggestions: Copilot (fastest, cheapest)
- Full IDE experience: Cursor (best balance)
- Agentic workflows: Claude Code (most capable)
- CI/CD integration: Claude Code or Codex (headless support)
- Budget-conscious: Copilot ($10/mo, good enough for most)
Key Concepts: | Concept | Book Reference | |———|—————-| | Tool evaluation | “The Pragmatic Programmer” - Ch. 3: Basic Tools | | Decision frameworks | “Thinking, Fast and Slow” - Part 4 | | TCO analysis | “The Art of Capacity Planning” - Ch. 1-2 |
| Difficulty: Intermediate | Time: 12-18 hours |
Project 20: Best Practices and Anti-Patterns
What You’ll Build: A comprehensive best practices guide and anti-pattern catalog for vibe coding, distilled from your experience across all previous projects.
Why This Teaches Vibe Coding: This is the synthesis project. You’ve learned through doing; now you articulate what works and what doesn’t. This becomes your team’s playbook for AI-assisted development.
Core Challenges:
- Document patterns that consistently work well
- Catalog anti-patterns with examples and fixes
- Create a decision framework for when to use AI (and when not to)
- Build a team onboarding guide for AI tools
Real World Outcome:
# VIBE CODING: BEST PRACTICES & ANTI-PATTERNS
## BEST PRACTICES
### 1. Context is King
DO: Invest time in CLAUDE.md. Every hour spent on context saves 10 hours of bad output.
DO: Include architecture diagrams, key files, testing commands.
DON'T: Include entire codebases (token waste, noise).
### 2. Trust but Verify
DO: Review every change, especially security-sensitive code.
DO: Run tests before accepting changes.
DON'T: "Accept All" on anything that matters.
### 3. Specify, Don't Hope
DO: "Create a function that validates email with RFC 5322, returns tuple (bool, error_message)"
DON'T: "Make an email validator"
### 4. Fail Fast
DO: Ask AI to explain its approach before generating code.
DO: Stop and redirect when output seems wrong.
DON'T: Keep iterating on bad direction hoping it improves.
### 5. Version Your AI Interactions
DO: Include AI-assisted commits in meaningful git history.
DO: Document AI-generated code with comments.
DON'T: Accept 500-line changes with "AI generated this" commit.
## ANTI-PATTERNS
### 1. The Blind Accept
SYMPTOM: Accepting AI code without reading it
RISK: Bugs, security issues, technical debt
FIX: Always diff review, run tests, check imports
### 2. The Context Overload
SYMPTOM: Including entire codebase in every prompt
RISK: Slow responses, high cost, confused output
FIX: Include relevant files only, use pointers not copies
### 3. The Hallucination Ignore
SYMPTOM: Trusting AI claims about APIs/libraries without verification
RISK: Runtime errors, non-existent methods, wrong signatures
FIX: Always verify imports exist, check docs for claimed features
### 4. The Prompt Vagueness
SYMPTOM: "Make it better" or "Add some tests"
RISK: Unpredictable output, wasted iterations
FIX: Specific requirements, expected behavior, edge cases
### 5. The Security Bypass
SYMPTOM: Using AI for auth/crypto without expert review
RISK: Vulnerabilities, data breaches, compliance failures
FIX: AI generates, security expert reviews, pen test validates
### 6. The Refactor Everything
SYMPTOM: Asking AI to "clean up" large codebases
RISK: Breaking changes, lost functionality, unreviewed modifications
FIX: Small, focused refactors with tests passing after each
### 7. The Debug Loop
SYMPTOM: Repeatedly asking AI to fix its own broken code
RISK: Infinite loops, lost time, frustration
FIX: Understand the bug yourself, give AI targeted fix instructions
## DECISION FRAMEWORK
WHEN TO USE AI:
+ Boilerplate and scaffolding
+ Common patterns and implementations
+ Code review assistance
+ Documentation drafts
+ Test generation
+ Refactoring with clear specifications
+ Exploring solution approaches
WHEN TO BE CAREFUL:
? Security-sensitive code (verify thoroughly)
? Performance-critical paths (benchmark)
? Complex algorithms (verify correctness)
? Business logic (verify requirements)
? Unfamiliar frameworks (verify patterns are current)
WHEN TO SKIP AI:
- Cryptographic implementations (use libraries)
- Safety-critical systems (formal verification needed)
- When you don't understand the output
- When you can't verify correctness
- Highly creative/novel solutions (AI copies patterns)
## TEAM ONBOARDING CHECKLIST
Week 1: Fundamentals
[ ] Install Claude Code, Cursor, and Copilot
[ ] Set up personal CLAUDE.md
[ ] Complete 3 simple tasks with AI assistance
[ ] Review and reflect on AI output quality
Week 2: Integration
[ ] Set up project CLAUDE.md
[ ] Use AI for code review
[ ] Practice prompt refinement
[ ] Learn to catch hallucinations
Week 3: Workflows
[ ] Integrate AI into daily development
[ ] Set up custom slash commands
[ ] Document personal best practices
[ ] Shadow senior developer using AI
Week 4: Advanced
[ ] Build something complex with AI
[ ] Debug AI-generated code
[ ] Contribute to team playbook
[ ] Teach a teammate
Key Concepts: | Concept | Book Reference | |———|—————-| | Best practices | “Practices of an Agile Developer” by Subramaniam - All | | Pattern language | “A Pattern Language” by Alexander - Intro | | Knowledge management | “The Fifth Discipline” by Senge - Ch. 1-3 |
| Difficulty: Advanced | Time: 15-20 hours |
Project Comparison Table
| # | Project | Difficulty | Time | Key Skill |
|---|---|---|---|---|
| 1 | Understanding AI Coding Assistants | Beginner | 8-12h | Mental model |
| 2 | Effective Prompting | Beginner | 6-10h | Prompt engineering |
| 3 | Claude Code Setup | Beginner | 4-8h | Tool configuration |
| 4 | Context Management & MCP | Intermediate | 12-16h | Tool integration |
| 5 | Multi-File Refactoring | Intermediate | 10-15h | Large-scale changes |
| 6 | Test-Driven Development | Intermediate | 10-15h | Quality assurance |
| 7 | Code Review with AI | Intermediate | 12-18h | Review workflow |
| 8 | Debugging with AI | Intermediate | 10-15h | Problem solving |
| 9 | Architecture Design | Advanced | 15-20h | System design |
| 10 | Managing Hallucinations | Intermediate | 8-12h | Verification |
| 11 | Version Control | Intermediate | 8-12h | Git workflow |
| 12 | CI/CD Integration | Advanced | 15-20h | Automation |
| 13 | Documentation Generation | Intermediate | 10-15h | Technical writing |
| 14 | API Design | Intermediate | 12-16h | Interface design |
| 15 | Performance Optimization | Advanced | 15-20h | System optimization |
| 16 | Security Review | Advanced | 15-20h | Security |
| 17 | Custom Skills | Advanced | 15-25h | Tool extension |
| 18 | Natural Language Building | Intermediate | 8-15h | Communication |
| 19 | Tool Comparison | Intermediate | 12-18h | Tool selection |
| 20 | Best Practices | Advanced | 15-20h | Synthesis |
Summary
This guide teaches you to master “vibe coding” - the art of AI-assisted software development. Coined by Andrej Karpathy in 2025, vibe coding represents a fundamental shift in how software is built: from writing every line manually to collaborating with AI through natural language.
Key Takeaways:
-
AI coding tools are powerful but not magic. They’re probabilistic code generators that predict likely patterns, not programmers that understand your requirements.
-
Context is everything. The quality of AI output directly correlates with the quality of context you provide. CLAUDE.md is your most important configuration file.
-
Trust calibration is the core skill. Know when to trust AI output, when to verify, and when to reject. This judgment separates effective vibe coders from frustrated ones.
-
Different tools for different jobs. Copilot for quick inline suggestions, Cursor for IDE integration, Claude Code for agentic workflows. Learn when to use each.
-
AI accelerates, not replaces. 41% of code is AI-assisted in 2025, but 46% of developers distrust AI accuracy. The skill is leveraging AI speed while maintaining quality.
After completing these 20 projects, you will:
- Understand how AI coding assistants actually work
- Write prompts that consistently produce good output
- Configure tools for maximum effectiveness
- Catch and prevent AI hallucinations
- Integrate AI into professional workflows (CI/CD, code review, documentation)
- Choose the right tool for each task
- Teach others to use AI effectively
The future of software development is collaborative. Human judgment + AI speed = unprecedented productivity. But only if you develop the skills to use AI as a tool, not a crutch.
Now go build something.
Sources and References
Research and Statistics
- Vibe coding - Wikipedia
- Andrej Karpathy’s original vibe coding tweet
- IBM: What is Vibe Coding?
- Stack Overflow 2025 Developer Survey - AI Section
- AI Coding Assistant Statistics & Trends 2025
- Jellyfish: 2025 AI Metrics in Review
- Cursor AI Adoption Trends
Tool Documentation
- Claude Code: Best practices for agentic coding
- Using CLAUDE.MD files: Customizing Claude Code
- Connect Claude Code to tools via MCP
- Model Context Protocol Introduction
- MCP Servers on GitHub
Community Insights
- Simon Willison: Not all AI-assisted programming is vibe coding
- Writing a good CLAUDE.md
- How I use Claude Code (Builder.io)
- CLAUDE.md Best Practices (Arize)
Last updated: 2026-01-01 Total projects: 20 | Estimated time: 211-390 hours | Difficulty: Beginner to Advanced