Project 35: CLAUDE.md Generator - Intelligent Context Builder

Comprehensive Learning Guide Automatically analyze codebases and generate optimal CLAUDE.md files with project context, conventions, and architectural guidance

Learning Objectives
Deep Theoretical Foundation
Complete Project Specification
Real World Outcome
Solution Architecture
Phased Implementation Guide
Testing Strategy
Common Pitfalls & Debugging
Extensions & Challenges
Resources
Self-Assessment Checklist

Learning Objectives

By completing this project, you will master:

Code Analysis Techniques: Learn to extract project structure, tech stack, and architectural patterns from codebases using AST parsing, file pattern analysis, and convention detection.
CLAUDE.md Semantics: Understand what information Claude Code needs to be maximally helpful, and how to structure that information for optimal context utilization.
Context Window Optimization: Balance detail vs. brevity to provide maximum value within Claude’s context limits, prioritizing high-value information.
Convention Detection: Infer coding standards, naming conventions, and architectural patterns from existing code rather than explicit configuration.
Documentation Generation: Apply documentation-as-code principles to generate living documentation that stays in sync with the codebase.
Incremental Updates: Build systems that detect when generated content is stale and update only changed sections while preserving manual additions.

Deep Theoretical Foundation

What Claude Code Needs to Know

CLAUDE.md bridges the gap between Claude’s general knowledge and your specific codebase. Understanding what to include requires understanding how Claude processes context:

Context Value Hierarchy:
=======================

HIGH VALUE (Always include):
+--------------------------------------------+
| 1. Project Purpose & Domain                |
|    - What does this project do?            |
|    - What problem does it solve?           |
|    - Who are the users?                    |
|                                            |
| 2. Architecture & Structure                |
|    - Key directories and their purposes    |
|    - Data flow between components          |
|    - External dependencies and integrations|
|                                            |
| 3. Critical Conventions                    |
|    - Naming patterns that are non-obvious  |
|    - File organization rules               |
|    - Testing requirements                  |
+--------------------------------------------+

MEDIUM VALUE (Include if space permits):
+--------------------------------------------+
| 4. Technology Stack Details                |
|    - Framework versions and configs        |
|    - Build system and tooling              |
|    - Environment setup                     |
|                                            |
| 5. Important Files                         |
|    - Entry points                          |
|    - Configuration files                   |
|    - Key abstractions                      |
+--------------------------------------------+

LOW VALUE (Usually omit):
+--------------------------------------------+
| 6. Information Claude can infer            |
|    - Common patterns (React hooks, etc)    |
|    - Standard file extensions              |
|    - Obvious directory names               |
|                                            |
| 7. Rapidly changing details                |
|    - Specific line numbers                 |
|    - Temporary workarounds                 |
|    - In-progress features                  |
+--------------------------------------------+

Reference: “Clean Architecture” by Robert C. Martin, Chapters 15-16 discuss how to communicate architectural intent.

Code Analysis Strategies

Different aspects of a codebase require different analysis approaches:

Analysis Strategy Matrix:
========================

ASPECT              METHOD                   TOOLS/TECHNIQUES
------              ------                   ----------------

Tech Stack          Manifest Analysis        package.json, Cargo.toml,
Detection           Config Detection         requirements.txt, go.mod
                    File Extension Scan      tsconfig.json, .eslintrc

Project             Directory Structure      src/, lib/, tests/
Structure           Entry Point Detection    main.ts, index.js, app.py
                    Import Graph Analysis    Dependency tracing

Conventions         AST Parsing              Babel, TypeScript API,
                    Pattern Mining           tree-sitter
                    Config File Reading      ESLint, Prettier rules

Architecture        Import/Export Analysis   Module boundaries
                    Directory Naming         Feature folders
                    Dependency Graph         Coupling analysis

Build & Test        Script Analysis          npm scripts, Makefile
                    CI Config Reading        .github/workflows/
                    Test Pattern Detection   *.test.ts, *_test.go


Detection Hierarchy (most reliable first):
==========================================

1. EXPLICIT CONFIG (Highest confidence)
   - package.json scripts and dependencies
   - tsconfig.json compiler options
   - .eslintrc rules
   - Example: "typescript" in devDependencies -> TypeScript project

2. NAMING PATTERNS (High confidence)
   - File extensions: .tsx -> React + TypeScript
   - Directory names: /components/, /hooks/
   - Example: src/components/ -> Component-based architecture

3. FILE CONTENT ANALYSIS (Medium confidence)
   - Import statements
   - Export patterns
   - Code structure
   - Example: "import React" -> React framework

4. STATISTICAL PATTERNS (Lower confidence)
   - Most common patterns
   - Frequency analysis
   - Example: 80% of functions use camelCase -> camelCase convention

Reference: “Working Effectively with Legacy Code” by Michael Feathers, Chapter 16 discusses understanding code through analysis.

CLAUDE.md Structure and Semantics

The structure of CLAUDE.md affects how Claude interprets and uses the information:

Optimal CLAUDE.md Structure:
===========================

# Project Name

One-paragraph overview that answers:
- What is this?
- What does it do?
- Who is it for?

## Architecture

Concise description of:
- Major components/layers
- How they interact
- Key abstractions

## Directory Structure

src/
+-- components/    # React UI components
+-- hooks/         # Custom React hooks
+-- services/      # API and business logic
+-- utils/         # Helper functions
+-- types/         # TypeScript type definitions

## Key Conventions

### Naming
- Components: PascalCase (UserProfile.tsx)
- Hooks: useCamelCase (useAuth.ts)
- Utils: camelCase (formatDate.ts)

### File Organization
- One component per file
- Co-located tests (Component.test.tsx)
- Styles in separate files (Component.styles.ts)

### Code Patterns
- Functional components only
- Custom hooks for shared logic
- Zod for runtime validation

## Important Files

- `src/App.tsx` - Application root
- `src/routes/` - Page components
- `src/api/client.ts` - API configuration
- `src/store/` - State management

## Commands

npm run dev      # Start development server
npm run build    # Production build
npm run test     # Run tests
npm run lint     # Check code quality

## Avoid

- Class components (use functional)
- any type (use proper types)
- Direct DOM manipulation
- Modifying generated files in src/generated/


Section Semantics:
=================

TITLE & OVERVIEW
- Sets context for everything below
- Should be scannable in < 10 seconds

ARCHITECTURE
- Mental model for code organization
- Helps Claude navigate unfamiliar areas

CONVENTIONS
- What might look wrong but is intentional
- Prevents Claude from "fixing" things

IMPORTANT FILES
- Where to look for specific concerns
- Reduces search time

COMMANDS
- How to build, test, run
- Essential for development tasks

AVOID
- Explicit anti-patterns
- Saves time by preventing mistakes

Context Window Optimization

CLAUDE.md competes for space with code, conversation, and other context:

Context Budget Allocation:
=========================

Total Context: ~200k tokens (Claude 3)

Typical Session Usage:
+------------------------------------------+
| System Prompt & Tools    | ~5k tokens    |
| CLAUDE.md                | ~2k tokens    | <-- Your budget
| Current Code Files       | ~20-100k      |
| Conversation History     | Variable      |
| Working Space            | Remaining     |
+------------------------------------------+

CLAUDE.md Optimization Strategies:
=================================

1. INFORMATION DENSITY
   Bad:  "The components directory contains React components"
   Good: "components/ - React UI (functional only)"

2. AVOID REDUNDANCY
   Bad:  List every file in a directory
   Good: Describe the pattern, give 1-2 examples

3. PRIORITIZE EXCEPTIONS
   Bad:  Document standard React patterns
   Good: Document project-specific deviations

4. USE REFERENCES
   Bad:  Copy code examples into CLAUDE.md
   Good: "See src/hooks/useAuth.ts for auth pattern"

5. STRUCTURED FORMATS
   Bad:  Prose paragraphs describing structure
   Good: ASCII tree diagrams, bullet lists, tables


Size Guidelines:
===============
- Minimum effective: ~500 tokens (basic structure)
- Recommended: 1000-2000 tokens (comprehensive)
- Maximum useful: ~3000 tokens (beyond this, diminishing returns)

Token Estimation:
- 1 token ~ 4 characters (English)
- 1 line of markdown ~ 10-20 tokens
- 50 lines of markdown ~ 500-1000 tokens

Complete Project Specification

Functional Requirements

Core Features (Must Have):

Feature	Description	Priority
Tech stack detection	Identify languages, frameworks, tools	P0
Directory structure analysis	Map project organization	P0
Convention inference	Detect naming, file organization patterns	P0
CLAUDE.md generation	Create formatted markdown output	P0
Important file detection	Identify key entry points and configs	P0
Command extraction	Find build, test, run scripts	P0
Incremental updates	Update without losing manual content	P1
Staleness detection	Know when regeneration is needed	P1
Multiple output formats	JSON, markdown, YAML	P2
Custom templates	User-defined section templates	P2

CLI Commands:

# Generate CLAUDE.md for current directory
claude-context generate [path]

# Preview without writing
claude-context generate --dry-run

# Update existing CLAUDE.md (preserve manual sections)
claude-context update

# Check if CLAUDE.md is stale
claude-context status

# Show detected information without generating
claude-context analyze [path]

# Use custom template
claude-context generate --template my-template.md

# Output as JSON (for tooling integration)
claude-context generate --format json

Real World Outcome

You’ll have a context generator:

Example Usage:

$ claude-context generate ./my-project

Analyzing project structure...
Detected: TypeScript + React + Node.js

Generated CLAUDE.md:
====================

# my-project

E-commerce platform built with TypeScript, React 18, and Node.js/Express backend.
Monorepo structure with shared types and utilities.

## Architecture

Frontend and backend in single repository with shared code:

Client (React SPA) | v API Gateway | v Express Backend –> PostgreSQL | v External Services (Stripe, SendGrid)


## Directory Structure

## Key Conventions

### TypeScript
- Strict mode enabled (`tsconfig.json: "strict": true`)
- Prefer interfaces over type aliases for object shapes
- Use discriminated unions for complex state

### React (Frontend)
- Functional components only (no class components)
- Custom hooks for shared logic (`src/client/hooks/`)
- Component structure: `Component.tsx` + `Component.styles.ts`
- Props interfaces defined above component

### API (Backend)
- RESTful endpoints organized by resource
- Validation with Zod schemas before handlers
- Error responses use Problem Details format (RFC 7807)

### Testing
- Co-located tests: `Component.test.tsx` next to `Component.tsx`
- Integration tests in `tests/integration/`
- Run `npm test` before committing

## Important Files

| File | Purpose |
|------|---------|
| `src/server/routes/index.ts` | API route registration |
| `src/client/App.tsx` | React app root component |
| `src/client/store/index.ts` | Redux store configuration |
| `src/common/types/api.ts` | API request/response types |
| `docker-compose.yml` | Local development services |

## Commands

```bash
npm run dev        # Start frontend and backend (concurrent)
npm run dev:client # Start frontend only
npm run dev:server # Start backend only
npm run build      # Production build
npm run test       # Run all tests
npm run test:watch # Run tests in watch mode
npm run lint       # ESLint check
npm run typecheck  # TypeScript check

Environment Setup

Required environment variables (see .env.example):

DATABASE_URL - PostgreSQL connection string
STRIPE_SECRET_KEY - Stripe API key
SENDGRID_API_KEY - SendGrid API key

Avoid

Class components in React (use functional)
any type in TypeScript (use proper types)
Direct DOM manipulation (use React refs)
Modifying files in src/generated/ (auto-generated)
Committing .env files (use .env.example)

CLAUDE.md generated (2.1KB, ~500 tokens) Would you like to add this to your project? [y/n]: y

Written to ./CLAUDE.md

---

## Solution Architecture

### System Architecture Diagram

+——————————————————————+ | CLAUDE-CONTEXT GENERATOR | +——————————————————————+ | | | +——————+ +——————+ +———–+ | | | CLI Interface | | Analysis Engine | | Generator | | | |——————| |——————| |———–| | | | - generate |—->| - Stack detect |—->| - Template| | | | - analyze | | - Structure map | | - Render | | | | - update |<—-| - Convention |<—-| - Format | | | +——————+ +——————+ +———–+ | | | | | | | v v v | | +——————+ +——————+ +———–+ | | | File Scanner | | AST Analyzer | | Templates | | | |——————| |——————| |———–| | | | - Glob patterns | | - TypeScript | | - Default | | | | - Config files | | - Babel | | - Custom | | | | - Manifest parse | | - Tree-sitter | | - Sections| | | +——————+ +——————+ +———–+ | | | +——————————————————————+ | v +——————-+ | Project Files | | package.json | | src/*/ | +——————-+


### Analysis Pipeline

Analysis Pipeline: ==================


### Module Breakdown

---

## Phased Implementation Guide

### Phase 1: Manifest and Stack Detection (Day 1-2)

**Goal**: Detect tech stack from package.json and config files.

**Milestone**: Running analyzer correctly identifies TypeScript + React project.

**Tasks**:

1. **Project Setup**
   ```bash
   mkdir claude-context && cd claude-context
   npm init -y
   npm install commander chalk glob yaml
   npm install -D typescript @types/node vitest
   npx tsc --init

Create Manifest Analyzer (src/analyzers/manifest.ts)

import { readFileSync, existsSync } from 'fs';

export interface ManifestInfo {
  type: 'npm' | 'cargo' | 'go' | 'python' | 'unknown';
  name?: string;
  dependencies: string[];
  devDependencies: string[];
  scripts: Record<string, string>;
}

export function analyzeManifest(projectPath: string): ManifestInfo {
  const packageJsonPath = `${projectPath}/package.json`;

  if (existsSync(packageJsonPath)) {
    const pkg = JSON.parse(readFileSync(packageJsonPath, 'utf-8'));
    return {
      type: 'npm',
      name: pkg.name,
      dependencies: Object.keys(pkg.dependencies || {}),
      devDependencies: Object.keys(pkg.devDependencies || {}),
      scripts: pkg.scripts || {}
    };
  }

  // Check for other manifest types...
  return { type: 'unknown', dependencies: [], devDependencies: [], scripts: {} };
}

Create Stack Detector (src/analyzers/stack.ts)

export interface StackInfo {
  languages: string[];
  frameworks: string[];
  tools: string[];
  runtime: string;
}

export function detectStack(manifest: ManifestInfo, files: string[]): StackInfo {
  const stack: StackInfo = {
    languages: [],
    frameworks: [],
    tools: [],
    runtime: ''
  };

  // Detect TypeScript
  if (manifest.devDependencies.includes('typescript') ||
      files.some(f => f.endsWith('.ts') || f.endsWith('.tsx'))) {
    stack.languages.push('TypeScript');
  }

  // Detect React
  if (manifest.dependencies.includes('react')) {
    stack.frameworks.push('React');
  }

  // More detections...
  return stack;
}

Test Detection

npx tsx src/analyzers/stack.ts ./test-project

Success Criteria: Correctly identifies languages and frameworks from real projects.

Phase 2: Structure and Convention Analysis (Day 3-4)

Goal: Map directory structure and infer coding conventions.

Milestone: Output includes accurate directory tree and detected patterns.

Tasks:

Create Structure Analyzer (src/analyzers/structure.ts)

export interface DirectoryInfo {
  path: string;
  purpose: string;
  fileCount: number;
  patterns: string[];
}

export function analyzeStructure(projectPath: string): DirectoryInfo[] {
  // Known directory patterns
  const knownPatterns: Record<string, string> = {
    'src': 'Source code',
    'lib': 'Library code',
    'test': 'Test files',
    'tests': 'Test files',
    'components': 'UI components',
    'hooks': 'React hooks',
    'utils': 'Utility functions',
    'services': 'Service layer',
    'models': 'Data models',
    'routes': 'Route handlers',
    'api': 'API layer',
    'types': 'Type definitions',
    'config': 'Configuration',
    'scripts': 'Build scripts',
    'docs': 'Documentation',
  };

  // Scan and categorize directories
  // ...
}

Create Convention Analyzer (src/analyzers/conventions.ts)

export interface ConventionInfo {
  naming: {
    components: string;  // PascalCase, kebab-case, etc.
    functions: string;
    files: string;
    directories: string;
  };
  fileOrganization: string[];  // Patterns detected
  testingPattern: string;       // Co-located, separate, etc.
}

export function analyzeConventions(files: string[]): ConventionInfo {
  // Analyze file names for patterns
  const componentFiles = files.filter(f =>
    f.includes('components/') || f.includes('Components/')
  );

  const namingPattern = detectNamingPattern(componentFiles);
  // ...
}

function detectNamingPattern(files: string[]): string {
  const pascalCase = files.filter(f => /[A-Z][a-z]+[A-Z]/.test(f));
  const kebabCase = files.filter(f => /[a-z]+-[a-z]+/.test(f));

  if (pascalCase.length > kebabCase.length) return 'PascalCase';
  if (kebabCase.length > pascalCase.length) return 'kebab-case';
  return 'mixed';
}

Integrate with CLI
- Add analyze command
- Display detected structure and conventions

Success Criteria: Accurately detects naming conventions and file organization.

Phase 3: Important File and Command Detection (Day 5)

Goal: Identify key files and extract build commands.

Milestone: Output includes important files table and commands section.

Tasks:

Create File Importance Detector (src/analyzers/files.ts)

export interface ImportantFile {
  path: string;
  purpose: string;
  importance: 'critical' | 'high' | 'medium';
}

export function detectImportantFiles(
  projectPath: string,
  manifest: ManifestInfo,
  stack: StackInfo
): ImportantFile[] {
  const important: ImportantFile[] = [];

  // Entry points
  const entryPoints = [
    'src/index.ts', 'src/index.tsx', 'src/main.ts',
    'src/App.tsx', 'src/app.ts', 'index.js', 'main.js'
  ];

  for (const entry of entryPoints) {
    if (existsSync(`${projectPath}/${entry}`)) {
      important.push({
        path: entry,
        purpose: 'Application entry point',
        importance: 'critical'
      });
      break;
    }
  }

  // Configuration files
  // Route files
  // Store/state management
  // ...

  return important;
}

Create Command Extractor (src/analyzers/commands.ts)

export interface Command {
  name: string;
  command: string;
  description: string;
}

export function extractCommands(manifest: ManifestInfo): Command[] {
  const commands: Command[] = [];

  const scriptDescriptions: Record<string, string> = {
    'dev': 'Start development server',
    'start': 'Start application',
    'build': 'Production build',
    'test': 'Run tests',
    'lint': 'Check code quality',
    'format': 'Format code',
    'typecheck': 'TypeScript type check',
  };

  for (const [name, cmd] of Object.entries(manifest.scripts)) {
    commands.push({
      name,
      command: `npm run ${name}`,
      description: scriptDescriptions[name] || inferDescription(cmd)
    });
  }

  return commands;
}

Success Criteria: Important files and commands are correctly identified and described.

Phase 4: CLAUDE.md Generation (Day 6)

Goal: Generate formatted CLAUDE.md from analysis results.

Milestone: Full CLAUDE.md generated from project analysis.

Tasks:

Create Template System (src/generators/templates.ts) ```typescript export const defaultTemplate = `
{projectName}

{overview}

Architecture

{architecture}

Directory Structure

``` {directoryTree} ```

Key Conventions

{conventions}

Important Files

{importantFiles}

Commands

```bash {commands} ```

Avoid

{avoidPatterns}

2. **Create Section Renderers** (`src/generators/sections.ts`)
   ```typescript
   export function renderDirectoryTree(structure: DirectoryInfo[]): string {
     // Generate ASCII tree representation
   }

   export function renderConventions(conventions: ConventionInfo): string {
     // Format as markdown sections
   }

   export function renderImportantFiles(files: ImportantFile[]): string {
     // Format as markdown table
   }

   export function renderCommands(commands: Command[]): string {
     // Format as bash code block
   }

Create Markdown Generator (src/generators/markdown.ts)

export function generateClaudeMd(analysis: ProjectAnalysis): string {
  const template = loadTemplate();

  return template
    .replace('{projectName}', analysis.name)
    .replace('{overview}', generateOverview(analysis))
    .replace('{architecture}', renderArchitecture(analysis))
    .replace('{directoryTree}', renderDirectoryTree(analysis.structure))
    .replace('{conventions}', renderConventions(analysis.conventions))
    .replace('{importantFiles}', renderImportantFiles(analysis.files))
    .replace('{commands}', renderCommands(analysis.commands))
    .replace('{avoidPatterns}', renderAvoidPatterns(analysis));
}

Implement Generate Command
- Run all analyzers
- Merge results
- Render template
- Write to file (or stdout with –dry-run)

Success Criteria: Generated CLAUDE.md is accurate and well-formatted.

Phase 5: Update and Staleness Detection (Day 7)

Goal: Support incremental updates and detect when regeneration is needed.

Milestone: update command preserves manual sections; status shows staleness.

Tasks:

Create Update Logic (src/generators/update.ts)

export function updateClaudeMd(
  existingContent: string,
  newContent: string
): string {
  const marker = '<!-- AUTO-GENERATED: Do not edit above this line -->';

  const existingParts = existingContent.split(marker);
  const newParts = newContent.split(marker);

  if (existingParts.length === 2 && newParts.length === 2) {
    // Preserve manual section below marker
    return newParts[0] + marker + existingParts[1];
  }

  return newContent;
}

Create Staleness Detector (src/cache/staleness.ts)

export interface StalenessInfo {
  isStale: boolean;
  reason?: string;
  changedFiles?: string[];
  lastGenerated?: string;
}

export function checkStaleness(projectPath: string): StalenessInfo {
  const cachePath = `${projectPath}/.claude-context-cache.json`;

  if (!existsSync(cachePath)) {
    return { isStale: true, reason: 'No cache found' };
  }

  const cache = JSON.parse(readFileSync(cachePath, 'utf-8'));
  const currentHashes = computeFileHashes(projectPath);

  // Compare with cached hashes
  const changed = findChangedFiles(cache.hashes, currentHashes);

  if (changed.length > 0) {
    return {
      isStale: true,
      reason: 'Files changed since last generation',
      changedFiles: changed,
      lastGenerated: cache.generatedAt
    };
  }

  return { isStale: false, lastGenerated: cache.generatedAt };
}

Implement Status Command

$ claude-context status

CLAUDE.md Status:
Generated: 2024-12-20 14:30
Status: STALE

Changed files (3):
- src/routes/api.ts (modified)
- src/components/NewFeature.tsx (added)
- package.json (modified)

Run `claude-context update` to regenerate.

Success Criteria: Updates preserve manual content; staleness accurately detected.

Testing Strategy

Unit Tests: Analyzers

// tests/analyzers/stack.test.ts
import { describe, it, expect } from 'vitest';
import { detectStack } from '../../src/analyzers/stack';

describe('stack detection', () => {
  it('detects TypeScript from dependencies', () => {
    const manifest = {
      type: 'npm',
      dependencies: [],
      devDependencies: ['typescript'],
      scripts: {}
    };

    const stack = detectStack(manifest, []);
    expect(stack.languages).toContain('TypeScript');
  });

  it('detects React from dependencies', () => {
    const manifest = {
      type: 'npm',
      dependencies: ['react', 'react-dom'],
      devDependencies: [],
      scripts: {}
    };

    const stack = detectStack(manifest, []);
    expect(stack.frameworks).toContain('React');
  });

  it('detects TypeScript from file extensions', () => {
    const manifest = {
      type: 'npm',
      dependencies: [],
      devDependencies: [],
      scripts: {}
    };

    const files = ['src/App.tsx', 'src/utils.ts'];
    const stack = detectStack(manifest, files);

    expect(stack.languages).toContain('TypeScript');
  });
});

Unit Tests: Convention Detection

// tests/analyzers/conventions.test.ts
import { describe, it, expect } from 'vitest';
import { analyzeConventions } from '../../src/analyzers/conventions';

describe('convention analysis', () => {
  it('detects PascalCase component naming', () => {
    const files = [
      'src/components/UserProfile.tsx',
      'src/components/NavigationBar.tsx',
      'src/components/Footer.tsx'
    ];

    const conventions = analyzeConventions(files);
    expect(conventions.naming.components).toBe('PascalCase');
  });

  it('detects co-located test pattern', () => {
    const files = [
      'src/components/Button.tsx',
      'src/components/Button.test.tsx',
      'src/components/Card.tsx',
      'src/components/Card.test.tsx'
    ];

    const conventions = analyzeConventions(files);
    expect(conventions.testingPattern).toBe('co-located');
  });

  it('detects separate test directory pattern', () => {
    const files = [
      'src/components/Button.tsx',
      'src/components/Card.tsx',
      'tests/components/Button.test.tsx',
      'tests/components/Card.test.tsx'
    ];

    const conventions = analyzeConventions(files);
    expect(conventions.testingPattern).toBe('separate');
  });
});

Integration Tests: Generation

// tests/integration/generate.test.ts
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import { mkdirSync, writeFileSync, rmSync, readFileSync } from 'fs';
import { generateForProject } from '../../src/index';

describe('CLAUDE.md generation', () => {
  const testProject = '/tmp/claude-context-test';

  beforeEach(() => {
    mkdirSync(`${testProject}/src/components`, { recursive: true });

    writeFileSync(`${testProject}/package.json`, JSON.stringify({
      name: 'test-project',
      dependencies: { react: '^18.0.0' },
      devDependencies: { typescript: '^5.0.0' },
      scripts: { dev: 'vite', build: 'vite build', test: 'vitest' }
    }));

    writeFileSync(`${testProject}/src/App.tsx`, 'export default function App() {}');
    writeFileSync(`${testProject}/src/components/Button.tsx`, 'export function Button() {}');
  });

  afterEach(() => {
    rmSync(testProject, { recursive: true });
  });

  it('generates valid CLAUDE.md', async () => {
    await generateForProject(testProject);

    const content = readFileSync(`${testProject}/CLAUDE.md`, 'utf-8');

    expect(content).toContain('# test-project');
    expect(content).toContain('TypeScript');
    expect(content).toContain('React');
    expect(content).toContain('npm run dev');
    expect(content).toContain('components/');
  });

  it('includes important files section', async () => {
    await generateForProject(testProject);

    const content = readFileSync(`${testProject}/CLAUDE.md`, 'utf-8');

    expect(content).toContain('App.tsx');
    expect(content).toContain('Important Files');
  });
});

Common Pitfalls & Debugging

Pitfall 1: Over-Generating Content

Symptom: CLAUDE.md is too long and wastes context.

Bad:

## All Files

- src/index.ts
- src/App.tsx
- src/components/Button.tsx
- src/components/Card.tsx
- src/components/Modal.tsx
[... 200 more files ...]

Good:

## Directory Structure

src/
+-- components/   # UI components (45 files)
+-- hooks/        # Custom hooks (12 files)
+-- utils/        # Utilities (8 files)

## Important Files

| File | Purpose |
|------|---------|
| src/App.tsx | Application root |

Pitfall 2: Missing Non-Obvious Conventions

Symptom: Claude makes changes that violate project conventions.

Bad:

## Conventions

- Use TypeScript
- Use React

Good:

## Conventions

### TypeScript
- Strict mode enabled
- **Prefer interfaces over types for object shapes**
- Use discriminated unions for complex state

### React
- **Functional components only** (no class components)
- Custom hooks go in `src/hooks/`
- Props interfaces defined above component

### Testing
- **Tests must be co-located** (`Component.test.tsx`)

Pitfall 3: Not Preserving Manual Additions

Symptom: User’s manual notes are lost on regeneration.

Bad:

// Overwrites entire file
writeFileSync('CLAUDE.md', generatedContent);

Good:

// Preserve manual section
const existingContent = readFileSync('CLAUDE.md', 'utf-8');
const updatedContent = updateClaudeMd(existingContent, generatedContent);
writeFileSync('CLAUDE.md', updatedContent);

Pitfall 4: Incorrect Framework Detection

Symptom: Wrong framework detected (e.g., Vue detected as React).

Debug:

// Check detection logic
console.log('Dependencies:', manifest.dependencies);
console.log('Files:', files.filter(f => f.endsWith('.tsx')));

Fix: Order detection by specificity:

// Check for specific frameworks first
if (deps.includes('@angular/core')) return 'Angular';
if (deps.includes('vue')) return 'Vue';
if (deps.includes('react')) return 'React';
// Generic fallback last

The Interview Questions They’ll Ask

Prepare to answer these:

“How would you detect coding conventions from existing code?”
- Pattern frequency analysis (naming, structure)
- Config file inference (ESLint, Prettier)
- AST analysis for code patterns
- Statistical confidence thresholds
“What’s your strategy for keeping generated documentation up to date?”
- Hash-based staleness detection
- File watcher for real-time updates
- Git hook integration for pre-commit
- Incremental regeneration
“How do you handle polyglot projects with multiple languages?”
- Detect all languages from extensions
- Generate sections per language
- Shared conventions vs. language-specific
- Priority for primary language
“What information is better left in the code vs. in CLAUDE.md?”
- CLAUDE.md: High-level architecture, project-specific conventions
- Code: Standard patterns, inline documentation
- Rule: If Claude can infer it, don’t repeat it
“How would you validate that the generated context is accurate?”
- Automated tests against known projects
- Confidence scores for detections
- User feedback mechanism
- Comparison with manual CLAUDE.md files

Hints in Layers

Hint 1: Start with Detection Build detectors for common patterns: package.json, tsconfig.json, Cargo.toml, etc. These provide high-confidence signals.

Hint 2: Template-Based Generation Start with templates for common stacks (React, Node, Python), then customize based on detection.

Hint 3: Use Claude Run Claude over the codebase in headless mode to generate the CLAUDE.md - meta! This can fill gaps in automated detection.

Hint 4: Preserve Manual Content Use markers like  to separate generated from manual content. Never overwrite user additions.

Books That Will Help

Topic	Book	Chapter
Code analysis	“Clean Architecture” by Robert C. Martin	Ch. 15-16: Architecture
Pattern detection	“Working Effectively with Legacy Code” by Feathers	Ch. 16: Understanding Code
Documentation	“Docs Like Code” by Anne Gentle	Ch. 4: Automation
AST parsing	“Crafting Interpreters” by Robert Nystrom	Ch. 4-6: Scanning & Parsing
Heuristics	“Programming Pearls” by Jon Bentley	Ch. 2: Algorithm Design

Extensions & Challenges

Extension 1: AI-Assisted Analysis

Use Claude to analyze unclear patterns:

async function analyzeWithAI(codeSnippets: string[]): Promise<ConventionInfo> {
  const response = await claude.generateObject({
    schema: conventionSchema,
    prompt: `Analyze these code snippets and identify conventions: ${codeSnippets}`
  });
  return response.object;
}

Extension 2: Multi-Language Support

Extend analyzers for Python, Rust, Go:

const analyzers: Record<string, Analyzer> = {
  npm: analyzeNpmProject,
  cargo: analyzeCargoProject,
  go: analyzeGoProject,
  python: analyzePythonProject
};

Extension 3: VS Code Extension

Create a VS Code extension that auto-updates CLAUDE.md:

{
  "activationEvents": ["onStartupFinished"],
  "contributes": {
    "commands": [{
      "command": "claude-context.generate",
      "title": "Generate CLAUDE.md"
    }]
  }
}

Self-Assessment Checklist

Conceptual Understanding

Can you explain what information Claude needs in CLAUDE.md?
Can you describe different code analysis strategies?
Can you explain context window optimization?
Can you list high-value vs. low-value information?
Can you explain staleness detection approaches?

Implementation Skills

Can you parse package.json and extract dependencies?
Can you detect tech stack from file patterns?
Can you infer naming conventions from files?
Can you generate formatted markdown from analysis?
Can you preserve manual sections during updates?

Code Quality

Is your code organized by analysis phase?
Are analyzers independently testable?
Is template rendering flexible?
Are confidence levels tracked?
Can another developer add a new analyzer?

The Core Question You’ve Answered

“What does Claude need to know about my project to be maximally helpful?”

CLAUDE.md bridges the gap between Claude’s general knowledge and your specific codebase. Automating its creation ensures Claude always has the right context.

By building this generator, you have mastered:

Code Analysis: Extracting structure, conventions, and patterns from code
Context Optimization: Providing maximum value in minimal tokens
Documentation Generation: Creating living documentation from code
Incremental Updates: Maintaining accuracy without losing manual additions

You can now ensure Claude Code has perfect context for every project you work on.

Project Guide Version 1.0 - December 2025

Project 35: CLAUDE.md Generator - Intelligent Context Builder

Table of Contents

Learning Objectives

Deep Theoretical Foundation

What Claude Code Needs to Know

Code Analysis Strategies

CLAUDE.md Structure and Semantics

Context Window Optimization

Complete Project Specification

Functional Requirements

Real World Outcome

Environment Setup

Avoid

Phase 2: Structure and Convention Analysis (Day 3-4)

Phase 3: Important File and Command Detection (Day 5)

Phase 4: CLAUDE.md Generation (Day 6)

{projectName}

Architecture

Directory Structure

Key Conventions

Important Files

Commands

Avoid

Phase 5: Update and Staleness Detection (Day 7)

Testing Strategy

Unit Tests: Analyzers

Unit Tests: Convention Detection

Integration Tests: Generation

Common Pitfalls & Debugging

Pitfall 1: Over-Generating Content

Pitfall 2: Missing Non-Obvious Conventions

Pitfall 3: Not Preserving Manual Additions

Pitfall 4: Incorrect Framework Detection

The Interview Questions They’ll Ask

Hints in Layers

Books That Will Help

Extensions & Challenges

Extension 1: AI-Assisted Analysis

Extension 2: Multi-Language Support

Extension 3: VS Code Extension

Self-Assessment Checklist

Conceptual Understanding

Implementation Skills

Code Quality

The Core Question You’ve Answered