Project 7: "The Executable Spec with mdflow" — Literate Programming

Project 7: “The Executable Spec with mdflow” — Literate Programming

Attribute	Value
File	`KIRO_CLI_LEARNING_PROJECTS.md`
Main Programming Language	Markdown / Bash
Coolness Level	Level 4: Hardcore Tech Flex
Difficulty	Level 3: Advanced
Knowledge Area	Literate Programming

What you’ll build: A Markdown spec whose code blocks are executed and validated, keeping docs in sync with reality.

Why it teaches Executable Specs: Documentation that executes cannot rot.

Success criteria:

The spec fails when code changes and passes after repair.

Real World Outcome

You’ll create a living specification document where every code example is automatically executed and validated. When your implementation changes, the spec either passes (proving docs are accurate) or fails (alerting you to update them).

Example: API Specification (api-spec.md)

# User Authentication API

## Creating a User

The `/api/users` endpoint accepts POST requests with email and password:

```bash
curl -X POST http://localhost:3000/api/users \
  -H "Content-Type: application/json" \
  -d '{"email":"test@example.com","password":"secure123"}'

Expected response:

{
  "id": "usr_abc123",
  "email": "test@example.com",
  "created_at": "2025-01-02T10:00:00Z"
}

When you run mdflow execute api-spec.md:

$ mdflow execute api-spec.md

Running: api-spec.md
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✓ Block 1: curl POST /api/users
  Status: 201 Created
  Response matched expected JSON schema

✓ Block 2: Expected response validation
  Field 'id' matches pattern: usr_[a-z0-9]+
  Field 'email' equals: test@example.com
  Field 'created_at' is valid ISO 8601

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
All blocks passed ✓ (2/2)
Execution time: 1.2s

When the API breaks:

$ mdflow execute api-spec.md

Running: api-spec.md
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✗ Block 1: curl POST /api/users
  Status: 500 Internal Server Error
  Expected: 201 Created

  Response:
  {
    "error": "Database connection failed"
  }

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
FAILED: 1 of 2 blocks failed
Execution time: 0.8s

This forces you to either fix the implementation or update the spec. Documentation can never drift from reality.

The Core Question You’re Answering

“How do I ensure my documentation stays synchronized with my actual codebase as it evolves?”

Most documentation becomes outdated within weeks of writing. Code examples break, APIs change, but the docs remain frozen in time. This project addresses the fundamental problem: passive documentation rots, executable documentation validates itself.

By embedding executable tests directly in your specification, you create a contract that must be maintained. When the contract breaks, CI fails, forcing alignment.

Concepts You Must Understand First

Stop and research these before coding:

Literate Programming
- What did Donald Knuth mean by “programs as literature”?
- How does weaving code with narrative improve understanding?
- Why is order of presentation different from order of execution?
- Book Reference: “Literate Programming” by Donald E. Knuth
Test-Driven Documentation
- How do executable examples serve as both docs and tests?
- What makes a good assertion in documentation?
- When should examples be simplified vs realistic?
- Book Reference: “Growing Object-Oriented Software, Guided by Tests” Ch. 2
Markdown Processing
- How do you parse and extract fenced code blocks?
- What metadata can be attached to code blocks (language, annotations)?
- How do you preserve line numbers for error reporting?
- Web Reference: CommonMark Specification - Fenced Code Blocks

Questions to Guide Your Design

Before implementing, think through these:

Execution Model
- How do you isolate each code block’s execution environment?
- Should blocks share state, or run independently?
- How do you handle blocks that depend on previous outputs?
- What happens if block 3 fails—do you run block 4?
Assertion Syntax
- How do users specify expected outputs (inline, separate blocks)?
- Do you support regex matching, JSON schema validation, or both?
- How do you handle non-deterministic outputs (timestamps, IDs)?
- Should exit codes alone determine success, or stdout comparison?
Language Support
- How do you execute different languages (bash, python, curl)?
- Do you need sandboxing (Docker containers, chroot)?
- How do you manage dependencies (language runtimes, system packages)?
- Should you support custom interpreters per project?

Thinking Exercise

Trace: Multi-Step API Workflow

Given this specification:

## User Workflow

Create a user:
```bash
USER_ID=$(curl -s POST /api/users -d '{"email":"test@example.com"}' | jq -r .id)

Verify creation:

curl GET "/api/users/$USER_ID"

Expected: {"id":"$USER_ID","email":"test@example.com"}

*Questions while designing:*
- How do you propagate `$USER_ID` from block 1 to block 2?
- Should the spec run in a single shell session, or fresh shells per block?
- What if `USER_ID` is empty because block 1 failed—should block 2 run?
- How do you validate that the returned ID matches the captured variable?

**Design Decision Matrix:**

| Approach | Pros | Cons |
|----------|------|------|
| Single shell session | State persists naturally | Pollution between tests |
| Environment variables | Explicit data flow | Manual propagation |
| JSON output files | Language-agnostic | Filesystem clutter |

---

#### The Interview Questions They'll Ask

1. "How would you design a system to execute code blocks from Markdown while preserving security boundaries?"

2. "Explain the tradeoffs between making documentation executable versus keeping separate test suites."

3. "How do you handle non-deterministic outputs (timestamps, random IDs) in executable documentation?"

4. "What strategies prevent test pollution when documentation blocks depend on shared state?"

5. "How would you integrate this into CI/CD to fail builds when documentation drifts from implementation?"

6. "Describe how you'd support multiple programming languages in a single specification document."

---

#### Hints in Layers

**Hint 1: Start with a Parser**
Use a Markdown parser (like `markdown-it` in Node.js or `mistune` in Python) to extract fenced code blocks. Store metadata (language, line numbers) for each block.

**Hint 2: Execution Strategy**
For each code block:
- Write code to a temporary script file
- Execute using the appropriate interpreter (`bash`, `python3`, `node`)
- Capture stdout, stderr, and exit code
- Compare against expected outputs (if specified)

**Hint 3: State Management**
Create a temporary directory as a "sandbox workspace":

/tmp/mdflow-session-abc123/ ├── block-1.sh ├── block-1.stdout ├── block-2.sh └── shared.env # Environment variables for state

**Hint 4: Assertion Annotations**
Support special comments for assertions:
```markdown
```bash
curl /api/users/123
# expect-status: 200
# expect-json: {"id":"123"}

```

Parse these comments to build validation rules.

Books That Will Help

Topic	Book	Chapter
Literate Programming Philosophy	“Literate Programming” by Donald E. Knuth	Introduction & Ch. 1
Test-Driven Development	“Test Driven Development: By Example” by Kent Beck	Part I
Markdown Parsing	“Crafting Interpreters” by Robert Nystrom	Ch. 4 (Scanning)
Documentation as Code	“Docs for Developers” by Jared Bhatti et al.	Ch. 6

Common Pitfalls & Debugging

Problem 1: “Code blocks fail due to missing dependencies”

Why: Spec assumes tools are installed (curl, jq, etc.)
Fix: Add a validation phase that checks for required binaries before execution
Quick test: command -v curl || echo "Missing curl"

Problem 2: “Non-deterministic outputs cause false failures”

Why: Timestamps, UUIDs, or random data change every run
Fix: Support regex patterns or placeholder matching (expect-pattern: usr_[a-z0-9]+)
Quick test: Replace exact matches with pattern assertions

Problem 3: “State leaks between blocks”

Why: Environment variables, temp files, or database records persist
Fix: Run each block in a fresh subprocess with isolated environment
Quick test: Add set -u to bash blocks to catch undefined variables

Problem 4: “Error messages don’t point to the right line in the spec”

Why: You’re losing line number context during extraction
Fix: Store original line numbers when parsing, include them in error reports
Quick test: Error in api-spec.md:15 (block 2)

Definition of Done

Parser extracts all fenced code blocks with metadata (language, line numbers)
Executor runs bash and at least one other language (Python or curl)
Assertions validate exit codes and stdout/stderr content
Failed blocks produce clear error messages with file/line references
Spec execution stops on first failure (or continues with --keep-going flag)
Environment isolation prevents state leaks between blocks
README includes example spec demonstrating success and failure cases
CI integration example shows how to fail builds on spec failures