Project 1: AI-Powered Expense Tracker CLI

Comprehensive Learning Guide Transform messy, unstructured human text into clean, typed, validated data structures using AI

Learning Objectives
Deep Theoretical Foundation
Complete Project Specification
Real World Outcome
Solution Architecture
Phased Implementation Guide
Testing Strategy
Common Pitfalls & Debugging
Extensions & Challenges
Resources
Self-Assessment Checklist

Learning Objectives

By completing this project, you will master:

Structured Output Generation: Understand how generateObject transforms natural language into validated TypeScript objects, and why this is the foundational pattern of modern AI applications.
Zod Schema Design for LLMs: Design schemas that serve dual purposes - TypeScript type safety AND LLM prompt engineering. Learn how field names, types, and descriptions guide AI output.
TypeScript Type Inference: Master z.infer<typeof schema> to derive types from schemas, eliminating type duplication and ensuring compile-time safety for AI-generated data.
AI Error Handling Patterns: Implement graceful degradation when LLMs produce invalid, partial, or no data. Understand the difference between validation errors and generation errors.
CLI Development Best Practices: Build professional command-line interfaces with proper argument parsing, helpful error messages, and intuitive user experience.
Data Persistence Strategies: Design file-based storage systems that organize data logically (by month/year) and handle concurrent access safely.
Prompt Engineering via Schema Design: Discover that schema descriptions are not just documentation - they are instructions to the LLM that dramatically affect output quality.

Deep Theoretical Foundation

How generateObject Works Internally

The generateObject function from the AI SDK is deceptively simple in its API but sophisticated in its implementation. Understanding its internals will make you a more effective AI developer.

+------------------+     +-------------------+     +------------------+
|   Your Code      |     |    AI SDK         |     |   LLM Provider   |
|                  |     |                   |     |   (OpenAI, etc)  |
|  generateObject( | --> | 1. Schema->JSON   | --> | Receives prompt  |
|    schema,       |     | 2. Build prompt   |     | with schema      |
|    prompt        |     | 3. API call       |     | instructions     |
|  )               |     |                   |     |                  |
+------------------+     +-------------------+     +------------------+
        ^                        |                         |
        |                        v                         v
        |                +-------------------+     +------------------+
        |                |  4. Parse JSON    |     | Generates JSON   |
        |                |  5. Zod validate  |     | matching schema  |
        +----------------+  6. Type infer    | <-- |                  |
                         +-------------------+     +------------------+

Step-by-Step Breakdown:

Schema Serialization: Your Zod schema is converted to JSON Schema format. This includes field names, types, constraints, and critically - the .describe() strings you provide.
Prompt Construction: The AI SDK builds a system prompt that includes:
- Instructions to output valid JSON
- The JSON Schema definition
- Your user prompt (the expense description)
API Call: The request goes to the LLM with specific parameters:
- response_format: { type: "json_object" } (for OpenAI)
- The constructed prompt
- Model parameters (temperature, etc.)
Response Parsing: The raw text response is parsed as JSON. If parsing fails, an error is thrown.
Zod Validation: The parsed JSON is validated against your Zod schema. This catches type mismatches, missing required fields, and constraint violations.
Type Inference: The validated object is returned with full TypeScript type inference based on your schema.

What the LLM Actually Sees (Simplified):
========================================

System: You are a structured data extraction assistant.
        Output valid JSON matching this schema:

        {
          "type": "object",
          "properties": {
            "amount": {
              "type": "number",
              "description": "The monetary amount in USD"
            },
            "vendor": {
              "type": "string",
              "description": "Business name where purchase was made"
            },
            "category": {
              "type": "string",
              "enum": ["dining", "travel", "office", "other"]
            }
          },
          "required": ["amount", "vendor", "category"]
        }

User: Coffee with team $23.40 at Starbucks this morning

The LLM uses this schema as a template, filling in values extracted from your prompt. Better descriptions = better extraction.

Zod Schema Design Patterns

Zod is a TypeScript-first schema validation library. For AI applications, it serves a dual purpose:

Runtime Validation: Ensures LLM output matches expected structure
Type Generation: Provides compile-time TypeScript types
LLM Guidance: Descriptions and constraints guide AI extraction

Pattern 1: The Progressive Disclosure Schema

Start minimal, add complexity as needed:

Level 1: Bare Minimum               Level 2: With Descriptions
========================           ============================

z.object({                         z.object({
  amount: z.number(),                amount: z.number()
  vendor: z.string(),                  .describe('USD amount'),
})                                   vendor: z.string()
                                       .describe('Business name'),
                                   })


Level 3: With Constraints          Level 4: Production Ready
==========================         ===========================

z.object({                         z.object({
  amount: z.number()                 amount: z.number()
    .positive()                        .positive()
    .describe('USD amount'),           .describe('The monetary amount
  vendor: z.string()                              spent in US dollars.
    .min(1)                                       Extract from "$45.50",
    .describe('Business name'),                   "45 dollars", "about
  category: z.enum([...])                         50 bucks". Round to
    .describe('Expense type'),                    2 decimal places.'),
})                                   vendor: z.string()
                                       .min(1)
                                       .max(100)
                                       .describe('The business or
                                                  merchant name'),
                                     category: z.enum([
                                       'dining',
                                       'travel',
                                       'office',
                                       'entertainment',
                                       'other'
                                     ]).describe('Expense category'),
                                     date: z.string()
                                       .describe('ISO 8601 date'),
                                     notes: z.string()
                                       .optional()
                                       .describe('Additional context'),
                                   })

Pattern 2: Enum with Descriptions

When using enums, the LLM only sees the values, not their meaning. Add descriptions:

// BAD: LLM has no context for these categories
const category = z.enum(['ent', 'trv', 'din', 'off']);

// GOOD: Explicit descriptions guide categorization
const category = z.enum(['dining', 'travel', 'office', 'entertainment', 'other'])
  .describe(`Expense category. Use:
    - "dining" for restaurants, coffee shops, food delivery
    - "travel" for transportation, hotels, flights, rideshare
    - "office" for supplies, equipment, software subscriptions
    - "entertainment" for events, subscriptions, media
    - "other" for anything that doesn't fit above categories`);

Pattern 3: Optional Fields with Defaults

Handle missing information gracefully:

Required Fields           Optional with Fallback      Nullable Pattern
================          =======================     ================

amount: z.number()        notes: z.string()           vendor: z.string()
  // MUST exist             .optional()                  .nullable()
  // Throws if missing       .default('')                // Can be null
                            // Uses '' if missing        // Different from
                                                         // missing entirely

Pattern 4: The Confidence Pattern

For uncertain extractions, ask the LLM to rate its confidence:

const expenseSchema = z.object({
  amount: z.number(),
  vendor: z.string(),
  confidence: z.number()
    .min(0)
    .max(1)
    .describe('How confident are you in this extraction? 1.0 = certain, 0.5 = guessing, 0 = no idea'),
});

// Then in your code:
if (result.confidence < 0.7) {
  console.log('Low confidence extraction - please verify');
}

How the LLM “Sees” Your Schema

Understanding the LLM’s perspective is crucial for effective schema design. The LLM receives a text representation of your schema and must generate matching JSON.

Your TypeScript Code:                    What the LLM Sees:
=====================                    ==================

const schema = z.object({     --->      "Extract data matching this format:
  amount: z.number()
    .positive()                          amount (number, positive):
    .describe('USD spent'),                The amount spent in USD

  vendor: z.string()                     vendor (string):
    .min(1)                                The business name
    .describe('Business name'),
                                         category (enum: dining|travel|
  category: z.enum([                                    office|other):
    'dining', 'travel',                    The expense type
    'office', 'other'
  ])                                     Output valid JSON only."
});

Critical Insight: Field names are part of the prompt!

Poor Field Names:                Better Field Names:
=================                ===================

z.object({                       z.object({
  a: z.number(),                   amountInUSD: z.number(),
  v: z.string(),                   vendorBusinessName: z.string(),
  c: z.string(),                   expenseCategory: z.string(),
})                               })

// LLM has no context!           // LLM understands intent

The Description Impact Test

Run this experiment to see descriptions in action:

// WITHOUT descriptions:
const schemaMinimal = z.object({
  amount: z.number(),
  vendor: z.string(),
});

// WITH descriptions:
const schemaDescribed = z.object({
  amount: z.number()
    .describe('The monetary amount spent in US dollars. Parse "$45.50" as 45.50, "about fifty bucks" as 50.00'),
  vendor: z.string()
    .describe('The business or merchant name where the purchase was made'),
});

// Test input: "grabbed coffee for like twenty bucks at the corner cafe"

// Minimal schema result:
// { amount: 20, vendor: "corner cafe" }  <-- might work, might not

// Described schema result:
// { amount: 20.00, vendor: "Corner Cafe" }  <-- more reliable

TypeScript Type Inference with Zod

One of Zod’s superpowers is deriving TypeScript types from runtime schemas. This eliminates type duplication and ensures your types always match your validation.

Traditional Approach (Duplication):       Zod Approach (Single Source of Truth):
===================================       ======================================

interface Expense {                       const expenseSchema = z.object({
  amount: number;                           amount: z.number(),
  vendor: string;                           vendor: z.string(),
  category: string;                         category: z.enum(['dining', ...]),
  notes?: string;                           notes: z.string().optional(),
}                                         });

// Must keep in sync!                     // Type derived automatically!
function validate(e: Expense) {...}       type Expense = z.infer<typeof expenseSchema>;

                                          // Expense = {
                                          //   amount: number;
                                          //   vendor: string;
                                          //   category: "dining" | "travel" | ...;
                                          //   notes?: string | undefined;
                                          // }

Deep Dive: How z.infer Works

Schema Definition                    Inferred Type
=================                    =============

z.string()                    -->    string
z.number()                    -->    number
z.boolean()                   -->    boolean
z.literal('foo')              -->    'foo' (literal type)
z.enum(['a', 'b'])            -->    'a' | 'b' (union)
z.array(z.string())           -->    string[]
z.object({ x: z.number() })   -->    { x: number }
z.string().optional()         -->    string | undefined
z.string().nullable()         -->    string | null
z.union([z.string(), z.number()])    string | number

Practical Example with Inference:

import { z } from 'zod';

const expenseSchema = z.object({
  id: z.string().uuid(),
  amount: z.number().positive(),
  vendor: z.string().min(1),
  category: z.enum(['dining', 'travel', 'office', 'entertainment', 'other']),
  date: z.string().datetime(),
  notes: z.string().optional(),
  tags: z.array(z.string()).default([]),
});

// Derive the type - ALWAYS in sync with schema!
type Expense = z.infer<typeof expenseSchema>;

// TypeScript now knows:
// - expense.amount is number
// - expense.category is 'dining' | 'travel' | 'office' | 'entertainment' | 'other'
// - expense.notes is string | undefined
// - expense.tags is string[]

function processExpense(expense: Expense) {
  // Full IntelliSense support!
  console.log(`${expense.vendor}: $${expense.amount}`);

  // TypeScript error: 'invalid' is not assignable to category type
  // expense.category = 'invalid';  // Compile error!
}

Reference: “Programming TypeScript” by Boris Cherny, Chapter 3 covers TypeScript’s type system fundamentals. Chapter 6 dives into advanced type inference patterns.

Error Handling in AI Systems

AI systems introduce unique error scenarios that don’t exist in traditional applications. Understanding these is critical for building robust tools.

Traditional Error Hierarchy:            AI Error Hierarchy:
============================           ===================

  Error                                  Error
    |                                      |
    +-- TypeError                          +-- NetworkError
    |                                      |     (API unreachable)
    +-- RangeError                         |
    |                                      +-- RateLimitError
    +-- ValidationError                    |     (too many requests)
                                           |
                                           +-- AI_NoObjectGeneratedError
                                           |     (LLM couldn't match schema)
                                           |
                                           +-- AI_InvalidResponseError
                                           |     (LLM returned invalid JSON)
                                           |
                                           +-- ZodError
                                                 (validation failed)

The Error Decision Tree:

                       generateObject() called
                               |
                               v
                    +----------+-----------+
                    |   Network Error?     |
                    +----------+-----------+
                         |            |
                        YES          NO
                         |            |
                         v            v
                    Retry with    +---+---+
                    backoff       | Parse |
                                  | JSON  |
                                  +---+---+
                                      |
                        +-------------+-------------+
                        |                           |
                    Parse Failed              Parse Success
                        |                           |
                        v                           v
               AI_InvalidResponseError    +--------+--------+
               (LLM output not JSON)      | Zod Validation  |
                                          +--------+--------+
                                                   |
                                     +-------------+-------------+
                                     |                           |
                                Validation Failed          Validation Success
                                     |                           |
                                     v                           v
                              +------+------+              Return typed
                              | Which error?|              object!
                              +------+------+
                                     |
                    +----------------+----------------+
                    |                                 |
            Missing required                   Type mismatch
            fields (ZodError)                  (ZodError)
                    |                                 |
                    v                                 v
            Prompt user for                   Schema needs
            missing info                      adjustment

Handling Each Error Type:

import { generateObject, AI_NoObjectGeneratedError } from 'ai';
import { z } from 'zod';

async function extractExpense(input: string) {
  try {
    const { object } = await generateObject({
      model: openai('gpt-4o-mini'),
      schema: expenseSchema,
      prompt: input,
    });
    return { success: true, data: object };

  } catch (error) {
    // Error Type 1: LLM couldn't generate matching object
    if (error instanceof AI_NoObjectGeneratedError) {
      return {
        success: false,
        error: 'extraction_failed',
        message: 'Could not extract expense from your description',
        suggestion: 'Try including an amount and vendor name',
      };
    }

    // Error Type 2: Zod validation failed
    if (error instanceof z.ZodError) {
      const missingFields = error.issues
        .filter(issue => issue.code === 'invalid_type' && issue.received === 'undefined')
        .map(issue => issue.path.join('.'));

      return {
        success: false,
        error: 'validation_failed',
        message: 'Missing required information',
        missingFields,
      };
    }

    // Error Type 3: Network/API errors
    if (error instanceof Error && error.message.includes('rate limit')) {
      return {
        success: false,
        error: 'rate_limited',
        message: 'Too many requests, please wait a moment',
      };
    }

    // Unknown error - rethrow
    throw error;
  }
}

Reference: “Programming TypeScript” by Boris Cherny, Chapter 7 covers error handling patterns including discriminated unions for error types (the { success, error } pattern above).

Complete Project Specification

Functional Requirements

Core Features (Must Have):

Feature	Description	Priority
Natural language expense input	Accept free-form text like “Coffee $5 at Starbucks”	P0
Structured extraction	Extract amount, vendor, category, date, notes	P0
Category assignment	Automatically categorize expenses	P0
JSON persistence	Store expenses in monthly JSON files	P0
Error handling	Graceful messages for invalid input	P0
List command	View recent expenses	P1
Report command	Generate monthly summary	P1
CSV export	Export expenses for spreadsheets	P2

Expense Schema Requirements:

// Required fields (must be extracted or defaulted)
{
  id: string;          // Generated UUID
  amount: number;      // Positive, 2 decimal places
  vendor: string;      // 1-100 characters
  category: string;    // One of predefined categories
  date: string;        // ISO 8601 format
  createdAt: string;   // Timestamp of record creation
}

// Optional fields
{
  notes: string;       // Additional context
  tags: string[];      // User-defined tags
}

CLI Commands:

# Add expense (default command)
expense "description of expense"
expense add "description of expense"

# List expenses
expense list                    # Last 10 expenses
expense list --all              # All expenses
expense list --month 2025-12    # Specific month
expense list --category dining  # Filter by category

# Generate reports
expense report                  # Current month
expense report --month 2025-12  # Specific month
expense report --year 2025      # Full year

# Export data
expense export --month 2025-12 --format csv
expense export --month 2025-12 --format json

Non-Functional Requirements

Requirement	Target	Rationale
Response time	< 2 seconds	User shouldn’t wait for AI processing
Error rate	< 5% for well-formed inputs	Most expenses should extract cleanly
Storage format	Human-readable JSON	Easy debugging and manual edits
API cost	< $0.01 per expense	Use gpt-4o-mini for cost efficiency

Real World Outcome

When you run the CLI, here’s exactly what you’ll see in your terminal:

Adding an Expense

$ expense "Coffee with team $23.40 at Starbucks this morning"

 Expense recorded

+-------------------------------------------------------------------+
|                        EXPENSE RECORD                             |
+-------------------------------------------------------------------+
|  Amount:     $23.40                                               |
|  Category:   dining                                               |
|  Vendor:     Starbucks                                            |
|  Date:       2025-12-22                                           |
|  Notes:      Coffee with team                                     |
+-------------------------------------------------------------------+
|  ID:         exp_a7f3b2c1                                         |
|  Created:    2025-12-22T10:34:12Z                                 |
+-------------------------------------------------------------------+

Saved to ~/.expenses/2025-12.json

Complex Natural Language Input

$ expense "Took an Uber from airport to hotel, $67.80, for the Chicago conference trip"

 Expense recorded

+-------------------------------------------------------------------+
|                        EXPENSE RECORD                             |
+-------------------------------------------------------------------+
|  Amount:     $67.80                                               |
|  Category:   travel                                               |
|  Vendor:     Uber                                                 |
|  Date:       2025-12-22                                           |
|  Notes:      Airport to hotel, Chicago conference                 |
+-------------------------------------------------------------------+
|  ID:         exp_b8e4c3d2                                         |
|  Created:    2025-12-22T10:35:45Z                                 |
+-------------------------------------------------------------------+

Monthly Report

$ expense report --month 2025-12

+-------------------------------------------------------------------+
|              EXPENSE REPORT: December 2025                        |
+-------------------------------------------------------------------+
|                                                                   |
|  SUMMARY BY CATEGORY                                              |
|  -------------------                                              |
|  dining        |################     |  $234.50  (12 expenses)   |
|  travel        |############         |  $567.80  (5 expenses)    |
|  office        |####                 |  $89.20   (3 expenses)    |
|  entertainment |##                   |  $45.00   (2 expenses)    |
|  -----------------------------------------------------------------|
|  TOTAL                                 $936.50  (22 expenses)    |
|                                                                   |
+-------------------------------------------------------------------+

Exported to ~/.expenses/report-2025-12.csv

Error Handling

$ expense "bought something"

! Could not extract expense details

Missing information:
  - Amount: No monetary value found
  - Vendor: No vendor/merchant identified

Please include at least an amount, e.g.:
  expense "bought lunch $15 at Chipotle"

Solution Architecture

System Architecture Diagram

+------------------------------------------------------------------+
|                        EXPENSE TRACKER CLI                        |
+------------------------------------------------------------------+
|                                                                   |
|  +------------------+     +------------------+     +-----------+  |
|  |  CLI Interface   |     |  Core Engine     |     |  Storage  |  |
|  |------------------|     |------------------|     |-----------|  |
|  | - Argument Parse |---->| - AI Extraction  |---->| - File I/O|  |
|  | - Command Route  |     | - Validation     |     | - JSON    |  |
|  | - Output Format  |<----| - Error Handle   |<----| - Reports |  |
|  +------------------+     +------------------+     +-----------+  |
|           |                       |                               |
|           |                       v                               |
|           |               +------------------+                    |
|           |               |   AI SDK Layer   |                    |
|           |               |------------------|                    |
|           +-------------->| - generateObject |                    |
|                           | - Zod Schemas    |                    |
|                           | - Provider Config|                    |
|                           +--------+---------+                    |
|                                    |                              |
+------------------------------------------------------------------+
                                     |
                                     v
                          +--------------------+
                          |   OpenAI API       |
                          | (gpt-4o-mini)      |
                          +--------------------+

Module Breakdown

src/
+-- index.ts              # Entry point, CLI routing
+-- commands/             # Command handlers
|   +-- add.ts            # Add expense command
|   +-- list.ts           # List expenses command
|   +-- report.ts         # Generate reports command
|   +-- export.ts         # Export to CSV/JSON
+-- core/                 # Core business logic
|   +-- extractor.ts      # AI extraction logic
|   +-- validator.ts      # Additional validation
|   +-- formatter.ts      # Output formatting
+-- schemas/              # Zod schema definitions
|   +-- expense.ts        # Expense schema
|   +-- report.ts         # Report schema
+-- storage/              # Data persistence
|   +-- json-store.ts     # JSON file operations
|   +-- paths.ts          # File path management
+-- utils/                # Utility functions
|   +-- date.ts           # Date parsing/formatting
|   +-- currency.ts       # Currency formatting
|   +-- logger.ts         # Colored console output
+-- types/                # TypeScript type definitions
    +-- index.ts          # Exported types (derived from schemas)

Data Flow Diagram

User Input                     Processing                      Output
==========                     ==========                      ======

"Coffee $5
at Starbucks"
      |
      v
+-------------+
| CLI Parser  |
| (yargs)     |
+------+------+
       |
       v
+-------------+    +--------+    +-------------+
| Extractor   |--->| AI SDK |--->|   OpenAI    |
|             |    |--------|    | gpt-4o-mini |
|             |<---|        |<---|             |
+------+------+    +--------+    +-------------+
       |
       v
+-------------+
| Validator   |
| (Zod)       |
+------+------+
       |
       +--------+--------+
       |                 |
       v                 v
+-------------+   +------------+
| JSON Store  |   | Formatter  |-----> Terminal
|             |   | (tables)   |       Output
+-------------+   +------------+
       |
       v
~/.expenses/
  2025-12.json

File Structure Recommendation

expense-tracker-cli/
+-- package.json
+-- tsconfig.json
+-- .env                    # OPENAI_API_KEY
+-- .env.example            # Template for .env
+-- .gitignore
+-- README.md
|
+-- src/
|   +-- index.ts            # Main entry point
|   +-- cli.ts              # CLI setup with yargs
|   +-- commands/
|   |   +-- add.ts
|   |   +-- list.ts
|   |   +-- report.ts
|   |   +-- export.ts
|   +-- core/
|   |   +-- extractor.ts
|   |   +-- validator.ts
|   |   +-- formatter.ts
|   +-- schemas/
|   |   +-- expense.ts
|   +-- storage/
|   |   +-- json-store.ts
|   +-- utils/
|       +-- date.ts
|       +-- logger.ts
|
+-- tests/
|   +-- unit/
|   |   +-- schemas.test.ts
|   |   +-- validator.test.ts
|   +-- integration/
|   |   +-- extractor.test.ts
|   |   +-- commands.test.ts
|   +-- fixtures/
|       +-- sample-inputs.json
|
+-- dist/                   # Compiled JavaScript (gitignored)

Phased Implementation Guide

Phase 1: Foundation (Day 1)

Goal: Get a minimal working extraction with hardcoded output.

Milestone: generateObject returns a parsed expense from natural language.

Tasks:

Project Setup

mkdir expense-tracker-cli && cd expense-tracker-cli
pnpm init
pnpm add ai @ai-sdk/openai zod
pnpm add -D typescript @types/node tsx
npx tsc --init

Create Basic Schema (src/schemas/expense.ts)
- Define expenseSchema with amount, vendor, category
- Export Expense type using z.infer
- Add descriptions to each field
Create Extractor (src/core/extractor.ts)
- Import generateObject from ai
- Import your schema
- Call generateObject with hardcoded prompt
- Log the result
Test Manually
```
pnpm tsx src/core/extractor.ts
```

Success Criteria: Running the extractor logs a parsed expense object.

Phase 2: CLI Interface (Day 2)

Goal: Accept input from command line and format output nicely.

Milestone: Running pnpm tsx src/index.ts "Coffee $5 at Starbucks" shows formatted expense.

Tasks:

Add CLI Dependencies

pnpm add yargs chalk
pnpm add -D @types/yargs

Create CLI Entry Point (src/index.ts)
- Use yargs to parse arguments
- Route to appropriate command handler
- Handle –help and –version
Create Add Command (src/commands/add.ts)
- Accept natural language input
- Call extractor
- Format and display result
Create Formatter (src/core/formatter.ts)
- Use chalk for colors
- Create box-drawing output
- Handle success and error states

Success Criteria: Natural language input produces beautifully formatted output.

Phase 3: Persistence (Day 3)

Goal: Store expenses in JSON files organized by month.

Milestone: Expenses persist across CLI invocations; list command shows history.

Tasks:

Create Storage Module (src/storage/json-store.ts)
- Define expenses directory (~/.expenses/)
- Implement loadExpenses(month: string)
- Implement saveExpense(expense: Expense)
- Handle file creation and concurrent writes
Update Add Command
- Generate UUID for each expense
- Add createdAt timestamp
- Save to appropriate monthly file
Create List Command (src/commands/list.ts)
- Load expenses from storage
- Format as table
- Support --month and --category filters

Success Criteria: Expenses persist in ~/.expenses/2025-12.json; list shows them.

Phase 4: Error Handling (Day 4)

Goal: Handle all error cases gracefully with helpful messages.

Milestone: Invalid inputs produce actionable error messages, not stack traces.

Tasks:

Enhance Extractor Error Handling
- Catch AI_NoObjectGeneratedError
- Catch ZodError for validation failures
- Identify missing fields
Create Error Messages
- For missing amount: suggest format
- For missing vendor: suggest including business name
- For ambiguous input: ask for clarification
Add Retry Logic (Optional)
- Retry once with enhanced prompt on failure
- Include example in retry prompt
Test Edge Cases
- “bought something” (too vague)
- ”” (empty input)
- “asdfghjkl” (gibberish)
- “coffee coffee coffee $5 $10 $15” (ambiguous)

Success Criteria: All error cases show user-friendly messages with suggestions.

Phase 5: Reports & Polish (Day 5)

Goal: Complete the feature set with reports and professional polish.

Milestone: report command generates category summaries; CLI feels professional.

Tasks:

Create Report Command (src/commands/report.ts)
- Load month’s expenses
- Group by category
- Calculate totals
- Generate ASCII bar chart
Create Export Command (src/commands/export.ts)
- Export to CSV format
- Export to JSON format
- Support date range filters
Add Polish
- Progress spinners during AI call
- Confirmation prompts for destructive actions
- Help text with examples
- Version number from package.json
Create bin Script
- Add "bin": { "expense": "./dist/index.js" } to package.json
- Compile TypeScript
- Test with pnpm link

Success Criteria: Professional CLI that can be installed globally and used daily.

Testing Strategy

Unit Tests: Schema Validation

Test that your schemas correctly validate and reject data:

// tests/unit/schemas.test.ts
import { describe, it, expect } from 'vitest';
import { expenseSchema } from '../../src/schemas/expense';

describe('expenseSchema', () => {
  it('accepts valid expense data', () => {
    const valid = {
      amount: 45.50,
      vendor: 'Starbucks',
      category: 'dining',
      date: '2025-12-22',
    };

    const result = expenseSchema.safeParse(valid);
    expect(result.success).toBe(true);
  });

  it('rejects negative amounts', () => {
    const invalid = {
      amount: -10,
      vendor: 'Starbucks',
      category: 'dining',
      date: '2025-12-22',
    };

    const result = expenseSchema.safeParse(invalid);
    expect(result.success).toBe(false);
  });

  it('rejects invalid categories', () => {
    const invalid = {
      amount: 10,
      vendor: 'Starbucks',
      category: 'invalid_category',
      date: '2025-12-22',
    };

    const result = expenseSchema.safeParse(invalid);
    expect(result.success).toBe(false);
  });

  it('allows optional notes to be missing', () => {
    const valid = {
      amount: 10,
      vendor: 'Starbucks',
      category: 'dining',
      date: '2025-12-22',
      // notes is missing - should be okay
    };

    const result = expenseSchema.safeParse(valid);
    expect(result.success).toBe(true);
  });
});

Integration Tests: Mocked LLM

Test extraction logic without hitting the real API:

// tests/integration/extractor.test.ts
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { extractExpense } from '../../src/core/extractor';

// Mock the AI SDK
vi.mock('ai', () => ({
  generateObject: vi.fn(),
}));

import { generateObject } from 'ai';

describe('extractExpense', () => {
  beforeEach(() => {
    vi.clearAllMocks();
  });

  it('extracts expense from simple input', async () => {
    // Mock the AI response
    vi.mocked(generateObject).mockResolvedValue({
      object: {
        amount: 5.00,
        vendor: 'Starbucks',
        category: 'dining',
        date: '2025-12-22',
        notes: 'Coffee',
      },
    });

    const result = await extractExpense('Coffee $5 at Starbucks');

    expect(result.success).toBe(true);
    expect(result.data?.amount).toBe(5.00);
    expect(result.data?.vendor).toBe('Starbucks');
  });

  it('handles extraction failure gracefully', async () => {
    const { AI_NoObjectGeneratedError } = await import('ai');

    vi.mocked(generateObject).mockRejectedValue(
      new AI_NoObjectGeneratedError({ message: 'Could not generate' })
    );

    const result = await extractExpense('bought something');

    expect(result.success).toBe(false);
    expect(result.error).toBe('extraction_failed');
  });
});

Edge Case Testing

Create a fixtures file with challenging inputs:

// tests/fixtures/sample-inputs.json
{
  "valid_inputs": [
    {
      "input": "Coffee $5 at Starbucks",
      "expected": { "amount": 5.00, "vendor": "Starbucks", "category": "dining" }
    },
    {
      "input": "Uber to airport $45.50",
      "expected": { "amount": 45.50, "vendor": "Uber", "category": "travel" }
    },
    {
      "input": "bought pens and paper at staples for around twenty bucks",
      "expected": { "amount": 20.00, "vendor": "Staples", "category": "office" }
    }
  ],
  "edge_cases": [
    {
      "input": "spent $10 and $20",
      "note": "Multiple amounts - should pick one or sum"
    },
    {
      "input": "coffee this morning",
      "note": "No amount - should fail gracefully"
    },
    {
      "input": "",
      "note": "Empty input - should fail immediately"
    }
  ]
}

Test Commands

# Run all tests
pnpm test

# Run with coverage
pnpm test --coverage

# Run specific test file
pnpm test tests/unit/schemas.test.ts

# Run in watch mode during development
pnpm test --watch

Common Pitfalls & Debugging

Pitfall 1: Missing Schema Descriptions

Symptom: LLM returns incorrect categories or misinterprets fields.

Bad:

const schema = z.object({
  amount: z.number(),
  category: z.enum(['d', 't', 'o', 'e']),  // What do these mean?
});

Good:

const schema = z.object({
  amount: z.number()
    .describe('The monetary amount in US dollars'),
  category: z.enum(['dining', 'travel', 'office', 'entertainment'])
    .describe('Use dining for restaurants and coffee shops'),
});

Debug: Print the JSON Schema that gets sent to the LLM:

import { zodToJsonSchema } from 'zod-to-json-schema';
console.log(JSON.stringify(zodToJsonSchema(schema), null, 2));

Pitfall 2: Not Handling AI Errors

Symptom: Unhandled promise rejection or cryptic error messages.

Bad:

const { object } = await generateObject({ ... });
// Crashes if extraction fails!

Good:

try {
  const { object } = await generateObject({ ... });
  return { success: true, data: object };
} catch (error) {
  if (error instanceof AI_NoObjectGeneratedError) {
    return { success: false, error: 'extraction_failed' };
  }
  throw error;
}

Pitfall 3: Hardcoded Date Handling

Symptom: “yesterday” always extracts as a fixed date.

Bad:

// No context about current date
const { object } = await generateObject({
  schema,
  prompt: input,
});

Good:

// Include current date in prompt
const today = new Date().toISOString().split('T')[0];
const { object } = await generateObject({
  schema,
  prompt: `Today is ${today}. Extract expense from: ${input}`,
});

Pitfall 4: Type Mismatch Between Schema and Storage

Symptom: TypeScript errors when saving or loading expenses.

Bad:

// Schema
const schema = z.object({ amount: z.number() });

// Storage adds extra fields without updating type
function save(expense: Expense) {
  const record = {
    ...expense,
    id: uuid(),        // Not in schema!
    createdAt: now(),  // Not in schema!
  };
  // TypeScript: 'record' doesn't match Expense type
}

Good:

// Schema for AI extraction
const extractionSchema = z.object({
  amount: z.number(),
  vendor: z.string(),
});

// Schema for storage (extends extraction)
const expenseRecordSchema = extractionSchema.extend({
  id: z.string().uuid(),
  createdAt: z.string().datetime(),
});

type ExtractedExpense = z.infer<typeof extractionSchema>;
type ExpenseRecord = z.infer<typeof expenseRecordSchema>;

Pitfall 5: Not Validating Loaded Data

Symptom: App crashes when JSON file is manually edited or corrupted.

Bad:

function loadExpenses(): Expense[] {
  const data = fs.readFileSync(file, 'utf-8');
  return JSON.parse(data);  // Trust whatever is in the file!
}

Good:

function loadExpenses(): Expense[] {
  const data = fs.readFileSync(file, 'utf-8');
  const parsed = JSON.parse(data);

  // Validate each expense
  return parsed.filter((item: unknown) => {
    const result = expenseSchema.safeParse(item);
    if (!result.success) {
      console.warn('Skipping invalid expense:', item);
    }
    return result.success;
  }).map((item: unknown) => expenseSchema.parse(item));
}

Pitfall 6: Blocking I/O in Async Context

Symptom: CLI feels slow or unresponsive.

Bad:

// Blocks the event loop
const expenses = JSON.parse(fs.readFileSync(file, 'utf-8'));

Good:

// Non-blocking
const data = await fs.promises.readFile(file, 'utf-8');
const expenses = JSON.parse(data);

Pitfall 7: No API Key Error Message

Symptom: Cryptic error about authentication or missing API key.

Debug and Fix:

// At the top of your entry point
const apiKey = process.env.OPENAI_API_KEY;
if (!apiKey) {
  console.error('Error: OPENAI_API_KEY environment variable is not set.');
  console.error('');
  console.error('To fix this:');
  console.error('1. Create a .env file with: OPENAI_API_KEY=sk-...');
  console.error('2. Or export it: export OPENAI_API_KEY=sk-...');
  process.exit(1);
}

Extensions & Challenges

Extension 1: Smart Category Learning

Enhance the categorization to learn from user corrections:

Challenge: When a user corrects a category (e.g., “That should be ‘travel’, not ‘dining’”), store the correction and use it to influence future categorizations.

Implementation Ideas:

Store a corrections.json file mapping vendor names to preferred categories
Include recent corrections in the system prompt
Use a separate LLM call to check against known preferences

// Before extraction, check preferences
const vendorHint = getVendorPreference('Uber'); // Returns 'travel'

// Include in prompt
const prompt = `
  User preferences: Uber should be categorized as 'travel'.

  Extract expense from: ${input}
`;

Extension 2: Receipt Image Processing

Add support for extracting expenses from photos of receipts.

Challenge: Accept an image path and extract expense details using vision capabilities.

Implementation Ideas:

Use a vision-capable model (GPT-4o, Claude)
Load image and encode as base64
Pass to generateObject with vision prompt

import { generateObject } from 'ai';
import { readFileSync } from 'fs';

async function extractFromReceipt(imagePath: string) {
  const imageData = readFileSync(imagePath).toString('base64');

  const { object } = await generateObject({
    model: openai('gpt-4o'),
    schema: expenseSchema,
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Extract expense details from this receipt:' },
          { type: 'image', image: imageData },
        ],
      },
    ],
  });

  return object;
}

Extension 3: Multi-Currency Support

Handle expenses in different currencies with automatic conversion.

Challenge: Extract currency from input (“50 EUR at cafe in Paris”) and convert to base currency.

Implementation Ideas:

Add currency field to schema
Integrate with exchange rate API
Store original amount and converted amount

const expenseSchema = z.object({
  amountOriginal: z.number(),
  currencyOriginal: z.string().length(3).describe('ISO 4217 currency code like USD, EUR, GBP'),
  amountUSD: z.number().optional().describe('Amount converted to USD'),
  // ...other fields
});

Extension 4: Voice Input

Accept voice memos and transcribe them before extraction.

Challenge: Record or accept audio file, transcribe with Whisper, then extract expense.

Implementation Ideas:

Use system microphone or accept audio file path
Call Whisper API for transcription
Pipe transcript to expense extractor

# Goal usage
expense --voice                    # Start recording
expense --audio recording.m4a     # From file

Resources

Essential Documentation

Resource	URL	What You’ll Learn
AI SDK Docs - Structured Data	https://ai-sdk.dev/docs/ai-sdk-core/generating-structured-data	Core `generateObject` usage
Zod Documentation	https://zod.dev	Schema design patterns
AI SDK Error Handling	https://ai-sdk.dev/docs/ai-sdk-core/error-handling	AI-specific error types

Books with Specific Chapters

Book	Author	Relevant Chapters
“Programming TypeScript”	Boris Cherny	Ch. 3 (Types), Ch. 6 (Advanced Types), Ch. 7 (Error Handling)
“JavaScript: The Definitive Guide”	David Flanagan	Ch. 13 (Asynchronous JavaScript)
“Command-Line Rust”	Ken Youens-Clark	Ch. 1-2 (CLI patterns applicable to TypeScript)

Video Resources

Fireship: Zod in 100 seconds
Theo: AI SDK deep dive
Matt Pocock: Advanced TypeScript patterns

Community

AI SDK Discord
TypeScript Discord
Zod GitHub Discussions

Self-Assessment Checklist

Before considering this project complete, verify your understanding:

Conceptual Understanding

Can you explain the difference between generateText and generateObject to a colleague?
Can you describe what data the LLM receives when you call generateObject?
Can you explain why schema descriptions improve extraction accuracy?
Can you list 3 types of errors that can occur in AI-based extraction?
Can you explain how z.infer<typeof schema> works?

Implementation Skills

Can you create a Zod schema with required and optional fields?
Can you handle AI_NoObjectGeneratedError gracefully?
Can you parse command-line arguments using yargs?
Can you persist JSON data to the filesystem?
Can you format terminal output with colors using chalk?

Schema Design

Do your schema field names clearly indicate their purpose?
Do all schema fields have descriptive .describe() calls?
Are appropriate fields marked as .optional()?
Does your enum have clear category descriptions?
Would the LLM understand your schema with no other context?

Error Handling

Does your CLI show helpful messages for missing amounts?
Does your CLI suggest corrections for invalid input?
Does your CLI handle network errors gracefully?
Does your CLI validate loaded data from JSON files?
Is there no way to crash the CLI with bad input?

Code Quality

Is your code organized into logical modules?
Are types derived from schemas (not duplicated)?
Are async operations properly awaited?
Is the API key validated at startup?
Can another developer understand your code structure?

Real-World Readiness

Can you add an expense in under 2 seconds?
Can you generate a monthly report?
Can you export to CSV?
Does the CLI have helpful --help output?
Could you use this tool daily for real expense tracking?

The Core Question You’ve Answered

“How do I transform messy, unstructured human text into clean, typed, validated data structures using AI?”

This is THE fundamental pattern of modern AI applications. Every chatbot that fills out forms, every assistant that creates calendar events, every tool that extracts data from documents - they all use this pattern.

By building this expense tracker, you have mastered:

Schema Design for AI: How to craft Zod schemas that guide LLM output
Type-Safe AI: How to derive TypeScript types from runtime schemas
Graceful Error Handling: How to recover from AI extraction failures
Practical CLI Development: How to build tools people actually want to use

You are now ready to build more complex AI applications. The patterns you learned here - structured output, schema design, error handling - are the foundation of everything from chatbots to data pipelines.

Project Guide Version 1.0 - December 2025

Project 1: AI-Powered Expense Tracker CLI

Table of Contents

Learning Objectives

Deep Theoretical Foundation

How generateObject Works Internally

Zod Schema Design Patterns

How the LLM “Sees” Your Schema

TypeScript Type Inference with Zod

Error Handling in AI Systems

Complete Project Specification

Functional Requirements

Non-Functional Requirements

Real World Outcome

Adding an Expense

Complex Natural Language Input

Monthly Report

Error Handling

Solution Architecture

System Architecture Diagram

Module Breakdown

Data Flow Diagram

File Structure Recommendation

Phased Implementation Guide

Phase 1: Foundation (Day 1)

Phase 2: CLI Interface (Day 2)

Phase 3: Persistence (Day 3)

Phase 4: Error Handling (Day 4)

Phase 5: Reports & Polish (Day 5)

Testing Strategy

Unit Tests: Schema Validation

Integration Tests: Mocked LLM

Edge Case Testing

Test Commands

Common Pitfalls & Debugging

Pitfall 1: Missing Schema Descriptions

Pitfall 2: Not Handling AI Errors

Pitfall 3: Hardcoded Date Handling

Pitfall 4: Type Mismatch Between Schema and Storage

Pitfall 5: Not Validating Loaded Data

Pitfall 6: Blocking I/O in Async Context

Pitfall 7: No API Key Error Message

Extensions & Challenges

Extension 1: Smart Category Learning

Extension 2: Receipt Image Processing

Extension 3: Multi-Currency Support

Extension 4: Voice Input

Resources

Essential Documentation

Books with Specific Chapters

Recommended Reading Order

Video Resources

Community

Self-Assessment Checklist

Conceptual Understanding

Implementation Skills

Schema Design

Error Handling

Code Quality

Real-World Readiness

The Core Question You’ve Answered