Project 6: Tool Router

Build a production-grade AI agent system that safely bridges natural language and programmatic function calls

Quick Reference

Attribute	Value
Difficulty	Expert
Time Estimate	1-2 weeks
Language	TypeScript (Alternatives: Python)
Prerequisites	Projects 1-2, JSON Schema knowledge, API design experience
Key Topics	Tool calling, function schemas, agent loops, error recovery
Knowledge Area	Tool Use / Agent Reliability
Software/Tool	OpenAI Function Calling / Anthropic Tool Use APIs
Main Book	“Clean Code” by Robert Martin (Ch. 8: Boundaries)
Coolness Level	Level 4: “Oh wow, that’s real”
Business Potential	4. The “Open Core” Infrastructure

1. Learning Objectives

By completing this project, you will:

Master Function Calling APIs: Understand how LLMs output structured tool calls instead of text
Design Tool Contracts: Create JSON Schema definitions that act as type-safe interfaces
Implement Agent Loops: Build the observe-reason-act pattern (ReAct)
Handle Argument Validation: Validate every parameter before execution to prevent crashes
Manage Error Recovery: Feed errors back to the LLM so it can self-correct
Prevent Infinite Loops: Detect and stop repeated failing tool calls
Build Production Safety: Add confirmation steps for destructive actions
Create Audit Trails: Log every decision for debugging and compliance
Optimize Token Usage: Filter irrelevant tools to reduce costs
Design Multi-Step Planning: Enable agents to chain multiple tools to complete complex tasks

2. Theoretical Foundation

2.1 Core Concepts

What is Tool Calling?

Tool calling (also called function calling) is a mechanism where the LLM outputs structured function invocations instead of natural language responses.

Without Tool Calling (Text-Based):

User: Order 5 pizzas
Model: I'll help you order pizzas. You can place an order by...

Problem: You can't execute this. It's just text.

With Tool Calling (Structured):

User: Order 5 pizzas
Model: {
  "tool": "create_order",
  "arguments": {
    "item": "pizza",
    "quantity": 5,
    "address": null  // Missing!
  }
}

You: Validate, detect missing argument, send error back
Model: Let me get the address first...
Model: {
  "tool": "get_user_info",
  "arguments": {"user_id": "current_user"}
}

You: Execute get_user_info(), return result
Model: Now I have the address...
Model: {
  "tool": "create_order",
  "arguments": {
    "item": "pizza",
    "quantity": 5,
    "address": "123 Main St, Springfield, IL 62701"
  }
}

You: Execute, return success

Why This Matters:

Tool calling transforms the LLM from a chatbot into an agent that can:

Modify databases
Call external APIs
Execute business logic
Chain multiple operations
Handle real-world tasks

JSON Schema as Contracts

A contract defines what a function expects (parameters) and returns (output). JSON Schema provides a language-agnostic way to specify these contracts.

Example Tool Definition:

{
  "name": "create_order",
  "description": "Creates a new food delivery order. Use when user wants to order food.",
  "parameters": {
    "type": "object",
    "properties": {
      "item": {
        "type": "string",
        "description": "Food item to order",
        "enum": ["pizza", "burger", "salad", "pasta"]
      },
      "quantity": {
        "type": "integer",
        "description": "Number of items",
        "minimum": 1,
        "maximum": 100
      },
      "address": {
        "type": "string",
        "description": "Full delivery address",
        "pattern": "^.+,.+,.+,\\s*\\d{5}$"
      }
    },
    "required": ["item", "quantity", "address"]
  },
  "returns": {
    "type": "object",
    "properties": {
      "order_id": {"type": "string"},
      "status": {"type": "string"},
      "estimated_time": {"type": "string"}
    }
  }
}

Why Schemas Are Critical:

Type Safety: Catch type errors before execution
Documentation: The schema IS the documentation
Validation: Programmatically verify arguments
Versioning: Track breaking changes to interfaces

The ReAct Pattern (Reasoning + Acting)

ReAct is the fundamental loop for tool-using agents:

LOOP until task complete OR max iterations:
  1. OBSERVE: Receive user input or tool result
  2. REASON: LLM thinks about what to do next
  3. ACT: LLM either:
     a) Calls a tool (action)
     b) Responds to user (finish)

Visual Representation:

User: "Order 5 pizzas to my house"
    ↓
┌───────────────────────────────────┐
│ OBSERVE: User query + Available   │
│          tools + Conversation      │
└────────────────┬──────────────────┘
                 ↓
┌────────────────────────────────────┐
│ REASON: Model analyzes intent      │
│  - Need to create order            │
│  - Need address (don't have it)    │
│  - Can get it from get_user_info   │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ ACT: Call get_user_info()          │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ OBSERVE: Tool result (address)     │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ REASON: Now have all data          │
│  - Can create order                │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ ACT: Call create_order()           │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ OBSERVE: Success response          │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ REASON: Task complete              │
└────────────────┬───────────────────┘
                 ↓
┌────────────────────────────────────┐
│ ACT: Respond to user               │
│  "Order placed! ID: ord_789"       │
└────────────────────────────────────┘

Research Foundation:

The ReAct paper (“ReAct: Synergizing Reasoning and Acting in Language Models” by Yao et al., 2022) showed that interleaving reasoning traces with actions improves:

Success rate on complex tasks (34% → 67%)
Interpretability (you can see the thinking)
Error recovery (model can detect its mistakes)

Statelessness in Tool Design

Principle: Tools should be black boxes to the LLM. The model knows the interface (inputs/outputs) but not the implementation.

Good Tool Design:

{
  "name": "get_weather",
  "description": "Gets current weather for a city",
  "parameters": {
    "city": {"type": "string"}
  }
}

Bad Tool Design:

{
  "name": "get_weather",
  "description": "Gets weather by querying the WeatherAPI v3 database at weather.com using API key stored in env.WEATHER_KEY",
  "parameters": {
    "city": {"type": "string"}
  }
}

Why?

Exposing implementation details:

Leaks security information (API keys, endpoints)
Makes the model make assumptions about side effects
Creates coupling between prompt and infrastructure

Correct Abstraction Level:

Model needs to know: WHAT the tool does
Model should NOT know: HOW it's implemented

Error Handling Patterns

Errors are inevitable. The key is making them recoverable.

Types of Errors:

Error Type	Example	Recovery Strategy
Missing Argument	`address: null`	Send error with hint to call get_user_info
Invalid Type	`quantity: "five"`	Ask model to convert to integer
Out of Range	`quantity: 1000`	Explain max limit, ask to adjust
Permission Denied	User not authorized	Inform and suggest login
Tool Not Found	Hallucinates `send_email`	List available tools, suggest closest match
API Failure	External service down	Retry or inform user

Error Message Design for LLMs:

// BAD: Human-oriented message
{
  "error": "Oops! Something went wrong. Please try again."
}

// GOOD: LLM-oriented message
{
  "error_type": "validation_failed",
  "field": "address",
  "expected": "string matching pattern: street, city, state, zip",
  "received": null,
  "suggestion": "Call get_user_info() to retrieve the user's saved address, or ask the user for it.",
  "available_tools": ["get_user_info", "update_user_address"]
}

The LLM can parse structured errors and self-correct!

2.2 Why This Matters

Production Relevance

Problem: Unreliable AI actions cause real damage

Without proper tool calling infrastructure:

# User: "Cancel my order"
# Model outputs: "I've cancelled order #789"

# But did it actually execute?
# Which order? User has 5 active orders!
# No audit trail, no confirmation, no safety check

Solution: Type-safe, validated, audited tool execution

# User: "Cancel my order"
# Router: Detects ambiguity, asks for clarification
# User: "The pizza order from today"
# Router: Resolves to order_id="ord_789"
# Router: Validates user has permission
# Router: Calls cancel_order(order_id="ord_789")
# Router: Logs decision
# Router: Confirms to user: "Order ord_789 cancelled"

Real-World Consequences:

Shopify: AI assistant that can modify inventory must validate EVERY parameter
Stripe: Payment API calls must be idempotent and logged
Notion: AI that edits documents must have undo capability
GitHub Copilot: Code execution must be sandboxed

One bug in tool calling can:

Charge customers incorrectly
Delete production data
Violate privacy regulations
Cause financial losses

Industry Applications

1. Customer Support Agents (Intercom, Zendesk)

const supportTools = [
  {
    name: "lookup_order",
    description: "Retrieve order details by ID",
    parameters: {
      order_id: { type: "string" }
    }
  },
  {
    name: "issue_refund",
    description: "Issue refund for an order",
    parameters: {
      order_id: { type: "string" },
      amount: { type: "number" },
      reason: { type: "string" }
    },
    requires_confirmation: true  // Safety!
  }
];

// User: "I want a refund for order #123"
// Agent: Calls lookup_order("123")
// Agent: Sees order total is $50
// Agent: Calls issue_refund("123", 50, "customer request")
// System: Asks for confirmation before executing

2. Code Execution Agents (ChatGPT Code Interpreter)

OpenAI’s Code Interpreter uses tool calling to execute Python:

tools = [
  {
    "name": "execute_python",
    "description": "Run Python code in sandboxed environment",
    "parameters": {
      "code": {"type": "string"}
    }
  }
]

# User: "Plot a graph of y=x^2"
# Model: Calls execute_python("import matplotlib...")
# System: Runs in sandbox, returns image
# Model: Shows user the graph

3. Database Query Agents (Text2SQL)

Translating natural language to database queries:

tools = [
  {
    "name": "query_database",
    "description": "Execute SELECT query on customer database",
    "parameters": {
      "sql": {"type": "string", "pattern": "^SELECT .+"},  # Only SELECT
      "limit": {"type": "integer", "maximum": 1000}
    }
  }
]

# User: "Show me customers who signed up last month"
# Model: Calls query_database(sql="SELECT * FROM customers WHERE created_at > ...", limit=100)
# System: Validates (only SELECT, no DROP/DELETE), executes, returns results

2.3 Common Misconceptions

Misconception	Reality
“LLMs always output valid JSON”	Models hallucinate fields, types, and entire tools
“Function calling is deterministic”	Models make probabilistic decisions; validation is essential
“Error messages slow things down”	Rich errors enable self-correction, reducing retries
“You need to show all tools to the model”	Showing 100 tools degrades performance; filter to top 5-10
“Tool calling is just parsing JSON”	It’s about bridging fuzzy intent to strict interfaces

3. Project Specification

3.1 What You Will Build

A tool routing system that:

Registers tools with JSON Schema definitions
Routes natural language to appropriate tool calls
Validates arguments before execution
Executes tools safely with error handling
Manages multi-step planning through ReAct loops
Handles disambiguation when intent is unclear
Prevents infinite loops with circuit breakers
Logs all actions for debugging and auditing
Supports confirmations for dangerous operations
Provides developer-friendly errors and metrics

Core Question This System Answers:

“How do I give an LLM ‘hands’ while ensuring it doesn’t break my API?”

3.2 Functional Requirements

FR1: Tool Registry Management

Load tool definitions from JSON files
Validate tool schemas at registration time
Support tool versioning (v1, v2)
Handle deprecated tools with warnings

FR2: Intent Analysis & Routing

Parse user queries to determine intent
Handle ambiguous queries (multiple matching tools)
Generate clarifying questions when needed
Support fuzzy matching for hallucinated tool names

FR3: Argument Validation

Validate all tool arguments:

Type checking: string, integer, boolean, arrays, objects
Constraint checking: min, max, pattern, enum
Required field verification
Custom validation rules (e.g., valid email, future date)

FR4: Tool Execution Engine

Execute tools with timeout protection
Handle synchronous and asynchronous operations
Capture return values and errors
Support retry logic for transient failures

FR5: Error Recovery Loop

Feed errors back to model as structured messages
Track error count per tool call
Detect error loops (same error repeated 3+ times)
Escalate to human when recovery fails

FR6: Multi-Step Planning (ReAct)

Maintain conversation context across tool calls
Pass tool results to model for next decision
Limit maximum iterations (prevent runaway agents)
Support parallel tool calls when possible

FR7: Safety & Confirmations

Flag destructive actions (delete, payment)
Require explicit user confirmation
Implement rate limiting per tool
Block out-of-scope tools based on user permissions

FR8: Audit & Logging

Log every tool call with timestamp
Record full arguments and results
Track token usage per session
Generate session summaries

3.3 Non-Functional Requirements

Requirement	Target	Rationale
Reliability	99.9% successful tool execution	Production-grade agents must be trustworthy
Latency	<500ms per tool call (excluding API)	User-facing interactions need responsiveness
Safety	0 unauthorized actions	Security is non-negotiable
Debuggability	Full audit trail for every session	Must be able to replay and debug failures
Extensibility	Add new tools without code changes	Business logic evolves rapidly
Token Efficiency	<2000 tokens per multi-step task	Control costs while maintaining quality

3.4 Example Usage

Running the router:

$ node router.js --tools ./tools.json --query "Order 5 pizzas to my house"

[Router Init] Loading tool definitions...
[Router Init] Registered 8 tools:
  - create_order (args: item, quantity, address)
  - cancel_order (args: order_id)
  - get_order_status (args: order_id)
  - update_user_address (args: address)
  - search_menu (args: category, dietary_restrictions)
  - get_user_info (args: user_id)
  - send_notification (args: user_id, message)
  - refund_order (args: order_id, reason)

[Intent Analysis]
User query: "Order 5 pizzas to my house"
Analyzing intent...

[LLM Tool Selection]
Model chose tool: create_order
Raw arguments from model:
{
  "item": "pizza",
  "quantity": 5,
  "address": null
}

[Validation Phase 1: Schema Check]
✓ Tool 'create_order' exists
✓ Argument 'item' is string (valid)
✓ Argument 'quantity' is integer (valid)
✗ Argument 'address' is null (REQUIRED field missing)

[Error Recovery Loop - Attempt 1]
Sending error back to model:
"The 'address' field is required for create_order. You need to either:
1. Ask the user for their address, OR
2. Call get_user_info() to retrieve the saved address."

[LLM Recovery Response]
Model chose new tool: get_user_info
Arguments: { "user_id": "current_user" }

[Validation Phase 2]
✓ All arguments valid
Executing get_user_info(user_id='current_user')...

[Tool Execution]
API Response: {
  "user_id": "u_12345",
  "name": "John Doe",
  "address": "123 Main St, Apt 4B, Springfield, IL 62701",
  "phone": "+1-555-0123"
}

[LLM Re-planning]
Model received tool result. Re-attempting original goal...
Model chose tool: create_order
Arguments:
{
  "item": "pizza",
  "quantity": 5,
  "address": "123 Main St, Apt 4B, Springfield, IL 62701"
}

[Validation Phase 3]
✓ All arguments valid
✓ Address format validated (regex: street, city, state, zip)
✓ Quantity within bounds (1-100)
Executing create_order()...

[Tool Execution]
POST /api/orders
Request body: { "item": "pizza", "quantity": 5, "address": "..." }
API Response: {
  "order_id": "ord_789",
  "status": "pending",
  "estimated_time": "35 minutes",
  "total": 87.45
}

[Final Response Generation]
Sending tool result back to model for natural language response...

[Assistant Response]
"I've placed your order for 5 pizzas to be delivered to 123 Main St, Apt 4B, Springfield, IL. Your order ID is ord_789, and it should arrive in about 35 minutes. The total is $87.45."

[Audit Log]
Saved complete interaction to logs/2024-12-27_15-23-11.json
{
  "session_id": "sess_abc123",
  "user_query": "Order 5 pizzas to my house",
  "tools_called": [
    {
      "tool": "get_user_info",
      "args": {"user_id": "current_user"},
      "result": {...},
      "timestamp": "2024-12-27T15:23:11.245Z"
    },
    {
      "tool": "create_order",
      "args": {"item": "pizza", "quantity": 5, "address": "..."},
      "result": {...},
      "timestamp": "2024-12-27T15:23:12.891Z"
    }
  ],
  "errors_encountered": 1,
  "recovery_successful": true,
  "total_api_calls": 3,
  "total_tokens": 1247
}

Handling ambiguous requests:

$ node router.js --query "I want to return my order"

[Intent Analysis]
Ambiguous intent detected. Multiple tools match:
  - cancel_order (similarity: 0.82)
  - refund_order (similarity: 0.79)

[Disambiguation Required]
Model generating clarifying question...

[Assistant]
"I can help you with that. Do you want to:
1. Cancel the order (before it's delivered), or
2. Request a refund (after you received it)?"

[User Response]
"I already got it but the pizza was cold"

[Resolved Intent]
Model chose tool: refund_order
Arguments: { "order_id": null, "reason": "pizza was cold" }

[Validation Error]
✗ Missing required field: order_id

[Clarification Loop]
Model: "I need your order ID to process the refund. You can find it in your order confirmation email, or I can look up your recent orders. Would you like me to check your recent orders?"

4. Solution Architecture

4.1 High-Level Design

┌──────────────────┐
│  User Input      │
└────────┬─────────┘
         │
         ▼
┌─────────────────────────────────────────┐
│  Intent Analyzer                        │
│  - Parse query                          │
│  - Detect ambiguity                     │
│  - Generate clarifications              │
└────────┬────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────┐
│  Tool Selector                          │
│  - Filter relevant tools                │
│  - Match intent to tool                 │
│  - Handle hallucinations                │
└────────┬────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────┐
│  Argument Extractor (LLM)               │
│  - Call LLM with tool definitions       │
│  - Receive structured tool call         │
└────────┬────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────────┐
│  Validator                              │
│  - Schema validation (types)            │
│  - Constraint validation (ranges)       │
│  - Custom business rules                │
└────────┬────────────────────────────────┘
         │
    Valid? ────No────▶ Error Handler ───┐
         │                               │
        Yes                              │
         │                               │
         ▼                               │
┌─────────────────────────────────────┐  │
│  Execution Engine                   │  │
│  - Call actual function             │  │
│  - Timeout protection               │  │
│  - Capture result/error             │  │
└────────┬────────────────────────────┘  │
         │                               │
         ▼                               │
┌─────────────────────────────────────┐  │
│  Result Processor                   │  │
│  - Format result for LLM            │  │
│  - Update conversation context      │  │
└────────┬────────────────────────────┘  │
         │                               │
         ▼                               │
┌─────────────────────────────────────┐  │
│  ReAct Controller                   │  │
│  - Check if task complete           │◀─┘
│  - Loop for next action             │
│  - Prevent infinite loops           │
└────────┬────────────────────────────┘
         │
         ▼
┌─────────────────────────────────────┐
│  Response Generator                 │
│  - LLM formats natural language     │
│  - Return to user                   │
└─────────────────────────────────────┘

4.2 Key Components

Component	Responsibility	Implementation Strategy
ToolRegistry	Store and version tool definitions	JSON files + in-memory map
IntentAnalyzer	Detect what user wants to do	Embedding similarity + keyword matching
ToolSelector	Choose which tool(s) apply	Filter by intent, rank by relevance
ArgumentExtractor	Get LLM to output structured call	Native function calling API
SchemaValidator	Validate against JSON Schema	Use library (Ajv, Zod, Pydantic)
ExecutionEngine	Run the actual function	Strategy pattern per tool type
ErrorHandler	Format errors for LLM	Structured error templates
ReActController	Manage agent loop	State machine with max iterations
AuditLogger	Record all actions	JSON logs + optional DB
ConfirmationGate	Require human approval	Flag tools as dangerous

4.3 Data Structures

interface ToolDefinition {
  name: string;
  description: string;
  parameters: JSONSchema;
  returns?: JSONSchema;
  requires_confirmation?: boolean;
  dangerous?: boolean;
  rate_limit?: {
    max_calls: number;
    window_seconds: number;
  };
  version?: string;
  deprecated?: boolean;
  metadata?: Record<string, any>;
}

interface ToolCall {
  id: string;  // Unique ID for this call
  tool: string;
  arguments: Record<string, any>;
  timestamp: Date;
}

interface ToolResult {
  call_id: string;
  success: boolean;
  result?: any;
  error?: StructuredError;
  execution_time_ms: number;
  timestamp: Date;
}

interface StructuredError {
  error_type: "validation_failed" | "tool_not_found" | "permission_denied" | "execution_failed";
  message: string;
  field?: string;
  expected?: any;
  received?: any;
  suggestion?: string;
  available_tools?: string[];
}

interface AgentState {
  session_id: string;
  user_query: string;
  conversation_history: Message[];
  tools_called: ToolCall[];
  current_iteration: number;
  max_iterations: number;
  error_count: number;
  status: "running" | "waiting_for_user" | "completed" | "failed";
}

interface Message {
  role: "system" | "user" | "assistant" | "tool";
  content: string;
  tool_call_id?: string;
  tool_calls?: ToolCall[];
}

interface SessionLog {
  session_id: string;
  start_time: Date;
  end_time: Date;
  user_query: string;
  tools_called: ToolCall[];
  results: ToolResult[];
  final_response: string;
  total_tokens: number;
  total_cost: number;
  success: boolean;
}

4.4 Algorithm Overview

Main ReAct Loop Algorithm

async function runAgentLoop(
  userQuery: string,
  availableTools: ToolDefinition[],
  config: AgentConfig
): Promise<AgentResult> {
  const state: AgentState = {
    session_id: generateId(),
    user_query: userQuery,
    conversation_history: [],
    tools_called: [],
    current_iteration: 0,
    max_iterations: config.max_iterations || 10,
    error_count: 0,
    status: "running"
  };

  // Add system message with tool definitions
  state.conversation_history.push({
    role: "system",
    content: buildSystemPrompt(availableTools)
  });

  // Add user query
  state.conversation_history.push({
    role: "user",
    content: userQuery
  });

  // Main loop
  while (state.current_iteration < state.max_iterations) {
    state.current_iteration++;

    // OBSERVE & REASON: Call LLM
    const response = await callLLM(
      state.conversation_history,
      availableTools
    );

    // Check if model wants to finish
    if (response.finish_reason === "stop") {
      state.status = "completed";
      return {
        success: true,
        response: response.content,
        session_log: buildSessionLog(state)
      };
    }

    // ACT: Model wants to call tool(s)
    if (response.tool_calls && response.tool_calls.length > 0) {
      for (const toolCall of response.tool_calls) {
        // Validate tool call
        const validationResult = validateToolCall(toolCall, availableTools);

        if (!validationResult.valid) {
          // Send error back to model
          state.conversation_history.push({
            role: "tool",
            tool_call_id: toolCall.id,
            content: JSON.stringify(validationResult.error)
          });

          state.error_count++;

          // Check for error loop
          if (state.error_count > 3) {
            state.status = "failed";
            return {
              success: false,
              error: "Agent stuck in error loop",
              session_log: buildSessionLog(state)
            };
          }

          continue;  // Let model try again
        }

        // Execute tool
        try {
          const result = await executeTool(
            toolCall.tool,
            toolCall.arguments,
            config.timeout_ms
          );

          // Record successful execution
          state.tools_called.push(toolCall);

          // Add result to conversation
          state.conversation_history.push({
            role: "tool",
            tool_call_id: toolCall.id,
            content: JSON.stringify(result)
          });

          // Reset error count on success
          state.error_count = 0;

        } catch (error) {
          // Handle execution errors
          const structuredError = formatExecutionError(error);

          state.conversation_history.push({
            role: "tool",
            tool_call_id: toolCall.id,
            content: JSON.stringify(structuredError)
          });

          state.error_count++;
        }
      }
    }

    // Safety check: prevent runaway loops
    if (state.current_iteration >= state.max_iterations) {
      state.status = "failed";
      return {
        success: false,
        error: `Max iterations (${state.max_iterations}) exceeded`,
        session_log: buildSessionLog(state)
      };
    }
  }

  // Should not reach here
  throw new Error("Unexpected loop exit");
}

Tool Validation Algorithm

function validateToolCall(
  toolCall: ToolCall,
  availableTools: ToolDefinition[]
): ValidationResult {
  // Step 1: Check if tool exists
  const toolDef = availableTools.find(t => t.name === toolCall.tool);

  if (!toolDef) {
    return {
      valid: false,
      error: {
        error_type: "tool_not_found",
        message: `Tool '${toolCall.tool}' does not exist`,
        available_tools: availableTools.map(t => t.name),
        suggestion: findClosestToolName(toolCall.tool, availableTools)
      }
    };
  }

  // Step 2: Validate against JSON Schema
  const schemaValidator = new JSONSchemaValidator(toolDef.parameters);
  const schemaResult = schemaValidator.validate(toolCall.arguments);

  if (!schemaResult.valid) {
    return {
      valid: false,
      error: {
        error_type: "validation_failed",
        message: "Arguments do not match schema",
        field: schemaResult.errors[0].field,
        expected: schemaResult.errors[0].expected,
        received: schemaResult.errors[0].received,
        suggestion: generateFixSuggestion(schemaResult.errors[0])
      }
    };
  }

  // Step 3: Custom business rule validation
  const customValidation = runCustomValidators(toolCall, toolDef);

  if (!customValidation.valid) {
    return {
      valid: false,
      error: customValidation.error
    };
  }

  // All checks passed
  return { valid: true };
}

Intent Disambiguation Algorithm

async function disambiguateIntent(
  userQuery: string,
  candidateTools: ToolDefinition[]
): Promise<ToolDefinition | null> {
  if (candidateTools.length === 1) {
    return candidateTools[0];
  }

  if (candidateTools.length === 0) {
    return null;
  }

  // Multiple candidates - ask for clarification
  const options = candidateTools.map((tool, i) => ({
    number: i + 1,
    tool: tool.name,
    description: tool.description
  }));

  const clarificationPrompt = `
    Your query could match multiple actions:
    ${options.map(opt => `${opt.number}. ${opt.description}`).join('\n')}

    Which one did you mean?
  `;

  // Send to user, wait for response
  // (This is simplified - real implementation would handle async user input)
  const userChoice = await askUser(clarificationPrompt);

  // Parse user choice (number or description match)
  const selectedIndex = parseInt(userChoice) - 1;

  if (selectedIndex >= 0 && selectedIndex < candidateTools.length) {
    return candidateTools[selectedIndex];
  }

  // Try semantic matching on user's clarification
  return selectBestMatch(userChoice, candidateTools);
}

4.5 Tool Definition Examples

Simple Tool:

{
  "name": "get_weather",
  "description": "Get current weather for a city",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "City name"
      },
      "units": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"],
        "default": "celsius"
      }
    },
    "required": ["city"]
  },
  "returns": {
    "type": "object",
    "properties": {
      "temperature": {"type": "number"},
      "condition": {"type": "string"},
      "humidity": {"type": "number"}
    }
  }
}

Complex Tool with Validation:

{
  "name": "transfer_funds",
  "description": "Transfer money between accounts",
  "parameters": {
    "type": "object",
    "properties": {
      "from_account": {
        "type": "string",
        "pattern": "^ACC[0-9]{8}$",
        "description": "Source account ID (format: ACC12345678)"
      },
      "to_account": {
        "type": "string",
        "pattern": "^ACC[0-9]{8}$",
        "description": "Destination account ID"
      },
      "amount": {
        "type": "number",
        "minimum": 0.01,
        "maximum": 10000,
        "description": "Amount to transfer (max $10,000 per transaction)"
      },
      "memo": {
        "type": "string",
        "maxLength": 100
      }
    },
    "required": ["from_account", "to_account", "amount"]
  },
  "requires_confirmation": true,
  "dangerous": true,
  "rate_limit": {
    "max_calls": 5,
    "window_seconds": 3600
  }
}

5. Implementation Guide

Phase 1: Foundation (Days 1-3)

Step 1: Set Up Tool Registry

Create tool definition file (tools/tools.json):

{
  "version": "1.0.0",
  "tools": [
    {
      "name": "get_time",
      "description": "Get current time",
      "parameters": {
        "type": "object",
        "properties": {},
        "required": []
      }
    },
    {
      "name": "get_weather",
      "description": "Get weather for a city",
      "parameters": {
        "type": "object",
        "properties": {
          "city": {"type": "string"}
        },
        "required": ["city"]
      }
    }
  ]
}

Checkpoint 1.1: Can you load and parse this JSON into TypeScript interfaces?

Step 2: Implement Mock Tool Execution

Don’t connect to real APIs yet:

const mockTools = {
  get_time: () => ({ time: new Date().toISOString() }),
  get_weather: (args: { city: string }) => ({
    city: args.city,
    temperature: 72,
    condition: "sunny"
  })
};

function executeTool(toolName: string, args: any): any {
  const fn = mockTools[toolName];
  if (!fn) {
    throw new Error(`Tool ${toolName} not found`);
  }
  return fn(args);
}

Checkpoint 1.2: Can you call executeTool("get_weather", {city: "NYC"}) and get a result?

Step 3: Integrate Native Function Calling API

For OpenAI:

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function callLLMWithTools(
  messages: Message[],
  tools: ToolDefinition[]
): Promise<any> {
  const response = await client.chat.completions.create({
    model: "gpt-4",
    messages: messages,
    tools: tools.map(tool => ({
      type: "function",
      function: {
        name: tool.name,
        description: tool.description,
        parameters: tool.parameters
      }
    }))
  });

  return response.choices[0].message;
}

Checkpoint 1.3: Send a message asking for the weather and verify the model outputs a tool call.

Phase 2: Validation & Error Handling (Days 4-6)

Step 4: Implement Schema Validation

Using Zod (TypeScript):

import { z } from "zod";

function validateWithZod(args: any, schema: any): ValidationResult {
  try {
    const zodSchema = jsonSchemaToZod(schema);  // Convert JSON Schema to Zod
    zodSchema.parse(args);
    return { valid: true };
  } catch (error) {
    if (error instanceof z.ZodError) {
      return {
        valid: false,
        error: {
          error_type: "validation_failed",
          message: error.errors[0].message,
          field: error.errors[0].path.join('.'),
          expected: error.errors[0].expected,
          received: error.errors[0].received
        }
      };
    }
    throw error;
  }
}

Checkpoint 2.1: Test with invalid arguments. Does validation catch type errors?

Step 5: Build Error Recovery Loop

async function runWithErrorRecovery(
  userQuery: string,
  tools: ToolDefinition[]
): Promise<string> {
  const messages: Message[] = [
    { role: "user", content: userQuery }
  ];

  for (let attempt = 0; attempt < 3; attempt++) {
    const response = await callLLMWithTools(messages, tools);

    if (response.tool_calls) {
      const toolCall = response.tool_calls[0];

      // Validate
      const validation = validateToolCall(toolCall, tools);

      if (!validation.valid) {
        // Send error back to model
        messages.push({
          role: "assistant",
          content: "",
          tool_calls: [toolCall]
        });

        messages.push({
          role: "tool",
          tool_call_id: toolCall.id,
          content: JSON.stringify(validation.error)
        });

        console.log(`Validation failed (attempt ${attempt + 1}): ${validation.error.message}`);
        continue;  // Let model try again
      }

      // Execute
      const result = executeTool(toolCall.function.name, JSON.parse(toolCall.function.arguments));

      return `Tool executed successfully: ${JSON.stringify(result)}`;
    }

    // No tool call, return text response
    return response.content;
  }

  throw new Error("Failed after 3 recovery attempts");
}

Checkpoint 2.2: Test with a query that requires error recovery (e.g., missing required field).

Phase 3: ReAct Loop & Multi-Step (Days 7-10)

Step 6: Implement Full ReAct Loop

async function runReActLoop(
  userQuery: string,
  tools: ToolDefinition[],
  maxIterations: number = 10
): Promise<SessionLog> {
  const sessionId = generateId();
  const messages: Message[] = [
    {
      role: "system",
      content: "You are a helpful assistant with access to tools. Use them to complete tasks."
    },
    {
      role: "user",
      content: userQuery
    }
  ];

  const toolsCalled: ToolCall[] = [];
  let iteration = 0;

  while (iteration < maxIterations) {
    iteration++;
    console.log(`\n[Iteration ${iteration}]`);

    // Call LLM
    const response = await callLLMWithTools(messages, tools);

    // Check if done
    if (response.finish_reason === "stop" && !response.tool_calls) {
      console.log("[Agent] Task complete");
      return {
        session_id: sessionId,
        success: true,
        final_response: response.content,
        tools_called: toolsCalled,
        iterations: iteration
      };
    }

    // Process tool calls
    if (response.tool_calls) {
      // Add assistant message
      messages.push({
        role: "assistant",
        content: response.content || "",
        tool_calls: response.tool_calls
      });

      for (const toolCall of response.tool_calls) {
        console.log(`[Agent] Calling tool: ${toolCall.function.name}`);
        console.log(`[Agent] Arguments: ${toolCall.function.arguments}`);

        // Validate
        const args = JSON.parse(toolCall.function.arguments);
        const validation = validateToolCall(
          { ...toolCall, arguments: args },
          tools
        );

        if (!validation.valid) {
          console.log(`[Validation] Failed: ${validation.error.message}`);
          messages.push({
            role: "tool",
            tool_call_id: toolCall.id,
            content: JSON.stringify(validation.error)
          });
          continue;
        }

        // Execute
        try {
          const result = executeTool(toolCall.function.name, args);
          console.log(`[Execution] Success: ${JSON.stringify(result)}`);

          toolsCalled.push({
            id: toolCall.id,
            tool: toolCall.function.name,
            arguments: args,
            timestamp: new Date()
          });

          messages.push({
            role: "tool",
            tool_call_id: toolCall.id,
            content: JSON.stringify(result)
          });
        } catch (error) {
          console.log(`[Execution] Error: ${error.message}`);
          messages.push({
            role: "tool",
            tool_call_id: toolCall.id,
            content: JSON.stringify({
              error_type: "execution_failed",
              message: error.message
            })
          });
        }
      }
    }
  }

  // Max iterations reached
  return {
    session_id: sessionId,
    success: false,
    error: "Max iterations exceeded",
    tools_called: toolsCalled,
    iterations: iteration
  };
}

Checkpoint 3.1: Test with a multi-step task (e.g., “Get weather for NYC and tell me what time it is”).

Step 7: Add Audit Logging

function saveSessionLog(log: SessionLog): void {
  const filename = `logs/${log.session_id}_${Date.now()}.json`;

  const detailedLog = {
    ...log,
    metadata: {
      timestamp: new Date().toISOString(),
      environment: process.env.NODE_ENV,
      model: "gpt-4"
    }
  };

  fs.writeFileSync(filename, JSON.stringify(detailedLog, null, 2));
  console.log(`[Audit] Saved session log: ${filename}`);
}

Checkpoint 3.2: Verify that each session creates a detailed log file.

Phase 4: Production Features (Days 11-14)

Step 8: Implement Confirmation Gates

async function executeWithConfirmation(
  toolCall: ToolCall,
  toolDef: ToolDefinition
): Promise<any> {
  if (toolDef.requires_confirmation) {
    console.log(`\n⚠️  CONFIRMATION REQUIRED ⚠️`);
    console.log(`Tool: ${toolCall.tool}`);
    console.log(`Arguments: ${JSON.stringify(toolCall.arguments, null, 2)}`);
    console.log(`\nThis action is potentially destructive.`);

    const confirmed = await promptUser("Do you want to proceed? (yes/no): ");

    if (confirmed.toLowerCase() !== "yes") {
      throw new Error("User cancelled operation");
    }
  }

  return executeTool(toolCall.tool, toolCall.arguments);
}

Step 9: Add Rate Limiting

class RateLimiter {
  private callCounts: Map<string, { count: number; windowStart: number }> = new Map();

  checkLimit(toolName: string, limit: { max_calls: number; window_seconds: number }): boolean {
    const now = Date.now();
    const key = toolName;
    const entry = this.callCounts.get(key);

    if (!entry) {
      this.callCounts.set(key, { count: 1, windowStart: now });
      return true;
    }

    const windowElapsed = (now - entry.windowStart) / 1000;

    if (windowElapsed >= limit.window_seconds) {
      // Reset window
      this.callCounts.set(key, { count: 1, windowStart: now });
      return true;
    }

    if (entry.count >= limit.max_calls) {
      return false;  // Rate limit exceeded
    }

    entry.count++;
    return true;
  }
}

Step 10: Build CLI Interface

import { Command } from "commander";

const program = new Command();

program
  .name("router")
  .description("AI Agent Tool Router")
  .version("1.0.0");

program
  .command("run")
  .description("Run the agent")
  .option("-q, --query <query>", "User query")
  .option("-t, --tools <file>", "Tool definitions file", "./tools.json")
  .option("-m, --max-iter <number>", "Max iterations", "10")
  .action(async (options) => {
    const tools = loadTools(options.tools);
    const result = await runReActLoop(
      options.query,
      tools,
      parseInt(options.maxIter)
    );

    console.log("\n" + "=".repeat(60));
    console.log("FINAL RESULT");
    console.log("=".repeat(60));
    console.log(result.final_response);
    console.log(`\nTools called: ${result.tools_called.length}`);
    console.log(`Iterations: ${result.iterations}`);
  });

program.parse();

6. Testing Strategy

6.1 Unit Tests

describe("Tool Validation", () => {
  test("should accept valid arguments", () => {
    const toolDef = {
      name: "test_tool",
      parameters: {
        type: "object",
        properties: {
          name: { type: "string" }
        },
        required: ["name"]
      }
    };

    const result = validateToolCall(
      { tool: "test_tool", arguments: { name: "Alice" } },
      [toolDef]
    );

    expect(result.valid).toBe(true);
  });

  test("should reject missing required field", () => {
    const result = validateToolCall(
      { tool: "test_tool", arguments: {} },
      [toolDef]
    );

    expect(result.valid).toBe(false);
    expect(result.error.error_type).toBe("validation_failed");
  });

  test("should reject invalid type", () => {
    const result = validateToolCall(
      { tool: "test_tool", arguments: { name: 123 } },
      [toolDef]
    );

    expect(result.valid).toBe(false);
  });
});

describe("Error Recovery", () => {
  test("should retry after validation error", async () => {
    const mockLLM = jest.fn()
      .mockResolvedValueOnce({
        // First call: invalid arguments
        tool_calls: [{
          id: "1",
          function: { name: "get_weather", arguments: "{}" }
        }]
      })
      .mockResolvedValueOnce({
        // Second call: valid arguments
        tool_calls: [{
          id: "2",
          function: { name: "get_weather", arguments: '{"city": "NYC"}' }
        }]
      });

    const result = await runWithRecovery(mockLLM, tools);

    expect(mockLLM).toHaveBeenCalledTimes(2);
    expect(result.success).toBe(true);
  });
});

6.2 Integration Tests

describe("Multi-Step Tasks", () => {
  test("should complete task requiring 2 tools", async () => {
    const query = "Get weather for NYC and tell me the time";

    const result = await runReActLoop(query, tools, 10);

    expect(result.success).toBe(true);
    expect(result.tools_called.length).toBe(2);
    expect(result.tools_called.map(t => t.tool)).toContain("get_weather");
    expect(result.tools_called.map(t => t.tool)).toContain("get_time");
  });

  test("should handle disambiguation", async () => {
    // This would require mocking user input
    const query = "I want to return my order";

    // Test that system asks for clarification
    // Then processes user's clarification
    // Then executes correct tool
  });
});

6.3 Performance Tests

describe("Performance", () => {
  test("should complete simple task in <5 seconds", async () => {
    const start = Date.now();

    await runReActLoop("What's the weather?", tools, 10);

    const elapsed = Date.now() - start;
    expect(elapsed).toBeLessThan(5000);
  });

  test("should handle rate limiting", () => {
    const limiter = new RateLimiter();
    const limit = { max_calls: 3, window_seconds: 60 };

    expect(limiter.checkLimit("test_tool", limit)).toBe(true);
    expect(limiter.checkLimit("test_tool", limit)).toBe(true);
    expect(limiter.checkLimit("test_tool", limit)).toBe(true);
    expect(limiter.checkLimit("test_tool", limit)).toBe(false);  // 4th call blocked
  });
});

7. Common Pitfalls & Debugging

7.1 The Hallucinated Tool Problem

Symptom: Model tries to call tools that don’t exist

// Model output
{
  "tool": "send_email",  // This tool doesn't exist!
  "arguments": {...}
}

Solution: Fuzzy matching with suggestions

function findClosestToolName(
  hallucinated: string,
  availableTools: ToolDefinition[]
): string | null {
  const scores = availableTools.map(tool => ({
    name: tool.name,
    score: levenshteinDistance(hallucinated, tool.name)
  }));

  scores.sort((a, b) => a.score - b.score);

  // If closest match is "close enough"
  if (scores[0].score <= 3) {
    return scores[0].name;
  }

  return null;
}

// In error response
{
  "error_type": "tool_not_found",
  "message": `Tool '${hallucinated}' not found. Did you mean '${closest}'?`,
  "available_tools": ["get_weather", "get_time", ...]
}

7.2 The Infinite Loop Problem

Symptom: Agent keeps calling the same failing tool

// Loop detected
[Iteration 1] Call: create_order → Error: Missing address
[Iteration 2] Call: create_order → Error: Missing address
[Iteration 3] Call: create_order → Error: Missing address
...

Solution: Loop detection

class LoopDetector {
  private history: string[] = [];

  detect(toolCall: ToolCall): boolean {
    const signature = `${toolCall.tool}:${JSON.stringify(toolCall.arguments)}`;

    // Check last 3 calls
    const recent = this.history.slice(-3);
    const repeats = recent.filter(sig => sig === signature).length;

    this.history.push(signature);

    // If same call 3 times in a row, it's a loop
    return repeats >= 2;
  }
}

7.3 The Type Confusion Problem

Symptom: Model outputs string instead of integer

// Model output
{
  "quantity": "five"  // Should be 5
}

Solution: Smart type coercion + clear error messages

function coerceTypes(args: any, schema: JSONSchema): any {
  const coerced = { ...args };

  for (const [key, propSchema] of Object.entries(schema.properties)) {
    if (propSchema.type === "integer" && typeof coerced[key] === "string") {
      // Try to parse
      const parsed = parseInt(coerced[key]);

      if (!isNaN(parsed)) {
        coerced[key] = parsed;
        console.warn(`Coerced ${key} from string to integer`);
      } else {
        throw new ValidationError(
          `Cannot convert "${coerced[key]}" to integer. Please provide a numeric value.`
        );
      }
    }
  }

  return coerced;
}

7.4 The Context Window Explosion

Symptom: After 10 tool calls, context is too large

Solution: Summarize old tool results

function compressConversationHistory(
  messages: Message[],
  maxTokens: number
): Message[] {
  let tokenCount = estimateTokens(messages);

  if (tokenCount <= maxTokens) {
    return messages;
  }

  // Keep system message and recent messages
  const system = messages[0];
  const recent = messages.slice(-10);

  // Summarize middle messages
  const middle = messages.slice(1, -10);
  const summary = summarizeToolCalls(middle);

  return [
    system,
    { role: "system", content: `Previous actions: ${summary}` },
    ...recent
  ];
}

8. Extensions

8.1 Beginner Extensions

Extension 1: Tool Usage Analytics

Track which tools are called most frequently:

class ToolAnalytics {
  private stats: Map<string, { count: number; avg_time_ms: number }> = new Map();

  record(toolName: string, executionTimeMs: number) {
    const entry = this.stats.get(toolName) || { count: 0, avg_time_ms: 0 };
    entry.count++;
    entry.avg_time_ms = (entry.avg_time_ms * (entry.count - 1) + executionTimeMs) / entry.count;
    this.stats.set(toolName, entry);
  }

  report() {
    console.log("\nTool Usage Statistics:");
    for (const [tool, stats] of this.stats.entries()) {
      console.log(`  ${tool}: ${stats.count} calls, avg ${stats.avg_time_ms.toFixed(0)}ms`);
    }
  }
}

Extension 2: Tool Filtering by Intent

Don’t show all 100 tools to the model - filter to top 5:

function filterRelevantTools(
  userQuery: string,
  allTools: ToolDefinition[],
  topK: number = 5
): ToolDefinition[] {
  const queryEmbedding = generateEmbedding(userQuery);

  const scored = allTools.map(tool => ({
    tool,
    score: cosineSimilarity(
      queryEmbedding,
      generateEmbedding(tool.description)
    )
  }));

  scored.sort((a, b) => b.score - a.score);

  return scored.slice(0, topK).map(s => s.tool);
}

8.2 Intermediate Extensions

Extension 3: Parallel Tool Execution

Execute independent tools in parallel:

async function executeToolsInParallel(
  toolCalls: ToolCall[]
): Promise<ToolResult[]> {
  // Analyze dependencies
  const groups = groupByDependencies(toolCalls);

  const results: ToolResult[] = [];

  for (const group of groups) {
    // Execute all tools in this group in parallel
    const promises = group.map(tc => executeTool(tc.tool, tc.arguments));
    const groupResults = await Promise.all(promises);
    results.push(...groupResults);
  }

  return results;
}

Extension 4: Tool Versioning

Support multiple versions of the same tool:

interface VersionedTool extends ToolDefinition {
  version: string;
  deprecated?: boolean;
  migration_guide?: string;
}

function selectToolVersion(
  toolName: string,
  requestedVersion: string | "latest"
): VersionedTool {
  const versions = registry.getVersions(toolName);

  if (requestedVersion === "latest") {
    return versions.filter(v => !v.deprecated)[0];
  }

  return versions.find(v => v.version === requestedVersion);
}

8.3 Advanced Extensions

Extension 5: LLM-as-a-Judge for Tool Selection

Use a second LLM call to evaluate if tool selection was correct:

async function judgeToolSelection(
  userQuery: string,
  selectedTool: string,
  allTools: ToolDefinition[]
): Promise<{ correct: boolean; reason: string }> {
  const judgePrompt = `
    User query: "${userQuery}"
    Selected tool: ${selectedTool}
    Available tools: ${allTools.map(t => `${t.name}: ${t.description}`).join('\n')}

    Is this the correct tool? Explain why or why not.
  `;

  const response = await callLLM(judgePrompt);

  // Parse response to determine if correct
  return parseJudgeResponse(response);
}

Extension 6: Automatic Tool Discovery

Automatically generate tool definitions from TypeScript types:

import { z } from "zod";

// Define tool with Zod schema
const CreateOrderSchema = z.object({
  item: z.enum(["pizza", "burger", "salad"]),
  quantity: z.number().int().min(1).max(100),
  address: z.string()
});

// Automatically generate JSON Schema
function generateToolDefinition(
  name: string,
  description: string,
  schema: z.ZodSchema
): ToolDefinition {
  return {
    name,
    description,
    parameters: zodToJsonSchema(schema)
  };
}

const createOrderTool = generateToolDefinition(
  "create_order",
  "Creates a food order",
  CreateOrderSchema
);

Extension 7: Distributed Agent System

Multiple specialized agents working together:

class AgentOrchestrator {
  private agents: Map<string, Agent> = new Map([
    ["weather_agent", new WeatherAgent()],
    ["order_agent", new OrderAgent()],
    ["support_agent", new SupportAgent()]
  ]);

  async routeToAgent(userQuery: string): Promise<string> {
    // Determine which agent is best suited
    const agentName = await selectBestAgent(userQuery, this.agents);

    // Route to that agent
    const agent = this.agents.get(agentName);
    return await agent.handle(userQuery);
  }
}

9. Real-World Connections

9.1 Production Case Studies

1. Shopify Sidekick (E-commerce Assistant)

Shopify’s AI assistant uses tool calling to:

Query product inventory
Modify store settings
Generate reports
Answer merchant questions

Implementation insights:

Every write operation requires merchant confirmation
Tool calls are rate-limited per merchant
Full audit log for compliance
Separate tool sets for different permission levels

2. ChatGPT Plugins

OpenAI’s plugin system is tool calling at scale:

1000+ plugins (tools) available
Dynamic tool selection based on user intent
Sandboxed execution
OAuth for user authentication

3. Replit GhostWriter (Code Agent)

Replit’s AI that writes and debugs code uses tools for:

Creating files
Running tests
Installing packages
Executing code

Safety measures:

Code execution in isolated containers
Timeout limits (30s per execution)
Resource limits (CPU, memory)
User approval for file modifications

9.2 Design Patterns in Production

Pattern 1: Tool Namespacing

Organize tools by domain:

const tools = {
  "orders.create": createOrderTool,
  "orders.cancel": cancelOrderTool,
  "orders.status": getOrderStatusTool,
  "users.get": getUserTool,
  "users.update": updateUserTool
};

// Model can call: "orders.create" or "users.get"

Pattern 2: Capability-Based Access

Filter tools by user permissions:

function getAvailableTools(user: User): ToolDefinition[] {
  const allTools = loadAllTools();

  return allTools.filter(tool => {
    // Check if user has capability
    return user.capabilities.includes(tool.required_capability);
  });
}

Pattern 3: Circuit Breaker

Prevent cascading failures:

class CircuitBreaker {
  private failures = 0;
  private state: "closed" | "open" | "half-open" = "closed";

  async execute(fn: () => Promise<any>): Promise<any> {
    if (this.state === "open") {
      throw new Error("Circuit breaker is open");
    }

    try {
      const result = await fn();
      this.failures = 0;
      this.state = "closed";
      return result;
    } catch (error) {
      this.failures++;

      if (this.failures >= 3) {
        this.state = "open";
        setTimeout(() => { this.state = "half-open"; }, 60000);
      }

      throw error;
    }
  }
}

10. Resources

10.1 Books

Topic	Book	Chapter
Boundary Design	“Clean Code” by Robert Martin	Ch. 8 (Boundaries - interfacing with external systems)
Interface Contracts	“The Pragmatic Programmer” by Hunt & Thomas	Ch. 5 (Bend or Break - Design by Contract)
JSON Schema	“Designing Data-Intensive Applications” by Kleppmann	Ch. 4 (Encoding & Schema Evolution)
Error Handling	“Code Complete” by McConnell	Ch. 8 (Defensive Programming)
API Design	“REST API Design Rulebook” by Mark Massé	Ch. 2 (Identifier Design) & Ch. 6 (Request Methods)
State Machines	“Clean Code” by Robert Martin	Ch. 6 (Objects and Data Structures)
Validation Patterns	“Refactoring” by Martin Fowler	Ch. 11 (Simplifying Conditional Expressions)
Agent Architectures	“AI Engineering” by Chip Huyen	Ch. 6 (Agent Patterns & Tool Use)
Type Safety	“Effective TypeScript” by Dan Vanderkam	Items 1-10 (Understanding TypeScript’s Type System)

10.2 Papers

“ReAct: Synergizing Reasoning and Acting in Language Models” (Yao et al., 2022)
- Foundation for agent loops
- https://arxiv.org/abs/2210.03629
“Toolformer: Language Models Can Teach Themselves to Use Tools” (Schick et al., 2023)
- Self-supervised tool learning
- https://arxiv.org/abs/2302.04761
“Gorilla: Large Language Model Connected with Massive APIs” (Patil et al., 2023)
- API calling optimization
- https://arxiv.org/abs/2305.15334

10.3 API Documentation

OpenAI Function Calling: https://platform.openai.com/docs/guides/function-calling
Anthropic Tool Use: https://docs.anthropic.com/claude/docs/tool-use
JSON Schema Specification: https://json-schema.org/

10.4 Libraries

# TypeScript
npm install zod          # Schema validation
npm install ajv          # JSON Schema validator
npm install commander    # CLI framework

# Python
pip install pydantic     # Data validation
pip install jsonschema   # JSON Schema validation
pip install click        # CLI framework

11. Self-Assessment Checklist

Core Understanding

I can explain what tool calling is and why it matters
- Test: Explain to someone the difference between text responses and tool calls
I understand JSON Schema and can write schemas by hand
- Test: Write a schema for a complex nested object with constraints
I know the ReAct pattern
- Test: Draw the observe-reason-act loop on a whiteboard
I can identify security risks in tool design
- Test: List 5 ways tool calling can go wrong in production

Implementation Skills

I’ve implemented argument validation
- Evidence: Validator catches all invalid types, missing fields, constraint violations
I’ve built an error recovery loop
- Evidence: System recovers from validation errors automatically
I’ve implemented the ReAct loop
- Evidence: Agent completes multi-step tasks end-to-end
I’ve added audit logging
- Evidence: Every session has a complete JSON log
I’ve implemented safety features
- Evidence: Dangerous tools require confirmation

Production Readiness

My system handles edge cases
- Hallucinated tools
- Type confusion
- Infinite loops
- Rate limiting
I have comprehensive error messages
- Evidence: Errors include suggestions for how to fix
I can debug failed sessions
- Evidence: Audit logs contain enough detail to replay
I’ve tested performance
- Evidence: Selection + validation + execution <500ms

Growth

I can design tool contracts for new domains
- Application: Design 5 tools for a different use case (e.g., email management)
I understand when NOT to use tool calling
- Give examples where tool calling adds unnecessary complexity
I can explain this to stakeholders
- Practice: 3-minute pitch on why tool calling improves reliability

12. Submission / Completion Criteria

Minimum Viable Completion

Can register tools from JSON
- Loads tool definitions
- Validates schema format
Can call LLM with tool definitions
- Uses native function calling API
- Receives structured tool calls
Validates tool arguments
- Checks types and required fields
- Returns structured errors
Executes tools
- At least 3 mock tools working
- Captures results

Proof: Screenshot showing validation error + recovery

Full Completion

All minimum criteria plus:

Full ReAct loop
- Multi-step task completion
- Max iteration limits
- Success/failure reporting
Error recovery
- Feeds errors back to model
- Detects infinite loops
- Handles at least 5 error types
Audit logging
- JSON logs per session
- Complete conversation history
- Token usage tracking
CLI interface
- Clear help text
- Multiple commands
- Professional output formatting

Proof: Public GitHub repository with README

Excellence

All full completion criteria plus any 3+:

Parallel tool execution
Tool versioning
Automatic schema generation from types
Distributed agent system
Production deployment (API endpoint)
Monitoring dashboard
Integration tests with real LLM calls

Proof: Blog post, video demo, or production URL

End of Project 6: Tool Router