Project 6: Tool Router (Function Schemas as Contracts)

Tool-call trace logs with policy decisions and argument validation status.

Quick Reference

Attribute Value
Difficulty Level 3: Advanced
Time Estimate See main guide estimates (typically 3-8 days except capstone)
Main Programming Language TypeScript
Alternative Programming Languages Python, Go
Coolness Level Level 4: Agent Systems Core
Business Potential 5. Product Foundation
Knowledge Area Agent Tooling
Software or Tool Intent router + policy gate
Main Book Building Microservices (Newman)
Concept Clusters Tool Calling and MCP Interoperability; Instruction Hierarchy and Injection Defense

1. Learning Objectives

By completing this project, you will:

  1. Design a tool registry with JSON Schema-based contracts that define each tool’s name, description, parameters, return type, and risk classification.
  2. Implement an intent classifier that maps natural language user requests to candidate tools with confidence scores and an explicit abstention path.
  3. Build a schema validator that checks LLM-generated tool arguments against JSON Schema definitions before any tool execution occurs.
  4. Implement a risk-aware policy gate that applies allow/deny/escalate decisions based on tool risk level, user tier, and environment context.
  5. Produce auditable trace logs that record every routing decision with sufficient detail for incident replay and compliance review.

2. All Theory Needed (Per-Concept Breakdown)

Tool/Function Calling Architecture Across LLM Providers

Fundamentals Tool calling (also called function calling) is the mechanism by which LLMs invoke external capabilities. Instead of generating free-form text, the model outputs a structured request to call a specific function with specific arguments. The LLM itself does not execute the function; the host application receives the structured call, validates it, executes the function, and returns the result to the model for incorporation into its response. This architecture transforms LLMs from text generators into orchestration engines that can take real-world actions: querying databases, making API calls, modifying state, and interacting with external systems. Understanding how different providers implement tool calling is essential for building a portable, robust tool router.

Deep Dive into the concept Tool calling follows a request-response cycle with the host application acting as intermediary between the model and the external tools. The cycle has five phases: (1) the model receives the user message plus available tool definitions, (2) the model decides whether to call a tool and generates a structured tool-call request, (3) the host application intercepts the request and validates it, (4) the host executes the tool and captures the result, (5) the host feeds the result back to the model for final response generation. The router you build in this project operates in phase 3: intercepting the model’s tool-call request and applying validation, policy checks, and routing logic before any execution occurs.

OpenAI’s implementation uses a tools parameter in the chat completion API. Each tool is defined with a name, description, and parameters object that follows JSON Schema. When the model decides to call a tool, it returns a message with tool_calls array, where each entry has an id, function.name, and function.arguments (a JSON string). The host parses the arguments, validates against the schema, executes the function, and sends a message with role tool containing the result. OpenAI also supports tool_choice to force or prevent tool calling: auto (model decides), none (no tools), required (must call a tool), or a specific function name.

Anthropic’s implementation uses a tools parameter in the Messages API with similar structure: each tool has a name, description, and input_schema (JSON Schema). When Claude decides to use a tool, it emits a tool_use content block with an id, name, and input object. The host validates and executes, then sends a message with role user containing a tool_result content block with the matching tool_use_id. Anthropic recently added structured outputs support that can guarantee the tool arguments match the schema exactly, reducing the need for client-side validation (but not eliminating it, since the schema itself may be wrong or the tool execution may fail).

Google’s implementation (Gemini API) uses function_declarations within a tools parameter. Each declaration has a name, description, and parameters using OpenAPI schema format (a superset of JSON Schema). The model returns a functionCall part with the function name and arguments. The host processes and returns a functionResponse part. Google also supports grounding with Google Search as a built-in tool type.

Despite these differences in wire format, the conceptual model is identical across providers: define tools with schemas, model decides and generates structured calls, host validates and executes, results flow back. Your router abstracts over these differences by working with a normalized tool definition format and a normalized call request format.

Parallel tool calls are supported by all major providers. The model can request multiple tool calls in a single turn, which the host should execute (potentially in parallel) and return all results together. Your router must handle parallel validation: all calls are validated before any are executed, and if one fails validation, you must decide whether to execute the valid ones or reject the batch.

The key architectural insight is that the host application is the trust boundary. The model generates tool calls, but the host decides whether to execute them. This is where your router lives: between the model’s output and the actual tool execution. The router is the enforcement point for schema validation, policy checks, rate limiting, and audit logging.

How this fit on projects This concept provides the foundational architecture for Project 6. Understanding how tool calling works across providers lets you design a router that is provider-agnostic, validating and gating tool calls regardless of whether they come from OpenAI, Anthropic, or Google models.

Definitions & key terms

  • Tool definition: A structured description of a tool including name, description, and parameter schema that is provided to the model.
  • Tool call: A structured request from the model to invoke a specific tool with specific arguments.
  • Tool result: The response from executing a tool, fed back to the model for incorporation into its response.
  • Host application: The software layer between the model and external tools that manages the tool-call lifecycle.
  • tool_choice / tool_use: Provider-specific parameters that control whether and how the model uses tools.
  • Parallel tool calls: Multiple tool invocations requested by the model in a single turn.

Mental model diagram (ASCII)

           TOOL CALLING LIFECYCLE (PROVIDER-AGNOSTIC)
           ==========================================

  User Message                 Tool Definitions
  "Book me a flight            (JSON Schema contracts)
   to NYC next Friday"         ┌─────────────────────┐
        |                      │ travel_search:       │
        |                      │   dest: string       │
        |                      │   date: date         │
        v                      │   class: enum        │
  +-------------+              │ hotel_book:          │
  |   LLM       |<────────────│   city: string       │
  | (any        |   provided   │   nights: integer    │
  |  provider)  |   at call    │ payment_transfer:    │
  +------+------+   time       │   amount: number     │
         |                     │   account: string    │
         | model decides       └─────────────────────┘
         | to call a tool
         v
  +--------------------+
  | Tool Call Request   |
  | name: travel_search |
  | args: {             |
  |   dest: "NYC",      |
  |   date: "2026-02-20"|
  | }                   |
  +--------+-----------+
           |
           |  *** YOUR ROUTER LIVES HERE ***
           v
  +====================================+
  ‖         TOOL ROUTER (P06)          ‖
  ‖                                    ‖
  ‖  1. Schema Validation              ‖
  ‖     args match JSON Schema?        ‖
  ‖                                    ‖
  ‖  2. Policy Gate                    ‖
  ‖     risk level + user tier         ‖
  ‖     -> ALLOW / DENY / ESCALATE    ‖
  ‖                                    ‖
  ‖  3. Trace Logging                  ‖
  ‖     record everything              ‖
  +====================================+
           |
      ┌────+────┐
      |         |
      v         v
  [EXECUTE]  [DENY/ESCALATE]
      |
      v
  +--------------------+
  | Tool Result         |
  | { flights: [...] }  |
  +--------+-----------+
           |
           v
  +-------------+
  |   LLM       |  incorporates result
  +------+------+  into response
         |
         v
  "I found 3 flights to NYC..."

How it works (step-by-step, with invariants and failure modes)

  1. The model receives the user message and tool definitions. Invariant: tool definitions are loaded from the versioned tool registry, not hardcoded. Failure mode: stale tool definitions cause the model to generate calls for tools that no longer exist or have changed schemas.
  2. The model generates a tool-call request with function name and arguments. Invariant: the request follows the provider-specific format. Failure mode: the model hallucinates a tool name that does not exist in the registry.
  3. The router receives the raw tool-call request and normalizes it into the internal format. Invariant: normalization preserves all information without lossy transformation. Failure mode: provider format changes break the normalizer.
  4. Schema validation checks the arguments against the JSON Schema for the named tool. Invariant: validation is strict (no additional properties, correct types, required fields present). Failure mode: the JSON Schema is malformed, causing the validator itself to error.
  5. Policy gate evaluates the tool’s risk level, user tier, and environment against the policy rules. Invariant: the policy decision is deterministic given the same inputs. Failure mode: missing policy rules for a tool/tier combination; the default should be DENY, not ALLOW.
  6. If approved, the tool is executed. The result is captured and returned to the model. If denied, the denial reason is returned to the model so it can inform the user. Invariant: no tool execution occurs without passing both validation and policy. Failure mode: the tool execution itself fails (timeout, error); the router must handle this and return a structured error to the model.

Minimal concrete example

  Tool definition (JSON Schema):

  {
    "name": "travel_search",
    "description": "Search for flights to a destination on a date",
    "parameters": {
      "type": "object",
      "properties": {
        "destination": { "type": "string", "minLength": 1 },
        "date": { "type": "string", "format": "date" },
        "cabin_class": {
          "type": "string",
          "enum": ["economy", "business", "first"]
        }
      },
      "required": ["destination", "date"],
      "additionalProperties": false
    },
    "risk_level": "read_only",
    "requires_approval": false
  }

  Normalized tool-call request:

  {
    "call_id": "tc_001",
    "tool_name": "travel_search",
    "arguments": {
      "destination": "NYC",
      "date": "2026-02-20"
    },
    "provider": "openai",
    "model": "gpt-4",
    "trace_id": "trc_p06_001"
  }

  Validation result: PASS (all required fields present, types correct)
  Policy result: ALLOW (read_only tool, no approval required)

Common misconceptions

  • “The LLM executes the tool directly.” The model only generates the call request. The host application is responsible for validation, policy enforcement, and execution. The model never touches external systems.
  • “JSON Schema validation is sufficient for safety.” Schema validation checks structure, not semantics. A tool call can be schema-valid but policy-violating (e.g., transferring money to an unauthorized account). Both validation layers are needed.
  • “Tool definitions are static configuration.” Tool definitions should be versioned and loaded from a registry. When a tool’s schema changes, the router must handle both the old and new versions during the transition period.
  • “All providers work the same way.” While the conceptual model is the same, wire formats differ. A production router needs a normalization layer for each supported provider.
  • “If the model is confident, the tool call is safe.” Model confidence is orthogonal to safety. A model can be 99% confident about a tool call that violates business policy. Policy gates must override model confidence.

Check-your-understanding questions

  1. Why does the router sit between the model output and tool execution rather than before the model call?
  2. What should the router do when the model hallucinates a tool name not in the registry?
  3. How should parallel tool calls be handled when one passes validation but another fails?
  4. Why should the default policy for unknown tool/tier combinations be DENY rather than ALLOW?

Check-your-understanding answers

  1. The router intercepts tool-call requests because the model’s decision to call a tool and its argument generation are part of the model’s inference. The router cannot prevent the model from deciding to call a tool; it can only prevent the execution. Placing the router after model output and before execution creates a clean trust boundary.
  2. Return a structured error with reason code “TOOL_NOT_FOUND” and the attempted tool name. Feed this back to the model so it can either try a different tool or inform the user. Never attempt to fuzzy-match or guess the intended tool, as this could lead to unintended actions.
  3. This is a design decision. The strict approach rejects the entire batch if any call fails (atomic semantics). The lenient approach executes the valid calls and returns errors for the invalid ones. For safety-critical applications, atomic semantics are recommended. The router should support both modes via configuration.
  4. Failing open (ALLOW by default) means any new tool added without explicit policy rules is immediately executable by all users. Failing closed (DENY by default) means new tools require explicit policy configuration before they can be used, which is safer. This follows the principle of least privilege.

Real-world applications

  • ChatGPT plugins use the tool-calling mechanism to interact with external services, with OpenAI acting as the host application and router.
  • Anthropic’s Claude in computer use mode uses tool calling to interact with desktop applications through structured API calls.
  • Enterprise AI platforms like Microsoft Copilot route tool calls through policy layers that enforce organizational permissions before executing actions in Office 365, Dynamics, etc.
  • Autonomous coding agents (Cursor, Claude Code) use tool calling to read files, write code, and run commands, with the host application enforcing which operations are permitted.

Where you’ll apply it

  • Phase 1: designing the tool registry schema and the normalized call format.
  • Phase 2: implementing the schema validator and policy gate.
  • Phase 3: adding provider-specific normalizers and parallel call handling.

References

  • OpenAI Function Calling documentation (platform.openai.com)
  • Anthropic Tool Use documentation (docs.anthropic.com)
  • Google Gemini Function Calling documentation (ai.google.dev)
  • “AI Engineering” by Chip Huyen - Chapters on agentic systems and tool use
  • “Building LLM Apps” by Valentina Alto - Tool integration patterns

Key insights The host application, not the model, is the trust boundary for tool execution; the router is the enforcement mechanism that transforms the model’s probabilistic intent into deterministic, policy-compliant action.

Summary Tool calling across LLM providers follows a consistent pattern: define tools with schemas, model generates structured call requests, the host validates and executes, results flow back to the model. The router sits at the critical trust boundary between model output and tool execution, enforcing schema validation, policy rules, and audit logging. Provider differences are handled through a normalization layer. The default policy for unknown combinations must be DENY to maintain safety.

Homework/Exercises to practice the concept

  • Write JSON Schema tool definitions for 3 tools with different risk levels: a read-only search tool, a state-modifying booking tool, and a high-risk payment transfer tool. Include all required fields and proper constraints.
  • Design the normalized internal format that your router uses to represent tool-call requests from any provider. Show how an OpenAI tool_calls response and an Anthropic tool_use block both normalize to the same format.
  • Trace through the 5-phase tool-calling lifecycle for a scenario where the model requests two parallel tool calls, one of which fails schema validation.

Solutions to the homework/exercises

  • The search tool should have risk_level “read_only” with no approval required. The booking tool should have risk_level “mutating” with approval required for high-value bookings (e.g., total > $1000). The payment tool should have risk_level “irreversible” with mandatory human approval. Each schema should use additionalProperties: false, required fields, and appropriate type constraints (enum for status, format: date for dates, minimum: 0 for amounts).
  • The normalized format should include: call_id (from provider’s tool call id), tool_name, arguments (parsed JSON object, not string), provider (enum: openai, anthropic, google), model (model id string), trace_id (generated by the router), and timestamp. OpenAI normalization parses function.arguments from string to object. Anthropic normalization maps tool_use.input directly. Both produce identical internal format.
  • Phase 1: model receives message + 2 tool defs. Phase 2: model returns 2 parallel calls. Phase 3: router normalizes both, validates both. Call A passes schema. Call B fails (missing required field). Phase 4a (atomic mode): reject both, return structured error listing the failed call. Phase 4b (lenient mode): execute Call A, return error for Call B. Phase 5: feed results (one success, one error) back to the model.

JSON Schema for Tool Parameter Validation

Fundamentals JSON Schema is the contract language for tool parameters. Every tool definition includes a JSON Schema that specifies the exact structure, types, and constraints of the arguments the model must provide. When the model generates a tool call, the router validates the arguments against this schema before execution. This is not optional safety theater; it is the hard boundary between the model’s probabilistic output and deterministic tool execution. Without schema validation, a model could pass a string where an integer is expected, omit required fields, or include unexpected properties that downstream tools do not handle. Schema validation catches these errors before they propagate into real systems and cause data corruption, financial errors, or security breaches.

Deep Dive into the concept JSON Schema (draft 2020-12 is current) provides a rich vocabulary for describing data structures. For tool parameter validation, the most important features are:

Type constraints specify the expected data type for each property: string, number, integer, boolean, array, object, or null. Type mismatches are the most common validation failure: the model returns “42” (string) when 42 (integer) is expected, or returns an array when the tool expects a single object.

Required fields declare which properties must be present. A tool that expects both destination and date should declare both as required. If the model omits a required field, the validation fails immediately rather than passing incomplete data to the tool.

Enum constraints restrict string values to a predefined set. For a cabin_class parameter that only accepts “economy”, “business”, or “first”, an enum constraint prevents the model from hallucinating values like “premium_economy” or “elite” that the tool does not support.

Format annotations provide semantic validation beyond type checking. The format: "date" annotation validates that a string matches ISO 8601 date format. The format: "email" annotation validates email structure. Format validation catches cases where the type is correct (string) but the content is invalid (not a valid date).

additionalProperties: false is a critical safety setting. By default, JSON Schema allows extra properties not defined in the schema. Setting additionalProperties to false rejects any arguments the model provides that are not explicitly defined. This prevents the model from injecting unexpected fields that could be interpreted by downstream tools in unintended ways.

Nested schemas handle complex tool parameters. A booking tool might accept a passengers array where each element is an object with name (string), age (integer, minimum: 0), and seat_preference (enum). JSON Schema handles arbitrarily nested validation.

Conditional schemas (if/then/else, oneOf, anyOf) handle parameters whose validation depends on other parameter values. For example, if payment_method is “credit_card”, then card_number is required; if payment_method is “bank_transfer”, then account_number is required. These patterns are powerful but add complexity to both the schema and the model’s ability to generate valid arguments.

Schema versioning is essential for production systems. When a tool’s parameters change (adding a new required field, changing an enum), the schema version changes. The router must handle the transition: old model calls use the old schema, new calls use the new schema. Without versioning, a schema change can break all existing prompt templates that use that tool.

Schema composition with $ref allows shared schema definitions. If multiple tools accept an address parameter, define the address schema once and reference it from each tool. This prevents drift between schema copies and simplifies updates.

How this fit on projects This concept drives the argument validation phase of the router (Phase 2). Every tool call passes through JSON Schema validation before reaching the policy gate. The quality and strictness of your schemas determine how many invalid tool calls are caught before execution.

Definitions & key terms

  • JSON Schema: A vocabulary for annotating and validating JSON documents, used here to define tool parameter contracts.
  • additionalProperties: A JSON Schema keyword that controls whether properties not defined in the schema are allowed. Set to false for strict validation.
  • $ref: A JSON Schema keyword for referencing shared schema definitions, enabling composition and reuse.
  • Schema draft: The version of the JSON Schema specification (current: draft 2020-12).
  • Coercion: Automatically converting a value to the expected type (e.g., string “42” to integer 42). Generally avoid in tool validation; prefer strict type checking.

Mental model diagram (ASCII)

           JSON SCHEMA VALIDATION PIPELINE
           ================================

  Model-Generated Arguments          Tool Schema (from registry)
  {                                  {
    "destination": "NYC",              "type": "object",
    "date": "2026-02-20",             "properties": {
    "passengers": 2                      "destination": {
  }                                        "type": "string",
           |                               "minLength": 1
           |                             },
           v                             "date": {
  +---------------------+                 "type": "string",
  | JSON SCHEMA         |                 "format": "date"
  | VALIDATOR           |               },
  |                     |               "cabin_class": {
  | Step 1: Parse JSON  |                 "type": "string",
  |   -> valid JSON? Y  |                 "enum": ["economy",
  |                     |                   "business", "first"]
  | Step 2: Type check  |               }
  |   destination: str Y |             },
  |   date: str        Y |             "required": ["destination",
  |   passengers: int  ? |               "date"],
  |   (not in schema!) X |             "additionalProperties": false
  |                     |            }
  | Step 3: Format check|
  |   date: ISO 8601? Y |
  |                     |
  | Step 4: Required    |
  |   destination? Y    |
  |   date? Y           |
  |                     |
  | Step 5: Additional  |
  |   "passengers" not  |
  |   in schema -> FAIL |
  +----------+----------+
             |
             v
  +---------------------+
  | VALIDATION RESULT   |
  |                     |
  | status: FAIL        |
  | errors: [           |
  |   {                 |
  |     path: "/",      |
  |     keyword:        |
  |       "additional   |
  |        Properties", |
  |     message:        |
  |       "unexpected   |
  |        property:    |
  |        passengers"  |
  |   }                 |
  | ]                   |
  +---------------------+

How it works (step-by-step, with invariants and failure modes)

  1. Parse the model’s arguments from JSON string to object. Invariant: parsing is strict (no comments, no trailing commas). Failure mode: invalid JSON (the model generated malformed JSON). Return a parse error with the character position.
  2. Load the schema for the named tool from the registry. Invariant: the schema is itself valid JSON Schema (validated at registration time). Failure mode: tool not found in registry; return TOOL_NOT_FOUND before attempting validation.
  3. Run the JSON Schema validator. Invariant: the validator implements the full draft specified in the schema’s $schema keyword. Failure mode: the validator library has bugs or incomplete draft support; use a well-tested library (Ajv for TypeScript, jsonschema for Python).
  4. Collect all validation errors. Invariant: errors include the JSON path to the failing property, the violated keyword, and a human-readable message. Failure mode: a single error aborts validation without checking remaining properties; configure the validator for “all errors” mode to collect everything.
  5. Return the validation result to the router. Invariant: a PASS result contains no errors; a FAIL result contains at least one error with full context. Failure mode: the validator returns ambiguous results (e.g., warnings instead of errors for required field violations); treat all violations as errors.

Minimal concrete example

  Schema with conditional validation:

  {
    "type": "object",
    "properties": {
      "action": {
        "type": "string",
        "enum": ["search", "book", "cancel"]
      },
      "query": { "type": "string" },
      "booking_id": { "type": "string", "pattern": "^BK-[0-9]{6}$" },
      "reason": { "type": "string", "maxLength": 500 }
    },
    "required": ["action"],
    "additionalProperties": false,
    "if": { "properties": { "action": { "const": "cancel" } } },
    "then": { "required": ["booking_id", "reason"] }
  }

  Valid cancel call:
    { "action": "cancel", "booking_id": "BK-123456", "reason": "Changed plans" }
    -> PASS

  Invalid cancel call (missing reason):
    { "action": "cancel", "booking_id": "BK-123456" }
    -> FAIL: "reason" is required when action is "cancel"

Common misconceptions

  • “Schema validation is overkill; just check required fields.” Missing type checks, format validation, and additional property rejection creates silent failures downstream. Schema validation is cheap insurance.
  • “The model always generates valid JSON.” Models regularly produce malformed JSON, especially for complex nested schemas. Always parse before validating.
  • “additionalProperties should default to true for flexibility.” Allowing unexpected properties means the model can inject fields that downstream tools might interpret in unintended ways. Default to false for safety.
  • “Schema changes are backward-compatible.” Adding a new required field is a breaking change. Schema versioning is needed to handle transitions safely.

Check-your-understanding questions

  1. Why is additionalProperties: false important for tool parameter schemas?
  2. What is the difference between type checking and format validation for a date parameter?
  3. How should the router handle the case where the model generates invalid JSON (not just schema-invalid, but unparseable)?
  4. Why should validation collect all errors rather than stopping at the first one?

Check-your-understanding answers

  1. Without it, the model can include arbitrary extra fields that are not part of the tool’s interface. Downstream tools may ignore them (wasting tokens) or worse, interpret them in unexpected ways. Setting additionalProperties to false ensures the tool receives exactly the parameters it expects.
  2. Type checking confirms the value is a string (correct type). Format validation confirms the string content matches ISO 8601 date format (semantic correctness). A value like “not-a-date” passes type checking but fails format validation.
  3. Return a structured error with reason code INVALID_JSON, the raw string, and the parse error position. Feed this back to the model so it can retry with valid JSON. The router should not attempt to fix the JSON (no auto-correction), as this could silently change the intended arguments.
  4. Collecting all errors gives the model (or developer) complete feedback for a single correction cycle. If validation stops at the first error, the next attempt might fix that error but hit a second one, creating a frustrating multi-round correction loop.

Real-world applications

  • OpenAI’s structured outputs feature enforces JSON Schema on the server side, guaranteeing that model output matches the schema. This reduces client-side validation needs but does not eliminate them (the schema itself might be wrong).
  • API gateways (Kong, Apigee) validate request payloads against OpenAPI schemas before routing to backend services, using the same JSON Schema validation approach.
  • Terraform validates provider configurations against JSON Schema definitions before applying infrastructure changes.

Where you’ll apply it

  • Phase 2: building the schema validation component of the router.
  • Phase 3: adding schema versioning and migration support.

References

  • JSON Schema specification (json-schema.org) - draft 2020-12
  • Ajv (Another JSON Schema Validator) documentation for TypeScript/JavaScript
  • OpenAI Structured Outputs documentation
  • Anthropic Structured Outputs documentation (2025)
  • “Building LLM Apps” by Valentina Alto - chapters on structured output

Key insights JSON Schema validation is the firewall between the model’s probabilistic output and your deterministic tool execution; it is cheap, well-understood, and prevents an entire category of runtime errors.

Summary JSON Schema provides the contract language for tool parameters. Strict validation with additionalProperties: false, required field enforcement, type checking, format validation, and conditional schemas catches invalid arguments before they reach tool execution. Schema versioning handles tool evolution without breaking existing calls. The validation pipeline should collect all errors per call and return structured feedback.

Homework/Exercises to practice the concept

  • Write JSON Schemas for 3 tools: a read-only search (simple), a booking tool with nested passenger array (moderate), and a payment tool with conditional required fields based on payment method (complex).
  • Design the validation error format that your router uses internally. Include at minimum: error path, violated keyword, message, and schema version.
  • Trace through the validation of a tool call where the model provides a valid JSON string that fails 3 different schema constraints simultaneously. Show all 3 errors collected.

Solutions to the homework/exercises

  • Search tool: type object, properties (query: string, limit: integer with minimum 1 and maximum 100), required [query], additionalProperties false. Booking tool: type object, properties (flight_id: string, passengers: array of objects with name/age/seat properties, each validated), required [flight_id, passengers], additionalProperties false. Payment tool: type object with if/then for payment_method: if “credit_card” then require card_number (pattern-validated) and expiry; if “bank_transfer” then require account_number and routing_number.
  • Error format: { path: “/passengers/0/age” (JSON Pointer), keyword: “type” (schema keyword violated), message: “Expected integer, got string”, schema_version: “2.1.0”, tool_name: “booking_create” }. Each error is self-contained and actionable.
  • Example: model sends { “destination”: 42, “date”: “not-a-date”, “extra_field”: true }. Error 1: /destination, keyword: type, message: “Expected string, got number.” Error 2: /date, keyword: format, message: “Not a valid ISO 8601 date.” Error 3: /, keyword: additionalProperties, message: “Unexpected property: extra_field.”

Risk-Based Tool Gating and Policy Enforcement

Fundamentals Not all tool calls are equal in risk. A search query is read-only and harmless; a database deletion is irreversible and dangerous. Risk-based tool gating classifies tools by their potential impact and enforces access policies that match the risk level to the authorization context (user tier, environment, request metadata). The policy gate is the second layer of defense after schema validation: even if the arguments are structurally correct, the policy gate can deny execution based on who is requesting, what they are requesting, and under what circumstances. Without this layer, a structurally valid tool call from an unauthorized user or in an unsafe context would execute unchecked.

Deep Dive into the concept The risk classification taxonomy defines four levels based on reversibility and impact:

Read-only tools retrieve information without modifying state. Examples: search, fetch, query, list. These tools are generally safe to execute without additional authorization because they have no side effects. The main risk is information leakage (exposing data the user should not see), which is handled by data access policies rather than tool-level gating.

Mutating tools modify state but the changes are reversible. Examples: create (can delete), update (can revert), add to cart (can remove). These tools require validation that the user is authorized to make the change and that the change parameters are within acceptable bounds (e.g., updating a profile field vs. changing a security setting).

Irreversible tools make changes that cannot be easily undone. Examples: delete (hard delete), send email, post to social media, execute financial transfer. These tools should always require explicit authorization, often including a confirmation step or human-in-the-loop approval. The cost of a false positive (executing when it should not) is high.

Privileged tools affect system configuration, security settings, or administrative functions. Examples: modify permissions, create/delete users, change API keys, alter system configurations. These should be blocked entirely for most users and require multi-factor authorization for administrators.

Policy rules combine tool risk level with context variables to produce a decision. The policy engine evaluates rules in priority order and returns the first matching decision. A typical rule structure is:

  IF tool.risk_level == "irreversible"
    AND user.tier == "free"
    AND environment == "production"
  THEN DENY reason="Irreversible actions require premium tier"

The decision space has four outcomes: ALLOW (execute the tool), DENY (reject with reason, inform the user), ESCALATE (require human approval before execution), and ABSTAIN (the router cannot determine safety; fall back to a human decision). ABSTAIN is important because it prevents the router from making decisions in ambiguous situations.

Rate limiting adds a temporal dimension to policy. Even allowed tools should be rate-limited to prevent abuse. A user who calls the payment_transfer tool 100 times in a minute should be flagged and throttled regardless of individual call validity.

Environment-aware policies distinguish between production, staging, and development environments. A tool call that is safe in development (where data is fake) may require stricter authorization in production (where data is real and actions have consequences).

Audit logging is not optional for policy decisions. Every ALLOW, DENY, and ESCALATE decision must be logged with: trace_id, timestamp, tool_name, user_id, user_tier, environment, risk_level, policy_rule_id (which rule matched), and the decision. This audit trail enables post-incident analysis (“who authorized that payment transfer?”) and compliance reporting.

How this fit on projects This concept is the core of the policy gate component in Phase 2. After schema validation passes, the policy gate evaluates the tool call against the risk taxonomy and policy rules. The gate’s decision determines whether execution proceeds, is blocked, or requires human approval.

Definitions & key terms

  • Risk level: Classification of a tool’s potential impact: read_only, mutating, irreversible, privileged.
  • Policy rule: A conditional expression that maps a combination of tool risk, user context, and environment to a decision (ALLOW, DENY, ESCALATE, ABSTAIN).
  • Escalation: Routing a tool call to a human approver rather than auto-executing or auto-denying.
  • Principle of least privilege: Users should only have access to the minimum set of tools and permissions needed for their current task.
  • Fail closed: When no policy rule matches, the default decision is DENY (safe) rather than ALLOW (dangerous).

Mental model diagram (ASCII)

          RISK-BASED TOOL GATING ARCHITECTURE
          ====================================

  +-------------------+    +------------------+
  | Validated         |    | Policy Config    |
  | Tool Call         |    | (YAML)           |
  |                   |    |                  |
  | tool: payment_xfr |    | rules:           |
  | args: {valid}     |    |  - risk=irrev    |
  | user_tier: free   |    |    tier=free     |
  | env: production   |    |    -> DENY       |
  +--------+----------+    |  - risk=irrev    |
           |               |    tier=premium  |
           v               |    env=prod      |
  +========================|    -> ESCALATE   |
  | RISK CLASSIFIER        |  - risk=readonly |
  |                        |    -> ALLOW      |
  | payment_xfr ->         |  - default       |
  |   risk: IRREVERSIBLE   |    -> DENY       |
  +=========+==============+--------+---------+
            |                       |
            v                       v
  +==========================================+
  | POLICY ENGINE                            |
  |                                          |
  | Input:                                   |
  |   risk=IRREVERSIBLE, tier=free, env=prod |
  |                                          |
  | Rule evaluation (priority order):        |
  |   Rule 1: risk=irrev AND tier=free       |
  |     -> MATCH: DENY                       |
  |                                          |
  | Decision: DENY                           |
  | Reason: "Irreversible actions require    |
  |          premium tier in production"      |
  +==========================================+
            |
            v
  +---------+---------+----------+---------+
  |         |         |          |         |
  v         v         v          v         v
 ALLOW    DENY    ESCALATE   ABSTAIN   RATE
                  (human)    (fallback) LIMIT
  |         |         |          |
  v         v         v          v
 Execute  Return   Queue for   Return
 tool     error    approval    error
  |         |         |          |
  v         v         v          v
 +------------------------------------+
 | AUDIT LOG                          |
 | trace_id, decision, rule_id,       |
 | user, tool, timestamp, reason      |
 +------------------------------------+

How it works (step-by-step, with invariants and failure modes)

  1. Look up the tool’s risk level from the tool registry. Invariant: every registered tool has an explicit risk level. Failure mode: if the risk level is missing, treat as PRIVILEGED (highest risk) and DENY by default.
  2. Assemble the evaluation context: user_tier, environment, request metadata, rate limit state. Invariant: all context fields are authenticated and trusted (not user-supplied claims). Failure mode: if user_tier cannot be determined (auth failure), default to the lowest tier.
  3. Evaluate policy rules in priority order. Invariant: rules are deterministic and ordered; the first match wins. Failure mode: no rule matches; apply the default policy (DENY).
  4. Execute the decision. Invariant: DENY and ESCALATE never lead to tool execution in the same request. Failure mode: a race condition allows execution while escalation is pending (handle via state lock).
  5. Write the audit log entry. Invariant: the audit log is append-only and includes all decision context. Failure mode: audit log write failure should NOT block the decision (log the failure separately and continue).

Minimal concrete example

  Policy configuration (YAML):

  default_decision: DENY

  rules:
    - id: allow_readonly
      condition:
        risk_level: read_only
      decision: ALLOW

    - id: allow_mutating_premium_prod
      condition:
        risk_level: mutating
        user_tier: [premium, enterprise]
        environment: production
      decision: ALLOW

    - id: escalate_irreversible_premium
      condition:
        risk_level: irreversible
        user_tier: [premium, enterprise]
      decision: ESCALATE
      escalation_target: "ops-team"

    - id: deny_irreversible_free
      condition:
        risk_level: irreversible
        user_tier: free
      decision: DENY
      reason: "Irreversible actions require premium tier"

    - id: deny_privileged_all
      condition:
        risk_level: privileged
      decision: DENY
      reason: "Privileged actions blocked via tool router"

  Trace evaluation for: tool=payment_xfr, risk=irreversible,
                         user_tier=free, env=production

  Rule 1 (allow_readonly): risk != read_only -> skip
  Rule 2 (allow_mutating_premium_prod): risk != mutating -> skip
  Rule 3 (escalate_irreversible_premium): tier != premium -> skip
  Rule 4 (deny_irreversible_free): MATCH
    -> Decision: DENY, reason: "Irreversible actions require premium tier"

Common misconceptions

  • “Schema validation is sufficient for safety; policy gates are redundant.” Schema validates structure; policy validates authorization. A structurally valid “delete all users” call should be blocked by policy even though its arguments are schema-valid.
  • “All users should have access to all tools.” This violates least privilege. Users should only see and use tools appropriate to their authorization level.
  • “Policy rules should be hardcoded.” Hardcoded rules are inflexible and require code deployments to change. Declarative policy configuration (YAML/JSON) enables rapid iteration and audit trail for policy changes.
  • “ESCALATE is just a slow ALLOW.” ESCALATE routes to a human decision-maker who can ALLOW or DENY. The outcome is uncertain, and the tool does not execute until the human decides. It is fundamentally different from a delayed ALLOW.
  • “Rate limiting is a separate concern from policy.” Rate limiting is a temporal policy dimension. A user who makes 100 valid tool calls per minute may be abusing the system even though each individual call is authorized.

Check-your-understanding questions

  1. Why should the default policy for unmatched rules be DENY rather than ALLOW?
  2. What is the difference between a DENY and an ESCALATE decision for irreversible tools?
  3. How does environment context change the policy evaluation for the same tool call?
  4. Why must audit logs be append-only?

Check-your-understanding answers

  1. DENY-by-default (fail closed) ensures that newly added tools or unexpected combinations cannot be exploited. If a tool is added to the registry without corresponding policy rules, it is blocked until someone explicitly writes a rule to allow it. This follows the principle of least privilege.
  2. DENY immediately blocks execution and returns an error to the user. ESCALATE pauses execution and routes the request to a human approver who makes the final decision. ESCALATE is appropriate when the action might be legitimate but requires human judgment.
  3. The same tool call might be ALLOW in development (fake data, no real impact), ESCALATE in staging (real-ish data, limited impact), and DENY in production (real data, real consequences). Environment context lets you enforce progressively stricter policies as you move toward production.
  4. Append-only audit logs prevent retroactive modification of decision records, which is essential for compliance and post-incident forensics. If someone could edit the audit log, they could cover up unauthorized tool executions.

Real-world applications

  • Cloud platforms (AWS IAM, GCP IAP) use policy engines to evaluate whether API calls are authorized based on caller identity, resource, action, and conditions.
  • Payment processors enforce risk-based gating: small transactions auto-approve, medium transactions flag for review, large transactions require multi-factor authorization.
  • Enterprise AI platforms (Microsoft Copilot, Salesforce Einstein) enforce organizational policies on AI agent tool usage to prevent data leakage and unauthorized actions.

Where you’ll apply it

  • Phase 2: implementing the policy gate with risk classification and rule evaluation.
  • Phase 3: adding rate limiting, environment-aware policies, and escalation routing.

References

  • “Security Engineering” by Ross Anderson - access control and policy design
  • “Site Reliability Engineering” by Google - Ch. 6 on monitoring, Ch. 14 on managing incidents
  • OWASP LLM Top 10 - tool misuse and insecure plugin design
  • “Systems Security Foundations for Agentic Computing” (2025 paper on agent sandboxing)
  • “Design Patterns to Secure LLM Agents In Action” (2025 lab report)

Key insights Risk classification separates the question “is this call valid?” (schema) from the question “should this call execute?” (policy), and both questions must be answered before any tool touches the real world.

Summary Risk-based tool gating classifies tools by impact (read-only, mutating, irreversible, privileged) and evaluates policy rules that combine risk level, user context, and environment to produce ALLOW/DENY/ESCALATE decisions. Policies should be declarative (YAML), fail closed by default, and produce append-only audit logs. Rate limiting adds a temporal dimension. The policy gate operates after schema validation, creating a two-layer defense between the model’s output and actual tool execution.

Homework/Exercises to practice the concept

  • Design a risk classification for 8 tools spanning all four risk levels. For each tool, justify the classification.
  • Write a YAML policy file with at least 6 rules covering: read-only allow-all, mutating by tier, irreversible with escalation, and privileged deny-all.
  • Trace through the policy evaluation for 3 tool calls with different risk/tier/environment combinations, showing which rule matches for each.

Solutions to the homework/exercises

  • Example classification: search_products (read_only: retrieves data, no side effects), get_user_profile (read_only: but may need data access policy), create_booking (mutating: creates a record that can be cancelled), update_booking (mutating: modifies existing record), send_email (irreversible: email cannot be unsent), execute_payment (irreversible: financial transfer), delete_account (irreversible: permanent data loss), modify_permissions (privileged: changes authorization structure).
  • The YAML should have rules ordered by specificity (most specific first, most general last) with a default DENY at the bottom. Each rule should have an id, condition block, decision, and optional reason/escalation_target.
  • Trace example: (1) search_products, free user, production -> matches allow_readonly -> ALLOW. (2) execute_payment, premium user, production -> matches escalate_irreversible_premium -> ESCALATE to ops-team. (3) modify_permissions, enterprise user, production -> matches deny_privileged_all -> DENY.

Model Context Protocol (MCP) and Tool Interoperability

Fundamentals The Model Context Protocol (MCP) is an open standard introduced by Anthropic in November 2024 that standardizes how AI applications integrate with external tools, data sources, and systems. Before MCP, every LLM provider and application implemented tool integration differently, creating an N x M integration problem: N applications each needed custom integrations with M tools. MCP solves this by defining a universal protocol that any application and any tool can implement, reducing the problem to N + M implementations. Understanding MCP is essential for this project because it represents the direction of tool interoperability and provides design patterns for your router’s tool registry and execution lifecycle.

Deep Dive into the concept MCP follows a client-server architecture. The MCP client runs inside the AI application (or your tool router) and communicates with MCP servers that wrap external tools and data sources. The protocol defines three main primitives:

Tools are executable functions that the model can invoke. Each MCP tool has a name, description, and input schema (JSON Schema). When the model decides to call a tool, the MCP client sends the call to the appropriate MCP server, which executes the tool and returns the result. This maps directly to the tool-calling lifecycle you are building in the router.

Resources are data sources that provide context to the model. Unlike tools, resources are read-only and do not have side effects. They provide structured data that the application can include in the model’s context. Resources are identified by URIs and can be static or dynamic.

Prompts are predefined templates that MCP servers can expose. These are pre-built prompt fragments that the application can use to construct messages to the model. This primitive is less relevant to the tool router but important for understanding the full MCP ecosystem.

The November 2025 specification introduced significant enhancements: modern authorization (OAuth 2.1 with PKCE), asynchronous execution for long-running tools, streamable HTTP transport (replacing the previous SSE-based transport), and structured error reporting. The authorization model is particularly relevant for the tool router: MCP servers declare their required permissions in a manifest file, and clients must obtain user consent before connecting.

MCP’s security model is relevant to your router design. Each MCP server runs as a separate process with its own security boundary. The server’s manifest declares what permissions it needs (filesystem access, network access, etc.), and the client requests user consent before granting these permissions. This capability-based security model aligns with the principle of least privilege: each tool server only has access to what it needs.

MCP adoption has accelerated rapidly. OpenAI, Google DeepMind, and major tool providers now support MCP, making it the de facto standard for AI tool integration. Building your router with MCP compatibility (or at least MCP-inspired patterns) ensures it can participate in this ecosystem.

For your router, MCP provides two key design patterns. First, the tool definition format (name, description, input_schema) that your registry should adopt. Second, the lifecycle pattern (discover tools, select tool, validate arguments, execute, return result) that your router pipeline should follow.

How this fit on projects MCP provides the standards-based design patterns for your tool registry and execution lifecycle. Even if you do not implement a full MCP server, adopting MCP-compatible tool definitions and lifecycle patterns ensures your router can integrate with the broader ecosystem.

Definitions & key terms

  • MCP (Model Context Protocol): An open standard for AI application integration with external tools and data sources.
  • MCP Client: The component in the AI application that communicates with MCP servers.
  • MCP Server: A process that wraps external tools and exposes them through the MCP protocol.
  • MCP Manifest: A JSON file describing an MCP server’s capabilities and required permissions.
  • Capability-based security: A security model where access is granted through explicit capabilities (permissions) rather than identity-based rules.

Mental model diagram (ASCII)

              MCP ARCHITECTURE AND TOOL ROUTER FIT
              =====================================

  +---------------------------------------------+
  |          AI APPLICATION / AGENT              |
  |                                              |
  |  +--------+    +-------------------------+  |
  |  |  LLM   |--->|  TOOL ROUTER (P06)      |  |
  |  |        |    |                         |  |
  |  +--------+    |  +---------+  +-------+ |  |
  |                |  | Schema  |  | Policy| |  |
  |                |  | Validate|  | Gate  | |  |
  |                |  +---------+  +-------+ |  |
  |                |         |               |  |
  |                |    MCP Client           |  |
  |                +----+----+----+----------+  |
  |                     |    |    |              |
  +---------------------|----|----|------ -------+
                        |    |    |
          MCP Protocol  |    |    |  (JSON-RPC over
          (standardized)|    |    |   Streamable HTTP)
                        v    v    v
              +---------+  +-+--+ +----------+
              |MCP Server| |MCP | |MCP Server|
              |          | |Serv| |          |
              | Travel   | |er  | | Payment  |
              | API      | |    | | Gateway  |
              |          | |DB  | |          |
              | manifest:| |    | | manifest:|
              |  read    | |read| |  read,   |
              |  network | |only| |  write,  |
              +----------+ +----+ |  network |
                                  +----------+
  TOOL DEFINITION (MCP-COMPATIBLE):
  +-----------------------------------------+
  | name: "travel_search"                   |
  | description: "Search for flights..."     |
  | inputSchema: {                          |
  |   type: "object",                       |
  |   properties: { ... },                  |
  |   required: [...],                      |
  |   additionalProperties: false           |
  | }                                       |
  | ---- router extensions ----             |
  | risk_level: "read_only"                 |
  | requires_approval: false                |
  | rate_limit: { max: 100, window: "1m" } |
  +-----------------------------------------+

How it works (step-by-step, with invariants and failure modes)

  1. MCP servers register with the client, providing their manifest (capabilities, required permissions). Invariant: the client validates the manifest schema before accepting the server. Failure mode: malformed manifest rejected at connection time.
  2. The client discovers available tools from all connected MCP servers. Invariant: tool names are globally unique (namespaced by server). Failure mode: name collision between servers; the client must detect and reject or namespace.
  3. Tool definitions from MCP servers are loaded into the router’s registry with additional metadata (risk level, policy rules). Invariant: every MCP tool has a corresponding risk classification in the router. Failure mode: a new MCP server provides tools without risk classifications; the router blocks them until classified.
  4. When a tool call arrives, the router validates and gates as normal, then dispatches to the appropriate MCP server. Invariant: the MCP server receives only validated, policy-approved calls. Failure mode: MCP server is unreachable; return a structured error with retry guidance.
  5. The MCP server executes the tool and returns the result through the protocol. Invariant: results conform to the MCP response format. Failure mode: server returns an error or times out; the router wraps this in a structured error for the model.

Minimal concrete example

  MCP Server manifest (JSON):

  {
    "name": "travel-service",
    "version": "1.2.0",
    "description": "Flight and hotel search",
    "capabilities": {
      "tools": true,
      "resources": false,
      "prompts": false
    },
    "permissions": {
      "network": ["api.travel-provider.com"],
      "filesystem": []
    },
    "tools": [
      {
        "name": "travel_search",
        "description": "Search flights by destination and date",
        "inputSchema": {
          "type": "object",
          "properties": {
            "destination": { "type": "string" },
            "date": { "type": "string", "format": "date" }
          },
          "required": ["destination", "date"]
        }
      }
    ]
  }

Common misconceptions

  • “MCP replaces the need for a custom router.” MCP standardizes the communication protocol but does not provide policy gating, risk classification, or audit logging. Your router adds these essential layers on top of MCP.
  • “MCP tools are automatically safe to execute.” MCP servers declare permissions, but the client (your router) must enforce policy. A malicious or misconfigured MCP server could request excessive permissions.
  • “MCP is only for Anthropic’s Claude.” MCP is provider-agnostic. OpenAI, Google, and many other providers have adopted it. It standardizes the tool interface, not the model.
  • “MCP means you do not need JSON Schema validation.” MCP uses JSON Schema for tool definitions, but your router should still validate arguments client-side before sending to the MCP server. Defense in depth.

Check-your-understanding questions

  1. How does MCP reduce the integration complexity from N x M to N + M?
  2. What is the role of the MCP manifest’s permissions field in security?
  3. Why should the router validate arguments before sending them to an MCP server, even though the server also validates?
  4. How do MCP tool definitions map to your router’s tool registry?

Check-your-understanding answers

  1. Without MCP, N AI applications each need custom integrations with M tools (N x M total integrations). With MCP, each application implements one MCP client (N implementations) and each tool implements one MCP server (M implementations). Any client can communicate with any server through the standard protocol.
  2. The permissions field declares what system resources the MCP server needs (network access, filesystem access, etc.). The MCP client shows these to the user for consent before connecting. This implements capability-based security: the server only gets the permissions it declared and the user approved.
  3. Defense in depth. The server’s validation is the last line of defense, but errors caught at the router level avoid the network round-trip to the server and provide faster feedback. Also, the router may have stricter policies than the server (e.g., the server allows any destination, but the router’s policy restricts to approved regions).
  4. MCP tool definitions (name, description, inputSchema) become the base of your registry entries. The router extends each entry with risk_level, policy_rules, rate_limits, and other metadata that MCP does not define. The registry is a superset of MCP tool definitions.

Real-world applications

  • Cursor, Zed, and other AI coding tools use MCP to connect to external services (GitHub, databases, documentation) through standardized MCP servers.
  • Enterprise platforms are adopting MCP for internal tool governance, with MCP servers wrapping internal APIs and the AI platform enforcing organizational policies.
  • The MCP ecosystem includes a growing registry of open-source MCP servers for common tools (Slack, GitHub, databases, cloud providers).

Where you’ll apply it

  • Phase 1: designing the tool registry with MCP-compatible definitions.
  • Phase 3: adding MCP client support for discovering and calling external tools.

References

  • Model Context Protocol specification (modelcontextprotocol.io) - November 2025 revision
  • MCP GitHub repository (github.com/modelcontextprotocol)
  • “One Year of MCP” blog post on the protocol’s evolution and adoption
  • Anthropic MCP documentation (docs.anthropic.com)
  • “2026: The Year for Enterprise-Ready MCP Adoption” (industry analysis)

Key insights MCP solves the integration problem (how tools connect) but not the governance problem (which tools should execute); your router provides the governance layer that MCP intentionally leaves to the client.

Summary MCP standardizes AI tool integration through a client-server protocol with tool definitions, resources, and prompts. The protocol provides the transport and discovery layer, while your router provides the governance layer (schema validation, policy gating, audit logging). MCP-compatible tool definitions serve as the base for your router’s registry, extended with risk classifications and policy rules. Adopting MCP patterns ensures your router can participate in the growing ecosystem of AI tool integrations.

Homework/Exercises to practice the concept

  • Design a registry schema that combines MCP tool definitions with router extensions (risk_level, policy_rules, rate_limits). Show how an MCP server’s manifest maps into your registry.
  • Compare the tool-calling lifecycle in your router with the MCP lifecycle. Identify where your router adds steps that MCP does not define.
  • Write a pseudocode MCP client that discovers tools from two MCP servers, merges them into a single registry, and handles name collisions.

Solutions to the homework/exercises

  • Registry entry: { mcp_server: “travel-service”, name: “travel_search”, description: “…”, inputSchema: {…}, risk_level: “read_only”, policy_rules: [“allow_readonly”], rate_limit: { max: 100, window: “1m” }, schema_version: “1.2.0”, last_updated: “2025-11-01” }. The MCP manifest’s tools array maps directly to the name/description/inputSchema fields. Router extensions (risk_level, policy_rules, rate_limit) are added during registration.
  • MCP lifecycle: discover -> call -> result. Router lifecycle: discover -> normalize -> validate schema -> evaluate policy -> rate limit check -> call -> capture result -> audit log. The router adds: normalization, schema validation, policy evaluation, rate limiting, and audit logging.
  • The pseudocode should: (1) connect to both servers and fetch manifests, (2) iterate over all tools from both servers, (3) namespace tool names with server name if collisions occur (e.g., “travel-service/search” vs “hotel-service/search”), (4) register each tool with the router’s registry, (5) log any collisions as warnings.

3. Project Specification

3.1 What You Will Build

A runtime tool-router API that maps intent to tool calls with strict schema and policy gates.

3.2 Functional Requirements

  1. Accept intent payloads and classify as direct answer, tool call, or abstain.
  2. Validate generated tool arguments against JSON schemas before execution.
  3. Apply policy gates by user tier, action risk, and environment.
  4. Return machine-readable routing decision with trace id.

3.3 Non-Functional Requirements

  • Performance: p95 routing latency below 250 ms excluding downstream tool execution.
  • Reliability: Same input/context/policy version yields the same routing decision.
  • Security/Policy: High-risk actions require explicit deny or escalation; never silent allow.

3.4 Example Usage / Output

$ npm run dev --workspace p06-tool-router
[ready] listening on http://localhost:3000

$ curl -s http://localhost:3000/v1/route \
  -H 'content-type: application/json' \
  -d '{
  "user_intent": "book me a flight to NYC next Friday",
  "context": {"user_tier": "free", "region": "US"}
}' | jq
{
  "decision": "TOOL_CALL",
  "tool_name": "travel_search",
  "arguments": {"destination": "NYC", "date": "2026-02-20"},
  "confidence": 0.91,
  "trace_id": "trc_p06_001"
}

3.5 Data Formats / Schemas / Protocols

  • Route request JSON: intent, context, optional tool preference.
  • Route response JSON: decision enum, tool, args, confidence, trace_id.
  • Policy file YAML: per-tool allow/deny/escalate rules by risk level.
  • Tool registry JSON: MCP-compatible definitions + risk level extensions.

3.6 Edge Cases

  • Intent maps to multiple tools with similar confidence.
  • Tool args pass schema but violate business policy.
  • User requests prohibited action with persuasive wording.
  • Tool schema changes without router refresh.
  • MCP server becomes unreachable mid-request.

3.7 Real World Outcome

This project is complete when your API can serve valid requests with typed responses and reject invalid/high-risk requests with a unified error shape.

3.7.1 How to Run (Copy/Paste)

$ npm run dev --workspace p06-tool-router

3.7.2 Golden Path Demo (Deterministic)

Use fixed fixture payloads and verify the same response shape and decision fields every run.

3.7.3 API Endpoints

| Method | Endpoint | Purpose | |——–|———-|———| | POST | /v1/route | Route intent to tool call with validation and policy | | GET | /v1/tools | List registered tools with risk levels | | GET | /v1/health | Router health check |

3.7.4 Success Response Example

$ curl -s http://localhost:3000/v1/route \
  -H 'content-type: application/json' \
  -d '{
  "user_intent": "book me a flight to NYC next Friday",
  "context": {"user_tier": "free", "region": "US"}
}' | jq
{
  "decision": "TOOL_CALL",
  "tool_name": "travel_search",
  "arguments": {"destination": "NYC", "date": "2026-02-20"},
  "confidence": 0.91,
  "trace_id": "trc_p06_001"
}

3.7.5 Error Response Example

$ curl -s http://localhost:3000/v1/route \
  -H 'content-type: application/json' \
  -d '{
  "user_intent": "wire $20,000 to external account now",
  "context": {"user_tier": "free", "region": "US"}
}' | jq
{
  "error": {
    "code": "POLICY_BLOCKED",
    "message": "Human approval required before executing this action.",
    "trace_id": "trc_01J...",
    "risk_level": "irreversible",
    "matched_rule": "deny_irreversible_free",
    "project": "P06"
  }
}

4. Solution Architecture

4.1 High-Level Design

              TOOL ROUTER ARCHITECTURE
              ========================

  HTTP Request (POST /v1/route)
  { intent, context, tool_preference? }
         |
         v
  +---------------------------+
  |  REQUEST NORMALIZER       |
  |  - Parse and validate     |
  |  - Attach trace_id        |
  |  - Authenticate context   |
  +---------------------------+
         |
         v
  +---------------------------+      +-----------------+
  |  INTENT CLASSIFIER        |      | Tool Registry   |
  |  - Map intent to tool(s)  |<---->| (MCP-compat)    |
  |  - Confidence scoring     |      | - definitions   |
  |  - Abstain if ambiguous   |      | - risk levels   |
  +---------------------------+      +-----------------+
         |
         | candidate: { tool, args, confidence }
         v
  +---------------------------+      +-----------------+
  |  SCHEMA VALIDATOR         |      | JSON Schemas    |
  |  - Validate args vs schema|<---->| (per tool)      |
  |  - Collect all errors     |      +-----------------+
  |  - Return PASS/FAIL       |
  +---------------------------+
         |
         | validated args
         v
  +---------------------------+      +-----------------+
  |  POLICY GATE              |      | Policy Rules    |
  |  - Risk classification    |<---->| (YAML)          |
  |  - Rule evaluation        |      | - per tool/tier |
  |  - Rate limit check       |      | - per env       |
  |  - ALLOW/DENY/ESCALATE    |      +-----------------+
  +---------------------------+
         |
    +----+----+----------+
    |         |          |
    v         v          v
  ALLOW     DENY      ESCALATE
    |         |          |
    v         v          v
  Execute   Return    Queue for
  tool      error     approval
    |         |          |
    v         v          v
  +---------------------------+
  |  TRACE LOGGER             |
  |  - Append-only audit log  |
  |  - trace_id, decision,    |
  |    rule_id, timestamp     |
  +---------------------------+
         |
         v
  HTTP Response
  { decision, tool, args, trace_id }

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Intent Classifier | Maps natural language intent to candidate tool(s). | Emit ABSTAIN when no tool scores above the confidence threshold. | | Schema Validator | Validates generated arguments against JSON Schema. | Use strict mode: additionalProperties false, all-errors collection. | | Policy Gate | Applies risk-based rules and tier/environment checks. | Fail closed (DENY by default). Append-only audit log. | | Trace Logger | Records every routing decision for audit and replay. | Include all context: intent, tool, args, decision, rule_id, latency. | | Tool Registry | Stores MCP-compatible tool definitions + risk metadata. | Versioned. Auto-reload on change. Namespace collision detection. |

4.3 Data Structures (No Full Code)

ToolDefinition:
  - name: string
  - description: string
  - inputSchema: JSONSchema
  - risk_level: enum(read_only, mutating, irreversible, privileged)
  - requires_approval: boolean
  - rate_limit: { max: number, window: string }
  - schema_version: string
  - mcp_server: string (optional)

RouteRequest:
  - user_intent: string
  - context: { user_tier, region, environment, ... }
  - tool_preference: string (optional)
  - trace_id: string (auto-generated)

RouteDecision:
  - decision: enum(TOOL_CALL, DIRECT_ANSWER, ABSTAIN, DENIED, ESCALATED)
  - tool_name: string (null if DIRECT_ANSWER/ABSTAIN)
  - arguments: object (null if not TOOL_CALL)
  - confidence: float
  - validation_result: { status, errors[] }
  - policy_result: { decision, rule_id, reason }
  - trace_id: string
  - latency_ms: number

TraceEntry:
  - trace_id: string
  - timestamp: datetime
  - intent: string
  - tool_name: string
  - decision: string
  - rule_id: string
  - user_tier: string
  - environment: string
  - risk_level: string
  - latency_ms: number
  - validation_errors: string[] (if any)
  - policy_reason: string (if denied/escalated)

4.4 Algorithm Overview

Key algorithm: Intent-to-tool routing with validation and policy pipeline

  1. Parse and normalize the incoming request. Attach trace_id and authenticate context.
  2. Classify intent: match against tool descriptions and scoring candidates by confidence. If no candidate exceeds the threshold, return ABSTAIN or DIRECT_ANSWER.
  3. For the top candidate: validate generated arguments against the tool’s JSON Schema. If validation fails, return structured error with all violations.
  4. If validation passes: evaluate policy rules using risk level, user tier, and environment. Return ALLOW, DENY, or ESCALATE.
  5. If ALLOW: execute the tool (or mock in the project). If DENY/ESCALATE: return the decision with reason.
  6. Log the trace entry with all decision context.

Complexity Analysis (conceptual):

  • Time: O(T) for intent classification where T = number of registered tools. O(P) for policy evaluation where P = number of rules. O(1) for schema validation (fixed per-call).
  • Space: O(T) for tool registry. O(1) per request for routing state.

5. Implementation Guide

5.1 Development Environment Setup

# 1) Install Node.js 18+ and TypeScript
# 2) Initialize workspace with p06-tool-router package
# 3) Install Ajv for JSON Schema validation
# 4) Prepare tool registry under registry/
# 5) Prepare policy rules under policies/
# 6) Run: npm run dev --workspace p06-tool-router

5.2 Project Structure

p06/
├── src/
│   ├── server.ts           # HTTP server entry point
│   ├── router.ts           # Main routing pipeline
│   ├── classifier.ts       # Intent-to-tool classification
│   ├── validator.ts        # JSON Schema validation
│   ├── policy-gate.ts      # Risk-based policy evaluation
│   ├── trace-logger.ts     # Append-only audit logging
│   └── registry.ts         # Tool registry management
├── registry/
│   ├── tools.json          # Tool definitions (MCP-compatible)
│   └── schemas/            # Per-tool JSON Schemas
├── policies/
│   └── rules.yaml          # Policy rules by risk/tier/env
├── fixtures/
│   ├── golden_case.json    # Happy path test payloads
│   └── failure_case.json   # Error path test payloads
├── out/
│   └── traces/             # Audit log output
└── README.md

5.3 The Core Question You’re Answering

“How does the system choose whether to call a tool, and which one, without unsafe side effects?”

This question matters because it forces you to build a system that makes explicit, auditable decisions rather than blindly executing whatever the model suggests.

5.4 Concepts You Must Understand First

  1. Tool/function calling architecture across providers
    • How do OpenAI, Anthropic, and Google implement tool calling?
    • Book Reference: “AI Engineering” by Chip Huyen - agent architecture chapters
  2. JSON Schema for parameter validation
    • Why is additionalProperties: false essential for tool schemas?
    • Book Reference: JSON Schema specification (json-schema.org)
  3. Risk-based policy and access control
    • How do you classify tools by risk and enforce authorization?
    • Book Reference: “Security Engineering” by Ross Anderson - access control chapters
  4. Model Context Protocol (MCP) and interoperability
    • How does MCP standardize tool discovery and invocation?
    • Book Reference: MCP specification (modelcontextprotocol.io)

5.5 Questions to Guide Your Design

  1. Routing decisions
    • How do you handle ambiguous intents that map to multiple tools?
    • What confidence threshold triggers ABSTAIN vs best-guess routing?
    • Should the router support parallel tool calls?
  2. Validation strictness
    • Should the router attempt to coerce types (string “42” to int 42) or reject strictly?
    • How do you handle schema version mismatches between registry and model?
    • What happens when validation passes but the arguments are semantically wrong?
  3. Policy design
    • Should policies be per-tool, per-user, per-environment, or all three?
    • How do you handle policy rule conflicts (tool is allowed by one rule, denied by another)?
    • What is the escalation path for denied but potentially legitimate requests?

5.6 Thinking Exercise

Pre-Mortem for Tool Router

Before implementing, write down 10 ways this project can fail in production. Classify each failure into: classification, validation, policy, security, or operations.

Questions to answer:

  • Which failures can be prevented at schema design time?
  • Which failures require runtime monitoring?
  • What happens when the model confidently calls the wrong tool?

5.7 The Interview Questions They’ll Ask

  1. “How do you handle ambiguity between two plausible tool calls?”
  2. “Why must schema validation happen before tool execution, not after?”
  3. “What information should a routing trace contain for incident investigation?”
  4. “How do you blend model confidence with hard policy constraints?”
  5. “What is your fallback behavior when no tool is safe to call?”
  6. “How does MCP change the architecture of tool integration?”

5.8 Hints in Layers

Hint 1: Define the decision enum first Keep outcomes explicit and exhaustive: TOOL_CALL, DIRECT_ANSWER, ABSTAIN, DENIED, ESCALATED. Every request gets exactly one outcome.

Hint 2: Version everything Schema hash, policy version, and registry version should be logged with every trace. Without this, decisions are not reproducible.

Hint 3: Treat policy as data, not code Keep rules in a declarative YAML file that can be reviewed, diffed, and rolled back independently of code deployments.

Hint 4: Build trace replay from day one Design traces so that any decision can be replayed with the same inputs and verified. This is invaluable for debugging and incident response.

Hint 5: Start with 3 tools spanning all risk levels A read-only search, a mutating booking, and an irreversible payment. This small set exercises the full risk taxonomy without overwhelming the initial implementation.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Service boundary design | “Building Microservices” by Sam Newman | API chapters | | Access control and policy | “Security Engineering” by Ross Anderson | Authz chapters | | Reliability and monitoring | “Site Reliability Engineering” by Google | Ch. 6 | | Agent architecture | “AI Engineering” by Chip Huyen | Agent systems chapters | | Tool integration patterns | “Building LLM Apps” by Valentina Alto | Tool use chapters |

5.10 Implementation Phases

Phase 1: Foundation

  • Design and populate the tool registry with 3-5 tools spanning all risk levels.
  • Build the HTTP server skeleton with the /v1/route endpoint.
  • Implement the trace logger with append-only JSON output.
  • Checkpoint: Server starts, accepts requests, returns stub responses with trace_ids.

Phase 2: Core Functionality

  • Implement the intent classifier with confidence scoring and ABSTAIN path.
  • Build the JSON Schema validator using Ajv (or equivalent).
  • Implement the policy gate with risk classification and rule evaluation.
  • Wire the full pipeline: classify -> validate -> gate -> respond.
  • Checkpoint: Golden path produces TOOL_CALL response. Policy-blocked path produces DENIED response. Both have complete traces.

Phase 3: Operational Hardening

  • Add rate limiting per user/tool combination.
  • Add environment-aware policy rules.
  • Add parallel tool call support with atomic/lenient modes.
  • Build trace replay capability for debugging.
  • Checkpoint: Rate limiting triggers on burst requests. Trace replay produces identical decisions for historical requests.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Schema validation strictness | Strict (reject) / Coercive (auto-fix) | Strict (reject) | Auto-coercion can silently change semantics | | Policy default | ALLOW / DENY | DENY (fail closed) | Principle of least privilege | | Parallel calls | Atomic / Lenient | Configurable | Safety-critical apps need atomic; others benefit from lenient | | Tool definitions | MCP-compatible / Custom | MCP-compatible with extensions | Future interoperability with MCP ecosystem | | Audit logging | Synchronous / Asynchronous | Asynchronous (fire-and-forget) | Logging should never block routing latency |

6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Validate individual components | Schema validation edge cases, policy rule matching | | Integration Tests | Verify full routing pipeline | Golden path end-to-end, error path end-to-end | | Edge Case Tests | Ensure robust failure handling | Ambiguous intent, unknown tool, malformed JSON | | Security Tests | Verify policy enforcement | Privilege escalation attempts, policy bypass attempts |

6.2 Critical Test Cases

  1. Golden path: intent maps to correct tool, args validate, policy allows, trace is complete.
  2. Schema validation: missing required field returns structured error with JSON path.
  3. Policy DENY: irreversible tool + free tier returns POLICY_BLOCKED with matched rule.
  4. Ambiguous intent: two tools with similar confidence returns ABSTAIN.
  5. Unknown tool: model hallucinated tool name returns TOOL_NOT_FOUND.
  6. Rate limit: burst requests trigger throttling with RATE_LIMITED response.
  7. Trace replay: replaying a trace with same inputs produces same decision.

6.3 Test Data

fixtures/golden_case.json       # Happy path: clear intent, valid args, allowed policy
fixtures/failure_case.json      # Error path: invalid args, blocked policy
fixtures/ambiguous_intent.json  # Multiple tools with similar confidence
fixtures/edge_cases/
  malformed_json.txt            # Invalid JSON from model
  unknown_tool.json             # Hallucinated tool name
  policy_bypass.json            # Persuasive wording for blocked action
  parallel_calls.json           # Multiple tool calls in one request

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |———|———|———-| | “Wrong tool selected for edge intents” | Classifier labels are too coarse. | Add hierarchical intent taxonomy with ABSTAIN for ambiguous cases. | | “Tool call succeeded but policy was violated” | Policy checks executed after tool invocation. | Move policy gate before execution path. Never skip the gate. | | “Router decisions are inconsistent” | Context normalization is missing. | Normalize locale, date, and entity extraction before routing. | | “Schema validation passes but tool fails” | Schema is too permissive. | Use additionalProperties: false, strict types, format annotations. | | “Audit log is incomplete” | Logging is conditional or synchronous-blocking. | Log every decision unconditionally. Use async writes. |

7.2 Debugging Strategies

  • Replay traces with the same inputs and verify decisions match.
  • Add request_id logging to correlate across classifier, validator, and policy gate.
  • Compare validation error paths by sending known-bad inputs for each schema constraint.
  • Test policy rules independently with unit tests before integration.

7.3 Performance Traps

  • Schema validation on every request: precompile schemas at startup with Ajv.
  • Policy rule evaluation with linear scan: index rules by risk_level for O(1) lookup.
  • Synchronous audit logging blocking response: use async fire-and-forget writes.
  • Loading tool registry from disk on every request: cache in memory, watch for changes.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add 2 more tools to the registry with different risk levels and test policy coverage.
  • Add one new policy rule that considers request time-of-day (e.g., deny irreversible actions outside business hours).

8.2 Intermediate Extensions

  • Implement parallel tool call support with configurable atomic/lenient modes.
  • Add a /v1/tools endpoint that lists registered tools with risk levels (filtered by user tier).
  • Build a trace dashboard that visualizes routing decisions over time.

8.3 Advanced Extensions

  • Implement MCP client support: discover tools from external MCP servers and merge into the registry.
  • Add escalation workflow integration: ESCALATE decisions queue for human approval with configurable timeout.
  • Build trace replay with diff analysis: compare current routing decisions against historical decisions after policy changes.
  • Integrate with P13 (Tool Permission Firewall) for fine-grained capability-based security.

9. Real-World Connections

9.1 Industry Applications

  • Enterprise AI platforms routing tool calls through organizational policy layers before executing actions in CRM, ERP, and communication systems.
  • Autonomous coding agents using tool routers to enforce which file operations, terminal commands, and API calls are permitted.
  • Customer support bots using tool routing to safely execute actions like refunds, account changes, and escalations based on agent tier and request risk.
  • LangChain tool routing and agent executor implementations.
  • Model Context Protocol specification and reference implementations.
  • Open Policy Agent (OPA) for declarative policy evaluation.
  • Ajv (Another JSON Schema Validator) for TypeScript/JavaScript validation.

9.3 Interview Relevance

  • Demonstrates understanding of the trust boundary between LLM output and real-world actions.
  • Shows ability to design layered security: schema validation + policy gating + audit logging.
  • Illustrates production thinking: deterministic decisions, trace replay, fail-closed defaults.

10. Resources

10.1 Essential Reading

  • OpenAI Function Calling documentation.
  • Anthropic Tool Use and Structured Outputs documentation.
  • Model Context Protocol specification (November 2025 revision).
  • JSON Schema specification (draft 2020-12).
  • OWASP LLM Top 10 (tool misuse and insecure plugin design).

10.2 Video Resources

  • Talks on AI agent security and tool sandboxing.
  • Conference presentations on MCP architecture and adoption.
  • Workshops on JSON Schema design for API contracts.

10.3 Tools & Documentation

  • Ajv documentation (JSON Schema validator for TypeScript/JavaScript).
  • Open Policy Agent (OPA) documentation for declarative policy engines.
  • OpenTelemetry documentation for structured tracing.
  • MCP reference implementations on GitHub.
  • P01 (Prompt Contract Harness): Contract discipline applied to routing outputs.
  • P03 (Prompt Injection Red-Team Lab): Tests injection resistance in tool-calling prompts.
  • P05 (Few-Shot Example Curator): Example selection for intent classification training.
  • P13 (Tool Permission Firewall): Fine-grained capability-based tool access control.
  • P18 (Capstone): Integrates the router into the full production platform.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the tool-calling lifecycle across at least two LLM providers.
  • I can describe the four risk levels and give examples of tools at each level.
  • I can explain why fail-closed is the correct default for policy evaluation.
  • I can describe how MCP standardizes tool integration and where the router adds value.

11.2 Implementation

  • Golden-path and failure-path flows both produce complete trace entries.
  • Schema validation catches missing fields, wrong types, and extra properties.
  • Policy gate correctly applies risk-based rules for different user tiers.
  • Trace replay produces identical decisions for the same inputs.

11.3 Growth

  • I can explain the tradeoff between atomic and lenient parallel tool call handling.
  • I can design a policy migration strategy for adding new tools without breaking existing rules.
  • I can describe how this project integrates with P13 (Firewall) and P18 (Capstone).

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Router accepts requests and returns structured decisions (TOOL_CALL, DENIED, ABSTAIN).
  • Schema validation catches invalid arguments and returns all errors.
  • Policy gate enforces at least one deny rule based on risk level + user tier.
  • Every decision produces a trace entry.

Full Completion:

  • Tool registry has 5+ tools spanning all four risk levels.
  • Policy rules cover all risk/tier/environment combinations with fail-closed default.
  • Rate limiting triggers on burst requests.
  • Trace replay verifies decision consistency.
  • Automated tests cover golden path, error path, and security cases.

Excellence (Above & Beyond):

  • MCP client discovers tools from external servers.
  • Escalation workflow with human approval integration.
  • Parallel tool call support with configurable atomic/lenient modes.
  • Integration with P13 (Firewall) for capability-based security.