Project 1: Tool Caller Baseline (Non-Agent)

Build a deterministic, single-shot CLI assistant that calls tools with strict schemas, logs tool vs model failures, and produces a reproducible JSON report.

Quick Reference

Attribute	Value
Difficulty	Level 1: Intro
Time Estimate	6–10 hours (weekend)
Language	Python or JavaScript
Prerequisites	CLI basics, JSON, simple file I/O
Key Topics	tool schemas, validation, determinism, error boundaries

Learning Objectives

By completing this project, you will:

Define tool contracts with strict input/output schemas.
Separate tool failures from model failures in logs and reports.
Guarantee deterministic output for the same inputs.
Build a minimal tool registry and execution pipeline.
Produce machine-verifiable reports that downstream systems can trust.

The Core Question You’re Answering

“What can structured tool calling accomplish without any agent loop, and where does it break?”

This project establishes a baseline. Without planning, memory, or iteration, the system is predictable. That predictability is your control group for all agentic behavior later.

Concepts You Must Understand First

Concept	Why It Matters	Where to Learn
JSON schema validation	Tool I/O must be verifiable	Pydantic/Zod docs
Deterministic execution	Debugging requires repeatability	Any testing guide
Error boundaries	Tool failures vs model failures	Systems design basics
CLI argument parsing	Reproducible inputs	argparse / yargs
Structured outputs	Enables strict parsing	LLM function calling guides

Theoretical Foundation

Single-Shot Tool Calling as a Pipeline

A non-agent tool caller is a straight-line pipeline:

User Input -> Tool Call -> Tool Output -> JSON Report

There is no feedback loop. That means:

Pros: deterministic, testable, easy to trace
Cons: cannot adapt to errors, cannot plan, cannot recover

Error Boundaries

You must distinguish:

Tool errors (file not found, bad input)
Model errors (invalid JSON, missing fields)

Blending these will make debugging impossible once you scale.

Project Specification

What You’ll Build

A CLI tool that runs a fixed tool chain (e.g., parse logs -> compute stats) and outputs a strict JSON report.

Functional Requirements

Tool registry with input/output schemas
Validation on tool calls and tool outputs
Deterministic execution (temperature 0, no randomness)
JSON report with metrics and tool logs
Distinct error codes for tool vs model failures

Non-Functional Requirements

Reproducible outputs
Clear audit logs
Safe defaults (no dynamic code execution)

Real World Outcome

When you run the tool, you get deterministic, auditable output:

$ python tool_caller.py analyze --file logs/server.log

Calling tool: parse_log
Tool input: {"file_path": "logs/server.log"}
Tool output received (382 bytes)

Calling tool: summarize_stats
Tool input: {"events": 1523}
Tool output received (128 bytes)

Analysis complete.

Output file analysis_report.json:

{
  "status": "success",
  "input_file": "logs/server.log",
  "summary": {
    "total_lines": 1523,
    "errors": 47,
    "warnings": 132
  },
  "tool_calls": [
    {"name": "parse_log", "duration_ms": 145, "status": "ok"},
    {"name": "summarize_stats", "duration_ms": 23, "status": "ok"}
  ]
}

If a tool fails, the report is explicit:

{
  "status": "error",
  "error_type": "tool_error",
  "tool": "parse_log",
  "message": "File not found: logs/missing.log"
}

Architecture Overview

┌───────────────┐   validate   ┌─────────────────┐
│ CLI Interface │──────────────▶│ Tool Registry   │
└───────┬───────┘               └───────┬─────────┘
        │                               │
        ▼                               ▼
┌───────────────┐               ┌─────────────────┐
│ Tool Executor │──────────────▶│ Tool Implement. │
└───────┬───────┘               └───────┬─────────┘
        │                               │
        ▼                               ▼
┌─────────────────┐           ┌────────────────────┐
│ Report Builder  │◀──────────│ Error Boundary     │
└─────────────────┘           └────────────────────┘

Implementation Guide

Phase 1: Tool Registry + Schemas (2–3h)

Define tool schemas with Pydantic/Zod
Validate input before execution
Checkpoint: invalid input fails fast

Phase 2: Tool Executor + Logging (2–3h)

Execute tools in a fixed order
Log tool inputs, outputs, timings
Checkpoint: trace log is complete

Phase 3: Report Builder (2–4h)

Build deterministic JSON output
Add error classification
Checkpoint: report validates against schema

Common Pitfalls & Debugging

Pitfall	Symptom	Fix
Mixed error types	all failures look identical	enforce error_type field
Non-determinism	outputs differ run-to-run	fix seeds + temp=0
Silent schema drift	missing fields in JSON	validate outputs strictly

Interview Questions They’ll Ask

Why must tool outputs be validated if tool inputs are already valid?
How do you distinguish model errors from tool errors?
What makes deterministic outputs critical for debugging?

Hints in Layers

Hint 1: Start with a single tool and enforce strict input schema validation.
Hint 2: Add a tool registry so you can test tools in isolation.
Hint 3: Create a JSON report schema and validate it before saving.
Hint 4: Add structured error types so failures are unambiguous.

Learning Milestones

Baseline Working: one tool call produces valid JSON.
Observable: logs show tool inputs/outputs clearly.
Reliable: outputs are deterministic and validated.

Submission / Completion Criteria

Minimum Completion

Fixed tool chain
Schema-validated inputs/outputs

Full Completion

JSON report with logs
Error classification

Excellence

Replay mode for stored runs
Metrics export (CSV/JSONL)

This guide was generated from project_based_ideas/AI_AGENTS_LLM_RAG/AI_AGENTS_PROJECTS.md.