Project 2: Minimal ReAct Agent

Build a ReAct-style agent that loops through Thought -> Action -> Observation, updates state, and stops under explicit termination rules.

Quick Reference

Attribute	Value
Difficulty	Level 2: Intermediate
Time Estimate	10–16 hours
Language	Python or JavaScript
Prerequisites	Project 1, tool calling basics
Key Topics	agent loops, state, termination, trace logging

Learning Objectives

By completing this project, you will:

Implement a closed-loop agent cycle (Think → Act → Observe).
Maintain explicit agent state across steps.
Define termination rules to stop safely.
Summarize observations to avoid context overflow.
Log a trace that makes reasoning auditable.

The Core Question You’re Answering

“How does an agent use feedback to adapt its next action instead of guessing once?”

This is the minimal loop that separates an agent from a pipeline. Without this loop, there is no real agentic behavior.

Concepts You Must Understand First

Concept	Why It Matters	Where to Learn
ReAct pattern	Interleaves reasoning + actions	ReAct paper
State vs observation	Prevents repeated actions	Agent design blogs
Termination conditions	Stops runaway loops	LangChain docs
Tool schemas	Actions must be validated	Pydantic/Zod docs

Theoretical Foundation

ReAct Loop as Control System

Goal -> Think -> Act -> Observe -> Update State -> Think -> ...

Key properties:

Feedback-driven: each step uses real observations.
Bounded: stops on success or step budget.
Auditable: trace shows why actions happened.

Project Specification

What You’ll Build

A CLI agent that solves multi-step tasks such as:

“Find and summarize the three largest markdown files in /docs.”

Functional Requirements

Thought → Action → Observation loop
State object with facts + action history
Termination rules: success, max steps, loop detection
Observation summarization
JSONL trace log

Non-Functional Requirements

Deterministic mode for tests
Explicit error handling on tool failures
Trace replay support

Real World Outcome

Example run:

$ python react_agent.py --goal "Find the three largest markdown files in /docs"

Step 1: list_files -> 47 results
Step 2: get_file_sizes -> top 3 found
Step 3: read_file -> ARCHITECTURE.md
Step 4: read_file -> API_GUIDE.md
Step 5: read_file -> TUTORIAL.md
Step 6: summarize -> report created

Trace entry example:

{"step": 3, "thought": "Read the largest file", "action": "read_file", "observation": "Read 450KB", "state_diff": {"files_read": 1}}

Architecture Overview

┌──────────────┐   thoughts   ┌───────────────┐
│  LLM Brain   │────────────▶│  Action Plan  │
└──────┬───────┘              └──────┬────────┘
       │                             │
       ▼                             ▼
┌──────────────┐              ┌──────────────┐
│ Tool Execute │────────────▶│ Observation  │
└──────┬───────┘              └──────┬───────┘
       │                             │
       ▼                             ▼
┌────────────────┐          ┌─────────────────┐
│ State Update   │◀────────│ Trace Logger     │
└────────────────┘          └─────────────────┘

Implementation Guide

Phase 1: Loop Skeleton (3–4h)

Implement a fixed loop with max steps
Hardcode action selection once
Checkpoint: loop logs steps

Phase 2: Tool Selection + State (4–6h)

Generate actions from LLM
Track action history and facts
Checkpoint: agent completes 3-step task

Phase 3: Termination + Summaries (3–6h)

Add loop detection and step budget
Summarize large observations
Checkpoint: trace file is replayable

Common Pitfalls & Debugging

Pitfall	Symptom	Fix
Infinite loops	repeated actions	detect repeated action history
Context overflow	model forgets	summarize observations
Silent failures	missing tool errors	log tool error separately

Interview Questions They’ll Ask

How does ReAct differ from chain-of-thought prompting?
Why is explicit state tracking necessary?
How do you prevent infinite loops?

Hints in Layers

Hint 1: Implement the loop with a max step budget.
Hint 2: Store actions_taken and block repeats.
Hint 3: Summarize observations to preserve context.
Hint 4: Log each step as JSONL for replay.

Learning Milestones

Loop Works: three steps logged correctly.
Stateful: actions depend on observations.
Safe: termination rules prevent runaway loops.

Submission / Completion Criteria

Minimum Completion

Working ReAct loop
Basic trace log

Full Completion

Termination rules
Observation summarization

Excellence

Replay mode
Parallel tool calls

This guide was generated from project_based_ideas/AI_AGENTS_LLM_RAG/AI_AGENTS_PROJECTS.md.