Project 3: A Basic RAG System for a Document

Build a simple RAG pipeline that answers questions about a single document with structured validation.

Quick Reference

Attribute Value
Difficulty Level 2: Intermediate
Time Estimate 8-12 hours
Language Python
Prerequisites Embeddings basics, Pydantic models
Key Topics retrieval, grounding, schema validation

1. Learning Objectives

By completing this project, you will:

  1. Chunk a document and store embeddings.
  2. Retrieve relevant context for a query.
  3. Generate answers with validated structure.
  4. Enforce evidence references in outputs.
  5. Evaluate accuracy on test questions.

2. Theoretical Foundation

2.1 Single-Document RAG

Starting with one document isolates chunking and retrieval issues before scaling.


3. Project Specification

3.1 What You Will Build

A Q&A tool that answers questions about a specific document, with schema-validated output.

3.2 Functional Requirements

  1. Chunker with overlap.
  2. Embedding store for chunks.
  3. Retriever for top-k chunks.
  4. Answer generator with structured output.
  5. Evaluation on sample queries.

3.3 Non-Functional Requirements

  • Deterministic runs for testing.
  • Clear citation fields in output.
  • Fallback when retrieval is empty.

4. Solution Architecture

4.1 Components

Component Responsibility
Chunker Split document
Embedder Generate vectors
Retriever Fetch relevant chunks
Answerer Generate validated output

5. Implementation Guide

5.1 Project Structure

LEARN_PYDANTIC_AI/P03-basic-rag/
├── src/
│   ├── chunk.py
│   ├── embed.py
│   ├── retrieve.py
│   ├── answer.py
│   └── eval.py

5.2 Implementation Phases

Phase 1: Chunking + embedding (4-6h)

  • Chunk the document and embed sections.
  • Checkpoint: chunks stored with IDs.

Phase 2: Retrieval (2-3h)

  • Retrieve top-k chunks per query.
  • Checkpoint: relevant chunks returned.

Phase 3: Answering + eval (2-3h)

  • Generate structured answers.
  • Evaluate accuracy.
  • Checkpoint: outputs include evidence IDs.

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit chunking size/overlap correctness
Integration retrieval top-k relevance
Regression output schema validation

6.2 Critical Test Cases

  1. Answer includes evidence IDs.
  2. Empty retrieval yields safe fallback.
  3. Output validates against schema.

7. Common Pitfalls & Debugging

Pitfall Symptom Fix
Overlap too small missed context increase overlap
Hallucinated fields invalid output enforce schema
Slow retrieval high latency reduce top-k

8. Extensions & Challenges

Beginner

  • Add PDF input support.
  • Add a CLI interface.

Intermediate

  • Add reranking.
  • Add citation coverage metrics.

Advanced

  • Add multi-document support.
  • Add evaluation dashboard.

9. Real-World Connections

  • Docs assistants answer questions about manuals.
  • Compliance requires cited answers.

10. Resources

  • PydanticAI docs
  • RAG system guides

11. Self-Assessment Checklist

  • I can build a RAG pipeline for a document.
  • I can validate outputs with Pydantic.
  • I can evaluate answer accuracy.

12. Submission / Completion Criteria

Minimum Completion:

  • Single-document RAG with schema output

Full Completion:

  • Evaluation on sample queries
  • Retrieval fallback

Excellence:

  • Reranking or citation metrics
  • Multi-document support

This guide was generated from project_based_ideas/AI_AGENTS_LLM_RAG/LEARN_PYDANTIC_AI.md.