Project 3: A Basic RAG System for a Document
Build a simple RAG pipeline that answers questions about a single document with structured validation.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | 8-12 hours |
| Language | Python |
| Prerequisites | Embeddings basics, Pydantic models |
| Key Topics | retrieval, grounding, schema validation |
1. Learning Objectives
By completing this project, you will:
- Chunk a document and store embeddings.
- Retrieve relevant context for a query.
- Generate answers with validated structure.
- Enforce evidence references in outputs.
- Evaluate accuracy on test questions.
2. Theoretical Foundation
2.1 Single-Document RAG
Starting with one document isolates chunking and retrieval issues before scaling.
3. Project Specification
3.1 What You Will Build
A Q&A tool that answers questions about a specific document, with schema-validated output.
3.2 Functional Requirements
- Chunker with overlap.
- Embedding store for chunks.
- Retriever for top-k chunks.
- Answer generator with structured output.
- Evaluation on sample queries.
3.3 Non-Functional Requirements
- Deterministic runs for testing.
- Clear citation fields in output.
- Fallback when retrieval is empty.
4. Solution Architecture
4.1 Components
| Component | Responsibility |
|---|---|
| Chunker | Split document |
| Embedder | Generate vectors |
| Retriever | Fetch relevant chunks |
| Answerer | Generate validated output |
5. Implementation Guide
5.1 Project Structure
LEARN_PYDANTIC_AI/P03-basic-rag/
├── src/
│ ├── chunk.py
│ ├── embed.py
│ ├── retrieve.py
│ ├── answer.py
│ └── eval.py
5.2 Implementation Phases
Phase 1: Chunking + embedding (4-6h)
- Chunk the document and embed sections.
- Checkpoint: chunks stored with IDs.
Phase 2: Retrieval (2-3h)
- Retrieve top-k chunks per query.
- Checkpoint: relevant chunks returned.
Phase 3: Answering + eval (2-3h)
- Generate structured answers.
- Evaluate accuracy.
- Checkpoint: outputs include evidence IDs.
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit | chunking | size/overlap correctness |
| Integration | retrieval | top-k relevance |
| Regression | output | schema validation |
6.2 Critical Test Cases
- Answer includes evidence IDs.
- Empty retrieval yields safe fallback.
- Output validates against schema.
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Fix |
|---|---|---|
| Overlap too small | missed context | increase overlap |
| Hallucinated fields | invalid output | enforce schema |
| Slow retrieval | high latency | reduce top-k |
8. Extensions & Challenges
Beginner
- Add PDF input support.
- Add a CLI interface.
Intermediate
- Add reranking.
- Add citation coverage metrics.
Advanced
- Add multi-document support.
- Add evaluation dashboard.
9. Real-World Connections
- Docs assistants answer questions about manuals.
- Compliance requires cited answers.
10. Resources
- PydanticAI docs
- RAG system guides
11. Self-Assessment Checklist
- I can build a RAG pipeline for a document.
- I can validate outputs with Pydantic.
- I can evaluate answer accuracy.
12. Submission / Completion Criteria
Minimum Completion:
- Single-document RAG with schema output
Full Completion:
- Evaluation on sample queries
- Retrieval fallback
Excellence:
- Reranking or citation metrics
- Multi-document support
This guide was generated from project_based_ideas/AI_AGENTS_LLM_RAG/LEARN_PYDANTIC_AI.md.