Project 10: End-to-End Research Assistant Agent
Build a full research assistant that plans, retrieves sources, synthesizes evidence, and produces a cited report with safety checks.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 4: Expert |
| Time Estimate | 20–40 hours |
| Language | Python or JavaScript |
| Prerequisites | Projects 2–9, retrieval basics |
| Key Topics | planning, RAG, citations, provenance, safety |
Learning Objectives
By completing this project, you will:
- Plan multi-step research tasks from a question.
- Retrieve and rank sources with relevance scoring.
- Synthesize evidence into grounded answers.
- Enforce citations and refusal when evidence is missing.
- Evaluate report quality with a rubric.
The Core Question You’re Answering
“How do you build a research agent that refuses to guess and always shows evidence?”
This is the difference between a chatbot and a research tool.
Concepts You Must Understand First
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| Retrieval grounding | Reduces hallucinations | RAG design guides |
| Provenance | Auditability of claims | Data lineage basics |
| Planning | Decompose research tasks | AI planning references |
| Citation enforcement | Trustworthy outputs | QA system design |
Theoretical Foundation
Research Pipeline
Question -> Plan -> Retrieve -> Synthesize -> Cite -> Report
Every claim must trace back to an evidence source.
Project Specification
What You’ll Build
A research agent that answers questions by retrieving sources and producing a cited report.
Functional Requirements
- Planning: break question into sub-questions
- Retrieval: gather sources and score relevance
- Synthesis: generate claims with citations
- Refusal: block unsupported claims
- Evaluation: rubric-based scoring
Non-Functional Requirements
- Deterministic evaluation mode
- Traceable source storage
- Clear failure handling
Real World Outcome
Example report structure:
{
"question": "What caused Event X?",
"summary": "...",
"findings": [
{"claim": "Cause A", "citations": ["src_12"]}
],
"limitations": "No evidence for claim B",
"sources": ["src_12", "src_19"]
}
Architecture Overview
┌──────────────┐ plan ┌──────────────┐
│ Planner │────────▶│ Retriever │
└──────┬───────┘ └──────┬───────┘
│ ▼
▼ ┌──────────────┐
┌──────────────┐ │ Synthesizer │
│ Provenance │◀─────────│ + Citations │
└──────────────┘ └──────────────┘
Implementation Guide
Phase 1: Planning + Retrieval (6–10h)
- Generate sub-questions
- Retrieve relevant sources
- Checkpoint: source list with scores
Phase 2: Synthesis + Citations (6–12h)
- Produce claims with citations
- Checkpoint: each claim has evidence
Phase 3: Safety + Evaluation (6–12h)
- Enforce refusal for missing evidence
- Score with rubric
- Checkpoint: report meets rubric
Common Pitfalls & Debugging
| Pitfall | Symptom | Fix |
|---|---|---|
| Hallucinated citations | fake sources | validate against index |
| Overconfident claims | missing evidence | enforce refusal mode |
| Poor retrieval | weak answers | tune chunking/retrieval |
Interview Questions They’ll Ask
- How do you enforce citations in generated text?
- What do you do when retrieval returns nothing?
- How do you evaluate research quality?
Hints in Layers
- Hint 1: Start with a fixed dataset and queries.
- Hint 2: Require citations in the output schema.
- Hint 3: Block claims without evidence.
- Hint 4: Build a rubric for evaluation.
Learning Milestones
- Grounded: every claim has a citation.
- Safe: unsupported questions trigger refusal.
- Measured: evaluation rubric scores reports.
Submission / Completion Criteria
Minimum Completion
- End-to-end research pipeline
Full Completion
- Citation enforcement + evaluation
Excellence
- Reranking or multi-source consensus
- Monitoring dashboard
This guide was generated from project_based_ideas/AI_AGENTS_LLM_RAG/AI_AGENTS_PROJECTS.md.