Project 10: End-to-End Research Assistant Agent

Build a full research assistant that plans, retrieves sources, synthesizes evidence, and produces a cited report with safety checks.


Quick Reference

Attribute Value
Difficulty Level 4: Expert
Time Estimate 20–40 hours
Language Python or JavaScript
Prerequisites Projects 2–9, retrieval basics
Key Topics planning, RAG, citations, provenance, safety

Learning Objectives

By completing this project, you will:

  1. Plan multi-step research tasks from a question.
  2. Retrieve and rank sources with relevance scoring.
  3. Synthesize evidence into grounded answers.
  4. Enforce citations and refusal when evidence is missing.
  5. Evaluate report quality with a rubric.

The Core Question You’re Answering

“How do you build a research agent that refuses to guess and always shows evidence?”

This is the difference between a chatbot and a research tool.


Concepts You Must Understand First

Concept Why It Matters Where to Learn
Retrieval grounding Reduces hallucinations RAG design guides
Provenance Auditability of claims Data lineage basics
Planning Decompose research tasks AI planning references
Citation enforcement Trustworthy outputs QA system design

Theoretical Foundation

Research Pipeline

Question -> Plan -> Retrieve -> Synthesize -> Cite -> Report

Every claim must trace back to an evidence source.


Project Specification

What You’ll Build

A research agent that answers questions by retrieving sources and producing a cited report.

Functional Requirements

  1. Planning: break question into sub-questions
  2. Retrieval: gather sources and score relevance
  3. Synthesis: generate claims with citations
  4. Refusal: block unsupported claims
  5. Evaluation: rubric-based scoring

Non-Functional Requirements

  • Deterministic evaluation mode
  • Traceable source storage
  • Clear failure handling

Real World Outcome

Example report structure:

{
  "question": "What caused Event X?",
  "summary": "...",
  "findings": [
    {"claim": "Cause A", "citations": ["src_12"]}
  ],
  "limitations": "No evidence for claim B",
  "sources": ["src_12", "src_19"]
}

Architecture Overview

┌──────────────┐   plan   ┌──────────────┐
│ Planner      │────────▶│ Retriever    │
└──────┬───────┘         └──────┬───────┘
       │                         ▼
       ▼                  ┌──────────────┐
┌──────────────┐           │ Synthesizer │
│ Provenance   │◀─────────│ + Citations │
└──────────────┘           └──────────────┘

Implementation Guide

Phase 1: Planning + Retrieval (6–10h)

  • Generate sub-questions
  • Retrieve relevant sources
  • Checkpoint: source list with scores

Phase 2: Synthesis + Citations (6–12h)

  • Produce claims with citations
  • Checkpoint: each claim has evidence

Phase 3: Safety + Evaluation (6–12h)

  • Enforce refusal for missing evidence
  • Score with rubric
  • Checkpoint: report meets rubric

Common Pitfalls & Debugging

Pitfall Symptom Fix
Hallucinated citations fake sources validate against index
Overconfident claims missing evidence enforce refusal mode
Poor retrieval weak answers tune chunking/retrieval

Interview Questions They’ll Ask

  1. How do you enforce citations in generated text?
  2. What do you do when retrieval returns nothing?
  3. How do you evaluate research quality?

Hints in Layers

  • Hint 1: Start with a fixed dataset and queries.
  • Hint 2: Require citations in the output schema.
  • Hint 3: Block claims without evidence.
  • Hint 4: Build a rubric for evaluation.

Learning Milestones

  1. Grounded: every claim has a citation.
  2. Safe: unsupported questions trigger refusal.
  3. Measured: evaluation rubric scores reports.

Submission / Completion Criteria

Minimum Completion

  • End-to-end research pipeline

Full Completion

  • Citation enforcement + evaluation

Excellence

  • Reranking or multi-source consensus
  • Monitoring dashboard

This guide was generated from project_based_ideas/AI_AGENTS_LLM_RAG/AI_AGENTS_PROJECTS.md.