Project 2: Conversation Episode Store
Build a storage system that captures conversations as “episodes” with embeddings, timestamps, and metadata—the foundation for episodic memory in AI agents.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | 1 week (15-20 hours) |
| Language | Python (Alternatives: TypeScript, Go) |
| Prerequisites | Project 1 (graph basics), vector embeddings understanding, async Python |
| Key Topics | Episodic memory, vector embeddings, chunking strategies, time-series data, hybrid storage |
1. Learning Objectives
By completing this project, you will:
- Understand the difference between episodic and semantic memory in AI systems.
- Design a storage schema that captures both raw conversations and their embeddings.
- Implement efficient chunking strategies for conversation data.
- Build a retrieval system that combines recency and semantic similarity.
- Create the foundation for temporal queries (“what did we discuss last week?”).
2. Theoretical Foundation
2.1 Core Concepts
-
Episodic Memory: Stores specific events/experiences with temporal context. In AI, this means raw conversation turns with timestamps, not just extracted facts.
-
Vector Embeddings: Dense numerical representations of text that capture semantic meaning. Similar texts have similar vectors (close in embedding space).
-
Chunking Strategy: How you split conversations affects retrieval quality. Options: by turn, by time window, by topic shift, by token count.
-
Hybrid Storage: Combining graph (relationships), vector (semantics), and traditional (metadata) storage for complete memory.
2.2 Why This Matters
Raw conversations contain nuance that extracted facts lose:
- Tone and sentiment (“user was frustrated”)
- Context (“this was discussed after the meeting”)
- Uncertainty (“user mentioned they might change their mind”)
Episodic memory preserves this richness for later retrieval and synthesis.
2.3 Common Misconceptions
- “Just store everything in vectors.” Vectors alone lose structure, time, and exact wording.
- “Chunking doesn’t matter.” Poor chunking fragments context and hurts retrieval.
- “Recency is enough.” Users often ask about semantically relevant old conversations.
2.4 ASCII Diagram: Episode Structure
CONVERSATION SESSION
====================
┌────────────────────────────────────────────────────────┐
│ Session: sess_001 │
│ Started: 2024-12-15T10:00:00Z │
│ User: user_123 │
└────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Episode 1 │ │ Episode 2 │ │ Episode 3 │
│ Turns: 1-5 │──▶│ Turns: 6-12 │──▶│ Turns: 13-18 │
│ Time: 10:00-02 │ │ Time: 10:03-08 │ │ Time: 10:09-15 │
│ Topic: greeting │ │ Topic: API help │ │ Topic: debugging│
│ │ │ │ │ │
│ ┌─────────────┐ │ │ ┌─────────────┐ │ │ ┌─────────────┐ │
│ │ Embedding │ │ │ │ Embedding │ │ │ │ Embedding │ │
│ │ [0.12, ...] │ │ │ │ [0.45, ...] │ │ │ │ [0.78, ...] │ │
│ └─────────────┘ │ │ └─────────────┘ │ │ └─────────────┘ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
3. Project Specification
3.1 What You Will Build
A Python library and CLI for storing and retrieving conversation episodes:
- Store conversations as episodes with embeddings
- Retrieve by recency, semantic similarity, or both
- Support multiple users/sessions
- Export episodes for analysis
3.2 Functional Requirements
- Store episode:
store.add_episode(session_id, turns, metadata) - Retrieve by recency:
store.get_recent(user_id, limit=10) - Retrieve by similarity:
store.search(query, user_id, top_k=5) - Hybrid retrieval:
store.retrieve(query, user_id, recency_weight=0.3) - List sessions:
store.get_sessions(user_id) - Export session:
store.export(session_id, format='json')
3.3 Non-Functional Requirements
- Latency: Search queries < 200ms for 10K episodes
- Scalability: Handle 100K+ episodes per user
- Isolation: Strict user data separation
3.4 Example Usage / Output
from episode_store import EpisodeStore
store = EpisodeStore()
# Store a conversation episode
episode = store.add_episode(
session_id="sess_001",
user_id="user_123",
turns=[
{"role": "user", "content": "How do I connect to Neo4j?"},
{"role": "assistant", "content": "You can use the neo4j Python driver..."},
{"role": "user", "content": "What about authentication?"},
{"role": "assistant", "content": "Use the NEO4J_AUTH environment variable..."}
],
metadata={"topic": "neo4j_setup", "sentiment": "neutral"}
)
print(f"Stored episode {episode.id} with {len(episode.turns)} turns")
# Retrieve by semantic similarity
results = store.search("database connection", user_id="user_123", top_k=3)
for r in results:
print(f"[{r.score:.2f}] {r.episode.summary[:50]}...")
# Hybrid retrieval
results = store.retrieve(
query="Neo4j authentication",
user_id="user_123",
recency_weight=0.3,
top_k=5
)
CLI Example:
$ episode-store search "API rate limiting" --user user_123 --top-k 3
Found 3 relevant episodes:
1. [0.89] Session sess_045 (2024-12-10)
Topic: API design discussion
"...we talked about implementing rate limiting with Redis..."
2. [0.82] Session sess_032 (2024-11-28)
Topic: Backend architecture
"...rate limiting was mentioned as a future consideration..."
3. [0.76] Session sess_012 (2024-10-15)
Topic: API security review
"...discussed rate limiting as a security measure..."
4. Solution Architecture
4.1 High-Level Design
┌───────────────────┐
│ EpisodeStore API │
└─────────┬─────────┘
│
┌─────┴─────┐
│ │
▼ ▼
┌────────┐ ┌──────────┐
│ SQLite │ │ Vector │
│ (meta) │ │ Store │
└────────┘ │ (embed) │
└──────────┘
4.2 Key Components
| Component | Responsibility | Technology |
|---|---|---|
| EpisodeStore | Main API, orchestration | Python class |
| MetadataStore | Sessions, episodes, metadata | SQLite |
| VectorStore | Embeddings, similarity search | ChromaDB/FAISS |
| Embedder | Text → vector conversion | sentence-transformers |
| Chunker | Split conversations into episodes | Custom logic |
4.3 Data Model
-- Sessions table
CREATE TABLE sessions (
id TEXT PRIMARY KEY,
user_id TEXT NOT NULL,
started_at TIMESTAMP,
ended_at TIMESTAMP,
metadata JSON
);
-- Episodes table
CREATE TABLE episodes (
id TEXT PRIMARY KEY,
session_id TEXT REFERENCES sessions(id),
sequence_num INTEGER,
content TEXT, -- JSON array of turns
summary TEXT,
created_at TIMESTAMP,
token_count INTEGER,
metadata JSON
);
-- Vector store (in ChromaDB)
-- Collection: episodes
-- Documents: episode content
-- Embeddings: episode vectors
-- Metadata: episode_id, session_id, user_id, timestamp
4.4 Chunking Algorithm
Strategy: Sliding Window with Topic Detection
- Start with first N turns
- Compute embedding similarity between consecutive turns
- If similarity drops below threshold, start new episode
- Ensure minimum/maximum episode sizes
- Overlap edges for context continuity
5. Implementation Guide
5.1 Development Environment Setup
# Create project
mkdir episode-store && cd episode-store
python -m venv .venv && source .venv/bin/activate
# Install dependencies
pip install chromadb sentence-transformers sqlalchemy pydantic click
# Verify embedding model
python -c "from sentence_transformers import SentenceTransformer; m = SentenceTransformer('all-MiniLM-L6-v2'); print(m.encode('test').shape)"
5.2 Project Structure
episode-store/
├── src/
│ ├── __init__.py
│ ├── store.py # Main EpisodeStore class
│ ├── metadata.py # SQLite metadata storage
│ ├── vectors.py # ChromaDB vector operations
│ ├── embedder.py # Embedding generation
│ ├── chunker.py # Conversation chunking
│ └── models.py # Pydantic models
├── cli/
│ └── main.py # Click CLI
├── tests/
└── README.md
5.3 Implementation Phases
Phase 1: Basic Storage (4-5h)
Goals:
- Store episodes with metadata
- Generate and store embeddings
Tasks:
- Set up SQLite schema for sessions/episodes
- Integrate sentence-transformers for embedding
- Set up ChromaDB for vector storage
- Implement
add_episodemethod
Checkpoint: Can store and retrieve an episode.
Phase 2: Retrieval Methods (4-5h)
Goals:
- Implement similarity search
- Implement recency-based retrieval
- Combine into hybrid retrieval
Tasks:
- Implement
searchwith vector similarity - Implement
get_recentwith timestamp ordering - Implement hybrid RRF fusion
- Add user isolation to all queries
Checkpoint: All three retrieval methods working.
Phase 3: Chunking and Polish (4-5h)
Goals:
- Intelligent episode chunking
- CLI interface
- Export functionality
Tasks:
- Implement topic-based chunking
- Build Click CLI
- Add JSON/CSV export
- Write documentation
Checkpoint: Full functionality with CLI.
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit | Test individual components | Chunker logic, embedding generation |
| Integration | Test storage operations | CRUD with both stores |
| Retrieval | Test search quality | Precision/recall on test queries |
6.2 Critical Test Cases
- User isolation: User A cannot see User B’s episodes
- Embedding consistency: Same text produces same embedding
- Hybrid ranking: Recency weight affects final ordering
- Chunking boundaries: Topics are correctly separated
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Embedding model too large | Slow startup, OOM | Use smaller model (MiniLM-L6-v2) |
| ChromaDB persistence | Data lost on restart | Configure persist_directory |
| Chunk size too small | Fragmented context | Increase minimum chunk size |
| Missing user filter | Cross-user data leaks | Add user_id to all queries |
8. Extensions & Challenges
8.1 Beginner Extensions
- Add episode summarization using an LLM
- Add sentiment analysis per episode
- Implement session timeline visualization
8.2 Intermediate Extensions
- Add reranking with cross-encoder
- Implement streaming episode ingestion
- Add episode clustering for topic discovery
8.3 Advanced Extensions
- Multi-modal episodes (text + images)
- Distributed vector storage (Qdrant, Pinecone)
- Real-time episode updates with CDC
9. Real-World Connections
9.1 Industry Applications
- ChatGPT Memory: Stores conversation context across sessions
- Zep: Episodic memory layer for LLM applications
- Customer Support AI: Maintains conversation history per user
9.2 Interview Relevance
- Explain the tradeoff between RAG and long context
- Discuss embedding model selection criteria
- Describe hybrid retrieval strategies
10. Resources
10.1 Essential Reading
- “AI Engineering” by Chip Huyen — Ch. on RAG and Memory
- Sentence Transformers Documentation — Embedding best practices
- ChromaDB Documentation — Vector storage patterns
10.2 Related Projects in This Series
- Previous: Project 1 (Personal Memory Graph)
- Next: Project 3 (Entity Extraction Pipeline)
11. Self-Assessment Checklist
- I can explain episodic vs. semantic memory
- I understand how vector embeddings capture meaning
- I can implement hybrid retrieval with weighted fusion
- I know when to chunk by turns vs. by topic
12. Submission / Completion Criteria
Minimum Viable Completion:
- Store and retrieve episodes with embeddings
- Semantic similarity search working
- User isolation enforced
Full Completion:
- Hybrid retrieval with configurable weights
- Intelligent chunking strategy
- CLI with search and export
Excellence:
- Episode summarization
- Topic clustering
- Performance benchmarks