Project 4: Hybrid Memory Router
Build a policy-based router that decides which memory stores to query and how to merge results.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3 |
| Time Estimate | 2-3 weeks |
| Main Programming Language | Python (Alternatives: TypeScript, Go) |
| Alternative Programming Languages | TypeScript, Go |
| Coolness Level | Level 3 |
| Business Potential | Level 3 |
| Prerequisites | Memory taxonomy, vector search basics |
| Key Topics | Routing policies, prompt budgeting, conflict resolution |
1. Learning Objectives
By completing this project, you will:
- Design routing policies for different memory types.
- Merge results from multiple memory stores.
- Enforce a memory token budget.
- Explain retrieval decisions with a debug trace.
2. All Theory Needed (Per-Concept Breakdown)
Memory Routing Policies and Prompt Budgeting
Fundamentals A hybrid memory router decides which memory stores to query and how to allocate limited prompt space. It is the layer that makes memory practical: without routing, every store is queried and the prompt overflows. Routing policies combine query intent, memory type, recency, and confidence. Prompt budgeting limits how much memory is injected, ensuring that high-priority memories fit within the context window.
Deep Dive into the concept Routing is a policy problem, not a storage problem. The router must interpret the query and decide which memory types are relevant. A question like “What did we decide last week?” should prioritize episodic memory and summaries, while “What tools did we use?” should prioritize procedural memory. You can implement this via explicit rules, lightweight classifiers, or retrieval-of-retrieval (e.g., using an LLM to select memory types). Regardless of approach, the router must be deterministic and auditable for evaluation.
Prompt budgeting is the second half of routing. Even if retrieval is perfect, injecting too much memory can harm answer quality. The model’s context window behaves like RAM: overfill it and you lose important task instructions. The router must allocate a memory budget (tokens or character count) and decide how to fill it. This often involves prioritization: high-confidence, high-recency, and policy-approved memories are injected first. Lower priority memories are either dropped or summarized.
Conflict resolution is another critical policy. Two memories may disagree due to drift or user updates. The router should not blindly inject both, as that confuses the model. Instead, it should apply conflict resolution rules: prefer newer memories, prefer memories with higher confidence, or surface conflicts explicitly. This is especially important for preference memory, where outdated preferences can cause user frustration.
Routing policies must also be safe. Sensitive memories should be filtered unless explicit consent is present. This requires the router to inspect metadata fields and enforce privacy constraints. A robust router therefore combines type selection, budgeting, conflict resolution, and safety filtering into a single decision pipeline. You will build that pipeline in this project.
From a systems perspective, this concept must be treated as a first-class interface between data and behavior. That means you need explicit invariants (what must always be true), observability (how you know it is true), and failure signatures (how it breaks when it is not). In practice, engineers often skip this and rely on ad-hoc fixes, which creates hidden coupling between the memory subsystem and the rest of the agent stack. A better approach is to model the concept as a pipeline stage with clear inputs, outputs, and preconditions: if inputs violate the contract, the stage should fail fast rather than silently corrupt memory. This is especially important because memory errors are long-lived and compound over time. You should also define operational metrics that reveal drift early. Examples include: the percentage of memory entries that lack required metadata, the ratio of retrieved memories that are later unused by the model, or the fraction of queries that trigger a fallback route because the primary memory store is empty. These metrics are not just for dashboards; they are design constraints that force you to keep the system testable and predictable.
Another critical dimension is lifecycle management. The concept may work well at small scale but degrade as the memory grows. This is where policies and thresholds matter: you need rules for promotion, demotion, merging, or deletion that prevent the memory from becoming a landfill. The policy should be deterministic and versioned. When it changes, you should be able to replay historical inputs and measure the delta in outputs. This is the same discipline used in data engineering for schema changes and backfills, and it applies equally to memory systems. Finally, remember that memory is an interface to user trust. If the memory system is noisy, the agent feels unreliable; if it is overly strict, the agent feels forgetful. The best designs expose these trade-offs explicitly, so you can tune them according to product goals rather than guessing in the dark.
How this fits on projects This concept is central to Project 4 and will be used directly in Project 10’s OS-style memory manager.
Definitions & key terms
- Routing policy: Rule set for selecting memory stores.
- Budgeting: Allocation of prompt space for memory.
- Conflict resolution: Rules for handling contradictory memories.
- Traceability: Logging why each memory was selected.
Mental model diagram (ASCII)
Query -> Router -> [Episodic] [Semantic] [Preference]
| | |
Rank/Filter | |
+----------+----------+
v
Prompt
How It Works (Step-by-Step)
- Parse query and classify intent.
- Select eligible memory stores.
- Retrieve candidates from each store.
- Apply filters (recency, sensitivity, confidence).
- Resolve conflicts and enforce budget.
- Inject into prompt with trace log.
Minimal Concrete Example
routing_policy:
if query contains "decide" -> episodic + summary
if query contains "preference" -> preference only
budget: 800 tokens
Common Misconceptions
- “More memory is always better.” (False: excess memory reduces quality.)
- “Routing can be ad-hoc.” (False: inconsistent routing breaks evaluation.)
Check-Your-Understanding Questions
- Why is budgeting necessary?
- How do you resolve conflicting memories?
- What metadata fields are critical for routing?
Check-Your-Understanding Answers
- Because the context window is limited and overfilling reduces clarity.
- Prefer higher confidence or more recent memory; optionally surface conflict.
- Type, timestamp, confidence, sensitivity, consent.
Real-World Applications
- Assistants that combine vector memory and summaries.
- Enterprise agents with strict privacy constraints.
Where You’ll Apply It
- In this project: §5.4 Concepts You Must Understand First and §6 Testing Strategy.
- Also used in: Project 10.
References
- MemGPT (memory routing concept) - https://arxiv.org/abs/2310.08560
Key Insights Routing policies determine whether memory helps or harms the agent.
Summary Hybrid routing is the decision layer that makes memory selective, safe, and effective.
Homework/Exercises to Practice the Concept
- Write a routing policy for three query types.
- Design a budgeting strategy for 600 tokens.
Solutions to the Homework/Exercises
- Example: decision queries -> episodic+summary; preference queries -> preference; tool queries -> procedural.
- Allocate 60% to high-priority memory, 40% to recent context.
3. Project Specification
3.1 What You Will Build
A hybrid router that:
- Classifies query intent
- Selects memory stores
- Retrieves and merges candidates
- Enforces token budget
- Logs decisions
3.2 Functional Requirements
- Store Selection: Choose relevant stores per query.
- Budget Enforcement: Limit memory tokens.
- Conflict Resolution: Handle contradictory memories.
- Trace Log: Explain each selection.
3.3 Non-Functional Requirements
- Performance: Routing adds < 50ms overhead.
- Reliability: Same query yields same routing decisions.
- Usability: Debug traces are human-readable.
3.4 Example Usage / Output
$ memrouter query "Summarize the deployment decision"
[ROUTE] episodic + summary
[FETCH] episodic=4 summary=1
[MERGE] selected=3 (budget=6)
3.5 Data Formats / Schemas / Protocols
route_trace:
query: "Summarize the deployment decision"
stores: ["episodic", "summary"]
selected_ids: ["EPI-0021", "SUM-0008"]
3.6 Edge Cases
- Conflicting memories with same confidence
- Budget too small for required memory
- Queries with ambiguous intent
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
$ memrouter query "Summarize the deployment decision"
$ memrouter trace last
3.7.2 Golden Path Demo (Deterministic)
$ memrouter query "Summarize the deployment decision"
[ROUTE] episodic + summary
[INJECT] selected_ids=EPI-0021,SUM-0008
exit_code=0
3.7.3 Failure Demo (Deterministic)
$ memrouter query ""
[ERROR] empty query
exit_code=1
4. Solution Architecture
4.1 High-Level Design
Query -> Intent Classifier -> Store Selector -> Retrieve -> Merge -> Prompt
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Intent Classifier | Determine query type | Rule-based vs ML |
| Store Selector | Pick memory stores | Type mapping |
| Merger | Combine results | Conflict rules |
| Budgeter | Enforce prompt size | Token counting |
4.3 Data Structures (No Full Code)
RouteDecision:
stores: list
budget: int
selected: list
4.4 Algorithm Overview
- Classify query intent.
- Select eligible stores.
- Retrieve candidates.
- Filter and rerank.
- Enforce budget and resolve conflicts.
Complexity Analysis: O(k) per store.
5. Implementation Guide
5.1 Development Environment Setup
- Load memory store configs
- Define token budget limits
5.2 Project Structure
project-root/
├── src/
│ ├── classify/
│ ├── route/
│ ├── merge/
│ └── trace/
5.3 The Core Question You’re Answering
“Which memories should be injected, and why?”
5.4 Concepts You Must Understand First
- Routing policies
- Budget enforcement
5.5 Questions to Guide Your Design
- How will you resolve conflicts?
- What is the maximum memory budget?
5.6 Thinking Exercise
Design a policy for handling preference vs episodic conflicts.
5.7 The Interview Questions They’ll Ask
- “How does a router improve memory quality?”
- “What is a memory budget?”
- “How do you handle conflicts?”
- “Why is traceability important?”
- “How do you evaluate routing?”
5.8 Hints in Layers
Hint 1: Start with keyword-based routing Hint 2: Add confidence filters Hint 3: Add budgeting Hint 4: Add trace logging
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Architecture | “Fundamentals of Software Architecture” | Ch. 2 |
| Agent systems | “AI Engineering” | Ch. 6 |
5.10 Implementation Phases
Phase 1: Foundation
- Intent classifier and store selector
Phase 2: Core Functionality
- Retrieval and merge logic
Phase 3: Polish
- Budget enforcement and trace logs
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Routing method | Rule-based / LLM | Rule-based | Deterministic and auditable |
| Conflict policy | Recency / Confidence | Confidence + recency | Balances freshness and reliability |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit | Policy rules | Intent mapping |
| Integration | Full routing | Query -> prompt |
| Edge | Conflicts | Contradictory memories |
6.2 Critical Test Cases
- Budget enforcement limits memory size.
- Conflicting memories resolved deterministically.
- Trace log includes selection reasons.
6.3 Test Data
query: "Summarize the deployment decision"
expected_stores: ["episodic", "summary"]
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Over-routing | Prompt overflow | Enforce budget |
| Under-routing | Missing context | Expand store selection |
| No traceability | Hard to debug | Add trace log |
7.2 Debugging Strategies
- Compare routing outputs for similar queries.
- Add trace IDs to each decision.
7.3 Performance Traps
- Excessive reranking of large candidate sets.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add manual override rules
8.2 Intermediate Extensions
- Add reranking by confidence
8.3 Advanced Extensions
- Add adaptive budget based on task complexity
9. Real-World Connections
9.1 Industry Applications
- Production memory routers in multi-store RAG systems
9.2 Related Open Source Projects
- MemGPT
9.3 Interview Relevance
- Routing policies are a common system design topic.
10. Resources
10.1 Essential Reading
- MemGPT paper
10.2 Video Resources
- Talks on RAG routing and retrieval policies
10.3 Tools & Documentation
- LangChain memory routing examples
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain routing and budgeting.
11.2 Implementation
- Router selects stores and enforces budget.
11.3 Growth
- I can justify my routing policy.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Routing and budgeting implemented
Full Completion:
- Conflict resolution and trace logging
Excellence (Going Above & Beyond):
- Adaptive routing based on task complexity