Project 6: Temporal Query Engine
Build a query engine that answers natural language temporal questions like “What projects was Alice working on last quarter?” by translating to graph+temporal queries.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 2 weeks (25-35 hours) |
| Language | Python (Alternatives: TypeScript) |
| Prerequisites | Projects 1-5, Cypher, LLM function calling |
| Key Topics | Natural language to query translation, temporal reasoning, Cypher generation, query planning, semantic parsing |
1. Learning Objectives
By completing this project, you will:
- Parse temporal expressions from natural language.
- Generate Cypher queries with temporal constraints.
- Build a query planner that combines graph traversal with time filtering.
- Use LLMs for query understanding while maintaining precision.
- Create a feedback loop for query refinement.
2. Theoretical Foundation
2.1 Core Concepts
-
Temporal Expression Parsing: Converting “last quarter”, “in 2023”, “before the meeting” into date ranges.
-
Semantic Parsing: Converting natural language to structured query representation.
-
Query Planning: Decomposing complex questions into executable query steps.
-
Cypher Temporal Patterns: Using WHERE clauses with date comparisons and interval predicates.
-
Disambiguation: Handling ambiguous time references (“this week” depends on context).
2.2 Why This Matters
Users don’t think in Cypher—they think in questions:
- “What did Alice and I discuss about the API last month?”
- “When did we first mention the budget issue?”
- “Show me everything that changed since the reorg.”
A temporal query engine bridges human questions and graph+time queries.
2.3 Common Misconceptions
- “LLMs can generate perfect Cypher.” They hallucinate schema and miss temporal nuances.
- “Parse once, query once.” Complex questions need multi-step query plans.
- “Temporal parsing is solved.” Context-dependent expressions need the conversation context.
2.4 ASCII Diagram: Query Pipeline
USER QUESTION
"What projects was Alice working on last quarter?"
│
▼
┌─────────────────────────────────────────────────────────┐
│ TEMPORAL EXPRESSION PARSER │
│ │
│ "last quarter" → (2024-07-01, 2024-09-30) │
│ Context: current_date = 2024-11-15 │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ SEMANTIC PARSER (LLM-assisted) │
│ │
│ Intent: FIND_RELATIONSHIPS │
│ Subject: "Alice" │
│ Relationship: "working on" │
│ Object Type: "Project" │
│ Time Constraint: (2024-07-01, 2024-09-30) │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ QUERY PLANNER │
│ │
│ Step 1: Find entity "Alice" (fuzzy match) │
│ Step 2: Traverse WORKS_ON relationships │
│ Step 3: Filter by valid_time overlap with Q3 │
│ Step 4: Return project names with dates │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ CYPHER GENERATOR │
│ │
│ MATCH (p:Person {name: "Alice"})-[r:WORKS_ON]->(proj) │
│ WHERE r.valid_from <= date("2024-09-30") │
│ AND (r.valid_to IS NULL OR r.valid_to >= date("2024-07-01"))│
│ RETURN proj.name, r.valid_from, r.valid_to │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ EXECUTOR │
│ │
│ Results: │
│ - "API Redesign" (2024-03-01 to ongoing) │
│ - "Q3 Planning" (2024-07-15 to 2024-09-30) │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ RESPONSE FORMATTER │
│ │
│ "In Q3 2024, Alice was working on: │
│ • API Redesign (ongoing since March) │
│ • Q3 Planning (completed end of September)" │
└─────────────────────────────────────────────────────────┘
3. Project Specification
3.1 What You Will Build
A Python query engine that:
- Parses temporal expressions from natural language
- Converts questions to Cypher with temporal constraints
- Executes queries against Neo4j
- Formats results for human consumption
3.2 Functional Requirements
- Parse temporal:
engine.parse_temporal("last quarter")→ DateRange - Parse question:
engine.parse_question(text)→ QueryIntent - Generate Cypher:
engine.to_cypher(intent)→ str - Execute:
engine.query(question)→ Results - Natural response:
engine.answer(question)→ str - Explain plan:
engine.explain(question)→ QueryPlan
3.3 Example Usage / Output
from temporal_query import TemporalQueryEngine
engine = TemporalQueryEngine(neo4j_driver, llm_client)
# Simple temporal query
answer = engine.answer("What projects was Alice working on last quarter?")
print(answer)
# "In Q3 2024, Alice was working on:
# • API Redesign (ongoing since March 2024)
# • Q3 Planning (July - September 2024)"
# Explain the query plan
plan = engine.explain("When did we first discuss the budget issue?")
print(plan)
# QueryPlan:
# 1. Parse temporal: "first" → earliest occurrence
# 2. Find entity: "budget issue" → search Topics/Episodes
# 3. Query: Find earliest episode mentioning budget
# 4. Cypher: MATCH (e:Episode)-[:MENTIONS]->(t:Topic {name: "budget"})
# RETURN e ORDER BY e.timestamp LIMIT 1
# Complex query with relative time
answer = engine.answer("What changed in Alice's projects after the reorg?")
print(answer)
# "After the reorg (October 2024):
# • Alice moved from API Redesign to Platform Team
# • Started working on Infrastructure Migration"
# Query with knowledge time
answer = engine.answer("What did we know about Alice's role before the correction?")
# Uses transaction time to show pre-correction state
4. Solution Architecture
4.1 High-Level Design
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Question │────▶│ Temporal │────▶│ Semantic │
│ Input │ │ Parser │ │ Parser │
└───────────────┘ └───────────────┘ └───────────────┘
│
▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ Response │◀────│ Executor │◀────│ Query │
│ Formatter │ │ │ │ Planner │
└───────────────┘ └───────────────┘ └───────────────┘
│
▼
┌───────────────┐
│ Neo4j │
└───────────────┘
4.2 Key Components
| Component | Responsibility | Technology |
|---|---|---|
| TemporalParser | Extract and normalize time expressions | dateparser + custom rules |
| SemanticParser | Convert NL to intent structure | LLM with function calling |
| QueryPlanner | Decompose into executable steps | Rule-based + LLM |
| CypherGenerator | Generate valid Cypher | Template + LLM validation |
| Executor | Run queries, handle errors | Neo4j driver |
| ResponseFormatter | Human-readable answers | LLM summarization |
4.3 Data Models
from pydantic import BaseModel
from datetime import date
from typing import Literal
class DateRange(BaseModel):
start: date | None
end: date | None
reference: str # "last quarter", "in 2023"
is_relative: bool
class QueryIntent(BaseModel):
intent_type: Literal["find", "count", "when", "compare", "timeline"]
subject: str | None
relationship: str | None
object_type: str | None
temporal_constraint: DateRange | None
knowledge_time: date | None # For "what did we know" queries
class QueryStep(BaseModel):
step_type: Literal["find_entity", "traverse", "filter_time", "aggregate"]
description: str
cypher_fragment: str | None
class QueryPlan(BaseModel):
question: str
steps: list[QueryStep]
final_cypher: str
5. Implementation Guide
5.1 Development Environment Setup
mkdir temporal-query && cd temporal-query
python -m venv .venv && source .venv/bin/activate
pip install neo4j openai dateparser pydantic
5.2 Project Structure
temporal-query/
├── src/
│ ├── engine.py # TemporalQueryEngine
│ ├── temporal.py # Temporal expression parsing
│ ├── semantic.py # NL to intent parsing
│ ├── planner.py # Query planning
│ ├── cypher.py # Cypher generation
│ ├── executor.py # Query execution
│ └── formatter.py # Response formatting
├── prompts/
│ ├── semantic_parse.txt
│ └── cypher_generate.txt
├── tests/
│ └── test_temporal.py
└── README.md
5.3 Implementation Phases
Phase 1: Temporal Parsing (6-8h)
Goals:
- Parse common temporal expressions
- Handle relative references
Tasks:
- Use dateparser for standard expressions
- Add custom rules for “last quarter”, “this year”, etc.
- Handle context-dependent expressions
- Build date range normalization
Checkpoint: “last quarter” returns correct date range.
Phase 2: Semantic Parsing + Planning (10-12h)
Goals:
- Convert questions to structured intents
- Build query plans
Tasks:
- Design LLM prompt for intent extraction
- Use function calling for structured output
- Build rule-based query planner
- Handle multi-step queries
Checkpoint: Question parses to intent with correct temporal constraint.
Phase 3: Cypher Generation + Execution (8-10h)
Goals:
- Generate valid temporal Cypher
- Execute and format results
Tasks:
- Build Cypher templates for common patterns
- Add temporal WHERE clause generation
- Implement query executor with error handling
- Build natural language response formatter
Checkpoint: End-to-end question → answer working.
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit | Test temporal parsing | “last week” → correct dates |
| Integration | Test full pipeline | Question → Cypher → results |
| Quality | Test answer accuracy | Compare to expected answers |
6.2 Critical Test Cases
- Temporal parsing: Various expressions parse correctly
- Interval overlap: Cypher correctly filters by time range
- NULL handling: Ongoing relationships included/excluded correctly
- Edge cases: “today”, “now”, timezone handling
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong date context | “last week” off by a week | Pass current_date explicitly |
| Schema mismatch | LLM generates invalid property names | Provide schema in prompt |
| Cypher injection | User input in Cypher | Use parameters, not string concat |
| Empty results | Query returns nothing | Add explain mode, check filters |
8. Extensions & Challenges
8.1 Beginner Extensions
- Add “show me examples” to explain query
- Add query history and favorites
8.2 Intermediate Extensions
- Add multi-hop temporal reasoning
- Implement query result caching
8.3 Advanced Extensions
- Add query auto-correction from errors
- Implement temporal inference rules
9. Real-World Connections
9.1 Industry Applications
- Business Intelligence: Natural language BI queries
- Knowledge Management: Temporal Q&A over corporate memory
- AI Assistants: Conversational memory access
9.2 Interview Relevance
- Explain semantic parsing vs keyword search
- Discuss LLM for query generation pros/cons
- Describe temporal query optimization
10. Resources
10.1 Essential Reading
- “AI Engineering” by Chip Huyen — Ch. on Tool Use and Agents
- Neo4j Cypher Manual — Temporal functions
- dateparser documentation — Temporal expression parsing
10.2 Related Projects
- Previous: Project 5 (Bi-Temporal Fact Store)
- Next: Project 7 (Semantic Memory Synthesizer)
11. Self-Assessment Checklist
- I can parse “last quarter” to a date range given context
- I understand how to generate Cypher with temporal constraints
- I can decompose complex temporal questions into query steps
- I know the limitations of LLM-generated queries
12. Submission / Completion Criteria
Minimum Viable Completion:
- Temporal expression parsing working
- Basic question → Cypher generation
- Query execution returning results
Full Completion:
- Multi-step query planning
- Natural language response formatting
- Query explanation mode
Excellence:
- Query caching and optimization
- Auto-correction from errors
- Complex multi-hop temporal queries