Project 6: Temporal Query Engine

Build a query engine that answers natural language temporal questions like “What projects was Alice working on last quarter?” by translating to graph+temporal queries.

Quick Reference

Attribute	Value
Difficulty	Level 3: Advanced
Time Estimate	2 weeks (25-35 hours)
Language	Python (Alternatives: TypeScript)
Prerequisites	Projects 1-5, Cypher, LLM function calling
Key Topics	Natural language to query translation, temporal reasoning, Cypher generation, query planning, semantic parsing

1. Learning Objectives

By completing this project, you will:

Parse temporal expressions from natural language.
Generate Cypher queries with temporal constraints.
Build a query planner that combines graph traversal with time filtering.
Use LLMs for query understanding while maintaining precision.
Create a feedback loop for query refinement.

2. Theoretical Foundation

2.1 Core Concepts

Temporal Expression Parsing: Converting “last quarter”, “in 2023”, “before the meeting” into date ranges.
Semantic Parsing: Converting natural language to structured query representation.
Query Planning: Decomposing complex questions into executable query steps.
Cypher Temporal Patterns: Using WHERE clauses with date comparisons and interval predicates.
Disambiguation: Handling ambiguous time references (“this week” depends on context).

2.2 Why This Matters

Users don’t think in Cypher—they think in questions:

“What did Alice and I discuss about the API last month?”
“When did we first mention the budget issue?”
“Show me everything that changed since the reorg.”

A temporal query engine bridges human questions and graph+time queries.

2.3 Common Misconceptions

“LLMs can generate perfect Cypher.” They hallucinate schema and miss temporal nuances.
“Parse once, query once.” Complex questions need multi-step query plans.
“Temporal parsing is solved.” Context-dependent expressions need the conversation context.

2.4 ASCII Diagram: Query Pipeline

USER QUESTION
"What projects was Alice working on last quarter?"

                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              TEMPORAL EXPRESSION PARSER                  │
│                                                          │
│  "last quarter" → (2024-07-01, 2024-09-30)              │
│  Context: current_date = 2024-11-15                      │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              SEMANTIC PARSER (LLM-assisted)              │
│                                                          │
│  Intent: FIND_RELATIONSHIPS                              │
│  Subject: "Alice"                                        │
│  Relationship: "working on"                              │
│  Object Type: "Project"                                  │
│  Time Constraint: (2024-07-01, 2024-09-30)              │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│                  QUERY PLANNER                           │
│                                                          │
│  Step 1: Find entity "Alice" (fuzzy match)              │
│  Step 2: Traverse WORKS_ON relationships                │
│  Step 3: Filter by valid_time overlap with Q3           │
│  Step 4: Return project names with dates                │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              CYPHER GENERATOR                            │
│                                                          │
│  MATCH (p:Person {name: "Alice"})-[r:WORKS_ON]->(proj)  │
│  WHERE r.valid_from <= date("2024-09-30")               │
│    AND (r.valid_to IS NULL OR r.valid_to >= date("2024-07-01"))│
│  RETURN proj.name, r.valid_from, r.valid_to             │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│                  EXECUTOR                                │
│                                                          │
│  Results:                                                │
│  - "API Redesign" (2024-03-01 to ongoing)               │
│  - "Q3 Planning" (2024-07-15 to 2024-09-30)             │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              RESPONSE FORMATTER                          │
│                                                          │
│  "In Q3 2024, Alice was working on:                     │
│   • API Redesign (ongoing since March)                  │
│   • Q3 Planning (completed end of September)"           │
└─────────────────────────────────────────────────────────┘

3. Project Specification

3.1 What You Will Build

A Python query engine that:

Parses temporal expressions from natural language
Converts questions to Cypher with temporal constraints
Executes queries against Neo4j
Formats results for human consumption

3.2 Functional Requirements

Parse temporal: engine.parse_temporal("last quarter") → DateRange
Parse question: engine.parse_question(text) → QueryIntent
Generate Cypher: engine.to_cypher(intent) → str
Execute: engine.query(question) → Results
Natural response: engine.answer(question) → str
Explain plan: engine.explain(question) → QueryPlan

3.3 Example Usage / Output

from temporal_query import TemporalQueryEngine

engine = TemporalQueryEngine(neo4j_driver, llm_client)

# Simple temporal query
answer = engine.answer("What projects was Alice working on last quarter?")
print(answer)
# "In Q3 2024, Alice was working on:
#  • API Redesign (ongoing since March 2024)
#  • Q3 Planning (July - September 2024)"

# Explain the query plan
plan = engine.explain("When did we first discuss the budget issue?")
print(plan)
# QueryPlan:
#   1. Parse temporal: "first" → earliest occurrence
#   2. Find entity: "budget issue" → search Topics/Episodes
#   3. Query: Find earliest episode mentioning budget
#   4. Cypher: MATCH (e:Episode)-[:MENTIONS]->(t:Topic {name: "budget"})
#              RETURN e ORDER BY e.timestamp LIMIT 1

# Complex query with relative time
answer = engine.answer("What changed in Alice's projects after the reorg?")
print(answer)
# "After the reorg (October 2024):
#  • Alice moved from API Redesign to Platform Team
#  • Started working on Infrastructure Migration"

# Query with knowledge time
answer = engine.answer("What did we know about Alice's role before the correction?")
# Uses transaction time to show pre-correction state

4. Solution Architecture

4.1 High-Level Design

┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│   Question    │────▶│   Temporal    │────▶│   Semantic    │
│    Input      │     │   Parser      │     │   Parser      │
└───────────────┘     └───────────────┘     └───────────────┘
                                                   │
                                                   ▼
┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│   Response    │◀────│   Executor    │◀────│    Query      │
│   Formatter   │     │               │     │   Planner     │
└───────────────┘     └───────────────┘     └───────────────┘
                            │
                            ▼
                      ┌───────────────┐
                      │    Neo4j      │
                      └───────────────┘

4.2 Key Components

Component	Responsibility	Technology
TemporalParser	Extract and normalize time expressions	dateparser + custom rules
SemanticParser	Convert NL to intent structure	LLM with function calling
QueryPlanner	Decompose into executable steps	Rule-based + LLM
CypherGenerator	Generate valid Cypher	Template + LLM validation
Executor	Run queries, handle errors	Neo4j driver
ResponseFormatter	Human-readable answers	LLM summarization

4.3 Data Models

from pydantic import BaseModel
from datetime import date
from typing import Literal

class DateRange(BaseModel):
    start: date | None
    end: date | None
    reference: str  # "last quarter", "in 2023"
    is_relative: bool

class QueryIntent(BaseModel):
    intent_type: Literal["find", "count", "when", "compare", "timeline"]
    subject: str | None
    relationship: str | None
    object_type: str | None
    temporal_constraint: DateRange | None
    knowledge_time: date | None  # For "what did we know" queries

class QueryStep(BaseModel):
    step_type: Literal["find_entity", "traverse", "filter_time", "aggregate"]
    description: str
    cypher_fragment: str | None

class QueryPlan(BaseModel):
    question: str
    steps: list[QueryStep]
    final_cypher: str

5. Implementation Guide

5.1 Development Environment Setup

mkdir temporal-query && cd temporal-query
python -m venv .venv && source .venv/bin/activate
pip install neo4j openai dateparser pydantic

5.2 Project Structure

temporal-query/
├── src/
│   ├── engine.py         # TemporalQueryEngine
│   ├── temporal.py       # Temporal expression parsing
│   ├── semantic.py       # NL to intent parsing
│   ├── planner.py        # Query planning
│   ├── cypher.py         # Cypher generation
│   ├── executor.py       # Query execution
│   └── formatter.py      # Response formatting
├── prompts/
│   ├── semantic_parse.txt
│   └── cypher_generate.txt
├── tests/
│   └── test_temporal.py
└── README.md

5.3 Implementation Phases

Phase 1: Temporal Parsing (6-8h)

Goals:

Parse common temporal expressions
Handle relative references

Tasks:

Use dateparser for standard expressions
Add custom rules for “last quarter”, “this year”, etc.
Handle context-dependent expressions
Build date range normalization

Checkpoint: “last quarter” returns correct date range.

Phase 2: Semantic Parsing + Planning (10-12h)

Goals:

Convert questions to structured intents
Build query plans

Tasks:

Design LLM prompt for intent extraction
Use function calling for structured output
Build rule-based query planner
Handle multi-step queries

Checkpoint: Question parses to intent with correct temporal constraint.

Phase 3: Cypher Generation + Execution (8-10h)

Goals:

Generate valid temporal Cypher
Execute and format results

Tasks:

Build Cypher templates for common patterns
Add temporal WHERE clause generation
Implement query executor with error handling
Build natural language response formatter

Checkpoint: End-to-end question → answer working.

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit	Test temporal parsing	“last week” → correct dates
Integration	Test full pipeline	Question → Cypher → results
Quality	Test answer accuracy	Compare to expected answers

6.2 Critical Test Cases

Temporal parsing: Various expressions parse correctly
Interval overlap: Cypher correctly filters by time range
NULL handling: Ongoing relationships included/excluded correctly
Edge cases: “today”, “now”, timezone handling

7. Common Pitfalls & Debugging

Pitfall	Symptom	Solution
Wrong date context	“last week” off by a week	Pass current_date explicitly
Schema mismatch	LLM generates invalid property names	Provide schema in prompt
Cypher injection	User input in Cypher	Use parameters, not string concat
Empty results	Query returns nothing	Add explain mode, check filters

8. Extensions & Challenges

8.1 Beginner Extensions

Add “show me examples” to explain query
Add query history and favorites

8.2 Intermediate Extensions

Add multi-hop temporal reasoning
Implement query result caching

8.3 Advanced Extensions

Add query auto-correction from errors
Implement temporal inference rules

9. Real-World Connections

9.1 Industry Applications

Business Intelligence: Natural language BI queries
Knowledge Management: Temporal Q&A over corporate memory
AI Assistants: Conversational memory access

9.2 Interview Relevance

Explain semantic parsing vs keyword search
Discuss LLM for query generation pros/cons
Describe temporal query optimization

10. Resources

10.1 Essential Reading

“AI Engineering” by Chip Huyen — Ch. on Tool Use and Agents
Neo4j Cypher Manual — Temporal functions
dateparser documentation — Temporal expression parsing

Previous: Project 5 (Bi-Temporal Fact Store)
Next: Project 7 (Semantic Memory Synthesizer)

11. Self-Assessment Checklist

I can parse “last quarter” to a date range given context
I understand how to generate Cypher with temporal constraints
I can decompose complex temporal questions into query steps
I know the limitations of LLM-generated queries

12. Submission / Completion Criteria

Minimum Viable Completion:

Temporal expression parsing working
Basic question → Cypher generation
Query execution returning results

Full Completion:

Multi-step query planning
Natural language response formatting
Query explanation mode

Excellence:

Query caching and optimization
Auto-correction from errors
Complex multi-hop temporal queries

Project 6: Temporal Query Engine

Quick Reference

1. Learning Objectives

2. Theoretical Foundation

2.1 Core Concepts

2.2 Why This Matters

2.3 Common Misconceptions

2.4 ASCII Diagram: Query Pipeline

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Example Usage / Output

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Models

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 Implementation Phases

Phase 1: Temporal Parsing (6-8h)

Phase 2: Semantic Parsing + Planning (10-12h)

Phase 3: Cypher Generation + Execution (8-10h)

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

7. Common Pitfalls & Debugging

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Related Projects

11. Self-Assessment Checklist

12. Submission / Completion Criteria