Project 6: Temporal Query Engine

Build a query engine that answers natural language temporal questions like “What projects was Alice working on last quarter?” by translating to graph+temporal queries.

Quick Reference

Attribute Value
Difficulty Level 3: Advanced
Time Estimate 2 weeks (25-35 hours)
Language Python (Alternatives: TypeScript)
Prerequisites Projects 1-5, Cypher, LLM function calling
Key Topics Natural language to query translation, temporal reasoning, Cypher generation, query planning, semantic parsing

1. Learning Objectives

By completing this project, you will:

  1. Parse temporal expressions from natural language.
  2. Generate Cypher queries with temporal constraints.
  3. Build a query planner that combines graph traversal with time filtering.
  4. Use LLMs for query understanding while maintaining precision.
  5. Create a feedback loop for query refinement.

2. Theoretical Foundation

2.1 Core Concepts

  • Temporal Expression Parsing: Converting “last quarter”, “in 2023”, “before the meeting” into date ranges.

  • Semantic Parsing: Converting natural language to structured query representation.

  • Query Planning: Decomposing complex questions into executable query steps.

  • Cypher Temporal Patterns: Using WHERE clauses with date comparisons and interval predicates.

  • Disambiguation: Handling ambiguous time references (“this week” depends on context).

2.2 Why This Matters

Users don’t think in Cypher—they think in questions:

  • “What did Alice and I discuss about the API last month?”
  • “When did we first mention the budget issue?”
  • “Show me everything that changed since the reorg.”

A temporal query engine bridges human questions and graph+time queries.

2.3 Common Misconceptions

  • “LLMs can generate perfect Cypher.” They hallucinate schema and miss temporal nuances.
  • “Parse once, query once.” Complex questions need multi-step query plans.
  • “Temporal parsing is solved.” Context-dependent expressions need the conversation context.

2.4 ASCII Diagram: Query Pipeline

USER QUESTION
"What projects was Alice working on last quarter?"

                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              TEMPORAL EXPRESSION PARSER                  │
│                                                          │
│  "last quarter" → (2024-07-01, 2024-09-30)              │
│  Context: current_date = 2024-11-15                      │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              SEMANTIC PARSER (LLM-assisted)              │
│                                                          │
│  Intent: FIND_RELATIONSHIPS                              │
│  Subject: "Alice"                                        │
│  Relationship: "working on"                              │
│  Object Type: "Project"                                  │
│  Time Constraint: (2024-07-01, 2024-09-30)              │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│                  QUERY PLANNER                           │
│                                                          │
│  Step 1: Find entity "Alice" (fuzzy match)              │
│  Step 2: Traverse WORKS_ON relationships                │
│  Step 3: Filter by valid_time overlap with Q3           │
│  Step 4: Return project names with dates                │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              CYPHER GENERATOR                            │
│                                                          │
│  MATCH (p:Person {name: "Alice"})-[r:WORKS_ON]->(proj)  │
│  WHERE r.valid_from <= date("2024-09-30")               │
│    AND (r.valid_to IS NULL OR r.valid_to >= date("2024-07-01"))│
│  RETURN proj.name, r.valid_from, r.valid_to             │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│                  EXECUTOR                                │
│                                                          │
│  Results:                                                │
│  - "API Redesign" (2024-03-01 to ongoing)               │
│  - "Q3 Planning" (2024-07-15 to 2024-09-30)             │
└─────────────────────────────────────────────────────────┘
                    │
                    ▼
┌─────────────────────────────────────────────────────────┐
│              RESPONSE FORMATTER                          │
│                                                          │
│  "In Q3 2024, Alice was working on:                     │
│   • API Redesign (ongoing since March)                  │
│   • Q3 Planning (completed end of September)"           │
└─────────────────────────────────────────────────────────┘

3. Project Specification

3.1 What You Will Build

A Python query engine that:

  • Parses temporal expressions from natural language
  • Converts questions to Cypher with temporal constraints
  • Executes queries against Neo4j
  • Formats results for human consumption

3.2 Functional Requirements

  1. Parse temporal: engine.parse_temporal("last quarter") → DateRange
  2. Parse question: engine.parse_question(text) → QueryIntent
  3. Generate Cypher: engine.to_cypher(intent) → str
  4. Execute: engine.query(question) → Results
  5. Natural response: engine.answer(question) → str
  6. Explain plan: engine.explain(question) → QueryPlan

3.3 Example Usage / Output

from temporal_query import TemporalQueryEngine

engine = TemporalQueryEngine(neo4j_driver, llm_client)

# Simple temporal query
answer = engine.answer("What projects was Alice working on last quarter?")
print(answer)
# "In Q3 2024, Alice was working on:
#  • API Redesign (ongoing since March 2024)
#  • Q3 Planning (July - September 2024)"

# Explain the query plan
plan = engine.explain("When did we first discuss the budget issue?")
print(plan)
# QueryPlan:
#   1. Parse temporal: "first" → earliest occurrence
#   2. Find entity: "budget issue" → search Topics/Episodes
#   3. Query: Find earliest episode mentioning budget
#   4. Cypher: MATCH (e:Episode)-[:MENTIONS]->(t:Topic {name: "budget"})
#              RETURN e ORDER BY e.timestamp LIMIT 1

# Complex query with relative time
answer = engine.answer("What changed in Alice's projects after the reorg?")
print(answer)
# "After the reorg (October 2024):
#  • Alice moved from API Redesign to Platform Team
#  • Started working on Infrastructure Migration"

# Query with knowledge time
answer = engine.answer("What did we know about Alice's role before the correction?")
# Uses transaction time to show pre-correction state

4. Solution Architecture

4.1 High-Level Design

┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│   Question    │────▶│   Temporal    │────▶│   Semantic    │
│    Input      │     │   Parser      │     │   Parser      │
└───────────────┘     └───────────────┘     └───────────────┘
                                                   │
                                                   ▼
┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│   Response    │◀────│   Executor    │◀────│    Query      │
│   Formatter   │     │               │     │   Planner     │
└───────────────┘     └───────────────┘     └───────────────┘
                            │
                            ▼
                      ┌───────────────┐
                      │    Neo4j      │
                      └───────────────┘

4.2 Key Components

Component Responsibility Technology
TemporalParser Extract and normalize time expressions dateparser + custom rules
SemanticParser Convert NL to intent structure LLM with function calling
QueryPlanner Decompose into executable steps Rule-based + LLM
CypherGenerator Generate valid Cypher Template + LLM validation
Executor Run queries, handle errors Neo4j driver
ResponseFormatter Human-readable answers LLM summarization

4.3 Data Models

from pydantic import BaseModel
from datetime import date
from typing import Literal

class DateRange(BaseModel):
    start: date | None
    end: date | None
    reference: str  # "last quarter", "in 2023"
    is_relative: bool

class QueryIntent(BaseModel):
    intent_type: Literal["find", "count", "when", "compare", "timeline"]
    subject: str | None
    relationship: str | None
    object_type: str | None
    temporal_constraint: DateRange | None
    knowledge_time: date | None  # For "what did we know" queries

class QueryStep(BaseModel):
    step_type: Literal["find_entity", "traverse", "filter_time", "aggregate"]
    description: str
    cypher_fragment: str | None

class QueryPlan(BaseModel):
    question: str
    steps: list[QueryStep]
    final_cypher: str

5. Implementation Guide

5.1 Development Environment Setup

mkdir temporal-query && cd temporal-query
python -m venv .venv && source .venv/bin/activate
pip install neo4j openai dateparser pydantic

5.2 Project Structure

temporal-query/
├── src/
│   ├── engine.py         # TemporalQueryEngine
│   ├── temporal.py       # Temporal expression parsing
│   ├── semantic.py       # NL to intent parsing
│   ├── planner.py        # Query planning
│   ├── cypher.py         # Cypher generation
│   ├── executor.py       # Query execution
│   └── formatter.py      # Response formatting
├── prompts/
│   ├── semantic_parse.txt
│   └── cypher_generate.txt
├── tests/
│   └── test_temporal.py
└── README.md

5.3 Implementation Phases

Phase 1: Temporal Parsing (6-8h)

Goals:

  • Parse common temporal expressions
  • Handle relative references

Tasks:

  1. Use dateparser for standard expressions
  2. Add custom rules for “last quarter”, “this year”, etc.
  3. Handle context-dependent expressions
  4. Build date range normalization

Checkpoint: “last quarter” returns correct date range.

Phase 2: Semantic Parsing + Planning (10-12h)

Goals:

  • Convert questions to structured intents
  • Build query plans

Tasks:

  1. Design LLM prompt for intent extraction
  2. Use function calling for structured output
  3. Build rule-based query planner
  4. Handle multi-step queries

Checkpoint: Question parses to intent with correct temporal constraint.

Phase 3: Cypher Generation + Execution (8-10h)

Goals:

  • Generate valid temporal Cypher
  • Execute and format results

Tasks:

  1. Build Cypher templates for common patterns
  2. Add temporal WHERE clause generation
  3. Implement query executor with error handling
  4. Build natural language response formatter

Checkpoint: End-to-end question → answer working.


6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Test temporal parsing “last week” → correct dates
Integration Test full pipeline Question → Cypher → results
Quality Test answer accuracy Compare to expected answers

6.2 Critical Test Cases

  1. Temporal parsing: Various expressions parse correctly
  2. Interval overlap: Cypher correctly filters by time range
  3. NULL handling: Ongoing relationships included/excluded correctly
  4. Edge cases: “today”, “now”, timezone handling

7. Common Pitfalls & Debugging

Pitfall Symptom Solution
Wrong date context “last week” off by a week Pass current_date explicitly
Schema mismatch LLM generates invalid property names Provide schema in prompt
Cypher injection User input in Cypher Use parameters, not string concat
Empty results Query returns nothing Add explain mode, check filters

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add “show me examples” to explain query
  • Add query history and favorites

8.2 Intermediate Extensions

  • Add multi-hop temporal reasoning
  • Implement query result caching

8.3 Advanced Extensions

  • Add query auto-correction from errors
  • Implement temporal inference rules

9. Real-World Connections

9.1 Industry Applications

  • Business Intelligence: Natural language BI queries
  • Knowledge Management: Temporal Q&A over corporate memory
  • AI Assistants: Conversational memory access

9.2 Interview Relevance

  • Explain semantic parsing vs keyword search
  • Discuss LLM for query generation pros/cons
  • Describe temporal query optimization

10. Resources

10.1 Essential Reading

  • “AI Engineering” by Chip Huyen — Ch. on Tool Use and Agents
  • Neo4j Cypher Manual — Temporal functions
  • dateparser documentation — Temporal expression parsing
  • Previous: Project 5 (Bi-Temporal Fact Store)
  • Next: Project 7 (Semantic Memory Synthesizer)

11. Self-Assessment Checklist

  • I can parse “last quarter” to a date range given context
  • I understand how to generate Cypher with temporal constraints
  • I can decompose complex temporal questions into query steps
  • I know the limitations of LLM-generated queries

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Temporal expression parsing working
  • Basic question → Cypher generation
  • Query execution returning results

Full Completion:

  • Multi-step query planning
  • Natural language response formatting
  • Query explanation mode

Excellence:

  • Query caching and optimization
  • Auto-correction from errors
  • Complex multi-hop temporal queries