Project 5: Bi-Temporal Fact Store

Build a fact storage system with both valid time (when facts are true) and transaction time (when facts were recorded), enabling temporal queries like “What did we know on date X about Y?”

Quick Reference

Attribute	Value
Difficulty	Level 3: Advanced
Time Estimate	2 weeks (25-35 hours)
Language	Python (Alternatives: TypeScript, Go)
Prerequisites	Projects 1-4, temporal databases, SQL datetime handling
Key Topics	Bi-temporal modeling, valid time, transaction time, temporal queries, point-in-time reconstruction, Allen’s interval algebra

1. Learning Objectives

By completing this project, you will:

Understand the difference between valid time and transaction time.
Design a bi-temporal schema for knowledge graph facts.
Implement temporal CRUD operations that preserve history.
Build queries for point-in-time reconstruction.
Handle temporal predicates (before, during, overlaps, etc.).

2. Theoretical Foundation

2.1 Core Concepts

Valid Time (VT): When a fact is true in the real world. “Alice worked at Acme from 2020-2023.”
Transaction Time (TT): When a fact was recorded in the system. “We learned this on 2024-01-15.”
Bi-Temporal Model: Combining both dimensions allows four types of queries:
- Current knowledge about current facts
- Current knowledge about past facts
- Past knowledge about past facts (what we knew then)
- Past knowledge corrected (audit trails)
Temporal Predicates: Allen’s interval algebra defines 13 relations between time intervals:
- before, after, meets, overlaps, during, starts, finishes (+ inverses + equals)
Snapshot vs. Delta Storage: Store full state at each point vs. store changes only.

2.2 Why This Matters

AI memory without temporality is broken:

“Alice works at Acme” vs “Alice worked at Acme” (validity)
“User said X yesterday but corrected to Y today” (transaction)
“What did the agent believe last week?” (debugging, audit)

Without bi-temporal tracking, you can’t distinguish outdated facts from corrections.

2.3 Common Misconceptions

“Just use updated_at.” That’s transaction time only—you lose when facts were actually true.
“Delete old facts.” Deletion destroys audit trails and makes debugging impossible.
“One timestamp is enough.” You need two independent dimensions for full temporal reasoning.

2.4 ASCII Diagram: Bi-Temporal Space

                     TRANSACTION TIME (when recorded)
                     ─────────────────────────────────►
                     Jan      Feb      Mar      Apr
                  ┌────────────────────────────────────┐
              2020│ ░░░░░░░░ ░░░░░░░░                  │
                  │          (recorded in Feb that     │
     V            │           Alice worked at Acme     │
     A            │           from 2020)               │
     L         2021│                                   │
     I            │                                    │
     D            │                                    │
                  │                                    │
     T         2022│          ████████ ████████        │
     I            │          (recorded in Mar that    │
     M            │           Alice left Acme in 2022)│
     E            │                                    │
              2023│                    ▓▓▓▓▓▓▓▓       │
     ↓            │                   (correction:     │
                  │                    actually 2023)  │
                  └────────────────────────────────────┘

Query: "What did we know in February about Alice's employment?"
Answer: Alice works at Acme (started 2020, no end recorded yet)

Query: "What do we know NOW about Alice's employment?"
Answer: Alice worked at Acme 2020-2023 (corrected in April)

2.5 Bi-Temporal Fact Example

FACT: Alice works at Acme

Version 1 (recorded Feb 1):
┌─────────────────────────────────────────────────────┐
│ subject: Alice                                       │
│ predicate: WORKS_AT                                  │
│ object: Acme                                         │
│ valid_from: 2020-03-15                              │
│ valid_to: NULL (ongoing)                            │
│ tx_from: 2024-02-01                                 │
│ tx_to: NULL (current version)                       │
└─────────────────────────────────────────────────────┘

Version 2 (recorded Apr 1, correcting end date):
┌─────────────────────────────────────────────────────┐
│ subject: Alice                                       │
│ predicate: WORKS_AT                                  │
│ object: Acme                                         │
│ valid_from: 2020-03-15                              │
│ valid_to: 2023-06-30  ◄── Correction                │
│ tx_from: 2024-04-01   ◄── New version               │
│ tx_to: NULL (current version)                       │
└─────────────────────────────────────────────────────┘

Previous version now closed:
┌─────────────────────────────────────────────────────┐
│ ... (same as Version 1)                             │
│ tx_to: 2024-04-01  ◄── No longer current            │
└─────────────────────────────────────────────────────┘

3. Project Specification

3.1 What You Will Build

A Python fact store with:

Bi-temporal fact storage (valid time + transaction time)
Temporal CRUD operations
Point-in-time queries
Interval predicates for complex temporal logic

3.2 Functional Requirements

Store fact: store.add_fact(subject, predicate, object, valid_from, valid_to)
Update fact: store.update_fact(fact_id, valid_to=new_date) (creates new version)
Invalidate fact: store.invalidate(fact_id) (ends transaction time)
Query current: store.query(subject, predicate) → current valid facts
Query at time: store.query_at(subject, predicate, as_of, known_at) → point-in-time
Temporal predicates: store.query(valid_time_overlaps=(start, end))
Fact history: store.history(fact_id) → all versions

3.3 Example Usage / Output

from temporal_store import BiTemporalFactStore
from datetime import date

store = BiTemporalFactStore()

# Record that Alice works at Acme (learned today)
store.add_fact(
    subject="Alice",
    predicate="WORKS_AT",
    object="Acme",
    valid_from=date(2020, 3, 15),
    valid_to=None  # ongoing
)

# Later, we learn she left
store.update_fact(
    subject="Alice",
    predicate="WORKS_AT",
    object="Acme",
    valid_to=date(2023, 6, 30)
)

# Query current knowledge
facts = store.query(subject="Alice", predicate="WORKS_AT")
print(facts)
# [Fact(object="Acme", valid_from=2020-03-15, valid_to=2023-06-30)]

# Query what we knew last month
facts = store.query_at(
    subject="Alice",
    predicate="WORKS_AT",
    known_at=date(2024, 3, 1)  # Before correction
)
print(facts)
# [Fact(object="Acme", valid_from=2020-03-15, valid_to=None)] # Still ongoing!

# Query facts valid during 2021
facts = store.query(
    subject="Alice",
    valid_time_overlaps=(date(2021, 1, 1), date(2021, 12, 31))
)
print(facts)
# [Fact(predicate="WORKS_AT", object="Acme",...)]

# Get full history
history = store.history(subject="Alice", predicate="WORKS_AT", object="Acme")
for version in history:
    print(f"Version valid {version.tx_from} - {version.tx_to}: {version}")

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────┐
│  BiTemporalFactStore│
└──────────┬──────────┘
           │
     ┌─────┴─────┐
     │           │
     ▼           ▼
┌─────────┐ ┌─────────────┐
│ SQLite  │ │  Temporal   │
│ Storage │ │  Index      │
└─────────┘ └─────────────┘
     │           │
     └─────┬─────┘
           ▼
    ┌─────────────┐
    │   Query     │
    │   Engine    │
    └─────────────┘

4.2 Key Components

Component	Responsibility	Technology
FactStore	CRUD operations with temporal semantics	Python class
TemporalIndex	Efficient interval queries	Interval trees or SQL indexes
QueryEngine	Temporal predicate evaluation	Allen’s algebra implementation
VersionManager	Handle fact versioning	Transaction time tracking

4.3 Data Model

CREATE TABLE facts (
    id TEXT PRIMARY KEY,
    subject TEXT NOT NULL,
    predicate TEXT NOT NULL,
    object TEXT NOT NULL,
    properties JSON,

    -- Valid time (when true in reality)
    valid_from DATE NOT NULL,
    valid_to DATE,  -- NULL means ongoing

    -- Transaction time (when recorded)
    tx_from TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
    tx_to TIMESTAMP,  -- NULL means current version

    -- Provenance
    source_episode TEXT,
    confidence REAL DEFAULT 1.0
);

-- Index for current facts
CREATE INDEX idx_current ON facts(subject, predicate)
    WHERE tx_to IS NULL;

-- Index for valid time queries
CREATE INDEX idx_valid_time ON facts(valid_from, valid_to);

5. Implementation Guide

5.1 Development Environment Setup

mkdir bi-temporal-store && cd bi-temporal-store
python -m venv .venv && source .venv/bin/activate
pip install pydantic sqlalchemy intervaltree

5.2 Project Structure

bi-temporal-store/
├── src/
│   ├── store.py          # BiTemporalFactStore
│   ├── models.py         # Fact, Query models
│   ├── temporal.py       # Allen's algebra
│   ├── indexing.py       # Interval tree index
│   └── queries.py        # Query builder
├── tests/
│   ├── test_temporal.py
│   └── test_queries.py
└── README.md

5.3 Implementation Phases

Phase 1: Basic Bi-Temporal Storage (8-10h)

Goals:

Store facts with valid_time and tx_time
Implement version-creating updates

Tasks:

Design SQLite schema with temporal columns
Implement add_fact with automatic tx_from
Implement update_fact that closes old version, creates new
Implement invalidate that sets tx_to

Checkpoint: Facts can be added and updated with version history.

Phase 2: Temporal Queries (8-10h)

Goals:

Query current facts
Query as-of any point in time

Tasks:

Implement query for current facts (tx_to IS NULL)
Implement query_at with both as_of and known_at parameters
Add valid_time_overlaps predicate
Build history retrieval

Checkpoint: Can query “what did we know on date X about Y?”

Phase 3: Advanced Predicates (6-8h)

Goals:

Full Allen’s interval algebra
Complex temporal logic

Tasks:

Implement all 13 Allen relations
Add compound temporal queries
Optimize with interval tree indexes
Add timeline visualization

Checkpoint: Complex temporal reasoning working.

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit	Test temporal predicates	Interval overlap logic
Integration	Test versioning	Update → query old version
Regression	Test edge cases	NULL end dates, same-day changes

6.2 Critical Test Cases

Version isolation: Update doesn’t affect past queries
Current facts: tx_to IS NULL filter works
Overlapping intervals: Correct interval detection
NULL handling: Ongoing facts (valid_to=NULL) handled correctly

7. Common Pitfalls & Debugging

Pitfall	Symptom	Solution
Timezone confusion	Facts appear on wrong date	Use UTC everywhere, convert at display
Inclusive vs exclusive	Off-by-one errors	Document and test boundary behavior
NULL comparisons	Queries miss ongoing facts	Handle NULL explicitly in SQL
Version ordering	Wrong version returned	Order by tx_from DESC, take first

8. Extensions & Challenges

8.1 Beginner Extensions

Add CLI for temporal queries
Add fact timeline ASCII visualization

8.2 Intermediate Extensions

Implement temporal joins (facts valid at same time)
Add coalescing for adjacent intervals

8.3 Advanced Extensions

Integrate with Neo4j temporal properties
Build temporal reasoning rules engine

9. Real-World Connections

9.1 Industry Applications

Financial Systems: Trading records, audit trails
Healthcare: Patient history with corrections
AI Memory: Graphiti’s temporal edge model

9.2 Interview Relevance

Explain bi-temporal vs uni-temporal modeling
Discuss immutability and audit requirements
Describe temporal query optimization strategies

10. Resources

10.1 Essential Reading

“Temporal Data & the Relational Model” by Date, Darwen, Lorentzos
“Designing Data-Intensive Applications” by Kleppmann — Ch. on Change Data Capture
Allen’s Interval Algebra paper (1983)

Previous: Project 4 (Entity Resolution)
Next: Project 6 (Temporal Query Engine)

11. Self-Assessment Checklist

I can explain the difference between valid time and transaction time
I understand why updates create new versions instead of modifying
I can query “what did we know at time X about Y at time Z”
I know Allen’s basic interval relations

12. Submission / Completion Criteria

Minimum Viable Completion:

Facts stored with valid_from, valid_to, tx_from, tx_to
Updates create new versions
Basic query_at working

Full Completion:

All temporal query types implemented
Allen’s interval predicates
Full version history retrieval

Excellence:

Interval tree optimization
Temporal join operations
Integration with graph database

Project 5: Bi-Temporal Fact Store

Quick Reference

1. Learning Objectives

2. Theoretical Foundation

2.1 Core Concepts

2.2 Why This Matters

2.3 Common Misconceptions

2.4 ASCII Diagram: Bi-Temporal Space

2.5 Bi-Temporal Fact Example

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Example Usage / Output

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Model

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 Implementation Phases

Phase 1: Basic Bi-Temporal Storage (8-10h)

Phase 2: Temporal Queries (8-10h)

Phase 3: Advanced Predicates (6-8h)

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

7. Common Pitfalls & Debugging

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Related Projects

11. Self-Assessment Checklist

12. Submission / Completion Criteria