Project 1: Personal Memory Graph CLI

Build a command-line tool that stores personal facts as nodes and relationships in Neo4j, with CRUD operations and basic Cypher queries—your first hands-on experience with graph data modeling for AI memory.

Quick Reference

Attribute	Value
Difficulty	Level 1: Beginner
Time Estimate	Weekend (8-12 hours)
Language	Python (Alternatives: TypeScript, Go)
Prerequisites	Basic Python, understanding of databases, Docker basics
Key Topics	Graph data modeling, Neo4j, Cypher query language, nodes and relationships, property graphs

1. Learning Objectives

By completing this project, you will:

Understand the fundamental difference between graph databases and relational databases.
Model real-world knowledge as nodes (entities) and relationships (edges) with properties.
Write basic Cypher queries for creating, reading, updating, and deleting graph data.
Design a schema for personal facts that could power an AI assistant’s memory.
Experience the “aha moment” of graph traversal vs. SQL joins.

2. Theoretical Foundation

2.1 Core Concepts

Property Graph Model: Nodes have labels (types) and properties (key-value pairs). Relationships have types and direction, and can also have properties. This is the data model used by Neo4j.
Index-Free Adjacency: Unlike relational databases that use foreign keys and joins, graph databases store direct pointers between connected nodes. This makes traversal O(1) per hop regardless of total graph size.
Cypher Query Language: Neo4j’s declarative query language for pattern matching. The syntax (a)-[r:KNOWS]->(b) reads naturally as “a KNOWS b”.
Labels vs. Properties: Labels categorize nodes (Person, Company, Fact); properties store attributes (name, date, value). Choose labels for filtering large sets; properties for specific attributes.

2.2 Why This Matters

Before you can build sophisticated AI memory systems, you need to internalize how graph databases think differently:

Relationships are first-class citizens: In SQL, relationships are implicit (foreign keys). In graphs, relationships have their own identity, type, and properties.
Traversal is cheap: Finding “friends of friends of friends” is one query, not multiple joins.
Schema flexibility: Add new relationship types without migrations.

2.3 Common Misconceptions

“Graph databases are just for social networks.” They excel at any connected data: knowledge bases, recommendations, fraud detection, and yes, AI memory.
“You need to know graph theory.” You don’t. The property graph model is intuitive—nodes are things, relationships connect things.
“Cypher is hard to learn.” It’s actually more readable than SQL for relationship queries. MATCH (a)-[:FRIEND]->(b) is clearer than three-way joins.

2.4 ASCII Diagram: Graph vs Relational

RELATIONAL (SQL)                    GRAPH (Neo4j)
================                    ==============

┌─────────────────┐
│ persons         │                      (Alice)
├─────────────────┤                         │
│ id │ name       │                    [:WORKS_AT]
│ 1  │ Alice      │                         │
│ 2  │ Bob        │                         ▼
└─────────────────┘                     (Acme Corp)
                                            │
┌─────────────────┐                    [:EMPLOYS]
│ employment      │                         │
├─────────────────┤                         ▼
│ person_id│org_id│                       (Bob)
│ 1        │ 1    │
│ 2        │ 1    │
└─────────────────┘

Query: "Who works at Acme?"          Query: "Who works at Acme?"
SELECT p.name                        MATCH (p)-[:WORKS_AT]->(c:Company)
FROM persons p                       WHERE c.name = 'Acme Corp'
JOIN employment e ON p.id = e.person_id   RETURN p.name
JOIN companies c ON e.org_id = c.id
WHERE c.name = 'Acme';

3. Project Specification

3.1 What You Will Build

A command-line tool that lets you:

Add facts about yourself (preferences, relationships, events)
Query facts using natural patterns
Update facts when things change
Delete facts that are no longer relevant
See the graph structure visually (ASCII or Neo4j Browser)

3.2 Functional Requirements

Add a fact: memory add "I prefer Python for scripting"
Add a relationship: memory relate "Alice" "WORKS_WITH" "Bob"
Query by entity: memory query "Alice" → shows all facts about Alice
Query by relationship: memory query --rel WORKS_WITH → all work relationships
Update a fact: memory update <id> "I now prefer Rust for scripting"
Delete a fact: memory delete <id>
Visualize: memory show → ASCII representation of the graph

3.3 Non-Functional Requirements

Reliability: Handle Neo4j connection failures gracefully
Usability: Clear error messages and help text
Performance: Queries should complete in < 100ms for graphs under 1000 nodes

3.4 Example Usage / Output

$ memory add "I prefer dark mode in all applications"
Created fact: (Preference {value: "dark mode in all applications"})

$ memory relate "Me" "PREFERS" "dark mode" --since "2023-01-01"
Created relationship: (Me)-[:PREFERS {since: 2023-01-01}]->(Preference)

$ memory query "Me"
Entity: Me
├── [:PREFERS] → dark mode (since: 2023-01-01)
├── [:PREFERS] → Python (since: 2020-03-15)
├── [:WORKS_AT] → Acme Corp (since: 2022-06-01)
└── [:KNOWS] → Alice, Bob, Charlie

$ memory show
Graph Visualization (15 nodes, 23 relationships):
   Me ──PREFERS──► dark_mode
    │ ──PREFERS──► Python
    │ ──WORKS_AT─► Acme_Corp
    └──KNOWS─────► Alice ──WORKS_WITH──► Bob

4. Solution Architecture

4.1 High-Level Design

┌───────────────┐      commands      ┌──────────────────┐
│   CLI (Click) │───────────────────▶│  Memory Service  │
└───────────────┘                    └────────┬─────────┘
                                              │
                                              │ Cypher
                                              ▼
                                     ┌──────────────────┐
                                     │   Neo4j Driver   │
                                     └────────┬─────────┘
                                              │
                                              ▼
                                     ┌──────────────────┐
                                     │   Neo4j (Docker) │
                                     └──────────────────┘

4.2 Key Components

Component	Responsibility	Key Decisions
CLI Layer	Parse commands, format output	Use Click for argument parsing
Memory Service	Business logic, query building	Keep Cypher queries centralized
Neo4j Driver	Database connection, transactions	Use official neo4j Python driver
Data Model	Define node labels and relationships	Start simple: Entity, Fact, Relationship

4.3 Data Model

Node Labels:
- Entity: Anything that can have facts (Person, Place, Concept)
- Fact: A piece of information with a value
- Preference: A special type of fact about preferences

Relationship Types:
- HAS_FACT: Entity → Fact
- PREFERS: Entity → Entity/Concept
- KNOWS: Person → Person
- WORKS_AT: Person → Organization
- (custom types as needed)

Properties:
- All nodes: id (UUID), created_at, updated_at
- Entity: name, type
- Fact: value, source, confidence
- Relationships: since, until, source

4.4 Algorithm Overview

Adding a Fact:

Parse the fact text to identify entity and value
Check if entity already exists (MERGE)
Create the fact node
Create relationship from entity to fact
Return confirmation with IDs

Querying:

Match the starting pattern
Optionally filter by relationship type or properties
Collect connected nodes and relationships
Format as tree or table for display

5. Implementation Guide

5.1 Development Environment Setup

# Start Neo4j with Docker
docker run -d \
  --name neo4j-memory \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password \
  neo4j:latest

# Create Python project
mkdir personal-memory-graph && cd personal-memory-graph
python -m venv .venv && source .venv/bin/activate
pip install neo4j click python-dotenv

# Verify connection
python -c "from neo4j import GraphDatabase; d = GraphDatabase.driver('bolt://localhost:7687', auth=('neo4j', 'password')); d.verify_connectivity(); print('Connected!')"

5.2 Project Structure

personal-memory-graph/
├── src/
│   ├── __init__.py
│   ├── cli.py          # Click command definitions
│   ├── service.py      # Memory service business logic
│   ├── driver.py       # Neo4j connection management
│   └── models.py       # Data structures
├── tests/
│   ├── test_service.py
│   └── test_queries.py
├── .env                # NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD
└── README.md

5.3 Implementation Phases

Phase 1: Connection and Basic CRUD (3-4h)

Goals:

Connect to Neo4j
Create and read single nodes

Tasks:

Set up Neo4j driver with connection pooling
Implement add command for simple facts
Implement query command to retrieve nodes by name
Add basic error handling

Checkpoint: Can add a fact and retrieve it by name.

Phase 2: Relationships and Traversal (3-4h)

Goals:

Create relationships between entities
Traverse the graph

Tasks:

Implement relate command for creating relationships
Enhance query to show connected nodes
Add relationship type filtering
Implement basic visualization

Checkpoint: Can create relationships and see graph structure.

Phase 3: Update, Delete, and Polish (2-3h)

Goals:

Complete CRUD operations
Improve UX

Tasks:

Implement update command
Implement delete command (with confirmation)
Add timestamps and metadata
Improve output formatting

Checkpoint: Full CRUD with nice output.

5.4 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Entity identification	Name-based vs UUID	Both (name for UX, UUID for internal)	Names can change; UUIDs are stable
Relationship direction	Always directed vs bidirectional	Always directed	Matches graph semantics; query both ways
Schema enforcement	Strict labels vs freeform	Freeform initially	Discover patterns before constraining

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit	Test query building	Cypher generation, input parsing
Integration	Test Neo4j operations	CRUD operations, transactions
E2E	Test CLI commands	Full command execution

6.2 Critical Test Cases

Create node: Verify node exists with correct properties
Create relationship: Verify both nodes and relationship exist
Duplicate handling: MERGE doesn’t create duplicates
Query traversal: Returns all connected nodes within depth
Delete cascade: Relationships are cleaned up

7. Common Pitfalls & Debugging

Pitfall	Symptom	Solution
Connection string wrong	“Unable to connect”	Use `bolt://` not `http://`; check port 7687
Authentication failed	“Invalid credentials”	Check NEO4J_AUTH matches driver config
CREATE vs MERGE confusion	Duplicate nodes	Use MERGE for entities; CREATE for unique facts
Missing relationship direction	Query returns nothing	All relationships must be directed in schema
Transaction not committed	Data disappears	Use `session.execute_write()` not `run()` alone

Debugging Strategies:

Use Neo4j Browser (http://localhost:7474) to visualize graph state
Log Cypher queries before execution
Check EXPLAIN output for query plans

8. Extensions & Challenges

8.1 Beginner Extensions

Add a memory import command to load facts from a JSON file
Add colored output using rich library
Add --format json flag for machine-readable output

8.2 Intermediate Extensions

Add temporal properties (valid_from, valid_until) to relationships
Implement fuzzy search for entity names
Add graph export to GraphML or JSON format

8.3 Advanced Extensions

Add natural language parsing for fact extraction (basic NLP)
Implement shortest path queries between entities
Add Cypher REPL for direct query execution

9. Real-World Connections

9.1 Industry Applications

Personal Knowledge Management: Tools like Roam, Obsidian use graph structures
AI Memory Systems: Zep, Mem0, and LangGraph use graph databases
Enterprise Knowledge Graphs: Google Knowledge Graph, Amazon Product Graph

Neo4j: The graph database you’re using
LangChain Neo4j Integration: Graph-based RAG patterns
Memgraph: Alternative graph database with Python-first approach

9.3 Interview Relevance

Explain when to use graph vs. relational databases
Discuss trade-offs of property graph vs. RDF models
Describe how graph databases enable AI memory

10. Resources

10.1 Essential Reading

“Graph Databases” by Robinson, Webber, Eifrem — Neo4j fundamentals (Ch. 1-3)
Neo4j Cypher Manual — Official query language reference
“Designing Data-Intensive Applications” by Kleppmann — Ch. 2 (Data Models)

10.2 Tools & Documentation

Neo4j Desktop (visualization and development)
Neo4j Browser (web-based query interface)
Cypher Refcard (quick reference)

Previous: None (start here)
Next: Project 2 (Conversation Episode Store) — add time-series conversation storage

11. Self-Assessment Checklist

I can explain the difference between graph and relational databases
I can write MERGE, MATCH, and CREATE Cypher queries
I understand when to use labels vs. properties
I can traverse relationships with variable-length paths
I can design a graph schema for a new domain

12. Submission / Completion Criteria

Minimum Viable Completion:

CLI with add, query, and delete commands
Neo4j connection working
Can create entities and relationships

Full Completion:

All CRUD operations working
Relationship traversal in queries
ASCII or formatted visualization
Proper error handling

Excellence (Going Above & Beyond):

Temporal properties on relationships
Import/export functionality
Natural language fact parsing

Project 1: Personal Memory Graph CLI

Quick Reference

1. Learning Objectives

2. Theoretical Foundation

2.1 Core Concepts

2.2 Why This Matters

2.3 Common Misconceptions

2.4 ASCII Diagram: Graph vs Relational

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Non-Functional Requirements

3.4 Example Usage / Output

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Model

4.4 Algorithm Overview

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 Implementation Phases

Phase 1: Connection and Basic CRUD (3-4h)

Phase 2: Relationships and Traversal (3-4h)

Phase 3: Update, Delete, and Polish (2-3h)

5.4 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

7. Common Pitfalls & Debugging

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Related Open Source Projects

9.3 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Tools & Documentation

10.3 Related Projects in This Series

11. Self-Assessment Checklist

12. Submission / Completion Criteria