Project 1: Personal Memory Graph CLI
Build a command-line tool that stores personal facts as nodes and relationships in Neo4j, with CRUD operations and basic Cypher queries—your first hands-on experience with graph data modeling for AI memory.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 1: Beginner |
| Time Estimate | Weekend (8-12 hours) |
| Language | Python (Alternatives: TypeScript, Go) |
| Prerequisites | Basic Python, understanding of databases, Docker basics |
| Key Topics | Graph data modeling, Neo4j, Cypher query language, nodes and relationships, property graphs |
1. Learning Objectives
By completing this project, you will:
- Understand the fundamental difference between graph databases and relational databases.
- Model real-world knowledge as nodes (entities) and relationships (edges) with properties.
- Write basic Cypher queries for creating, reading, updating, and deleting graph data.
- Design a schema for personal facts that could power an AI assistant’s memory.
- Experience the “aha moment” of graph traversal vs. SQL joins.
2. Theoretical Foundation
2.1 Core Concepts
-
Property Graph Model: Nodes have labels (types) and properties (key-value pairs). Relationships have types and direction, and can also have properties. This is the data model used by Neo4j.
-
Index-Free Adjacency: Unlike relational databases that use foreign keys and joins, graph databases store direct pointers between connected nodes. This makes traversal O(1) per hop regardless of total graph size.
-
Cypher Query Language: Neo4j’s declarative query language for pattern matching. The syntax
(a)-[r:KNOWS]->(b)reads naturally as “a KNOWS b”. -
Labels vs. Properties: Labels categorize nodes (Person, Company, Fact); properties store attributes (name, date, value). Choose labels for filtering large sets; properties for specific attributes.
2.2 Why This Matters
Before you can build sophisticated AI memory systems, you need to internalize how graph databases think differently:
- Relationships are first-class citizens: In SQL, relationships are implicit (foreign keys). In graphs, relationships have their own identity, type, and properties.
- Traversal is cheap: Finding “friends of friends of friends” is one query, not multiple joins.
- Schema flexibility: Add new relationship types without migrations.
2.3 Common Misconceptions
- “Graph databases are just for social networks.” They excel at any connected data: knowledge bases, recommendations, fraud detection, and yes, AI memory.
- “You need to know graph theory.” You don’t. The property graph model is intuitive—nodes are things, relationships connect things.
- “Cypher is hard to learn.” It’s actually more readable than SQL for relationship queries.
MATCH (a)-[:FRIEND]->(b)is clearer than three-way joins.
2.4 ASCII Diagram: Graph vs Relational
RELATIONAL (SQL) GRAPH (Neo4j)
================ ==============
┌─────────────────┐
│ persons │ (Alice)
├─────────────────┤ │
│ id │ name │ [:WORKS_AT]
│ 1 │ Alice │ │
│ 2 │ Bob │ ▼
└─────────────────┘ (Acme Corp)
│
┌─────────────────┐ [:EMPLOYS]
│ employment │ │
├─────────────────┤ ▼
│ person_id│org_id│ (Bob)
│ 1 │ 1 │
│ 2 │ 1 │
└─────────────────┘
Query: "Who works at Acme?" Query: "Who works at Acme?"
SELECT p.name MATCH (p)-[:WORKS_AT]->(c:Company)
FROM persons p WHERE c.name = 'Acme Corp'
JOIN employment e ON p.id = e.person_id RETURN p.name
JOIN companies c ON e.org_id = c.id
WHERE c.name = 'Acme';
3. Project Specification
3.1 What You Will Build
A command-line tool that lets you:
- Add facts about yourself (preferences, relationships, events)
- Query facts using natural patterns
- Update facts when things change
- Delete facts that are no longer relevant
- See the graph structure visually (ASCII or Neo4j Browser)
3.2 Functional Requirements
- Add a fact:
memory add "I prefer Python for scripting" - Add a relationship:
memory relate "Alice" "WORKS_WITH" "Bob" - Query by entity:
memory query "Alice"→ shows all facts about Alice - Query by relationship:
memory query --rel WORKS_WITH→ all work relationships - Update a fact:
memory update <id> "I now prefer Rust for scripting" - Delete a fact:
memory delete <id> - Visualize:
memory show→ ASCII representation of the graph
3.3 Non-Functional Requirements
- Reliability: Handle Neo4j connection failures gracefully
- Usability: Clear error messages and help text
- Performance: Queries should complete in < 100ms for graphs under 1000 nodes
3.4 Example Usage / Output
$ memory add "I prefer dark mode in all applications"
Created fact: (Preference {value: "dark mode in all applications"})
$ memory relate "Me" "PREFERS" "dark mode" --since "2023-01-01"
Created relationship: (Me)-[:PREFERS {since: 2023-01-01}]->(Preference)
$ memory query "Me"
Entity: Me
├── [:PREFERS] → dark mode (since: 2023-01-01)
├── [:PREFERS] → Python (since: 2020-03-15)
├── [:WORKS_AT] → Acme Corp (since: 2022-06-01)
└── [:KNOWS] → Alice, Bob, Charlie
$ memory show
Graph Visualization (15 nodes, 23 relationships):
Me ──PREFERS──► dark_mode
│ ──PREFERS──► Python
│ ──WORKS_AT─► Acme_Corp
└──KNOWS─────► Alice ──WORKS_WITH──► Bob
4. Solution Architecture
4.1 High-Level Design
┌───────────────┐ commands ┌──────────────────┐
│ CLI (Click) │───────────────────▶│ Memory Service │
└───────────────┘ └────────┬─────────┘
│
│ Cypher
▼
┌──────────────────┐
│ Neo4j Driver │
└────────┬─────────┘
│
▼
┌──────────────────┐
│ Neo4j (Docker) │
└──────────────────┘
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| CLI Layer | Parse commands, format output | Use Click for argument parsing |
| Memory Service | Business logic, query building | Keep Cypher queries centralized |
| Neo4j Driver | Database connection, transactions | Use official neo4j Python driver |
| Data Model | Define node labels and relationships | Start simple: Entity, Fact, Relationship |
4.3 Data Model
Node Labels:
- Entity: Anything that can have facts (Person, Place, Concept)
- Fact: A piece of information with a value
- Preference: A special type of fact about preferences
Relationship Types:
- HAS_FACT: Entity → Fact
- PREFERS: Entity → Entity/Concept
- KNOWS: Person → Person
- WORKS_AT: Person → Organization
- (custom types as needed)
Properties:
- All nodes: id (UUID), created_at, updated_at
- Entity: name, type
- Fact: value, source, confidence
- Relationships: since, until, source
4.4 Algorithm Overview
Adding a Fact:
- Parse the fact text to identify entity and value
- Check if entity already exists (MERGE)
- Create the fact node
- Create relationship from entity to fact
- Return confirmation with IDs
Querying:
- Match the starting pattern
- Optionally filter by relationship type or properties
- Collect connected nodes and relationships
- Format as tree or table for display
5. Implementation Guide
5.1 Development Environment Setup
# Start Neo4j with Docker
docker run -d \
--name neo4j-memory \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
neo4j:latest
# Create Python project
mkdir personal-memory-graph && cd personal-memory-graph
python -m venv .venv && source .venv/bin/activate
pip install neo4j click python-dotenv
# Verify connection
python -c "from neo4j import GraphDatabase; d = GraphDatabase.driver('bolt://localhost:7687', auth=('neo4j', 'password')); d.verify_connectivity(); print('Connected!')"
5.2 Project Structure
personal-memory-graph/
├── src/
│ ├── __init__.py
│ ├── cli.py # Click command definitions
│ ├── service.py # Memory service business logic
│ ├── driver.py # Neo4j connection management
│ └── models.py # Data structures
├── tests/
│ ├── test_service.py
│ └── test_queries.py
├── .env # NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD
└── README.md
5.3 Implementation Phases
Phase 1: Connection and Basic CRUD (3-4h)
Goals:
- Connect to Neo4j
- Create and read single nodes
Tasks:
- Set up Neo4j driver with connection pooling
- Implement
addcommand for simple facts - Implement
querycommand to retrieve nodes by name - Add basic error handling
Checkpoint: Can add a fact and retrieve it by name.
Phase 2: Relationships and Traversal (3-4h)
Goals:
- Create relationships between entities
- Traverse the graph
Tasks:
- Implement
relatecommand for creating relationships - Enhance
queryto show connected nodes - Add relationship type filtering
- Implement basic visualization
Checkpoint: Can create relationships and see graph structure.
Phase 3: Update, Delete, and Polish (2-3h)
Goals:
- Complete CRUD operations
- Improve UX
Tasks:
- Implement
updatecommand - Implement
deletecommand (with confirmation) - Add timestamps and metadata
- Improve output formatting
Checkpoint: Full CRUD with nice output.
5.4 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Entity identification | Name-based vs UUID | Both (name for UX, UUID for internal) | Names can change; UUIDs are stable |
| Relationship direction | Always directed vs bidirectional | Always directed | Matches graph semantics; query both ways |
| Schema enforcement | Strict labels vs freeform | Freeform initially | Discover patterns before constraining |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit | Test query building | Cypher generation, input parsing |
| Integration | Test Neo4j operations | CRUD operations, transactions |
| E2E | Test CLI commands | Full command execution |
6.2 Critical Test Cases
- Create node: Verify node exists with correct properties
- Create relationship: Verify both nodes and relationship exist
- Duplicate handling: MERGE doesn’t create duplicates
- Query traversal: Returns all connected nodes within depth
- Delete cascade: Relationships are cleaned up
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Connection string wrong | “Unable to connect” | Use bolt:// not http://; check port 7687 |
| Authentication failed | “Invalid credentials” | Check NEO4J_AUTH matches driver config |
| CREATE vs MERGE confusion | Duplicate nodes | Use MERGE for entities; CREATE for unique facts |
| Missing relationship direction | Query returns nothing | All relationships must be directed in schema |
| Transaction not committed | Data disappears | Use session.execute_write() not run() alone |
Debugging Strategies:
- Use Neo4j Browser (http://localhost:7474) to visualize graph state
- Log Cypher queries before execution
- Check
EXPLAINoutput for query plans
8. Extensions & Challenges
8.1 Beginner Extensions
- Add a
memory importcommand to load facts from a JSON file - Add colored output using
richlibrary - Add
--format jsonflag for machine-readable output
8.2 Intermediate Extensions
- Add temporal properties (valid_from, valid_until) to relationships
- Implement fuzzy search for entity names
- Add graph export to GraphML or JSON format
8.3 Advanced Extensions
- Add natural language parsing for fact extraction (basic NLP)
- Implement shortest path queries between entities
- Add Cypher REPL for direct query execution
9. Real-World Connections
9.1 Industry Applications
- Personal Knowledge Management: Tools like Roam, Obsidian use graph structures
- AI Memory Systems: Zep, Mem0, and LangGraph use graph databases
- Enterprise Knowledge Graphs: Google Knowledge Graph, Amazon Product Graph
9.2 Related Open Source Projects
- Neo4j: The graph database you’re using
- LangChain Neo4j Integration: Graph-based RAG patterns
- Memgraph: Alternative graph database with Python-first approach
9.3 Interview Relevance
- Explain when to use graph vs. relational databases
- Discuss trade-offs of property graph vs. RDF models
- Describe how graph databases enable AI memory
10. Resources
10.1 Essential Reading
- “Graph Databases” by Robinson, Webber, Eifrem — Neo4j fundamentals (Ch. 1-3)
- Neo4j Cypher Manual — Official query language reference
- “Designing Data-Intensive Applications” by Kleppmann — Ch. 2 (Data Models)
10.2 Tools & Documentation
- Neo4j Desktop (visualization and development)
- Neo4j Browser (web-based query interface)
- Cypher Refcard (quick reference)
10.3 Related Projects in This Series
- Previous: None (start here)
- Next: Project 2 (Conversation Episode Store) — add time-series conversation storage
11. Self-Assessment Checklist
- I can explain the difference between graph and relational databases
- I can write MERGE, MATCH, and CREATE Cypher queries
- I understand when to use labels vs. properties
- I can traverse relationships with variable-length paths
- I can design a graph schema for a new domain
12. Submission / Completion Criteria
Minimum Viable Completion:
- CLI with add, query, and delete commands
- Neo4j connection working
- Can create entities and relationships
Full Completion:
- All CRUD operations working
- Relationship traversal in queries
- ASCII or formatted visualization
- Proper error handling
Excellence (Going Above & Beyond):
- Temporal properties on relationships
- Import/export functionality
- Natural language fact parsing