Project 1: In-Memory Cache with LRU Eviction (Mini-Redis)
Build a TCP key-value cache with LRU eviction using a hash table plus doubly linked list.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1-2 weeks |
| Language | C |
| Prerequisites | C pointers, sockets, memory management |
| Key Topics | hash tables, LRU list, eviction, resizing |
1. Learning Objectives
By completing this project, you will:
- Implement O(1) lookup with a chained hash table.
- Implement O(1) LRU updates and eviction with a doubly linked list.
- Track memory usage and enforce a hard max-memory limit.
- Serve a simple TCP protocol with SET/GET/DEL/STAT commands.
2. Theoretical Foundation
2.1 Core Concepts
- Hash tables: Provide O(1) average lookup; collisions require chaining or open addressing.
- LRU eviction: Requires a data structure that can move elements to the front and evict from the tail in O(1).
- Resizing: Rehashing is expensive; you must decide how and when to resize.
2.2 Why This Matters
Caches like Redis use this exact structure. Understanding the mechanics shows why linked lists are occasionally the best choice.
2.3 Historical Context / Background
LRU caches became standard in systems software because they balance recency and performance with minimal overhead.
2.4 Common Misconceptions
- “A hash table is enough”: It cannot answer “least recently used” without extra structure.
- “Resizing is free”: Rehashing is O(n) and can block clients.
3. Project Specification
3.1 What You Will Build
A TCP server that supports basic cache commands and evicts least-recently-used keys when memory is full.
3.2 Functional Requirements
- Support
SET key value,GET key,DEL key, andSTATScommands. - Enforce a max-memory limit with LRU eviction.
- Track hits, misses, evictions, and load factor.
3.3 Non-Functional Requirements
- Performance: Keep GET/SET near O(1).
- Reliability: Do not exceed max-memory.
- Usability: Clear protocol responses and stats.
3.4 Example Usage / Output
$ ./mini_cache --max-memory 1048576
> SET user:1 Alice
OK
> GET user:1
Alice
> STATS
Total keys: 1
Evictions: 0
3.5 Real World Outcome
You can connect with netcat and see eviction happen once you exceed the limit:
$ ./mini_cache --max-memory 1048576
Server listening on 6379
$ nc localhost 6379
SET user:1 Alice
OK
GET user:1
Alice
STATS
Total keys: 1
Evictions: 0
4. Solution Architecture
4.1 High-Level Design
TCP server -> parse commands -> hash table lookup -> LRU update -> respond
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Hash table | O(1) key lookup | Chaining vs open addressing |
| LRU list | Track recency | Intrusive doubly linked list |
| Eviction | Free old entries | Evict from tail |
| Protocol | Simple text commands | Line-based parsing |
4.3 Data Structures
typedef struct cache_entry {
char *key;
char *value;
size_t value_len;
struct cache_entry *prev, *next; // LRU list
struct cache_entry *hash_next; // Hash chain
} cache_entry_t;
4.4 Algorithm Overview
Key Algorithm: LRU Touch
- Remove entry from its current list position.
- Insert at LRU head.
Complexity Analysis:
- Time: O(1) for GET/SET average
- Space: O(n) entries
5. Implementation Guide
5.1 Development Environment Setup
gcc --version
5.2 Project Structure
project-root/
├── src/
│ ├── server.c
│ ├── cache.c
│ └── hash.c
├── include/
└── README.md
5.3 The Core Question You’re Answering
“Why do caches combine a hash table with a doubly linked list for LRU?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Hash table chaining
- Doubly linked list operations in O(1)
- Memory accounting per entry
5.5 Questions to Guide Your Design
Before implementing, think through these:
- When should you update LRU order? GET only, or GET and SET?
- What happens if a new value is larger than max-memory?
- How will you resize the hash table without a long pause?
5.6 Thinking Exercise
LRU sequence
Walk through: SET a, SET b, GET a, SET c with capacity 2. Which key is evicted and why?
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “Explain why LRU needs a linked list.”
- “What is your complexity for GET/SET and eviction?”
- “How would you make this cache thread-safe?”
5.8 Hints in Layers
Hint 1: Intrusive nodes Store LRU pointers inside the cache entry to avoid separate allocations.
Hint 2: Two-sample CPU If you add stats, sample over time for meaningful metrics.
Hint 3: Handle updates Updating a key should refresh its LRU position.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| LRU cache design | “Designing Data-Intensive Applications” | Ch. 3 |
| Hash tables | “Mastering Algorithms with C” | Ch. 8 |
| Cache locality | “CS:APP” | Ch. 6 |
5.10 Implementation Phases
Phase 1: Foundation (3-4 days)
Goals:
- Hash table with basic insert/lookup.
Tasks:
- Implement hash function.
- Handle collisions with chaining.
Checkpoint: GET returns correct values.
Phase 2: Core Functionality (4-5 days)
Goals:
- LRU list integration and eviction.
Tasks:
- Implement list operations.
- Evict from tail when memory is full.
Checkpoint: Oldest key is removed on overflow.
Phase 3: Polish & Edge Cases (2-3 days)
Goals:
- Add stats and protocol stability.
Tasks:
- Add STATS output.
- Handle malformed commands.
Checkpoint: Server remains stable under fuzzed input.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Eviction policy | LRU vs FIFO | LRU | Better hit rate |
| Hash resizing | immediate vs incremental | immediate first | Simpler to implement |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Protocol | Validate commands | SET/GET/DEL |
| LRU | Validate eviction order | Access patterns |
| Memory | Validate limit | Fill to capacity |
6.2 Critical Test Cases
- GET of missing key returns NOT FOUND.
- Oldest key evicted after capacity reached.
- Memory usage never exceeds max-memory.
6.3 Test Data
Keys: user:1, user:2, user:3
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| LRU not updated | Wrong eviction | Call touch on GET |
| Incorrect memory accounting | Over-limit usage | Include key/value sizes |
| Hash parse errors | Crashes | Validate input lines |
7.2 Debugging Strategies
- Add a
DUMPcommand to print LRU order. - Run with AddressSanitizer for memory issues.
7.3 Performance Traps
Resizing the hash table too often causes pauses; use a sensible load factor.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add TTL expiration.
- Add a simple benchmark mode.
8.2 Intermediate Extensions
- Implement incremental rehashing.
- Add multi-threaded request handling.
8.3 Advanced Extensions
- Add persistence (AOF-style append log).
- Support binary-safe values.
9. Real-World Connections
9.1 Industry Applications
- Cache layers in databases, CDNs, and microservices.
9.2 Related Open Source Projects
- Redis: https://redis.io
- memcached: https://memcached.org
9.3 Interview Relevance
- LRU cache implementation is a classic systems interview topic.
10. Resources
10.1 Essential Reading
- redis.conf and Redis eviction policies docs
- hash functions: FNV-1a reference
10.2 Video Resources
- LRU cache walkthroughs (search “LRU cache C”)
10.3 Tools & Documentation
- nc(1) -
man 1 nc
10.4 Related Projects in This Series
- Log-Structured Key-Value Store: persistence and compaction.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain why LRU needs a linked list.
- I can compute hash table load factor.
- I can explain eviction order.
11.2 Implementation
- GET/SET/DEL commands work.
- Evictions follow LRU.
- Memory limit is enforced.
11.3 Growth
- I can extend the cache with TTL or persistence.
- I can explain tradeoffs in interview settings.
12. Submission / Completion Criteria
Minimum Viable Completion:
- TCP server with GET/SET and LRU eviction.
Full Completion:
- STATS command and memory limit enforcement.
Excellence (Going Above & Beyond):
- Incremental rehashing and persistence.
This guide was generated from SPRINT_4_5_REAL_WORLD_PROJECTS.md. For the complete learning path, see the parent directory.