Project 1: In-Memory Cache with LRU Eviction (Mini-Redis)

Build a TCP key-value cache with LRU eviction using a hash table plus doubly linked list.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 1-2 weeks
Language C
Prerequisites C pointers, sockets, memory management
Key Topics hash tables, LRU list, eviction, resizing

1. Learning Objectives

By completing this project, you will:

  1. Implement O(1) lookup with a chained hash table.
  2. Implement O(1) LRU updates and eviction with a doubly linked list.
  3. Track memory usage and enforce a hard max-memory limit.
  4. Serve a simple TCP protocol with SET/GET/DEL/STAT commands.

2. Theoretical Foundation

2.1 Core Concepts

  • Hash tables: Provide O(1) average lookup; collisions require chaining or open addressing.
  • LRU eviction: Requires a data structure that can move elements to the front and evict from the tail in O(1).
  • Resizing: Rehashing is expensive; you must decide how and when to resize.

2.2 Why This Matters

Caches like Redis use this exact structure. Understanding the mechanics shows why linked lists are occasionally the best choice.

2.3 Historical Context / Background

LRU caches became standard in systems software because they balance recency and performance with minimal overhead.

2.4 Common Misconceptions

  • “A hash table is enough”: It cannot answer “least recently used” without extra structure.
  • “Resizing is free”: Rehashing is O(n) and can block clients.

3. Project Specification

3.1 What You Will Build

A TCP server that supports basic cache commands and evicts least-recently-used keys when memory is full.

3.2 Functional Requirements

  1. Support SET key value, GET key, DEL key, and STATS commands.
  2. Enforce a max-memory limit with LRU eviction.
  3. Track hits, misses, evictions, and load factor.

3.3 Non-Functional Requirements

  • Performance: Keep GET/SET near O(1).
  • Reliability: Do not exceed max-memory.
  • Usability: Clear protocol responses and stats.

3.4 Example Usage / Output

$ ./mini_cache --max-memory 1048576
> SET user:1 Alice
OK
> GET user:1
Alice
> STATS
Total keys: 1
Evictions: 0

3.5 Real World Outcome

You can connect with netcat and see eviction happen once you exceed the limit:

$ ./mini_cache --max-memory 1048576
Server listening on 6379
$ nc localhost 6379
SET user:1 Alice
OK
GET user:1
Alice
STATS
Total keys: 1
Evictions: 0

4. Solution Architecture

4.1 High-Level Design

TCP server -> parse commands -> hash table lookup -> LRU update -> respond

4.2 Key Components

Component Responsibility Key Decisions
Hash table O(1) key lookup Chaining vs open addressing
LRU list Track recency Intrusive doubly linked list
Eviction Free old entries Evict from tail
Protocol Simple text commands Line-based parsing

4.3 Data Structures

typedef struct cache_entry {
    char *key;
    char *value;
    size_t value_len;
    struct cache_entry *prev, *next;    // LRU list
    struct cache_entry *hash_next;      // Hash chain
} cache_entry_t;

4.4 Algorithm Overview

Key Algorithm: LRU Touch

  1. Remove entry from its current list position.
  2. Insert at LRU head.

Complexity Analysis:

  • Time: O(1) for GET/SET average
  • Space: O(n) entries

5. Implementation Guide

5.1 Development Environment Setup

gcc --version

5.2 Project Structure

project-root/
├── src/
│   ├── server.c
│   ├── cache.c
│   └── hash.c
├── include/
└── README.md

5.3 The Core Question You’re Answering

“Why do caches combine a hash table with a doubly linked list for LRU?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Hash table chaining
  2. Doubly linked list operations in O(1)
  3. Memory accounting per entry

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. When should you update LRU order? GET only, or GET and SET?
  2. What happens if a new value is larger than max-memory?
  3. How will you resize the hash table without a long pause?

5.6 Thinking Exercise

LRU sequence

Walk through: SET a, SET b, GET a, SET c with capacity 2. Which key is evicted and why?

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “Explain why LRU needs a linked list.”
  2. “What is your complexity for GET/SET and eviction?”
  3. “How would you make this cache thread-safe?”

5.8 Hints in Layers

Hint 1: Intrusive nodes Store LRU pointers inside the cache entry to avoid separate allocations.

Hint 2: Two-sample CPU If you add stats, sample over time for meaningful metrics.

Hint 3: Handle updates Updating a key should refresh its LRU position.

5.9 Books That Will Help

Topic Book Chapter
LRU cache design “Designing Data-Intensive Applications” Ch. 3
Hash tables “Mastering Algorithms with C” Ch. 8
Cache locality “CS:APP” Ch. 6

5.10 Implementation Phases

Phase 1: Foundation (3-4 days)

Goals:

  • Hash table with basic insert/lookup.

Tasks:

  1. Implement hash function.
  2. Handle collisions with chaining.

Checkpoint: GET returns correct values.

Phase 2: Core Functionality (4-5 days)

Goals:

  • LRU list integration and eviction.

Tasks:

  1. Implement list operations.
  2. Evict from tail when memory is full.

Checkpoint: Oldest key is removed on overflow.

Phase 3: Polish & Edge Cases (2-3 days)

Goals:

  • Add stats and protocol stability.

Tasks:

  1. Add STATS output.
  2. Handle malformed commands.

Checkpoint: Server remains stable under fuzzed input.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Eviction policy LRU vs FIFO LRU Better hit rate
Hash resizing immediate vs incremental immediate first Simpler to implement

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Protocol Validate commands SET/GET/DEL
LRU Validate eviction order Access patterns
Memory Validate limit Fill to capacity

6.2 Critical Test Cases

  1. GET of missing key returns NOT FOUND.
  2. Oldest key evicted after capacity reached.
  3. Memory usage never exceeds max-memory.

6.3 Test Data

Keys: user:1, user:2, user:3

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
LRU not updated Wrong eviction Call touch on GET
Incorrect memory accounting Over-limit usage Include key/value sizes
Hash parse errors Crashes Validate input lines

7.2 Debugging Strategies

  • Add a DUMP command to print LRU order.
  • Run with AddressSanitizer for memory issues.

7.3 Performance Traps

Resizing the hash table too often causes pauses; use a sensible load factor.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add TTL expiration.
  • Add a simple benchmark mode.

8.2 Intermediate Extensions

  • Implement incremental rehashing.
  • Add multi-threaded request handling.

8.3 Advanced Extensions

  • Add persistence (AOF-style append log).
  • Support binary-safe values.

9. Real-World Connections

9.1 Industry Applications

  • Cache layers in databases, CDNs, and microservices.
  • Redis: https://redis.io
  • memcached: https://memcached.org

9.3 Interview Relevance

  • LRU cache implementation is a classic systems interview topic.

10. Resources

10.1 Essential Reading

  • redis.conf and Redis eviction policies docs
  • hash functions: FNV-1a reference

10.2 Video Resources

  • LRU cache walkthroughs (search “LRU cache C”)

10.3 Tools & Documentation

  • nc(1) - man 1 nc

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain why LRU needs a linked list.
  • I can compute hash table load factor.
  • I can explain eviction order.

11.2 Implementation

  • GET/SET/DEL commands work.
  • Evictions follow LRU.
  • Memory limit is enforced.

11.3 Growth

  • I can extend the cache with TTL or persistence.
  • I can explain tradeoffs in interview settings.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • TCP server with GET/SET and LRU eviction.

Full Completion:

  • STATS command and memory limit enforcement.

Excellence (Going Above & Beyond):

  • Incremental rehashing and persistence.

This guide was generated from SPRINT_4_5_REAL_WORLD_PROJECTS.md. For the complete learning path, see the parent directory.