Project 8: ETS Cache and Session Service

An ETS-backed cache with TTL eviction and stats.

Quick Reference

Attribute	Value
Difficulty	Level 2
Time Estimate	10-15 hours
Main Programming Language	Erlang or Elixir (Alternatives: Gleam)
Alternative Programming Languages	Gleam
Coolness Level	Level 3
Business Potential	Level 2
Prerequisites	ETS + State
Key Topics	ETS, caching, TTL eviction

1. Learning Objectives

By completing this project, you will:

Design a BEAM process model that isolates failures.
Apply OTP supervision strategies to real services.
Validate correctness with deterministic test scenarios.

2. All Theory Needed (Per-Concept Breakdown)

ETS + State

Fundamentals ETS provides in-memory tables for fast access to shared data. Tables are created and owned by a process and are destroyed when the owner exits. ETS supports different table types (set, ordered_set, bag, duplicate_bag) and is optimized for key-based access. It is not a full database; it is a high-performance in-memory store that must be used with care. Mnesia builds on these concepts to provide transactions and replication, but it adds complexity.

Deep Dive into the concept ETS is central to many BEAM systems because it provides fast shared access without requiring explicit locks. The owner process defines lifecycle, while other processes can read or write depending on access settings. This enables cache-like behavior and registry-like lookups.

The key design challenge is lifecycle management. If the owner process crashes, the table disappears. For ephemeral caches, this might be acceptable. For critical state, you must add persistence or replication. ETS itself does not provide durability.

Querying ETS requires discipline. While match and select operations exist, they can be expensive for large tables because they may require scanning. This means you should design keys and indexes for constant-time lookups whenever possible.

Mnesia adds distributed transactions and replication but comes with operational complexity. It is powerful but not always necessary. For learning projects, ETS plus a journaled log is often sufficient and teaches you the same trade-offs.

The best practice is to wrap ETS behind a GenServer or dedicated API so you control access and prevent uncontrolled writes. This preserves the benefits of isolation while still allowing fast lookups.

How this fit on projects This concept is essential for the project and appears repeatedly in BEAM systems.

Definitions & key terms

Core vocabulary for this concept, defined in project context.

Mental model diagram

[Input] -> [Process] -> [State] -> [Output]

How it works (step-by-step, with invariants and failure modes)

Identify inputs and their constraints.
Apply the core rules of the concept.
Validate outputs and error states.
Invariant: the system preserves isolation and deterministic behavior.
Failure modes: overload, incorrect state transitions, missing supervision.

Minimal concrete example

Small example flow using messages and state updates.

Common misconceptions

Confusing representation with runtime behavior.
Assuming failures are rare instead of expected.

Check-your-understanding questions

Explain the concept in your own words.
Predict the outcome of a simple failure scenario.
Why is this concept crucial for reliability?

Check-your-understanding answers

It defines how the system behaves under concurrency.
The supervisor should recover failed components.
Without it, failure handling becomes ad hoc and unreliable.

Real-world applications

High-concurrency services
Fault-tolerant backends
Real-time pipelines

Where you’ll apply it

In this project’s core runtime loop and error handling.

References

Official OTP and BEAM documentation for this concept.

Key insights This concept is the lever that makes BEAM systems resilient and scalable.

Summary Mastering this concept makes the project predictable and robust.

Homework/Exercises to practice the concept

Draw a failure path and its recovery.
Design a message flow for a small subsystem.

Solutions to the homework/exercises

The failure should trigger a supervisor restart.
The flow should isolate state and avoid shared mutation.

3. Project Specification

3.1 What You Will Build

Build a focused service with clear inputs, outputs, and failure behavior. It should be observable and testable, and should demonstrate the core BEAM concept for the project.

3.2 Functional Requirements

Validated Input: Reject malformed or out-of-range values.
Deterministic Output: Same input always yields the same output.
Fault Behavior: Defined recovery path when a worker crashes.

3.3 Non-Functional Requirements

Performance: Must handle expected input rate without unbounded queues.
Reliability: Must recover from injected failures.
Usability: Outputs are explicit and reproducible.

3.4 Example Usage / Output

$ run-project --demo
[expected output]

3.5 Data Formats / Schemas / Protocols

Inputs: CLI arguments or small config file
Outputs: structured logs + CLI status

3.6 Edge Cases

Empty input
Over-limit bursts
Process crashes mid-operation

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

Build: mix compile or rebar3 compile
Run: ./bin/project --demo

3.7.2 Golden Path Demo (Deterministic)

A known input produces a known, testable output.

3.7.3 If CLI: exact terminal transcript

$ ./project --demo
[result line 1]
[result line 2]

4. Solution Architecture

4.1 High-Level Design

[Client] -> [Router] -> [Worker] -> [State]

4.2 Key Components

Component	Responsibility	Key Decisions
Router	Dispatch requests	Choose routing strategy
Worker	Handle operations	Isolation and failure handling
Storage	Maintain state	ETS vs process state

4.4 Data Structures (No Full Code)

Process state maps
ETS tables for shared data
Message structs with tagged fields

4.4 Algorithm Overview

Key Algorithm: Core Service Loop

Parse input into a message.
Dispatch to target process.
Update state and emit output.

Complexity Analysis:

Time: O(n) in number of messages
Space: O(k) in number of active keys/processes

5. Implementation Guide

5.1 Development Environment Setup

# Use standard OTP tooling (mix or rebar3)

5.2 Project Structure

project-root/
├── lib/
│   ├── router.ex
│   ├── worker.ex
│   └── storage.ex
├── test/
│   └── project_test.exs
└── README.md

5.3 The Core Question You’re Answering

“How do I make this service recover automatically without global failure?”

5.4 Concepts You Must Understand First

Review the concepts above and ensure you can explain them clearly.

5.5 Questions to Guide Your Design

How will you partition work across processes?
Where is state stored and how is it protected?
What happens when a worker crashes?

5.6 Thinking Exercise

Sketch the message flow and failure paths before coding.

5.7 The Interview Questions They’ll Ask

“Why does this design scale better than shared-memory locks?”
“How do you detect and recover from failures?”
“How do you prevent mailbox buildup?”

5.8 Hints in Layers

Hint 1: Start with one worker process and a single message type.

Hint 2: Add supervision and confirm restart behavior.

Hint 3: Add routing and concurrency only after correctness.

Hint 4: Validate output with scripted test vectors.

5.9 Books That Will Help

Topic	Book	Chapter
Core BEAM	“Programming Erlang”	Processes chapter
OTP	“Designing for Scalability with Erlang/OTP”	Supervision chapter

5.10 Implementation Phases

Phase 1: Foundation (2-4 hours)

Build a single worker process
Define message shapes

Phase 2: Core Functionality (4-8 hours)

Add routing and state storage
Validate correctness on test cases

Phase 3: Polish & Edge Cases (2-4 hours)

Add failure injection tests
Document recovery behavior

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
State location	process vs ETS	process	simpler, safe
Supervision	one_for_one vs one_for_all	one_for_one	isolate failures

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit Tests	Validate core logic	message parsing
Integration Tests	End-to-end flow	demo scenario
Failure Tests	Crash recovery	kill worker

6.2 Critical Test Cases

Normal path: request -> response
Crash path: worker exit -> restart
Overload: burst input -> bounded queue

6.3 Test Data

inputs: demo messages
expected: stable outputs

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall	Symptom	Solution
Missing supervisor	app dies on crash	add supervisor
Oversized mailbox	latency spikes	split processes
Unbounded state	memory growth	add eviction

7.2 Debugging Strategies

Use observer to inspect process counts and queues
Add structured logs on message receive

7.3 Performance Traps

Avoid heavy work in a single process

8. Extensions & Challenges

8.1 Beginner Extensions

Add structured logging
Add basic metrics

8.2 Intermediate Extensions

Add distribution across two nodes
Add persistence layer

8.3 Advanced Extensions

Add rolling upgrades
Add multi-region replication

9. Real-World Connections

9.1 Industry Applications

Chat, presence, and realtime monitoring systems

Phoenix Channels
GenStage-based pipelines

9.3 Interview Relevance

OTP supervision and process model questions

10. Resources

10.1 Essential Reading

“Programming Erlang” by Joe Armstrong
“Designing for Scalability with Erlang/OTP” by Cesarini/Thompson

10.2 Video Resources

Conference talks on OTP supervision and BEAM concurrency