← Back to all projects

LEARNING TIME SERIES DATABASES

Learning Time Series Databases: A Project-Based Journey

Introduction

Time Series Databases (TSDBs) are specialized storage systems optimized for time-stamped data—metrics, logs, IoT sensor readings, financial ticks, and monitoring telemetry. Unlike traditional databases that optimize for random access and updates, TSDBs are built around a fundamental insight: time is the primary query dimension, and data is almost always appended, never updated.

To truly understand TSDBs, you need to grapple with their core challenges:

  1. Write-Heavy Workloads: Millions of data points per second, append-only
  2. Time-Range Queries: “Give me all CPU metrics between 2pm and 3pm”
  3. Compression: Adjacent values are often similar—exploit this
  4. High Cardinality: Millions of unique metric/tag combinations
  5. Retention & Downsampling: Old data is less valuable, can be aggregated
  6. Efficient Aggregations: Sum, avg, percentiles over massive datasets

This guide provides 15 projects that progressively build your understanding from basic concepts to implementing a full-featured TSDB.


Core Concept Analysis

What Makes Time Series Data Special?

Traditional DB Row:  | user_id | name    | email           | created_at |
Time Series Point:   | timestamp | metric_name | tags{} | value |

Key characteristics:

  • Immutable writes: Data points are rarely updated after insertion
  • Sequential access: Queries almost always involve time ranges
  • High write throughput: Systems may ingest millions of points/second
  • Predictable patterns: Values often change slowly between adjacent timestamps

The TSDB Stack

┌─────────────────────────────────────────────┐
│              Query Engine                    │
│         (PromQL, InfluxQL, SQL)             │
├─────────────────────────────────────────────┤
│              Index Layer                     │
│     (Tag/Label → Series ID mapping)         │
├─────────────────────────────────────────────┤
│           Compression Layer                  │
│    (Gorilla, Delta, Run-Length Encoding)    │
├─────────────────────────────────────────────┤
│            Storage Engine                    │
│      (TSM Tree, Blocks, LSM variants)       │
├─────────────────────────────────────────────┤
│         Write-Ahead Log (WAL)               │
│           (Durability layer)                │
└─────────────────────────────────────────────┘

Project 1: In-Memory Time Series Store

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Zig
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
  • Difficulty: Level 2: Intermediate (The Developer)
  • Knowledge Area: Data Structures / Time Series Fundamentals
  • Software or Tool: Custom Time Series Store
  • Main Book: “C Interfaces and Implementations” by David R. Hanson

What you’ll build

An in-memory time series store that can ingest data points, store them efficiently by series, and answer basic time-range queries.

Why it teaches Time Series Databases

This is your foundation. You’ll understand the fundamental data model: series identification (metric name + tags), timestamps, and values. You’ll see why organizing data by series (not by insertion order) is crucial for query performance.

Core challenges you’ll face

  • Series identification (how do you uniquely identify “cpu.usage{host=server1,region=us-east}”?) → maps to tag indexing
  • Efficient time-range lookups (binary search vs. linear scan) → maps to time-based indexing
  • Memory layout decisions (array of structs vs. struct of arrays) → maps to columnar storage
  • Handling out-of-order writes (sensor data arrives late) → maps to write buffering

Key Concepts

  • Time Series Data Model: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
  • Hash Tables for Series Lookup: “Algorithms” Chapter 3.4 - Robert Sedgewick
  • Binary Search for Time Ranges: “C Programming: A Modern Approach” Chapter 17 - K.N. King
  • Memory Layout: “Computer Systems: A Programmer’s Perspective” Chapter 6 - Bryant & O’Hallaron

Difficulty

Beginner-Intermediate

Time estimate

1 week

Prerequisites

Basic C, understanding of arrays and hash tables

Real world outcome

$ ./tsdb-mem
> INSERT cpu.usage host=server1 region=us-east 1703001600 0.75
OK
> INSERT cpu.usage host=server1 region=us-east 1703001610 0.82
OK
> INSERT cpu.usage host=server2 region=us-west 1703001600 0.45
OK
> QUERY cpu.usage{host=server1} FROM 1703001600 TO 1703001620
[1703001600, 0.75]
[1703001610, 0.82]
2 points returned in 0.001ms

Implementation Hints

The core data structure is a hash map where:

  • Key: A “series key” combining metric name and sorted tags (e.g., cpu.usage|host=server1|region=us-east)
  • Value: A dynamic array of (timestamp, value) pairs, kept sorted by timestamp

Pseudo-code structure:

STRUCT DataPoint:
    timestamp: int64
    value: float64

STRUCT Series:
    key: string
    points: DynamicArray<DataPoint>

STRUCT TimeSeriesStore:
    series_map: HashMap<string, Series>

FUNCTION insert(metric, tags, timestamp, value):
    series_key = build_series_key(metric, tags)
    IF series_key NOT IN series_map:
        series_map[series_key] = new Series(series_key)
    series = series_map[series_key]
    insert_sorted(series.points, DataPoint(timestamp, value))

FUNCTION query(metric, tags, start_time, end_time):
    series_key = build_series_key(metric, tags)
    series = series_map[series_key]
    start_idx = binary_search_lower_bound(series.points, start_time)
    end_idx = binary_search_upper_bound(series.points, end_time)
    RETURN series.points[start_idx:end_idx]

Critical insight: Building the series key requires sorting the tags so that {host=a,region=b} and {region=b,host=a} produce the same key.

Learning milestones

  1. Basic insert/query works → You understand the series concept
  2. Binary search time-range queries are fast → You understand why time ordering matters
  3. Handle 1M+ points without slowdown → You understand memory layout implications

Project 2: Delta Encoding Compressor

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Python
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
  • Difficulty: Level 2: Intermediate (The Developer)
  • Knowledge Area: Compression / Data Encoding
  • Software or Tool: Compression Library
  • Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron

What you’ll build

A compression library that implements delta encoding for timestamps and values, reducing time series data by 60-80%.

Why it teaches Time Series Databases

Compression is the defining characteristic of TSDBs. Without it, storing billions of data points would be prohibitively expensive. You’ll understand why time series data compresses so well: adjacent values are similar.

Core challenges you’ll face

  • Delta encoding for timestamps (store differences, not absolutes) → maps to temporal locality
  • Delta-of-delta encoding (when timestamps are evenly spaced) → maps to pattern exploitation
  • Variable-length integer encoding (small deltas need fewer bytes) → maps to bit-level efficiency
  • Handling edge cases (large jumps, negative deltas) → maps to robustness

Key Concepts

  • Delta Encoding Fundamentals: Delta Encoding Tutorial - smarx.com
  • Variable-Length Integers: “Computer Systems: A Programmer’s Perspective” Chapter 2 - Bryant & O’Hallaron
  • Bit Manipulation: “The Art of Computer Programming Vol 4A” Section 7.1.3 - Knuth
  • Compression Theory: Time Series Compression Algorithms Explained - TigerData

Difficulty

Intermediate

Time estimate

1 week

Prerequisites

Bit manipulation, understanding of binary representation

Real world outcome

$ ./delta-compress input.bin output.compressed
Original size:     1,000,000 bytes (125,000 data points)
Compressed size:      234,567 bytes
Compression ratio: 4.26x
Timestamps: avg delta = 10s, avg bits/delta = 4.2
Values: avg delta = 0.003, avg bits/delta = 12.1

$ ./delta-decompress output.compressed recovered.bin
Decompressed 125,000 points
Verification: PASSED (bit-exact match)

Implementation Hints

Timestamp delta encoding:

ORIGINAL TIMESTAMPS: [1000, 1010, 1020, 1030, 1040]
DELTA ENCODED:       [1000,   10,   10,   10,   10]  # First value + deltas

Delta-of-delta (for regular intervals):

DELTAS:              [10, 10, 10, 10]
DELTA-OF-DELTA:      [10,  0,  0,  0]  # First delta + differences

Variable-length integer encoding (Simple8b or varint):

  • If delta fits in 7 bits: use 1 byte
  • If delta fits in 14 bits: use 2 bytes
  • And so on…

Pseudo-code:

FUNCTION encode_timestamps(timestamps):
    output = BitStream()
    output.write_int64(timestamps[0])  # First timestamp in full

    prev_delta = 0
    FOR i FROM 1 TO len(timestamps):
        delta = timestamps[i] - timestamps[i-1]
        delta_of_delta = delta - prev_delta
        output.write_varint(zigzag_encode(delta_of_delta))
        prev_delta = delta

    RETURN output

FUNCTION zigzag_encode(n):
    # Maps negative numbers to positive: 0,-1,1,-2,2 → 0,1,2,3,4
    IF n >= 0: RETURN 2 * n
    ELSE: RETURN 2 * abs(n) - 1

Critical insight: Zig-zag encoding is essential for handling negative deltas efficiently in variable-length encoding.

Learning milestones

  1. Basic delta encoding works → You understand the core concept
  2. Variable-length integers reduce size further → You understand bit-level optimization
  3. Round-trip (encode → decode) is bit-exact → You’ve built production-quality code

Project 3: Gorilla Compression (Facebook Paper)

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Java
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 1. The “Resume Gold” (Educational/Personal Brand)
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Compression / Floating Point / Bit Manipulation
  • Software or Tool: Gorilla Compression Library
  • Main Book: “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron

What you’ll build

A complete implementation of Facebook’s Gorilla compression algorithm, achieving ~12x compression on real-world metrics data.

Why it teaches Time Series Databases

This is the industry-standard compression for floating-point time series. By implementing it, you’ll understand why 51% of values compress to a single bit and how to exploit the structure of IEEE 754 floating-point numbers.

Core challenges you’ll face

  • XOR-based floating point compression (adjacent floats have similar bit patterns) → maps to floating point internals
  • Leading/trailing zero counting (efficiently encode the XOR result) → maps to bit manipulation mastery
  • Block-based encoding (headers, data layout) → maps to format design
  • Streaming decode (decompress without loading entire block) → maps to memory efficiency

Resources for key challenges

Key Concepts

  • IEEE 754 Floating Point: “Computer Systems: A Programmer’s Perspective” Chapter 2.4 - Bryant & O’Hallaron
  • XOR Properties: “The Art of Computer Programming Vol 4A” - Knuth
  • Bit Stream Implementation: “Practical Binary Analysis” Chapter 2 - Dennis Andriesse
  • Compression Block Design: Gorilla Paper Section 4.1

Difficulty

Advanced

Time estimate

2-3 weeks

Prerequisites

Strong bit manipulation, understanding of IEEE 754 floating point

Real world outcome

$ ./gorilla-compress metrics.raw metrics.gorilla
Input:  8,000,000 bytes (500,000 timestamp-value pairs)
Output:    685,000 bytes
Compression ratio: 11.68x

Breakdown:
  Timestamps: 1.03 bytes/point (delta-of-delta encoding)
  Values:     0.34 bytes/point (XOR encoding)
    - 51.2% compressed to 1 bit (identical to previous)
    - 29.8% compressed with '10' prefix (avg 26.6 bits)
    - 19.0% compressed with '11' prefix (avg 36.2 bits)

$ ./gorilla-decompress metrics.gorilla recovered.raw
Decompressed 500,000 points in 12.3ms
Verification: PASSED

Implementation Hints

The XOR insight: When two floating-point values are similar, their XOR has many leading and trailing zeros.

Value 1:     0x40490FDB (3.14159...)
Value 2:     0x40490FDA (3.14158...)
XOR result:  0x00000001 (only 1 bit differs!)

Encoding the XOR result:

FUNCTION encode_value(current, previous):
    xor = current XOR previous

    IF xor == 0:
        write_bit(0)  # Values identical - just 1 bit!
        RETURN

    write_bit(1)  # Values differ

    leading_zeros = count_leading_zeros(xor)
    trailing_zeros = count_trailing_zeros(xor)
    meaningful_bits = 64 - leading_zeros - trailing_zeros

    IF leading_zeros >= prev_leading AND trailing_zeros >= prev_trailing:
        # Fits in previous window
        write_bit(0)
        write_bits(xor >> prev_trailing, prev_meaningful_bits)
    ELSE:
        # New window
        write_bit(1)
        write_bits(leading_zeros, 5)    # 5 bits for leading zeros (0-31)
        write_bits(meaningful_bits, 6)  # 6 bits for length (0-63)
        write_bits(xor >> trailing_zeros, meaningful_bits)

Block structure:

┌──────────────────────────────────────────┐
│ Header: timestamp_start (64 bits)        │
├──────────────────────────────────────────┤
│ First value (64 bits, uncompressed)      │
├──────────────────────────────────────────┤
│ Compressed (timestamp, value) pairs...   │
│ Each pair: delta-of-delta + XOR encoded  │
└──────────────────────────────────────────┘

Critical insight: The algorithm exploits two properties: (1) timestamps arrive at regular intervals, (2) metric values change slowly. This is why it works so well for monitoring data.

Learning milestones

  1. XOR encoding produces small bit counts → You understand floating point similarity
  2. Block compression achieves 10x+ ratio → You’ve matched Facebook’s results
  3. Streaming decompression works → You understand how to read bit-aligned data

Project 4: Write-Ahead Log (WAL) for Time Series

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Zig
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Durability / Storage Systems
  • Software or Tool: WAL Implementation
  • Main Book: “Operating Systems: Three Easy Pieces” by Remzi H. Arpaci-Dusseau

What you’ll build

A write-ahead log that ensures durability for time series writes, with crash recovery and segment rotation.

Why it teaches Time Series Databases

Every production TSDB uses a WAL. You’ll understand the fundamental trade-off: write to a sequential log first (fast), then organize data later (async). This is why TSDBs can handle millions of writes per second.

Core challenges you’ll face

  • Sequential write optimization (why appending is fast) → maps to disk I/O patterns
  • Segment rotation (WAL files can’t grow forever) → maps to file management
  • Crash recovery (replay WAL to rebuild state) → maps to durability guarantees
  • Checkpointing (truncate WAL after data is persisted) → maps to space reclamation
  • CRC checksums (detect corruption) → maps to data integrity

Key Concepts

  • Write-Ahead Logging: “Operating Systems: Three Easy Pieces” Chapter 42 - Arpaci-Dusseau
  • fsync and Durability: “The Linux Programming Interface” Chapter 13 - Michael Kerrisk
  • CRC Checksums: “Computer Networks” Chapter 3.2 - Tanenbaum
  • Segment Files: InfluxDB WAL Documentation

Difficulty

Advanced

Time estimate

2 weeks

Prerequisites

File I/O, understanding of durability concepts

Real world outcome

$ ./wal-demo
Starting WAL demo...

[Write Phase]
Wrote 100,000 data points to WAL
WAL segments: wal-00001.log (16MB), wal-00002.log (16MB)
Write throughput: 245,000 points/second

[Simulated Crash]
Killing process without clean shutdown...

[Recovery Phase]
$ ./wal-demo --recover
Scanning WAL segments...
  wal-00001.log: 65,432 valid entries, 0 corrupt
  wal-00002.log: 34,568 valid entries, 0 corrupt
Recovered 100,000 data points in 234ms
In-memory state rebuilt successfully!

[Checkpoint Phase]
Checkpointing WAL after persistent storage write...
Truncated wal-00001.log (data now in TSM files)

Implementation Hints

WAL entry format:

┌─────────────────────────────────────────────┐
│ Length (4 bytes)                            │
├─────────────────────────────────────────────┤
│ CRC32 checksum (4 bytes)                    │
├─────────────────────────────────────────────┤
│ Entry type (1 byte): WRITE | DELETE | ...   │
├─────────────────────────────────────────────┤
│ Payload (variable length)                   │
│   - Series key                              │
│   - Timestamp                               │
│   - Value                                   │
└─────────────────────────────────────────────┘

Segment rotation pseudo-code:

STRUCT WAL:
    current_segment: File
    segment_number: int
    segment_size: int
    max_segment_size: int = 16MB

FUNCTION write_entry(entry):
    serialized = serialize(entry)
    crc = crc32(serialized)

    IF segment_size + len(serialized) > max_segment_size:
        rotate_segment()

    write(current_segment, length || crc || serialized)
    fsync(current_segment)  # Critical for durability!
    segment_size += len(serialized)

FUNCTION rotate_segment():
    close(current_segment)
    segment_number += 1
    current_segment = open("wal-{segment_number}.log", CREATE)
    segment_size = 0

Crash recovery:

FUNCTION recover():
    state = empty HashMap

    FOR EACH segment IN sorted(glob("wal-*.log")):
        FOR EACH entry IN read_entries(segment):
            IF verify_crc(entry):
                apply_to_state(state, entry)
            ELSE:
                LOG("Corrupt entry detected, stopping replay")
                BREAK

    RETURN state

Critical insight: The WAL must call fsync() to guarantee durability. Without it, data can be lost on crash even if write() returned successfully.

Learning milestones

  1. Sequential writes hit 200K+ points/second → You understand why WAL is fast
  2. Recovery rebuilds state after kill -9 → You understand durability
  3. Segment rotation prevents unbounded growth → You understand operational concerns

Project 5: Inverted Index for Time Series Tags

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Java
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Indexing / Search
  • Software or Tool: Tag Index
  • Main Book: “Algorithms” by Robert Sedgewick

What you’ll build

An inverted index that maps tag values to series IDs, enabling fast queries like “find all series where region=us-east”.

Why it teaches Time Series Databases

Without an index, finding series matching a tag filter requires scanning all series—impossibly slow at scale. You’ll understand why TSDBs need two types of indexes: one for time (within a series) and one for tags (across series).

Core challenges you’ll face

  • Inverted index construction (tag value → set of series IDs) → maps to search engine fundamentals
  • Set intersection (combine multiple tag filters efficiently) → maps to query optimization
  • Bitmap vs. sorted array (space/time trade-offs) → maps to data structure selection
  • Index persistence (store index to disk) → maps to serialization
  • Index updates (new series arrive continuously) → maps to incremental indexing

Key Concepts

Difficulty

Advanced

Time estimate

2 weeks

Prerequisites

Hash tables, set operations, basic understanding of search

Real world outcome

$ ./tag-index-demo
Building index from 1,000,000 series...

Index stats:
  Unique tag keys: 15
  Unique tag values: 45,230
  Index size: 12.4 MB (bitmaps)

Query: region=us-east
  Matching series: 125,432
  Query time: 0.8ms

Query: region=us-east AND env=production
  Matching series: 41,234
  Query time: 1.2ms (intersection of 2 postings lists)

Query: region=us-east AND env=production AND service=api
  Matching series: 8,432
  Query time: 1.5ms (intersection of 3 postings lists)

Throughput: 650 queries/second

Implementation Hints

Index structure:

STRUCT TagIndex:
    # Maps "label_name=label_value" → set of series IDs
    postings: HashMap<string, RoaringBitmap>

    # Maps series ID → series key (for result retrieval)
    series_by_id: HashMap<int, string>

    next_series_id: int

FUNCTION add_series(series_key, tags):
    series_id = next_series_id++
    series_by_id[series_id] = series_key

    FOR EACH (key, value) IN tags:
        posting_key = "{key}={value}"
        IF posting_key NOT IN postings:
            postings[posting_key] = new RoaringBitmap()
        postings[posting_key].add(series_id)

Query execution:

FUNCTION query(filters):
    # filters is like [("region", "us-east"), ("env", "production")]

    IF len(filters) == 0:
        RETURN all_series_ids()

    # Start with smallest postings list (optimization)
    sorted_filters = sort_by_postings_size(filters)

    result = postings["{sorted_filters[0].key}={sorted_filters[0].value}"]

    FOR EACH filter IN sorted_filters[1:]:
        posting_key = "{filter.key}={filter.value}"
        result = result.intersection(postings[posting_key])

        IF result.is_empty():
            RETURN empty  # Early termination

    RETURN result

Critical insight: Roaring Bitmaps are essential for production TSDBs. They compress well and support fast set operations. For this project, you can start with sorted arrays and implement intersection via merge-join, then upgrade to Roaring Bitmaps.

Learning milestones

  1. Simple tag queries work → You understand inverted indexes
  2. Multi-tag intersection is fast → You understand query optimization
  3. Index scales to 1M+ series → You understand the cardinality challenge

Project 6: Block-Based Storage Engine (Prometheus-Style)

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, C, C++
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert (The Systems Architect)
  • Knowledge Area: Storage Engines / Database Internals
  • Software or Tool: Block Storage Engine
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

A Prometheus-style storage engine with a write-ahead head block, immutable on-disk blocks, and compaction.

Why it teaches Time Series Databases

Prometheus’s TSDB is elegantly simple yet production-ready. You’ll understand the hybrid architecture: in-memory head block for recent data, immutable blocks for historical data, with compaction merging blocks over time.

Core challenges you’ll face

  • Head block management (in-memory buffer with WAL backing) → maps to write path optimization
  • Block creation (flush head to immutable on-disk block) → maps to data lifecycle
  • Chunk encoding (compress samples within chunks) → maps to storage efficiency
  • Index file format (series → chunk references) → maps to read path optimization
  • Compaction (merge adjacent blocks) → maps to storage maintenance

Resources for key challenges

Key Concepts

  • Block Structure: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
  • Compaction Strategies: Prometheus Storage Architecture - Palark
  • Memory-Mapped Files: “The Linux Programming Interface” Chapter 49 - Michael Kerrisk
  • Chunk Format: Prometheus source code (pkg/chunkenc)

Difficulty

Expert

Time estimate

3-4 weeks

Prerequisites

WAL implementation (Project 4), compression (Projects 2-3), strong systems programming

Real world outcome

$ ./prometheus-style-tsdb
Starting TSDB with 2h block range...

[Ingestion - 1 hour]
Ingested 5,000,000 samples across 10,000 series
Head block: 180MB in-memory, WAL: 45MB on disk

[Block Creation - at 2h mark]
Flushing head block to disk...
Created block: 01HQKV3X5M2N4P6R8S0T
  Time range: 1703001600 - 1703008800
  Samples: 5,000,000
  Size on disk: 23MB (7.8x compression)
  Chunks: 10,000
  Index size: 1.2MB

[Query]
> http_requests_total{method="GET"}[5m]
Scanned: 1 block, 45 chunks
Returned: 30 samples
Query time: 4.2ms

[Compaction - later]
Compacting blocks: 01HQKV3X5M..., 01HQKV7Y9N...
Created merged block: 01HQKZ2A8B
  Time range: 1703001600 - 1703016000 (4 hours)
  Reduced from 2 blocks (46MB) to 1 block (42MB)

Implementation Hints

Block directory structure:

data/
├── wal/
│   ├── 00000001
│   └── 00000002
├── 01HQKV3X5M2N4P6R8S0T/    # Block 1
│   ├── meta.json
│   ├── index
│   ├── chunks/
│   │   ├── 000001
│   │   └── 000002
│   └── tombstones
└── 01HQKV7Y9N3O5Q7R9S1U/    # Block 2
    └── ...

meta.json:

{
    "ulid": "01HQKV3X5M2N4P6R8S0T",
    "minTime": 1703001600000,
    "maxTime": 1703008800000,
    "stats": {
        "numSamples": 5000000,
        "numSeries": 10000,
        "numChunks": 10000
    },
    "compaction": {
        "level": 1,
        "sources": []
    }
}

Head block pseudo-code:

STRUCT HeadBlock:
    series: HashMap<SeriesID, MemSeries>
    wal: WAL
    min_time: int64
    max_time: int64

STRUCT MemSeries:
    labels: LabelSet
    chunks: List<MemChunk>

STRUCT MemChunk:
    samples: List<(timestamp, value)>
    min_time: int64
    max_time: int64

FUNCTION append(series_id, timestamp, value):
    wal.write(series_id, timestamp, value)

    series = get_or_create_series(series_id)
    chunk = series.head_chunk()

    IF chunk.is_full():
        chunk = series.create_new_chunk()

    chunk.append(timestamp, value)
    update_time_bounds(timestamp)

Block creation (compaction from head):

FUNCTION create_block(head, block_dir):
    block_id = generate_ulid()

    # Create chunks file
    chunks_file = open("{block_dir}/chunks/000001")
    chunk_refs = []

    FOR EACH series IN head.series:
        FOR EACH chunk IN series.chunks:
            compressed = gorilla_compress(chunk)
            offset = chunks_file.write(compressed)
            chunk_refs.append(ChunkRef(series.id, chunk.min_time, offset))

    # Create index file
    write_index(block_dir, head.series, chunk_refs)

    # Write metadata
    write_meta_json(block_dir, block_id, time_range, stats)

Critical insight: The block structure separates chunks (actual data) from index (series → chunk mapping). This allows efficient queries: first find relevant chunks via index, then only read those chunks.

Learning milestones

  1. Head block accepts writes → You understand the write path
  2. Blocks persist to disk → You understand block creation
  3. Queries span head + blocks → You understand the read path
  4. Compaction merges blocks → You understand storage maintenance

Project 7: TSM Tree Storage Engine (InfluxDB-Style)

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, C++
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert (The Systems Architect)
  • Knowledge Area: LSM Trees / Storage Engines
  • Software or Tool: TSM Storage Engine
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

A Time Structured Merge Tree (TSM) storage engine, inspired by InfluxDB’s architecture—a specialized LSM tree for time series.

Why it teaches Time Series Databases

TSM is the production storage engine behind InfluxDB. You’ll understand how LSM trees adapt to time series workloads: write-optimized ingestion, background compaction, and columnar storage for efficient compression.

Core challenges you’ll face

  • Cache + WAL architecture (in-memory cache backed by WAL) → maps to write optimization
  • TSM file format (series-organized, compressed blocks) → maps to read optimization
  • Compaction levels (merge small files into larger ones) → maps to LSM fundamentals
  • Index within TSM files (find data by series key + time range) → maps to efficient lookups
  • Cache eviction (flush to TSM when memory limit reached) → maps to resource management

Resources for key challenges

Key Concepts

  • LSM Tree Fundamentals: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
  • Compaction Strategies: LSM Compaction Mechanism - Alibaba Cloud
  • Write Amplification: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
  • Bloom Filters: “Algorithms” Chapter 5 - Sedgewick

Difficulty

Expert

Time estimate

4-5 weeks

Prerequisites

Project 4 (WAL), Project 3 (Gorilla compression), understanding of LSM trees

Real world outcome

$ ./tsm-engine
TSM Storage Engine initialized
Cache size limit: 256MB
TSM file target size: 256MB

[Write Phase]
Writing 10,000,000 points across 50,000 series...
Cache usage: 245MB
WAL segments: 3

[Flush Phase - cache full]
Flushing cache to TSM file...
Created: data/000001.tsm (98MB compressed)
  Series: 50,000
  Points: 10,000,000
  Compression ratio: 2.5x

[Compaction Phase]
Level 1 compaction: 000001.tsm + 000002.tsm + 000003.tsm + 000004.tsm
  → 000005.tsm (Level 2, 320MB)
  Write amplification avoided: 3.2x

[Query Phase]
Query: SELECT * FROM cpu WHERE host='server1' AND time > now() - 1h
Files scanned: 1 TSM (using index)
Points returned: 3,600
Query time: 2.1ms

Implementation Hints

TSM file structure:

┌─────────────────────────────────────────────────────┐
│                    TSM File                         │
├─────────────────────────────────────────────────────┤
│  Block 1: Series A, timestamps 1000-2000            │
│    - Compressed timestamps (delta-of-delta)         │
│    - Compressed values (Gorilla XOR)                │
├─────────────────────────────────────────────────────┤
│  Block 2: Series A, timestamps 2001-3000            │
├─────────────────────────────────────────────────────┤
│  Block 3: Series B, timestamps 1000-2000            │
├─────────────────────────────────────────────────────┤
│  ... more blocks ...                                │
├─────────────────────────────────────────────────────┤
│  Index:                                             │
│    Series A → [Block 1 offset, Block 2 offset]      │
│    Series B → [Block 3 offset]                      │
├─────────────────────────────────────────────────────┤
│  Footer: Index offset, magic number                 │
└─────────────────────────────────────────────────────┘

Cache structure:

STRUCT Cache:
    entries: HashMap<SeriesKey, SortedList<(timestamp, value)>>
    size_bytes: int
    max_size_bytes: int

FUNCTION write(series_key, timestamp, value):
    wal.append(series_key, timestamp, value)

    IF series_key NOT IN entries:
        entries[series_key] = new SortedList()

    entries[series_key].insert(timestamp, value)
    size_bytes += POINT_SIZE

    IF size_bytes >= max_size_bytes:
        flush_to_tsm()

Compaction levels:

Level 1: Small files from cache flush (~50MB each)
Level 2: Merged from 4 Level-1 files (~200MB)
Level 3: Merged from 4 Level-2 files (~800MB)
Level 4: Merged from 4 Level-3 files (~3GB)

Compaction trigger: When 4+ files exist at a level

Critical insight: Unlike Prometheus blocks (time-partitioned), TSM files contain data from all time ranges for the series they cover. This means queries must check the index of every TSM file, which is why bloom filters become important at scale.

Learning milestones

  1. Cache flushes produce valid TSM files → You understand the write path
  2. Queries use TSM index correctly → You understand the read path
  3. Compaction reduces file count → You understand LSM maintenance
  4. Benchmark shows 100K+ writes/sec → You’ve built production-quality code

Project 8: Time-Based Sharding System

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, Python, Java
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Distributed Systems / Partitioning
  • Software or Tool: Time-Based Shard Manager
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

A sharding layer that partitions time series data by time windows (hour, day, week), enabling efficient retention policies and parallel queries.

Why it teaches Time Series Databases

Time-based sharding is fundamental to TSDB scalability. You’ll understand why: (1) old data can be deleted by dropping entire shards, (2) queries touching specific time ranges only need specific shards, (3) compaction happens per-shard independently.

Core challenges you’ll face

  • Shard routing (which shard handles a given timestamp?) → maps to partitioning logic
  • Cross-shard queries (query spans multiple time windows) → maps to query fan-out
  • Shard lifecycle (create, compact, delete) → maps to data lifecycle management
  • Retention enforcement (drop shards older than X days) → maps to operational automation
  • Hot/cold tiering (recent shards in fast storage, old in slow) → maps to storage economics

Key Concepts

  • Time-Based Partitioning: “Designing Data-Intensive Applications” Chapter 6 - Martin Kleppmann
  • Retention Policies: InfluxDB Retention Documentation
  • Shard Groups: InfluxDB Shard Groups
  • Query Planning: “Designing Data-Intensive Applications” Chapter 6 - Martin Kleppmann

Difficulty

Advanced

Time estimate

2 weeks

Prerequisites

Basic understanding of distributed systems, file I/O

Real world outcome

$ ./shard-manager status
Shard Configuration:
  Shard duration: 1 day
  Retention: 7 days

Active Shards:
  shard_2024-01-15  [HOT]   size: 2.3GB  series: 50,000
  shard_2024-01-14  [WARM]  size: 4.1GB  series: 50,000
  shard_2024-01-13  [WARM]  size: 4.0GB  series: 50,000
  shard_2024-01-12  [COLD]  size: 3.8GB  series: 50,000
  ...
  shard_2024-01-08  [COLD]  size: 3.5GB  series: 50,000

$ ./shard-manager query "SELECT * FROM cpu WHERE time > '2024-01-14'"
Query plan:
  Shards to scan: shard_2024-01-15, shard_2024-01-14
  Parallel execution: 2 workers
  Estimated points: 8,640,000

Results: 8,523,412 points in 1.2s

$ ./shard-manager enforce-retention
Retention policy: 7 days
Shards to delete: shard_2024-01-07
Deleted 3.2GB, recovered disk space

Implementation Hints

Shard structure:

STRUCT ShardManager:
    shard_duration: Duration  # e.g., 1 day
    retention_period: Duration  # e.g., 7 days
    shards: HashMap<ShardID, Shard>

STRUCT Shard:
    id: string  # e.g., "shard_2024-01-15"
    start_time: Timestamp
    end_time: Timestamp
    storage: StorageEngine  # Could be TSM, Block-based, etc.
    state: HOT | WARM | COLD

FUNCTION get_shard_for_timestamp(timestamp):
    shard_start = truncate_to_boundary(timestamp, shard_duration)
    shard_id = format_shard_id(shard_start)

    IF shard_id NOT IN shards:
        shards[shard_id] = create_shard(shard_id, shard_start)

    RETURN shards[shard_id]

FUNCTION truncate_to_boundary(timestamp, duration):
    # For daily shards: truncate to midnight
    # For hourly shards: truncate to hour start
    RETURN timestamp - (timestamp % duration)

Cross-shard query:

FUNCTION query(start_time, end_time, filters):
    relevant_shards = []

    FOR EACH shard IN shards:
        IF shard.end_time >= start_time AND shard.start_time <= end_time:
            relevant_shards.append(shard)

    # Execute in parallel
    results = parallel_map(relevant_shards,
        lambda shard: shard.query(start_time, end_time, filters))

    RETURN merge_sorted_by_time(results)

Retention enforcement:

FUNCTION enforce_retention():
    cutoff = now() - retention_period

    FOR EACH shard IN shards:
        IF shard.end_time < cutoff:
            shard.close()
            delete_shard_directory(shard.id)
            shards.remove(shard.id)
            LOG("Deleted shard {shard.id}")

Critical insight: The shard boundary alignment is crucial. If you have daily shards, all data for a given day must go to the same shard. This is why time-based sharding works so well for retention: you never partially delete a shard.

Learning milestones

  1. Writes route to correct shard → You understand partitioning
  2. Queries span multiple shards correctly → You understand fan-out
  3. Retention deletes old shards → You understand data lifecycle
  4. Hot/cold tiering works → You understand storage economics

Project 9: Downsampling and Rollup Engine

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Python, Rust, Java
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Data Aggregation / Stream Processing
  • Software or Tool: Downsampling Engine
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

An engine that automatically creates lower-resolution rollups of time series data (e.g., 1-minute raw → 1-hour aggregates → 1-day aggregates).

Why it teaches Time Series Databases

Long-term storage of high-resolution data is expensive and rarely needed. You’ll understand tiered retention: keep raw data for 7 days, hourly rollups for 90 days, daily rollups forever.

Core challenges you’ll face

  • Aggregation functions (min, max, sum, count, avg, percentiles) → maps to statistics fundamentals
  • Incremental computation (don’t reprocess old data) → maps to stream processing
  • Rollup scheduling (when to run, what to process) → maps to job scheduling
  • Rollup storage (where to store aggregated data) → maps to data organization
  • Query routing (use rollups for long-range queries) → maps to query optimization

Key Concepts

Difficulty

Advanced

Time estimate

2 weeks

Prerequisites

Basic aggregation understanding, time-based sharding

Real world outcome

$ ./rollup-engine status
Rollup Configuration:
  Raw data retention: 7 days (1-second resolution)
  1-minute rollups: 30 days
  1-hour rollups: 1 year
  1-day rollups: forever

Rollup Jobs:
  Last 1-minute rollup: 2024-01-15 14:35:00 (processed 60 seconds of data)
  Last 1-hour rollup: 2024-01-15 14:00:00 (processed 60 minutes of data)
  Last 1-day rollup: 2024-01-15 00:00:00 (processed 24 hours of data)

$ ./rollup-engine query "avg(cpu_usage) WHERE host='server1'" --range="30d"
Query plan:
  Last 7 days: raw data (1-second resolution)
  Days 8-30: 1-minute rollups
  Total points scanned: 604,800 + 33,120 = 637,920
  (vs. 2,592,000 without rollups - 4x fewer points!)

Result: avg=0.342, min=0.05, max=0.98, count=2,592,000

$ ./rollup-engine query "avg(cpu_usage)" --range="1y"
Query plan:
  Last 7 days: raw data
  Days 8-30: 1-minute rollups
  Days 31-365: 1-hour rollups
  Total points scanned: 604,800 + 33,120 + 8,040 = 645,960
  (vs. 31,536,000 without rollups - 48x fewer points!)

Implementation Hints

Rollup point structure:

STRUCT RollupPoint:
    timestamp: int64       # Start of aggregation window
    count: int64           # Number of raw points
    sum: float64           # Sum of values
    min: float64           # Minimum value
    max: float64           # Maximum value
    # Derived: avg = sum / count

# For percentiles, use a sketch data structure:
STRUCT PercentileRollupPoint:
    timestamp: int64
    t_digest: TDigest      # Or DDSketch

Rollup job pseudo-code:

STRUCT RollupJob:
    source_resolution: Duration    # e.g., 1 second
    target_resolution: Duration    # e.g., 1 minute
    last_processed: Timestamp

FUNCTION run_rollup_job(job):
    start = job.last_processed
    end = truncate_to_boundary(now(), job.target_resolution)

    IF start >= end:
        RETURN  # Nothing new to process

    windows = generate_windows(start, end, job.target_resolution)

    FOR EACH window IN windows:
        raw_points = query_raw_data(window.start, window.end)

        rollup = RollupPoint(
            timestamp = window.start,
            count = len(raw_points),
            sum = sum(p.value for p in raw_points),
            min = min(p.value for p in raw_points),
            max = max(p.value for p in raw_points)
        )

        store_rollup(rollup)

    job.last_processed = end

Query routing:

FUNCTION smart_query(start_time, end_time, aggregation):
    segments = []
    current = start_time

    # Use highest resolution available for each time segment
    WHILE current < end_time:
        IF current > now() - raw_retention:
            segments.append(QuerySegment("raw", current, min(end_time, now())))
            current = now()
        ELIF current > now() - minute_rollup_retention:
            segment_end = min(end_time, now() - raw_retention)
            segments.append(QuerySegment("1min", current, segment_end))
            current = segment_end
        ELSE:
            segment_end = min(end_time, now() - minute_rollup_retention)
            segments.append(QuerySegment("1hour", current, segment_end))
            current = segment_end

    RETURN merge_results(execute_segments(segments))

Critical insight: Rollups must be additive—you can compute sum and count from rollups and derive avg, but you cannot compute median without storing a sketch. This is why production systems use t-digest or DDSketch for percentile rollups.

Learning milestones

  1. Basic aggregations work → You understand rollup fundamentals
  2. Incremental processing works → You understand stream processing
  3. Query routing uses appropriate rollups → You understand query optimization
  4. Percentile rollups work → You understand sketch data structures

Project 10: PromQL-Style Query Engine

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, Python, TypeScript
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert (The Systems Architect)
  • Knowledge Area: Query Languages / Compilers
  • Software or Tool: Query Engine
  • Main Book: “Language Implementation Patterns” by Terence Parr

What you’ll build

A query engine that parses and executes PromQL-style queries with selectors, functions, and binary operators.

Why it teaches Time Series Databases

PromQL is the de facto standard for time series queries. You’ll understand how a functional query language works: expression trees, type system (scalars vs. vectors), and streaming execution.

Core challenges you’ll face

  • Lexer and parser (tokenize and build AST) → maps to language implementation
  • Type system (instant vector, range vector, scalar) → maps to type theory basics
  • Selector execution (query storage for matching series) → maps to storage integration
  • Function implementation (rate, avg_over_time, histogram_quantile) → maps to aggregation logic
  • Binary operator matching (how to combine two vectors) → maps to set operations

Resources for key challenges

Key Concepts

  • Lexing and Parsing: “Language Implementation Patterns” Chapters 2-5 - Terence Parr
  • Expression Trees: “Compilers: Principles and Practice” Chapter 5 - Dave & Dave
  • Type Systems: “Language Implementation Patterns” Chapter 8 - Terence Parr
  • Vector Matching: PromQL Documentation

Difficulty

Expert

Time estimate

3-4 weeks

Prerequisites

Basic compiler knowledge (lexer/parser), understanding of time series data model

Real world outcome

$ ./promql-engine
PromQL Engine v0.1

> http_requests_total
TYPE: instant vector
{method="GET", status="200"} 1523
{method="GET", status="404"} 42
{method="POST", status="200"} 891
Query time: 2.3ms

> rate(http_requests_total[5m])
TYPE: instant vector
{method="GET", status="200"} 5.07
{method="GET", status="404"} 0.14
{method="POST", status="200"} 2.97
Query time: 4.1ms

> sum by (method) (rate(http_requests_total[5m]))
TYPE: instant vector
{method="GET"} 5.21
{method="POST"} 2.97
Query time: 4.8ms

> http_requests_total / ignoring(status) http_requests_started
TYPE: instant vector (ratio of completed to started)
{method="GET"} 0.95
{method="POST"} 0.89
Query time: 5.2ms

Implementation Hints

Token types:

TOKENS:
  IDENTIFIER    # metric names, label names
  STRING        # "value"
  NUMBER        # 123, 45.67
  DURATION      # 5m, 1h, 30s
  LBRACE        # {
  RBRACE        # }
  LPAREN        # (
  RPAREN        # )
  LBRACKET      # [
  RBRACKET      # ]
  COMMA         # ,
  EQ            # =
  NEQ           # !=
  REGEX_MATCH   # =~
  REGEX_NOMATCH # !~

  # Binary operators
  ADD, SUB, MUL, DIV, MOD, POW

  # Keywords
  SUM, AVG, MIN, MAX, COUNT, BY, WITHOUT, ON, IGNORING, RATE, ...

AST node types:

ENUM ExprType:
    VECTOR_SELECTOR     # http_requests_total{method="GET"}
    MATRIX_SELECTOR     # http_requests_total{method="GET"}[5m]
    SCALAR_LITERAL      # 42
    UNARY_EXPR          # -metric
    BINARY_EXPR         # a + b
    CALL                # rate(metric[5m])
    AGGREGATION         # sum by (label) (expr)

STRUCT VectorSelector:
    metric_name: string
    label_matchers: List<LabelMatcher>

STRUCT MatrixSelector:
    vector: VectorSelector
    range: Duration

STRUCT BinaryExpr:
    left: Expr
    right: Expr
    op: Operator
    matching: VectorMatching  # on(), ignoring()

Execution pseudo-code:

FUNCTION evaluate(expr, eval_time):
    MATCH expr:
        CASE VectorSelector(name, matchers):
            series = storage.select(name, matchers)
            RETURN instant_vector(series, eval_time)

        CASE MatrixSelector(vector, range):
            series = storage.select(vector.name, vector.matchers)
            RETURN range_vector(series, eval_time - range, eval_time)

        CASE Call("rate", [matrix_expr]):
            matrix = evaluate(matrix_expr, eval_time)
            RETURN apply_rate(matrix)

        CASE BinaryExpr(left, right, op, matching):
            lhs = evaluate(left, eval_time)
            rhs = evaluate(right, eval_time)
            RETURN vector_binary_op(lhs, rhs, op, matching)

        CASE Aggregation("sum", by_labels, expr):
            vector = evaluate(expr, eval_time)
            RETURN aggregate_sum(vector, by_labels)

Vector matching (the tricky part):

FUNCTION vector_binary_op(lhs, rhs, op, matching):
    result = []

    # Build signature map for RHS
    rhs_map = {}
    FOR EACH sample IN rhs:
        sig = signature(sample.labels, matching.on_labels)
        rhs_map[sig] = sample

    # Match LHS against RHS
    FOR EACH lsample IN lhs:
        sig = signature(lsample.labels, matching.on_labels)
        IF sig IN rhs_map:
            rsample = rhs_map[sig]
            result_value = apply_op(lsample.value, rsample.value, op)
            result.append(Sample(lsample.labels, result_value))

    RETURN result

Critical insight: PromQL is a functional language where every subexpression evaluates to a result (scalar, instant vector, or range vector) that feeds into the parent expression. The type rules are strict: rate() requires a range vector, aggregations produce instant vectors, etc.

Learning milestones

  1. Simple selectors work → You understand the lexer/parser
  2. rate() and other functions work → You understand range vectors
  3. Binary operators with matching work → You understand PromQL’s type system
  4. Aggregations work → You’ve built a functional query engine

Project 11: High-Throughput Ingestion Pipeline

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, C++, C
  • Coolness Level: Level 4: Hardcore Tech Flex
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 4: Expert (The Systems Architect)
  • Knowledge Area: High Performance / Concurrency
  • Software or Tool: Ingestion Pipeline
  • Main Book: “Rust Atomics and Locks” by Mara Bos

What you’ll build

A high-performance ingestion pipeline capable of handling 1M+ data points per second with batching, back-pressure, and parallel processing.

Why it teaches Time Series Databases

Production TSDBs must handle enormous write loads from thousands of agents. You’ll understand write-path optimization: batching, lock-free data structures, memory pools, and the trade-offs between latency and throughput.

Core challenges you’ll face

  • Batching strategies (when to flush: time-based, size-based, or both) → maps to latency/throughput trade-offs
  • Lock-free structures (concurrent writes without blocking) → maps to concurrency patterns
  • Memory pooling (avoid allocation overhead) → maps to memory management
  • Back-pressure (what happens when writes exceed processing capacity) → maps to flow control
  • Wire protocol (efficient parsing of incoming data) → maps to serialization

Key Concepts

  • Lock-Free Programming: “Rust Atomics and Locks” Chapters 4-6 - Mara Bos
  • Batching and Buffering: “Designing Data-Intensive Applications” Chapter 11 - Martin Kleppmann
  • Memory Pools: “C Interfaces and Implementations” Chapter 5 - David Hanson
  • Back-Pressure: “Designing Data-Intensive Applications” Chapter 11 - Martin Kleppmann

Difficulty

Expert

Time estimate

3-4 weeks

Prerequisites

Strong concurrency knowledge, understanding of memory layout

Real world outcome

$ ./ingestion-bench --writers=100 --duration=60s
Starting ingestion benchmark...
Writers: 100 concurrent
Duration: 60 seconds

Results:
  Total points ingested: 72,345,678
  Throughput: 1,205,761 points/second

  Latency (p50): 0.4ms
  Latency (p99): 2.1ms
  Latency (p99.9): 8.3ms

  Batches flushed: 723,456
  Avg batch size: 100 points

  Memory usage: 512MB (stable)
  GC pauses: 0 (using memory pool)

$ ./ingestion-server --port=9090
Ingestion server listening on :9090
Protocol: InfluxDB line protocol compatible

$ echo "cpu,host=server1 usage=0.75 1703001600000000000" | nc localhost 9090
OK

$ ./telegraf --config=telegraf.conf  # Send 10K metrics/sec
Connected to ingestion server
Sending metrics...

Implementation Hints

Batching architecture:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│   Receiver 1    │────▶│   Batch Queue   │────▶│   Flusher 1     │
├─────────────────┤     │   (lock-free)   │     ├─────────────────┤
│   Receiver 2    │────▶│                 │────▶│   Flusher 2     │
├─────────────────┤     │                 │     ├─────────────────┤
│   Receiver N    │────▶│                 │────▶│   Flusher M     │
└─────────────────┘     └─────────────────┘     └─────────────────┘
                              ▲
                              │
                        Flush trigger:
                        - Time: every 100ms
                        - Size: every 10K points

Lock-free batch queue pseudo-code:

STRUCT BatchQueue:
    batches: Array<AtomicPtr<Batch>>
    write_idx: AtomicUint64
    read_idx: AtomicUint64

STRUCT Batch:
    points: FixedArray<Point, 10000>
    count: AtomicUint32
    state: AtomicEnum<FILLING, READY, FLUSHING>

FUNCTION append_point(queue, point):
    LOOP:
        idx = queue.write_idx.load()
        batch = queue.batches[idx % NUM_BATCHES]

        IF batch.state.load() == FILLING:
            slot = batch.count.fetch_add(1)
            IF slot < BATCH_SIZE:
                batch.points[slot] = point
                RETURN OK
            ELSE:
                # Batch full, try to seal it
                batch.state.compare_swap(FILLING, READY)

        # Move to next batch
        queue.write_idx.compare_swap(idx, idx + 1)

Memory pool:

STRUCT PointPool:
    free_list: LockFreeStack<*Point>
    slabs: Vec<*Slab>

FUNCTION allocate():
    IF point = free_list.pop():
        RETURN point
    ELSE:
        RETURN allocate_from_new_slab()

FUNCTION release(point):
    clear(point)
    free_list.push(point)

Back-pressure:

FUNCTION handle_write(points):
    IF pending_batches() > MAX_PENDING:
        IF blocking_mode:
            wait_for_flush()
        ELSE:
            RETURN ERROR_BACKPRESSURE

    FOR EACH point IN points:
        append_point(batch_queue, point)

    RETURN OK

Critical insight: The key to high throughput is avoiding contention. Each receiver thread should have its own batch being filled, and only contend when batches are swapped. Memory pools eliminate allocation overhead, which is critical at 1M+ points/second.

Learning milestones

  1. Single-threaded hits 200K points/sec → You understand parsing overhead
  2. Multi-threaded scales linearly → You understand lock-free design
  3. Latency stays stable under load → You understand batching trade-offs
  4. Memory stays flat → You understand pooling

Project 12: Cardinality Management System

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, Python, Java
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 3. The “Service & Support” Model
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Metrics / Observability
  • Software or Tool: Cardinality Analyzer
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

A system that tracks, analyzes, and limits the cardinality of time series (unique combinations of metric + tags), preventing the “cardinality explosion” problem.

Why it teaches Time Series Databases

High cardinality is the #1 operational challenge for TSDBs. A single metric with a “user_id” tag can create millions of series. You’ll understand why this breaks TSDBs and how to detect and limit it.

Core challenges you’ll face

  • Cardinality counting (how many unique series exist?) → maps to HyperLogLog and sketches
  • Top-K tracking (which labels contribute most cardinality?) → maps to streaming algorithms
  • Cardinality limiting (reject writes that would create too many series) → maps to admission control
  • Label analysis (detect problematic labels like user_id) → maps to data quality
  • Cardinality forecasting (predict future growth) → maps to capacity planning

Resources for key challenges

Key Concepts

  • HyperLogLog: “Algorithms” (probabilistic counting) - or HyperLogLog paper
  • Top-K/Heavy Hitters: “Algorithms” - Sedgewick (or Space-Saving algorithm)
  • Admission Control: “Designing Data-Intensive Applications” Chapter 11 - Martin Kleppmann
  • Cardinality Estimation: Greptime: Unveiling High Cardinality

Difficulty

Advanced

Time estimate

2 weeks

Prerequisites

Basic probability/statistics, understanding of time series data model

Real world outcome

$ ./cardinality-analyzer
Cardinality Analysis Report
===========================

Total unique series: 1,234,567
Growth rate: +12,345/hour

Top contributors to cardinality:
  1. http_requests{path=*}         523,456 series (42%)
     WARNING: "path" label has high cardinality
     Unique values: 523,456
     Recommendation: Use regex grouping or drop label

  2. container_cpu{pod_id=*}       312,789 series (25%)
     WARNING: "pod_id" is likely ephemeral
     Unique values: 312,789
     Recommendation: Consider pod_name instead

  3. api_latency{user_id=*}        234,567 series (19%)
     CRITICAL: "user_id" creates unbounded cardinality
     Unique values: 234,567
     Recommendation: Remove label, use exemplars

Cardinality limits:
  Metric "http_requests": 10,000 series (current: 523,456 - EXCEEDED)
  Action: New series with unique "path" values are being dropped
  Dropped in last hour: 45,678 writes

$ ./cardinality-analyzer forecast --days=30
Predicted cardinality in 30 days: 2,456,789
WARNING: Exceeds recommended limit of 2,000,000

Implementation Hints

Cardinality tracking structure:

STRUCT CardinalityTracker:
    # Approximate total count
    global_hll: HyperLogLog

    # Per-metric counts
    metric_hlls: HashMap<MetricName, HyperLogLog>

    # Per-label cardinality (for analysis)
    label_hlls: HashMap<(MetricName, LabelName), HyperLogLog>

    # Top-K series by write frequency
    heavy_hitters: SpaceSaving<SeriesKey>

    # Limits
    limits: HashMap<MetricName, uint64>

    # Dropped writes counter
    dropped: AtomicUint64

FUNCTION track_write(metric, labels, value):
    series_key = build_series_key(metric, labels)

    # Update HLL
    global_hll.add(hash(series_key))
    metric_hlls[metric].add(hash(series_key))

    FOR EACH (name, value) IN labels:
        label_hlls[(metric, name)].add(hash(value))

    heavy_hitters.add(series_key)

Cardinality limiting:

FUNCTION should_accept_write(metric, labels):
    series_key = build_series_key(metric, labels)

    IF series_key IN known_series:
        RETURN true  # Existing series, always accept

    # New series - check limits
    current = metric_hlls[metric].count()
    limit = limits.get(metric, DEFAULT_LIMIT)

    IF current >= limit:
        dropped.increment()
        LOG("Dropping write: cardinality limit exceeded for {metric}")
        RETURN false

    known_series.add(series_key)
    RETURN true

Label analysis:

FUNCTION analyze_labels(metric):
    report = []
    metric_cardinality = metric_hlls[metric].count()

    FOR EACH (label_name, hll) IN label_hlls WHERE metric == metric:
        label_cardinality = hll.count()
        contribution = label_cardinality / metric_cardinality

        IF contribution > 0.5:
            report.append(Warning(
                "Label '{label_name}' contributes {contribution*100}% of cardinality",
                "Consider removing or bucketing this label"
            ))

        IF label_cardinality > 10000:
            report.append(Critical(
                "Label '{label_name}' has {label_cardinality} unique values",
                "This is likely unbounded (user_id, request_id, etc.)"
            ))

    RETURN report

Critical insight: The cardinality problem is fundamentally about combinatorial explosion. If you have 3 labels with 100, 50, and 20 unique values, you could have up to 100,000 unique series. Adding a fourth label with 1000 values (like user_id) creates up to 100,000,000 series—impossible to handle.

Learning milestones

  1. HyperLogLog counts work → You understand approximate counting
  2. Top-K identifies problematic labels → You understand streaming algorithms
  3. Cardinality limiting works → You understand admission control
  4. Analysis produces actionable recommendations → You understand operational needs

Project 13: Distributed Time Series Store

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, Java
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 5. The “Industry Disruptor” (VC-Backable Platform)
  • Difficulty: Level 5: Master (The First-Principles Wizard)
  • Knowledge Area: Distributed Systems
  • Software or Tool: Distributed TSDB
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

A distributed time series store with replication, consistent hashing for series placement, and federated queries across nodes.

Why it teaches Time Series Databases

Single-node TSDBs can’t scale forever. You’ll understand how production systems like Cortex, Mimir, and Thanos distribute data: hash ring for writes, fan-out for queries, replication for durability.

Core challenges you’ll face

  • Consistent hashing (assign series to nodes) → maps to partitioning
  • Replication (write to N nodes for durability) → maps to fault tolerance
  • Query fan-out (query all nodes, merge results) → maps to distributed query execution
  • Node failure handling (detect and recover) → maps to failure detection
  • Membership protocol (nodes joining/leaving) → maps to cluster management

Key Concepts

  • Consistent Hashing: “Designing Data-Intensive Applications” Chapter 6 - Martin Kleppmann
  • Replication: “Designing Data-Intensive Applications” Chapter 5 - Martin Kleppmann
  • Gossip Protocols: “Designing Data-Intensive Applications” Chapter 8 - Martin Kleppmann
  • Distributed Query Execution: Grafana Mimir Architecture

Difficulty

Master

Time estimate

6-8 weeks

Prerequisites

All previous projects, strong distributed systems knowledge

Real world outcome

$ ./tsdb-cluster init --nodes=3 --replication=2
Initializing 3-node cluster with replication factor 2...
Node 1 (localhost:9001): ONLINE
Node 2 (localhost:9002): ONLINE
Node 3 (localhost:9003): ONLINE
Hash ring configured: 256 virtual nodes per physical node

$ ./tsdb-cluster status
Cluster Status: HEALTHY
  Nodes: 3/3 online
  Total series: 5,234,567
  Distribution:
    Node 1: 1,756,234 series (33.5%)
    Node 2: 1,745,678 series (33.3%)
    Node 3: 1,732,655 series (33.1%)
  Replication lag: <100ms

$ ./tsdb-cluster write "cpu{host=server1} 0.75"
Written to: Node 1 (primary), Node 2 (replica)
Ack: 2/2 nodes confirmed

$ ./tsdb-cluster query "cpu{host=server1}[5m]"
Query plan:
  Fan-out to: Node 1 (primary holder)
  Fallback: Node 2 (replica)
Result: 30 samples in 4.2ms

$ ./tsdb-cluster kill node-2
Node 2 marked as FAILED
Rebalancing: 1,745,678 series being re-replicated to Node 3
Progress: 45%... 78%... 100%
Cluster Status: DEGRADED (2/3 nodes)
All data remains available (replication factor satisfied)

Implementation Hints

Hash ring structure:

STRUCT HashRing:
    nodes: List<Node>
    virtual_nodes: SortedMap<uint64, Node>  # hash → node
    replication_factor: int

FUNCTION add_node(node):
    FOR i FROM 0 TO VIRTUAL_NODES_PER_PHYSICAL:
        hash = hash(node.id + "_" + i)
        virtual_nodes[hash] = node

FUNCTION get_nodes_for_series(series_key):
    hash = hash(series_key)
    start_idx = virtual_nodes.ceiling_key(hash)

    result = []
    seen_physical = set()

    FOR EACH (_, node) IN virtual_nodes starting from start_idx:
        IF node.id NOT IN seen_physical:
            result.append(node)
            seen_physical.add(node.id)
            IF len(result) >= replication_factor:
                BREAK

    RETURN result

Write path with replication:

FUNCTION write(series_key, timestamp, value):
    nodes = ring.get_nodes_for_series(series_key)

    # Write to all replicas
    responses = parallel_write(nodes, series_key, timestamp, value)

    # Wait for quorum (majority)
    successes = count(r for r in responses if r.success)

    IF successes >= (replication_factor / 2) + 1:
        RETURN OK
    ELSE:
        RETURN ERROR("Failed to achieve write quorum")

Query fan-out:

FUNCTION query(selectors, time_range):
    # For each matching series, find primary node
    # (In practice, broadcast to all nodes and merge)

    results = []

    FOR EACH node IN ring.nodes:
        IF node.is_healthy():
            partial = node.query(selectors, time_range)
            results.extend(partial)

    # Merge and deduplicate
    RETURN merge_sorted_by_time(deduplicate(results))

Failure detection and rebalancing:

STRUCT MembershipManager:
    ring: HashRing
    heartbeat_interval: Duration = 1s
    failure_threshold: int = 3  # missed heartbeats

FUNCTION monitor_nodes():
    LOOP:
        FOR EACH node IN ring.nodes:
            IF time_since_last_heartbeat(node) > failure_threshold * heartbeat_interval:
                mark_node_failed(node)
                trigger_rebalance(node)
        sleep(heartbeat_interval)

FUNCTION trigger_rebalance(failed_node):
    # Find series that were on failed node
    affected_series = get_series_on_node(failed_node)

    FOR EACH series IN affected_series:
        # Find new replica location
        new_nodes = ring.get_nodes_for_series(series, excluding=failed_node)

        # Copy data from existing replica to new location
        source = find_healthy_replica(series)
        target = new_nodes.last()  # New replica
        copy_series_data(source, target, series)

Critical insight: The fundamental trade-off in distributed TSDBs is consistency vs. availability. Most choose eventual consistency for writes (all replicas will converge) and read-your-writes consistency where possible. Queries may return slightly stale data during network partitions.

Learning milestones

  1. Writes distribute across nodes → You understand consistent hashing
  2. Reads work with one node down → You understand replication
  3. Cluster rebalances on node failure → You understand failure recovery
  4. Query performance scales with nodes → You’ve built a distributed system

Project 14: Real-Time Alerting Engine

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Go
  • Alternative Programming Languages: Rust, Python, Java
  • Coolness Level: Level 3: Genuinely Clever
  • Business Potential: 4. The “Open Core” Infrastructure
  • Difficulty: Level 3: Advanced (The Engineer)
  • Knowledge Area: Stream Processing / Alerting
  • Software or Tool: Alerting Engine
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

An alerting engine that evaluates rules against incoming time series data and fires alerts when thresholds are breached.

Why it teaches Time Series Databases

Alerting is the most important use case for TSDBs. You’ll understand streaming evaluation: how to efficiently check thousands of rules against millions of incoming data points without overwhelming the system.

Core challenges you’ll face

  • Rule evaluation (periodically query and check thresholds) → maps to scheduled execution
  • Alert state machine (pending → firing → resolved) → maps to state management
  • Deduplication (don’t spam on every evaluation) → maps to event processing
  • Grouping and routing (aggregate related alerts) → maps to alert management
  • For-duration (alert only if condition persists for X minutes) → maps to temporal logic

Key Concepts

  • Alerting Rules: Prometheus Alerting Rules
  • Alert State Machine: Prometheus Alertmanager
  • Stream Processing: “Designing Data-Intensive Applications” Chapter 11 - Martin Kleppmann
  • Event Processing: “Designing Data-Intensive Applications” Chapter 11 - Martin Kleppmann

Difficulty

Advanced

Time estimate

2-3 weeks

Prerequisites

Query engine (Project 10), basic understanding of state machines

Real world outcome

$ ./alerting-engine --config=alerts.yaml
Loading 150 alerting rules...
Starting evaluation loop (interval: 15s)

$ cat alerts.yaml
groups:
  - name: system
    rules:
      - alert: HighCPU
        expr: avg(cpu_usage) > 0.8
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "High CPU usage detected"

[Evaluation at 14:00:00]
Rule "HighCPU": avg(cpu_usage) = 0.72 (below threshold)
State: inactive

[Evaluation at 14:00:15]
Rule "HighCPU": avg(cpu_usage) = 0.85 (above threshold!)
State: pending (for: 15s of 5m)

[Evaluation at 14:05:15]
Rule "HighCPU": avg(cpu_usage) = 0.87 (above threshold)
State: firing (for: 5m15s >= 5m)
ALERT FIRED: HighCPU
  Labels: {severity=warning, alertname=HighCPU}
  Annotations: {summary="High CPU usage detected"}
  Firing since: 2024-01-15 14:00:15

[Sending to notification channels...]
  Slack #alerts: ⚠️ HighCPU - High CPU usage detected
  PagerDuty: Created incident PD-12345

Implementation Hints

Alert rule structure:

STRUCT AlertRule:
    name: string
    expression: PromQLExpr  # Parsed query
    for_duration: Duration  # How long before firing
    labels: Map<string, string>
    annotations: Map<string, string>

STRUCT AlertState:
    rule: AlertRule
    state: INACTIVE | PENDING | FIRING
    active_since: Timestamp  # When became pending/firing
    last_evaluation: Timestamp
    last_sample_value: float64
    firing_alerts: Map<LabelSet, FiringAlert>

STRUCT FiringAlert:
    labels: LabelSet
    annotations: Map<string, string>
    started_at: Timestamp
    resolved_at: Optional<Timestamp>

Evaluation loop:

FUNCTION evaluation_loop(rules, interval):
    LOOP:
        eval_time = now()

        FOR EACH rule IN rules:
            evaluate_rule(rule, eval_time)

        sleep_until(eval_time + interval)

FUNCTION evaluate_rule(rule, eval_time):
    # Execute the PromQL expression
    result = query_engine.eval(rule.expression, eval_time)

    # For each series in the result
    FOR EACH sample IN result:
        IF sample.value > 0:  # Threshold exceeded
            handle_active(rule, sample.labels, eval_time)
        ELSE:
            handle_inactive(rule, sample.labels, eval_time)

State machine:

FUNCTION handle_active(rule, labels, eval_time):
    state = get_or_create_state(rule, labels)

    MATCH state.state:
        CASE INACTIVE:
            state.state = PENDING
            state.active_since = eval_time
            LOG("Alert {rule.name} now PENDING")

        CASE PENDING:
            duration = eval_time - state.active_since
            IF duration >= rule.for_duration:
                state.state = FIRING
                fire_alert(rule, labels, eval_time)
                LOG("Alert {rule.name} now FIRING")

        CASE FIRING:
            # Already firing, no action needed
            pass

FUNCTION handle_inactive(rule, labels, eval_time):
    state = get_state(rule, labels)

    IF state AND state.state IN [PENDING, FIRING]:
        IF state.state == FIRING:
            resolve_alert(rule, labels, eval_time)
        state.state = INACTIVE
        LOG("Alert {rule.name} now INACTIVE")

Alert grouping (simplified):

STRUCT AlertGroup:
    labels: LabelSet  # Common labels (e.g., {severity=critical})
    alerts: List<FiringAlert>
    last_notification: Timestamp

FUNCTION group_alerts(firing_alerts, group_by_labels):
    groups = HashMap<LabelSet, AlertGroup>()

    FOR EACH alert IN firing_alerts:
        group_key = extract_labels(alert.labels, group_by_labels)

        IF group_key NOT IN groups:
            groups[group_key] = AlertGroup(group_key)

        groups[group_key].alerts.append(alert)

    RETURN groups

FUNCTION send_notifications(groups):
    FOR EACH group IN groups:
        IF should_send(group):  # Rate limiting, dedup
            notification = render_notification(group)
            send_to_channels(notification)
            group.last_notification = now()

Critical insight: The for duration is essential to prevent flapping. Without it, a metric oscillating around the threshold would generate constant alert/resolve cycles. The pending state acts as a debounce.

Learning milestones

  1. Simple threshold alerts work → You understand rule evaluation
  2. For-duration prevents flapping → You understand state machines
  3. Grouping reduces noise → You understand alert management
  4. Notifications integrate with external systems → You’ve built a production tool

  • File: LEARNING_TIME_SERIES_DATABASES.md
  • Main Programming Language: Rust
  • Alternative Programming Languages: Go, C++
  • Coolness Level: Level 5: Pure Magic (Super Cool)
  • Business Potential: 5. The “Industry Disruptor” (VC-Backable Platform)
  • Difficulty: Level 5: Master (The First-Principles Wizard)
  • Knowledge Area: Complete TSDB Implementation
  • Software or Tool: Full TSDB
  • Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build

A complete, production-quality time series database combining all previous projects: storage engine, compression, indexing, query language, ingestion, and clustering.

Why it teaches Time Series Databases

This is the capstone project. You’ll integrate everything you’ve learned into a cohesive system, facing the real challenges of making components work together: buffer management, resource contention, query planning, and operational concerns.

Core challenges you’ll face

  • Component integration (storage + index + query + ingestion) → maps to system design
  • Resource management (memory limits, file handles, connections) → maps to production engineering
  • Query planning (use indexes, rollups, appropriate storage tier) → maps to optimizer design
  • Operational API (compaction, retention, cluster management) → maps to admin interface
  • Observability (metrics about the database itself) → maps to meta-monitoring

Key Concepts

  • System Integration: “Designing Data-Intensive Applications” - Martin Kleppmann (entire book)
  • Production Engineering: “Release It!” - Michael Nygard
  • Query Optimization: “Designing Data-Intensive Applications” Chapter 3 - Martin Kleppmann
  • Observability: “Designing Data-Intensive Applications” Chapter 11 - Martin Kleppmann

Difficulty

Master

Time estimate

3-6 months

Prerequisites

All previous projects completed and understood

Real world outcome

$ ./mytsdb start
MyTSDB v1.0 starting...
  Storage: /var/lib/mytsdb
  WAL: initialized (256MB limit)
  Cache: 1GB
  Compaction: background threads started
  Query engine: PromQL compatible
  HTTP API: listening on :9090
  gRPC API: listening on :9091

$ curl -X POST localhost:9090/write -d '
cpu_usage{host="server1",region="us-east"} 0.75 1703001600
cpu_usage{host="server1",region="us-east"} 0.82 1703001610
memory_usage{host="server1"} 0.45 1703001600
'
{"status":"success","samples_written":3}

$ curl 'localhost:9090/query?query=rate(cpu_usage[5m])&time=1703001800'
{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": [
      {"metric": {"host": "server1", "region": "us-east"}, "value": [1703001800, "0.023"]}
    ]
  }
}

$ curl localhost:9090/admin/status
{
  "status": "healthy",
  "storage": {
    "total_series": 125432,
    "total_samples": 45678901,
    "disk_usage_bytes": 1234567890,
    "compression_ratio": 12.5
  },
  "ingestion": {
    "samples_per_second": 125000,
    "active_series": 45000
  },
  "query": {
    "queries_per_second": 150,
    "avg_latency_ms": 12.5
  }
}

Implementation Hints

System architecture:

┌────────────────────────────────────────────────────────────────┐
│                        HTTP/gRPC API                           │
├────────────────────────────────────────────────────────────────┤
│                                                                │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────────┐ │
│  │   Ingestion  │  │    Query     │  │       Admin          │ │
│  │   Pipeline   │  │    Engine    │  │       API            │ │
│  └──────┬───────┘  └──────┬───────┘  └──────────┬───────────┘ │
│         │                 │                     │              │
│  ┌──────▼─────────────────▼─────────────────────▼───────────┐ │
│  │                    Storage Engine                         │ │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────────┐  │ │
│  │  │  Head   │  │  Blocks │  │  Index  │  │ Compaction  │  │ │
│  │  │  Block  │  │ (TSM)   │  │         │  │   Manager   │  │ │
│  │  └────┬────┘  └────┬────┘  └────┬────┘  └──────┬──────┘  │ │
│  │       │            │            │              │          │ │
│  │  ┌────▼────────────▼────────────▼──────────────▼───────┐ │ │
│  │  │                      WAL                             │ │ │
│  │  └──────────────────────────────────────────────────────┘ │ │
│  └───────────────────────────────────────────────────────────┘ │
│                                                                │
│  ┌──────────────────────────────────────────────────────────┐ │
│  │                 Resource Manager                          │ │
│  │  Memory limits, file handles, connection pools            │ │
│  └──────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘

Component interfaces:

TRAIT StorageEngine:
    FUNCTION append(series_ref, timestamp, value) → Result
    FUNCTION query(matchers, start, end) → Iterator<Sample>
    FUNCTION compact() → Result
    FUNCTION snapshot() → Snapshot

TRAIT QueryEngine:
    FUNCTION eval(expr, time) → Result<Value>
    FUNCTION eval_range(expr, start, end, step) → Result<Matrix>

TRAIT IngestionPipeline:
    FUNCTION ingest(samples) → Result<WriteResponse>
    FUNCTION flush() → Result

TRAIT AdminAPI:
    FUNCTION status() → SystemStatus
    FUNCTION compact_now() → Result
    FUNCTION snapshot() → Result
    FUNCTION reload_config() → Result

Resource management:

STRUCT ResourceManager:
    memory_limit: Bytes
    current_memory: AtomicBytes
    file_handle_limit: int
    open_files: AtomicInt

FUNCTION allocate_memory(size):
    new_total = current_memory.fetch_add(size)
    IF new_total > memory_limit:
        current_memory.fetch_sub(size)
        trigger_memory_pressure()
        RETURN ERROR("memory limit exceeded")
    RETURN OK

FUNCTION trigger_memory_pressure():
    # Flush buffers, drop caches, etc.
    head_block.flush_oldest_series()
    query_cache.clear()

Startup sequence:

FUNCTION start():
    # 1. Initialize resource manager
    resource_manager = ResourceManager(config.memory_limit, ...)

    # 2. Open/recover WAL
    wal = WAL.open_or_create(config.wal_dir)

    # 3. Initialize storage engine
    storage = StorageEngine(config.data_dir, wal, resource_manager)
    storage.recover()  # Replay WAL, load block metadata

    # 4. Initialize index
    index = InvertedIndex.load(config.data_dir)

    # 5. Start background tasks
    spawn(compaction_loop, storage)
    spawn(retention_loop, storage, config.retention)
    spawn(wal_checkpoint_loop, wal, storage)

    # 6. Initialize query engine
    query_engine = QueryEngine(storage, index)

    # 7. Initialize ingestion pipeline
    ingestion = IngestionPipeline(storage, index, config.batch_size)

    # 8. Start API servers
    start_http_server(ingestion, query_engine, admin_api)
    start_grpc_server(ingestion, query_engine)

Self-monitoring (meta-metrics):

FUNCTION collect_internal_metrics():
    RETURN [
        Metric("mytsdb_storage_series_total", storage.series_count()),
        Metric("mytsdb_storage_samples_total", storage.sample_count()),
        Metric("mytsdb_storage_bytes", storage.disk_usage()),
        Metric("mytsdb_ingestion_samples_per_second", ingestion.rate()),
        Metric("mytsdb_query_latency_seconds", query_engine.latency_histogram()),
        Metric("mytsdb_memory_usage_bytes", resource_manager.current_memory),
        Metric("mytsdb_compaction_duration_seconds", compaction.duration_histogram()),
        Metric("mytsdb_wal_bytes", wal.size()),
    ]

Critical insight: Building a full TSDB is an exercise in trade-offs. Every decision (block size, flush interval, compaction strategy) affects multiple dimensions: write throughput, query latency, disk usage, memory usage, and durability. The art is finding the right balance for your use case.

Learning milestones

  1. All components integrate → You understand system design
  2. Benchmarks match expectations → You understand performance
  3. Failure scenarios are handled → You understand production engineering
  4. Self-monitoring works → You’ve built a production-quality system
  5. Documentation is complete → You can explain your design decisions

Project Comparison Table

Project Difficulty Time Depth of Understanding Fun Factor
1. In-Memory Store Beginner-Intermediate 1 week ⭐⭐ ⭐⭐⭐
2. Delta Encoding Intermediate 1 week ⭐⭐⭐ ⭐⭐⭐
3. Gorilla Compression Advanced 2-3 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
4. WAL Implementation Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
5. Inverted Index Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
6. Block Storage (Prometheus) Expert 3-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
7. TSM Tree (InfluxDB) Expert 4-5 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
8. Time-Based Sharding Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
9. Downsampling Engine Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
10. PromQL Query Engine Expert 3-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
11. High-Throughput Ingestion Expert 3-4 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
12. Cardinality Management Advanced 2 weeks ⭐⭐⭐⭐ ⭐⭐⭐
13. Distributed Store Master 6-8 weeks ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐
14. Alerting Engine Advanced 2-3 weeks ⭐⭐⭐⭐ ⭐⭐⭐⭐
15. Full TSDB Master 3-6 months ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐

Recommendation

If you’re just starting out:

Start with Project 1 (In-Memory Store) and Project 2 (Delta Encoding). These give you the foundation without overwhelming complexity. You’ll understand the core data model and why compression is essential.

If you want to understand storage engines deeply:

Do Projects 4 (WAL), 6 (Block Storage), and 7 (TSM Tree). These are the heart of any TSDB. Understanding both Prometheus-style and InfluxDB-style approaches gives you perspective on the design space.

If you’re interested in query languages:

Focus on Project 10 (PromQL Engine). This is a self-contained compiler/interpreter project that teaches you how time series queries actually work.

If you want the “wow factor”:

Project 3 (Gorilla Compression) and Project 13 (Distributed Store) are the showstoppers. Implementing Facebook’s compression paper is impressive, and building a distributed system is always a challenge.

If you have 3-6 months:

Go for Project 15 (Full TSDB). This is the ultimate learning experience—you’ll face every challenge in TSDB design and come out understanding the field deeply.


Summary

# Project Main Language
1 In-Memory Time Series Store C
2 Delta Encoding Compressor C
3 Gorilla Compression (Facebook Paper) C
4 Write-Ahead Log (WAL) for Time Series C
5 Inverted Index for Time Series Tags C
6 Block-Based Storage Engine (Prometheus-Style) Go
7 TSM Tree Storage Engine (InfluxDB-Style) Go
8 Time-Based Sharding System Go
9 Downsampling and Rollup Engine Go
10 PromQL-Style Query Engine Go
11 High-Throughput Ingestion Pipeline Rust
12 Cardinality Management System Go
13 Distributed Time Series Store Go
14 Real-Time Alerting Engine Go
15 Full-Featured TSDB from Scratch Rust

Sources

Time Series Database Architecture

Storage Engines

Compression

LSM Trees

High Cardinality

PromQL