LEARN HAPROXY DEEP DIVE

Learn HAProxy: From Zero to Load Balancer Master

Goal: Deeply understand HAProxy—not just configuration, but how it works behind the scenes. Build your own high-performance load balancer from scratch in C, mastering event-driven programming, network protocols, and the art of handling millions of connections.

Why HAProxy Matters

HAProxy sits at the heart of modern infrastructure. It’s the invisible layer that:

Routes traffic to your backend servers
Terminates SSL/TLS connections
Detects failed servers and routes around them
Enables zero-downtime deployments

Understanding HAProxy’s internals teaches you:

Event-driven architecture: How to handle 1M+ connections in a single process
Network programming: TCP/IP, HTTP parsing, socket management
High-performance C: Zero-copy buffers, cache-friendly code, SIMD optimization
Systems programming: epoll/kqueue, non-blocking I/O, process management

After completing these projects, you will:

Build a production-capable load balancer from scratch
Understand every syscall HAProxy makes
Know exactly why event-driven beats thread-per-connection
Read HAProxy source code like documentation

HAProxy vs NGINX: Architecture Comparison

Aspect	HAProxy	NGINX
Design Focus	Purpose-built load balancer	Web server with LB added
Process Model	Multi-threaded single process	Multi-process (workers)
Event Model	Event loop per thread	Event loop per worker
Protocol Focus	TCP/HTTP load balancing	HTTP serving + proxying
Configuration	Declarative sections	Hierarchical blocks
Health Checks	Very sophisticated	Basic
Stats/Metrics	61+ metrics, rich dashboard	Basic stub_status
SSL Performance	Excellent	Slightly better
Static Content	Not designed for this	Excellent
L4 (TCP) LB	First-class	Supported
Connection Handling	10-15% better raw performance	Great, slightly lower

Core Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                         HAProxy Architecture                         │
├─────────────────────────────────────────────────────────────────────┤
│                                                                       │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                     Master Process                           │   │
│   │  • Configuration parsing                                     │   │
│   │  • Signal handling                                           │   │
│   │  • Hot reload coordination                                   │   │
│   └───────────────────────────┬─────────────────────────────────┘   │
│                               │                                      │
│   ┌───────────────────────────▼─────────────────────────────────┐   │
│   │                    Worker Process                            │   │
│   │  ┌─────────────────────────────────────────────────────┐    │   │
│   │  │              Event Loop (epoll/kqueue)               │    │   │
│   │  │  ┌─────────┐  ┌─────────┐  ┌─────────┐             │    │   │
│   │  │  │ Thread 1│  │ Thread 2│  │ Thread N│             │    │   │
│   │  │  │ (events)│  │ (events)│  │ (events)│             │    │   │
│   │  │  └────┬────┘  └────┬────┘  └────┬────┘             │    │   │
│   │  │       │            │            │                   │    │   │
│   │  │       └────────────┼────────────┘                   │    │   │
│   │  │                    ▼                                │    │   │
│   │  │  ┌─────────────────────────────────────────────┐   │    │   │
│   │  │  │              Connection Table                │   │    │   │
│   │  │  │  • Frontend connections (clients)            │   │    │   │
│   │  │  │  • Backend connections (servers)             │   │    │   │
│   │  │  │  • Connection state machines                 │   │    │   │
│   │  │  └─────────────────────────────────────────────┘   │    │   │
│   │  └─────────────────────────────────────────────────────┘    │   │
│   │                                                              │   │
│   │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────┐     │   │
│   │  │   Buffers   │  │   Timers    │  │   Health Checks │     │   │
│   │  │  (ring/pool)│  │  (wheel)    │  │  (active/passive)│     │   │
│   │  └─────────────┘  └─────────────┘  └─────────────────┘     │   │
│   └──────────────────────────────────────────────────────────────┘   │
│                                                                       │
│   ┌─────────────────────────────────────────────────────────────┐   │
│   │                       Frontends                              │   │
│   │  • Bind to ports (80, 443, etc.)                            │   │
│   │  • Accept connections                                        │   │
│   │  • SSL termination                                          │   │
│   │  • Request parsing & routing                                │   │
│   └───────────────────────────┬─────────────────────────────────┘   │
│                               │                                      │
│   ┌───────────────────────────▼─────────────────────────────────┐   │
│   │                       Backends                               │   │
│   │  • Server pools                                              │   │
│   │  • Load balancing algorithms                                 │   │
│   │  • Connection pooling                                        │   │
│   │  • Health checking                                           │   │
│   └─────────────────────────────────────────────────────────────┘   │
│                                                                       │
└─────────────────────────────────────────────────────────────────────┘

Data Flow:
  Client → Frontend (accept, parse) → Backend (select server) → Server
  Server → Backend (receive) → Frontend (forward) → Client

Fundamental Concepts

Event-Driven Model
- Single-threaded or multi-threaded event loop
- epoll (Linux) / kqueue (BSD/macOS) for I/O multiplexing
- Non-blocking sockets for all connections
- State machines for connection handling
Frontend/Backend Model
- Frontend: Listening sockets, accepts connections, parses requests
- Backend: Server pools, selects server, forwards requests
- Server: Individual backend server with health state

Connection Lifecycle

ACCEPT → READ_REQUEST → SELECT_BACKEND → CONNECT_SERVER →
FORWARD_REQUEST → READ_RESPONSE → FORWARD_RESPONSE → CLOSE/KEEPALIVE

Load Balancing Algorithms
- Round Robin: Rotate through servers
- Least Connections: Choose server with fewest active connections
- Source Hash: Same client IP always goes to same server
- URI Hash: Same URL always goes to same server
- Consistent Hashing: Minimal disruption when servers change
Health Checking
- Active: Periodic probes (TCP connect, HTTP request)
- Passive: Track failed requests
- Server states: UP, DOWN, MAINT, DRAIN
Buffers and Zero-Copy
- Ring buffers for streaming data
- Splice/sendfile for zero-copy when possible
- Buffer pools to avoid allocation

Project List

Projects are ordered from foundational concepts to advanced implementations. All projects are in C.

Project 1: TCP Echo Server with Select/Poll

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Network Programming / Systems
Software or Tool: TCP Server Foundation
Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: A simple TCP echo server that handles multiple clients using select() or poll(). This is the foundation—before epoll, you need to understand the basics.

Why it teaches HAProxy: HAProxy evolved from simpler I/O multiplexing to epoll. Understanding select/poll’s limitations explains why epoll exists and how event loops work.

Core challenges you’ll face:

Socket creation and binding → maps to HAProxy’s bind directive
Accepting connections → maps to frontend connection handling
Multiplexing with select/poll → maps to event loop basics
Non-blocking I/O → maps to HAProxy’s core I/O model

Key Concepts:

Socket Programming: “The Linux Programming Interface” Chapter 56-61 - Michael Kerrisk
select() and poll(): “Beej’s Guide to Network Programming” - Brian Hall
Non-blocking I/O: Non-Blocking Sockets Tutorial

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Basic C programming, understanding of TCP/IP

Real world outcome:

# Compile and run
$ gcc -o echoserver echoserver.c
$ ./echoserver 8080

Echo server listening on port 8080
Max clients: 1024

# In another terminal
$ nc localhost 8080
Hello, server!
Hello, server!   # Echoed back

# Multiple clients work simultaneously
$ for i in {1..100}; do echo "Client $i" | nc localhost 8080 & done
Client 1
Client 2
...

# Server output
[Client 1] Connected from 127.0.0.1:54321
[Client 1] Received: Hello, server!
[Client 1] Sent: Hello, server!
[Client 2] Connected from 127.0.0.1:54322
...
Active connections: 100

Implementation Hints:

Basic socket setup:

int create_listening_socket(int port) {
    int sockfd = socket(AF_INET, SOCK_STREAM, 0);

    // Allow address reuse (important for quick restarts)
    int opt = 1;
    setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_addr.s_addr = INADDR_ANY,
        .sin_port = htons(port)
    };

    bind(sockfd, (struct sockaddr*)&addr, sizeof(addr));
    listen(sockfd, SOMAXCONN);

    // Make non-blocking
    fcntl(sockfd, F_SETFL, O_NONBLOCK);

    return sockfd;
}

Event loop with poll():

#define MAX_CLIENTS 1024

struct pollfd fds[MAX_CLIENTS];
int nfds = 1;

// Add listening socket
fds[0].fd = listen_fd;
fds[0].events = POLLIN;

while (1) {
    int ready = poll(fds, nfds, -1);  // Wait forever

    // Check listening socket for new connections
    if (fds[0].revents & POLLIN) {
        int client_fd = accept(listen_fd, NULL, NULL);
        fcntl(client_fd, F_SETFL, O_NONBLOCK);
        fds[nfds].fd = client_fd;
        fds[nfds].events = POLLIN;
        nfds++;
    }

    // Check client sockets for data
    for (int i = 1; i < nfds; i++) {
        if (fds[i].revents & POLLIN) {
            char buf[1024];
            int n = read(fds[i].fd, buf, sizeof(buf));
            if (n <= 0) {
                close(fds[i].fd);
                // Remove from array...
            } else {
                write(fds[i].fd, buf, n);  // Echo back
            }
        }
    }
}

Learning milestones:

Single client works → You understand basic sockets
Multiple clients work → You understand multiplexing
No blocking on slow clients → You understand non-blocking I/O
You feel the O(n) pain at 10K clients → You’re ready for epoll

Project 2: High-Performance Event Loop with epoll

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: Systems Programming / High Performance
Software or Tool: Event Loop Library
Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: A high-performance event loop using epoll (Linux) that can handle 100K+ connections. This is HAProxy’s core—the engine that makes everything else possible.

Why it teaches HAProxy: HAProxy runs around an event loop, waiting for events with epoll and processing them as fast as possible. This IS HAProxy’s architecture at its core.

Core challenges you’ll face:

epoll_create, epoll_ctl, epoll_wait → maps to HAProxy’s poller abstraction
Edge-triggered vs level-triggered → maps to performance optimization
Handling thousands of connections → maps to C10K/C100K problem
Timer management → maps to timeout handling

Resources for key challenges:

Kernel Queue Complete Guide - epoll, kqueue, IOCP comparison
epoll Tutorial - Non-blocking sockets with epoll

Key Concepts:

epoll API: “The Linux Programming Interface” Chapter 63 - Michael Kerrisk
Edge-Triggered Mode: Essential for performance
Timer Wheel: Efficient timeout management

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Project 1 completed

Real world outcome:

# Compile with optimization
$ gcc -O3 -o eventloop eventloop.c
$ ./eventloop 8080

Event loop started (epoll)
Listening on port 8080

# Benchmark with wrk
$ wrk -t12 -c10000 -d30s http://localhost:8080/
Running 30s test @ http://localhost:8080/
  12 threads and 10000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.23ms    2.45ms  50.12ms   95.23%
    Req/Sec    85.23k    12.34k  120.45k    72.00%
  30,567,890 requests in 30.00s, 3.45GB read
Requests/sec: 1,018,929.67
Transfer/sec: 117.89MB

# Server stats
Active connections: 10000
Events processed: 61,234,567
Average event latency: 0.89μs

Implementation Hints:

epoll event loop:

#define MAX_EVENTS 1024

int epfd = epoll_create1(0);

// Add listening socket
struct epoll_event ev = {
    .events = EPOLLIN | EPOLLET,  // Edge-triggered!
    .data.fd = listen_fd
};
epoll_ctl(epfd, EPOLL_CTL_ADD, listen_fd, &ev);

struct epoll_event events[MAX_EVENTS];

while (1) {
    int nready = epoll_wait(epfd, events, MAX_EVENTS, 1000);  // 1s timeout

    for (int i = 0; i < nready; i++) {
        int fd = events[i].data.fd;

        if (fd == listen_fd) {
            // Accept ALL pending connections (edge-triggered!)
            while (1) {
                int client = accept(listen_fd, NULL, NULL);
                if (client < 0) {
                    if (errno == EAGAIN || errno == EWOULDBLOCK)
                        break;  // No more pending
                    perror("accept");
                    break;
                }
                fcntl(client, F_SETFL, O_NONBLOCK);

                struct epoll_event cev = {
                    .events = EPOLLIN | EPOLLET | EPOLLOUT,
                    .data.fd = client
                };
                epoll_ctl(epfd, EPOLL_CTL_ADD, client, &cev);
            }
        } else if (events[i].events & EPOLLIN) {
            handle_read(fd);
        } else if (events[i].events & EPOLLOUT) {
            handle_write(fd);
        }
    }
}

Edge-triggered gotcha:

// With edge-triggered, you MUST read/write until EAGAIN
void handle_read(int fd) {
    while (1) {
        char buf[4096];
        ssize_t n = read(fd, buf, sizeof(buf));
        if (n < 0) {
            if (errno == EAGAIN)
                break;  // Would block, done for now
            // Real error, close connection
            close(fd);
            return;
        }
        if (n == 0) {
            // EOF, client closed
            close(fd);
            return;
        }
        // Process data...
    }
}

Timer wheel for timeouts:

#define WHEEL_SIZE 1024
#define TICK_MS 100

struct timer_entry {
    int fd;
    uint64_t expires_at;
    struct timer_entry *next;
};

struct timer_entry *wheel[WHEEL_SIZE];

void add_timer(int fd, int timeout_ms) {
    uint64_t now = get_time_ms();
    uint64_t expires = now + timeout_ms;
    int slot = (expires / TICK_MS) % WHEEL_SIZE;

    struct timer_entry *entry = malloc(sizeof(*entry));
    entry->fd = fd;
    entry->expires_at = expires;
    entry->next = wheel[slot];
    wheel[slot] = entry;
}

void check_timers() {
    uint64_t now = get_time_ms();
    int slot = (now / TICK_MS) % WHEEL_SIZE;

    struct timer_entry **pp = &wheel[slot];
    while (*pp) {
        if ((*pp)->expires_at <= now) {
            // Timer expired! Close connection or handle timeout
            handle_timeout((*pp)->fd);
            struct timer_entry *expired = *pp;
            *pp = expired->next;
            free(expired);
        } else {
            pp = &(*pp)->next;
        }
    }
}

Learning milestones:

Handles 10K connections → You understand epoll basics
Edge-triggered works correctly → You understand the subtleties
Timeouts work → You understand timer management
Memory doesn’t grow → You understand resource management

Project 3: HTTP/1.1 Parser

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Protocol Parsing / Performance
Software or Tool: HTTP Parser (like picohttpparser)
Main Book: “HTTP: The Definitive Guide” by David Gourley

What you’ll build: A zero-copy, streaming HTTP/1.1 request parser. Parse method, path, headers, and body without allocating memory for each request.

Why it teaches HAProxy: HAProxy must parse HTTP at wire speed. Understanding incremental parsing, header extraction, and content-length handling explains HAProxy’s HTTP mode.

Core challenges you’ll face:

Incremental parsing → maps to handling partial reads
Zero-copy design → maps to performance optimization
Header parsing → maps to HAProxy’s header manipulation
Chunked encoding → maps to HTTP/1.1 transfer encoding

Resources for key challenges:

picohttpparser - High-performance HTTP parser
Cloudflare AVX2 Optimization - SIMD parsing

Key Concepts:

HTTP/1.1 Protocol: RFC 7230-7235
Streaming Parsers: State machine design
SIMD Optimization: Using SSE4.2/AVX2 for fast parsing

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Understanding of HTTP protocol, state machines

Real world outcome:

# Benchmark the parser
$ ./http_parser_bench

Parsing 1,000,000 requests...
Simple GET:     4,521,345 req/sec (221 ns/req)
GET with 10 headers: 2,345,678 req/sec (426 ns/req)
POST with body: 1,987,654 req/sec (503 ns/req)

# Test with real requests
$ echo -e "GET /path HTTP/1.1\r\nHost: example.com\r\n\r\n" | ./http_parser

Parsed HTTP Request:
  Method: GET
  Path: /path
  Version: HTTP/1.1
  Headers:
    Host: example.com
  Body: (none)
  Parse time: 156 ns

# Handle partial data (streaming)
$ ./http_parser_test --partial
Feeding "GET /pa"... INCOMPLETE (need more data)
Feeding "th HTTP/1.1\r\n"... INCOMPLETE (need headers)
Feeding "Host: x\r\n\r\n"... COMPLETE!

Implementation Hints:

HTTP request structure (zero-copy):

struct http_header {
    const char *name;    // Pointer into buffer
    size_t name_len;
    const char *value;   // Pointer into buffer
    size_t value_len;
};

struct http_request {
    const char *method;
    size_t method_len;
    const char *path;
    size_t path_len;
    int minor_version;   // HTTP/1.x

    struct http_header headers[64];
    size_t num_headers;

    const char *body;
    size_t body_len;
    size_t content_length;

    // Parser state
    int state;
    size_t bytes_parsed;
};

Incremental parser (state machine):

enum parser_state {
    S_METHOD,
    S_PATH,
    S_VERSION,
    S_HEADER_NAME,
    S_HEADER_VALUE,
    S_BODY,
    S_DONE
};

int parse_http_request(struct http_request *req, const char *buf, size_t len) {
    const char *p = buf + req->bytes_parsed;
    const char *end = buf + len;

    while (p < end) {
        switch (req->state) {
        case S_METHOD:
            // Find space after method
            while (p < end && *p != ' ') {
                if (!is_token_char(*p)) return -1;  // Invalid
                p++;
            }
            if (p == end) {
                req->bytes_parsed = p - buf;
                return 0;  // Need more data
            }
            req->method = buf;
            req->method_len = p - buf;
            p++;  // Skip space
            req->state = S_PATH;
            break;

        case S_PATH:
            req->path = p;
            while (p < end && *p != ' ') p++;
            if (p == end) {
                req->bytes_parsed = req->path - buf;
                return 0;  // Need more data
            }
            req->path_len = p - req->path;
            p++;  // Skip space
            req->state = S_VERSION;
            break;

        // ... more states for version, headers, body
        }
    }

    req->bytes_parsed = p - buf;
    return (req->state == S_DONE) ? 1 : 0;
}

Fast header search (case-insensitive):

// Find header value by name
const char *find_header(struct http_request *req, const char *name, size_t *len) {
    size_t name_len = strlen(name);
    for (size_t i = 0; i < req->num_headers; i++) {
        if (req->headers[i].name_len == name_len &&
            strncasecmp(req->headers[i].name, name, name_len) == 0) {
            *len = req->headers[i].value_len;
            return req->headers[i].value;
        }
    }
    return NULL;
}

Learning milestones:

Simple requests parse → You understand HTTP format
Partial data handles correctly → You understand streaming
All headers extracted → You understand header parsing
Performance is good (>1M req/sec) → You understand optimization

Project 4: Round-Robin Load Balancer

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Load Balancing / Networking
Software or Tool: Basic Load Balancer
Main Book: “High Performance Browser Networking” by Ilya Grigorik

What you’ll build: A TCP load balancer that distributes connections across multiple backend servers using round-robin. This combines your event loop and parser into a working proxy.

Why it teaches HAProxy: This is HAProxy’s core functionality. You’ll understand how connections are accepted, routed to backends, and data is proxied bidirectionally.

Core challenges you’ll face:

Accepting and forwarding → maps to frontend/backend model
Bidirectional proxying → maps to HAProxy’s stream processing
Connection pairing → maps to client-server association
Backend selection → maps to load balancing algorithms

Key Concepts:

Round-Robin Algorithm: ByteByteGo Load Balancing
TCP Proxying: Connection splicing pattern
Backend Pools: Server grouping and selection

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Projects 1-3 completed

Real world outcome:

# Start backend servers (simple echo servers)
$ ./echoserver 9001 &
$ ./echoserver 9002 &
$ ./echoserver 9003 &

# Start load balancer
$ ./loadbalancer --frontend 8080 --backends 127.0.0.1:9001,127.0.0.1:9002,127.0.0.1:9003

Load Balancer Started
Frontend: 0.0.0.0:8080
Backends:
  [0] 127.0.0.1:9001 (weight=1)
  [1] 127.0.0.1:9002 (weight=1)
  [2] 127.0.0.1:9003 (weight=1)
Algorithm: round-robin

# Connections are distributed
$ for i in {1..9}; do echo "Request $i" | nc localhost 8080; done
Request 1   # Went to :9001
Request 2   # Went to :9002
Request 3   # Went to :9003
Request 4   # Went to :9001 (wraps around)
...

# Check stats
$ curl localhost:8081/stats
{
  "frontend": {
    "connections_total": 9,
    "connections_active": 0,
    "bytes_in": 90,
    "bytes_out": 90
  },
  "backends": [
    {"address": "127.0.0.1:9001", "connections": 3, "status": "UP"},
    {"address": "127.0.0.1:9002", "connections": 3, "status": "UP"},
    {"address": "127.0.0.1:9003", "connections": 3, "status": "UP"}
  ]
}

Implementation Hints:

Connection structure:

struct connection {
    int client_fd;
    int server_fd;
    struct backend *backend;

    // Buffers for bidirectional proxying
    char client_buf[8192];
    size_t client_buf_len;
    size_t client_buf_sent;

    char server_buf[8192];
    size_t server_buf_len;
    size_t server_buf_sent;

    // State
    enum conn_state state;
    uint64_t created_at;
    uint64_t last_activity;
};

struct backend {
    char *address;
    int port;
    int weight;
    int active_connections;
    uint64_t total_connections;
    enum { UP, DOWN, MAINT } status;
};

Round-robin selection:

static int rr_counter = 0;

struct backend *select_backend_roundrobin(struct backend *backends, int n) {
    int attempts = 0;
    while (attempts < n) {
        int idx = rr_counter % n;
        rr_counter++;

        if (backends[idx].status == UP) {
            return &backends[idx];
        }
        attempts++;
    }
    return NULL;  // All backends down!
}

Bidirectional proxy:

void handle_connection(struct connection *conn, uint32_t events) {
    // Client has data to read
    if (events & EPOLLIN && conn->client_fd >= 0) {
        ssize_t n = read(conn->client_fd, conn->client_buf + conn->client_buf_len,
                         sizeof(conn->client_buf) - conn->client_buf_len);
        if (n > 0) {
            conn->client_buf_len += n;
            // Enable write to server
            modify_epoll(conn->server_fd, EPOLLIN | EPOLLOUT);
        } else if (n == 0) {
            // Client closed, shutdown server write
            shutdown(conn->server_fd, SHUT_WR);
        }
    }

    // Server ready to receive
    if (events & EPOLLOUT && conn->server_fd >= 0) {
        if (conn->client_buf_len > conn->client_buf_sent) {
            ssize_t n = write(conn->server_fd,
                              conn->client_buf + conn->client_buf_sent,
                              conn->client_buf_len - conn->client_buf_sent);
            if (n > 0) {
                conn->client_buf_sent += n;
                if (conn->client_buf_sent == conn->client_buf_len) {
                    conn->client_buf_len = 0;
                    conn->client_buf_sent = 0;
                }
            }
        }
    }

    // Similar for server → client direction...
}

Learning milestones:

Connections route to backends → You understand forwarding
Data flows both directions → You understand bidirectional proxy
Round-robin works correctly → You understand load balancing
Dead servers are skipped → You’re ready for health checks

Project 5: Advanced Load Balancing Algorithms

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Algorithms / Distributed Systems
Software or Tool: Load Balancing Library
Main Book: “Designing Data-Intensive Applications” by Martin Kleppmann

What you’ll build: Implement multiple load balancing algorithms: least connections, weighted round-robin, source IP hash, and consistent hashing. Make them pluggable.

Why it teaches HAProxy: HAProxy supports many algorithms for different use cases. Understanding when to use each explains HAProxy’s balance directive options.

Core challenges you’ll face:

Least connections tracking → maps to HAProxy’s leastconn
Weighted distribution → maps to HAProxy’s weight parameter
Consistent hashing → maps to HAProxy’s hash-type consistent
Session persistence → maps to HAProxy’s stick-tables

Key Concepts:

Load Balancing Algorithms: ByteByteGo Comparison
Consistent Hashing: Maglev Paper
Session Affinity: Sticky sessions patterns

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Project 4 completed

Real world outcome:

# Test different algorithms
$ ./loadbalancer --algorithm leastconn --backends 9001,9002,9003

# Create uneven load
$ (while true; do curl -s localhost:8080; done) &  # Slow client to :9001
$ for i in {1..100}; do curl -s localhost:8080; done

# Leastconn sends new requests to :9002, :9003 while :9001 is busy

# Source IP hashing (same client always goes to same server)
$ ./loadbalancer --algorithm source --backends 9001,9002,9003
$ curl localhost:8080  # Always goes to same backend
$ curl localhost:8080  # Same backend!

# Consistent hashing (minimal disruption when servers change)
$ ./loadbalancer --algorithm consistent --backends 9001,9002,9003

# Remove one backend, only ~1/3 of connections remap
$ ./loadbalancer --algorithm consistent --backends 9001,9002
# Requests that went to 9003 now distributed to 9001, 9002
# Requests that went to 9001, 9002 stay there!

# Stats show distribution
$ curl localhost:8081/stats
Algorithm: consistent-hash (256 virtual nodes)
  9001: 34.2% (3,420 requests)
  9002: 33.1% (3,310 requests)
  9003: 32.7% (3,270 requests)

Implementation Hints:

Least connections:

struct backend *select_leastconn(struct backend *backends, int n) {
    struct backend *best = NULL;
    int min_conns = INT_MAX;

    for (int i = 0; i < n; i++) {
        if (backends[i].status != UP) continue;

        // Weight-adjusted: connections / weight
        int adjusted = backends[i].active_connections * 100 / backends[i].weight;
        if (adjusted < min_conns) {
            min_conns = adjusted;
            best = &backends[i];
        }
    }
    return best;
}

Source IP hash:

struct backend *select_source_hash(struct backend *backends, int n,
                                    struct sockaddr_in *client_addr) {
    uint32_t hash = hash_ip(client_addr->sin_addr.s_addr);

    // Count UP servers
    int up_count = 0;
    for (int i = 0; i < n; i++)
        if (backends[i].status == UP) up_count++;

    if (up_count == 0) return NULL;

    int target = hash % up_count;
    int current = 0;
    for (int i = 0; i < n; i++) {
        if (backends[i].status == UP) {
            if (current == target) return &backends[i];
            current++;
        }
    }
    return NULL;
}

Consistent hashing with virtual nodes:

#define VIRTUAL_NODES 256

struct ring_entry {
    uint32_t hash;
    int backend_idx;
};

struct consistent_hash {
    struct ring_entry ring[MAX_BACKENDS * VIRTUAL_NODES];
    int ring_size;
};

void build_ring(struct consistent_hash *ch, struct backend *backends, int n) {
    ch->ring_size = 0;

    for (int i = 0; i < n; i++) {
        if (backends[i].status != UP) continue;

        for (int v = 0; v < VIRTUAL_NODES; v++) {
            char key[256];
            snprintf(key, sizeof(key), "%s:%d-%d",
                     backends[i].address, backends[i].port, v);

            ch->ring[ch->ring_size].hash = hash_string(key);
            ch->ring[ch->ring_size].backend_idx = i;
            ch->ring_size++;
        }
    }

    // Sort by hash
    qsort(ch->ring, ch->ring_size, sizeof(ch->ring[0]), compare_ring_entry);
}

struct backend *select_consistent_hash(struct consistent_hash *ch,
                                        struct backend *backends,
                                        const char *key) {
    if (ch->ring_size == 0) return NULL;

    uint32_t hash = hash_string(key);

    // Binary search for first entry >= hash
    int lo = 0, hi = ch->ring_size;
    while (lo < hi) {
        int mid = (lo + hi) / 2;
        if (ch->ring[mid].hash < hash)
            lo = mid + 1;
        else
            hi = mid;
    }

    // Wrap around
    if (lo == ch->ring_size) lo = 0;

    return &backends[ch->ring[lo].backend_idx];
}

Learning milestones:

Leastconn balances under uneven load → You understand connection tracking
Source hash provides affinity → You understand session persistence
Consistent hash minimizes remapping → You understand the algorithm
Weighted distribution is accurate → You understand weight handling

Project 6: Health Checking System

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Monitoring / Reliability
Software or Tool: Health Check System
Main Book: “Site Reliability Engineering” by Google

What you’ll build: A health checking system with TCP checks, HTTP checks, configurable intervals, thresholds, and graceful up/down transitions.

Why it teaches HAProxy: HAProxy’s health checking is sophisticated—it’s not just “is port open?” Understanding rise/fall thresholds and check intervals explains HAProxy’s option httpchk and related directives.

Core challenges you’ll face:

Non-blocking health checks → maps to parallel checking
Rise/fall thresholds → maps to avoiding flapping
HTTP health checks → maps to HAProxy’s httpchk
Health state transitions → maps to UP/DOWN/DRAIN states

Key Concepts:

Health Check Patterns: “Site Reliability Engineering” Chapter 22
Circuit Breaker: Prevent cascading failures
Graceful Degradation: Slow drain vs hard failure

Difficulty: Advanced Time estimate: 1-2 weeks Prerequisites: Projects 2-4 completed

Real world outcome:

# Configure health checks
$ ./loadbalancer --backends 9001,9002,9003 \
    --health-check tcp \
    --check-interval 2000 \
    --rise 3 \
    --fall 2

Health Check Configuration:
  Type: TCP connect
  Interval: 2000ms
  Rise threshold: 3 (consecutive successes to go UP)
  Fall threshold: 2 (consecutive failures to go DOWN)

# Simulate backend failure
$ kill $(lsof -ti :9002)

# Logs show transition
[14:32:45] Backend 127.0.0.1:9002: check FAILED (1/2)
[14:32:47] Backend 127.0.0.1:9002: check FAILED (2/2) -> DOWN
[14:32:47] Backend 127.0.0.1:9002: marked DOWN, 0 active connections

# Restart backend
$ ./echoserver 9002 &

[14:33:01] Backend 127.0.0.1:9002: check OK (1/3)
[14:33:03] Backend 127.0.0.1:9002: check OK (2/3)
[14:33:05] Backend 127.0.0.1:9002: check OK (3/3) -> UP
[14:33:05] Backend 127.0.0.1:9002: marked UP

# HTTP health checks
$ ./loadbalancer --backends 9001,9002,9003 \
    --health-check http \
    --health-uri /health \
    --health-expect 200

[14:35:00] Backend 9001: HTTP GET /health -> 200 OK (5ms)
[14:35:00] Backend 9002: HTTP GET /health -> 503 Service Unavailable -> FAIL
[14:35:00] Backend 9003: HTTP GET /health -> 200 OK (3ms)

Implementation Hints:

Health check state machine:

enum health_state {
    HEALTH_UP,
    HEALTH_DOWN,
    HEALTH_CHECKING_UP,    // Was DOWN, checking if UP
    HEALTH_CHECKING_DOWN   // Was UP, checking if DOWN
};

struct health_check {
    enum health_state state;
    int consecutive_success;
    int consecutive_failure;
    int rise_threshold;
    int fall_threshold;
    uint64_t last_check;
    uint64_t next_check;
    int check_fd;  // For async check
};

void update_health_state(struct backend *b, bool check_passed) {
    struct health_check *h = &b->health;

    if (check_passed) {
        h->consecutive_success++;
        h->consecutive_failure = 0;

        if (h->state == HEALTH_DOWN || h->state == HEALTH_CHECKING_UP) {
            if (h->consecutive_success >= h->rise_threshold) {
                h->state = HEALTH_UP;
                b->status = UP;
                log_info("Backend %s: marked UP", b->address);
            } else {
                h->state = HEALTH_CHECKING_UP;
            }
        }
    } else {
        h->consecutive_failure++;
        h->consecutive_success = 0;

        if (h->state == HEALTH_UP || h->state == HEALTH_CHECKING_DOWN) {
            if (h->consecutive_failure >= h->fall_threshold) {
                h->state = HEALTH_DOWN;
                b->status = DOWN;
                log_info("Backend %s: marked DOWN", b->address);
            } else {
                h->state = HEALTH_CHECKING_DOWN;
            }
        }
    }
}

Non-blocking TCP check:

int start_tcp_check(struct backend *b) {
    int fd = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);

    struct sockaddr_in addr = {
        .sin_family = AF_INET,
        .sin_port = htons(b->port)
    };
    inet_pton(AF_INET, b->address, &addr.sin_addr);

    int ret = connect(fd, (struct sockaddr*)&addr, sizeof(addr));
    if (ret == 0) {
        // Immediate success (unlikely for non-blocking)
        close(fd);
        return 1;  // Check passed
    }

    if (errno == EINPROGRESS) {
        // Connection in progress, add to epoll
        b->health.check_fd = fd;
        struct epoll_event ev = {
            .events = EPOLLOUT,  // Wait for connect to complete
            .data.ptr = b
        };
        epoll_ctl(health_epfd, EPOLL_CTL_ADD, fd, &ev);
        return 0;  // Pending
    }

    close(fd);
    return -1;  // Check failed
}

void complete_tcp_check(struct backend *b, uint32_t events) {
    int err;
    socklen_t len = sizeof(err);
    getsockopt(b->health.check_fd, SOL_SOCKET, SO_ERROR, &err, &len);

    bool passed = (err == 0);
    close(b->health.check_fd);
    epoll_ctl(health_epfd, EPOLL_CTL_DEL, b->health.check_fd, NULL);

    update_health_state(b, passed);
    schedule_next_check(b);
}

HTTP health check:

int start_http_check(struct backend *b, const char *uri, int expect_status) {
    int fd = start_tcp_check(b);
    if (fd <= 0) return fd;

    // Prepare HTTP request
    char request[1024];
    snprintf(request, sizeof(request),
             "GET %s HTTP/1.1\r\n"
             "Host: %s\r\n"
             "Connection: close\r\n"
             "\r\n",
             uri, b->address);

    // Store for sending after connect completes
    b->health.http_request = strdup(request);
    b->health.expect_status = expect_status;
    return fd;
}

Learning milestones:

TCP checks detect down servers → You understand basic health checking
Rise/fall prevents flapping → You understand thresholds
HTTP checks validate application → You understand layer 7 checks
Checks are non-blocking → You understand async checking

Project 7: Connection Pooling and Keep-Alive

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Performance Optimization
Software or Tool: Connection Pool
Main Book: “High Performance Browser Networking” by Ilya Grigorik

What you’ll build: A connection pool for backend servers, reusing connections for multiple requests. Implement HTTP keep-alive handling on both frontend and backend.

Why it teaches HAProxy: Connection reuse is crucial for performance. HAProxy’s http-reuse directive and connection pooling explain how it achieves high throughput with fewer connections.

Core challenges you’ll face:

Connection pool management → maps to HAProxy’s connection reuse
HTTP keep-alive parsing → maps to request boundaries
Idle timeout management → maps to pool size vs latency
Thread-safe pool access → maps to multi-threaded HAProxy

Key Concepts:

HTTP Keep-Alive: “High Performance Browser Networking” Chapter 11
Connection Pooling: Amortizing TCP handshake cost
Pool Sizing: Balancing memory vs latency

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Project 3-4 completed

Real world outcome:

# Without pooling (connection per request)
$ ./loadbalancer --no-pool --backends 9001,9002

$ wrk -t4 -c100 -d10s http://localhost:8080/
Requests/sec: 15,234
Backend connections opened: 15,234
Average latency: 6.5ms

# With pooling
$ ./loadbalancer --pool-size 50 --backends 9001,9002

$ wrk -t4 -c100 -d10s http://localhost:8080/
Requests/sec: 45,678   # 3x improvement!
Backend connections opened: 50
Connection reuse rate: 99.9%
Average latency: 2.1ms

# Pool stats
$ curl localhost:8081/pool-stats
{
  "backend_127.0.0.1:9001": {
    "pool_size": 25,
    "idle_connections": 5,
    "active_connections": 20,
    "total_requests": 22,839,
    "reuse_rate": 99.89
  },
  "backend_127.0.0.1:9002": {
    "pool_size": 25,
    "idle_connections": 8,
    "active_connections": 17,
    "total_requests": 22,839,
    "reuse_rate": 99.91
  }
}

Implementation Hints:

Connection pool structure:

struct pooled_connection {
    int fd;
    struct backend *backend;
    uint64_t last_used;
    bool in_use;
    struct pooled_connection *next;  // For free list
};

struct connection_pool {
    struct backend *backend;
    struct pooled_connection *connections;
    int pool_size;
    int active_count;
    int idle_count;

    struct pooled_connection *free_list;  // Idle connections
    pthread_mutex_t lock;  // For thread safety
};

Acquiring and releasing connections:

struct pooled_connection *pool_acquire(struct connection_pool *pool) {
    pthread_mutex_lock(&pool->lock);

    // Try to get idle connection
    if (pool->free_list) {
        struct pooled_connection *conn = pool->free_list;
        pool->free_list = conn->next;
        conn->in_use = true;
        pool->idle_count--;
        pool->active_count++;
        pthread_mutex_unlock(&pool->lock);
        return conn;
    }

    // Create new connection if under limit
    if (pool->active_count < pool->pool_size) {
        pthread_mutex_unlock(&pool->lock);

        int fd = connect_to_backend(pool->backend);
        if (fd < 0) return NULL;

        struct pooled_connection *conn = malloc(sizeof(*conn));
        conn->fd = fd;
        conn->backend = pool->backend;
        conn->in_use = true;

        pthread_mutex_lock(&pool->lock);
        pool->active_count++;
        pthread_mutex_unlock(&pool->lock);

        return conn;
    }

    pthread_mutex_unlock(&pool->lock);
    return NULL;  // Pool exhausted, must wait or error
}

void pool_release(struct connection_pool *pool, struct pooled_connection *conn) {
    pthread_mutex_lock(&pool->lock);

    conn->in_use = false;
    conn->last_used = get_time_ms();

    // Return to free list
    conn->next = pool->free_list;
    pool->free_list = conn;
    pool->idle_count++;
    pool->active_count--;

    pthread_mutex_unlock(&pool->lock);
}

HTTP keep-alive handling:

// Determine if connection can be reused
bool can_reuse_connection(struct http_request *req, struct http_response *resp) {
    // HTTP/1.0: must have explicit Connection: keep-alive
    if (req->minor_version == 0) {
        const char *conn = find_header(req, "Connection", NULL);
        if (!conn || strcasecmp(conn, "keep-alive") != 0)
            return false;
    }

    // HTTP/1.1: keep-alive is default, check for Connection: close
    const char *conn = find_header(resp, "Connection", NULL);
    if (conn && strcasecmp(conn, "close") == 0)
        return false;

    // Must have Content-Length or be chunked
    if (!has_content_length(resp) && !is_chunked(resp))
        return false;

    return true;
}

Learning milestones:

Connections are reused → You understand pooling basics
Performance improves significantly → You understand the benefit
Keep-alive boundaries work → You understand HTTP framing
Pool handles load correctly → You understand resource management

Project 8: SSL/TLS Termination

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Security / Cryptography
Software or Tool: TLS Termination (like HAProxy SSL)
Main Book: “Bulletproof SSL and TLS” by Ivan Ristić

What you’ll build: Add SSL/TLS termination to your load balancer using OpenSSL. Handle certificates, SNI for multiple domains, and TLS handshakes efficiently.

Why it teaches HAProxy: HAProxy often terminates TLS, offloading crypto from backends. Understanding SNI, session resumption, and certificate management explains HAProxy’s SSL configuration.

Core challenges you’ll face:

TLS handshake integration → maps to HAProxy’s bind ssl
SNI handling → maps to HAProxy’s crt directive
Certificate management → maps to HAProxy’s CA configuration
Performance optimization → maps to session tickets, OCSP stapling

Key Concepts:

TLS Protocol: “Bulletproof SSL and TLS” Chapters 1-4
SNI (Server Name Indication): Virtual hosting with TLS
OpenSSL API: SSL_CTX, SSL_new, SSL_accept

Difficulty: Expert Time estimate: 3 weeks Prerequisites: Projects 1-4 completed, understanding of TLS

Real world outcome:

# Generate certificates
$ openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -days 365 -nodes

# Start with SSL
$ ./loadbalancer --ssl --cert cert.pem --key key.pem --frontend 8443 --backends 9001,9002

SSL Frontend: 0.0.0.0:8443
Certificate: cert.pem (CN=localhost)
Ciphers: TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256:...

# Test
$ curl -k https://localhost:8443/
Hello from backend!

# Check SSL handshake
$ openssl s_client -connect localhost:8443 -servername example.com
...
SSL-Session:
    Protocol  : TLSv1.3
    Cipher    : TLS_AES_256_GCM_SHA384
...

# SNI for multiple domains
$ ./loadbalancer --ssl \
    --sni example.com:example.pem \
    --sni api.example.com:api.pem \
    --default-cert default.pem

# Stats
$ curl localhost:8081/ssl-stats
{
  "handshakes": 1234,
  "session_reuse_rate": 45.2,
  "cipher_usage": {
    "TLS_AES_256_GCM_SHA384": 890,
    "TLS_CHACHA20_POLY1305_SHA256": 344
  }
}

Implementation Hints:

SSL context setup:

#include <openssl/ssl.h>
#include <openssl/err.h>

SSL_CTX *create_ssl_context(const char *cert_file, const char *key_file) {
    SSL_CTX *ctx = SSL_CTX_new(TLS_server_method());

    SSL_CTX_set_min_proto_version(ctx, TLS1_2_VERSION);
    SSL_CTX_set_max_proto_version(ctx, TLS1_3_VERSION);

    // Load certificate and key
    if (SSL_CTX_use_certificate_file(ctx, cert_file, SSL_FILETYPE_PEM) <= 0) {
        ERR_print_errors_fp(stderr);
        exit(1);
    }

    if (SSL_CTX_use_PrivateKey_file(ctx, key_file, SSL_FILETYPE_PEM) <= 0) {
        ERR_print_errors_fp(stderr);
        exit(1);
    }

    // Set cipher list
    SSL_CTX_set_cipher_list(ctx, "ECDHE+AESGCM:DHE+AESGCM");

    return ctx;
}

SNI callback:

struct sni_entry {
    char *hostname;
    SSL_CTX *ctx;
};

struct sni_entry sni_table[MAX_SNI_ENTRIES];
int sni_count = 0;

int sni_callback(SSL *ssl, int *alert, void *arg) {
    const char *servername = SSL_get_servername(ssl, TLSEXT_NAMETYPE_host_name);
    if (!servername) return SSL_TLSEXT_ERR_NOACK;

    for (int i = 0; i < sni_count; i++) {
        if (strcasecmp(sni_table[i].hostname, servername) == 0) {
            SSL_set_SSL_CTX(ssl, sni_table[i].ctx);
            return SSL_TLSEXT_ERR_OK;
        }
    }

    return SSL_TLSEXT_ERR_NOACK;  // Use default context
}

// Setup
SSL_CTX_set_tlsext_servername_callback(default_ctx, sni_callback);

Non-blocking SSL handshake:

enum ssl_state {
    SSL_HANDSHAKE,
    SSL_ESTABLISHED,
    SSL_SHUTDOWN
};

struct ssl_connection {
    int fd;
    SSL *ssl;
    enum ssl_state state;
    struct connection *conn;  // Underlying connection
};

void handle_ssl_event(struct ssl_connection *sc, uint32_t events) {
    if (sc->state == SSL_HANDSHAKE) {
        int ret = SSL_accept(sc->ssl);
        if (ret == 1) {
            // Handshake complete!
            sc->state = SSL_ESTABLISHED;
            log_info("SSL handshake complete, cipher=%s",
                     SSL_get_cipher(sc->ssl));
        } else {
            int err = SSL_get_error(sc->ssl, ret);
            if (err == SSL_ERROR_WANT_READ) {
                modify_epoll(sc->fd, EPOLLIN);
            } else if (err == SSL_ERROR_WANT_WRITE) {
                modify_epoll(sc->fd, EPOLLOUT);
            } else {
                // Real error
                close_ssl_connection(sc);
            }
        }
    } else if (sc->state == SSL_ESTABLISHED) {
        // Use SSL_read/SSL_write instead of read/write
        if (events & EPOLLIN) {
            char buf[4096];
            int n = SSL_read(sc->ssl, buf, sizeof(buf));
            // Handle data...
        }
    }
}

Learning milestones:

TLS handshake works → You understand SSL integration
SNI routes correctly → You understand virtual hosting
Non-blocking handshake works → You understand async TLS
Performance is acceptable → You understand optimization

Project 9: Configuration Parser and Hot Reload

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Systems Programming / DevOps
Software or Tool: Configuration System
Main Book: “The Linux Programming Interface” by Michael Kerrisk

What you’ll build: A configuration file parser (HAProxy-style) and a hot reload mechanism that applies new configs without dropping connections.

Why it teaches HAProxy: HAProxy’s configuration syntax and seamless reload are critical for production use. Understanding how config changes propagate explains operational aspects.

Core challenges you’ll face:

Configuration parsing → maps to HAProxy’s config syntax
Validation → maps to HAProxy’s -c flag
Hot reload → maps to HAProxy’s -sf/-st
Zero-downtime → maps to connection draining

Key Concepts:

HAProxy Configuration: HAProxy Configuration Manual
Graceful Restart: Unix signal handling, socket passing
Configuration Validation: Fail-fast on syntax errors

Difficulty: Advanced Time estimate: 2 weeks Prerequisites: Projects 4-6 completed

Real world outcome:

# Create configuration file
$ cat loadbalancer.cfg
global
    maxconn 10000
    log stdout format short

defaults
    timeout connect 5s
    timeout client 30s
    timeout server 30s

frontend http
    bind *:8080
    default_backend webservers

backend webservers
    balance roundrobin
    option httpchk GET /health
    server web1 127.0.0.1:9001 check weight 1
    server web2 127.0.0.1:9002 check weight 2
    server web3 127.0.0.1:9003 check weight 1

# Validate configuration
$ ./loadbalancer -c -f loadbalancer.cfg
Configuration file is valid.

# Start with config
$ ./loadbalancer -f loadbalancer.cfg

# Hot reload (update config, apply without dropping connections)
$ vi loadbalancer.cfg  # Add server web4
$ ./loadbalancer -f loadbalancer.cfg -sf $(cat /var/run/loadbalancer.pid)

[15:23:45] Received SIGUSR2, reloading configuration...
[15:23:45] New configuration validated
[15:23:45] Starting new worker with updated config
[15:23:45] Old worker draining 42 connections
[15:23:46] Old worker finished, 42 connections migrated
[15:23:46] Reload complete

Implementation Hints:

Configuration structures:

struct server_config {
    char *name;
    char *address;
    int port;
    int weight;
    bool check_enabled;
    char *check_uri;
};

struct backend_config {
    char *name;
    enum lb_algorithm algorithm;
    struct server_config *servers;
    int server_count;
};

struct frontend_config {
    char *name;
    char *bind_address;
    int bind_port;
    bool ssl_enabled;
    char *ssl_cert;
    char *ssl_key;
    char *default_backend;
};

struct config {
    int maxconn;
    int timeout_connect_ms;
    int timeout_client_ms;
    int timeout_server_ms;

    struct frontend_config *frontends;
    int frontend_count;

    struct backend_config *backends;
    int backend_count;
};

Simple config parser:

struct config *parse_config(const char *filename) {
    FILE *f = fopen(filename, "r");
    struct config *cfg = calloc(1, sizeof(*cfg));

    char line[1024];
    char section[64] = "";

    while (fgets(line, sizeof(line), f)) {
        char *p = line;
        while (*p == ' ' || *p == '\t') p++;  // Skip whitespace

        if (*p == '#' || *p == '\n') continue;  // Comment or empty

        // Section headers
        if (strncmp(p, "global", 6) == 0) {
            strcpy(section, "global");
        } else if (strncmp(p, "defaults", 8) == 0) {
            strcpy(section, "defaults");
        } else if (strncmp(p, "frontend ", 9) == 0) {
            strcpy(section, "frontend");
            // Create new frontend...
        } else if (strncmp(p, "backend ", 8) == 0) {
            strcpy(section, "backend");
            // Create new backend...
        } else {
            // Parse directive within section
            parse_directive(cfg, section, p);
        }
    }

    fclose(f);
    return cfg;
}

Hot reload with socket passing:

void handle_reload_signal(int sig) {
    log_info("Received reload signal, starting new worker");

    // Fork new worker
    pid_t pid = fork();
    if (pid == 0) {
        // Child: exec new binary with updated config
        char *argv[] = {"loadbalancer", "-f", config_path, "-x", socket_path, NULL};
        execv("/path/to/loadbalancer", argv);
        exit(1);
    }

    // Parent: enter draining mode
    draining = true;
    drain_start_time = get_time_ms();

    // Wait for connections to finish, then exit
}

// New worker receives listening sockets via Unix socket
int receive_sockets(const char *socket_path) {
    int sock = socket(AF_UNIX, SOCK_STREAM, 0);
    struct sockaddr_un addr = { .sun_family = AF_UNIX };
    strcpy(addr.sun_path, socket_path);
    connect(sock, (struct sockaddr*)&addr, sizeof(addr));

    // Receive file descriptors via SCM_RIGHTS
    // This is the magic that allows seamless reload!
    struct msghdr msg = {...};
    struct cmsghdr *cmsg = CMSG_FIRSTHDR(&msg);
    int *fds = (int*)CMSG_DATA(cmsg);
    recvmsg(sock, &msg, 0);

    return fds[0];  // Listening socket
}

Learning milestones:

Config parses correctly → You understand the format
Validation catches errors → You understand safety checks
Hot reload works → You understand zero-downtime reload
Connections survive reload → You understand socket passing

Project 10: Stats and Monitoring Dashboard

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust, Go
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Monitoring / Observability
Software or Tool: Stats Dashboard
Main Book: “Site Reliability Engineering” by Google

What you’ll build: A real-time stats endpoint and simple web dashboard showing connections, request rates, latencies, and server health—like HAProxy’s famous stats page.

Why it teaches HAProxy: HAProxy’s stats page is legendary—61+ metrics per backend. Understanding what to measure and how to expose it explains observability in proxies.

Core challenges you’ll face:

Metric collection → maps to atomic counters, histograms
JSON/HTML output → maps to HAProxy stats formats
Real-time updates → maps to rate calculations
Latency percentiles → maps to P50, P95, P99

Key Concepts:

HAProxy Stats: HAProxy Stats Documentation
Histograms for Latency: HDR Histogram pattern
Rate Calculation: Rolling windows

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic HTTP serving, JSON knowledge

Real world outcome:

# Enable stats endpoint
$ ./loadbalancer -f config.cfg --stats-port 8081

# JSON stats
$ curl localhost:8081/stats
{
  "uptime_seconds": 3600,
  "total_connections": 1234567,
  "active_connections": 42,
  "requests_per_second": 1523.4,
  "bytes_in": 15234567890,
  "bytes_out": 45678901234,
  "frontends": [
    {
      "name": "http",
      "bind": "*:8080",
      "status": "UP",
      "connections": 42,
      "requests": 456789
    }
  ],
  "backends": [
    {
      "name": "webservers",
      "algorithm": "roundrobin",
      "servers": [
        {
          "name": "web1",
          "address": "127.0.0.1:9001",
          "status": "UP",
          "weight": 1,
          "connections": 14,
          "requests": 152341,
          "latency_p50_ms": 2.3,
          "latency_p95_ms": 15.2,
          "latency_p99_ms": 45.1
        }
      ]
    }
  ]
}

# HTML dashboard
$ open http://localhost:8081/

# Prometheus-compatible metrics
$ curl localhost:8081/metrics
# HELP haproxy_frontend_connections Total connections
# TYPE haproxy_frontend_connections counter
haproxy_frontend_connections{frontend="http"} 456789
...

Implementation Hints:

Atomic counters:

#include <stdatomic.h>

struct server_stats {
    atomic_uint_fast64_t connections_total;
    atomic_uint_fast64_t connections_active;
    atomic_uint_fast64_t bytes_in;
    atomic_uint_fast64_t bytes_out;
    atomic_uint_fast64_t requests;
    atomic_uint_fast64_t errors;

    // Latency histogram buckets (microseconds)
    atomic_uint_fast64_t latency_bucket[16];  // <1ms, <2ms, <5ms, ...
};

void record_latency(struct server_stats *stats, uint64_t latency_us) {
    int bucket = latency_to_bucket(latency_us);
    atomic_fetch_add(&stats->latency_bucket[bucket], 1);
}

Rate calculation (rolling window):

#define RATE_WINDOW_SIZE 60  // 60 seconds

struct rate_calculator {
    uint64_t buckets[RATE_WINDOW_SIZE];
    int current_bucket;
    uint64_t last_update;
};

double calculate_rate(struct rate_calculator *rc) {
    uint64_t now = get_time_seconds();
    rotate_buckets(rc, now);

    uint64_t sum = 0;
    for (int i = 0; i < RATE_WINDOW_SIZE; i++) {
        sum += rc->buckets[i];
    }
    return (double)sum / RATE_WINDOW_SIZE;
}

JSON stats endpoint:

void handle_stats_request(int client_fd) {
    char response[65536];
    int len = 0;

    len += snprintf(response + len, sizeof(response) - len,
        "HTTP/1.1 200 OK\r\n"
        "Content-Type: application/json\r\n"
        "Connection: close\r\n\r\n"
        "{\n"
        "  \"uptime_seconds\": %lu,\n"
        "  \"total_connections\": %lu,\n"
        "  \"active_connections\": %lu,\n",
        get_uptime(),
        atomic_load(&global_stats.connections_total),
        atomic_load(&global_stats.connections_active));

    // Add backends, servers...
    len += snprintf(response + len, sizeof(response) - len, "}\n");

    write(client_fd, response, len);
}

Learning milestones:

Basic stats work → You understand metric collection
Rates are accurate → You understand rolling windows
Latency percentiles work → You understand histograms
Dashboard updates in real-time → Full observability

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. TCP Echo (select/poll)	Intermediate	1 week	⭐⭐	⭐⭐⭐
2. Event Loop (epoll)	Advanced	2 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
3. HTTP Parser	Advanced	2 weeks	⭐⭐⭐⭐	⭐⭐⭐
4. Round-Robin LB	Advanced	2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐
5. Advanced Algorithms	Advanced	2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐
6. Health Checking	Advanced	1-2 weeks	⭐⭐⭐	⭐⭐⭐
7. Connection Pooling	Advanced	2 weeks	⭐⭐⭐⭐	⭐⭐⭐
8. SSL/TLS Termination	Expert	3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
9. Config & Hot Reload	Advanced	2 weeks	⭐⭐⭐⭐	⭐⭐⭐
10. Stats Dashboard	Intermediate	1-2 weeks	⭐⭐⭐	⭐⭐⭐⭐

Recommended Learning Path

Phase 1: Foundations (The Core)

Project 1 (select/poll) → Project 2 (epoll) → Project 3 (HTTP Parser)

This teaches you event-driven programming—HAProxy’s foundation.

Phase 2: Load Balancing

Project 4 (Round-Robin) → Project 5 (Advanced Algorithms) → Project 6 (Health Checks)

This teaches you what makes a load balancer.

Phase 3: Production Features

Project 7 (Pooling) → Project 8 (SSL) → Project 9 (Config) → Project 10 (Stats)

This makes your load balancer production-ready.

Final Capstone: MiniHAProxy

File: LEARN_HAPROXY_DEEP_DIVE.md
Main Programming Language: C
Alternative Programming Languages: Rust
Coolness Level: Level 5: Pure Magic
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 5: Master
Knowledge Area: Systems Programming / Full Stack
Software or Tool: Complete Load Balancer
Main Book: All previous books combined

What you’ll build: Combine all projects into a complete HAProxy-like load balancer with:

Multi-threaded event loop (epoll/kqueue)
HTTP/1.1 and TCP modes
Multiple load balancing algorithms
Health checking
Connection pooling
SSL/TLS termination
HAProxy-compatible configuration
Hot reload
Stats dashboard

Real world outcome:

# Full HAProxy-compatible configuration
$ cat minihaproxy.cfg
global
    maxconn 50000
    nbthread 4

defaults
    mode http
    timeout connect 5s
    timeout client 30s
    timeout server 30s

frontend https
    bind *:443 ssl crt /etc/ssl/cert.pem
    default_backend webservers

backend webservers
    balance leastconn
    option httpchk GET /health
    http-check expect status 200
    server web1 192.168.1.10:8080 check weight 100
    server web2 192.168.1.11:8080 check weight 100
    server web3 192.168.1.12:8080 check weight 50

listen stats
    bind *:8404
    stats enable
    stats uri /stats

$ ./minihaproxy -f minihaproxy.cfg

MiniHAProxy 1.0
Threads: 4
Max connections: 50000

Frontend 'https': 0.0.0.0:443 (ssl)
Backend 'webservers': 3 servers, leastconn

$ wrk -t12 -c10000 -d60s https://localhost:443/
Running 60s test @ https://localhost:443/
  12 threads and 10000 connections
  Requests/sec: 250,000+
  Latency P99: 12ms

Summary

#	Project	Main Language
1	TCP Echo Server with Select/Poll	C
2	High-Performance Event Loop with epoll	C
3	HTTP/1.1 Parser	C
4	Round-Robin Load Balancer	C
5	Advanced Load Balancing Algorithms	C
6	Health Checking System	C
7	Connection Pooling and Keep-Alive	C
8	SSL/TLS Termination	C
9	Configuration Parser and Hot Reload	C
10	Stats and Monitoring Dashboard	C
Capstone	MiniHAProxy (Complete Load Balancer)	C

Key Resources

Books

“The Linux Programming Interface” by Michael Kerrisk - Essential for systems programming
“High Performance Browser Networking” by Ilya Grigorik - Network protocols and optimization
“Beej’s Guide to Network Programming” by Brian Hall - Free online
“Bulletproof SSL and TLS” by Ivan Ristić - TLS deep dive

HAProxy Documentation

HAProxy Configuration Manual - Official docs
HAProxy Management Guide - Internals and operations
HAProxy Source Code - The ultimate reference

Tutorials & Articles

Kernel Queue Complete Guide - epoll, kqueue, IOCP
Non-Blocking Sockets with epoll - epoll tutorial
llb - Dead Simple Load Balancer - Reference implementation
picohttpparser - High-performance HTTP parser

Reference Implementations

HAProxy Source - The real thing
llb - Simple C load balancer
GLB - Galera Load Balancer

“To understand HAProxy, build HAProxy. The event loop will humble you, the syscalls will enlighten you, and the performance numbers will amaze you.”