Learn Web Infrastructure Tools: From Zero to Infrastructure Master

Goal: Deeply understand the entire spectrum of web infrastructure—from web servers and reverse proxies to load balancers, API gateways, and CDNs. You will learn not just HOW to configure these tools, but WHY they exist, what problems they solve, how HTTP flows through them, and when to choose each one. After completing these projects, you’ll be able to architect production-grade infrastructure, troubleshoot complex networking issues, and understand the trade-offs that power the modern internet.

Why Web Infrastructure Tools Matter

In 1989, Tim Berners-Lee invented HTTP and the first web server at CERN. It could handle a handful of requests per second. Today, companies like Netflix serve over 400 million hours of content daily, Cloudflare handles 57 million HTTP requests per second, and a single misconfigured load balancer can take down an entire service.

The tools you’re about to learn power the entire internet:

Apache HTTP Server (1995) - The original dominant web server, still powers 31% of all websites
Nginx (2004) - Created to solve the C10K problem (handling 10,000+ concurrent connections)
HAProxy (2000) - The gold standard for TCP/HTTP load balancing, used by GitHub, Reddit, Stack Overflow
Cloudflare (2010) - Handles ~20% of all internet traffic, providing CDN and DDoS protection
Envoy (2016) - Born at Lyft, now the backbone of service mesh architectures everywhere

Understanding these tools is understanding how the modern internet works. Every request you make travels through multiple layers of this infrastructure:

Your Browser
     │
     ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                              CDN LAYER                                   │
│  (Cloudflare, Fastly, Akamai)                                           │
│  - Edge caching, DDoS protection, TLS termination                       │
│  - 300+ global PoPs, sub-50ms latency worldwide                         │
└─────────────────────────────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         LOAD BALANCER LAYER                              │
│  (HAProxy, Nginx, Envoy, Traefik)                                       │
│  - Traffic distribution, health checks, SSL termination                 │
│  - Rate limiting, connection pooling                                    │
└─────────────────────────────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         API GATEWAY LAYER                                │
│  (Kong, OpenResty, Istio Ingress)                                       │
│  - Authentication, rate limiting, request transformation                │
│  - Service discovery, routing rules                                     │
└─────────────────────────────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         WEB SERVER LAYER                                 │
│  (Apache, Nginx, Caddy, LiteSpeed, IIS)                                 │
│  - Static file serving, reverse proxying to app servers                 │
│  - Compression, caching headers, security headers                       │
└─────────────────────────────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                      APPLICATION SERVER LAYER                            │
│  (Tomcat, Jetty, Undertow, Kestrel, Gunicorn, uWSGI)                    │
│  - Business logic execution, dynamic content generation                 │
│  - Database connections, session management                             │
└─────────────────────────────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                         CACHING LAYER                                    │
│  (Squid, Apache Traffic Server, Varnish, Redis)                         │
│  - Response caching, cache invalidation strategies                      │
│  - Memory vs disk caching, cache hierarchies                            │
└─────────────────────────────────────────────────────────────────────────┘

The Tools You Will Master

Category 1: Traditional Web Servers

Category 2: Reverse Proxies & Load Balancers

Category 3: API Gateways

Category 4: Caching Proxies

Category 5: Application Servers (Java/.NET)

Category 6: CDNs (Content Delivery Networks)

Core Concept Analysis

1. The HTTP Request Lifecycle

Before configuring any tool, you must understand what happens when a browser makes a request:

┌─────────────────────────────────────────────────────────────────────────────┐
│                        HTTP REQUEST LIFECYCLE                                │
└─────────────────────────────────────────────────────────────────────────────┘

1. DNS Resolution                    2. TCP Connection
   ┌─────────────┐                      ┌─────────────┐
   │   Browser   │                      │   Browser   │
   │  "google.   │                      │             │
   │   com" →    │                      │  ──SYN──►   │
   │  142.250.   │                      │  ◄─SYN/ACK─ │
   │   80.46     │                      │  ──ACK──►   │
   └─────────────┘                      └─────────────┘

3. TLS Handshake (if HTTPS)          4. HTTP Request
   ┌─────────────┐                      ┌─────────────────────────────┐
   │  ClientHello│                      │ GET /index.html HTTP/1.1    │
   │  ServerHello│                      │ Host: example.com           │
   │  Certificate│                      │ User-Agent: Mozilla/5.0     │
   │  Key Exchange                      │ Accept: text/html           │
   │  Finished   │                      │ Accept-Encoding: gzip       │
   └─────────────┘                      └─────────────────────────────┘

5. Server Processing                  6. HTTP Response
   ┌─────────────┐                      ┌─────────────────────────────┐
   │ Parse request                      │ HTTP/1.1 200 OK             │
   │ Check cache │                      │ Content-Type: text/html     │
   │ Run app logic                      │ Content-Length: 1234        │
   │ Query DB    │                      │ Cache-Control: max-age=3600 │
   │ Render resp │                      │                             │
   └─────────────┘                      │ <html>...</html>            │
                                        └─────────────────────────────┘

Every tool you learn operates at one or more of these stages.

2. Forward Proxy vs Reverse Proxy

This is the most fundamental concept for understanding these tools:

┌─────────────────────────────────────────────────────────────────────────────┐
│                            FORWARD PROXY                                     │
│                                                                              │
│    Client knows it's using a proxy. Proxy acts on behalf of CLIENT.         │
│                                                                              │
│    ┌────────┐        ┌─────────────┐        ┌──────────────┐                │
│    │ Client │──────► │Forward Proxy│──────► │Origin Server │                │
│    │        │◄────── │  (Squid)    │◄────── │              │                │
│    └────────┘        └─────────────┘        └──────────────┘                │
│                                                                              │
│    Use cases:                                                                │
│    • Corporate content filtering                                             │
│    • Anonymizing client IP                                                   │
│    • Caching for multiple clients                                            │
│    • Bypassing geo-restrictions                                              │
└─────────────────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────────────────┐
│                            REVERSE PROXY                                     │
│                                                                              │
│    Client doesn't know proxy exists. Proxy acts on behalf of SERVER.        │
│                                                                              │
│    ┌────────┐        ┌──────────────┐        ┌──────────────┐               │
│    │ Client │──────► │Reverse Proxy │──────► │Origin Server │               │
│    │        │◄────── │(Nginx/HAProxy│◄────── │              │               │
│    └────────┘        └──────────────┘        └──────────────┘               │
│                            │                                                 │
│                            ├──────► │Origin Server 2│                       │
│                            │                                                 │
│                            └──────► │Origin Server 3│                       │
│                                                                              │
│    Use cases:                                                                │
│    • Load balancing across servers                                           │
│    • SSL/TLS termination                                                     │
│    • Caching static content                                                  │
│    • Hiding server infrastructure                                            │
│    • Request routing based on path/host                                      │
└─────────────────────────────────────────────────────────────────────────────┘

3. Load Balancing Algorithms

Every load balancer you configure will use one of these algorithms:

┌─────────────────────────────────────────────────────────────────────────────┐
│                        LOAD BALANCING ALGORITHMS                             │
└─────────────────────────────────────────────────────────────────────────────┘

ROUND ROBIN                          LEAST CONNECTIONS
┌──────────────────────┐             ┌──────────────────────┐
│ Request 1 → Server A │             │ Server A: 5 conns    │
│ Request 2 → Server B │             │ Server B: 2 conns ◄──│── Next request
│ Request 3 → Server C │             │ Server C: 8 conns    │
│ Request 4 → Server A │             │                      │
│ (repeat)             │             │ (always pick lowest) │
└──────────────────────┘             └──────────────────────┘
Simple, even distribution            Better for varying request times

WEIGHTED ROUND ROBIN                 IP HASH (Session Affinity)
┌──────────────────────┐             ┌──────────────────────┐
│ Server A (weight=3)  │             │ hash(client_ip) % N  │
│ Server B (weight=1)  │             │                      │
│ Server C (weight=2)  │             │ Client 1 → Server A  │
│                      │             │ Client 2 → Server C  │
│ A,A,A,B,C,C,A,A,A... │             │ Client 1 → Server A  │
└──────────────────────┘             │ (always same server) │
Different server capacities          └──────────────────────┘
                                     Sticky sessions, stateful apps

LEAST RESPONSE TIME                  RANDOM
┌──────────────────────┐             ┌──────────────────────┐
│ Server A: 50ms avg   │             │ Randomly pick server │
│ Server B: 20ms avg ◄─│             │                      │
│ Server C: 80ms avg   │             │ Simple but effective │
│                      │             │ for large clusters   │
│ (pick fastest)       │             │                      │
└──────────────────────┘             └──────────────────────┘

4. Connection Handling Models

Understanding WHY Nginx outperforms Apache for high concurrency:

┌─────────────────────────────────────────────────────────────────────────────┐
│                    PROCESS-PER-CONNECTION (Apache prefork)                   │
└─────────────────────────────────────────────────────────────────────────────┘

    Connection 1 ──► [Process 1] ─┐
    Connection 2 ──► [Process 2] ─┼──► Kernel
    Connection 3 ──► [Process 3] ─┤
    Connection 4 ──► [Process 4] ─┘

    • Each connection gets dedicated process
    • Memory: ~10MB per connection
    • Context switching overhead
    • Simple programming model
    • 1000 connections = 10GB RAM

┌─────────────────────────────────────────────────────────────────────────────┐
│                    THREAD-PER-CONNECTION (Apache worker)                     │
└─────────────────────────────────────────────────────────────────────────────┘

    Connection 1 ──► [Thread 1] ──┐
    Connection 2 ──► [Thread 2] ──┼──► Process ──► Kernel
    Connection 3 ──► [Thread 3] ──┤
    Connection 4 ──► [Thread 4] ──┘

    • Each connection gets dedicated thread
    • Memory: ~1MB per connection (stack)
    • Less overhead than processes
    • Thread safety concerns
    • 10,000 connections = 10GB RAM

┌─────────────────────────────────────────────────────────────────────────────┐
│                    EVENT-DRIVEN (Nginx, HAProxy, Envoy)                      │
└─────────────────────────────────────────────────────────────────────────────┘

    Connection 1 ──┐
    Connection 2 ──┤              ┌─────────────┐
    Connection 3 ──┼──► [epoll] ──│Single Worker│──► Kernel
    Connection 4 ──┤              │   Process   │
    ...            │              └─────────────┘
    Connection N ──┘

    • Single thread handles thousands of connections
    • Non-blocking I/O with epoll/kqueue
    • Memory: ~2.5KB per connection
    • State machine programming model
    • 10,000 connections = 25MB RAM

    WHY IT WORKS:
    ┌───────────────────────────────────────────────────┐
    │ Most time is spent WAITING for:                   │
    │ • Network I/O (client sending data)               │
    │ • Disk I/O (reading files)                        │
    │ • Backend response                                │
    │                                                   │
    │ Event-driven = do useful work while waiting       │
    └───────────────────────────────────────────────────┘

5. TLS/SSL Termination

Where encryption ends matters for architecture:

┌─────────────────────────────────────────────────────────────────────────────┐
│                         TLS TERMINATION OPTIONS                              │
└─────────────────────────────────────────────────────────────────────────────┘

OPTION 1: Terminate at Load Balancer
┌────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  Client ══HTTPS══► Load Balancer ──HTTP──► Backend Servers                 │
│                    (terminates TLS)                                         │
│                                                                             │
│  ✓ Offloads CPU from backends        ✗ Internal traffic unencrypted        │
│  ✓ Single place to manage certs      ✗ Load balancer sees plaintext        │
│  ✓ Backend config simpler                                                   │
└────────────────────────────────────────────────────────────────────────────┘

OPTION 2: TLS Passthrough
┌────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  Client ══HTTPS══► Load Balancer ══HTTPS══► Backend Servers                │
│                    (TCP proxy only)         (terminate TLS)                 │
│                                                                             │
│  ✓ End-to-end encryption             ✗ Can't inspect traffic               │
│  ✓ Backend controls certificates     ✗ No Layer 7 routing                  │
│  ✓ Load balancer never sees data     ✗ Each backend needs cert             │
└────────────────────────────────────────────────────────────────────────────┘

OPTION 3: Re-encryption (TLS Bridging)
┌────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  Client ══HTTPS══► Load Balancer ══HTTPS══► Backend Servers                │
│                    (terminate + re-encrypt)                                 │
│                                                                             │
│  ✓ Can inspect and route traffic     ✗ Double encryption overhead          │
│  ✓ Internal traffic encrypted        ✗ Complex certificate management      │
│  ✓ Full Layer 7 features                                                    │
└────────────────────────────────────────────────────────────────────────────┘

OPTION 4: mTLS (Mutual TLS) - Service Mesh Pattern
┌────────────────────────────────────────────────────────────────────────────┐
│                                                                             │
│  Client ══HTTPS══► Ingress ══mTLS══► Service A ══mTLS══► Service B         │
│                    Gateway          (both sides verify certificates)        │
│                                                                             │
│  ✓ Zero-trust security               ✗ Certificate management complexity   │
│  ✓ Service identity verification     ✗ Performance overhead                │
│  ✓ Encrypted service-to-service                                             │
└────────────────────────────────────────────────────────────────────────────┘

6. Caching Layers and Strategies

┌─────────────────────────────────────────────────────────────────────────────┐
│                           CACHING HIERARCHY                                  │
└─────────────────────────────────────────────────────────────────────────────┘

                    FASTEST
                       │
    ┌──────────────────┼──────────────────┐
    │     BROWSER CACHE (Client-side)     │  • Cache-Control headers
    │     TTL: seconds to days            │  • ETag/If-None-Match
    └──────────────────┼──────────────────┘  • Last-Modified/If-Modified-Since
                       │
    ┌──────────────────┼──────────────────┐
    │      CDN EDGE CACHE (300+ PoPs)     │  • Geographic distribution
    │      TTL: minutes to hours          │  • Stale-while-revalidate
    └──────────────────┼──────────────────┘  • Instant purge APIs
                       │
    ┌──────────────────┼──────────────────┐
    │   REVERSE PROXY CACHE (Nginx/Varnish)│  • Full-page caching
    │   TTL: seconds to minutes           │  • ESI (Edge Side Includes)
    └──────────────────┼──────────────────┘  • Vary header handling
                       │
    ┌──────────────────┼──────────────────┐
    │   APPLICATION CACHE (Redis/Memcached)│  • Session data
    │   TTL: seconds to hours             │  • Database query results
    └──────────────────┼──────────────────┘  • Computed values
                       │
    ┌──────────────────┼──────────────────┐
    │      DATABASE QUERY CACHE           │  • Query result caching
    │      TTL: automatic invalidation    │  • Buffer pool
    └──────────────────┼──────────────────┘
                       │
                    SLOWEST


CACHE-CONTROL HEADER CHEAT SHEET:
┌─────────────────────────────────────────────────────────────────────────────┐
│ Directive              │ Meaning                                            │
├────────────────────────┼────────────────────────────────────────────────────┤
│ max-age=3600           │ Cache for 1 hour                                   │
│ s-maxage=3600          │ Cache for 1 hour (shared caches like CDN)          │
│ no-cache               │ Must revalidate with origin before using           │
│ no-store               │ Never cache this response                          │
│ private                │ Only browser can cache (not CDN)                   │
│ public                 │ Any cache can store this                           │
│ must-revalidate        │ Don't use stale content if revalidation fails      │
│ stale-while-revalidate │ Serve stale while fetching fresh in background     │
│ stale-if-error         │ Serve stale if origin returns error                │
└─────────────────────────────────────────────────────────────────────────────┘

7. Health Checks and Failover

┌─────────────────────────────────────────────────────────────────────────────┐
│                         HEALTH CHECK TYPES                                   │
└─────────────────────────────────────────────────────────────────────────────┘

TCP CHECK (Layer 4)                  HTTP CHECK (Layer 7)
┌─────────────────────┐              ┌─────────────────────┐
│ Can I open a TCP    │              │ Does GET /health    │
│ connection to       │              │ return 200 OK?      │
│ port 8080?          │              │                     │
│                     │              │ Can check:          │
│ ✓ Fast              │              │ • Response code     │
│ ✓ Simple            │              │ • Response body     │
│ ✗ App might be hung │              │ • Response time     │
└─────────────────────┘              └─────────────────────┘

HEALTH CHECK STATE MACHINE:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                              │
│    ┌─────────┐  3 consecutive    ┌──────────┐  2 consecutive   ┌─────────┐ │
│    │ HEALTHY │────failures──────►│ DEGRADED │────failures─────►│  DOWN   │ │
│    │         │◄─────────────────│          │◄─────────────────│         │ │
│    └─────────┘  2 consecutive    └──────────┘  3 consecutive   └─────────┘ │
│                 successes                      successes                    │
│                                                                              │
│    Traffic: 100%                 Traffic: 50%                 Traffic: 0%   │
└─────────────────────────────────────────────────────────────────────────────┘

CIRCUIT BREAKER PATTERN:
┌─────────────────────────────────────────────────────────────────────────────┐
│                                                                              │
│    ┌────────┐  failures > threshold  ┌────────┐  timeout    ┌───────────┐  │
│    │ CLOSED │───────────────────────►│  OPEN  │────────────►│HALF-OPEN  │  │
│    │(normal)│                        │(reject)│             │(probe)    │  │
│    └────────┘◄───────────────────────└────────┘◄────────────└───────────┘  │
│               success in half-open              probe fails                 │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘

Concept Summary Table

Concept Cluster	What You Need to Internalize
HTTP Lifecycle	Every request goes through DNS, TCP, TLS, HTTP request, processing, and response. Each tool operates at one or more stages.
Forward vs Reverse Proxy	Forward proxy acts for clients (hiding them), reverse proxy acts for servers (hiding them). Most tools are reverse proxies.
Load Balancing	Distribute traffic using algorithms (round-robin, least-conn, IP-hash). Health checks detect failures. Sticky sessions for stateful apps.
Connection Models	Process-per-conn (Apache prefork) wastes RAM. Event-driven (Nginx) handles 10K+ connections with MB of RAM.
TLS Termination	Where you decrypt determines what you can inspect and what stays encrypted. mTLS for zero-trust.
Caching	Multiple layers from browser to DB. Cache-Control headers drive behavior. Stale-while-revalidate for performance.
Health Checks	TCP checks fast but shallow. HTTP checks verify app health. Circuit breakers prevent cascade failures.

Deep Dive Reading by Concept

This section maps each concept to specific book chapters for deeper understanding.

HTTP Protocol Fundamentals

Concept	Book & Chapter
HTTP Message Format	“TCP/IP Illustrated, Volume 1” by W. Richard Stevens — Ch. 16: HTTP
HTTP/2 and HTTP/3	“Computer Networks” by Tanenbaum — Ch. 7: Application Layer (Section 7.3)
TLS/SSL Handshake	“Serious Cryptography” by Aumasson — Ch. 14: TLS
TCP Connection States	“TCP/IP Illustrated, Volume 1” — Ch. 13: TCP Connection Management

Web Server Architecture

Concept	Book & Chapter
Process vs Thread Models	“Operating Systems: Three Easy Pieces” — Ch. 26-27: Concurrency
Event-Driven I/O	“The Linux Programming Interface” by Kerrisk — Ch. 63: Alternative I/O Models
epoll/kqueue	“Linux System Programming” by Love — Ch. 4: Advanced I/O
Non-blocking Sockets	“The Sockets Networking API” by Stevens — Ch. 16: Nonblocking I/O

Load Balancing & Proxying

Concept	Book & Chapter
Reverse Proxy Patterns	“Building Microservices” by Sam Newman — Ch. 6: Deployment
Load Balancing Algorithms	“Designing Data-Intensive Applications” by Kleppmann — Ch. 6: Partitioning
Health Checks	“Release It!” by Nygard — Ch. 5: Stability Patterns
Circuit Breakers	“Release It!” by Nygard — Ch. 5: Circuit Breaker Pattern

Caching Strategies

Concept	Book & Chapter
Cache Invalidation	“Designing Data-Intensive Applications” — Ch. 5: Replication (Caching section)
CDN Architecture	“Computer Networks” by Tanenbaum — Ch. 7.5: Content Delivery Networks
HTTP Caching Headers	“HTTP: The Definitive Guide” by Gourley — Ch. 7: Caching

Essential Reading Order

For maximum comprehension, read in this order:

Foundation (Week 1-2):
- “Computer Networks” Ch. 5-6 (Transport & Application layers)
- “The Linux Programming Interface” Ch. 56-63 (Sockets & I/O)
Web Servers Deep Dive (Week 3):
- “Operating Systems: Three Easy Pieces” Ch. 26-33 (Concurrency)
- Nginx/Apache official documentation
Production Patterns (Week 4):
- “Release It!” Ch. 4-8 (Stability patterns)
- “Building Microservices” Ch. 6-8 (Deployment & Resilience)
Advanced Topics (Week 5+):
- “Designing Data-Intensive Applications” Ch. 5-6
- Service mesh documentation (Istio, Envoy)

PROJECT LIST

Project 1: Build a Multi-Site Apache HTTP Server with Virtual Hosts

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Bash/Configuration (Apache Config)
Alternative Programming Languages: Python (for CGI), PHP, Perl
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Web Servers, Virtual Hosting, HTTP
Software or Tool: Apache HTTP Server (httpd)
Main Book: “Apache: The Definitive Guide” by Ben Laurie & Peter Laurie

What you’ll build: A complete Apache installation serving 3+ different websites on a single server, with name-based virtual hosts, SSL certificates, .htaccess overrides, mod_rewrite rules, and custom error pages.

Why it teaches web infrastructure: Apache is the grandfather of web servers. Understanding its configuration paradigm—directory-based config, .htaccess cascading, and module system—gives you the mental model that all other web servers either adopted or rejected. You’ll learn WHY Nginx later chose a different path.

Core challenges you’ll face:

Understanding the Apache configuration hierarchy → maps to how servers parse and apply configuration
Setting up name-based virtual hosts → maps to the Host header and how one IP serves many sites
Configuring SSL/TLS with Let’s Encrypt → maps to certificate management and HTTPS
Writing mod_rewrite rules → maps to URL manipulation and SEO-friendly URLs
Tuning MPM (prefork vs worker vs event) → maps to connection handling models

Key Concepts:

Virtual Hosting: “Apache: The Definitive Guide” Ch. 3 - Laurie & Laurie
SSL/TLS Configuration: “Serious Cryptography” Ch. 14 - Aumasson
mod_rewrite: Apache official docs, “Apache Cookbook” Ch. 8
MPM Tuning: “Apache Performance Tuning” - Apache Foundation docs

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic Linux command line, understanding of DNS (A records), basic HTTP concepts

Real World Outcome

You’ll have a fully functional web server hosting multiple websites. When someone visits site1.example.com, they see Site 1’s content. When they visit site2.example.com on the same server, they see completely different content.

Example Output:

# Check Apache is running
$ sudo systemctl status apache2
● apache2.service - The Apache HTTP Server
     Active: active (running) since Mon 2024-12-22 10:00:00 UTC

# Test virtual hosts
$ curl -H "Host: site1.example.com" http://localhost
<!DOCTYPE html>
<html><head><title>Welcome to Site 1!</title></head>
<body><h1>This is Site 1</h1></body></html>

$ curl -H "Host: site2.example.com" http://localhost
<!DOCTYPE html>
<html><head><title>Site 2 Dashboard</title></head>
<body><h1>Site 2 - Different Content!</h1></body></html>

# Test SSL
$ curl -I https://site1.example.com
HTTP/2 200
server: Apache/2.4.57
content-type: text/html
strict-transport-security: max-age=31536000

# Test mod_rewrite (pretty URLs)
$ curl -I http://site1.example.com/products/laptop-123
HTTP/1.1 200 OK
# Internally rewrites to /product.php?id=laptop-123

# Check server status page
$ curl http://localhost/server-status
Apache Server Status for localhost
Server uptime: 2 days 5 hours 23 minutes
Total accesses: 145823 - Total Traffic: 2.3 GB
CPU Usage: u.12 s.08 cu0 cs0 - .00128% CPU load
12 requests/sec - 205.3 kB/second - 17.2 kB/request

The Core Question You’re Answering

“How does a single server with one IP address serve completely different websites depending on what domain name someone types?”

Before you write any configuration, sit with this question. The answer lies in the Host HTTP header—the browser tells the server which site it wants. But this seemingly simple mechanism unlocks the entire shared hosting industry where millions of sites run on thousands of servers.

Concepts You Must Understand First

Stop and research these before configuring:

HTTP Host Header
- What header does the browser send to specify which site it wants?
- Why was this header added in HTTP/1.1 and not HTTP/1.0?
- What happens if no Host header is sent?
- Book Reference: “HTTP: The Definitive Guide” Ch. 5 - Gourley
DNS A Records
- How does site1.example.com resolve to an IP address?
- Can multiple domains point to the same IP?
- What’s the difference between A and CNAME records?
- Book Reference: “DNS and BIND” Ch. 4 - Albitz & Liu
File Permissions on Linux
- Why does Apache run as the www-data user?
- What permissions should web files have (644 vs 755)?
- Why is putting web files in /root a bad idea?
- Book Reference: “The Linux Command Line” Ch. 9 - Shotts
TLS Certificate Chain
- What is a certificate authority and chain of trust?
- How does Let’s Encrypt verify you own a domain?
- What’s the difference between HTTP-01 and DNS-01 challenges?
- Book Reference: “Serious Cryptography” Ch. 14 - Aumasson

Questions to Guide Your Design

Before configuring, think through these:

Virtual Host Strategy
- Will you use name-based or IP-based virtual hosts?
- What should the “default” virtual host show for unknown domains?
- Where will you store each site’s files (/var/www/site1 vs /home/user/site1)?
SSL Configuration
- Will you terminate SSL at Apache or use a reverse proxy?
- How will you handle HTTP to HTTPS redirects?
- Will you use a wildcard certificate or individual certs?
Logging Strategy
- Separate log files per virtual host or combined?
- What log format will you use (combined, custom)?
- How will you handle log rotation?
Security
- Which directories need .htaccess to be enabled?
- What security headers will you add (HSTS, X-Frame-Options)?
- How will you protect sensitive files (.git, .env)?

Thinking Exercise

Trace an HTTP Request Through Apache

Before configuring, trace what happens when a browser requests https://site2.example.com/products/shoes:

Browser types URL
      │
      ▼
[DNS Resolution] → What IP does site2.example.com resolve to?
      │
      ▼
[TCP Connection] → Browser connects to port 443
      │
      ▼
[TLS Handshake] → Which certificate does Apache send?
      │
      ▼
[HTTP Request] → GET /products/shoes, Host: site2.example.com
      │
      ▼
[Virtual Host Selection] → How does Apache pick which <VirtualHost> block?
      │
      ▼
[mod_rewrite] → Does /products/shoes get rewritten to something else?
      │
      ▼
[.htaccess Check] → Does Apache look for .htaccess files?
      │
      ▼
[File Serving] → What file actually gets read from disk?
      │
      ▼
[Response] → HTTP 200, content-type, security headers

Questions while tracing:

At which step does Apache decide which virtual host to use?
Where exactly in the config does the SSL certificate get specified?
If mod_rewrite changes the URL, does the log show the original or rewritten URL?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between <VirtualHost *:80> and <VirtualHost 192.168.1.1:80>?”
“How would you configure Apache to redirect HTTP to HTTPS?”
“Explain what .htaccess files are and when you would disable AllowOverride.”
“What’s the difference between prefork, worker, and event MPMs? When would you use each?”
“How would you troubleshoot ‘Permission denied’ errors when Apache serves files?”
“What does the Options -Indexes directive do and why is it important?”
“How would you set up Apache to proxy requests to a backend application server?”

Hints in Layers

Hint 1: Starting Point Install Apache, create two directories under /var/www/, and create a virtual host file for each in /etc/apache2/sites-available/.

Hint 2: Virtual Host Structure Each virtual host file needs <VirtualHost *:80>, ServerName, DocumentRoot, and ErrorLog/CustomLog directives. Use a2ensite to enable them.

Hint 3: SSL Setup Install certbot, run certbot --apache -d site1.example.com, and it will automatically create the SSL virtual host configuration.

Hint 4: Debugging Use apachectl configtest to validate config, tail -f /var/log/apache2/error.log to watch errors, and curl -v to see exactly what headers are sent/received.

Books That Will Help

Topic	Book	Chapter
Apache basics	“Apache: The Definitive Guide” by Laurie	Ch. 1-4
Virtual hosts	“Apache: The Definitive Guide”	Ch. 3
SSL/TLS	“Serious Cryptography” by Aumasson	Ch. 14
mod_rewrite	“Apache Cookbook” by Coar & Bowen	Ch. 8
Linux permissions	“The Linux Command Line” by Shotts	Ch. 9

Learning milestones:

First site works on HTTP → You understand virtual hosts and DocumentRoot
Multiple sites with SSL → You understand certificate configuration and SNI
mod_rewrite working → You understand URL manipulation and regex in server config
Tuned for performance → You understand connection handling and caching

Project 2: Configure Nginx as Reverse Proxy with Load Balancing

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Nginx Configuration
Alternative Programming Languages: Lua (OpenResty), Python (backend apps)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Reverse Proxying, Load Balancing, HTTP
Software or Tool: Nginx
Main Book: “Nginx HTTP Server” by Clément Nedelcu

What you’ll build: A complete Nginx setup that acts as a reverse proxy in front of multiple backend application servers, with load balancing, health checks, SSL termination, caching, and rate limiting. You’ll deploy 3 identical backend apps and watch Nginx distribute traffic.

Why it teaches web infrastructure: Nginx represents the event-driven revolution in web servers. By configuring it as a reverse proxy, you’ll understand the architectural pattern that powers most modern deployments—separating the “traffic cop” from the “business logic.” Every large website uses this pattern.

Core challenges you’ll face:

Understanding upstream blocks and load balancing → maps to traffic distribution algorithms
Configuring health checks → maps to failure detection and automatic failover
SSL termination vs passthrough → maps to where encryption happens in your architecture
Proxy headers (X-Forwarded-For, X-Real-IP) → maps to preserving client information through proxies
Response caching with cache keys → maps to reducing backend load

Key Concepts:

Reverse Proxy Pattern: “Building Microservices” Ch. 6 - Sam Newman
Load Balancing Algorithms: “Designing Data-Intensive Applications” Ch. 6 - Kleppmann
Nginx Configuration: “Nginx HTTP Server” Ch. 3-5 - Nedelcu
HTTP Caching: “HTTP: The Definitive Guide” Ch. 7 - Gourley

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic Nginx, running backend apps (can be simple Python/Node servers), Project 1 concepts

Real World Outcome

You’ll have Nginx distributing requests across multiple backend servers. When you stop one backend, Nginx automatically routes traffic to healthy servers. When you run a load test, you’ll see requests evenly distributed.

Example Output:

# Start 3 backend servers on different ports
$ python3 backend.py --port 8001 &  # Server A
$ python3 backend.py --port 8002 &  # Server B
$ python3 backend.py --port 8003 &  # Server C

# Check Nginx config
$ sudo nginx -t
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

# Make requests and see load balancing in action
$ for i in {1..6}; do curl -s http://localhost/api/whoami; done
{"server": "backend-8001", "request_id": 1}
{"server": "backend-8002", "request_id": 1}
{"server": "backend-8003", "request_id": 1}
{"server": "backend-8001", "request_id": 2}
{"server": "backend-8002", "request_id": 2}
{"server": "backend-8003", "request_id": 2}

# Stop one backend and watch failover
$ kill %1  # Stop Server A

$ for i in {1..4}; do curl -s http://localhost/api/whoami; done
{"server": "backend-8002", "request_id": 3}
{"server": "backend-8003", "request_id": 3}
{"server": "backend-8002", "request_id": 4}
{"server": "backend-8003", "request_id": 4}
# No requests to 8001, Nginx detected it's down!

# Check upstream status
$ curl http://localhost/upstream_status
Upstream: backends
  backend-8001: DOWN (last_fail: 2 seconds ago)
  backend-8002: UP (active_connections: 12)
  backend-8003: UP (active_connections: 11)

# Test rate limiting
$ for i in {1..20}; do curl -s -o /dev/null -w "%{http_code}\n" http://localhost/api/data; done
200
200
200
200
200
429  # Rate limited!
429
429
...

The Core Question You’re Answering

“How does a reverse proxy decide which backend server should handle each request, and what happens when one of those servers dies?”

Before you configure anything, understand that you’re building a traffic cop—something that receives all incoming requests and makes intelligent decisions about where to send them. This is the foundation of horizontal scaling.

Concepts You Must Understand First

Stop and research these before configuring:

The Reverse Proxy Pattern
- Why put a proxy in front of your application servers?
- What’s the difference between a load balancer and a reverse proxy?
- What information does the proxy hide from clients?
- Book Reference: “Building Microservices” Ch. 6 - Sam Newman
HTTP Connection Lifecycle
- What’s the difference between the client→Nginx and Nginx→backend connections?
- Why does Nginx maintain a connection pool to backends?
- What is HTTP keep-alive and how does it affect proxy performance?
- Book Reference: “HTTP: The Definitive Guide” Ch. 4 - Gourley
Load Balancing Algorithms
- Round-robin vs least-connections vs IP-hash—when to use each?
- What is session affinity and when is it needed?
- What are the trade-offs of sticky sessions?
- Book Reference: “Designing Data-Intensive Applications” Ch. 6 - Kleppmann
Health Checks
- What’s the difference between passive (fail-based) and active (probe-based) health checks?
- How quickly should a failed server be removed from rotation?
- What happens during the “recovery” phase?
- Book Reference: “Release It!” Ch. 5 - Nygard

Questions to Guide Your Design

Before configuring, think through these:

Upstream Configuration
- How many backend servers will you run?
- What load balancing algorithm fits your use case?
- How will you handle server weights if servers have different capacities?
Health Check Strategy
- Will you use passive checks only, or implement active health checks?
- After how many failures should a server be marked down?
- How long before attempting to use a recovered server?
Header Forwarding
- Which headers need to be forwarded to backends (Host, X-Forwarded-For)?
- How will backends know the original client’s IP address?
- How will you handle the X-Forwarded-Proto header for HTTPS detection?
Caching Strategy
- Which responses should be cached at the Nginx layer?
- How will you handle cache invalidation?
- What cache key will you use (URL only, or URL + headers)?

Thinking Exercise

Trace a Request Through the Proxy

Map out what happens when a client requests https://api.example.com/users/123:

Client Request
      │
      ▼
[Nginx receives on port 443]
      │
      ▼
[SSL Termination] → Nginx decrypts using which certificate?
      │
      ▼
[Location Matching] → /users/123 matches which location block?
      │
      ▼
[Rate Limit Check] → Is this client within their rate limit?
      │
      ▼
[Cache Lookup] → Is there a cached response for this request?
      │
      ▼
[Upstream Selection] → Which backend server gets this request?
      │
      ▼
[Proxy Request] → Nginx connects to backend on port 8001
      │            → What headers does Nginx add/modify?
      │
      ▼
[Backend Response] → Backend returns 200 with JSON
      │
      ▼
[Cache Storage] → Should Nginx cache this response?
      │
      ▼
[Client Response] → Response sent to client

Questions while tracing:

If the selected backend times out, what happens next?
Can you see the backend server’s internal port in any response headers?
What if backend 8001 returns 500—does Nginx try another backend?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between proxy_pass http://backend and proxy_pass http://backend/?” (trailing slash matters!)
“How would you configure Nginx to retry a request on a different backend if one returns 502?”
“Explain the upstream block and what least_conn does.”
“How do you preserve the client’s real IP when using a reverse proxy?”
“What’s the difference between proxy_connect_timeout and proxy_read_timeout?”
“How would you implement rate limiting per client IP in Nginx?”
“What happens to active connections when you reload Nginx configuration?”

Hints in Layers

Hint 1: Starting Point Create an upstream block with your backend servers, then use proxy_pass in a location block to forward requests to that upstream.

Hint 2: Configuration Structure

upstream backends {
    least_conn;
    server 127.0.0.1:8001;
    server 127.0.0.1:8002;
    server 127.0.0.1:8003;
}

server {
    listen 80;
    location / {
        proxy_pass http://backends;
    }
}

Hint 3: Essential Proxy Headers Add proxy_set_header Host $host;, proxy_set_header X-Real-IP $remote_addr;, and proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; to preserve client info.

Hint 4: Debugging Use curl -v to see all headers, check /var/log/nginx/error.log for upstream errors, and add $upstream_addr to your access log format to see which backend handled each request.

Books That Will Help

Topic	Book	Chapter
Nginx configuration	“Nginx HTTP Server” by Nedelcu	Ch. 3-6
Load balancing concepts	“Designing Data-Intensive Applications”	Ch. 6
Proxy patterns	“Building Microservices” by Newman	Ch. 6
Failure handling	“Release It!” by Nygard	Ch. 5
HTTP headers	“HTTP: The Definitive Guide”	Ch. 5

Learning milestones:

Basic proxy working → You understand proxy_pass and location matching
Load balancing distributes traffic → You understand upstream blocks and algorithms
Health checks working → You understand failure detection and recovery
Caching reducing backend load → You understand cache keys and invalidation

Project 3: Build an API Rate Limiter with HAProxy

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: HAProxy Configuration
Alternative Programming Languages: Lua (for custom logic), Python (backend)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Load Balancing, Rate Limiting, TCP/HTTP
Software or Tool: HAProxy
Main Book: “HAProxy: The Complete Guide” by Willy Tarreau (creator)

What you’ll build: An HAProxy configuration that provides sophisticated rate limiting for an API—per-IP limits, per-API-key limits, sliding window tracking, and graceful degradation with 429 responses. You’ll also implement connection queuing and tarpit behavior for abusers.

Why it teaches web infrastructure: HAProxy is the gold standard for raw load balancing performance. It thinks in terms of TCP connections and HTTP transactions in a way that’s more explicit than Nginx. Building a rate limiter forces you to understand stick tables, ACLs, and the request/response lifecycle at a deep level.

Core challenges you’ll face:

Understanding stick tables → maps to distributed state tracking at the proxy layer
Writing ACLs (Access Control Lists) → maps to request matching and routing decisions
Configuring rate limiting with counters → maps to sliding window algorithms
Implementing connection queuing → maps to handling traffic spikes gracefully
Layer 4 vs Layer 7 load balancing → maps to when to inspect HTTP vs raw TCP

Key Concepts:

Stick Tables: HAProxy official documentation - “Using stick tables”
Rate Limiting Algorithms: “Designing Data-Intensive Applications” Ch. 4 - Kleppmann
ACL Logic: “HAProxy Starter Guide” - haproxy.com
Token Bucket vs Leaky Bucket: Computer networking textbooks, algorithm references

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic load balancing concepts, understanding of HTTP, Project 2 concepts

Real World Outcome

You’ll have an HAProxy instance that protects your API from abuse. Legitimate users get smooth service, while abusers get rate-limited and eventually blocked. You can see real-time statistics on who’s hitting your API and at what rate.

Example Output:

# Normal requests work fine
$ for i in {1..5}; do curl -s -o /dev/null -w "%{http_code} " http://api.example.com/data; done
200 200 200 200 200

# Exceed rate limit (10 requests per second per IP)
$ for i in {1..20}; do curl -s -o /dev/null -w "%{http_code} " http://api.example.com/data; done
200 200 200 200 200 200 200 200 200 200 429 429 429 429 429 429 429 429 429 429

# Check the rate limit headers
$ curl -I http://api.example.com/data
HTTP/1.1 200 OK
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 7
X-RateLimit-Reset: 1703260000

# After exceeding limit
$ curl -I http://api.example.com/data
HTTP/1.1 429 Too Many Requests
Retry-After: 3
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
Content-Type: application/json

{"error": "rate_limit_exceeded", "retry_after": 3}

# Check HAProxy stats to see rate limiting in action
$ curl http://localhost:9000/stats
# Frontend: api_frontend
#   Rate limited requests: 1,247
#   Total requests: 15,892
#   Current connections: 45
#
# Stick table: per_ip_rates
#   192.168.1.100: http_req_rate(10s)=12, http_err_rate(10s)=0
#   192.168.1.101: http_req_rate(10s)=3, http_err_rate(10s)=0
#   10.0.0.50: http_req_rate(10s)=150 [BLOCKED]

# Show stick table contents
$ echo "show table per_ip_rates" | socat stdio /var/run/haproxy.sock
# table: per_ip_rates, type: ip, size:1048576, used:3
0x1234: key=192.168.1.100 use=1 exp=9850 http_req_rate(10000)=12
0x1235: key=192.168.1.101 use=1 exp=9900 http_req_rate(10000)=3
0x1236: key=10.0.0.50 use=1 exp=9500 http_req_rate(10000)=150

The Core Question You’re Answering

“How do you protect an API from abuse while ensuring legitimate users never notice the protection is there?”

Before writing configuration, understand that rate limiting is a balancing act. Too aggressive and you block legitimate traffic spikes. Too lenient and you don’t protect against abuse. The best rate limiters are invisible to normal users.

Concepts You Must Understand First

Stop and research these before configuring:

Rate Limiting Algorithms
- What’s the difference between token bucket and leaky bucket?
- How does a sliding window differ from fixed windows?
- What are the trade-offs of each approach?
- Book Reference: “Designing Data-Intensive Applications” Ch. 4 - Kleppmann
HAProxy Architecture
- What’s the difference between frontend, backend, and listen sections?
- How do ACLs work and when are they evaluated?
- What are stick tables and how do they track state?
- Book Reference: HAProxy official documentation
HTTP Response Codes for Rate Limiting
- Why is 429 the correct response code (not 503)?
- What headers should you include in a rate limit response?
- What’s the Retry-After header and why does it matter?
- Book Reference: RFC 6585 - Additional HTTP Status Codes
Connection Handling
- What’s the difference between connection rate and request rate?
- How does HTTP keep-alive affect rate limiting?
- What is connection queuing and when should you use it?
- Book Reference: “TCP/IP Illustrated” Ch. 13 - Stevens

Questions to Guide Your Design

Before configuring, think through these:

Rate Limit Dimensions
- Rate limit per IP address, per API key, or both?
- What time window (1 second, 10 seconds, 1 minute)?
- Different limits for different endpoints?
Limit Values
- What’s the normal traffic pattern for legitimate users?
- How much headroom for traffic spikes?
- Should authenticated users have higher limits?
Enforcement Actions
- Return 429 immediately, or queue requests?
- Tarpit (slow down) repeat offenders?
- Ban IPs that consistently exceed limits?
Observability
- How will you expose rate limit stats?
- What should the 429 response body contain?
- How will you alert on unusual rate limit activity?

Thinking Exercise

Trace a Request Through HAProxy Rate Limiting

Map out what happens when a request arrives:

Client Request Arrives
      │
      ▼
[Frontend Receives] → Which frontend block handles this?
      │
      ▼
[ACL Evaluation] → Extract client IP, check stick table
      │
      ├── Client IP not in stick table
      │   └── Create entry, increment counter, allow
      │
      ├── Client IP in table, under limit
      │   └── Increment counter, allow
      │
      └── Client IP in table, OVER limit
          │
          ▼
      [Deny/Queue Decision]
          │
          ├── Return 429 immediately
          │
          └── Queue request, wait for rate to decrease
                    │
                    ▼
              [Timeout or slot available]
                    │
                    ├── Slot available → forward to backend
                    │
                    └── Timeout → return 503

Questions while tracing:

Where exactly does the counter increment happen (before or after backend response)?
What happens to the stick table entry after the TTL expires?
If you have multiple HAProxy instances, how do you share stick tables?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between rate limiting at Layer 4 vs Layer 7?”
“How would you implement rate limiting that spans multiple HAProxy instances?”
“What’s a stick table in HAProxy and how does it differ from a session table?”
“How do you rate limit by API key instead of IP address?”
“What’s the difference between http_req_rate and conn_rate counters?”
“How would you implement exponential backoff for repeat offenders?”
“What happens to in-flight requests when you reload HAProxy configuration?”

Hints in Layers

Hint 1: Starting Point Create a stick table in your frontend with stick-table type ip size 1m expire 10s store http_req_rate(10s). This tracks request rate per IP over a 10-second sliding window.

Hint 2: Tracking and Limiting Use http-request track-sc0 src to track the source IP, then http-request deny deny_status 429 if { sc_http_req_rate(0) gt 10 } to limit to 10 requests per 10 seconds.

Hint 3: Adding Headers Use http-response set-header X-RateLimit-Limit 10 and http-response set-header X-RateLimit-Remaining %[sc0_http_req_rate(10s),sub(10)] to inform clients.

Hint 4: Debugging Enable the stats page with stats enable, use show stat on the admin socket, and check show table <table_name> to see current stick table contents.

Books That Will Help

Topic	Book	Chapter
HAProxy fundamentals	HAProxy Starter Guide	haproxy.com
Stick tables	HAProxy Documentation	“Using stick tables”
Rate limiting algorithms	“Designing Data-Intensive Applications”	Ch. 4
TCP connection handling	“TCP/IP Illustrated, Vol. 1”	Ch. 13
HTTP status codes	RFC 6585	Section 4

Learning milestones:

Basic rate limiting works → You understand stick tables and ACLs
Per-endpoint limits → You understand ACL chaining and conditions
Proper 429 responses with headers → You understand HTTP response manipulation
Stats show rate limiting activity → You understand HAProxy observability

Project 4: Set Up Automatic HTTPS with Caddy

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Caddyfile (Caddy Configuration)
Alternative Programming Languages: JSON (Caddy admin API), Go (Caddy plugins)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 1: Beginner
Knowledge Area: Web Servers, TLS/SSL, ACME Protocol
Software or Tool: Caddy
Main Book: Caddy Official Documentation (caddyserver.com/docs)

What you’ll build: A Caddy server that automatically obtains and renews TLS certificates for multiple domains, serves as a reverse proxy to backend applications, handles HTTP to HTTPS redirects, and provides clean, minimal configuration compared to Apache/Nginx.

Why it teaches web infrastructure: Caddy revolutionized web servers by making HTTPS automatic and default. Understanding how it works teaches you the ACME protocol (Let’s Encrypt), the importance of sensible defaults, and what modern web server design looks like when you start fresh without 25 years of legacy.

Core challenges you’ll face:

Understanding ACME certificate challenges → maps to how Let’s Encrypt verifies domain ownership
Configuring Caddyfile syntax → maps to declarative vs imperative configuration
Setting up reverse proxy with automatic TLS → maps to zero-config HTTPS termination
Using Caddy’s API for dynamic configuration → maps to runtime configuration changes
Handling wildcard certificates with DNS challenge → maps to DNS-01 ACME challenge

Key Concepts:

ACME Protocol: RFC 8555 - Automatic Certificate Management Environment
TLS/SSL: “Serious Cryptography” Ch. 14 - Aumasson
Caddy Configuration: Caddy official documentation - caddyserver.com/docs
DNS Challenges: Let’s Encrypt documentation

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic understanding of DNS, a domain name you control

Real World Outcome

You’ll have a Caddy server that automatically manages HTTPS for any domain you point at it. No certbot, no cron jobs, no certificate expiration emergencies at 3 AM. Just point your DNS and Caddy handles everything.

Example Output:

# Simple Caddyfile - that's literally all you need for HTTPS
$ cat /etc/caddy/Caddyfile
example.com {
    respond "Hello, HTTPS World!"
}

api.example.com {
    reverse_proxy localhost:8080
}

# Start Caddy
$ sudo systemctl start caddy

# Check certificate was automatically obtained
$ curl -v https://example.com 2>&1 | grep -E "(subject|issuer|expire)"
*  subject: CN=example.com
*  issuer: C=US; O=Let's Encrypt; CN=R3
*  expire date: Mar 22 00:00:00 2024 GMT

# HTTP automatically redirects to HTTPS
$ curl -I http://example.com
HTTP/1.1 308 Permanent Redirect
Location: https://example.com/

# Caddy serves HTTPS with modern TLS settings
$ curl -I https://example.com
HTTP/2 200
server: Caddy
content-type: text/plain; charset=utf-8
alt-svc: h3=":443"; ma=2592000

# Check certificate status via admin API
$ curl localhost:2019/config/apps/tls/automation/policies
[
  {
    "subjects": ["example.com", "api.example.com"],
    "issuers": [{"module": "acme"}],
    "on_demand": false
  }
]

# Certificate automatically renewed (check logs)
$ journalctl -u caddy | grep -i certificate
Dec 20 03:00:00 caddy[1234]: obtained certificate for example.com
Dec 20 03:00:01 caddy[1234]: certificate for example.com is valid for 89 more days

The Core Question You’re Answering

“Why has obtaining and managing TLS certificates historically been such a pain, and how does Caddy make it completely invisible?”

Before configuring, understand that HTTPS was once a luxury—expensive certificates, complex installation, manual renewals. Let’s Encrypt and ACME changed everything, but someone had to build a server that actually used this properly. That was Caddy.

Concepts You Must Understand First

Stop and research these before configuring:

The ACME Protocol
- How does Let’s Encrypt verify you own a domain?
- What’s the difference between HTTP-01 and DNS-01 challenges?
- Why do certificates need to be renewed every 90 days?
- Book Reference: RFC 8555, Let’s Encrypt documentation
TLS Certificate Chain
- What is a root CA vs intermediate CA vs end-entity certificate?
- Why do browsers trust Let’s Encrypt certificates?
- What happens if a certificate in the chain expires?
- Book Reference: “Serious Cryptography” Ch. 14 - Aumasson
DNS Requirements for ACME
- Why must ports 80 and 443 be accessible for HTTP-01?
- When would you use DNS-01 instead?
- What’s the role of CAA records?
- Book Reference: “DNS and BIND” Ch. 4 - Albitz & Liu
Modern TLS Best Practices
- What TLS versions should you support (1.2, 1.3)?
- Why is TLS 1.0/1.1 deprecated?
- What cipher suites should you enable?
- Book Reference: Mozilla SSL Configuration Generator

Questions to Guide Your Design

Before configuring, think through these:

Domain Setup
- Which domains will you serve?
- Are DNS records already pointing to your server?
- Do you need wildcard certificates?
Challenge Type
- Can your server receive connections on ports 80 and 443?
- If behind a corporate firewall, will you use DNS-01?
- Which DNS provider do you use (for DNS-01 plugin)?
Reverse Proxy Setup
- What backends will Caddy proxy to?
- What headers need to be forwarded?
- Will you use Caddy’s built-in load balancing?
Admin API
- Will you enable the admin API for dynamic config?
- How will you secure the admin endpoint?
- Will you use the JSON config or Caddyfile?

Thinking Exercise

Trace the Certificate Issuance Process

Map out what happens when Caddy starts with a new domain:

Caddy Starts
      │
      ▼
[Parse Caddyfile] → Discover domain names that need certificates
      │
      ▼
[Check Certificate Storage] → Do valid certificates already exist?
      │
      ├── Yes → Load certificates, start serving
      │
      └── No → Need to obtain certificates
              │
              ▼
        [Contact ACME Server] → Request certificate from Let's Encrypt
              │
              ▼
        [Create Challenge] → Let's Encrypt sends challenge
              │
              ├── HTTP-01: "Put this token at /.well-known/acme-challenge/xxx"
              │     └── Caddy automatically serves the token
              │
              └── DNS-01: "Create TXT record _acme-challenge.example.com"
                    └── Caddy updates DNS via provider API
              │
              ▼
        [Validation] → Let's Encrypt checks challenge
              │
              ▼
        [Certificate Issued] → Caddy stores certificate
              │
              ▼
        [Start HTTPS] → Begin serving with new certificate
              │
              ▼
        [Schedule Renewal] → Check renewal ~30 days before expiry

Questions while tracing:

What happens if the HTTP-01 challenge fails?
Where does Caddy store the certificates on disk?
How does Caddy handle the chicken-and-egg problem of needing to serve HTTP to get HTTPS?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between HTTP-01 and DNS-01 ACME challenges?”
“How does Caddy handle certificate renewal automatically?”
“Why does Caddy require no configuration for HTTPS by default?”
“What’s the admin API in Caddy and how would you use it for zero-downtime config updates?”
“How would you configure Caddy for a wildcard certificate?”
“What happens if Let’s Encrypt rate limits your domain?”
“How does Caddy’s configuration syntax differ from Nginx’s, and what are the trade-offs?”

Hints in Layers

Hint 1: Starting Point Create a simple Caddyfile with just example.com on a line followed by respond "Hello". That’s it—Caddy handles HTTPS automatically.

Hint 2: Reverse Proxy Use reverse_proxy localhost:8080 inside a site block to proxy to a backend. Caddy automatically handles WebSocket, HTTP/2, and header forwarding.

Hint 3: DNS Challenge for Wildcard For *.example.com, you need DNS-01. Use tls { dns cloudflare {env.CF_API_TOKEN} } with the appropriate DNS provider plugin.

Hint 4: Debugging Use caddy validate --config Caddyfile to check syntax, caddy adapt --config Caddyfile to see the JSON equivalent, and journalctl -u caddy -f to watch logs.

Books That Will Help

Topic	Book	Chapter
Caddy configuration	Caddy Official Docs	caddyserver.com/docs
ACME protocol	RFC 8555	All
TLS fundamentals	“Serious Cryptography” by Aumasson	Ch. 14
DNS for ACME	“DNS and BIND” by Albitz & Liu	Ch. 4
Modern TLS config	Mozilla SSL Config Generator	Web

Learning milestones:

Automatic HTTPS working → You understand ACME and HTTP-01
Multiple domains with reverse proxy → You understand Caddyfile syntax
Wildcard certificate via DNS-01 → You understand DNS challenges
Dynamic config via admin API → You understand zero-downtime updates

Project 5: Deploy a Java Application with Tomcat, Jetty, and Undertow

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Java
Alternative Programming Languages: Kotlin, Scala, Groovy
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Application Servers, Java Servlets, HTTP
Software or Tool: Apache Tomcat, Eclipse Jetty, JBoss Undertow
Main Book: “Tomcat: The Definitive Guide” by Jason Brittain & Ian Darwin

What you’ll build: The same Java web application deployed on all three servers (Tomcat, Jetty, Undertow), with performance benchmarks comparing them. You’ll configure thread pools, connection settings, and observe how each server handles load differently.

Why it teaches web infrastructure: These three servers represent different philosophies for running Java applications. Tomcat is the reference implementation (stable, familiar). Jetty is modular and embeddable. Undertow is high-performance and non-blocking. Understanding their differences teaches you what trade-offs matter in application server selection.

Core challenges you’ll face:

Understanding the Servlet specification → maps to the contract between app and container
Configuring thread pools → maps to connection handling capacity
Tuning connection timeouts → maps to resource management under load
Comparing blocking vs non-blocking I/O → maps to architectural differences
Deploying as WAR vs embedded → maps to deployment model trade-offs

Key Concepts:

Servlet Specification: Java Servlet 4.0 Specification
Thread Pool Tuning: “Java Concurrency in Practice” - Goetz
Tomcat Architecture: “Tomcat: The Definitive Guide” - Brittain & Darwin
Non-blocking I/O: “The Linux Programming Interface” Ch. 63 - Kerrisk

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic Java, understanding of HTTP, familiarity with Maven/Gradle

Real World Outcome

You’ll have the same application running on three different servers, with measurable performance data showing their differences. You’ll understand when to choose each one.

Example Output:

# Deploy same WAR file to each server
$ cp myapp.war /opt/tomcat/webapps/
$ cp myapp.war /opt/jetty/webapps/
$ java -jar undertow-app.jar  # Undertow embedded

# Start each server and verify
$ curl http://localhost:8080/myapp/api/health  # Tomcat
{"status": "UP", "server": "Apache Tomcat/10.1.16"}

$ curl http://localhost:8081/myapp/api/health  # Jetty
{"status": "UP", "server": "Jetty/11.0.18"}

$ curl http://localhost:8082/api/health  # Undertow
{"status": "UP", "server": "Undertow/2.3.10"}

# Run performance benchmark
$ wrk -t12 -c400 -d30s http://localhost:8080/myapp/api/data

Tomcat Results:
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    12.34ms    8.92ms  245.67ms   89.45%
    Req/Sec     2.45k   312.45     3.12k    72.00%
  Requests/sec: 29,234.12
  Transfer/sec:      5.23MB

Jetty Results:
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    10.78ms    7.23ms  198.34ms   91.23%
    Req/Sec     2.78k   287.12     3.45k    75.00%
  Requests/sec: 33,127.89
  Transfer/sec:      5.92MB

Undertow Results:
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     8.92ms    5.67ms  156.78ms   93.45%
    Req/Sec     3.12k   245.67     3.89k    78.00%
  Requests/sec: 37,456.23
  Transfer/sec:      6.69MB

# Monitor thread pool usage
$ jconsole  # Connect to each JVM
# Tomcat: 200 threads in http-nio pool
# Jetty: 50 threads in QueuedThreadPool
# Undertow: 8 worker threads (non-blocking)

# Memory comparison
$ jstat -gc <tomcat_pid> 1000
# Tomcat heap: 512MB used, 1GB allocated
$ jstat -gc <jetty_pid> 1000
# Jetty heap: 256MB used, 512MB allocated
$ jstat -gc <undertow_pid> 1000
# Undertow heap: 128MB used, 256MB allocated

The Core Question You’re Answering

“Why do we have three popular Java application servers, and what does each one do differently that matters for my application?”

Before deploying anything, understand that all three implement the same Servlet specification—your application code is portable between them. The differences are in HOW they run that code, and those differences matter enormously at scale.

Concepts You Must Understand First

Stop and research these before deploying:

The Servlet Specification
- What contract does a Servlet container provide?
- What’s the difference between servlets, filters, and listeners?
- What is the servlet lifecycle (init, service, destroy)?
- Book Reference: Java Servlet 4.0 Specification
Thread Pool Models
- What is a thread pool and why does size matter?
- What’s the difference between bounded and unbounded queues?
- How do thread pools affect memory and context switching?
- Book Reference: “Java Concurrency in Practice” Ch. 8 - Goetz
Blocking vs Non-Blocking I/O
- Why does one thread per connection limit scalability?
- How does NIO (New I/O) change the model?
- What’s the difference between BIO, NIO, and APR in Tomcat?
- Book Reference: “The Linux Programming Interface” Ch. 63 - Kerrisk
WAR vs Embedded Deployment
- What’s in a WAR file and how does a container deploy it?
- Why do microservices often embed the server in the JAR?
- What are the operational trade-offs?
- Book Reference: “Tomcat: The Definitive Guide” Ch. 5 - Brittain

Questions to Guide Your Design

Before deploying, think through these:

Thread Pool Configuration
- How many concurrent users do you expect?
- What’s the average request duration?
- How many threads can your server handle (CPU, memory)?
Connection Settings
- What connection timeout is appropriate?
- Should you use HTTP keep-alive?
- How many connections can your database handle?
Deployment Model
- Will you deploy WAR files or embedded JARs?
- How will you handle configuration per environment?
- How will you deploy updates (rolling, blue-green)?
Monitoring
- What JMX metrics will you expose?
- How will you detect thread pool exhaustion?
- How will you profile under load?

Thinking Exercise

Compare Request Handling Models

Map out how each server handles an HTTP request:

TOMCAT (NIO Connector - Default)
┌─────────────────────────────────────────────────────────────────────┐
│ Acceptor Thread          Worker Thread Pool (200 threads default)   │
│      │                            │                                 │
│      ▼                            ▼                                 │
│ [Accept Connection] ───► [Assign to Worker] ───► [Run Servlet]     │
│                                   │                                 │
│                                   └─► [Block on I/O if needed]     │
└─────────────────────────────────────────────────────────────────────┘

JETTY (QueuedThreadPool)
┌─────────────────────────────────────────────────────────────────────┐
│ Acceptor              SelectorManager        Thread Pool           │
│      │                      │                     │                 │
│      ▼                      ▼                     ▼                 │
│ [Accept] ───► [Register with Selector] ───► [Dispatch to Thread]   │
│                      │                                              │
│                      └─► [Selector multiplexes many connections]   │
└─────────────────────────────────────────────────────────────────────┘

UNDERTOW (XNIO - Non-blocking)
┌─────────────────────────────────────────────────────────────────────┐
│ I/O Threads (few)              Worker Threads (for blocking ops)   │
│      │                                  │                          │
│      ▼                                  ▼                          │
│ [Handle connection via NIO] ───► [Dispatch to worker only if       │
│ [Run most request handling]      needed for blocking operation]    │
│                                                                     │
│ Most requests NEVER leave I/O threads - pure non-blocking          │
└─────────────────────────────────────────────────────────────────────┘

Questions while tracing:

Why does Undertow need fewer threads for the same load?
What happens in Tomcat when all 200 worker threads are busy?
How does Jetty’s selector pattern reduce thread count?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between Tomcat’s BIO, NIO, and APR connectors?”
“How would you size a thread pool for a web application?”
“Why might you choose Jetty over Tomcat for embedded deployments?”
“What makes Undertow faster than Tomcat in benchmarks?”
“How do you monitor thread pool utilization in a Java application server?”
“What’s the difference between WAR deployment and Spring Boot’s fat JAR approach?”
“How would you configure connection timeouts to prevent resource exhaustion?”

Hints in Layers

Hint 1: Starting Point Create a simple Spring Boot application with @RestController. Spring Boot can use any of these servers—just swap the starter dependency.

Hint 2: Server Configuration For Tomcat: edit server.xml or use application.properties. For Jetty: use jetty.xml or programmatic config. For Undertow: configure via application.properties in Spring Boot.

Hint 3: Key Settings to Tune Focus on: maxThreads, acceptCount (queue size), connectionTimeout, and keepAliveTimeout. These have the biggest impact on behavior under load.

Hint 4: Benchmarking Use wrk or ab (Apache Bench) for load testing. Use jconsole or VisualVM for thread monitoring. Use jstat for GC observation.

Books That Will Help

Topic	Book	Chapter
Tomcat architecture	“Tomcat: The Definitive Guide”	Ch. 1-4
Thread pool tuning	“Java Concurrency in Practice”	Ch. 8
Servlet specification	Java Servlet 4.0 Spec	All
Performance testing	“Java Performance” by Scott Oaks	Ch. 2-3
Non-blocking I/O	“The Linux Programming Interface”	Ch. 63

Learning milestones:

Same app runs on all three → You understand servlet portability
Thread pools configured correctly → You understand capacity planning
Benchmark shows performance differences → You understand architectural trade-offs
Understand when to use each → You can make informed technology choices

Project 6: Build a Programmable API Gateway with OpenResty (Nginx + Lua)

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Lua
Alternative Programming Languages: C (Nginx modules), JavaScript (Node.js alternatives)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: API Gateway, Edge Computing, Scripting
Software or Tool: OpenResty (Nginx + LuaJIT)
Main Book: “Programming OpenResty” by Yichun Zhang (creator)

What you’ll build: A custom API gateway using OpenResty that performs JWT authentication, request transformation, rate limiting with Redis, response caching, and request/response logging—all in Lua code running inside Nginx at near-C performance.

Why it teaches web infrastructure: OpenResty represents the ultimate power-user configuration of Nginx. Instead of static configuration files, you write Lua code that executes at each phase of request processing. This teaches you the HTTP lifecycle at a level most developers never see, and shows you how tools like Kong are built under the hood.

Core challenges you’ll face:

Understanding Nginx request phases → maps to when code executes during request processing
Writing non-blocking Lua code → maps to event-driven programming at the edge
Connecting to Redis from Lua → maps to shared state in distributed systems
JWT verification in Lua → maps to authentication at the edge
Request/response transformation → maps to API gateway patterns

Key Concepts:

Nginx Request Phases: OpenResty documentation - phases guide
LuaJIT: “Programming in Lua” - Roberto Ierusalimschy
API Gateway Patterns: “Building Microservices” Ch. 11 - Sam Newman
JWT Authentication: RFC 7519, jwt.io

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Basic Nginx, basic Lua syntax, understanding of HTTP, Redis basics

Real World Outcome

You’ll have an API gateway that authenticates requests, transforms headers, applies rate limits, and logs everything—running at tens of thousands of requests per second on a single core.

Example Output:

# Request without JWT token - rejected at the edge
$ curl -I http://api.example.com/v1/users
HTTP/1.1 401 Unauthorized
Content-Type: application/json
WWW-Authenticate: Bearer realm="api"

{"error": "missing_token", "message": "Authorization header required"}

# Request with valid JWT - authenticated and routed
$ curl -H "Authorization: Bearer eyJhbGciOiJIUzI1NiIs..." http://api.example.com/v1/users
HTTP/1.1 200 OK
X-Request-ID: abc123-def456
X-Authenticated-User: user@example.com
X-Rate-Limit-Remaining: 97

{"users": [...]}

# Rate limit hit
$ for i in {1..150}; do curl -s -H "Authorization: Bearer ..." -o /dev/null -w "%{http_code}\n" http://api.example.com/v1/users; done
200
200
... (first 100 succeed)
429
429
... (remaining rejected)

# Check gateway metrics
$ curl http://localhost:9145/metrics
# HELP gateway_requests_total Total requests processed
# TYPE gateway_requests_total counter
gateway_requests_total{status="2xx"} 15234
gateway_requests_total{status="4xx"} 892
gateway_requests_total{status="5xx"} 12

# HELP gateway_latency_seconds Request latency in seconds
# TYPE gateway_latency_seconds histogram
gateway_latency_seconds_bucket{le="0.001"} 12456
gateway_latency_seconds_bucket{le="0.01"} 14892

# View transformed request (backend sees modified headers)
$ # Backend logs show:
# X-User-ID: 12345
# X-User-Email: user@example.com
# X-Request-ID: abc123-def456
# (original Authorization header removed)

The Core Question You’re Answering

“How do you add custom logic to an Nginx/reverse proxy without sacrificing its legendary performance?”

Before writing any Lua, understand that OpenResty embeds LuaJIT—a just-in-time compiled Lua—directly into Nginx workers. Your code runs at nearly native speed, in the same process as Nginx, with access to the same event loop. This is how you get programmability without the overhead of a separate service.

Concepts You Must Understand First

Stop and research these before coding:

Nginx Request Processing Phases
- What are the 11 phases of Nginx request processing?
- Which phases can you hook with Lua (access, content, log, etc.)?
- Why does the order of phases matter?
- Book Reference: OpenResty documentation - “Nginx phases”
Non-blocking I/O in OpenResty
- Why must all I/O in OpenResty be non-blocking?
- What happens if you use blocking Lua libraries?
- How does ngx.socket.tcp differ from Lua’s standard sockets?
- Book Reference: “Programming OpenResty” Ch. 3
LuaJIT vs Standard Lua
- Why is LuaJIT faster than standard Lua?
- What’s the FFI (Foreign Function Interface)?
- What are the memory limitations of LuaJIT?
- Book Reference: LuaJIT documentation
JWT Structure and Verification
- What are the three parts of a JWT?
- How is the signature verified?
- What claims should you validate (exp, iss, aud)?
- Book Reference: RFC 7519, jwt.io introduction

Questions to Guide Your Design

Before coding, think through these:

Authentication Strategy
- Where will you validate JWTs (access phase)?
- How will you handle expired tokens?
- How will you pass user information to backends?
Rate Limiting Architecture
- Will you use Redis or Nginx shared memory?
- Per-user, per-IP, or per-API-key limits?
- What happens when Redis is unavailable?
Request Transformation
- What headers will you add/remove?
- How will you handle request body transformation?
- How will you version your API transformations?
Observability
- What metrics will you expose?
- How will you correlate requests across services (request ID)?
- What will you log and where?

Thinking Exercise

Trace Request Through OpenResty Phases

Map out what happens in each phase:

Request Arrives
      │
      ▼
[set_by_lua] → Set variables from Lua
      │
      ▼
[rewrite_by_lua] → URL rewriting, early redirects
      │
      ▼
[access_by_lua] → Authentication, authorization, rate limiting
      │                   ├── JWT validation
      │                   ├── Redis rate limit check
      │                   └── Return 401/403/429 if failed
      │
      ▼
[content_by_lua] → Generate response OR...
      │
      ▼
[proxy_pass] → Forward to upstream
      │
      ▼
[header_filter_by_lua] → Modify response headers
      │
      ▼
[body_filter_by_lua] → Modify response body (streaming)
      │
      ▼
[log_by_lua] → Custom logging (non-blocking)

Questions while tracing:

If authentication fails in access_by_lua, do later phases run?
Can you modify the request body before proxying?
How do you share data between phases (ngx.ctx)?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between content_by_lua and content_by_lua_block?”
“How would you implement distributed rate limiting with OpenResty?”
“Why can’t you use standard Lua socket libraries in OpenResty?”
“How does OpenResty share data between requests (shared dictionaries)?”
“What’s the cosocket and why is it important?”
“How would you handle JWT refresh tokens in an API gateway?”
“What are the memory limitations of LuaJIT and how do you work around them?”

Hints in Layers

Hint 1: Starting Point Install OpenResty, create a basic nginx.conf with content_by_lua_block { ngx.say("Hello from Lua!") } to verify everything works.

Hint 2: JWT Library Use lua-resty-jwt for JWT verification. Install via opm get SkyLothar/lua-resty-jwt. The library handles all the cryptographic heavy lifting.

Hint 3: Redis Connection Use lua-resty-redis for non-blocking Redis. Always use connection pooling with set_keepalive() to avoid connection overhead.

Hint 4: Debugging Use ngx.log(ngx.ERR, "message") for logging, check /usr/local/openresty/nginx/logs/error.log. Enable lua_code_cache off; during development.

Books That Will Help

Topic	Book	Chapter
OpenResty fundamentals	“Programming OpenResty”	All
Lua programming	“Programming in Lua” by Ierusalimschy	Ch. 1-10
API gateway patterns	“Building Microservices”	Ch. 11
JWT authentication	RFC 7519	All
Redis	“Redis in Action” by Carlson	Ch. 1-3

Learning milestones:

Hello World from Lua → You understand basic OpenResty setup
JWT authentication working → You understand access phase
Redis rate limiting → You understand non-blocking I/O
Full request transformation → You understand the complete gateway pattern

Project 7: Configure Envoy as a Modern Service Proxy

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: YAML (Envoy configuration)
Alternative Programming Languages: Go (xDS control plane), Python (control plane)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: Service Mesh, L7 Proxy, Observability
Software or Tool: Envoy Proxy
Main Book: “Istio in Action” by Christian Posta (covers Envoy deeply)

What you’ll build: An Envoy proxy configuration with traffic routing, circuit breaking, retries with exponential backoff, distributed tracing (with Jaeger), and Prometheus metrics—the complete observability stack that makes Envoy the backbone of modern service meshes.

Why it teaches web infrastructure: Envoy was designed for the microservices era. Unlike Nginx or HAProxy which evolved from simpler use cases, Envoy was built from day one with observability, dynamic configuration, and complex routing. Understanding Envoy teaches you modern infrastructure thinking.

Core challenges you’ll face:

Understanding Envoy’s architecture → maps to listeners, clusters, and filter chains
Configuring circuit breakers → maps to protecting downstream services
Setting up distributed tracing → maps to request correlation across services
Dynamic configuration with xDS → maps to control plane concepts
Traffic splitting and canary deployments → maps to progressive delivery

Key Concepts:

Envoy Architecture: Envoy documentation - “What is Envoy”
Circuit Breakers: “Release It!” Ch. 5 - Nygard
Distributed Tracing: “Distributed Systems Observability” - Sridharan
Service Mesh: “Istio in Action” Ch. 1-4 - Posta

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Understanding of microservices, Docker/containers, HTTP/2, gRPC basics

Real World Outcome

You’ll have an Envoy proxy with sophisticated traffic management, automatic retries, circuit breaking, and full observability. You’ll see distributed traces across multiple services and metrics in Prometheus.

Example Output:

# Start Envoy with your configuration
$ envoy -c /etc/envoy/envoy.yaml --log-level info

# Check listeners and clusters
$ curl http://localhost:9901/listeners
[
  {
    "name": "listener_0",
    "address": {"socket_address": {"address": "0.0.0.0", "port_value": 8080}}
  }
]

$ curl http://localhost:9901/clusters
user-service::10.0.0.1:8080::cx_active::15
user-service::10.0.0.1:8080::rq_total::45234
user-service::10.0.0.2:8080::cx_active::12
user-service::10.0.0.2:8080::outlier_detection.ejected::false

# Make request with tracing
$ curl -H "x-request-id: test-trace-123" http://localhost:8080/api/users
{"users": [...]}
# Check Jaeger UI: see complete trace across services!

# Circuit breaker in action
$ # Backend service starts failing
$ curl http://localhost:8080/api/users
HTTP/1.1 503 Service Unavailable
x-envoy-overloaded: true

# Check circuit breaker stats
$ curl http://localhost:9901/stats | grep circuit
cluster.user-service.circuit_breakers.default.cx_open: 1
cluster.user-service.circuit_breakers.default.remaining_cx: 0
cluster.user-service.upstream_rq_pending_overflow: 234

# Traffic splitting (90% v1, 10% v2)
$ for i in {1..100}; do curl -s http://localhost:8080/api/version; done | sort | uniq -c
     90 {"version": "v1"}
     10 {"version": "v2"}

# Prometheus metrics endpoint
$ curl http://localhost:9901/stats/prometheus
# TYPE envoy_cluster_upstream_rq_total counter
envoy_cluster_upstream_rq_total{envoy_cluster_name="user-service",envoy_response_code="200"} 45000
envoy_cluster_upstream_rq_total{envoy_cluster_name="user-service",envoy_response_code="503"} 234

# TYPE envoy_cluster_upstream_rq_time histogram
envoy_cluster_upstream_rq_time_bucket{envoy_cluster_name="user-service",le="5"} 35000

The Core Question You’re Answering

“How do you build a proxy that can handle the complexity of microservices—automatic retries, circuit breaking, traffic splitting, and distributed tracing—all at the data plane?”

Before configuring, understand that Envoy represents a philosophy: put intelligence at the edge of each service, not in a centralized load balancer. Every service gets its own proxy (sidecar pattern) that handles all this complexity.

Concepts You Must Understand First

Stop and research these before configuring:

Envoy Architecture
- What’s the difference between listeners, clusters, and endpoints?
- What are filter chains and why do they matter?
- What’s the difference between downstream and upstream?
- Book Reference: Envoy documentation - architecture overview
Circuit Breakers
- What are the five states of a circuit breaker?
- When should a circuit breaker trip?
- What’s the difference between consecutive failures and outlier detection?
- Book Reference: “Release It!” Ch. 5 - Nygard
Distributed Tracing
- What is a trace ID and how does it propagate?
- What’s the difference between spans and traces?
- How does B3 propagation work?
- Book Reference: “Distributed Systems Observability” - Sridharan
xDS Protocol
- What are LDS, RDS, CDS, and EDS?
- Why does Envoy use dynamic configuration?
- What’s a control plane vs data plane?
- Book Reference: Envoy xDS documentation

Questions to Guide Your Design

Before configuring, think through these:

Listener Configuration
- What ports will Envoy listen on?
- Will you use HTTP connection manager or TCP proxy?
- How will you route based on path/headers?
Cluster Configuration
- How will you discover upstream endpoints?
- What load balancing algorithm will you use?
- What health check configuration makes sense?
Resilience Settings
- What retry policy (attempts, backoff)?
- What circuit breaker thresholds?
- What timeout values?
Observability
- Which tracing backend (Jaeger, Zipkin)?
- What sampling rate for traces?
- Which metrics are most important?

Thinking Exercise

Trace a Request Through Envoy

Map out what happens when a request arrives:

Request Arrives at Listener
      │
      ▼
[Listener Filter Chain] → TLS termination, protocol detection
      │
      ▼
[HTTP Connection Manager] → Parse HTTP, apply HTTP filters
      │
      ▼
[Route Matching] → Match request to a cluster
      │
      ▼
[Cluster Selection] → Pick an endpoint (load balancing)
      │
      ├── Health check: is endpoint healthy?
      ├── Circuit breaker: is circuit open?
      └── Outlier detection: is endpoint ejected?
      │
      ▼
[Upstream Request] → Send request to endpoint
      │
      ├── Success → Return response
      │
      └── Failure → Retry policy kicks in
              │
              ├── Retries available? → Pick different endpoint
              │
              └── No retries → Return error, trip circuit

Questions while tracing:

Where does distributed tracing inject the trace header?
If the circuit is open, what happens to the request?
How does Envoy decide when to eject an endpoint?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between a listener and a cluster in Envoy?”
“How does Envoy’s outlier detection differ from traditional health checks?”
“Explain how circuit breakers prevent cascade failures.”
“What’s the xDS protocol and why is dynamic configuration important?”
“How would you implement canary deployments with Envoy?”
“What’s the difference between retries and hedging in Envoy?”
“How does Envoy integrate with Prometheus for metrics?”

Hints in Layers

Hint 1: Starting Point Start with Envoy’s minimal configuration: one listener on port 8080, one cluster pointing to a backend, and HTTP connection manager.

Hint 2: Static vs Dynamic Begin with static configuration in YAML. Once that works, consider xDS for dynamic configuration. You can run a simple gRPC control plane.

Hint 3: Circuit Breaker Settings Start conservative: max_connections: 100, max_pending_requests: 100, max_requests: 100. Tune based on observed behavior.

Hint 4: Debugging Enable admin interface on port 9901. Use /config_dump to see complete config, /clusters for cluster health, /stats for all metrics.

Books That Will Help

Topic	Book	Chapter
Envoy fundamentals	Envoy Documentation	All
Service mesh patterns	“Istio in Action” by Posta	Ch. 1-6
Circuit breakers	“Release It!” by Nygard	Ch. 5
Distributed tracing	“Distributed Systems Observability”	Ch. 4
Microservices	“Building Microservices”	Ch. 8-11

Learning milestones:

Basic proxy routing works → You understand listeners and clusters
Circuit breakers trip under load → You understand resilience patterns
Traces appear in Jaeger → You understand distributed tracing
Traffic splitting for canary → You understand progressive delivery

Project 8: Implement Kubernetes Ingress with Traefik

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: YAML (Kubernetes manifests)
Alternative Programming Languages: Go (Traefik plugins), TOML (Traefik config)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Kubernetes, Ingress, Cloud Native
Software or Tool: Traefik Proxy
Main Book: “Kubernetes in Action” by Marko Lukša

What you’ll build: A complete Kubernetes ingress setup with Traefik that automatically discovers services, handles TLS with cert-manager, provides path-based and host-based routing, implements middleware chains (auth, rate limiting, compression), and exposes metrics and dashboards.

Why it teaches web infrastructure: Traefik represents the “cloud-native” approach to reverse proxies. Instead of static configuration files, it watches Kubernetes for changes and automatically configures itself. Understanding Traefik teaches you how modern infrastructure is declarative and self-configuring.

Core challenges you’ll face:

Understanding Kubernetes Ingress → maps to how traffic enters a cluster
Configuring Traefik IngressRoutes → maps to advanced routing beyond standard Ingress
Setting up cert-manager integration → maps to automatic TLS in Kubernetes
Writing middleware chains → maps to request processing pipelines
Service discovery and auto-configuration → maps to GitOps and declarative infrastructure

Key Concepts:

Kubernetes Ingress: “Kubernetes in Action” Ch. 5 - Lukša
Traefik Architecture: Traefik documentation - concepts
cert-manager: cert-manager.io documentation
Middleware Patterns: “Building Microservices” Ch. 6 - Newman

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic Kubernetes (pods, services, deployments), kubectl, understanding of TLS

Real World Outcome

You’ll have a Traefik ingress controller that automatically configures routing as you deploy services. Add a new service with the right annotations, and Traefik routes to it immediately—no manual configuration needed.

Example Output:

# Deploy Traefik to Kubernetes
$ helm install traefik traefik/traefik -n traefik-system

# Check Traefik is running
$ kubectl get pods -n traefik-system
NAME                      READY   STATUS    RESTARTS   AGE
traefik-7d9c9f8b4-x2k9p   1/1     Running   0          2m

# Deploy a service with IngressRoute
$ kubectl apply -f myapp-ingressroute.yaml
ingressroute.traefik.containo.us/myapp created

# Traefik immediately discovers and routes to it!
$ curl https://myapp.example.com/api/health
{"status": "UP", "service": "myapp", "version": "1.0.0"}

# Check Traefik dashboard (port-forward)
$ kubectl port-forward -n traefik-system svc/traefik 9000:9000
# Visit http://localhost:9000/dashboard/
# See all routers, services, and middlewares!

# View automatic TLS certificate
$ curl -v https://myapp.example.com 2>&1 | grep -E "(subject|issuer)"
*  subject: CN=myapp.example.com
*  issuer: C=US; O=Let's Encrypt; CN=R3

# Test middleware chain (auth + rate limit + compression)
$ curl -H "Authorization: Bearer invalid" https://myapp.example.com/api/data
HTTP/2 401 Unauthorized

$ curl -H "Authorization: Bearer valid_token" https://myapp.example.com/api/data
HTTP/2 200 OK
Content-Encoding: gzip
X-RateLimit-Remaining: 99

# Add new service - Traefik auto-discovers!
$ kubectl apply -f newservice.yaml
deployment.apps/newservice created
service/newservice created
ingressroute.traefik.containo.us/newservice created

$ curl https://newservice.example.com/
{"message": "Hello from new service!"}
# No Traefik restart needed - automatic!

The Core Question You’re Answering

“How do you build an ingress controller that automatically configures itself as services come and go in Kubernetes?”

Before writing manifests, understand that Traefik watches the Kubernetes API for changes. When you create an IngressRoute, Traefik sees it within seconds and updates its routing table. This is the declarative, self-healing infrastructure that makes Kubernetes powerful.

Concepts You Must Understand First

Stop and research these before configuring:

Kubernetes Ingress vs Gateway API
- What’s the standard Ingress resource and its limitations?
- What additional features do IngressRoutes provide?
- What’s the new Gateway API and how does Traefik support it?
- Book Reference: “Kubernetes in Action” Ch. 5 - Lukša
Traefik Architecture
- What’s the difference between entrypoints, routers, and services?
- How does Traefik discover configuration (providers)?
- What are middlewares and how do they chain?
- Book Reference: Traefik documentation - concepts
TLS with cert-manager
- How does cert-manager integrate with Traefik?
- What’s an Issuer vs ClusterIssuer?
- How do Certificate resources work?
- Book Reference: cert-manager.io documentation
Kubernetes RBAC for Ingress
- What permissions does Traefik need?
- Why does it need to watch Ingress, Secret, and Service resources?
- How do you secure the Traefik dashboard?
- Book Reference: Kubernetes RBAC documentation

Questions to Guide Your Design

Before configuring, think through these:

Entrypoint Configuration
- Which ports will Traefik listen on (80, 443)?
- Will you redirect HTTP to HTTPS?
- Will you use hostPort, NodePort, or LoadBalancer?
Routing Strategy
- Path-based routing, host-based, or both?
- How will you handle routing priority?
- Will you use IngressRoute CRDs or standard Ingress?
TLS Strategy
- Will you use cert-manager for automatic certs?
- Will you use a wildcard certificate?
- How will you handle certificate secrets?
Middleware Configuration
- What middlewares will you use globally vs per-route?
- How will you chain middlewares?
- How will you test middleware behavior?

Thinking Exercise

Trace Request Through Traefik in Kubernetes

Map out what happens when traffic reaches the cluster:

External Request
      │
      ▼
[Cloud Load Balancer] → Sends traffic to Traefik pods
      │
      ▼
[Traefik Entrypoint] → Port 443, TLS termination
      │
      ▼
[Router Matching] → Match Host header + path
      │
      ├── Host: myapp.example.com, Path: /api/*
      │   └── Router: myapp-router
      │
      └── Host: other.example.com
          └── Router: other-router
      │
      ▼
[Middleware Chain] → Each middleware processes request
      │
      ├── auth-middleware → JWT validation
      ├── rate-limit-middleware → Rate limiting
      └── compress-middleware → Gzip compression
      │
      ▼
[Service Selection] → Pick Kubernetes service
      │
      ▼
[Endpoint Selection] → Pick pod (load balancing)
      │
      ▼
[Pod] → Request reaches your application

Questions while tracing:

What happens if no router matches the request?
If a middleware returns an error, do subsequent middlewares run?
How does Traefik know which pods back a service?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between Traefik’s IngressRoute and standard Kubernetes Ingress?”
“How does Traefik discover new services in Kubernetes?”
“How would you configure automatic TLS with cert-manager and Traefik?”
“What’s a middleware in Traefik and how do you chain them?”
“How would you implement rate limiting per user in Traefik?”
“What’s the difference between Traefik’s providers (file, Kubernetes, Docker)?”
“How would you do a canary deployment with Traefik in Kubernetes?”

Hints in Layers

Hint 1: Starting Point Install Traefik with Helm and expose the dashboard. Create a simple IngressRoute that routes to a test service.

Hint 2: CRD Installation Traefik’s IngressRoute is a CRD (Custom Resource Definition). The Helm chart installs these automatically, but you need them before creating IngressRoutes.

Hint 3: Middleware Definition Create Middleware resources first, then reference them in IngressRoutes with middlewares:. The order in the list determines execution order.

Hint 4: Debugging Use the Traefik dashboard to see all routers/services/middlewares. Check kubectl logs for Traefik pods. Use kubectl describe ingressroute for status.

Books That Will Help

Topic	Book	Chapter
Kubernetes fundamentals	“Kubernetes in Action” by Lukša	Ch. 1-5
Ingress concepts	“Kubernetes in Action”	Ch. 5
Traefik configuration	Traefik Documentation	All
cert-manager	cert-manager.io docs	All
GitOps	“GitOps and Kubernetes”	Ch. 1-4

Learning milestones:

Traefik routes to a service → You understand basic IngressRoutes
Automatic TLS working → You understand cert-manager integration
Middleware chain working → You understand request processing
Auto-discovery of new services → You understand cloud-native patterns

Project 9: Set Up a Forward Proxy with Squid for Corporate Network

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Squid Configuration
Alternative Programming Languages: Python (ICAP), C (custom helpers)
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Forward Proxy, Content Filtering, Caching
Software or Tool: Squid Proxy Server
Main Book: “Squid: The Definitive Guide” by Duane Wessels

What you’ll build: A Squid forward proxy that provides internet access for a corporate network, with content filtering, user authentication, bandwidth throttling, SSL inspection (HTTPS interception), and detailed access logging for compliance.

Why it teaches web infrastructure: While most modern work focuses on reverse proxies, forward proxies remain critical for corporate environments. Understanding Squid teaches you ACLs, HTTP filtering, certificate handling for HTTPS interception, and the client-side perspective of proxying—the opposite of everything else in this course.

Core challenges you’ll face:

Configuring ACLs (Access Control Lists) → maps to who can access what
Setting up authentication → maps to user identity in network traffic
Implementing HTTPS interception → maps to SSL/TLS bump/splice
Configuring caching policies → maps to bandwidth optimization
Delay pools for bandwidth management → maps to QoS at the proxy layer

Key Concepts:

Forward Proxy Fundamentals: “Squid: The Definitive Guide” Ch. 1-3 - Wessels
ACL Logic: “Squid: The Definitive Guide” Ch. 6 - Wessels
HTTPS Interception: “Serious Cryptography” Ch. 14 - Aumasson
HTTP Caching: “HTTP: The Definitive Guide” Ch. 7 - Gourley

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic networking (IP, ports, DNS), understanding of HTTP/HTTPS, certificate concepts

Real World Outcome

You’ll have a fully functional forward proxy that controls internet access for clients. You can see who’s accessing what, block certain sites, cache popular content, and even inspect HTTPS traffic (with proper certificates deployed).

Example Output:

# Start Squid
$ sudo systemctl start squid
$ sudo systemctl status squid
● squid.service - Squid Web Proxy Server
     Active: active (running)

# Configure client browser to use proxy (proxy.corp.local:3128)

# Test proxy is working
$ curl -x http://proxy.corp.local:3128 http://example.com
<!DOCTYPE html>
<html>...

# Test authentication required
$ curl -x http://proxy.corp.local:3128 http://example.com
HTTP/1.1 407 Proxy Authentication Required
Proxy-Authenticate: Basic realm="Squid proxy"

$ curl -x http://user:password@proxy.corp.local:3128 http://example.com
<!DOCTYPE html>... (success!)

# Test content filtering
$ curl -x http://user:pass@proxy.corp.local:3128 http://blocked-site.com
HTTP/1.1 403 Forbidden
X-Squid-Error: ERR_ACCESS_DENIED 0

# View access logs
$ tail -f /var/log/squid/access.log
1703260000.000    123 192.168.1.50 TCP_MISS/200 1234 GET http://example.com/ user1 DIRECT/93.184.216.34 text/html
1703260001.000     45 192.168.1.51 TCP_DENIED/403 567 GET http://blocked-site.com/ user2 NONE/- text/html

# Check cache hit ratio
$ squidclient mgr:info | grep -E "(Request Hit|Object)"
Request Hit Ratios:     5min: 23.4%,    60min: 28.7%
Object Storage:         12345 objects, 456 MB

# View current connections
$ squidclient mgr:active_requests
Connection: 192.168.1.50:52345 -> 93.184.216.34:80
  User: user1
  Request: GET http://example.com/large-file.zip
  Bytes: 15MB of 100MB (15%)

The Core Question You’re Answering

“How do you control, monitor, and optimize internet access for hundreds of users from a single point?”

Before configuring, understand that a forward proxy sits between your users and the internet. Every HTTP request they make goes through it. This gives you unprecedented visibility and control—but also responsibility for security and privacy.

Concepts You Must Understand First

Stop and research these before configuring:

Forward vs Reverse Proxy (Again)
- In forward proxy, who is the client?
- Why must browsers be configured to use a forward proxy?
- What’s “transparent” proxying?
- Book Reference: “Squid: The Definitive Guide” Ch. 1
Access Control Lists (ACLs)
- How do ACLs combine to make access decisions?
- What’s the difference between allow and deny?
- What order are ACLs evaluated?
- Book Reference: “Squid: The Definitive Guide” Ch. 6
HTTPS Interception (SSL Bump)
- Why can’t a proxy see HTTPS content by default?
- What does “SSL bumping” do?
- Why is this controversial? (Privacy, security implications)
- Book Reference: Squid SSL-Bump documentation
HTTP Caching for Clients
- What’s the difference between a cache hit and miss?
- When should a proxy cache responses?
- How does If-Modified-Since work?
- Book Reference: “HTTP: The Definitive Guide” Ch. 7

Questions to Guide Your Design

Before configuring, think through these:

Authentication Strategy
- Will you authenticate users (LDAP, NTLM, Basic)?
- Will you allow anonymous access from certain IPs?
- How will you handle service accounts?
Access Control Policy
- What categories of sites will you block?
- Will you use a blocklist, allowlist, or both?
- Different policies for different user groups?
HTTPS Inspection
- Will you inspect HTTPS traffic?
- How will you handle certificate pinning?
- What are the legal/privacy implications in your jurisdiction?
Caching Strategy
- What’s your cache size (disk and memory)?
- What content should never be cached?
- How aggressive should cache freshness be?

Thinking Exercise

Trace a Request Through Squid

Map out what happens when a browser makes a request:

Browser configured with proxy.corp.local:3128
      │
      ▼
[Browser sends CONNECT or GET to proxy]
      │
      ▼
[Squid receives request]
      │
      ▼
[ACL Evaluation - Order matters!]
      │
      ├── Check src (source IP)
      ├── Check dst (destination)
      ├── Check port
      ├── Check protocol
      ├── Check time_of_day
      ├── Check user (if authenticated)
      └── Check url_regex
      │
      ├── http_access deny → Return 403
      │
      └── http_access allow → Continue
              │
              ▼
        [Cache Lookup]
              │
              ├── Cache HIT → Return cached response
              │
              └── Cache MISS → Forward to origin
                      │
                      ▼
                [Origin Response]
                      │
                      ▼
                [Store in cache if cacheable]
                      │
                      ▼
                [Return to client]

Questions while tracing:

What happens if no ACL matches?
Can you cache authenticated content?
How does CONNECT (for HTTPS) differ from GET?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between a forward proxy and a reverse proxy?”
“How would you configure Squid to authenticate against Active Directory?”
“Explain how ACL rules are evaluated in Squid.”
“What are the security and privacy implications of HTTPS interception?”
“How would you configure Squid to block a category of websites?”
“What’s a transparent proxy and when would you use it?”
“How does Squid’s delay pools feature work for bandwidth management?”

Hints in Layers

Hint 1: Starting Point Install Squid, set http_port 3128, configure a basic http_access allow localnet, and test with curl -x.

Hint 2: ACL Order ACLs are evaluated in order. Put specific rules before general ones. The first matching http_access rule wins.

Hint 3: HTTPS Interception For SSL bumping, you need to generate a CA certificate, deploy it to all clients, and configure ssl_bump rules carefully.

Hint 4: Debugging Use squid -k parse to check config syntax. Use squidclient mgr: commands for runtime info. Check cache.log for errors.

Books That Will Help

Topic	Book	Chapter
Squid fundamentals	“Squid: The Definitive Guide”	Ch. 1-5
ACLs in depth	“Squid: The Definitive Guide”	Ch. 6
HTTPS interception	Squid SSL-Bump docs	All
HTTP caching	“HTTP: The Definitive Guide”	Ch. 7
TLS/SSL	“Serious Cryptography”	Ch. 14

Learning milestones:

Proxy working for HTTP → You understand basic Squid configuration
Authentication required → You understand proxy authentication
Content filtering working → You understand ACLs
HTTPS interception (optional) → You understand SSL bump

Project 10: Build a High-Performance Caching Layer with Apache Traffic Server

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Traffic Server Configuration
Alternative Programming Languages: C/C++ (plugins), TypeScript (for TS plugins)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: CDN, Caching, High Performance
Software or Tool: Apache Traffic Server
Main Book: Apache Traffic Server Documentation

What you’ll build: A high-performance caching reverse proxy using Apache Traffic Server (the technology behind Yahoo, Apple, Akamai CDNs) that caches content at scale, handles cache hierarchies, implements cache purging APIs, and serves millions of requests per day on modest hardware.

Why it teaches web infrastructure: Apache Traffic Server (ATS) is what CDNs are built on. It’s designed for the kind of scale where Nginx/HAProxy aren’t enough. Understanding ATS teaches you about cache architecture at a level most developers never need—but that powers every major website.

Core challenges you’ll face:

Understanding the cache architecture → maps to how CDNs store and retrieve content
Configuring remap rules → maps to URL-based routing at CDN scale
Setting up cache hierarchies → maps to parent caching and cache tiering
Implementing cache invalidation → maps to the hardest problem in caching
Tuning for high performance → maps to understanding disk I/O, memory, and networking

Key Concepts:

Cache Architecture: Apache Traffic Server documentation - cache internals
HTTP Caching: “HTTP: The Definitive Guide” Ch. 7 - Gourley
CDN Design: “Building Scalable Web Sites” Ch. 14 - Henderson
High Performance I/O: “The Linux Programming Interface” Ch. 63 - Kerrisk

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Understanding of HTTP caching, disk I/O concepts, production Linux experience

Real World Outcome

You’ll have a caching proxy capable of handling CDN-scale traffic. You’ll see cache hit ratios climb as content is cached, watch disk I/O patterns, and understand why CDNs are so effective at reducing origin load.

Example Output:

# Start Traffic Server
$ sudo traffic_server start
Traffic Server is now running.

# Check status
$ traffic_ctl server status
Proxy -- on

# Make requests and watch caching
$ curl -I http://cdn.example.com/images/logo.png
HTTP/1.1 200 OK
Age: 0
X-Cache: MISS
X-Cache-Key: http://origin.example.com/images/logo.png

$ curl -I http://cdn.example.com/images/logo.png
HTTP/1.1 200 OK
Age: 5
X-Cache: HIT
X-Cache-Key: http://origin.example.com/images/logo.png
# Second request served from cache!

# Check cache statistics
$ traffic_ctl metric get proxy.process.cache.bytes_total
proxy.process.cache.bytes_total: 15234567890

$ traffic_ctl metric get proxy.process.http.completed_requests
proxy.process.http.completed_requests: 1234567

$ traffic_ctl metric get proxy.node.cache.percent_free
proxy.node.cache.percent_free: 45.67

# View cache hit ratio
$ traffic_ctl metric match hit
proxy.node.cache_hit_ratio: 0.847
proxy.node.bandwidth_hit_ratio: 0.912
# 84.7% of requests served from cache!

# Purge specific URL from cache
$ curl -X PURGE http://cdn.example.com/images/logo.png
HTTP/1.1 200 OK
X-Purge-Status: OK

# Run load test
$ wrk -t12 -c400 -d60s http://cdn.example.com/static/app.js
Running 60s test @ http://cdn.example.com/static/app.js
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.23ms    0.45ms   12.34ms   95.67%
    Req/Sec    25.12k     2.34k    32.45k    78.00%
  18,123,456 requests in 60s, 892.34GB read
Requests/sec: 302,057.60
# 300K requests per second from cache!

The Core Question You’re Answering

“How do you build a caching layer that can serve millions of requests per day while keeping origin servers protected?”

Before configuring, understand that ATS was designed for scale that matters. Yahoo donated it to Apache after using it to handle billions of requests daily. This is the same technology class as Varnish, but with different trade-offs.

Concepts You Must Understand First

Stop and research these before configuring:

Cache Architecture
- What’s the difference between memory cache and disk cache?
- How does ATS organize its cache (spans, stripes, volumes)?
- What’s the cache object store format?
- Book Reference: ATS documentation - cache architecture
HTTP Caching Semantics
- What headers determine cacheability (Cache-Control, Expires, ETag)?
- What’s the difference between freshness and validation?
- When must a cache validate with the origin?
- Book Reference: “HTTP: The Definitive Guide” Ch. 7
Remap Configuration
- How do remap rules work (from URL to origin)?
- What’s the difference between map, reverse_map, and regex_map?
- How do plugins interact with remap rules?
- Book Reference: ATS remap.config documentation
Cache Invalidation
- Why is cache invalidation called “hard”?
- What’s the difference between purge and refresh?
- How do you handle emergency cache clears?
- Book Reference: ATS purge documentation

Questions to Guide Your Design

Before configuring, think through these:

Cache Storage
- How much disk space will you allocate?
- What disk type (SSD, NVMe, spinning)?
- How will you partition between ram and disk cache?
Remap Strategy
- One origin or multiple origins?
- Path-based routing rules?
- Host header manipulation?
Caching Policies
- What TTLs for different content types?
- What should never be cached?
- How will you handle query strings?
Invalidation Strategy
- How will you purge cached content?
- Who has permission to purge?
- Instant purge vs scheduled refresh?

Thinking Exercise

Trace a Request Through ATS Cache

Map out what happens when a request arrives:

Request Arrives
      │
      ▼
[Remap Lookup] → Map incoming URL to origin URL
      │
      ├── No match → Return 404 or pass through
      │
      └── Match → Continue with remapped URL
              │
              ▼
        [RAM Cache Lookup]
              │
              ├── RAM HIT → Return immediately (fastest)
              │
              └── RAM MISS → Check disk cache
                      │
                      ▼
                [Disk Cache Lookup]
                      │
                      ├── DISK HIT
                      │   │
                      │   ├── Fresh → Return from disk
                      │   │
                      │   └── Stale → Validate with origin
                      │             │
                      │             ├── 304 Not Modified → Return cached
                      │             │
                      │             └── 200 → Update cache, return new
                      │
                      └── DISK MISS → Fetch from origin
                              │
                              ▼
                        [Origin Request]
                              │
                              ▼
                        [Store in cache if cacheable]
                              │
                              ▼
                        [Return to client]

Questions while tracing:

What determines if an object goes to RAM cache vs disk only?
How does ATS handle If-Modified-Since validation?
What happens during a “thundering herd” (many requests for same uncached object)?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between Apache Traffic Server and Varnish?”
“How does a cache hierarchy (parent caching) work and when would you use it?”
“Explain the difference between purge, refresh, and cache aging.”
“How would you handle the thundering herd problem in a cache?”
“What metrics would you monitor for cache health?”
“How does ATS store objects on disk?”
“When would you choose ATS over Nginx as a caching layer?”

Hints in Layers

Hint 1: Starting Point Install ATS, configure storage.config with a cache partition, set up a simple remap rule in remap.config, and test with curl.

Hint 2: Remap Configuration Format: map http://cdn.example.com/ http://origin.example.com/. The first URL is what clients request, the second is where ATS forwards.

Hint 3: Cache Tuning Start with defaults. Use traffic_ctl metric match cache to see hit ratios. Tune proxy.config.http.cache.required_headers based on your origin’s headers.

Hint 4: Debugging Enable transaction logging in logging.yaml. Use traffic_ctl server backtrace for stack traces. Check traffic.out for startup errors.

Books That Will Help

Topic	Book	Chapter
ATS configuration	ATS Documentation	All
HTTP caching	“HTTP: The Definitive Guide”	Ch. 7
CDN architecture	“Building Scalable Web Sites”	Ch. 14
High-performance I/O	“The Linux Programming Interface”	Ch. 63
Caching patterns	“Designing Data-Intensive Applications”	Ch. 5

Learning milestones:

Basic caching working → You understand remap rules and cache storage
High cache hit ratio → You understand cache tuning
Purge API working → You understand cache invalidation
High-performance under load → You understand production cache operations

Project 11: Deploy ASP.NET Core with Kestrel and IIS as Reverse Proxy

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: C#
Alternative Programming Languages: F#, VB.NET
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Windows Server, ASP.NET Core, .NET Runtime
Software or Tool: Microsoft IIS + Kestrel
Main Book: Microsoft ASP.NET Core Documentation

What you’ll build: An ASP.NET Core application running on Kestrel with IIS as a reverse proxy, configured for production with proper request limits, timeouts, application pool settings, Windows Authentication, and health monitoring.

Why it teaches web infrastructure: The Windows ecosystem has its own patterns for web hosting. Understanding how Kestrel (the modern cross-platform server) works with IIS (the Windows-native server) teaches you about the Microsoft way of doing things—and when you need IIS vs when you can run Kestrel edge-to-edge.

Core challenges you’ll face:

Understanding the IIS + Kestrel architecture → maps to in-process vs out-of-process hosting
Configuring application pools → maps to process isolation and recycling
Setting up Windows Authentication → maps to integrated Windows security
Configuring request limits and timeouts → maps to protecting against slow clients
Understanding the ASP.NET Core Module → maps to how IIS proxies to Kestrel

Key Concepts:

ASP.NET Core Hosting: Microsoft ASP.NET Core documentation
IIS Architecture: IIS.net documentation
Kestrel Server: Microsoft Kestrel documentation
Windows Authentication: “Windows Security Internals” - Russinovich

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic C#/.NET, Windows Server familiarity, understanding of HTTP

Real World Outcome

You’ll have an ASP.NET Core application running in production configuration with IIS handling external requests and Kestrel running your application. You’ll understand when to use in-process vs out-of-process hosting and how to configure both.

Example Output:

# Deploy application to IIS
PS> dotnet publish -c Release -o C:\inetpub\wwwroot\myapp

# Check application pool
PS> Get-IISAppPool -Name "MyAppPool"
Name        State   CLRConfigFile ProcessModel
----        -----   ------------- ------------
MyAppPool   Started               InProcess

# Test the application
PS> Invoke-WebRequest -Uri http://myapp.local/api/health
StatusCode: 200
Content: {"status":"Healthy","server":"Kestrel"}

# Check Windows Authentication
PS> Invoke-WebRequest -Uri http://myapp.local/api/user -UseDefaultCredentials
StatusCode: 200
Content: {"user":"DOMAIN\\username","authenticated":true}

# View application logs in Event Viewer
PS> Get-WinEvent -LogName Application -MaxEvents 10 | Where-Object {$_.ProviderName -eq 'ASP.NET Core Module'}
TimeCreated          Message
-----------          -------
12/22/2024 10:00:00  Application '/LM/W3SVC/1/ROOT/myapp' started process '1234' successfully

# Monitor requests in IIS logs
PS> Get-Content C:\inetpub\logs\LogFiles\W3SVC1\u_ex241222.log -Tail 10
2024-12-22 10:00:01 GET /api/health - 80 - 192.168.1.100 200 0 0 15
2024-12-22 10:00:02 GET /api/user - 80 DOMAIN\user 192.168.1.100 200 0 0 23

# Check worker process memory
PS> Get-Process -Name w3wp | Select-Object ProcessName, WorkingSet64, CPU
ProcessName WorkingSet64      CPU
----------- ------------      ---
w3wp        234567890         45.23

The Core Question You’re Answering

“Why do modern .NET applications still use IIS, and what does each layer (Kestrel vs IIS) do?”

Before configuring, understand that Kestrel is designed to be fast and lightweight, but IIS provides features that Kestrel doesn’t: process management, Windows authentication, request filtering, and more. The combination gives you the best of both worlds.

Concepts You Must Understand First

Stop and research these before deploying:

In-Process vs Out-of-Process Hosting
- What’s the difference in architecture?
- Why is in-process faster?
- When would you choose out-of-process?
- Book Reference: Microsoft ASP.NET Core hosting documentation
IIS Application Pools
- What is an application pool?
- Why do pools recycle?
- What’s the difference between Classic and Integrated modes?
- Book Reference: IIS.net documentation
ASP.NET Core Module (ANCM)
- How does ANCM proxy requests to Kestrel?
- What’s the difference between ANCM v1 and v2?
- How does in-process hosting work at the module level?
- Book Reference: Microsoft ANCM documentation
Windows Authentication
- What’s NTLM vs Kerberos?
- How does negotiate authentication work?
- Why does Windows Auth require IIS?
- Book Reference: Windows Authentication documentation

Questions to Guide Your Design

Before deploying, think through these:

Hosting Model
- In-process or out-of-process?
- Will you run multiple sites on one server?
- How will you handle app pool recycling?
Authentication
- Windows Auth, Forms, or JWT?
- Will you need NTLM fallback?
- How will you handle anonymous access?
Request Limits
- What’s the maximum request size?
- What timeout values are appropriate?
- How will you handle slow clients?
Monitoring
- How will you monitor application health?
- What logs will you collect?
- How will you alert on failures?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between in-process and out-of-process hosting in ASP.NET Core?”
“Why would you use IIS in front of Kestrel instead of running Kestrel directly?”
“How does IIS application pool recycling work and why is it important?”
“How would you configure Windows Authentication for an ASP.NET Core app?”
“What’s the ASP.NET Core Module and what does it do?”
“How would you troubleshoot a 502.5 error in ASP.NET Core on IIS?”
“What request limits should you configure in IIS for a production app?”

Hints in Layers

Hint 1: Starting Point Create a new ASP.NET Core web API, publish it to a folder, create an IIS site pointing to that folder, and ensure the .NET Core Hosting Bundle is installed.

Hint 2: In-Process Configuration In your csproj or web.config, set <AspNetCoreHostingModel>InProcess</AspNetCoreHostingModel>. This runs your app inside IIS worker process.

Hint 3: Windows Authentication Enable Windows Auth in IIS, disable Anonymous Auth, and add [Authorize] to your controllers. Use HttpContext.User.Identity to get user info.

Hint 4: Debugging Enable stdout logging in web.config temporarily. Check Event Viewer for ANCM errors. Use Failed Request Tracing for detailed diagnostics.

Books That Will Help

Topic	Book	Chapter
ASP.NET Core hosting	Microsoft Docs	Host and deploy
IIS administration	IIS.net	All
Kestrel configuration	Microsoft Docs	Kestrel web server
Windows Authentication	Microsoft Docs	Security/Authentication

Learning milestones:

App running on IIS → You understand basic IIS + Kestrel architecture
Windows Auth working → You understand integrated Windows security
Health checks configured → You understand production monitoring
Request limits tuned → You understand production hardening

Project 12: Configure Kong as an Enterprise API Gateway

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: YAML/JSON (Kong configuration)
Alternative Programming Languages: Lua (Kong plugins), Go (Kong plugins)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: API Gateway, Authentication, Rate Limiting
Software or Tool: Kong Gateway
Main Book: Kong Documentation + “Building Microservices” by Sam Newman

What you’ll build: A full Kong API gateway deployment with OAuth 2.0 authentication, API key management, rate limiting per consumer, request/response transformation, logging to external systems, and a developer portal for API consumers.

Why it teaches web infrastructure: Kong is the most popular open-source API gateway. Understanding Kong teaches you enterprise API management patterns—consumer management, authentication plugins, traffic control, and observability. It’s also built on OpenResty, so you’ll see how your OpenResty knowledge applies.

Core challenges you’ll face:

Understanding Kong’s architecture → maps to services, routes, consumers, and plugins
Configuring authentication plugins → maps to OAuth 2.0, JWT, API keys
Setting up rate limiting → maps to traffic control per consumer
Using the Admin API → maps to declarative vs imperative configuration
Deploying Kong in DB-less mode → maps to GitOps and declarative infrastructure

Key Concepts:

API Gateway Patterns: “Building Microservices” Ch. 11 - Newman
OAuth 2.0: RFC 6749
Kong Architecture: Kong documentation
Rate Limiting: “Designing Data-Intensive Applications” Ch. 4 - Kleppmann

Difficulty: Advanced Time estimate: 2-4 weeks Prerequisites: Understanding of REST APIs, authentication concepts, OpenResty basics (Project 6)

Real World Outcome

You’ll have an enterprise-grade API gateway that handles authentication, rate limiting, and traffic management for multiple APIs. You’ll be able to onboard new API consumers, manage their quotas, and monitor their usage.

Example Output:

# Start Kong
$ kong start -c /etc/kong/kong.conf

# Create a service and route
$ curl -X POST http://localhost:8001/services \
  --data "name=user-service" \
  --data "url=http://backend:8080/api/users"

$ curl -X POST http://localhost:8001/services/user-service/routes \
  --data "name=user-route" \
  --data "paths[]=/users"

# Test the route (no auth yet)
$ curl http://localhost:8000/users
{"users": [...]}

# Enable API key authentication
$ curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=key-auth"

# Create a consumer and API key
$ curl -X POST http://localhost:8001/consumers \
  --data "username=acme-corp"

$ curl -X POST http://localhost:8001/consumers/acme-corp/key-auth \
  --data "key=acme-secret-key-123"

# Test with API key
$ curl http://localhost:8000/users
HTTP/1.1 401 Unauthorized
{"message": "No API key found in request"}

$ curl -H "apikey: acme-secret-key-123" http://localhost:8000/users
HTTP/1.1 200 OK
{"users": [...]}

# Enable rate limiting per consumer
$ curl -X POST http://localhost:8001/services/user-service/plugins \
  --data "name=rate-limiting" \
  --data "config.minute=100" \
  --data "config.policy=local"

# Check rate limit headers
$ curl -I -H "apikey: acme-secret-key-123" http://localhost:8000/users
HTTP/1.1 200 OK
X-RateLimit-Limit-Minute: 100
X-RateLimit-Remaining-Minute: 99
RateLimit-Reset: 58

# View consumer usage in Kong Manager
$ curl http://localhost:8001/consumers/acme-corp
{
  "username": "acme-corp",
  "id": "abc-123-def",
  "created_at": 1703260000,
  "custom_id": null
}

# View all active plugins
$ curl http://localhost:8001/plugins
{
  "data": [
    {"name": "key-auth", "service": {"id": "..."}, "enabled": true},
    {"name": "rate-limiting", "service": {"id": "..."}, "config": {"minute": 100}}
  ]
}

The Core Question You’re Answering

“How do you manage dozens of APIs with different consumers, each with their own authentication credentials and rate limits?”

Before configuring, understand that Kong sits between your API consumers and your backend services. Every request goes through Kong, which can authenticate, rate limit, transform, log, and route—all configured per-service, per-route, or per-consumer.

Concepts You Must Understand First

Stop and research these before configuring:

Kong’s Data Model
- What’s the relationship between services, routes, and plugins?
- What are consumers and why do they matter?
- What’s the difference between global vs scoped plugins?
- Book Reference: Kong documentation - Admin API
Authentication Patterns
- When to use API keys vs OAuth 2.0 vs JWT?
- What’s the difference between client credentials and authorization code flow?
- How does Kong validate JWTs?
- Book Reference: RFC 6749, Kong authentication plugins
Rate Limiting Strategies
- Per consumer vs per API vs global?
- What’s the difference between local, cluster, and redis policies?
- How do you handle rate limit synchronization?
- Book Reference: “Designing Data-Intensive Applications” Ch. 4
DB-mode vs DB-less Mode
- What’s declarative configuration?
- When would you use DB-less mode?
- How do you version control Kong configuration?
- Book Reference: Kong DB-less documentation

Questions to Guide Your Design

Before configuring, think through these:

Service Organization
- How will you organize services (one per microservice)?
- How will you version your APIs?
- Will you use workspaces for multi-tenancy?
Authentication Strategy
- One auth method or multiple?
- How will consumers obtain credentials?
- How will you handle credential rotation?
Rate Limiting
- What limits per consumer tier (free, premium)?
- What time windows (second, minute, hour)?
- How will you handle rate limit violations?
Deployment Mode
- Traditional DB mode or DB-less?
- Single node or clustered?
- How will you handle configuration updates?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between Kong’s services and routes?”
“How would you implement tiered rate limiting (different limits for different API consumers)?”
“Explain Kong’s plugin execution order and how it affects request processing.”
“How would you configure OAuth 2.0 with Kong for a mobile application?”
“What’s the difference between Kong’s traditional mode and DB-less mode?”
“How would you migrate from API keys to JWT authentication without downtime?”
“How does Kong’s rate limiting work in a clustered deployment?”

Hints in Layers

Hint 1: Starting Point Start Kong in DB-less mode with a simple declarative YAML config. Create one service, one route, and test basic proxying works.

Hint 2: Admin API Use http://localhost:8001 for the Admin API. All configuration can be done via REST. Use curl or HTTPie to explore.

Hint 3: Plugin Ordering Plugins run in a specific order (access, header_filter, body_filter, log). Authentication runs early, logging runs late.

Hint 4: Debugging Enable debug logging with log_level = debug in kong.conf. Use curl -v to see all headers. Check /status endpoint for health.

Books That Will Help

Topic	Book	Chapter
API gateway patterns	“Building Microservices”	Ch. 11
Kong architecture	Kong Documentation	All
OAuth 2.0	RFC 6749	All
Rate limiting	“Designing Data-Intensive Applications”	Ch. 4
Microservices security	“Microservices Security in Action”	Ch. 5-8

Learning milestones:

Basic proxy working → You understand services and routes
API key auth working → You understand consumers and plugins
Rate limiting per consumer → You understand traffic control
OAuth 2.0 flow working → You understand enterprise authentication

Project 13: Set Up Istio Service Mesh with Ingress Gateway

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: YAML (Kubernetes + Istio CRDs)
Alternative Programming Languages: Go (for extending)
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 4: Expert
Knowledge Area: Service Mesh, Kubernetes, Microservices Security
Software or Tool: Istio + Envoy
Main Book: “Istio in Action” by Christian Posta

What you’ll build: A complete Istio service mesh with ingress gateway, mTLS between all services, traffic management (canary deployments, traffic splitting), observability (Kiali, Jaeger, Prometheus), and authorization policies—the full service mesh experience.

Why it teaches web infrastructure: Istio represents the current state-of-the-art in service mesh technology. It combines traffic management, security, and observability in a way that would take months to build manually. Understanding Istio teaches you the patterns that will define infrastructure for the next decade.

Core challenges you’ll face:

Understanding the Istio architecture → maps to control plane vs data plane
Configuring the Ingress Gateway → maps to external traffic entry
Setting up mTLS → maps to zero-trust networking
Creating traffic management rules → maps to VirtualService and DestinationRule
Implementing authorization policies → maps to service-to-service access control

Key Concepts:

Service Mesh: “Istio in Action” Ch. 1-2 - Posta
Envoy Proxy: Envoy documentation (Istio’s data plane)
Zero Trust Security: “Zero Trust Networks” - Barth & Gilman
Kubernetes: “Kubernetes in Action” - Lukša

Difficulty: Expert Time estimate: 3-4 weeks Prerequisites: Kubernetes experience, understanding of Envoy (Project 7), mTLS concepts

Real World Outcome

You’ll have a service mesh where all traffic is encrypted, all services are authenticated, and you have complete visibility into every request. You’ll be able to do canary deployments, A/B testing, and implement sophisticated authorization rules.

Example Output:

# Install Istio
$ istioctl install --set profile=demo
✔ Istio core installed
✔ Istiod installed
✔ Ingress gateways installed
✔ Installation complete

# Enable sidecar injection for namespace
$ kubectl label namespace default istio-injection=enabled

# Deploy sample application
$ kubectl apply -f bookinfo.yaml
deployment.apps/productpage created
deployment.apps/reviews-v1 created
deployment.apps/reviews-v2 created

# Check sidecars are injected
$ kubectl get pods
NAME                          READY   STATUS    RESTARTS
productpage-abc123            2/2     Running   0     # 2/2 = app + sidecar
reviews-v1-def456             2/2     Running   0
reviews-v2-ghi789             2/2     Running   0

# Create Gateway and VirtualService
$ kubectl apply -f gateway.yaml
gateway.networking.istio.io/bookinfo-gateway created
virtualservice.networking.istio.io/bookinfo created

# Test external access
$ curl http://$GATEWAY_IP/productpage
<html>...(Book Info page)...</html>

# Enable mTLS for all services
$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: default
spec:
  mtls:
    mode: STRICT
EOF

# Verify mTLS is working
$ istioctl x authz check pod/productpage-abc123
LISTENER   MATCH
inbound    All connections mTLS

# Create traffic split (90% v1, 10% v2)
$ kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: reviews
spec:
  hosts: [reviews]
  http:
  - route:
    - destination:
        host: reviews
        subset: v1
      weight: 90
    - destination:
        host: reviews
        subset: v2
      weight: 10
EOF

# View in Kiali dashboard
$ istioctl dashboard kiali
# See service graph with traffic percentages!

# View traces in Jaeger
$ istioctl dashboard jaeger
# See complete request traces across services!

# Create authorization policy
$ kubectl apply -f - <<EOF
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: reviews-policy
spec:
  selector:
    matchLabels:
      app: reviews
  rules:
  - from:
    - source:
        principals: ["cluster.local/ns/default/sa/productpage"]
EOF
# Only productpage can call reviews!

The Core Question You’re Answering

“How do you secure, observe, and manage traffic between dozens of microservices without changing application code?”

Before deploying, understand that Istio injects a sidecar proxy (Envoy) next to every pod. All traffic goes through these proxies, which are centrally configured by Istio’s control plane. This gives you consistent security, observability, and traffic management across all services.

Concepts You Must Understand First

Stop and research these before deploying:

Service Mesh Architecture
- What’s the difference between control plane and data plane?
- What is sidecar injection and how does it work?
- How do sidecars intercept traffic (iptables)?
- Book Reference: “Istio in Action” Ch. 2 - Posta
mTLS and Zero Trust
- What is mutual TLS and why does it matter?
- How does Istio provision certificates automatically?
- What’s the SPIFFE identity standard?
- Book Reference: “Zero Trust Networks” Ch. 4 - Barth & Gilman
Istio Traffic Management
- What’s a VirtualService vs DestinationRule?
- How do traffic splits work?
- What’s the difference between Gateway and VirtualService?
- Book Reference: “Istio in Action” Ch. 4-5 - Posta
Istio Security
- How do AuthorizationPolicies work?
- What’s the difference between ALLOW, DENY, and CUSTOM actions?
- How does Istio integrate with external auth systems?
- Book Reference: “Istio in Action” Ch. 7-8 - Posta

Questions to Guide Your Design

Before deploying, think through these:

Installation Profile
- Which Istio profile (minimal, default, demo)?
- Which components do you need?
- How will you upgrade Istio?
mTLS Policy
- Strict or permissive mTLS?
- How will you handle non-mesh services?
- How will you rotate certificates?
Traffic Management
- What routing rules do you need?
- Will you use traffic mirroring for testing?
- How will you implement canary deployments?
Observability
- Which backends (Prometheus, Jaeger, Kiali)?
- What sampling rate for traces?
- What retention for metrics?

The Interview Questions They’ll Ask

Prepare to answer these:

“What’s the difference between Istio’s control plane and data plane?”
“How does Istio’s sidecar injection work?”
“Explain how mTLS works in Istio and how certificates are managed.”
“What’s the difference between a Gateway and a VirtualService?”
“How would you implement a canary deployment with Istio?”
“What’s a DestinationRule and when would you use it?”
“How do AuthorizationPolicies work and how would you debug them?”

Hints in Layers

Hint 1: Starting Point Use istioctl install --set profile=demo for a full-featured installation. Enable sidecar injection on your namespace with kubectl label.

Hint 2: Traffic Entry External traffic enters through the Ingress Gateway. You need both a Gateway resource (ports/hosts) and a VirtualService (routing rules).

Hint 3: mTLS Debugging Use istioctl x authz check to verify mTLS status. Use istioctl analyze to find configuration issues.

Hint 4: Observability Use istioctl dashboard to access Kiali (service graph), Jaeger (traces), and Prometheus (metrics). These are essential for understanding traffic flow.

Books That Will Help

Topic	Book	Chapter
Istio fundamentals	“Istio in Action” by Posta	Ch. 1-4
Traffic management	“Istio in Action”	Ch. 5-6
Security	“Istio in Action”	Ch. 7-8
Zero trust	“Zero Trust Networks”	Ch. 1-5
Kubernetes	“Kubernetes in Action”	Ch. 1-10

Learning milestones:

Sidecars injected and running → You understand the mesh architecture
mTLS enforced → You understand zero-trust networking
Traffic splitting working → You understand traffic management
Authorization policies enforced → You understand service-level security

Project 14: Configure LiteSpeed for High-Performance WordPress Hosting

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: LiteSpeed Configuration
Alternative Programming Languages: PHP, JavaScript
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Web Servers, WordPress, Caching
Software or Tool: LiteSpeed Web Server + LSCache
Main Book: LiteSpeed Documentation

What you’ll build: A high-performance WordPress hosting environment using LiteSpeed (or OpenLiteSpeed) with the LSCache plugin, configured for optimal performance with proper PHP-FPM settings, ESI (Edge Side Includes), and crawler-based cache warming.

Why it teaches web infrastructure: LiteSpeed is the fastest-growing web server for WordPress hosting. Unlike Nginx which requires complex caching setup, LiteSpeed has a native cache module that integrates deeply with WordPress. Understanding this integration teaches you about application-aware caching.

Core challenges you’ll face:

Understanding LiteSpeed architecture → maps to event-driven + Apache-compatible
Configuring LSCache → maps to page caching with intelligent invalidation
Setting up ESI → maps to partial page caching for dynamic content
PHP-FPM optimization → maps to PHP process management
Cache warming with crawler → maps to proactive cache population

Key Concepts:

LiteSpeed Architecture: LiteSpeed documentation
HTTP Caching: “HTTP: The Definitive Guide” Ch. 7 - Gourley
WordPress Performance: WordPress Developer documentation
ESI: W3C Edge Side Includes specification

Difficulty: Intermediate Time estimate: 1 week Prerequisites: Basic web server configuration, WordPress familiarity, PHP basics

Real World Outcome

You’ll have a WordPress site that loads in under 1 second, handles thousands of concurrent visitors, and automatically invalidates cache when content changes. You’ll understand why LiteSpeed is the preferred server for high-traffic WordPress sites.

Example Output:

# Start OpenLiteSpeed
$ sudo systemctl start lsws

# Check status
$ sudo /usr/local/lsws/bin/lswsctrl status
litespeed is running with PID 1234.

# Install WordPress and LSCache plugin
$ wp plugin install litespeed-cache --activate

# Run performance test BEFORE cache
$ curl -o /dev/null -s -w "Time: %{time_total}s\n" https://mysite.com/
Time: 2.345s  # Slow!

# Enable LSCache
$ wp litespeed-cache option set cache.enabled 1

# Run performance test AFTER cache
$ curl -o /dev/null -s -w "Time: %{time_total}s\n" https://mysite.com/
Time: 0.045s  # 50x faster!

# Check cache headers
$ curl -I https://mysite.com/
HTTP/2 200
x-litespeed-cache: hit
cf-cache-status: DYNAMIC
x-litespeed-tag: D9A_URL.6e9f,D9A_H.1

# View cache statistics
$ wp litespeed-cache admin status
Cache Status:
  Entries: 1,234
  Size: 45.6 MB
  Hit Rate: 98.7%

# Trigger cache purge for specific post
$ wp litespeed-cache purge 123
Purging post 123 from cache...
Purged URLs: /sample-post/

# View crawler status
$ wp litespeed-cache crawler status
Crawler Status: Active
  Pages crawled: 456
  Pages remaining: 23
  Last run: 2024-12-22 10:00:00

# Check LiteSpeed real-time stats
$ cat /tmp/lshttpd/.rtreport
VERSION: 1.0
UPTIME: 86400
BPS_IN: 123456
BPS_OUT: 7654321
SSL_BPS_IN: 98765
REQ_RATE: 45.6

The Core Question You’re Answering

“How do you make WordPress fast without manually configuring complex caching rules?”

Before configuring, understand that LiteSpeed’s advantage for WordPress is deep integration. The LSCache plugin communicates directly with the server’s cache module, enabling automatic invalidation when content changes—something that requires complex configuration with Nginx or Varnish.

Concepts You Must Understand First

Stop and research these before configuring:

LiteSpeed vs OpenLiteSpeed
- What features are in Enterprise vs Open?
- What’s the Apache compatibility layer?
- How does LSAPI compare to PHP-FPM?
- Book Reference: LiteSpeed documentation
LSCache Plugin
- How does the plugin communicate with the server?
- What are cache tags and how do they enable smart invalidation?
- What’s the difference between Public and Private cache?
- Book Reference: LSCache plugin documentation
Edge Side Includes (ESI)
- What are ESI blocks?
- When should you use ESI vs full-page caching?
- What’s the performance impact of ESI?
- Book Reference: W3C ESI specification
PHP Process Management
- What’s LSAPI vs PHP-FPM?
- How do you size the PHP process pool?
- What’s OPcache and how does it help?
- Book Reference: PHP documentation

Questions to Guide Your Design

Before configuring, think through these:

Caching Strategy
- What pages should be cached (all public, specific)?
- How long should cache live (TTL)?
- What should bypass cache (logged-in users, cart)?
Dynamic Content
- What content varies per user (login status, cart)?
- Will you use ESI for partially dynamic pages?
- How will you handle AJAX requests?
Cache Warming
- Will you use the built-in crawler?
- What pages should be pre-warmed?
- How often should the crawler run?
Invalidation
- When should cache be purged (post update, comment)?
- Will you use manual purge triggers?
- How will you handle plugin/theme updates?

The Interview Questions They’ll Ask

Prepare to answer these:

“What makes LiteSpeed faster than Nginx for WordPress?”
“How does LSCache automatically invalidate cache when content changes?”
“What’s Edge Side Includes (ESI) and when would you use it?”
“How does LSAPI differ from PHP-FPM?”
“What’s the difference between page cache and object cache?”
“How would you troubleshoot a page that’s not being cached?”
“How would you configure caching for a WooCommerce site with logged-in users?”

Hints in Layers

Hint 1: Starting Point Install OpenLiteSpeed from the official repository, install WordPress, and activate the LSCache plugin. Basic caching should work immediately.

Hint 2: Cache Configuration Most settings can be configured in the LSCache plugin admin. Start with defaults, then tune based on your site’s needs.

Hint 3: ESI for Dynamic Elements Use ESI for navigation bars showing login status, shopping carts, or user-specific content while keeping the main page cached.

Hint 4: Debugging Check response headers for x-litespeed-cache: hit/miss. Use the plugin’s Debug Log feature. Check /usr/local/lsws/logs/error.log for server errors.

Books That Will Help

Topic	Book	Chapter
LiteSpeed configuration	LiteSpeed Docs	All
HTTP caching	“HTTP: The Definitive Guide”	Ch. 7
WordPress optimization	WordPress Codex	Performance
PHP optimization	PHP documentation	OPcache, configuration

Learning milestones:

Basic caching working → You understand LSCache fundamentals
Cache hit ratio > 90% → You understand cache configuration
ESI for dynamic content → You understand partial page caching
Crawler warming cache → You understand proactive performance

Project 15: Integrate with CDNs: Cloudflare, Fastly, and Akamai

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Various (Cloudflare Workers: JavaScript, Fastly VCL, Akamai EdgeWorkers: JavaScript)
Alternative Programming Languages: Python (API scripting)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: CDN, Edge Computing, Caching
Software or Tool: Cloudflare, Fastly, Akamai
Main Book: Each CDN’s documentation

What you’ll build: Configure all three major CDNs in front of your application—Cloudflare (with Workers for edge compute), Fastly (with VCL for custom logic), and Akamai (with EdgeWorkers). You’ll implement caching rules, security features, and edge compute functions on each platform.

Why it teaches web infrastructure: These three CDNs power most of the internet’s traffic. Each has different philosophies: Cloudflare emphasizes ease of use and edge compute, Fastly emphasizes real-time and customization, Akamai emphasizes enterprise scale and ISP embedding. Understanding all three gives you a complete picture of modern content delivery.

Core challenges you’ll face:

Understanding CDN architecture → maps to PoPs, anycast, and edge caching
Configuring cache rules → maps to when to cache, cache keys, TTLs
Writing edge functions → maps to Cloudflare Workers, Fastly Compute, Akamai EdgeWorkers
Setting up security features → maps to WAF, DDoS protection, bot management
Cache invalidation → maps to instant purge, soft purge, cache tags

Key Concepts:

CDN Architecture: “Computer Networks” Ch. 7.5 - Tanenbaum
Edge Computing: CDN documentation, serverless edge patterns
HTTP Caching: “HTTP: The Definitive Guide” Ch. 7 - Gourley
Web Security: “Serious Cryptography” Ch. 14 - Aumasson

Difficulty: Advanced Time estimate: 3-4 weeks Prerequisites: Understanding of HTTP caching, DNS, JavaScript, production web experience

Real World Outcome

You’ll have experience with all three major CDNs, understanding their strengths and differences. You’ll be able to recommend and configure the right CDN for any use case, and you’ll have hands-on experience with edge computing on each platform.

Example Output:

# ===== CLOUDFLARE =====

# Configure caching via Page Rules
$ curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/pagerules" \
  -H "Authorization: Bearer $CF_TOKEN" \
  --data '{
    "targets": [{"target": "url", "constraint": {"operator": "matches", "value": "*.example.com/static/*"}}],
    "actions": [{"id": "cache_level", "value": "cache_everything"}, {"id": "edge_cache_ttl", "value": 86400}]
  }'

# Deploy Cloudflare Worker for A/B testing
$ wrangler publish
Uploaded worker (1.23 KB)
Published to https://example.com/ (via Route)

# Test A/B test header
$ curl -I https://example.com/
X-AB-Test: variant-b
CF-Cache-Status: HIT

# ===== FASTLY =====

# Create VCL snippet for custom caching
$ curl -X POST "https://api.fastly.com/service/$SERVICE_ID/version/$VERSION/snippet" \
  -H "Fastly-Key: $FASTLY_KEY" \
  --data 'name=cache-static&type=recv&content=if (req.url ~ "^/static/") { set req.http.X-Cache-TTL = "86400"; }'

# Instant purge by URL
$ curl -X POST "https://api.fastly.com/purge/www.example.com/static/app.js" \
  -H "Fastly-Key: $FASTLY_KEY"
{"status": "ok", "id": "abc123"}
# Purge takes <150ms globally!

# Deploy Compute@Edge function
$ fastly compute publish
✓ Deployed to https://example.com

# ===== AKAMAI =====

# Configure caching via Property Manager API
$ akamai property update example.com \
  --behavior caching \
  --option default-ttl=86400 \
  --option must-revalidate=false

# Deploy EdgeWorker
$ akamai ew upload --bundle bundle.tgz
EdgeWorker ID: 12345 uploaded successfully

# Purge by CP Code
$ akamai purge invalidate --cp-code 123456
Invalidation request accepted. Estimated completion: 5 seconds.

# Check cache status
$ curl -I https://example.com/
X-Akamai-Cache: TCP_HIT
X-Akamai-Request-ID: abc123.456
X-Akamai-Staging: ESSL

# ===== COMPARISON =====

# Test TTFB from different locations
$ for cdn in cloudflare fastly akamai; do
    echo "Testing $cdn..."
    curl -o /dev/null -s -w "TTFB: %{time_starttransfer}s\n" https://$cdn.example.com/
done
Testing cloudflare...
TTFB: 0.045s
Testing fastly...
TTFB: 0.038s
Testing akamai...
TTFB: 0.052s

The Core Question You’re Answering

“What are the differences between Cloudflare, Fastly, and Akamai, and when would you choose each one?”

Before configuring any CDN, understand that they all do the same fundamental job—cache content close to users—but with different trade-offs. Cloudflare is the easiest and cheapest, Fastly is the most customizable and real-time, Akamai is the most reliable and enterprise-focused.

Concepts You Must Understand First

Stop and research these before configuring:

CDN Architecture
- What is a Point of Presence (PoP)?
- How does anycast routing work?
- What’s the difference between pull and push CDN?
- Book Reference: “Computer Networks” Ch. 7.5 - Tanenbaum
Cache-Control and Caching Logic
- How do CDNs interpret Cache-Control headers?
- What’s the difference between browser cache and edge cache?
- How do surrogate keys (cache tags) work?
- Book Reference: “HTTP: The Definitive Guide” Ch. 7
Edge Computing
- What’s the difference between CDN edge and cloud edge?
- What can you do in a Workers/Compute function?
- What are the memory/time limits?
- Book Reference: Each CDN’s edge compute documentation
Security Features
- What does a CDN’s WAF protect against?
- How does DDoS protection work at the edge?
- What’s bot management?
- Book Reference: Each CDN’s security documentation

Questions to Guide Your Design

Before configuring, think through these:

CDN Selection
- What’s your budget?
- How important is real-time cache invalidation?
- Do you need edge compute?
Caching Strategy
- What content should be cached at the edge?
- What TTLs for different content types?
- How will you handle personalized content?
Security Configuration
- What WAF rules do you need?
- How will you handle rate limiting?
- What bot protection is required?
Edge Compute Use Cases
- A/B testing at the edge?
- Geolocation-based routing?
- Request/response transformation?

The Interview Questions They’ll Ask

Prepare to answer these:

“What are the main differences between Cloudflare, Fastly, and Akamai?”
“How would you implement A/B testing at the CDN edge?”
“Explain how cache invalidation works differently across CDNs.”
“What’s the difference between Cloudflare Workers and Fastly Compute@Edge?”
“How would you debug a caching issue on a CDN?”
“When would you choose Akamai over Cloudflare or Fastly?”
“How does a CDN protect against DDoS attacks?”

Hints in Layers

Hint 1: Starting Point Start with Cloudflare (free tier) for the easiest setup. Point DNS to Cloudflare, configure basic caching, and deploy a simple Worker.

Hint 2: Fastly VCL Fastly uses VCL (Varnish Configuration Language). Start with their boilerplate and modify recv, deliver, and error subroutines.

Hint 3: Akamai Property Manager Akamai uses a rule-based system. Start with a basic property, add behaviors for caching and security, then test in staging.

Hint 4: Debugging Use each CDN’s response headers to understand cache status. Cloudflare: CF-Cache-Status, Fastly: X-Served-By, Akamai: X-Cache.

Books That Will Help

Topic	Book	Chapter
CDN fundamentals	“Computer Networks” by Tanenbaum	Ch. 7.5
HTTP caching	“HTTP: The Definitive Guide”	Ch. 7
Cloudflare	Cloudflare documentation	All
Fastly	Fastly documentation	All
Akamai	Akamai documentation	All

Learning milestones:

Basic caching on each CDN → You understand the fundamentals
Custom caching rules → You understand configuration
Edge function deployed → You understand edge compute
Security features configured → You understand CDN security

Final Project: Production-Grade Multi-Tool Infrastructure Stack

File: LEARN_WEB_INFRASTRUCTURE_TOOLS_DEEP_DIVE.md
Main Programming Language: Docker/Docker Compose (YAML), Bash, Python
Alternative Programming Languages: Go, Node.js
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 4. The “Open Core” Infrastructure (Enterprise Scale)
Difficulty: Level 5: Master (The First-Principles Wizard)
Knowledge Area: Full Stack Infrastructure / DevOps / Platform Engineering
Software or Tool: All 20 tools in combination
Main Book: “Site Reliability Engineering” by Google

What you’ll build: A complete, production-ready infrastructure stack that combines multiple tools from this learning path: Nginx as web server, HAProxy for load balancing, Kong for API gateway, Squid for forward proxy, Traefik for container orchestration, Envoy as sidecar proxy, with Cloudflare in front. You’ll deploy a multi-service application that demonstrates real-world patterns.

Why it teaches everything: This project forces you to understand how ALL these tools work together. You’ll see the actual request flow from user through CDN → Load Balancer → API Gateway → Service Mesh → Application Server → Backend. You’ll understand WHY each layer exists and WHEN to use each tool.

Core challenges you’ll face:

Request routing through multiple layers → maps to understanding request lifecycle
TLS termination at the right layer → maps to security architecture
Health checks at each tier → maps to reliability patterns
Log aggregation from all components → maps to observability
Configuration management → maps to infrastructure as code

Key Concepts:

Multi-tier Architecture: “Site Reliability Engineering” Ch. 1 - Google
Service Mesh Patterns: “Istio in Action” Ch. 1-3 - Christian Posta
Load Balancing: “HAProxy Documentation” - Configuration Reference
API Gateway Patterns: “Kong Documentation” - Plugin Development Guide
Infrastructure as Code: “Infrastructure as Code” Ch. 4 - Kief Morris

Difficulty: Master Time estimate: 1-2 months Prerequisites: Complete Projects 1-15, strong Docker/Kubernetes knowledge, understanding of networking fundamentals, TLS/SSL certificates

Real World Outcome

You’ll have a fully functional infrastructure stack running locally (via Docker Compose) or on Kubernetes that looks like this:

┌─────────────────────────────────────────────────────────────────────┐
│                         YOUR BROWSER                                │
└─────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    CLOUDFLARE (CDN/WAF/DDoS)                        │
│                    - Edge caching for static                         │
│                    - Workers for A/B testing                         │
│                    - WAF rules active                                │
└─────────────────────────────────────────────────────────────────────┘
                                 │
                                 ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    HAPROXY (Global Load Balancer)                   │
│                    - SSL termination                                 │
│                    - Rate limiting                                   │
│                    - Health checks                                   │
└─────────────────────────────────────────────────────────────────────┘
                                 │
                    ┌───────────┴───────────┐
                    ▼                       ▼
┌─────────────────────────┐   ┌─────────────────────────┐
│    KONG (API Gateway)   │   │    NGINX (Static)       │
│    - Auth (JWT/OAuth)   │   │    - Static files       │
│    - Rate limiting      │   │    - HTML/CSS/JS        │
│    - Request transform  │   │    - Images             │
└─────────────────────────┘   └─────────────────────────┘
            │
            ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    TRAEFIK (Container Routing)                      │
│                    - Service discovery                               │
│                    - Dynamic config                                  │
│                    - Metrics                                         │
└─────────────────────────────────────────────────────────────────────┘
            │
   ┌────────┼────────┐
   ▼        ▼        ▼
┌──────┐ ┌──────┐ ┌──────┐
│Svc A │ │Svc B │ │Svc C │
│+Envoy│ │+Envoy│ │+Envoy│  ← Envoy sidecar proxies
└──────┘ └──────┘ └──────┘
   │        │        │
   └────────┼────────┘
            ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    SQUID (Egress Proxy)                             │
│                    - Outbound traffic control                        │
│                    - Caching external APIs                           │
│                    - Security filtering                              │
└─────────────────────────────────────────────────────────────────────┘

Example Output:

$ docker-compose up -d
Creating network "infra_default" with the default driver
Creating infra_postgres_1       ... done
Creating infra_redis_1          ... done
Creating infra_squid_1          ... done
Creating infra_service-a_1      ... done
Creating infra_service-b_1      ... done
Creating infra_service-c_1      ... done
Creating infra_envoy-a_1        ... done
Creating infra_envoy-b_1        ... done
Creating infra_envoy-c_1        ... done
Creating infra_traefik_1        ... done
Creating infra_kong_1           ... done
Creating infra_kong-database_1  ... done
Creating infra_nginx_1          ... done
Creating infra_haproxy_1        ... done

$ curl -v https://myapp.local/api/v1/users
*   Trying 127.0.0.1:443...
* Connected to myapp.local (127.0.0.1) port 443
* TLS handshake completed
> GET /api/v1/users HTTP/2
> Host: myapp.local
>
< HTTP/2 200
< x-kong-upstream-latency: 12
< x-kong-proxy-latency: 3
< x-request-id: abc-123-def-456
< x-served-by: haproxy-01
< x-traefik-router: api-router@docker
< x-envoy-upstream-service-time: 5
< cf-cache-status: DYNAMIC
<
[
  {"id": 1, "name": "Alice"},
  {"id": 2, "name": "Bob"}
]

$ curl https://myapp.local/health
{
  "status": "healthy",
  "components": {
    "haproxy": "up",
    "kong": "up",
    "traefik": "up",
    "service-a": "up",
    "service-b": "up",
    "service-c": "up",
    "postgres": "up",
    "redis": "up"
  }
}

# Observe request flow through all layers
$ docker logs infra_haproxy_1 | tail -1
[HAProxy] 192.168.1.100:54321 -> kong_backend/kong-1 200 "GET /api/v1/users HTTP/2.0" 45ms

$ docker logs infra_kong_1 | tail -1
[Kong] client=10.0.0.1 route=users-route service=users-service latency_kong=3 latency_upstream=12 status=200

$ docker logs infra_traefik_1 | tail -1
[Traefik] 10.0.0.2 - - "GET /users HTTP/1.1" 200 142 - 5ms

$ docker logs infra_envoy-a_1 | tail -1
[Envoy] upstream_cluster=local_service method=GET path=/users response_code=200 duration=5

The Core Question You’re Answering

“How do all these infrastructure tools fit together in a real production system, and why do we need each layer?”

This is THE question that separates junior from senior engineers. Anyone can configure one tool. Understanding how and WHY tools are layered together—what each layer adds, what the trade-offs are—is what makes you an infrastructure expert.

Concepts You Must Understand First

Stop and research these before building:

Defense in Depth
- Why do we have multiple layers of security?
- What does each layer protect against?
- What’s the principle of least privilege?
- Book Reference: “Site Reliability Engineering” Ch. 14 - Google
Request Routing Tiers
- Why separate CDN from Load Balancer from API Gateway?
- When should routing decisions be made at each tier?
- What’s the performance impact of each hop?
- Book Reference: “Building Microservices” Ch. 11 - Sam Newman
Observability
- How do you trace a request through multiple services?
- What metrics matter at each layer?
- How do you correlate logs across systems?
- Book Reference: “Distributed Systems Observability” - Cindy Sridharan
Failure Modes
- What happens if each component fails?
- How do circuit breakers work?
- What’s graceful degradation?
- Book Reference: “Release It!” Ch. 5 - Michael Nygard
Configuration Management
- How do you version infrastructure config?
- What’s GitOps?
- How do you handle secrets?
- Book Reference: “Infrastructure as Code” Ch. 4 - Kief Morris

Questions to Guide Your Design

Before building, think through these:

Architecture Decisions
- Which requests need to hit the API gateway vs. go direct to static?
- Where should authentication happen?
- What’s your caching strategy at each layer?
Failure Handling
- If Kong goes down, what happens to API traffic?
- If one Envoy sidecar fails, do other services continue?
- What’s your circuit breaker strategy?
Security Layers
- Where does TLS terminate?
- How does mTLS work between services?
- What does each WAF layer protect?
Scaling Considerations
- Which components need horizontal scaling?
- How do you scale the stateful components (Kong DB, etc.)?
- What’s your auto-scaling trigger?
Observability
- How will you trace a request end-to-end?
- What dashboards do you need?
- How will you be alerted to problems?

Thinking Exercise

Trace a Request Through Every Layer

Before coding, trace this request manually:

User clicks "Login" button
  → https://myapp.local/api/v1/auth/login
  → POST { "email": "user@example.com", "password": "***" }

Draw the complete flow:

1. Browser DNS lookup for myapp.local
   └─→ Returns: Cloudflare edge IP (anycast)

2. Request hits Cloudflare edge (nearest PoP)
   └─→ Cloudflare checks:
       - Is this cached? NO (POST request)
       - WAF rules pass? YES
       - Rate limit exceeded? NO
   └─→ Forwards to origin (HAProxy)

3. Request hits HAProxy
   └─→ HAProxy checks:
       - Which backend pool? API (path starts with /api)
       - SSL termination? YES (decrypt here)
       - Health check passed? YES
       - Rate limit for this IP? OK
   └─→ Load balances to Kong instance

4. Request hits Kong
   └─→ Kong checks:
       - Which route? /api/v1/auth/* → auth-service
       - Auth plugin? Not for /login endpoint
       - Rate limit plugin? 100 req/min per IP → OK
       - Transform request? Add X-Request-ID header
   └─→ Forwards to Traefik

5. Request hits Traefik
   └─→ Traefik checks:
       - Which router? api-router (Host: myapp.local)
       - Which service? auth-service@docker
       - Container healthy? YES
   └─→ Routes to auth-service container

6. Request hits Envoy sidecar
   └─→ Envoy checks:
       - Circuit breaker open? NO
       - Retry policy? 3 retries for 5xx
       - Timeout? 30s
   └─→ Forwards to local app (localhost:8080)

7. Application processes request
   └─→ App connects to PostgreSQL (via Squid? No, internal)
   └─→ App validates credentials
   └─→ App generates JWT token
   └─→ Returns response

8. Response travels back through each layer
   └─→ Envoy adds x-envoy-upstream-service-time header
   └─→ Traefik adds x-traefik-router header
   └─→ Kong adds x-kong-proxy-latency header
   └─→ HAProxy adds x-served-by header
   └─→ Cloudflare adds CF-Cache-Status: DYNAMIC
   └─→ Browser receives response

Questions while tracing:

At which layer would you add auth for protected endpoints?
If the auth-service is down, at which layer does the request fail?
How does each layer know about the health of downstream services?
What happens if you need to debug a 500 error?

The Interview Questions They’ll Ask

Prepare to answer these:

“Walk me through how a request flows through your infrastructure stack.”
“Why do you have both HAProxy AND Kong? Isn’t that redundant?”
“Where would you put authentication in this architecture?”
“How would you debug a slow request that’s timing out?”
“What happens if your Kong database goes down?”
“How do you handle SSL certificate rotation?”
“Explain your circuit breaker strategy.”
“How would you add a new microservice to this stack?”
“What’s your disaster recovery plan for this infrastructure?”
“How do you prevent a single service from taking down the entire system?”

Hints in Layers

Hint 1: Start Simple Don’t build everything at once. Start with just HAProxy → Nginx for static files. Verify it works. Then add Kong. Then Traefik. Each layer should be verified before adding the next.

Hint 2: Docker Compose First Build everything in Docker Compose before attempting Kubernetes. Compose gives you faster iteration and easier debugging. Only move to K8s when Compose works perfectly.

Hint 3: Configuration Strategy Create a directory structure like:

infrastructure/
├── docker-compose.yml
├── haproxy/
│   └── haproxy.cfg
├── kong/
│   └── kong.yml
├── traefik/
│   └── traefik.yml
├── nginx/
│   └── nginx.conf
├── envoy/
│   └── envoy.yaml
└── squid/
    └── squid.conf

Hint 4: Health Check Chain Configure health checks at each layer to check the NEXT layer:

Cloudflare checks HAProxy health
HAProxy checks Kong health
Kong checks Traefik health
Traefik checks container health
Envoy checks app health

Hint 5: Debugging Tools Use these to trace requests:

curl -v for headers at each hop
docker logs <container> for each component
Jaeger/Zipkin for distributed tracing
Prometheus/Grafana for metrics

Books That Will Help

Topic	Book	Chapter
Overall architecture	“Site Reliability Engineering” by Google	Ch. 1, 14
Microservices patterns	“Building Microservices” by Sam Newman	Ch. 11-13
Failure handling	“Release It!” by Michael Nygard	Ch. 4-5
Observability	“Distributed Systems Observability” by Sridharan	All
Infrastructure as Code	“Infrastructure as Code” by Kief Morris	Ch. 4-6
Service mesh	“Istio in Action” by Christian Posta	Ch. 1-5
Load balancing	“The Art of Scalability” by Abbott & Fisher	Ch. 23

Learning milestones:

Basic stack running → You understand component interaction
End-to-end request traced → You understand the request lifecycle
Failure injection tested → You understand resilience patterns
Metrics and logs unified → You understand observability
Production-ready config → You understand real-world deployment

Project Comparison Table

#	Project Name	Tools Covered	Difficulty	Time	Depth	Fun Factor
1	Apache Virtual Hosts	Apache HTTP Server	Beginner	Weekend	⭐⭐	⭐⭐
2	Nginx Reverse Proxy	Nginx	Beginner	Weekend	⭐⭐⭐	⭐⭐⭐
3	HAProxy Rate Limiter	HAProxy	Intermediate	1 week	⭐⭐⭐⭐	⭐⭐⭐
4	Caddy Automatic HTTPS	Caddy	Beginner	Weekend	⭐⭐⭐	⭐⭐⭐⭐
5	Java App Servers	Tomcat, Jetty, Undertow	Intermediate	1 week	⭐⭐⭐	⭐⭐
6	OpenResty API Gateway	OpenResty (Nginx+Lua)	Advanced	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐
7	Envoy Service Proxy	Envoy	Advanced	1-2 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐
8	Traefik K8s Ingress	Traefik	Intermediate	1 week	⭐⭐⭐⭐	⭐⭐⭐⭐
9	Squid Forward Proxy	Squid	Intermediate	1 week	⭐⭐⭐	⭐⭐
10	Traffic Server Cache	Apache Traffic Server	Advanced	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐
11	Kestrel + IIS	Kestrel, IIS	Intermediate	1 week	⭐⭐⭐	⭐⭐
12	Kong API Gateway	Kong	Intermediate	1 week	⭐⭐⭐⭐	⭐⭐⭐⭐
13	Istio Service Mesh	Istio	Expert	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
14	LiteSpeed WordPress	LiteSpeed	Intermediate	1 week	⭐⭐⭐	⭐⭐⭐
15	CDN Comparison	Cloudflare, Fastly, Akamai	Advanced	2 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐
16	Full Infrastructure Stack	ALL TOOLS	Master	1-2 months	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐

Recommendation

Starting Points Based on Your Experience:

Complete Beginner to Infrastructure: Start with Project 1 (Apache) or Project 2 (Nginx). These are foundational and teach you how web servers actually work before adding complexity.

Intermediate (Know basics, want practical skills): Jump to Project 3 (HAProxy) and Project 8 (Traefik). These are the most commonly used in modern DevOps and give you immediately marketable skills.

Advanced (Ready for production-grade tools): Focus on Project 6 (OpenResty), Project 7 (Envoy), and Project 12 (Kong). These are what you’ll find in serious production systems.

Expert Track (Service Mesh & Cloud-Native): Tackle Project 13 (Istio) and Project 15 (CDNs). These represent the cutting edge of infrastructure and are highly valued at large tech companies.

Recommended Learning Path:

Week 1-2: Project 2 (Nginx) → Foundation of reverse proxying
    ↓
Week 3-4: Project 3 (HAProxy) → Load balancing deep dive
    ↓
Week 5-6: Project 12 (Kong) → API Gateway patterns
    ↓
Week 7-8: Project 8 (Traefik) → Container orchestration
    ↓
Week 9-10: Project 7 (Envoy) → Service proxy internals
    ↓
Week 11-12: Project 13 (Istio) → Service mesh architecture
    ↓
Week 13-14: Project 15 (CDNs) → Edge computing
    ↓
Month 3-4: Project 16 (Full Stack) → Everything together

Summary

This learning path covers web infrastructure tools through 16 hands-on projects. Here’s the complete list:

#	Project Name	Main Language	Difficulty	Time Estimate
1	Apache HTTP Server Virtual Hosts	Configuration	Beginner	Weekend
2	Nginx Reverse Proxy with Load Balancing	Configuration/Bash	Beginner	Weekend
3	HAProxy Rate Limiter & Circuit Breaker	Configuration/Python	Intermediate	1 week
4	Caddy Automatic HTTPS Server	Caddyfile/Go	Beginner	Weekend
5	Java Application Servers Comparison	Java	Intermediate	1 week
6	OpenResty (Nginx + Lua) API Gateway	Lua	Advanced	1-2 weeks
7	Envoy Service Proxy	YAML/Go	Advanced	1-2 weeks
8	Traefik Kubernetes Ingress Controller	YAML/Docker	Intermediate	1 week
9	Squid Forward Proxy & Content Filter	Configuration	Intermediate	1 week
10	Apache Traffic Server Caching Layer	Configuration	Advanced	1-2 weeks
11	Kestrel + IIS for ASP.NET Core	C#/PowerShell	Intermediate	1 week
12	Kong API Gateway with Plugins	YAML/Lua	Intermediate	1 week
13	Istio Service Mesh	YAML/Go	Expert	2-3 weeks
14	LiteSpeed for WordPress Optimization	Configuration/PHP	Intermediate	1 week
15	CDN Comparison (Cloudflare/Fastly/Akamai)	JavaScript/VCL	Advanced	2 weeks
16	Production Multi-Tool Infrastructure Stack	Docker/Python/Bash	Master	1-2 months

All 20 Tools Covered:

Category	Tools Covered
Web Servers	Apache HTTP Server, Nginx, LiteSpeed, Caddy, Microsoft IIS
Reverse Proxies/Load Balancers	HAProxy, Envoy, Traefik
API Gateways	Kong, OpenResty, Istio Ingress Gateway
Caching Proxies	Squid, Apache Traffic Server
Application Servers	Tomcat, Jetty, Undertow, Kestrel
CDNs	Cloudflare, Fastly, Akamai

Recommended Learning Paths:

For beginners: Start with projects #2, #4, #1 For intermediate: Jump to projects #3, #12, #8 For advanced: Focus on projects #6, #7, #13, #15 For complete mastery: Complete all 15 projects, then tackle #16

Expected Outcomes

After completing these projects, you will:

Understand how HTTP requests flow from browser to backend and back
Know when to use each tool and why (Apache vs Nginx, HAProxy vs Traefik, etc.)
Configure web servers, reverse proxies, load balancers, and API gateways
Implement rate limiting, circuit breakers, and health checks
Deploy service meshes and understand sidecar proxy patterns
Configure CDNs for caching, security, and edge computing
Build production-grade multi-tier infrastructure stacks
Debug complex distributed systems issues
Answer infrastructure interview questions with confidence

You’ll have built 16 working projects that demonstrate deep understanding of web infrastructure from first principles to production-grade systems.