Learn API Gateway Engineering: From Proxy to Edge Intelligence

Goal: Deeply understand the architecture, implementation, and extension of modern API gateways like Kong and Envoy. You will move beyond “just using” a gateway to engineering them—mastering traffic control, security protocols, and edge-computing extensions to build resilient, high-performance distributed systems.

Why API Gateway Engineering Matters

In the monolithic era, security and traffic management were baked into the application. In the microservices era, this approach fails. If you have 100 services, you cannot implement rate-limiting, authentication, and logging 100 times. You need a centralized, high-performance “front door.”

API Gateway Engineering is the discipline of managing this entry point. It’s where the internet meets your internal network.

The Business Case for API Gateway Mastery

The numbers tell the story: The global API Gateway market reached $4.3 billion in 2024 and is projected to hit $20.2 billion by 2033, growing at a 20.8% CAGR (source). This explosive growth reflects a fundamental shift in how software is architected.

Current Industry Adoption (2024-2025):

63% of enterprises integrate gateways into CI/CD pipelines for speed, scalability, and monitoring (source)
68% microservices adoption is driving gateway demand (source)
58% of US cloud applications use gateways for traffic management and real-time diagnostics (source)
62% of new API products now ship with advanced security features (rate limiting, token introspection, DDoS prevention) (source)

Why Engineers Must Master This

Separation of Concerns: Let developers focus on business logic while you handle the “plumbing” (TLS, Auth, CORS).
Resilience: Gateways are the first line of defense against DDoS and cascading failures via circuit breakers.
Observability: If it doesn’t pass through the gateway, it didn’t happen. It’s the ultimate source of truth for system health.
Evolution: Modern gateways are moving toward “Edge Computing,” where transformation and logic happen as close to the user as possible.

The Architecture Shift

Monolithic Era                    Microservices Era (Today)
┌─────────────────────┐          ┌──────────────────────────┐
│   Single App        │          │   API Gateway            │
│                     │          │  (Kong/Envoy)            │
│ • Auth              │          │                          │
│ • Rate Limiting     │          │ • TLS Termination        │
│ • Logging           │          │ • Auth/AuthZ             │
│ • Business Logic    │          │ • Rate Limiting          │
│                     │          │ • Observability          │
└─────────────────────┘          └─────────┬────────────────┘
                                           │
                     ┌─────────────────────┼────────────────────┐
                     │                     │                    │
                     ▼                     ▼                    ▼
              ┌─────────────┐      ┌─────────────┐     ┌─────────────┐
              │  Service A  │      │  Service B  │     │  Service C  │
              │ (Logic only)│      │ (Logic only)│     │ (Logic only)│
              └─────────────┘      └─────────────┘     └─────────────┘

The Gateway becomes the critical control point: Envoy Proxy (created at Lyft in 2016) is now the de-facto standard for service mesh and API gateway functionality, adopted by Google, Stripe, and countless enterprises (source). Kong Gateway, built on NGINX/OpenResty, powers traffic “twenty times that of Netflix worldwide” (source).

Core Concept Analysis

1. Data Plane vs. Control Plane

Modern gateways are split into two distinct functional areas. Understanding this split is fundamental to scaling.

      Users (The Internet)
           │
           ▼
┌───────────────────────────┐
│        DATA PLANE         │ <── High Performance
│ (Envoy / Kong Workers)    │     (C++ / Nginx)
│                           │
│  - Packet Processing      │
│  - TLS Termination        │
│  - Header Mutation        │
│  - Load Balancing         │
└───────────┬───────────────┘
            │
            │ xDS / Admin API (Configuration)
            │
┌───────────▼───────────────┐
│       CONTROL PLANE       │ <── Intelligence
│ (Istio / Kong Admin)      │     (Go / Python)
│                           │
│  - Policy Definition      │
│  - Service Discovery      │
│  - Certificate Mgmt       │
│  - Monitoring Aggregation │
└───────────────────────────┘

2. The Filter Chain (Envoy Architecture)

Envoy processes requests through a series of filters. This is a “Pipe and Filter” architecture that allows for extreme modularity.

Downstream (Client) ──▶ [ Listener ]
                            │
                            ▼
                    [ Network Filter 1 ] (e.g., TLS)
                            │
                            ▼
                    [ Network Filter 2 ] (e.g., HTTP Connection Manager)
                            │
                            ▼
                    [ HTTP Filter 1 ] (e.g., Rate Limit)
                            │
                            ▼
                    [ HTTP Filter 2 ] (e.g., Router) ──▶ Upstream (Service)

3. The Plugin Lifecycle (Kong/Nginx Architecture)

Kong, built on Nginx and OpenResty, uses “Phases” to hook into the request/response lifecycle.

certificate: Before SSL handshake.
rewrite: Before the request is parsed (useful for custom URI changes).
access: Authentication and authorization happen here.
header_filter: Modify response headers before they go to the client.
body_filter: Modify the response body (streaming).
log: Asynchronous logging after the client has received the response.

Concept Summary Table

Concept Cluster	What You Need to Internalize
Data Plane	The “dumb” but fast worker that handles actual bytes. Must be optimized for latency.
Control Plane	The “brain” that pushes configuration to the data plane. Handles state and policy.
Filter Chain	Requests are a sequence of transformations. Modularity is key to extension.
Service Discovery	How the gateway knows where services live (DNS, Consul, Kubernetes API).
xDS Protocol	The specific API protocol Envoy uses to receive dynamic updates without restart.
Edge Logic	Moving logic (like JWT validation) to the gateway to protect upstream services.

Deep Dive Reading by Concept

This section maps each concept from above to specific book chapters for deeper understanding. Read these before or alongside the projects to build strong mental models.

Architecture & Theory

Concept	Book & Chapter
Microservice Patterns	“Building Microservices” by Sam Newman — Ch. 4: “Integration”
Data Plane/Control Plane	“Cloud Native Patterns” by Cornelia Davis — Ch. 7: “Service Discovery”
API Patterns	“API Design Patterns” by JJ Geewax — Ch. 18: “API Gateways”

Tool-Specific Mastery

Concept	Book & Chapter
Envoy Internals	“Envoy Proxy Essentials” by Richard Johnson — Ch. 2: “Architecture & Core Components”
Kong Plugins	“Kong API Gateway Essentials” by Richard Johnson — Ch. 5: “The Plugin Framework”
xDS Protocol	“Envoy Proxy Essentials” by Richard Johnson — Ch. 4: “Dynamic Configuration via xDS”

Essential Reading Order

Foundation (Week 1):
- Building Microservices Ch. 4 (The ‘Why’ of Gateways)
- Cloud Native Patterns Ch. 7 (Distributed system fundamentals)
The Data Plane (Week 2):
- Envoy Proxy Essentials Ch. 2 (Understanding the filter chain)
- Kong API Gateway Essentials Ch. 1-2 (Nginx/OpenResty foundations)

Prerequisites & Background Knowledge

Essential Prerequisites (Must Have)

Before diving into API Gateway engineering, you should have:

Networking Fundamentals
- Understanding of TCP/IP, HTTP/HTTPS protocols
- OSI model (especially Layers 4 and 7)
- DNS resolution and how domain names work
- Basic understanding of TLS/SSL
Container & Orchestration Basics
- Docker fundamentals (running containers, port mapping, volumes)
- Basic YAML syntax for configuration files
- Familiarity with container networking concepts
Programming Competence
- Comfortable reading code in at least one language (Go, Python, Lua, or Rust)
- Understanding of asynchronous programming concepts
- Basic shell scripting for automation
HTTP & REST APIs
- HTTP methods (GET, POST, PUT, DELETE)
- Status codes and their meanings
- Headers, request/response structure
- Basic authentication mechanisms (API keys, tokens)

Helpful But Not Required

These concepts will be learned/reinforced through the projects:

Kubernetes and service mesh architectures
OAuth 2.0 and JWT internals
gRPC and Protocol Buffers
WebAssembly (Wasm) programming
Advanced cryptography (mTLS, certificate rotation)
Observability platforms (Prometheus, Jaeger)

Self-Assessment Questions

Can you answer these before starting?

Can you explain the difference between a reverse proxy and a forward proxy?
Do you understand what happens during a TLS handshake?
Can you describe how DNS resolves a domain name to an IP address?
Do you know the difference between Layer 4 (TCP) and Layer 7 (HTTP) load balancing?
Have you ever configured a reverse proxy (even simple Nginx)?
Can you explain what “stateless” means in the context of HTTP?
Do you understand the basic structure of a JWT token?

If you answered “no” to more than 3 questions, consider reviewing:

“Computer Networks” by Tanenbaum — Chapters 1, 5, 6 (Networking fundamentals)
“HTTP: The Definitive Guide” — Chapters 1-4 (HTTP protocol)
“Building Microservices” by Sam Newman — Chapter 4 (Integration patterns)

Development Environment Setup

Required Tools:

Docker (version 24.0+)

docker --version
# You'll run Envoy, Kong, and backend services in containers

Docker Compose (version 2.0+)

docker-compose --version
# For multi-container orchestration

curl or httpie

curl --version
# For testing API endpoints

Text Editor/IDE with YAML support
- VSCode with YAML extension
- vim/neovim with syntax highlighting

Recommended Tools:

jq (for parsing JSON responses)
openssl (for certificate generation and inspection)
grpcurl (for testing gRPC endpoints in Project 7)
Postman or Insomnia (for complex API testing)
Wireshark or tcpdump (for packet inspection)

Time Investment

Realistic Timeline:

Project 1-2 (Basics): 2-3 days each
Project 3-6 (Intermediate): 1-2 weeks each
Project 7-10 (Advanced): 2-4 weeks each

Total Time to Mastery: 3-6 months of consistent practice (10-15 hours/week)

Important Reality Check

API Gateway Engineering is hard because:

Debugging is non-obvious - Traffic doesn’t flow the way you think it does
Configuration is YAML hell - One indentation error breaks everything
Documentation is scattered - Envoy docs assume deep networking knowledge
Failure modes are subtle - “Why is my circuit breaker not triggering?”

But it’s worth it because:

High demand, low supply - Few engineers deeply understand this layer
Central to cloud-native - Every modern architecture needs this skill
Excellent compensation - Senior API Gateway engineers command $150k-$250k+
Transferable knowledge - Concepts apply to Istio, Linkerd, AWS API Gateway, etc.

Quick Start Guide (For the Overwhelmed)

If you’re feeling lost, start here (First 48 hours):

Day 1: The Fundamentals (4 hours)

Morning (2 hours):

Read “Building Microservices” Ch. 4 (The ‘Why’ of Gateways)
Watch: “Envoy Internals Deep Dive” by Matt Klein (Lyft) on YouTube
Install Docker and Docker Compose

Afternoon (2 hours):

Start Project 1: The Naked Proxy
Follow the Envoy quickstart to get something running
Send your first HTTP request through Envoy

Success Criteria: You have Envoy running and routing traffic to a backend

Day 2: Your First Real Configuration (4 hours)

Morning (2 hours):

Read Envoy docs on “Listeners” and “Clusters”
Modify your Project 1 config to route two different paths
Break it intentionally and learn to read Envoy error messages

Afternoon (2 hours):

Add basic observability (Envoy admin endpoint)
Inspect the /stats endpoint to see metrics
Trigger a 503 error by stopping a backend, observe the behavior

Success Criteria: You understand how Envoy decides where to send traffic

Week 1: Security Foundations

Move to Project 2: Secure the Edge

Set up Kong Gateway
Configure HTTPS (generate self-signed certs)
Add JWT validation plugin

Week 2-3: Deep Dive

Choose your path:

Security Focus: Projects 2 → 9 → 10
Performance Focus: Projects 3 → 6 → 8
Extension Focus: Projects 4 → 5 → 7

Recommended Learning Paths

Path 1: The SRE Track (Reliability & Observability)

Focus: Running gateways in production, handling failures, monitoring

Project Sequence:

Project 1: The Naked Proxy → Understand the basics
Project 6: The Watchman → Add observability
Project 8: The Shadow Boxer → Safe deployments with traffic shadowing
Project 10: The Service Mesh → mTLS for service-to-service security

Best for: Site Reliability Engineers, DevOps, Platform Engineers

Outcome: You can operate a production API gateway at scale with confidence

Path 2: The Security Engineer Track

Focus: Hardening the edge, preventing attacks, cryptographic protocols

Project Sequence:

Project 1: The Naked Proxy → Basics
Project 2: Secure the Edge → TLS and JWT
Project 3: The Traffic Cop → Rate limiting (DDoS defense)
Project 9: The Fortress → Web Application Firewall
Project 10: The Service Mesh → mTLS and certificate management

Best for: Security Engineers, AppSec specialists

Outcome: You can architect and defend a zero-trust edge infrastructure

Path 3: The Platform Architect Track

Focus: Building extensible platforms, control planes, advanced routing

Project Sequence:

Project 1: The Naked Proxy → Basics
Project 4: The Shape-Shifter → WebAssembly extensions
Project 5: The Master Architect → Build your own control plane
Project 7: The Diplomat → gRPC transcoding

Best for: Backend architects, platform engineers, infrastructure teams

Outcome: You can build custom gateway platforms for your organization

Path 4: The Full-Stack Engineer Track (Quickest)

Focus: Enough knowledge to integrate with gateways, troubleshoot issues

Project Sequence:

Project 1: The Naked Proxy → 2 days
Project 2: Secure the Edge → 1 week
Project 6: The Watchman → 3 days

Best for: Backend developers who need to understand the edge but aren’t specializing

Outcome: You can effectively work with gateway teams and debug integration issues

Project List

Projects are ordered from fundamental proxying to advanced edge engineering.

Project 1: The “Naked” Proxy (Foundational Mechanics)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: YAML (Envoy Configuration)
Alternative Programming Languages: Nginx Config, HCL
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Reverse Proxy / Infrastructure
Software or Tool: Envoy Proxy
Main Book: “Envoy Proxy Essentials” by Richard Johnson

What you’ll build: A static Envoy configuration that acts as a basic reverse proxy for two different backends (e.g., an echo service and a web service), handling path-based routing and simple load balancing.

Why it teaches API Gateway Engineering: This project strips away the “magic” of Control Planes. You have to manually define the Listener, Filter Chain, Clusters, and Endpoints. You’ll understand how Envoy maps a port to a routing table.

Core challenges you’ll face:

Defining the HTTP Connection Manager → maps to understanding how Envoy upgrades TCP to HTTP
Configuring Path-based Routing → maps to learning how regex or prefix matching works at the edge
Cluster/Endpoint separation → maps to understanding the difference between a logical service (Cluster) and its physical instances (Endpoints)

Key Concepts:

Static Configuration: Envoy Docs - Static Config
Listeners & Clusters: “Envoy Proxy Essentials” Ch. 2

Difficulty: Beginner Time estimate: Weekend Prerequisites: Basic understanding of Docker (to run Envoy/Backends)

Real World Outcome

You will have a single binary (Envoy) running in a container that accepts traffic on port 8080 and correctly routes it to internal services based on the URL path.

Example Output:

# Request to the echo service
$ curl localhost:8080/service/echo
{"message": "Hello from echo service!", "path": "/service/echo"}

# Request to the static web service
$ curl localhost:8080/service/web
<html><body>Welcome to the Web Service</body></html>

# Request to an undefined path
$ curl -I localhost:8080/unknown
HTTP/1.1 404 Not Found

The Core Question You’re Answering

“How does a single binary decide where thousands of different requests should go without getting confused?”

Before you write any code, sit with this question. Imagine a post office with one door but a million PO boxes. How is the mail sorted at high speed? The “sorting” logic is the core of the gateway.

Concepts You Must Understand First

Stop and research these before coding:

Reverse Proxy vs. Forward Proxy
- Who is the proxy “acting” for? The client or the server?
- Book Reference: “Building Microservices” Ch. 4 - Sam Newman
The 7-Layer OSI Model
- Why do gateways usually operate at Layer 4 (TCP) or Layer 7 (HTTP)?
- Book Reference: “Computer Networks” Ch. 1 - Tanenbaum

Questions to Guide Your Design

Before implementing, think through these:

Routing Logic
- What happens if a request matches two different path prefixes? (e.g., /api and /api/v1)
- How does Envoy handle the order of rules?
Upstream Health
- If a backend service is down, what should Envoy return to the user?
- How does it know the service is down without a health check?

Thinking Exercise

The Static Route Trace

Look at this Envoy snippet:

routes:
  - match: { prefix: "/static" }
    route: { cluster: "web_service" }

Questions while tracing:

If I request /static/images/logo.png, what is the “path” sent to the web_service?
Does Envoy strip the /static prefix by default? Should it?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is the difference between a Listener and a Cluster in Envoy?”
“How does Envoy’s threading model handle thousands of concurrent connections?”
“Explain the ‘Hot Restart’ feature in Envoy.”
“Why would you choose Envoy over Nginx for a cloud-native environment?”
“What is a ‘sidecar’ proxy and why is it useful?”

Hints in Layers

Hint 1: The Structure Your YAML file must have four main top-level sections: static_resources, listeners, clusters, and admin.

Hint 2: The Filter To handle HTTP traffic, you must use the envoy.filters.network.http_connection_manager network filter.

Hint 3: Routing The route_config inside the HTTP Connection Manager is where you map virtual_hosts to clusters.

Hint 4: Debugging Run Envoy with -l debug to see exactly how it’s parsing your YAML and matching your requests.

Books That Will Help

Topic	Book	Chapter
Proxy Architecture	“Building Microservices” by Sam Newman	Ch. 4
Envoy Configuration	“Envoy Proxy Essentials” by Richard Johnson	Ch. 2
Protocol Basics	“Computer Networks” by Andrew Tanenbaum	Ch. 5

Project 2: Secure the Edge (Authentication & TLS)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Lua (for Kong configuration/declarative mode)
Alternative Programming Languages: Go (Kong plugins), YAML
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Security / Identity
Software or Tool: Kong API Gateway
Main Book: “Kong API Gateway Essentials” by Richard Johnson

What you’ll build: A Kong gateway deployment that enforces HTTPS (TLS) and validates JWT (JSON Web Tokens) at the entry point. Requests without a valid token are rejected before they ever hit your internal services.

Why it teaches API Gateway Engineering: This teaches “Edge Security.” You’ll learn that the gateway is the perfect place to offload expensive cryptographic operations (like JWT verification) so your backends remain lightweight.

Real World Outcome

An API that is inaccessible unless you present a cryptographically signed token. Your backend service doesn’t even know what a JWT is; it just receives clean, authenticated requests.

Example Output:

# Attempt without token
$ curl -k https://localhost:8443/secure-api
HTTP/1.1 401 Unauthorized
{"message":"Unauthorized"}

# Attempt with invalid token
$ curl -k -H "Authorization: Bearer bad-token" https://localhost:8443/secure-api
HTTP/1.1 401 Unauthorized

# Attempt with valid token
$ curl -k -H "Authorization: Bearer <valid_jwt>" https://localhost:8443/secure-api
HTTP/1.1 200 OK
{"data": "Protected data accessible!"}

The Core Question You’re Answering

“If my gateway handles authentication, how do my internal services trust that the request is actually safe?”

Before building this, consider: In a zero-trust architecture, the gateway is the trust boundary. Everything beyond it (your internal services) assumes the request has been validated. If the gateway says “this user is authenticated,” the backend services believe it. This is why gateway security engineering is mission-critical.

Concepts You Must Understand First

Stop and research these before coding:

TLS Handshake Process
- What happens during the CLIENT_HELLO and SERVER_HELLO exchange?
- Why does TLS termination at the gateway improve backend performance?
- What’s the difference between TLS 1.2 and TLS 1.3?
- Book Reference: “Serious Cryptography” Ch. 13 - Jean-Philippe Aumasson
JWT Structure & Validation
- What are the three parts of a JWT (header, payload, signature)?
- How does asymmetric signing (RS256) differ from symmetric (HS256)?
- Why should you never accept unsigned JWTs (alg: "none" attack)?
- Book Reference: “Foundations of Information Security” Ch. 8 - Jason Andress
Certificate Management
- What is a Certificate Authority (CA)?
- How do you generate a self-signed certificate for development?
- What’s the difference between a certificate and a private key?
- Book Reference: “Serious Cryptography” Ch. 12 - Jean-Philippe Aumasson
OAuth 2.0 Flow (Conceptual)
- What’s the difference between authentication and authorization?
- Why use JWT as an access token instead of opaque tokens?
- Book Reference: “Building Microservices” Ch. 9 - Sam Newman

Questions to Guide Your Design

Before implementing, think through these:

Certificate Strategy
- Will you use self-signed certificates (dev) or Let’s Encrypt (production)?
- Where should the private key be stored? (Kubernetes secrets? Vault?)
- How will you handle certificate rotation without downtime?
JWT Validation
- Where is the JWT signature verification key stored?
- What happens if the JWT is expired but the signature is valid?
- Should you validate iss (issuer) and aud (audience) claims?
Error Handling
- What HTTP status code should you return for an expired JWT? (401 vs 403)
- Should you log failed authentication attempts for security monitoring?
- How do you prevent timing attacks during JWT validation?

Thinking Exercise

The JWT Validation Trace

Before implementing, analyze this JWT payload:

{
  "sub": "user123",
  "iss": "https://auth.myapp.com",
  "aud": "api.myapp.com",
  "exp": 1735689600,
  "iat": 1735686000,
  "scopes": ["read:orders", "write:orders"]
}

Questions while analyzing:

If the current timestamp is 1735690000, should this token be accepted?
What should the gateway check before validating the signature?
If the aud claim is wrong-api.com, what should happen?
How can you verify this JWT was signed by the correct authority?

The Interview Questions They’ll Ask

Prepare to answer these:

“Explain the difference between authentication and authorization in the context of an API gateway.”
“How does JWT signature validation work? What algorithm would you choose and why?”
“What’s the security risk of accepting JWTs with alg: none?” (source)
“How would you implement mTLS between the gateway and backend services?” (source)
“Describe the TLS handshake process. At what point does the gateway decrypt traffic?”
“What’s the difference between TLS termination and TLS passthrough?” (source)

Hints in Layers

Hint 1: The Architecture You’ll need three components: (1) Kong Gateway, (2) A backend service (can be a simple HTTP echo server), (3) A JWT issuer (you can use jwt.io to manually create tokens for testing).

Hint 2: Certificate Generation For development, generate a self-signed cert with OpenSSL:

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

Then configure Kong to use these files for TLS.

Hint 3: Kong JWT Plugin Kong has a built-in jwt plugin. You’ll configure it with the public key used to verify signatures. The plugin intercepts requests, validates the JWT in the Authorization: Bearer <token> header, and either passes the request through or returns 401.

Hint 4: Testing Create a valid JWT at https://jwt.io with your signing key, then:

# Without token (should fail)
curl -k https://localhost:8443/api

# With valid token (should succeed)
curl -k -H "Authorization: Bearer <your_jwt>" https://localhost:8443/api

Books That Will Help

Topic	Book	Chapter
TLS/SSL Internals	“Serious Cryptography” by Jean-Philippe Aumasson	Ch. 12-13
JWT Security	“Foundations of Information Security” by Jason Andress	Ch. 8
API Security Patterns	“Building Microservices” by Sam Newman	Ch. 9
Kong Configuration	“Kong API Gateway Essentials” by Richard Johnson	Ch. 3-4
OAuth 2.0	“Building Microservices” by Sam Newman	Ch. 9

Common Pitfalls & Debugging

Problem 1: “Kong returns 400 Bad Request when I send a JWT”

Why: The JWT plugin is configured but the token format is incorrect (missing Bearer prefix, or the token has extra whitespace)
Fix: Ensure your Authorization header is exactly: Authorization: Bearer <token> with a single space
Quick test: echo "$TOKEN" | wc -c to check for hidden newlines

Problem 2: “Certificate error: unable to get local issuer certificate”

Why: Self-signed certificates aren’t trusted by default. Your client (curl) is validating the cert against system CAs
Fix: Use curl -k (insecure mode) for development, or add your self-signed CA to the system trust store
Quick test: openssl s_client -connect localhost:8443 to see the cert chain

Problem 3: “JWT validation fails with ‘signature verification failed’“

Why: The signing key used to create the JWT doesn’t match the public key configured in Kong
Fix: Verify the kid (Key ID) claim in your JWT matches the key configured in Kong’s JWT plugin
Quick test: Decode the JWT at jwt.io and verify the signature section

Problem 4: “Kong accepts expired JWTs”

Why: JWT exp claim validation isn’t enabled, or clocks are out of sync
Fix: Enable config.claims_to_verify=exp in Kong’s JWT plugin config
Quick test: Create a JWT with exp set to a past timestamp and verify it’s rejected

Problem 5: “TLS handshake timeout or connection reset”

Why: Firewall blocking port 8443, or Kong isn’t listening on HTTPS
Fix: Check docker-compose ps to verify Kong’s port mapping includes 8443:8443
Quick test: netstat -an | grep 8443 to verify the port is listening

Project 3: The Traffic Cop (Custom Rate Limiter)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Lua
Alternative Programming Languages: Go, Python
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Traffic Engineering / Algorithms
Software or Tool: Kong / Redis
Main Book: “Kong API Gateway Essentials” by Richard Johnson

What you’ll build: A custom Kong plugin in Lua that implements a “Tiered Rate Limiter” using Redis as a backend. Users are limited based on their API Key: Free users (5 req/min), Pro users (100 req/min).

Why it teaches API Gateway Engineering: Rate limiting is the most requested feature in production gateways. You’ll learn the Token Bucket algorithm, how to use Redis for distributed rate limiting (so multiple gateway instances share state), and how to write production-grade Lua code that runs in the critical request path.

Core challenges you’ll face:

Implementing the Token Bucket algorithm → maps to understanding time-based rate limiting vs simple counters
Redis integration in Lua → maps to learning how to make external calls without blocking the event loop
Tiered logic based on API key → maps to understanding how to look up user metadata during request processing
Testing edge cases → maps to handling clock skew, Redis failures, and burst traffic

Key Concepts:

Token Bucket Algorithm: “Grokking Algorithms” Ch. 10 - Aditya Bhargava
Lua in OpenResty: “Kong API Gateway Essentials” Ch. 5 - Richard Johnson
Distributed Systems: “Designing Data-Intensive Applications” Ch. 5 - Martin Kleppmann

Difficulty: Advanced Time estimate: 1-2 Weeks Prerequisites:

Completed Projects 1-2 (understanding of Kong plugin architecture)
Basic Lua syntax (Kong’s plugin system uses Lua)
Redis fundamentals (key expiration, atomic operations)
Understanding of rate limiting concepts

Real World Outcome

You’ll have a working rate limiter that returns HTTP 429 Too Many Requests when users exceed their quota, with a Retry-After header telling them when they can try again.

Example Output:

# Free user (5 req/min limit)
$ curl -H "X-API-Key: free-user-123" http://localhost:8000/api
{"data": "Success!"}
# Headers: X-RateLimit-Limit: 5, X-RateLimit-Remaining: 4

$ # Send 5 more requests rapidly...
$ for i in {1..5}; do curl -H "X-API-Key: free-user-123" http://localhost:8000/api; done

# 6th request within the minute
$ curl -i -H "X-API-Key: free-user-123" http://localhost:8000/api
HTTP/1.1 429 Too Many Requests
Retry-After: 42
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735690042

{"message": "API rate limit exceeded"}

# Pro user (100 req/min limit) - same API Key, different tier
$ curl -H "X-API-Key: pro-user-456" http://localhost:8000/api
{"data": "Success!"}
# Headers: X-RateLimit-Limit: 100, X-RateLimit-Remaining: 99

You’re seeing the same behavior as Stripe, GitHub, and Twitter APIs - this is production-grade rate limiting.

The Core Question You’re Answering

“How do you fairly limit thousands of users hitting your API without a database query on every request?”

The naive approach is to track request counts in a database. But querying a DB on every API call destroys performance. The solution: in-memory counters in Redis with atomic increment operations. This project teaches you how to build rate limiting that scales to millions of requests per second.

Concepts You Must Understand First

Stop and research these before coding:

Token Bucket Algorithm
- How does a bucket “refill” at a constant rate?
- What’s the difference between “rate” and “burst” limits?
- Why is Token Bucket better than a simple counter?
- Book Reference: “Grokking Algorithms” Ch. 10 - Aditya Bhargava
Redis Atomic Operations
- What is INCR and why is it atomic?
- How does EXPIRE work with existing keys?
- What happens if Redis becomes unavailable? (Fail open vs fail closed)
- Book Reference: “Designing Data-Intensive Applications” Ch. 5 - Martin Kleppmann
OpenResty Lua Execution Context
- What is the access phase in Kong’s request lifecycle?
- Why must you use ngx.timer.at for async operations?
- How do you make non-blocking calls to Redis?
- Book Reference: “Kong API Gateway Essentials” Ch. 5 - Richard Johnson
HTTP 429 Semantics
- What headers should you include with a 429 response?
- What’s the difference between X-RateLimit-* headers and Retry-After?
- Book Reference: “HTTP: The Definitive Guide” Ch. 3 - Gourley & Totty

Questions to Guide Your Design

Before implementing, think through these:

Algorithm Choice
- Token Bucket vs Leaky Bucket vs Fixed Window vs Sliding Window?
- Should you allow bursts above the rate limit? (100 req/min = burst of 100, then throttle?)
- How do you handle sub-second granularity? (e.g., 1000 req/sec)
Storage Strategy
- Redis key naming: ratelimit:{api_key}:{minute} or ratelimit:{api_key}?
- Should you use a single Redis key with fields, or separate keys?
- How long should keys expire? (TTL strategy)
Failure Modes
- If Redis is down, do you fail open (allow all requests) or fail closed (reject all)?
- What if Redis latency spikes to 500ms? Do you timeout and allow the request?
- How do you prevent cache stampede when limits reset?
User Experience
- Should you return the same 429 response whether a user is 1 request or 100 requests over?
- Do you want a “warning” header when users are at 80% of their quota?

Thinking Exercise

The Token Bucket Simulation

Before coding, trace this scenario by hand:

User: free-user-123 (limit: 5 requests per minute) Redis key: ratelimit:free-user-123 Algorithm: Token Bucket with refill

Time: 10:00:00 → Request 1
  Tokens in bucket: 5
  After request: 4 tokens remain
  Response: 200 OK

Time: 10:00:10 → Request 2, 3, 4, 5, 6 (burst)
  Tokens: 4, 3, 2, 1, 0
  6th request: 0 tokens left
  Response: 429 Too Many Requests

Time: 10:00:12 → Request 7 (12 seconds elapsed)
  Tokens refilled: 0 + (12 seconds / 12 seconds per token) = 1 token
  After request: 0 tokens
  Response: 200 OK

Questions:

How did you calculate the refill rate? (5 tokens per 60 seconds = 1 token every 12 seconds)
What happens if the user waits 2 minutes before the next request?
How would you implement this refill logic in Lua + Redis?

The Interview Questions They’ll Ask

Prepare to answer these:

“Explain the difference between the Token Bucket and Leaky Bucket algorithms. When would you choose one over the other?”
“How would you implement distributed rate limiting across multiple API gateway instances?” (source)
“What happens to rate limits if Redis goes down? How do you ensure availability?”
“Describe how you’d implement a ‘sliding window’ rate limiter.”
“What’s the security risk of rate limiting by IP address vs. API key?” (source)
“How would you prevent a malicious user from exhausting another user’s rate limit?”

Hints in Layers

Hint 1: The Architecture Your Lua plugin will hook into Kong’s access phase. On each request: (1) Extract the API key from headers, (2) Look up the user’s tier (free/pro) from a config table, (3) Check/update Redis counter, (4) Allow or reject the request.

Hint 2: Redis Operation Use a Lua script (via EVAL) to make the increment and expiry atomic:

local current = redis.call('INCR', KEYS[1])
if current == 1 then
  redis.call('EXPIRE', KEYS[1], 60)  -- 60 seconds
end
return current

Hint 3: Kong Plugin Skeleton Your plugin file (handler.lua) will have:

function plugin:access(conf)
  local api_key = kong.request.get_header("X-API-Key")
  local limit = get_user_limit(api_key)  -- 5 or 100
  local count = increment_redis_counter(api_key)

  if count > limit then
    return kong.response.exit(429, {message = "Rate limit exceeded"})
  end

  -- Add headers
  kong.response.set_header("X-RateLimit-Limit", limit)
  kong.response.set_header("X-RateLimit-Remaining", limit - count)
end

Hint 4: Testing Use Apache Bench or hey to send 100 requests rapidly:

ab -n 100 -c 10 -H "X-API-Key: free-user-123" http://localhost:8000/api
# Should see 95 requests get 429 responses

Books That Will Help

Topic	Book	Chapter
Token Bucket Algorithm	“Grokking Algorithms” by Aditya Bhargava	Ch. 10
Kong Plugin Development	“Kong API Gateway Essentials” by Richard Johnson	Ch. 5
Distributed Rate Limiting	“Designing Data-Intensive Applications” by Martin Kleppmann	Ch. 5
Redis Operations	“Designing Data-Intensive Applications” by Martin Kleppmann	Ch. 3
HTTP Standards	“HTTP: The Definitive Guide” by Gourley & Totty	Ch. 3

Common Pitfalls & Debugging

Problem 1: “Rate limit resets every second instead of every minute”

Why: You’re using EXPIRE incorrectly - it’s being set on every request instead of only on the first request in a window
Fix: Use the Lua script pattern above where EXPIRE only runs when current == 1
Quick test: redis-cli TTL ratelimit:yourkey - should show ~60 seconds, not resetting constantly

Problem 2: “Different gateway instances have different counts”

Why: You’re using local memory instead of Redis, or Redis connections aren’t shared
Fix: Ensure all Kong instances point to the same Redis server
Quick test: Send requests to different Kong IPs, check if the counter increments globally

Problem 3: “Redis connection timeout under load”

Why: Connection pool exhausted, or Redis is overwhelmed
Fix: Increase Redis connection pool size in Kong’s config, or use Redis pipelining
Quick test: Monitor redis-cli INFO stats for rejected_connections

Problem 4: “Users bypass rate limits by changing API keys”

Why: You’re only rate limiting by API key, not by IP or user ID
Fix: Implement multi-level rate limiting (by IP AND by API key)
Quick test: Security audit - can a user register 100 free accounts and bypass limits?

Problem 5: “Race condition: two requests at exactly the same time both pass the limit”

Why: Non-atomic check-then-increment operation
Fix: Use Redis INCR which is atomic, not GET followed by SET
Quick test: Send 1000 concurrent requests, verify the count is exactly 1000, not 999 or 1001

Project 4: The Shape-Shifter (Wasm-based Body Transformation)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Rust (compiling to WebAssembly)
Alternative Programming Languages: C++, Go (TinyGo)
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 4: Expert
Knowledge Area: Edge Computing / WebAssembly
Software or Tool: Envoy Proxy / Proxy-Wasm
Main Book: “Envoy Proxy Essentials” by Richard Johnson

What you’ll build: A WebAssembly (Wasm) filter for Envoy that intercepts a JSON response from a legacy backend and “transcodes” it into a modern format or masks sensitive PII (Personally Identifiable Information) before it reaches the client.

Why it teaches API Gateway Engineering: This is the future of gateway extensibility. WebAssembly lets you write custom logic in Rust/C++/Go and compile it to a portable binary that runs inside Envoy at near-native speed. Companies like Google, Stripe, and Lyft use Wasm filters to implement custom transformations without modifying Envoy’s C++ codebase (source).

Core challenges you’ll face:

Understanding the Proxy-Wasm ABI → maps to learning how Wasm modules communicate with the host proxy
Compiling Rust to Wasm32-wasi target → maps to understanding WebAssembly compilation toolchains
Manipulating HTTP bodies in streaming fashion → maps to handling large responses without buffering everything in memory
Testing Wasm modules → maps to debugging compiled binaries without traditional debugging tools

Key Concepts:

WebAssembly Fundamentals: Web resources on Wasm architecture
Proxy-Wasm Specification: “What is Proxy-Wasm” - Kong Blog (source)
Rust Wasm Compilation: “The Rust Programming Language” Ch. 20 - Klabnik & Nichols
Envoy Filter Chain: “Envoy Proxy Essentials” Ch. 2 - Richard Johnson

Difficulty: Expert Time estimate: 2 Weeks Prerequisites:

Completed Project 1 (Envoy architecture understanding)
Intermediate Rust (ownership, traits, error handling)
Basic understanding of WebAssembly concepts
Familiarity with JSON parsing and transformation

Real World Outcome

You’ll have a Wasm filter that transforms legacy API responses in real-time. Imagine a backend returns this:

{
  "user_id": "12345",
  "ssn": "123-45-6789",
  "credit_card": "4111-1111-1111-1111",
  "name": "John Doe"
}

Your filter intercepts it and the client receives:

{
  "user_id": "12345",
  "ssn": "***-**-6789",
  "credit_card": "****-****-****-1111",
  "name": "John Doe"
}

Example Output:

# Without the Wasm filter
$ curl http://localhost:8080/user/12345
{"user_id":"12345","ssn":"123-45-6789","credit_card":"4111-1111-1111-1111","name":"John Doe"}

# With the Wasm filter loaded in Envoy
$ curl http://localhost:8080/user/12345
{"user_id":"12345","ssn":"***-**-6789","credit_card":"****-****-****-1111","name":"John Doe"}

# Envoy logs show the filter was executed
[2024-12-28 10:15:32] [wasm] [info] PII masking filter: masked 2 fields (ssn, credit_card)

This is production-grade data protection - the backend doesn’t need to change, and you’ve added compliance (GDPR, PCI-DSS) at the edge.

The Core Question You’re Answering

“How do you safely extend a high-performance C++ proxy without recompiling or risking crashes?”

Envoy is written in C++. Traditionally, adding custom logic meant forking Envoy, writing C++, and maintaining a custom build. WebAssembly changes this: you write Rust (memory-safe), compile to Wasm, and Envoy loads it as a sandboxed module. If your Wasm code crashes, it doesn’t bring down the entire proxy.

Concepts You Must Understand First

Stop and research these before coding:

WebAssembly Architecture
- What is the Wasm bytecode format?
- Why is Wasm sandboxed? (No direct memory access outside the module)
- What’s the difference between wasm32-unknown-unknown and wasm32-wasi?
- Resource: “WebAssembly: The Definitive Guide” or MDN Web Docs
Proxy-Wasm Specification
- What is the ABI (Application Binary Interface) that Wasm modules use?
- What are the lifecycle callbacks? (on_http_request_headers, on_http_response_body)
- How does host communication work? (importing host functions)
- Resource: Proxy-Wasm spec on GitHub (source)
Rust Ownership & Borrowing
- Why does Rust prevent memory leaks and data races?
- What is &str vs String in the context of Wasm?
- How do you handle errors without panicking (which crashes Wasm)?
- Book Reference: “The Rust Programming Language” Ch. 4, 9 - Klabnik & Nichols
HTTP Body Streaming
- Why can’t you always buffer the entire response body?
- What happens if the response is 1GB? (Memory exhaustion)
- How do chunked transfer encodings work?
- Book Reference: “HTTP: The Definitive Guide” Ch. 3, 15 - Gourley & Totty

Questions to Guide Your Design

Before implementing, think through these:

Safety vs. Performance
- Should you parse JSON in every chunk of the response? (Expensive)
- Or buffer the entire body, parse once, then transform? (Risky for large responses)
- What’s the maximum response size you’ll support?
PII Detection Strategy
- Hardcode field names (ssn, credit_card) or use regex patterns?
- What about nested JSON? (user.profile.ssn)
- Should you log which fields were masked for audit purposes?
Error Handling
- If JSON parsing fails (malformed response), do you pass it through unchanged or return 500?
- If the Wasm filter panics, does Envoy crash or just disable the filter?
Testing & Observability
- How do you debug a Wasm module? (No println!, no debugger)
- Should you emit metrics (number of fields masked) via Envoy stats?

Thinking Exercise

The Wasm Callback Flow

Before coding, trace the execution flow when a client requests /user/123:

1. Client → Envoy: GET /user/123
2. Envoy → Upstream: GET /user/123
3. Upstream → Envoy: 200 OK + JSON body (1024 bytes)
4. Envoy → Wasm filter: on_http_response_headers(...)
   - Filter sees: Content-Type: application/json
   - Filter decision: "I need to transform this"
5. Envoy → Wasm filter: on_http_response_body(chunk1, is_final=false)
   - Filter sees: First 512 bytes of JSON
   - Filter decision: "Buffer this, not complete yet"
6. Envoy → Wasm filter: on_http_response_body(chunk2, is_final=true)
   - Filter sees: Last 512 bytes
   - Filter action: Parse full JSON, mask PII, return modified body
7. Envoy → Client: 200 OK + Modified JSON

Questions:

What happens if the response is sent in 10 chunks? Do you buffer all of them?
How do you know when you have the complete JSON? (Look for is_final=true)
What if the Content-Type is text/html? Should the filter run?

The Interview Questions They’ll Ask

Prepare to answer these:

“What is WebAssembly and why is it useful for API gateways?” (source)
“Explain the security benefits of running custom logic in Wasm vs native code.”
“How would you debug a Wasm filter that’s crashing in production?”
“What’s the performance overhead of Wasm compared to native C++ filters?” (source)
“Describe the Proxy-Wasm ABI and how it enables cross-proxy compatibility.” (source)
“How would you handle a 1GB response body in a streaming Wasm filter?”

Hints in Layers

Hint 1: The Rust SDK Use the proxy-wasm crate. Your filter is a struct that implements traits:

use proxy_wasm::traits::*;
use proxy_wasm::types::*;

struct PiiMaskingFilter;

impl HttpContext for PiiMaskingFilter {
    fn on_http_response_headers(&mut self, ...) -> Action { ... }
    fn on_http_response_body(&mut self, body_size: usize, end_of_stream: bool) -> Action { ... }
}

Hint 2: Compiling to Wasm Build with the correct target:

cargo build --target wasm32-wasi --release
# Output: target/wasm32-wasi/release/pii_filter.wasm

Hint 3: Loading in Envoy Add to your Envoy config:

http_filters:
  - name: envoy.filters.http.wasm
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
      config:
        name: "pii_masking"
        vm_config:
          runtime: "envoy.wasm.runtime.v8"
          code:
            local:
              filename: "/etc/envoy/pii_filter.wasm"

Hint 4: Testing Create a mock backend that returns JSON with PII:

# Backend returns raw data
curl http://localhost:9090/user/123
{"ssn":"123-45-6789"}

# Through Envoy with Wasm filter
curl http://localhost:8080/user/123
{"ssn":"***-**-6789"}

Books That Will Help

Topic	Book	Chapter
Rust Basics	“The Rust Programming Language” by Klabnik & Nichols	Ch. 1-10
Rust Wasm	“The Rust Programming Language” by Klabnik & Nichols	Ch. 20
Envoy Architecture	“Envoy Proxy Essentials” by Richard Johnson	Ch. 2
HTTP Streaming	“HTTP: The Definitive Guide” by Gourley & Totty	Ch. 3, 15
Rust Error Handling	“Rust for Rustaceans” by Jon Gjengset	Ch. 3

Common Pitfalls & Debugging

Problem 1: “Wasm module fails to load with ‘unknown import’“

Why: Your Rust code is trying to use a function not provided by the Proxy-Wasm host ABI
Fix: Ensure you’re only using proxy-wasm crate APIs, not std library functions like println!
Quick test: wasm-objdump -x pii_filter.wasm | grep import to see what imports are required

Problem 2: “Filter crashes with ‘out of bounds memory access’“

Why: Buffer overflow or accessing dropped memory (common in Rust Wasm without proper ownership)
Fix: Use cargo clippy to catch memory safety issues, review all buffer slicing code
Quick test: Add bounds checks before all array/slice access

Problem 3: “JSON parsing works locally but fails in Envoy”

Why: Response body is chunked, and you’re trying to parse incomplete JSON
Fix: Buffer all chunks until end_of_stream=true, then parse the complete string
Quick test: Add logging (via log::info!) to see chunk sizes

Problem 4: “Performance degrades with large responses”

Why: Buffering the entire multi-megabyte body in Wasm memory
Fix: Implement streaming JSON parsing (difficult) or set a maximum body size and pass through if exceeded
Quick test: Send a 10MB response, monitor Envoy’s memory usage with top

Problem 5: “Wasm compilation succeeds but Envoy rejects the module”

Why: Wrong Wasm target (wasm32-unknown-unknown instead of wasm32-wasi)
Fix: Rebuild with --target wasm32-wasi and ensure the ABI version matches Envoy’s Wasm runtime
Quick test: file pii_filter.wasm should show “WebAssembly (wasm) binary module”

Project 5: The Master Architect (Building an xDS Control Plane)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Go
Alternative Programming Languages: Python, Java
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 5: Master
Knowledge Area: Distributed Systems / Control Planes
Software or Tool: gRPC / Envoy xDS API
Main Book: “Envoy Proxy Essentials” by Richard Johnson

What you’ll build: A minimal “Control Plane” server in Go that implements the Envoy xDS protocol (specifically LDS and CDS). It will watch a simple database or file and push configuration updates to a fleet of Envoy proxies in real-time without restarting them.

Why it teaches API Gateway Engineering: This is the holy grail project - you’re building the brain behind Istio, AWS App Mesh, and Google Traffic Director. The xDS protocol is how control planes (Istio, Consul) dynamically configure thousands of Envoy instances without restarting them. Mastering this puts you in the top 1% of gateway engineers.

Real World Outcome:

# Start your control plane
$ go run control-plane.go
[2024-12-28 10:30:00] Control plane listening on :18000
[2024-12-28 10:30:05] Envoy node 'envoy-1' connected

# Start Envoy configured to connect to your control plane
$ envoy -c dynamic-config.yaml
[2024-12-28 10:30:05] xDS connection to localhost:18000 established

# Update service endpoints in your control plane's config file
$ echo '{"backend-v2": ["10.0.1.10:8080", "10.0.1.11:8080"]}' > services.json

# Control plane detects change and pushes update via xDS
[2024-12-28 10:30:10] Configuration changed, pushing CDS update to 1 node(s)
[2024-12-28 10:30:10] Sent cluster 'backend-v2' to envoy-1

# Envoy receives update and reconfigures WITHOUT restarting
[envoy] [info] cds: added/updated 1 cluster(s), removed 0 cluster(s)

# Traffic now routes to new backends immediately
$ curl http://localhost:8080/api
{"served_by": "backend-v2-instance-10.0.1.10"}

The Interview Questions They’ll Ask:

“Explain the xDS protocol. What are LDS, CDS, EDS, and RDS?” (source)
“How does Envoy perform zero-downtime configuration updates?”
“What’s the difference between SotW (State of the World) and incremental xDS?”
“How would you handle version conflicts when pushing config to 1000 Envoy instances?”

Key Concepts:

xDS Protocol: “Envoy Proxy Essentials” Ch. 4 - Richard Johnson
gRPC Streaming: “gRPC Up & Running” Ch. 2-3
Service Discovery: “Designing Data-Intensive Applications” Ch. 5 - Martin Kleppmann

Project 6: The Watchman (Observability Pipeline)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Go/Python (for exporters)
Alternative Programming Languages: Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 2: Intermediate
Knowledge Area: Observability / Monitoring
Software or Tool: Prometheus / Jaeger / OpenTelemetry
Main Book: “Cloud Native Patterns” by Cornelia Davis

What you’ll build: A gateway observability stack where Envoy/Kong exports metrics to Prometheus and distributed traces to Jaeger using OpenTelemetry. You’ll create a dashboard that shows the “Golden Signals” (Latency, Error Rate, Throughput) at the edge.

Why it teaches API Gateway Engineering: “You can’t improve what you don’t measure.” This project teaches you how to instrument a gateway for production monitoring. You’ll learn the Golden Signals (Latency, Traffic, Errors, Saturation), distributed tracing with Jaeger, and how to correlate logs across microservices.

Real World Outcome:

# Start the observability stack
$ docker-compose up prometheus jaeger grafana envoy

# Envoy exposes metrics on the admin endpoint
$ curl localhost:9901/stats/prometheus
envoy_http_downstream_rq_total{envoy_http_conn_manager_prefix="ingress"} 1542
envoy_http_downstream_rq_xx{envoy_response_code_class="2xx"} 1450
envoy_http_downstream_rq_xx{envoy_response_code_class="5xx"} 92
envoy_cluster_upstream_rq_time_bucket{le="50"} 1200
envoy_cluster_upstream_rq_time_bucket{le="100"} 1430

# Jaeger shows distributed trace
$ curl localhost:8080/checkout
# Trace ID: 5f3a2b1c4d5e6f7g8h9i0j
# Span 1: [API Gateway] 120ms
# Span 2: [Auth Service] 45ms
# Span 3: [Payment Service] 320ms <- bottleneck identified!
# Total: 485ms

# Grafana dashboard shows Golden Signals in real-time
- Latency (p50, p95, p99): 120ms, 450ms, 890ms
- Traffic: 1542 req/min
- Errors: 5.9% (92 errors / 1542 requests)
- Saturation: CPU 45%, Memory 62%

The Interview Questions They’ll Ask:

“What are the ‘Golden Signals’ of monitoring and why are they important?” (source)
“Explain how distributed tracing works. What is a span? What is a trace?”
“How would you debug high p99 latency at the API gateway?” (source)
“What’s the difference between RED metrics and USE metrics?”

Key Concepts:

Observability: “Designing Data-Intensive Applications” Ch. 1 - Martin Kleppmann
Distributed Tracing: “Cloud Native Patterns” Ch. 8 - Cornelia Davis
Prometheus: Official Prometheus docs

Project 7: The Diplomat (gRPC-to-JSON Transcoding)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Protobuf
Alternative Programming Languages: Go, Python
Coolness Level: Level 3: Genuinely Clever
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: Protocols / API Translation
Software or Tool: Envoy Proxy / gRPC-JSON Transcoder
Main Book: “Envoy Proxy Essentials” by Richard Johnson

What you’ll build: Configure Envoy to accept JSON HTTP requests from web clients and automatically convert them to gRPC calls to your backend services, then convert the gRPC responses back to JSON - all without the backend knowing anything about HTTP/JSON.

Why it teaches API Gateway Engineering: Many modern backends use gRPC for efficiency (binary protocol, HTTP/2, streaming), but browsers and mobile apps speak JSON. This project teaches protocol translation at the edge - a critical skill for hybrid architectures.

Real World Outcome:

# Backend only speaks gRPC (Protobuf)
$ grpcurl localhost:9090 user.UserService/GetUser
{
  "user_id": "123",
  "name": "Alice",
  "email": "alice@example.com"
}

# Web client sends JSON via Envoy (automatic transcoding)
$ curl -X POST http://localhost:8080/v1/users/123
{"user_id": "123", "name": "Alice", "email": "alice@example.com"}

# Envoy logs show the transcoding
[2024-12-28 11:00:00] [grpc-json] Transcoded POST /v1/users/123 → gRPC user.UserService/GetUser

The Interview Questions They’ll Ask:

“What are the benefits of gRPC over REST?”
“How does Envoy perform gRPC-JSON transcoding without hardcoding API schemas?” (source)
“What’s the performance impact of protocol transcoding at the gateway?”

Project 8: The Shadow Boxer (Traffic Shadowing)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: YAML / Envoy Config
Alternative Programming Languages: Kong Admin API
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: DevOps / Risk Management
Software or Tool: Envoy Proxy
Main Book: “Envoy Proxy Essentials” by Richard Johnson

What you’ll build: Configure Envoy to “shadow” production traffic - sending 100% of requests to your production backend AND duplicating them to a v2 backend for testing, but only returning the production response to the client.

Why it teaches API Gateway Engineering: This is how Netflix, Google, and Stripe safely test new backend versions with real production traffic without risking user experience. If v2 crashes, users never see it. You’re learning dark launch and A/B testing at the infrastructure level.

Real World Outcome:

# Normal request (goes to both prod and v2)
$ curl http://localhost:8080/api/order/456
{"order_id": "456", "status": "shipped"}  # Response from PROD

# Envoy logs show both backends received the request
[2024-12-28 12:00:00] [router] Cluster 'production' responded in 45ms (200 OK)
[2024-12-28 12:00:00] [router] Shadow cluster 'v2-canary' responded in 120ms (500 ERROR) <- Caught a bug!

# v2 backend logs show it received traffic but errors didn't affect users
[v2-backend] ERROR: NullPointerException in new feature
# User never saw this error - they got the prod response

The Interview Questions They’ll Ask:

“What is traffic shadowing and why is it safer than canary deployments?”
“How would you measure the success of a shadowed deployment?” (source)
“What are the risks of shadowing write operations (POST/PUT/DELETE)?”

Project 9: The Fortress (API Web Application Firewall)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Go (Coraza WAF)
Alternative Programming Languages: Lua
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 4: Expert
Knowledge Area: Security / WAF
Software or Tool: Coraza WAF / Envoy Wasm
Main Book: “Practical Malware Analysis” by Michael Sikorski

What you’ll build: Integrate a Web Application Firewall (WAF) into Envoy using a Wasm-based Coraza module that inspects requests for common attacks: SQL injection, XSS, path traversal, and command injection.

Why it teaches API Gateway Engineering: The gateway is the first line of defense against application-layer attacks. This project teaches you to implement OWASP Top 10 protection at the edge, blocking malicious requests before they reach your backend services. Companies like Cloudflare and AWS WAF do this at massive scale.

Real World Outcome:

# Normal request (allowed)
$ curl http://localhost:8080/search?q=laptops
{"results": [...]}

# SQL injection attempt (blocked by WAF)
$ curl "http://localhost:8080/search?q=' OR '1'='1"
HTTP/1.1 403 Forbidden
{"error": "Request blocked by WAF", "rule_id": "942100", "attack_type": "SQL Injection"}

# XSS attempt (blocked)
$ curl "http://localhost:8080/comment?text=<script>alert('XSS')</script>"
HTTP/1.1 403 Forbidden
{"error": "Request blocked by WAF", "rule_id": "941100", "attack_type": "XSS"}

# Envoy WAF logs
[2024-12-28 13:00:00] [wasm][waf] Blocked request from 192.168.1.100: SQL injection pattern detected
[2024-12-28 13:00:05] [wasm][waf] Blocked request from 192.168.1.105: XSS pattern detected

The Interview Questions They’ll Ask:

“What is a Web Application Firewall and how does it differ from a network firewall?”
“Explain the OWASP Top 10. How would you protect against SQL injection at the API gateway?” (source)
“What are false positives in WAF rules and how do you tune them?”
“Should WAF rules be implemented at the gateway or in the application? Why?”

Key Concepts:

OWASP Top 10: OWASP official documentation
WAF Patterns: “Practical Malware Analysis” Ch. 14 - Michael Sikorski
Security at the Edge: “Building Microservices” Ch. 9 - Sam Newman

Project 10: The Service Mesh (Sidecar mTLS)

File: API_GATEWAY_ENGINEERING_MASTERY.md
Main Programming Language: Shell / Docker-Compose
Alternative Programming Languages: Go (for cert rotation)
Coolness Level: Level 5: Pure Magic
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 5: Master
Knowledge Area: Networking / Service Mesh
Software or Tool: Envoy Proxy / SPIFFE
Main Book: “Envoy Proxy Essentials” by Richard Johnson

What you’ll build: Deploy Envoy as a sidecar proxy for each microservice, implementing mutual TLS (mTLS) for service-to-service authentication and encryption. Each service gets cryptographic identity via SPIFFE, and traffic between services is automatically encrypted without application code changes.

Why it teaches API Gateway Engineering: This is the service mesh pattern - the foundation of Istio, Linkerd, and Consul Connect. You’re moving from a single edge gateway to a mesh of gateways where every service gets its own proxy. This teaches zero-trust networking, certificate rotation, and identity-based security.

Real World Outcome:

# Deploy services with Envoy sidecars
$ docker-compose up service-a service-b envoy-a envoy-b

# Service A → Service B communication (automatically mTLS encrypted)
# From inside service-a container:
$ curl http://localhost:15001/api/data
{"data": "Secret information from Service B"}

# Envoy logs show mTLS handshake
[2024-12-28 14:00:00] [envoy-a] Established mTLS connection to service-b
[2024-12-28 14:00:00] [envoy-a] Peer certificate verified: spiffe://cluster.local/service-b

# Wireshark capture shows encrypted traffic (TLS 1.3)
$ tcpdump -i lo -A port 8080
# [Encrypted Application Data] - no plaintext visible!

# Certificate rotation happens automatically every 24 hours
[2024-12-29 14:00:00] [cert-rotation] Renewed certificate for service-a (expires 2024-12-30)
[2024-12-29 14:00:00] [envoy-a] Hot-reloaded new certificate, 0 dropped connections

The Interview Questions They’ll Ask:

“What is mutual TLS (mTLS) and how does it differ from regular TLS?” (source)
“Explain the service mesh pattern. Why run a proxy alongside every service?”
“What is SPIFFE and how does it provide service identity?” (source)
“How do you handle certificate rotation in a service mesh without downtime?”
“What are the performance costs of mTLS at scale?”

Key Concepts:

mTLS: “Serious Cryptography” Ch. 13 - Jean-Philippe Aumasson
Service Mesh: “Cloud Native Patterns” Ch. 9 - Cornelia Davis
SPIFFE: Official SPIFFE documentation
Zero Trust Architecture: “Building Microservices” Ch. 9 - Sam Newman

Common Pitfalls:

Clock skew: mTLS certificate validation fails if system clocks drift
Certificate rotation: Services crash during rotation if not using hot-reload
Performance overhead: mTLS adds 5-15ms latency per request at high volume
Debugging encrypted traffic: Use Envoy’s admin endpoint to inspect certificates

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. Naked Proxy	Level 1	Weekend	★★☆☆☆	★★☆☆☆
2. Secure Edge	Level 2	1 Week	★★★☆☆	★★★☆☆
3. Traffic Cop	Level 3	1-2 Weeks	★★★★☆	★★★★☆
4. Shape-Shifter	Level 4	2 Weeks	★★★★★	★★★★★
5. Master Arch.	Level 5	1 Month	★★★★★	★★★★☆
6. Watchman	Level 2	Weekend	★★★☆☆	★★★☆☆
7. Diplomat	Level 3	1 Week	★★★★☆	★★★☆☆
8. Shadow Boxer	Level 2	Weekend	★★★☆☆	★★★★☆
9. Fortress	Level 4	2 Weeks	★★★★☆	★★★★☆
10. Mesh	Level 5	1 Month	★★★★★	★★★★★

Recommendation

Start with Project 1 (Envoy Static Config). Most modern tutorials use high-level tools like Istio or Kubernetes Ingress, which hide the complexity. By manually writing an Envoy YAML, you will struggle with “Clusters” vs “Listeners,” and that struggle is where the real learning happens.

If you are a Python/JS developer, start with Project 2 (Kong + JWT) to see how plugins can simplify your code.

Final Overall Project: The “Edge Intel” Platform

The Goal: Build a fully integrated API Gateway platform that serves a fictional “E-commerce” system.

Requirements:

Hybrid Gateway: Use Envoy for high-speed edge routing and Kong for developer-friendly service management.
Unified Control Plane: Build a Go service that manages configuration for both.
Smart Rate Limiting: Limit users based on their “Customer Lifetime Value” (fetched from a database at the edge).
Resilience: Implement Circuit Breaking and Outlier Detection. If a backend service fails 5% of requests, the gateway must automatically eject it from the load balancer.
Security: Mandatory TLS 1.3, JWT validation, and WAF inspection.
Observability: A single Grafana dashboard showing the flow of money (successful checkouts) vs. the flow of errors.

Summary

This learning path covers API Gateway Engineering through 10 hands-on projects.

#	Project Name	Main Language	Difficulty	Time Estimate
1	The Naked Proxy	YAML	Beginner	Weekend
2	Secure the Edge	Lua	Intermediate	1 Week
3	The Traffic Cop	Lua	Advanced	1-2 Weeks
4	The Shape-Shifter	Rust (Wasm)	Expert	2 Weeks
5	The Master Architect	Go	Master	1 Month
6	The Watchman	Go/Python	Intermediate	Weekend
7	The Diplomat	Protobuf	Advanced	1 Week
8	The Shadow Boxer	YAML	Intermediate	Weekend
9	The Fortress	Go (Wasm)	Expert	2 Weeks
10	The Service Mesh	Shell/Docker	Master	1 Month

Recommended Learning Path

For SREs: Focus on 1, 6, 8, 10. For Security Engineers: Focus on 2, 3, 9. For Backend Architects: Focus on 4, 5, 7.

Expected Outcomes

After completing these projects, you will:

Internalize the Data Plane/Control Plane separation.
Be able to write custom extensions in Lua or Wasm.
Understand the gRPC and xDS protocols at a binary level.
Design resilient, secure systems that handle thousands of requests per second.
Master the “Golden Signals” of observability at the network edge.