Project 6: HTTP Server with Request Pooling (Capstone)
Project 6: HTTP Server with Request Pooling (Capstone)
Sprint: 2 - Data & Invariants Difficulty: Expert Time Estimate: 2-4 weeks Prerequisites: All previous Sprint 2 projects, basic understanding of sockets
Overview
What youโll build: A single-threaded HTTP/1.1 server that handles multiple concurrent connections using select()/poll(), with a custom memory pool for request parsing, demonstrating every concept from this sprint in a production-relevant context.
Why this is the ultimate test: A network server is where memory bugs become security bugs. You must:
- Parse untrusted input into owned data structures
- Track connection state with strict invariants (partial reads, connection lifecycle)
- Prevent buffer overflows in parsing (attackers WILL send malformed data)
- Free resources correctly when connections close unexpectedly
- Handle ownership of request data across parse/handle/respond phases
The Core Question Youโre Answering:
โCan I build a production-quality network service that handles untrusted input, manages complex state, and never leaks memory or crashes?โ
Learning Objectives
By the end of this project, you will be able to:
- Implement an event-driven server using select()/poll()
- Design a connection state machine with clear invariants
- Use pool allocation for request handling
- Parse HTTP safely with bounds checking
- Handle partial reads/writes correctly
- Survive fuzzing without crashes or leaks
- Build production-quality systems software
Theoretical Foundation
Connection State Machine
CONNECTION STATES:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโโโโโโโโโ read data โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ READING โ โโโโโโโโโโโโโโโโโโโ โ READING_HEADERS โ โ
โ โ REQUEST โ โ (partial request) โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ headers complete โ more headers โ
โ โผ โ โ
โ โโโโโโโโโโโโโโโ โ โ
โ โ PROCESSING โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ (building โ โ
โ โ response) โ โ
โ โโโโโโโโโโโโโโโ โ
โ โ โ
โ โ response ready โ
โ โผ โ
โ โโโโโโโโโโโโโโโ send complete โโโโโโโโโโโโโโโ โ
โ โ SENDING โ โโโโโโโโโโโโโโโโโโโโโ โ CLOSING โ โ
โ โ RESPONSE โ โ โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ partial send โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
INVARIANTS:
1. Each connection is in EXACTLY one state
2. Buffer contains only data valid for current state
3. State transitions only happen on specific events
4. Resources are freed when entering CLOSING state
Request Pool Allocation
Per-Request Memory Pool:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ Connection starts: โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Pool: 4KB โ โ
โ โ [โโโโโโโโโโโโโโโโ available โโโโโโโโโโโโโโโโโโ] โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ After parsing request: โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Pool: 4KB โ โ
โ โ [method][uri][headers][][ available ] โ โ
โ โ โ 312 bytes used โ โ 3784 free โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ After response sent: โ
โ pool_reset() โ 0 bytes used, ready for next request โ
โ โ
โ KEY INSIGHT: โ
โ - Zero malloc() calls during request handling โ
โ - All request data has same lifetime โ
โ - pool_reset() is O(1), not O(n) frees โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Defensive Parsing
HTTP Request Format:
โโโโโโโโโโโโโโโโโโโโโ
GET /index.html HTTP/1.1\r\n
Host: localhost\r\n
Content-Length: 0\r\n
\r\n
ATTACK VECTORS (must handle all):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ 1. Oversized URI โ
โ GET /AAAAAAA...(10KB)...AAAA HTTP/1.1 โ
โ โ Reject with 414 URI Too Long โ
โ โ
โ 2. Invalid Content-Length โ
โ Content-Length: 99999999999 โ
โ โ Reject with 413 Payload Too Large โ
โ โ
โ 3. Null bytes in headers โ
โ Host: localhost\x00malicious โ
โ โ Reject with 400 Bad Request โ
โ โ
โ 4. Missing terminator โ
โ GET /index.html HTTP/1.1\r\nHost: local(connection closes) โ
โ โ Timeout and close cleanly โ
โ โ
โ 5. Slowloris attack โ
โ Send partial request, hold connection โ
โ โ Timeout after 30 seconds, close โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Project Specification
Core API
// Server lifecycle
Server* server_create(int port, const char* docroot);
void server_run(Server* server); // Main event loop
void server_stop(Server* server);
void server_destroy(Server* server);
// Configuration
void server_set_pool_size(Server* server, size_t size);
void server_set_max_connections(Server* server, int max);
void server_set_timeout(Server* server, int seconds);
// Statistics
ServerStats server_get_stats(Server* server);
void server_print_stats(Server* server);
Expected Output
$ ./httpserver --pool-size 4096 --max-connections 128 --port 8080 ./www
[INFO] Initializing memory pools...
[INFO] Created 128 connection pools (4096 bytes each)
[INFO] Total pre-allocated memory: 512 KB
[INFO] Listening on 0.0.0.0:8080
[INFO] Document root: /home/user/project/www
[INFO] Server ready. Press Ctrl+C to stop.
[CONN 0] New connection from 127.0.0.1:52341
[CONN 0] State: READING_REQUEST_LINE
[CONN 0] Read 78 bytes, buffer: 78/4096
[CONN 0] Parsed: GET /index.html HTTP/1.1
[CONN 0] State: READING_HEADERS โ PROCESSING
[CONN 0] Headers parsed: 6 headers, used 312 bytes pool
[CONN 0] State: PROCESSING โ SENDING_RESPONSE
[CONN 0] Serving file: ./www/index.html (1247 bytes)
[CONN 0] Sent 1247 bytes
[CONN 0] State: SENDING_RESPONSE โ CLOSING
[CONN 0] Pool reset, 312 bytes freed
[CONN 0] Connection closed
[CONN 0] Total lifetime: 3ms, heap allocs: 0
Memory Statistics
========== Memory Statistics ==========
Total requests served: 1,247
Total bytes received: 156,783
Total bytes sent: 4,291,034
Pool Statistics:
Active connections: 3
Peak connections: 47
Pool resets: 1,247
Average pool usage: 412 bytes/request
Peak pool usage: 3,891 bytes
Pool overflows: 0
Heap Statistics:
malloc() calls during runtime: 0
free() calls during runtime: 0
Current heap usage: 0 bytes
Connection State Breakdown:
READING_REQUEST_LINE: 2
READING_HEADERS: 1
PROCESSING: 0
SENDING_RESPONSE: 0
Invariant Checks Passed: 47,291
Invariant Violations: 0
=======================================
Solution Architecture
Data Structures
typedef enum {
CONN_READING_REQUEST,
CONN_READING_HEADERS,
CONN_PROCESSING,
CONN_SENDING_RESPONSE,
CONN_CLOSING
} ConnectionState;
typedef struct {
int fd;
ConnectionState state;
// Request parsing
char read_buffer[8192];
size_t read_pos;
// Pool for request data
Arena* pool;
// Parsed request (allocated from pool)
char* method;
char* uri;
char* version;
HttpHeader* headers;
size_t header_count;
// Response
char* response_buffer;
size_t response_len;
size_t response_sent;
// Timing
time_t connected_at;
time_t last_activity;
} Connection;
typedef struct {
int listen_fd;
Connection* connections[MAX_CONNECTIONS];
Arena* pools[MAX_CONNECTIONS];
const char* docroot;
ServerStats stats;
bool running;
} Server;
Implementation Guide
Phase 1: Basic Socket Server (Day 1-2)
Server* server_create(int port, const char* docroot) {
Server* server = calloc(1, sizeof(Server));
server->docroot = strdup(docroot);
// Create listening socket
server->listen_fd = socket(AF_INET, SOCK_STREAM, 0);
if (server->listen_fd < 0) {
perror("socket");
free(server);
return NULL;
}
// Allow port reuse
int opt = 1;
setsockopt(server->listen_fd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
// Bind
struct sockaddr_in addr = {0};
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_port = htons(port);
if (bind(server->listen_fd, (struct sockaddr*)&addr, sizeof(addr)) < 0) {
perror("bind");
close(server->listen_fd);
free(server);
return NULL;
}
listen(server->listen_fd, 128);
// Pre-allocate pools
for (int i = 0; i < MAX_CONNECTIONS; i++) {
server->pools[i] = arena_create(POOL_SIZE);
}
return server;
}
Phase 2: Event Loop with select() (Day 2-3)
void server_run(Server* server) {
server->running = true;
while (server->running) {
fd_set read_fds, write_fds;
FD_ZERO(&read_fds);
FD_ZERO(&write_fds);
int max_fd = server->listen_fd;
FD_SET(server->listen_fd, &read_fds);
// Add active connections
for (int i = 0; i < MAX_CONNECTIONS; i++) {
Connection* conn = server->connections[i];
if (!conn) continue;
if (conn->state == CONN_READING_REQUEST ||
conn->state == CONN_READING_HEADERS) {
FD_SET(conn->fd, &read_fds);
}
if (conn->state == CONN_SENDING_RESPONSE) {
FD_SET(conn->fd, &write_fds);
}
if (conn->fd > max_fd) max_fd = conn->fd;
}
struct timeval timeout = {.tv_sec = 1, .tv_usec = 0};
int ready = select(max_fd + 1, &read_fds, &write_fds, NULL, &timeout);
if (ready < 0) {
if (errno == EINTR) continue;
perror("select");
break;
}
// Accept new connections
if (FD_ISSET(server->listen_fd, &read_fds)) {
accept_connection(server);
}
// Handle existing connections
for (int i = 0; i < MAX_CONNECTIONS; i++) {
Connection* conn = server->connections[i];
if (!conn) continue;
if (FD_ISSET(conn->fd, &read_fds)) {
handle_read(server, conn);
}
if (FD_ISSET(conn->fd, &write_fds)) {
handle_write(server, conn);
}
}
// Check timeouts
check_timeouts(server);
}
}
Phase 3: HTTP Parsing with Bounds Checking (Day 3-5)
static bool parse_request_line(Connection* conn) {
char* line_end = strstr(conn->read_buffer, "\r\n");
if (!line_end) return false; // Incomplete
size_t line_len = line_end - conn->read_buffer;
if (line_len > MAX_REQUEST_LINE) {
conn->state = CONN_CLOSING;
// Send 414 URI Too Long
return false;
}
// Parse: METHOD URI VERSION
char* cursor = conn->read_buffer;
// Method (allocate from pool)
char* method_end = strchr(cursor, ' ');
if (!method_end || method_end > line_end) {
// Bad request
return false;
}
size_t method_len = method_end - cursor;
conn->method = arena_alloc(conn->pool, method_len + 1, 1);
memcpy(conn->method, cursor, method_len);
conn->method[method_len] = '\0';
cursor = method_end + 1;
// URI
char* uri_end = strchr(cursor, ' ');
if (!uri_end || uri_end > line_end) {
return false;
}
size_t uri_len = uri_end - cursor;
if (uri_len > MAX_URI_LENGTH) {
// 414 URI Too Long
return false;
}
conn->uri = arena_alloc(conn->pool, uri_len + 1, 1);
memcpy(conn->uri, cursor, uri_len);
conn->uri[uri_len] = '\0';
// Validate URI (no null bytes, path traversal, etc.)
if (!validate_uri(conn->uri)) {
// 400 Bad Request
return false;
}
cursor = uri_end + 1;
// Version
size_t version_len = line_end - cursor;
conn->version = arena_alloc(conn->pool, version_len + 1, 1);
memcpy(conn->version, cursor, version_len);
conn->version[version_len] = '\0';
// Shift buffer
memmove(conn->read_buffer, line_end + 2, conn->read_pos - line_len - 2);
conn->read_pos -= line_len + 2;
conn->state = CONN_READING_HEADERS;
return true;
}
Phase 4: File Serving (Day 5-6)
static void serve_file(Connection* conn, const char* path) {
// Construct full path
char full_path[PATH_MAX];
snprintf(full_path, sizeof(full_path), "%s%s", server->docroot, path);
// Validate path (prevent directory traversal)
char resolved[PATH_MAX];
if (!realpath(full_path, resolved)) {
send_error(conn, 404, "Not Found");
return;
}
// Check it's under docroot
if (strncmp(resolved, server->docroot, strlen(server->docroot)) != 0) {
send_error(conn, 403, "Forbidden");
return;
}
// Open file
int fd = open(resolved, O_RDONLY);
if (fd < 0) {
send_error(conn, 404, "Not Found");
return;
}
// Get file size
struct stat st;
fstat(fd, &st);
// Build response
char header[512];
int header_len = snprintf(header, sizeof(header),
"HTTP/1.1 200 OK\r\n"
"Content-Type: %s\r\n"
"Content-Length: %ld\r\n"
"Connection: close\r\n"
"\r\n",
get_content_type(resolved),
st.st_size);
// Allocate response buffer
conn->response_len = header_len + st.st_size;
conn->response_buffer = arena_alloc(conn->pool, conn->response_len, 1);
memcpy(conn->response_buffer, header, header_len);
read(fd, conn->response_buffer + header_len, st.st_size);
close(fd);
conn->response_sent = 0;
conn->state = CONN_SENDING_RESPONSE;
}
Phase 5: Cleanup and Statistics (Day 6-7)
static void close_connection(Server* server, int slot) {
Connection* conn = server->connections[slot];
if (!conn) return;
close(conn->fd);
arena_reset(conn->pool); // O(1) cleanup!
server->stats.total_connections++;
server->stats.active_connections--;
free(conn);
server->connections[slot] = NULL;
}
void server_print_stats(Server* server) {
printf("========== Server Statistics ==========\n");
printf("Total requests: %lu\n", server->stats.total_requests);
printf("Active connections: %d\n", server->stats.active_connections);
printf("Peak connections: %d\n", server->stats.peak_connections);
printf("Bytes received: %lu\n", server->stats.bytes_received);
printf("Bytes sent: %lu\n", server->stats.bytes_sent);
printf("Pool resets: %lu\n", server->stats.pool_resets);
printf("Average pool usage: %.1f bytes\n",
(double)server->stats.total_pool_usage / server->stats.total_requests);
printf("Invariant violations: %d\n", server->stats.invariant_violations);
printf("========================================\n");
}
Testing Strategy
Functional Testing
# Basic request
curl -v http://localhost:8080/index.html
# Concurrent connections
ab -n 10000 -c 100 http://localhost:8080/test.html
# Large file
curl http://localhost:8080/large_file.bin -o /dev/null
Fuzzing
# Send malformed requests
./fuzzer --target localhost:8080 --malformed-requests 1000
# Expected: All rejected gracefully, no crashes, no leaks
Memory Verification
$ valgrind --leak-check=full ./httpserver 8080 ./www
# Handle 1000 requests, then shutdown
==12345== All heap blocks were freed -- no leaks are possible
Common Pitfalls
Pitfall 1: Partial Reads
// WRONG: Assuming read() returns complete request
int n = read(conn->fd, buffer, sizeof(buffer));
// May return partial data!
// CORRECT: Accumulate in buffer, check for complete message
int n = read(conn->fd, conn->read_buffer + conn->read_pos,
sizeof(conn->read_buffer) - conn->read_pos);
conn->read_pos += n;
// Check if we have a complete request line
if (strstr(conn->read_buffer, "\r\n")) {
parse_request_line(conn);
}
Pitfall 2: Buffer Overflow in URI
// WRONG: No length check
strcpy(path, uri); // Buffer overflow!
// CORRECT: Bounds checking
if (strlen(uri) >= sizeof(path)) {
send_error(conn, 414, "URI Too Long");
return;
}
strncpy(path, uri, sizeof(path) - 1);
path[sizeof(path) - 1] = '\0';
Pitfall 3: Pool Exhaustion
// WRONG: Not checking allocation
char* data = arena_alloc(conn->pool, size, 1);
memcpy(data, src, size); // Crash if data is NULL!
// CORRECT: Check for pool exhaustion
char* data = arena_alloc(conn->pool, size, 1);
if (!data) {
send_error(conn, 507, "Insufficient Storage");
return;
}
Interview Preparation
Common Questions
- โHow does select() work for handling multiple connections?โ
- Single-threaded event loop
- select() blocks until any fd is ready
- Check which fds are ready, handle them
- Repeat
- โWhy use memory pools instead of malloc?โ
- Zero malloc overhead per request
- All request data has same lifetime
- O(1) cleanup with pool_reset()
- Prevents fragmentation
- โHow do you prevent buffer overflows when parsing HTTP?โ
- Check all lengths before copying
- Validate Content-Length limits
- Reject requests that exceed buffer sizes
- Use safe string functions
- โWhat happens if a client sends data very slowly?โ
- Timeout mechanism
- Close connections that are idle too long
- Limit partial request duration
Self-Assessment Checklist
Functionality
- Serves static files correctly
- Handles 100+ concurrent connections
- Timeouts work correctly
- Graceful shutdown releases resources
Security
- Rejects oversized requests
- Prevents directory traversal
- No buffer overflows (survives fuzzing)
- Validates all input
Memory
- Zero heap allocations during request handling
- Valgrind clean after extended use
- Pool usage stays bounded
- No leaks on connection close
Invariants
- State machine transitions correct
- Invariant checker passes continuously
- Statistics are accurate
Summary
The HTTP Server capstone proves you can build production-quality systems software:
- Invariants everywhere: Connection state, buffer validity, pool ownership
- Defensive parsing: Untrusted input, bounds checking, validation
- Pool allocation: Zero malloc per request, O(1) cleanup
- Event-driven architecture: Non-blocking I/O, state machines
- Security awareness: Fuzzing survival, attack resistance
When you can build a server that handles thousands of requests, survives fuzzing, never leaks memory, and clearly documents its invariants, youโve proven mastery of Sprint 2โs core concepts.
Congratulations on completing Sprint 2: Data & Invariants!
Youโve gone from โhoping code worksโ to โproving code is correct.โ You understand ownership, invariants, and defensive design at a professional level. This is the discipline that separates production-quality systems code from fragile prototypes.