Project 4: The Asynchronous HTTP 1.0 Client
Build a complete HTTP client that chains DNS resolution, TCP connection, request writing, and response reading into a seamless async workflow.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 1-2 weeks |
| Language | C |
| Prerequisites | Project 3, HTTP basics |
| Key Topics | DNS resolution, callback composition, multi-stage async, HTTP protocol |
1. Learning Objectives
By completing this project, you will:
- Perform asynchronous DNS lookups with
uv_getaddrinfo() - Chain multiple async operations (DNS → connect → write → read)
- Manage state across callback boundaries
- Parse simple HTTP responses
- Handle errors at each stage of the pipeline
- Build a complete network client from primitives
- Understand the “callback hell” challenge and how to structure code
2. Theoretical Foundation
2.1 Core Concepts
Multi-Stage Async Operations
Making an HTTP request requires four sequential async operations:
┌────────────────────────────────────────────────────────────────────┐
│ HTTP Request Pipeline │
├────────────────────────────────────────────────────────────────────┤
│ │
│ Stage 1: DNS Resolution │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ uv_getaddrinfo("example.com") ──► IP: 93.184.216.34 │ │
│ └──────────────────────────────────────────────────────────┬──┘ │
│ │ │
│ on_resolved() │ │
│ ▼ │
│ Stage 2: TCP Connection │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ uv_tcp_connect(93.184.216.34:80) ──► Socket connected │ │
│ └──────────────────────────────────────────────────────────┬──┘ │
│ │ │
│ on_connect() │ │
│ ▼ │
│ Stage 3: Send HTTP Request │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ uv_write("GET / HTTP/1.0\r\n...") ──► Request sent │ │
│ └──────────────────────────────────────────────────────────┬──┘ │
│ │ │
│ on_write() │ │
│ ▼ │
│ Stage 4: Read HTTP Response │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ uv_read_start() ──► Multiple on_read() calls ──► Complete │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────┘
Asynchronous DNS with uv_getaddrinfo
DNS is traditionally a blocking operation. libuv provides an async wrapper:
// The request object
uv_getaddrinfo_t resolver;
// The hints (what we're looking for)
struct addrinfo hints;
hints.ai_family = AF_INET; // IPv4
hints.ai_socktype = SOCK_STREAM; // TCP
hints.ai_flags = 0;
hints.ai_protocol = IPPROTO_TCP;
// Start the resolution
uv_getaddrinfo(loop, &resolver, on_resolved, "example.com", "80", &hints);
In the callback, you get a linked list of addresses:
void on_resolved(uv_getaddrinfo_t* resolver, int status, struct addrinfo* res) {
if (status < 0) {
// Handle error
}
// res->ai_addr is the sockaddr to connect to
// Use it with uv_tcp_connect()
uv_freeaddrinfo(res); // Must free!
}
State Management Across Callbacks
Since each callback is a separate function, you need a way to pass state:
┌─────────────────────────────────────────────────────────────────┐
│ State Management Options │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Option 1: Global Variables (Simple but ugly) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ char* url; │ │
│ │ uv_tcp_t socket; │ │
│ │ // Access from any callback │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
│ Option 2: Handle->data pointer (Recommended) │
│ ┌───────────────────────────────────────────────────────────┐ │
│ │ typedef struct { │ │
│ │ char host[256]; │ │
│ │ char path[1024]; │ │
│ │ int port; │ │
│ │ uv_tcp_t socket; │ │
│ │ uv_connect_t connect_req; │ │
│ │ // ... more state │ │
│ │ } http_request_t; │ │
│ │ │ │
│ │ // Attach to handle │ │
│ │ socket.data = &request_context; │ │
│ │ │ │
│ │ // Retrieve in callback │ │
│ │ http_request_t* ctx = (http_request_t*)handle->data; │ │
│ └───────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
HTTP 1.0 Protocol Basics
HTTP 1.0 is text-based and simple:
Request Format:
GET /path HTTP/1.0\r\n
Host: example.com\r\n
\r\n
Response Format:
HTTP/1.0 200 OK\r\n
Content-Type: text/html\r\n
Content-Length: 1234\r\n
\r\n
<html>...</html>
Key points:
- Lines end with
\r\n(CRLF) - Headers end with blank line (
\r\n\r\n) - HTTP 1.0 closes connection after response
- No chunked encoding (simpler to parse)
2.2 Why This Matters
HTTP clients are everywhere:
- REST API clients
- Web scrapers
- Health check tools
- Webhook senders
- Microservice communication
Understanding the async flow helps you:
- Debug HTTP issues at the protocol level
- Build custom HTTP clients
- Understand higher-level libraries
2.3 Historical Context
- 1996: HTTP 1.0 standardized (RFC 1945)
- 1997: HTTP 1.1 adds persistent connections
- 2015: HTTP/2 with multiplexing
- 2022: HTTP/3 with QUIC
HTTP 1.0 is simpler (connection closes after response) making it ideal for learning.
2.4 Common Misconceptions
| Misconception | Reality |
|---|---|
| “DNS is instant” | Can take 10-1000ms depending on cache |
| “Connect is instant” | TCP handshake takes at least 1 RTT |
| “One read gets the whole response” | May take many reads |
| “HTTP is binary” | HTTP 1.x is text-based |
| “Connection stays open” | HTTP 1.0 closes after response |
3. Project Specification
3.1 What You Will Build
A command-line HTTP client that:
- Accepts a URL as an argument
- Parses the host and path
- Resolves the domain to an IP
- Connects and sends an HTTP GET request
- Prints the response to stdout
3.2 Functional Requirements
- Parse URLs in format
http://host[:port]/path - Default to port 80 if not specified
- Default to path “/” if not specified
- Perform async DNS resolution
- Connect via TCP
- Send HTTP 1.0 GET request with Host header
- Print complete response (headers + body)
- Exit cleanly after response
3.3 Non-Functional Requirements
- Handle connection timeouts (10 second default)
- Handle DNS failures gracefully
- Handle connection refused gracefully
- Clean compilation (no warnings)
- No memory leaks
3.4 Example Usage / Output
$ ./http-client http://example.com/
Resolving example.com...
Connecting to 93.184.216.34:80...
Sending request...
HTTP/1.0 200 OK
Age: 527297
Cache-Control: max-age=604800
Content-Type: text/html; charset=UTF-8
Date: Mon, 15 Jan 2024 12:00:00 GMT
...
<!doctype html>
<html>
<head>
<title>Example Domain</title>
...
$ ./http-client http://nonexistent.invalid/
Resolving nonexistent.invalid...
DNS resolution failed: unknown node or service
$ ./http-client http://localhost:9999/
Resolving localhost...
Connecting to 127.0.0.1:9999...
Connection failed: connection refused
3.5 Real World Outcome
A working HTTP client demonstrating:
- Multi-stage async composition
- DNS resolution
- TCP client implementation
- HTTP protocol basics
- Proper error handling at each stage
4. Solution Architecture
4.1 High-Level Design
┌────────────────────────────────────────────────────────────────────┐
│ main() │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ 1. Parse URL (host, port, path) │ │
│ │ 2. Initialize request context │ │
│ │ 3. uv_getaddrinfo(host, on_resolved) │ │
│ │ 4. uv_run(loop) │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ on_resolved(resolver, status, res) │ │
│ │ - Check for errors │ │
│ │ - Extract IP address │ │
│ │ - uv_tcp_init(&socket) │ │
│ │ - uv_tcp_connect(socket, addr, on_connect) │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ on_connect(connect_req, status) │ │
│ │ - Check for errors │ │
│ │ - Build HTTP request string │ │
│ │ - uv_write(socket, request, on_write) │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ on_write(write_req, status) │ │
│ │ - Check for errors │ │
│ │ - uv_read_start(socket, alloc_buffer, on_read) │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ on_read(stream, nread, buf) ◄──────────────┐ │ │
│ │ - nread > 0: print data, continue reading ─────────────────┘ │ │
│ │ - nread == UV_EOF: close socket, done │ │
│ │ - nread < 0: error, close socket │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────────┘
4.2 Key Components
| Component | Type | Purpose |
|---|---|---|
http_request_t |
struct | Request context (state across callbacks) |
resolver |
uv_getaddrinfo_t |
DNS resolution request |
socket |
uv_tcp_t |
TCP connection handle |
connect_req |
uv_connect_t |
Connect request |
write_req |
uv_write_t |
Write request |
4.3 Data Structures
// Request context structure
typedef struct {
// URL components
char host[256];
char path[1024];
int port;
// libuv handles and requests
uv_loop_t* loop;
uv_getaddrinfo_t resolver;
uv_tcp_t socket;
uv_connect_t connect_req;
uv_write_t write_req;
// Request buffer
char request_buf[4096];
// State tracking
int connected;
int response_started;
} http_request_t;
4.4 Algorithm Overview
ALGORITHM: Async HTTP Client
INPUT: URL from command line
OUTPUT: HTTP response to stdout
1. PARSE URL
- Extract host (required)
- Extract port (default: 80)
- Extract path (default: "/")
2. DNS RESOLUTION
- Call uv_getaddrinfo(host, port)
- Wait for callback
3. ON_RESOLVED
- If error: print message, exit
- Get sockaddr from result
- Initialize TCP handle
- Call uv_tcp_connect()
- Free addrinfo
4. ON_CONNECT
- If error: print message, exit
- Build HTTP request:
"GET {path} HTTP/1.0\r\n"
"Host: {host}\r\n"
"\r\n"
- Call uv_write()
5. ON_WRITE
- If error: print message, close
- Start reading response
- Call uv_read_start()
6. ON_READ (may be called multiple times)
- If nread > 0: print data
- If nread == UV_EOF: close socket (done)
- If nread < 0: error, close socket
- Free buffer
7. CLEANUP
- Close socket
- Loop exits
- Program ends
5. Implementation Guide
5.1 Development Environment Setup
# Create project
mkdir http-client && cd http-client
# Create Makefile
cat > Makefile << 'EOF'
CC = gcc
CFLAGS = -Wall -Wextra -g $(shell pkg-config --cflags libuv)
LDFLAGS = $(shell pkg-config --libs libuv)
http-client: main.c
$(CC) $(CFLAGS) -o $@ $^ $(LDFLAGS)
clean:
rm -f http-client
.PHONY: clean
EOF
touch main.c
5.2 Project Structure
http-client/
├── Makefile
└── main.c
5.3 The Core Question You’re Answering
How do you coordinate multiple dependent asynchronous operations where each step requires the result of the previous?
The answer: Callback chaining, with a context structure to carry state.
5.4 Concepts You Must Understand First
- What information does DNS resolution provide?
- IP address(es) for the hostname
- Returned as
struct sockaddrfor direct use
- Why chain callbacks instead of blocking?
- Blocking would halt the entire program
- Async allows handling multiple requests
- Non-blocking is more scalable
- What’s the HTTP 1.0 request format?
- Method + path + version on first line
- Headers on following lines
- Blank line signals end of headers
5.5 Questions to Guide Your Design
URL Parsing:
- How do you extract host, port, and path?
- What’s a simple parsing strategy?
- What edge cases exist?
State Management:
- What state needs to persist across callbacks?
- How will each callback access shared state?
- How do you pass state to the DNS callback?
Error Handling:
- What if DNS fails?
- What if connection is refused?
- What if the write fails?
- How do you cleanup at each failure point?
5.6 Thinking Exercise
Trace a successful request:
URL: http://example.com/test
Time T0: main() parses URL
- host = "example.com"
- port = 80
- path = "/test"
Time T1: uv_getaddrinfo() called
- Work queued to thread pool
Time T2: DNS resolution completes
- on_resolved() fires
- IP: 93.184.216.34
Time T3: uv_tcp_connect() called
- TCP SYN sent
Time T4: TCP handshake completes
- on_connect() fires
- Request built:
"GET /test HTTP/1.0\r\n"
"Host: example.com\r\n"
"\r\n"
Time T5: uv_write() called
- Data queued for sending
Time T6: Write completes
- on_write() fires
- uv_read_start() called
Time T7: First response data arrives
- on_read(nread=1024) fires
- Print 1024 bytes
Time T8: More data arrives
- on_read(nread=500) fires
- Print 500 bytes
Time T9: Server closes connection
- on_read(UV_EOF) fires
- Close socket
Time T10: Close complete
- on_close() fires
- Loop exits
Questions:
- How many callbacks are involved?
- Where is the request string built?
- When is memory freed?
5.7 Hints in Layers
Hint 1: URL Parsing (Simple Version)
// Simple URL parser (no validation)
// Format: http://host[:port][/path]
int parse_url(const char* url, http_request_t* req) {
// Skip "http://"
if (strncmp(url, "http://", 7) != 0) {
return -1;
}
url += 7;
// Find end of host (port or path)
const char* host_end = strpbrk(url, ":/");
size_t host_len = host_end ? (host_end - url) : strlen(url);
strncpy(req->host, url, host_len);
req->host[host_len] = '\0';
// Default port
req->port = 80;
// Check for port
if (host_end && *host_end == ':') {
req->port = atoi(host_end + 1);
host_end = strchr(host_end, '/');
}
// Path (default to "/")
if (host_end && *host_end == '/') {
strcpy(req->path, host_end);
} else {
strcpy(req->path, "/");
}
return 0;
}
Hint 2: Starting the Request
int main(int argc, char* argv[]) {
if (argc < 2) {
fprintf(stderr, "Usage: %s <url>\n", argv[0]);
return 1;
}
uv_loop_t* loop = uv_default_loop();
// Allocate and parse
http_request_t* req = malloc(sizeof(http_request_t));
memset(req, 0, sizeof(*req));
req->loop = loop;
if (parse_url(argv[1], req) < 0) {
fprintf(stderr, "Invalid URL\n");
return 1;
}
printf("Resolving %s...\n", req->host);
// DNS lookup
struct addrinfo hints;
memset(&hints, 0, sizeof(hints));
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;
char port_str[16];
snprintf(port_str, sizeof(port_str), "%d", req->port);
req->resolver.data = req; // Attach context
uv_getaddrinfo(loop, &req->resolver, on_resolved,
req->host, port_str, &hints);
return uv_run(loop, UV_RUN_DEFAULT);
}
Hint 3: DNS and Connect
void on_resolved(uv_getaddrinfo_t* resolver, int status, struct addrinfo* res) {
http_request_t* req = (http_request_t*)resolver->data;
if (status < 0) {
fprintf(stderr, "DNS resolution failed: %s\n", uv_strerror(status));
free(req);
return;
}
// Get the address
char ip[17];
uv_ip4_name((struct sockaddr_in*)res->ai_addr, ip, sizeof(ip));
printf("Connecting to %s:%d...\n", ip, req->port);
// Initialize socket
uv_tcp_init(req->loop, &req->socket);
req->socket.data = req; // Attach context
// Connect
req->connect_req.data = req;
uv_tcp_connect(&req->connect_req, &req->socket,
res->ai_addr, on_connect);
uv_freeaddrinfo(res);
}
Hint 4: Write and Read
void on_connect(uv_connect_t* connect_req, int status) {
http_request_t* req = (http_request_t*)connect_req->data;
if (status < 0) {
fprintf(stderr, "Connection failed: %s\n", uv_strerror(status));
uv_close((uv_handle_t*)&req->socket, on_close);
return;
}
printf("Sending request...\n\n");
// Build HTTP request
snprintf(req->request_buf, sizeof(req->request_buf),
"GET %s HTTP/1.0\r\n"
"Host: %s\r\n"
"Connection: close\r\n"
"\r\n",
req->path, req->host);
uv_buf_t buf = uv_buf_init(req->request_buf, strlen(req->request_buf));
req->write_req.data = req;
uv_write(&req->write_req, (uv_stream_t*)&req->socket, &buf, 1, on_write);
}
void on_write(uv_write_t* write_req, int status) {
http_request_t* req = (http_request_t*)write_req->data;
if (status < 0) {
fprintf(stderr, "Write error: %s\n", uv_strerror(status));
uv_close((uv_handle_t*)&req->socket, on_close);
return;
}
// Start reading response
uv_read_start((uv_stream_t*)&req->socket, alloc_buffer, on_read);
}
void on_read(uv_stream_t* stream, ssize_t nread, const uv_buf_t* buf) {
http_request_t* req = (http_request_t*)stream->data;
if (nread > 0) {
// Print response
fwrite(buf->base, 1, nread, stdout);
} else if (nread == UV_EOF) {
// Done!
printf("\n");
uv_close((uv_handle_t*)stream, on_close);
} else if (nread < 0) {
fprintf(stderr, "Read error: %s\n", uv_strerror(nread));
uv_close((uv_handle_t*)stream, on_close);
}
free(buf->base);
}
5.8 The Interview Questions They’ll Ask
- “Walk me through the stages of making an HTTP request asynchronously.”
- DNS resolution → TCP connect → send request → read response
- Each stage is async with its own callback
- State passed through context structure
- “What’s ‘callback hell’ and how do you manage it?”
- Deep nesting of callbacks
- Manage with: context structs, separate functions, state machines
- Modern: Promises, async/await (in other languages)
- “How do you handle timeout for an HTTP request?”
- Start a timer when beginning the request
- If timer fires before response, cancel the operation
- Use
uv_timer_twith appropriate timeout
- “What if the DNS returns multiple IP addresses?”
- The result is a linked list (
res->ai_next) - Could try each one on failure
- Usually just use the first
- The result is a linked list (
- “How would you add HTTPS support?”
- Need TLS library (OpenSSL, mbedtls)
- Wrap the socket with TLS context
- Handle TLS handshake before HTTP
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| HTTP protocol | HTTP: The Definitive Guide | Chapters 1-4 |
| DNS | DNS and BIND | Chapters 2, 14 |
| TCP connections | TCP/IP Illustrated | Chapters 17-18 |
| Socket programming | UNIX Network Programming | Chapters 8-11 |
| libuv | An Introduction to libuv | Networking chapter |
5.10 Implementation Phases
Phase 1: URL Parsing (1 hour)
Goal: Parse URL and print components.
./http-client http://example.com/test
# Output: host=example.com port=80 path=/test
Phase 2: DNS Resolution (2 hours)
Goal: Resolve hostname and print IP.
- Implement
uv_getaddrinfo() - Handle DNS errors
- Print resolved IP
Phase 3: TCP Connect (2 hours)
Goal: Connect to the resolved IP.
- Implement
on_connect - Handle connection errors
- Print “Connected!”
Phase 4: Send Request (2 hours)
Goal: Send HTTP request and start reading.
- Build request string
- Send with
uv_write() - Handle write errors
Phase 5: Read Response (3 hours)
Goal: Read and print the response.
- Implement
on_read - Handle multiple reads
- Handle EOF
- Clean up properly
Phase 6: Error Handling & Polish (2 hours)
Goal: Robust error handling everywhere.
- Check all return values
- Print meaningful errors
- Test edge cases
- Verify no memory leaks
5.11 Key Implementation Decisions
| Decision | Options | Recommendation |
|---|---|---|
| State storage | Global / context struct | Context struct |
| URL parsing | Regex / manual | Manual (simpler) |
| Port handling | String / int | Both (int for logic, string for getaddrinfo) |
| Buffer allocation | Static / dynamic | Dynamic (via alloc_cb) |
| Request buffer | Stack / in context | In context (lives across callbacks) |
6. Testing Strategy
Test URLs
# Simple GET
./http-client http://example.com/
# With path
./http-client http://example.com/path/to/resource
# Different port
./http-client http://localhost:8080/
# HTTPS (should fail gracefully - not implemented)
./http-client https://example.com/
Error Cases
# Non-existent domain
./http-client http://nonexistent.invalid/
# Connection refused
./http-client http://localhost:9999/
# No argument
./http-client
# Invalid URL
./http-client not-a-url
Memory Testing
valgrind --leak-check=full ./http-client http://example.com/
7. Common Pitfalls & Debugging
| Problem | Symptom | Root Cause | Fix |
|---|---|---|---|
| Crash in callback | Segfault | Context not attached | Set handle->data |
| Hangs forever | No response | Forgot to start reading | Add uv_read_start() |
| Partial response | Truncated | Not handling multiple reads | Loop until EOF |
| Memory leak | Valgrind error | Didn’t free addrinfo | Call uv_freeaddrinfo() |
| Wrong host | Connection refused | Used wrong address | Check sockaddr usage |
| No response | Timeout | Missing headers | Add Host: header |
Debugging Tips
# Test with netcat first
echo -e "GET / HTTP/1.0\r\nHost: example.com\r\n\r\n" | nc example.com 80
# Use curl as reference
curl -v http://example.com/
# Trace DNS
dig example.com
# Trace TCP connection
tcpdump -i any port 80
8. Extensions & Challenges
Extension 1: Follow Redirects
Handle 301/302 redirects by parsing Location header.
Challenge: Need to parse headers, make new request.
Extension 2: POST Support
Add -d "data" flag for POST requests.
Challenge: Different request format, Content-Length.
Extension 3: Timeout
Add connection and response timeouts.
Challenge: Use uv_timer_t coordinated with request.
Extension 4: Multiple URLs
Fetch multiple URLs concurrently.
Challenge: Manage multiple request contexts.
Extension 5: Keep-Alive
Use HTTP 1.1 with persistent connections.
Challenge: Parse Content-Length or chunked encoding.
9. Real-World Connections
How curl Works
curl http://example.com/
Under the hood:
1. URL parsed by libcurl
2. DNS resolved (with cache)
3. TCP connected (with timeout)
4. TLS handshake (if HTTPS)
5. Request sent
6. Response read and parsed
7. Output written
8. Connection cached for reuse
Production HTTP Client Features
| Feature | Purpose |
|---|---|
| Connection pooling | Reuse TCP connections |
| Cookie handling | Session management |
| Redirect following | Handle 3xx responses |
| Compression | Accept gzip/deflate |
| Retries | Handle transient failures |
| Proxy support | Corporate environments |
| TLS/SSL | Secure connections |
10. Resources
Documentation
Reference Implementations
11. Self-Assessment Checklist
You’re ready when:
- Can fetch http://example.com/ successfully
- Handles non-existent domains gracefully
- Handles connection refused gracefully
- Handles custom ports
- Handles custom paths
- No memory leaks
- Can explain the callback chain
- Can explain why each stage is async
- Could add a new feature (timeout, POST)
12. Submission / Completion Criteria
Your project is complete when:
- Functional: Fetches HTTP pages correctly
- Robust: All error cases handled
- Clean: No warnings, no memory leaks
- Documented: Comments explain flow
Bonus: Implement at least one extension.
Congratulations!
You’ve completed all four libuv projects. You now have a deep understanding of:
- Event loops and async I/O
- File operations with threadpool
- TCP server implementation
- Multi-stage async clients
Next Steps:
- Build a more complete HTTP server
- Add TLS support with OpenSSL
- Implement WebSocket protocol
- Explore libuv in Node.js source code
Navigation
| Previous | Up | Next |
|---|---|---|
| P03: TCP Echo Server | README | - |