Project 11: Simple HTTP Server

A basic HTTP server that serves static responses and logs requests.

Quick Reference

Attribute	Value
Difficulty	Level 2: Intermediate
Time Estimate	Week
Main Programming Language	Python
Alternative Programming Languages	C, Go, Rust
Coolness Level	Level 2: Practical but Forgettable
Business Potential	2. The “Micro-SaaS / Pro Tool”
Prerequisites	See concepts below
Key Topics	Transport and Ports (TCP and UDP), Layering and Encapsulation

1. Learning Objectives

Build and validate: A basic HTTP server that serves static responses and logs requests..
Explain protocol behavior and verify it with a capture or trace.
Handle edge cases and produce reproducible results.

2. All Theory Needed (Per-Concept Breakdown)

Transport and Ports (TCP and UDP)

Fundamentals The transport layer provides end-to-end communication between applications. TCP offers reliable, ordered delivery through acknowledgments, retransmission, and flow control. UDP provides minimal overhead without reliability guarantees, which makes it ideal for low-latency or simple query/response protocols like DNS. Ports are the addressing mechanism for applications. A network connection is identified by a 5-tuple: source IP, source port, destination IP, destination port, and protocol. Understanding how TCP and UDP differ is essential for building tools like ping-like diagnostics, port scanners, and proxies.

Deep Dive TCP is a stateful protocol. It begins with a three-way handshake that establishes initial sequence numbers on both ends. Once established, each side maintains a sliding window of bytes that have been sent but not yet acknowledged. Retransmission occurs on timeout or after duplicate acknowledgments. Flow control uses the receiver’s advertised window to prevent buffer overflow, while congestion control reacts to signs of network congestion by reducing the sending rate. These behaviors are not just theoretical: they explain why a large file transfer can slow down after packet loss, and why a connection may stall if ACKs are filtered or delayed.

UDP is the opposite: it simply wraps application data with source and destination ports and a checksum. There is no handshake, no retransmission, and no ordering. This makes UDP great for short queries, live streaming, or gaming, where timeliness is more important than perfect delivery. But it also means that the application must handle loss, duplication, or reordering if those problems matter. This is why protocols like DNS and DHCP include their own retry logic and transaction IDs.

Ports are the multiplexing mechanism of the transport layer. A single IP address can host many services because each service listens on a different port. Clients use ephemeral ports chosen by the OS to distinguish their connections. Firewalls and NAT devices often make decisions based on ports and protocol state, which is why understanding TCP states (SYN_SENT, ESTABLISHED, FIN_WAIT) is crucial for debugging. For example, a firewall that drops inbound SYN packets but allows inbound ACKs can cause mysterious failures in connection setup while established connections continue to work.

Transport behavior is also shaped by MTU and fragmentation. TCP segments are sized to fit within the path MTU. If a segment is too large and fragmentation is blocked, the connection can stall in ways that appear random. UDP has no built-in recovery for lost fragments, which can make large UDP payloads unreliable. This is why many UDP-based protocols keep messages small or implement their own segmentation and reassembly.

When you build transport-layer tools, you are interacting with a state machine. A port scanner that uses TCP SYN packets is testing how a host responds to a state transition. A proxy server is managing two concurrent TCP state machines and relaying data between them. A VPN tunnel often runs over UDP or TCP and must handle reliability differently depending on the transport. The more you understand transport mechanics, the more precise your debugging and design choices become.

How this fit on projects Projects 2, 3, 9, 11, 17, and 19 depend directly on transport behavior.

Definitions & key terms

5-tuple: The identifiers of a transport flow.
Handshake: TCP connection establishment.
Window: Flow control mechanism for TCP.
Ephemeral port: Temporary client port assigned by the OS.

Mental model diagram

Client                     Server
SYN  -------------------->  (listening)
SYN-ACK <------------------
ACK  -------------------->  (established)

How it works

TCP establishes state via handshake.
Data is sent with sequence numbers and ACKs.
Loss triggers retransmission and window reduction.
UDP sends without state; application handles retries if needed. Invariants: TCP guarantees order if the connection stays up. Failure modes: half-open connections, blocked SYNs, dropped ACKs.

Minimal concrete example

UDP request/response:
Client -> UDP:53 query id=0x1234
Server -> UDP:53 response id=0x1234

Common misconceptions

“UDP is always faster.” It can be, but loss may negate benefits.
“TCP guarantees delivery across the internet.” It only guarantees delivery within the connection’s lifetime.

Check-your-understanding questions

Why does a TCP connection need both sequence and acknowledgment numbers?
When would you choose UDP over TCP for a home network tool?

Check-your-understanding answers

To track sent bytes and confirm receipt in order.
For low-latency, small messages where retries are acceptable.

Real-world applications

Reliable file transfer versus real-time streaming.
Port scanning and service discovery.

Where you will apply it

Projects 2, 3, 9, 11, 17, 19

References

RFC 9293 (TCP), RFC 768 (UDP)
“TCP/IP Illustrated, Vol 1” by Stevens - Ch. 11-16

Key insights Transport protocols are state machines; your tools must respect their state.

Summary TCP and UDP make different promises. Your designs must align with those promises.

Homework/Exercises to practice the concept

List three application protocols that use UDP and why.
Draw the TCP close sequence and label each side’s state.

Solutions to the homework/exercises

DNS (small queries), NTP (time sync), VoIP (latency).
FIN/ACK exchange with TIME_WAIT on the side that closes last.

Layering and Encapsulation

Fundamentals Layering is the core mental model of networking. Each layer wraps the data from the layer above with its own header, creating a structure that can be forwarded and understood without needing to interpret higher-level meaning. This is called encapsulation. At the sender, application data becomes a transport segment, then an IP packet, then a link-layer frame, and finally a stream of bits. At the receiver, the process runs in reverse. Layering matters because it limits complexity: you can troubleshoot and replace one layer without rewriting the others. It also shapes failure modes. A broken DNS resolver looks different from a broken Wi-Fi link even though both present as “the internet is down.” Understanding the boundaries and contracts between layers is the prerequisite for every project in this guide.

Deep Dive Layering is both a conceptual tool and a real architectural boundary. The OSI model is useful vocabulary, while the TCP/IP model represents how systems are actually built. Each layer provides a service to the layer above and consumes a service from the layer below. The practical consequence is that the same application can run over different link types (Ethernet or Wi-Fi) without changing application logic, because the transport and network layers abstract away those differences. Encapsulation is the mechanism that makes this abstraction possible. The application emits bytes; the transport layer adds source and destination ports and reliability metadata; the network layer adds source and destination IP addresses and routing-related fields; the link layer adds source and destination MAC addresses and a frame check sequence. Each hop strips and rewrites only the link-layer envelope while leaving higher layers untouched. This design creates stability but also creates overhead and limits.

The concept of a “layer” is not just a teaching device. It defines the fields you see on the wire and what devices are allowed to change. Switches operate at Layer 2 and rewrite nothing above it. Routers operate at Layer 3 and do not (in principle) modify transport payloads. However, real networks introduce cross-layer devices like NATs, firewalls, and proxies, which inspect or alter information above their nominal layer. This is why you can have a perfectly valid TCP segment that never reaches its destination: the middleboxes change or block it. Learning to see this boundary between ideal layering and real-world layering is essential for debugging home networks.

Layering also explains why maximum transmission unit (MTU) problems are so confusing. The link layer has a maximum frame size (for Ethernet, this is typically 1500 bytes of payload). The IP layer can either fragment larger packets or rely on Path MTU Discovery to avoid fragmentation. Transport protocols like TCP negotiate a maximum segment size (MSS) that fits within the path MTU. If an intermediate device drops ICMP “fragmentation needed” messages, Path MTU Discovery breaks and connections appear to hang only for large transfers. This failure is a cross-layer interaction: a link-layer size limit, an IP-layer fragmentation mechanism, and a transport-layer path discovery algorithm. Layering gives you a vocabulary to explain it, but you must be able to test it with real captures and commands.

Layering is also a performance story. Each layer adds overhead. A tiny application payload can become a much larger frame once transport, network, and link headers are applied. On Wi-Fi, retries and contention can further increase airtime. On a busy home network, that overhead matters. It also affects security: a link-layer encryption protocol like WPA3 protects the frame, but not necessarily the application data once it leaves the Wi-Fi link. Understanding which layer provides confidentiality, integrity, and authentication is a prerequisite for designing secure systems.

Finally, layering shapes how you design tools. A packet sniffer observes data across layers and therefore must decode multiple headers. A port scanner uses the transport layer to infer service behavior. A DNS resolver is an application-layer tool that relies on transport and network layers to function. Each project in this guide is an experiment that isolates and stresses a specific layer contract. The result is a mental model where failures become testable hypotheses rather than mysteries.

How this fit on projects You will repeatedly build tools that operate at specific layers (ARP at link, ping at network, DNS at application). This concept helps you decide which headers to read and which variables to control.

Definitions & key terms

Layer: A conceptual boundary that provides a service to the layer above.
Encapsulation: Wrapping data with headers as it moves down the stack.
Decapsulation: Stripping headers as data moves up the stack.
MTU: Maximum payload size at the link layer.
MSS: Maximum TCP payload size negotiated by endpoints.

Mental model diagram

APP DATA
  |
  v
[ TCP hdr | APP DATA ]
  |
  v
[ IP hdr | TCP hdr | APP DATA ]
  |
  v
[ ETH hdr | IP hdr | TCP hdr | APP DATA | FCS ]
  |
  v
BITS ON THE WIRE

How it works

Application produces bytes.
Transport adds ports and reliability metadata.
Network adds IP addressing and TTL.
Link adds MAC addresses and integrity checks.
Each hop rewrites the link header, not the IP header. Invariants: Higher layers should not depend on link type. Failure modes: MTU mismatch, middlebox interference, incorrect assumptions about which layer provides security.

Minimal concrete example

Encapsulation trace (sizes):
APP payload: 120 bytes
TCP header: 20 bytes
IP header: 20 bytes
Ethernet header + FCS: 18 bytes
Total on wire: 178 bytes

Common misconceptions

“A router forwards frames.” It forwards packets; frames are link-local.
“NAT is security.” NAT is address translation, not a security control.

Check-your-understanding questions

Which layer is responsible for port numbers?
Why does a packet get a new MAC destination at every hop?

Check-your-understanding answers

Transport layer.
MAC addresses are only meaningful within a local link.

Real-world applications

Debugging MTU black holes.
Explaining why Wi-Fi encryption does not secure end-to-end traffic.

Where you will apply it

Projects 1-4, 9, 11, 14, 17

References

“Computer Networks” by Tanenbaum and Wetherall - Ch. 1
“TCP/IP Illustrated, Vol 1” by Stevens - Ch. 1-2

Key insights Encapsulation is the reason you can diagnose failures by layer instead of guessing.

Summary Layering reduces complexity, but real networks leak across layers. You must know both the ideal model and the messy reality.

Homework/Exercises to practice the concept

Draw the encapsulation stack for a DNS query over UDP.
Identify which headers change at each hop in a traceroute.

Solutions to the homework/exercises

APP -> UDP -> IP -> Ethernet, with UDP ports 53.
Link-layer headers change every hop; IP header TTL changes each hop.

3. Project Specification

3.1 What You Will Build

A basic HTTP server that serves static responses and logs requests.

Included:

CLI tool with clear output
Validation steps and logging
Documentation of assumptions

Excluded:

Production-grade performance tuning
Full security hardening

3.2 Functional Requirements

Core function: Implement the primary behavior described in the project goal.
Observable output: Produce deterministic output comparable to the Real World Outcome.
Error handling: Handle timeouts, invalid inputs, and unreachable hosts gracefully.

3.3 Non-Functional Requirements

Performance: Complete typical tasks within a few seconds on a LAN.
Reliability: Fail safely and clearly on errors.
Usability: Provide concise CLI flags and helpful messages.

3.4 Example Usage / Output

$ curl -i http://localhost:8080/
HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 18

Hello from LAN

3.5 Data Formats / Schemas / Protocols

Protocols: Transport and Ports (TCP and UDP), Layering and Encapsulation.

3.6 Edge Cases

Target unreachable or timing out
Malformed or unexpected responses
Multiple interfaces or subnets

3.7 Real World Outcome

$ curl -i http://localhost:8080/
HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 18

Hello from LAN

3.7.1 How to Run (Copy/Paste)

Build: make (or create a virtual environment as needed)
Run: ./P11-simple-http-server
Config: update any constants in a config file or flags
Working directory: project root

3.7.2 Golden Path Demo (Deterministic)

Run against a known local target and compare with the expected output.

3.7.3 If CLI: provide an exact terminal transcript

$ curl -i http://localhost:8080/
HTTP/1.1 200 OK
Content-Type: text/plain
Content-Length: 18

Hello from LAN

4. Solution Architecture

4.1 High-Level Design

CLI Input -> Core Engine -> Output/Logs

4.2 Key Components

Component	Responsibility	Key Decisions
CLI Parser	Parse flags and inputs	Keep interface minimal
Core Engine	Protocol logic and state	Keep deterministic timing
Output/Logs	Present results and errors	Use consistent formatting

4.4 Data Structures (No Full Code)

Request:
  - target
  - protocol fields
Response:
  - status
  - timing
State:
  - retries
  - cache entries

5. Implementation Guide

5.1 Development Environment Setup

Ensure required tools (tcpdump, dig, ip) are installed.
Use elevated privileges where raw sockets are required.

5.2 Project Structure

project/
  README.md
  docs/
  src/
  tests/
  data/

5.3 The Core Question You’re Answering

“What does a real application protocol look like over raw TCP?”

5.4 Concepts You Must Understand First

TCP streams
- Why is TCP a byte stream, not message-based?
- Book Reference: “UNIX Network Programming, Vol 1” - Ch. 1-3
HTTP basics
- What defines a request line and headers?
- Book Reference: “TCP/IP Illustrated, Vol 1” - Ch. 14
Connection lifecycle
- How do you handle multiple clients safely?
- Book Reference: “Computer Networks” - Ch. 6

5.5 Questions to Guide Your Design

Parsing and buffering
- How do you detect end of headers?
Concurrency
- Will you handle one request at a time or many?

5.6 Thinking Exercise

Partial reads

How do you handle a request that arrives in multiple TCP segments?

Questions to answer:

What indicates the end of the headers?
What if the client sends more data after the headers?

5.7 The Interview Questions They’ll Ask

“Why is HTTP built on TCP rather than UDP?”
“How do you know when an HTTP request is complete?”
“What is keep-alive and why does it matter?”
“How do you prevent a slow client from blocking others?”
“What is the difference between HTTP and HTTPS?”

5.8 Hints in Layers

Hint 1: Start with a fixed response Serve a static string for every request.

Hint 2: Print raw requests Log incoming bytes to understand headers.

Hint 3: Pseudocode outline

- accept TCP connection
- read until blank line
- parse request line
- write response headers and body

Hint 4: Verification Use curl -i and compare with expected output.

5.9 Books That Will Help

Topic	Book	Chapter
TCP streams	“UNIX Network Programming, Vol 1”	Ch. 1-3
HTTP over TCP	“TCP/IP Illustrated, Vol 1”	Ch. 14
Concurrency	“Computer Networks”	Ch. 6

5.10 Implementation Phases

Establish core protocol I/O and a minimal success path.
Add parsing, validation, and timeouts.
Add logging, metrics, and polish for output.

5.11 Key Implementation Decisions

Which interface and capture point provides visibility?
What timeout and retry strategy balances speed and accuracy?
How will results be validated against reference tools?

6. Testing Strategy

6.1 Test Categories

Unit: parsing and validation logic
Integration: protocol exchange with a real device
System: full run with reference tools

6.2 Critical Test Cases

Successful request/response path
Timeout and retry behavior
Invalid or unexpected input handling

6.3 Test Data

Local gateway IP and a known reachable host
A non-routable IP to trigger timeouts

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Problem 1: “Client hangs”

Why: Server does not send Content-Length or close connection.
Fix: Provide Content-Length or close after response.
Quick test: Compare headers with curl -i.

Problem 2: “Requests are partial”

Why: TCP segmentation splits the headers.
Fix: Read until header terminator sequence.
Quick test: Simulate slow client with nc.

7.2 Debugging Strategies

Capture traffic with tcpdump or Wireshark.
Compare against a known-good tool.
Log timestamps and retry logic.

7.3 Performance Traps

Excessive retries causing long runtimes.
Inefficient parsing under high packet rates.

8. Extensions & Challenges

8.1 Beginner Extensions

Add basic configuration flags for interface and timeout.
Improve output formatting and sorting.

8.2 Intermediate Extensions

Add caching or state persistence.
Add CSV or JSON output export.

8.3 Advanced Extensions

Add concurrency with careful rate limiting.
Add visualization or integration with a dashboard.

9. Real-World Connections

9.1 Industry Applications

Network diagnostics and troubleshooting
Security monitoring and policy enforcement

tcpdump / Wireshark
nmap / dnsmasq / unbound (as applicable)

9.3 Interview Relevance

Explaining protocol behavior
Diagnosing failures by layer

10. Resources

10.1 Essential Reading

Topic	Book	Chapter
TCP streams	“UNIX Network Programming, Vol 1”	Ch. 1-3
HTTP over TCP	“TCP/IP Illustrated, Vol 1”	Ch. 14
Concurrency	“Computer Networks”	Ch. 6

10.2 Video Resources

Wireshark or tcpdump walkthroughs (search for recent tutorials)
Vendor or RFC explainers for the relevant protocol

10.3 Tools & Documentation

RFCs for the protocols used in this project
man tcpdump, man ip, man ss