Project 16: Network Stack Exploration

Build a packet capture tool and a tiny TCP/IP stack simulator.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 16-24 hours
Main Programming Language C
Alternative Programming Languages Rust, Go
Coolness Level Very High
Business Potential Medium (network tooling)
Prerequisites sockets, packet formats, binary parsing
Key Topics Ethernet/IP/TCP headers, packet capture, TCP state machine

1. Learning Objectives

By completing this project, you will:

  1. Capture packets from a real interface and decode headers.
  2. Simulate a minimal TCP state machine for handshake and teardown.
  3. Compare simulated events with real kernel traffic.
  4. Explain how socket API calls map to packets on the wire.

2. All Theory Needed (Per-Concept Breakdown)

TCP/IP Stack and Packet Flow

Fundamentals

The network stack moves bytes from applications to packets on the wire. TCP provides reliable byte streams over IP, which runs over Ethernet. Each layer adds its own header. A packet capture tool reads raw frames from a network interface and decodes these headers. TCP connections follow a state machine: SYN, SYN-ACK, ACK to establish, and FIN/ACK to close. Understanding the header fields and states allows you to connect application behavior with network events.

Deep Dive into the concept

Ethernet frames have source/destination MAC addresses and an EtherType field indicating the payload (IPv4, IPv6, ARP). The IP header includes source/destination IPs, protocol number, and fragmentation fields. TCP adds ports, sequence numbers, acknowledgment numbers, flags (SYN, ACK, FIN), and window size. These fields drive connection semantics.

A packet capture tool can be implemented using raw sockets or libpcap. Raw sockets require root privileges and provide access to link-layer frames. You must parse variable-length headers (IP header length can vary due to options). Endianness matters: fields are in network byte order. Failing to convert with htons/ntohs yields incorrect values.

The TCP state machine is deterministic: new connections start in CLOSED, then move to SYN-SENT, SYN-RECEIVED, ESTABLISHED, and eventually FIN-WAIT and CLOSED. Your simulator can model only a subset (handshake and teardown), but it should track sequence numbers and acknowledge values. You can compare your simulated handshake to captured packets from a real TCP connection to verify correctness.

This project bridges OS and networking. You will see how the kernel takes a connect() call and emits a SYN packet. You can also observe retransmissions and timeouts by dropping packets or blocking responses. The combination of capture and simulation makes the TCP/IP stack tangible.

How this fit on projects

This concept builds on Project 8 (syscall tracing) and Project 15 (kernel modules). It also uses IPC and buffer management concepts.

Definitions & key terms

  • Ethernet frame: link-layer packet format.
  • IP packet: network-layer unit with IP header.
  • TCP segment: transport-layer unit with TCP header.
  • SYN/ACK: TCP handshake flags.

Mental model diagram (ASCII)

App -> TCP -> IP -> Ethernet -> Wire

How it works (step-by-step)

  1. Capture packet from interface.
  2. Decode Ethernet header.
  3. Decode IP header and verify protocol.
  4. Decode TCP header and flags.
  5. Update simulator state.

Minimal concrete example

struct iphdr *ip = (struct iphdr*)(frame + 14);
if (ip->protocol == IPPROTO_TCP) { /* parse TCP */ }

Common misconceptions

  • “TCP is stateless”: it is a strict state machine.
  • “Packets are always complete”: fragmentation can occur.

Check-your-understanding questions

  1. Why do we need sequence numbers in TCP?
  2. What does the SYN flag mean?
  3. Why must you use ntohs when parsing ports?

Check-your-understanding answers

  1. To order bytes and retransmit reliably.
  2. It initiates a new connection.
  3. Fields are in network byte order.

Real-world applications

  • Packet sniffers, network debugging, IDS systems.

Where you’ll apply it

  • This project: Section 3.2, Section 3.7, Section 5.10 Phase 2.
  • Also used in: Project 8.

References

  • “TCP/IP Illustrated” Vol. 1
  • “UNIX Network Programming” Vol. 1

Key insights

TCP behavior is visible in packet traces; the state machine explains the patterns.

Summary

By capturing and simulating packets, you connect socket APIs to real network traffic.

Homework/Exercises to practice the concept

  1. Add UDP header parsing.
  2. Implement retransmission timeout in simulator.
  3. Compare kernel vs simulator for a full HTTP request.

Solutions to the homework/exercises

  1. Parse UDP header fields (src/dst port, length).
  2. Add timer and resend SYN if no ACK.
  3. Capture traffic and compare state transitions.

3. Project Specification

3.1 What You Will Build

A packet capture CLI that decodes Ethernet/IP/TCP headers and a minimal TCP simulator that prints state transitions for a synthetic connection.

3.2 Functional Requirements

  1. Capture packets from a specified interface.
  2. Decode Ethernet, IP, and TCP headers.
  3. Implement a TCP handshake simulator.
  4. Compare simulator output with real capture logs.

3.3 Non-Functional Requirements

  • Performance: process 10k packets/sec on a small capture.
  • Reliability: handle malformed packets gracefully.
  • Usability: ./pktcap eth0 and ./mini_tcp_sim.

3.4 Example Usage / Output

$ sudo ./pktcap eth0
[12:00:01] RX TCP 192.168.1.10:443 -> 192.168.1.50:51432 len=74

3.5 Data Formats / Schemas / Protocols

  • Ethernet II frames
  • IPv4 headers
  • TCP headers

3.6 Edge Cases

  • Non-TCP packets.
  • IP header with options.
  • Short frames.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

sudo ./pktcap eth0
./mini_tcp_sim --seed 42

3.7.2 Golden Path Demo (Deterministic)

  • mini_tcp_sim uses fixed seed and deterministic state transitions.

3.7.3 If CLI: exact terminal transcript

$ ./mini_tcp_sim --seed 42
SYN -> SYN-ACK -> ACK
state=ESTABLISHED

Failure demo (deterministic):

$ ./pktcap nope0
error: interface not found

Exit codes:

  • 0 success
  • 2 invalid args
  • 3 capture error

4. Solution Architecture

4.1 High-Level Design

Packet capture -> Decoder -> Logger
TCP simulator -> State machine -> Output

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Capture | read raw frames | libpcap or raw socket | | Decoder | parse headers | careful offsets | | Simulator | TCP states | small deterministic model |

4.3 Data Structures (No Full Code)

struct tcp_state {
    int state;
    uint32_t seq;
    uint32_t ack;
};

4.4 Algorithm Overview

Key Algorithm: handshake simulation

  1. Send SYN, move to SYN-SENT.
  2. Receive SYN-ACK, move to ESTABLISHED.
  3. Send ACK.

Complexity Analysis:

  • Time: O(1) per state transition
  • Space: O(1)

5. Implementation Guide

5.1 Development Environment Setup

sudo apt-get install libpcap-dev

5.2 Project Structure

project-root/
|-- pktcap.c
|-- mini_tcp_sim.c
`-- Makefile

5.3 The Core Question You’re Answering

“How does the OS move bytes from sockets to packets, and back again?”

5.4 Concepts You Must Understand First

  1. Ethernet/IP/TCP header formats.
  2. Endianness conversions.
  3. TCP state machine basics.

5.5 Questions to Guide Your Design

  1. How will you handle IP header options?
  2. How will you validate packet length?
  3. What fields will you log for debugging?

5.6 Thinking Exercise

Draw the TCP three-way handshake with state transitions.

5.7 The Interview Questions They’ll Ask

  1. What is the difference between TCP and UDP?
  2. Why does TCP need sequence numbers?

5.8 Hints in Layers

Hint 1: Start with packet capture and Ethernet parsing.

Hint 2: Add IP and TCP parsing.

Hint 3: Add state machine simulator.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | TCP/IP | TCP/IP Illustrated | 1-6 | | Sockets | UNIX Network Programming | 1-5 |

5.10 Implementation Phases

Phase 1: Capture + decode (6-8 hours)

Goals: decode headers and print logs.

Phase 2: Simulator (4-6 hours)

Goals: handshake state machine.

Phase 3: Compare (4-6 hours)

Goals: map simulator logs to real capture.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Capture API | raw socket vs libpcap | libpcap | portability | | Header parsing | manual vs struct | manual | control and safety |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit | header parsing | known packet bytes | | Integration | capture | loopback traffic | | Simulator | state transitions | deterministic sequence |

6.2 Critical Test Cases

  1. Parse a TCP SYN packet correctly.
  2. Handle non-IP packets gracefully.
  3. Simulator transitions to ESTABLISHED.

6.3 Test Data

packet bytes (hex): 45 00 00 34 ...

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |——–|———|———-| | Wrong offsets | garbage fields | compute header lengths | | Endianness | wrong port numbers | use ntohs/ntohl | | Short frames | crashes | length checks |

7.2 Debugging Strategies

  • Compare with tcpdump -xx output.
  • Add verbose logs for header fields.

7.3 Performance Traps

  • Parsing with lots of memcpy for each packet.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add UDP parsing.

8.2 Intermediate Extensions

  • Implement basic retransmission.

8.3 Advanced Extensions

  • Add a user-space TCP stack that can fetch a web page.

9. Real-World Connections

9.1 Industry Applications

  • Packet sniffers and network debugging tools.
  • tcpdump, Wireshark.

9.3 Interview Relevance

  • Networking stack questions and TCP state machine.

10. Resources

10.1 Essential Reading

  • TCP/IP Illustrated Vol. 1

10.2 Video Resources

  • TCP/IP lectures

10.3 Tools & Documentation

  • libpcap docs

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain TCP handshake states.
  • I can parse packet headers.

11.2 Implementation

  • Packet capture and simulator work.

11.3 Growth

  • I can explain socket-to-wire flow.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Packet capture with TCP header decode.

Full Completion:

  • TCP simulator with handshake.

Excellence (Going Above & Beyond):

  • User-space TCP stack or retransmission logic.