Project 6: Network Stack Server

Build a user-space TCP/IP stack server with IPC-based socket APIs.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 1 month
Language C (Alternatives: Rust)
Prerequisites TCP/IP basics, IPC, driver framework
Key Topics TCP state machine, sockets, packet IO

1. Learning Objectives

By completing this project, you will:

  1. Implement a minimal TCP/IP stack in user space.
  2. Expose a socket-like IPC API to clients.
  3. Integrate with a network driver or TAP device.
  4. Explain the tradeoffs of network services in microkernels.

2. Theoretical Foundation

2.1 Core Concepts

  • TCP State Machine: SYN, ESTABLISHED, FIN, retransmission.
  • IP Routing: Basic routing and ARP/ND resolution.
  • Socket API: A user-facing abstraction for connection-oriented IO.
  • Packet IO: Driver interface for sending/receiving frames.

2.2 Why This Matters

Network stacks are complex. Building them as servers proves microkernels can modularize even the most complex OS services.

2.3 Historical Context / Background

lwIP and BSD stacks influenced many embedded and microkernel systems. QNX uses message passing for socket APIs.

2.4 Common Misconceptions

  • “TCP is just send/recv.” TCP needs timers, retransmissions, and windowing.
  • “You must start with TCP.” Start with UDP/ICMP for sanity.

3. Project Specification

3.1 What You Will Build

A network server that accepts IPC requests for socket operations and handles TCP/UDP traffic using a driver or TAP device.

3.2 Functional Requirements

  1. Socket API: socket/connect/send/recv/close via IPC.
  2. UDP support: send/receive datagrams.
  3. TCP handshake: SYN/SYN-ACK/ACK.
  4. Packet IO: transmit/receive packets via driver.
  5. Multi-client: per-client socket tables.

3.3 Non-Functional Requirements

  • Correctness: Basic TCP state transitions and retransmissions.
  • Performance: Avoid per-byte IPC when possible.
  • Robustness: Defensive parsing of packets.

3.4 Example Usage / Output

int s = net_socket(AF_INET, SOCK_STREAM, 0);
net_connect(s, "93.184.216.34", 80);
net_send(s, "GET / HTTP/1.0\r\n\r\n", 18);
int n = net_recv(s, buf, sizeof(buf));

3.5 Real World Outcome

$ ./net_server &
[net] server ready on endpoint "net"

$ ./http_client example.com
[net] socket() -> 3
[net] connect() -> SYN sent
[net] ESTABLISHED
[net] recv() 1256 bytes
HTTP/1.0 200 OK

4. Solution Architecture

4.1 High-Level Design

┌──────────────┐  IPC   ┌──────────────┐   packets   ┌──────────────┐
│   Clients    │ ─────▶ │  Net Server  │ ─────────▶ │  Net Driver  │
└──────────────┘ ◀───── │ TCP/IP stack │ ◀───────── │ (TAP/real)   │
                         └──────────────┘            └──────────────┘

4.2 Key Components

Component Responsibility Key Decisions
Socket manager Per-client sockets Table per client
TCP layer State, retransmit Timer wheel vs heap
IP layer Routing, ARP Static routing first
Driver interface Send/recv frames TAP device for dev

4.3 Data Structures

typedef struct {
    int id;
    uint32_t local_ip, remote_ip;
    uint16_t local_port, remote_port;
    tcp_state_t state;
    uint32_t snd_nxt, rcv_nxt;
} tcp_conn_t;

4.4 Algorithm Overview

Key Algorithm: TCP Handshake

  1. Client sends SYN.
  2. Server responds with SYN-ACK.
  3. Client replies ACK, connection established.

Complexity Analysis:

  • Time: O(1) per packet processing
  • Space: O(N) connections

5. Implementation Guide

5.1 Development Environment Setup

cc -O2 -g -o net_server *.c

5.2 Project Structure

net_server/
├── src/
│   ├── socket.c
│   ├── tcp.c
│   ├── ip.c
│   ├── driver.c
│   └── main.c
├── include/
│   └── net.h
└── tests/
    └── test_tcp.c

5.3 The Core Question You’re Answering

“Can a full TCP/IP stack live outside the kernel and still feel usable?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. TCP state machine
  2. IP headers and checksums
  3. ARP for local networks
  4. Timers and retransmission

5.5 Questions to Guide Your Design

  1. How will you route packets (static vs dynamic)?
  2. How will you store TCP connections?
  3. How will you handle retransmission timers?
  4. How will you expose socket errors to clients?

5.6 Thinking Exercise

Simulate a TCP Handshake

Write out the state transitions and sequence numbers for a 3-way handshake.

5.7 The Interview Questions They’ll Ask

  1. “What states does TCP go through during connection setup?”
  2. “Why does TCP need sequence numbers?”
  3. “How do microkernels handle sockets?”

5.8 Hints in Layers

Hint 1: Start with UDP Implement stateless packet send/receive first.

Hint 2: Add TCP handshake Hardcode a handshake before full data transfer.

Hint 3: Implement retransmission Add timers for SYN/ACK retransmit.

5.9 Books That Will Help

Topic Book Chapter
TCP/IP TCP/IP Illustrated Vol 1
Socket API CS:APP Ch. 11

5.10 Implementation Phases

Phase 1: Foundation (1 week)

Goals:

  • Packet IO
  • UDP send/recv

Tasks:

  1. Use TAP or a raw socket for packet IO.
  2. Send a UDP packet and receive it.

Checkpoint: UDP ping works.

Phase 2: Core Functionality (2 weeks)

Goals:

  • TCP connection setup
  • Basic send/recv

Tasks:

  1. Implement TCP state machine.
  2. Implement basic data transfer.

Checkpoint: HTTP GET works via your stack.

Phase 3: Polish & Edge Cases (1 week)

Goals:

  • Retransmission and close

Tasks:

  1. Add retransmit timers.
  2. Implement FIN/close.

Checkpoint: Repeated HTTP requests succeed.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Driver backend TAP, raw NIC TAP first Safer and easier
TCP timers per-conn thread, wheel Timer wheel Scales better
IPC API POSIX-like, custom POSIX-like Familiar to users

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Packet parsing checksum tests
Integration Tests TCP handshake SYN/SYN-ACK/ACK
Load Tests Multiple clients 50 connections

6.2 Critical Test Cases

  1. Malformed packet is dropped safely.
  2. Retransmit on loss works.
  3. Concurrent connections do not interfere.

6.3 Test Data

TCP ports: 80, 8080
Payload sizes: 64B, 4KB

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong checksum Packets ignored Verify checksum algorithm
Missing retransmit Connections hang Add timers
Global locks Throughput low Use per-conn locks

7.2 Debugging Strategies

  • Use tcpdump to inspect packets.
  • Add a trace log of TCP state transitions.

7.3 Performance Traps

Per-byte IPC calls are slow. Buffer sends and do fewer IPCs.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add ICMP echo (ping).
  • Add DNS lookup via UDP.

8.2 Intermediate Extensions

  • Implement basic congestion control.
  • Support IPv6.

8.3 Advanced Extensions

  • Add TLS offload via user-space library.
  • Integrate with a capability system for socket access.

9. Real-World Connections

9.1 Industry Applications

  • QNX: Network stack as a server process.
  • Fuchsia: Netstack is a component.
  • lwIP: https://savannah.nongnu.org/projects/lwip/
  • gVisor netstack: https://github.com/google/gvisor

9.3 Interview Relevance

TCP state machine and socket design are frequent interview topics.


10. Resources

10.1 Essential Reading

  • TCP/IP Illustrated, Vol 1 - Protocol details.
  • CS:APP - Socket programming.

10.2 Video Resources

  • Network lectures from university OS courses.

10.3 Tools & Documentation

  • tcpdump: packet capture
  • Wireshark: protocol analysis
  • Project 5: File system server pattern.
  • Project 12: Benchmark IPC overhead.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain TCP connection setup and teardown.
  • I can describe how my IPC API maps to sockets.

11.2 Implementation

  • UDP and TCP send/recv work.
  • Multiple clients can connect simultaneously.

11.3 Growth

  • I can outline how to add congestion control.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • UDP works end-to-end.
  • TCP handshake succeeds.

Full Completion:

  • TCP data transfer works reliably.
  • Multiple connections are supported.

Excellence (Going Above & Beyond):

  • Congestion control or TLS support.
  • Performance metrics documented.

This guide was generated from LEARN_MICROKERNELS.md. For the complete learning path, see the parent directory.