Project 4: Network Packet Sniffer
A sniffer that captures frames, decodes Ethernet and IP headers, and prints a readable summary.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | Week |
| Main Programming Language | C |
| Alternative Programming Languages | Python, Go, Rust |
| Coolness Level | Level 3: Genuinely Clever |
| Business Potential | 2. The “Micro-SaaS / Pro Tool” |
| Prerequisites | See concepts below |
| Key Topics | Layering and Encapsulation, Link Layer and LAN Behavior (Ethernet, Wi-Fi, ARP, Switching) |
1. Learning Objectives
- Build and validate: A sniffer that captures frames, decodes Ethernet and IP headers, and prints a readable summary..
- Explain protocol behavior and verify it with a capture or trace.
- Handle edge cases and produce reproducible results.
2. All Theory Needed (Per-Concept Breakdown)
Layering and Encapsulation
Fundamentals
Layering is the core mental model of networking. Each layer wraps the data from the layer above with its own header, creating a structure that can be forwarded and understood without needing to interpret higher-level meaning. This is called encapsulation. At the sender, application data becomes a transport segment, then an IP packet, then a link-layer frame, and finally a stream of bits. At the receiver, the process runs in reverse. Layering matters because it limits complexity: you can troubleshoot and replace one layer without rewriting the others. It also shapes failure modes. A broken DNS resolver looks different from a broken Wi-Fi link even though both present as “the internet is down.” Understanding the boundaries and contracts between layers is the prerequisite for every project in this guide.
Deep Dive
Layering is both a conceptual tool and a real architectural boundary. The OSI model is useful vocabulary, while the TCP/IP model represents how systems are actually built. Each layer provides a service to the layer above and consumes a service from the layer below. The practical consequence is that the same application can run over different link types (Ethernet or Wi-Fi) without changing application logic, because the transport and network layers abstract away those differences. Encapsulation is the mechanism that makes this abstraction possible. The application emits bytes; the transport layer adds source and destination ports and reliability metadata; the network layer adds source and destination IP addresses and routing-related fields; the link layer adds source and destination MAC addresses and a frame check sequence. Each hop strips and rewrites only the link-layer envelope while leaving higher layers untouched. This design creates stability but also creates overhead and limits.
The concept of a “layer” is not just a teaching device. It defines the fields you see on the wire and what devices are allowed to change. Switches operate at Layer 2 and rewrite nothing above it. Routers operate at Layer 3 and do not (in principle) modify transport payloads. However, real networks introduce cross-layer devices like NATs, firewalls, and proxies, which inspect or alter information above their nominal layer. This is why you can have a perfectly valid TCP segment that never reaches its destination: the middleboxes change or block it. Learning to see this boundary between ideal layering and real-world layering is essential for debugging home networks.
Layering also explains why maximum transmission unit (MTU) problems are so confusing. The link layer has a maximum frame size (for Ethernet, this is typically 1500 bytes of payload). The IP layer can either fragment larger packets or rely on Path MTU Discovery to avoid fragmentation. Transport protocols like TCP negotiate a maximum segment size (MSS) that fits within the path MTU. If an intermediate device drops ICMP “fragmentation needed” messages, Path MTU Discovery breaks and connections appear to hang only for large transfers. This failure is a cross-layer interaction: a link-layer size limit, an IP-layer fragmentation mechanism, and a transport-layer path discovery algorithm. Layering gives you a vocabulary to explain it, but you must be able to test it with real captures and commands.
Layering is also a performance story. Each layer adds overhead. A tiny application payload can become a much larger frame once transport, network, and link headers are applied. On Wi-Fi, retries and contention can further increase airtime. On a busy home network, that overhead matters. It also affects security: a link-layer encryption protocol like WPA3 protects the frame, but not necessarily the application data once it leaves the Wi-Fi link. Understanding which layer provides confidentiality, integrity, and authentication is a prerequisite for designing secure systems.
Finally, layering shapes how you design tools. A packet sniffer observes data across layers and therefore must decode multiple headers. A port scanner uses the transport layer to infer service behavior. A DNS resolver is an application-layer tool that relies on transport and network layers to function. Each project in this guide is an experiment that isolates and stresses a specific layer contract. The result is a mental model where failures become testable hypotheses rather than mysteries.
How this fit on projects You will repeatedly build tools that operate at specific layers (ARP at link, ping at network, DNS at application). This concept helps you decide which headers to read and which variables to control.
Definitions & key terms
- Layer: A conceptual boundary that provides a service to the layer above.
- Encapsulation: Wrapping data with headers as it moves down the stack.
- Decapsulation: Stripping headers as data moves up the stack.
- MTU: Maximum payload size at the link layer.
- MSS: Maximum TCP payload size negotiated by endpoints.
Mental model diagram
APP DATA
|
v
[ TCP hdr | APP DATA ]
|
v
[ IP hdr | TCP hdr | APP DATA ]
|
v
[ ETH hdr | IP hdr | TCP hdr | APP DATA | FCS ]
|
v
BITS ON THE WIRE
How it works
- Application produces bytes.
- Transport adds ports and reliability metadata.
- Network adds IP addressing and TTL.
- Link adds MAC addresses and integrity checks.
- Each hop rewrites the link header, not the IP header. Invariants: Higher layers should not depend on link type. Failure modes: MTU mismatch, middlebox interference, incorrect assumptions about which layer provides security.
Minimal concrete example
Encapsulation trace (sizes):
APP payload: 120 bytes
TCP header: 20 bytes
IP header: 20 bytes
Ethernet header + FCS: 18 bytes
Total on wire: 178 bytes
Common misconceptions
- “A router forwards frames.” It forwards packets; frames are link-local.
- “NAT is security.” NAT is address translation, not a security control.
Check-your-understanding questions
- Which layer is responsible for port numbers?
- Why does a packet get a new MAC destination at every hop?
Check-your-understanding answers
- Transport layer.
- MAC addresses are only meaningful within a local link.
Real-world applications
- Debugging MTU black holes.
- Explaining why Wi-Fi encryption does not secure end-to-end traffic.
Where you will apply it
- Projects 1-4, 9, 11, 14, 17
References
- “Computer Networks” by Tanenbaum and Wetherall - Ch. 1
- “TCP/IP Illustrated, Vol 1” by Stevens - Ch. 1-2
Key insights Encapsulation is the reason you can diagnose failures by layer instead of guessing.
Summary Layering reduces complexity, but real networks leak across layers. You must know both the ideal model and the messy reality.
Homework/Exercises to practice the concept
- Draw the encapsulation stack for a DNS query over UDP.
- Identify which headers change at each hop in a traceroute.
Solutions to the homework/exercises
- APP -> UDP -> IP -> Ethernet, with UDP ports 53.
- Link-layer headers change every hop; IP header TTL changes each hop.
Link Layer and LAN Behavior (Ethernet, Wi-Fi, ARP, Switching)
Fundamentals
The link layer is the realm of MAC addresses, frames, and local delivery. It is responsible for getting data from one device to another device on the same local network segment. Ethernet and Wi-Fi are the most common link layers in home and office networks. Ethernet uses switches that learn where MAC addresses live and forward frames to the correct port. Wi-Fi is a shared medium where devices contend for airtime and associate with an access point, which then bridges traffic to the wired LAN. ARP (Address Resolution Protocol) is the bridge between IP and MAC addresses, allowing an IP address to be mapped to a link-layer destination. The link layer defines the boundaries of broadcast domains, which strongly influences performance and security.
Deep Dive
The link layer is the place where “local” really means local. A switch builds a table of MAC address to port mappings by observing the source MAC of incoming frames. When it sees a destination MAC it does not know, it floods the frame out all ports (except the one it arrived on). Over time, this learning behavior makes forwarding efficient, but it also creates risks like MAC table overflow or broadcast storms if the network is poorly segmented. Unlike routers, switches do not understand IP addresses; they only see MACs and EtherType values. This is why ARP is necessary. When a host wants to send to an IP on the same subnet, it broadcasts an ARP request asking who owns that IP, and the owner replies with its MAC. That reply is cached for a limited time. If the cache is stale, ARP traffic increases and devices appear to “mysteriously” fail or slow down.
Wi-Fi adds complexity because the medium is shared and half-duplex. Devices must contend for airtime, and interference or poor signal quality can cause retransmissions that look like packet loss at higher layers. An access point is effectively a bridge between the Wi-Fi link and the wired Ethernet LAN. When you see a device “connected” but unable to reach the internet, the cause could be at the association layer (link) rather than at IP or DNS. Another subtlety is that Wi-Fi uses different frame formats and encryption (WPA2/WPA3) to protect frames on the air. This encryption is per-link, not end-to-end. Once a frame leaves the access point and enters the wired LAN, that Wi-Fi encryption no longer applies.
VLANs are a link-layer technique for segmentation. They allow multiple logical networks to share the same physical switch by tagging frames. In a home/office setting, VLANs are used to separate guest networks or IoT devices from trusted devices. This is a key security and performance tool, but it requires that all participating switches and access points handle VLAN tags correctly. Misconfigured VLAN tagging leads to symptoms like DHCP working on one SSID but not another, or devices that can access the router but not other devices.
Understanding link-layer behavior is essential for building scanners and sniffers. A packet sniffer on a wired switch port sees only the frames destined for that port unless you use port mirroring. On Wi-Fi, you might need monitor mode to see frames not destined for your device. This is a practical limitation that affects how you validate your tools. It is also why many network tools appear to “miss” traffic when run on the wrong interface or in the wrong capture mode.
Finally, link-layer security is not the same as network-layer security. ARP has no authentication, which is why ARP spoofing is possible. Wi-Fi encryption does not prevent a malicious device from joining the network if the passphrase is weak. If you understand the link layer, you can explain and detect these risks with concrete evidence, such as duplicate IP address warnings or sudden changes in ARP cache entries.
How this fit on projects Projects 1, 4, 12, 15, and 18 force you to understand ARP, frames, and capture limitations.
Definitions & key terms
- MAC address: A link-layer identifier used for local delivery.
- Frame: The unit of data at the link layer.
- ARP: Protocol that maps IP addresses to MAC addresses on a LAN.
- Broadcast domain: The scope of a link-layer broadcast.
- VLAN: A logical segmentation of a link layer.
Mental model diagram
Device A Switch Device B
AA:AA CAM Table BB:BB
| | |
| ARP who-has? | flood |
|--------------->|------------->|
| | ARP reply |
|<---------------|<-------------|
| data frame | unicast |
How it works
- Host wants to send to IP in same subnet.
- Host broadcasts ARP request for target IP.
- Target replies with its MAC address.
- Host caches mapping and sends frames directly. Invariants: ARP traffic does not cross routers. Failure modes: ARP cache poisoning, switch flooding, weak Wi-Fi encryption.
Minimal concrete example
ARP exchange (text):
Request: Who has 192.168.1.50? Tell 192.168.1.10
Reply: 192.168.1.50 is at 00:11:22:33:44:55
Common misconceptions
- “Switches block broadcasts.” Switches forward broadcasts to all ports.
- “Wi-Fi is just wireless Ethernet.” The medium behavior and security differ.
Check-your-understanding questions
- Why does ARP not work across subnets?
- What does a switch do with an unknown destination MAC?
Check-your-understanding answers
- ARP requests are link-local broadcasts; routers do not forward them.
- It floods the frame out all ports except the ingress port.
Real-world applications
- Diagnosing why a device is visible but not reachable.
- Segmenting IoT devices with VLANs and guest Wi-Fi.
Where you will apply it
- Projects 1, 4, 12, 15, 18
References
- “TCP/IP Illustrated, Vol 1” by Stevens - Ch. 2, 4
- RFC 826 (ARP)
Key insights The link layer is the truth of local delivery; everything else is built on it.
Summary Mastering ARP, MACs, and switching gives you real control over your LAN behavior.
Homework/Exercises to practice the concept
- Map your own ARP cache and identify every device.
- Draw your network and mark the broadcast domain boundaries.
Solutions to the homework/exercises
- Use
arp -aand compare with your router client list. - Each router boundary separates a broadcast domain.
3. Project Specification
3.1 What You Will Build
A sniffer that captures frames, decodes Ethernet and IP headers, and prints a readable summary.
Included:
- CLI tool with clear output
- Validation steps and logging
- Documentation of assumptions
Excluded:
- Production-grade performance tuning
- Full security hardening
3.2 Functional Requirements
- Core function: Implement the primary behavior described in the project goal.
- Observable output: Produce deterministic output comparable to the Real World Outcome.
- Error handling: Handle timeouts, invalid inputs, and unreachable hosts gracefully.
3.3 Non-Functional Requirements
- Performance: Complete typical tasks within a few seconds on a LAN.
- Reliability: Fail safely and clearly on errors.
- Usability: Provide concise CLI flags and helpful messages.
3.4 Example Usage / Output
$ sudo ./sniff --iface en0 --filter "tcp"
[14:32:01.102] ETH src=aa:bb:cc:11:22:33 dst=ff:ff:ff:ff:ff:ff type=0x0800
IP src=192.168.1.12 dst=93.184.216.34 ttl=64 proto=TCP
TCP sport=52344 dport=443 flags=SYN
3.5 Data Formats / Schemas / Protocols
Protocols: Layering and Encapsulation, Link Layer and LAN Behavior (Ethernet, Wi-Fi, ARP, Switching).
3.6 Edge Cases
- Target unreachable or timing out
- Malformed or unexpected responses
- Multiple interfaces or subnets
3.7 Real World Outcome
$ sudo ./sniff --iface en0 --filter "tcp"
[14:32:01.102] ETH src=aa:bb:cc:11:22:33 dst=ff:ff:ff:ff:ff:ff type=0x0800
IP src=192.168.1.12 dst=93.184.216.34 ttl=64 proto=TCP
TCP sport=52344 dport=443 flags=SYN
3.7.1 How to Run (Copy/Paste)
- Build:
make(or create a virtual environment as needed) - Run:
./P04-network-packet-sniffer - Config: update any constants in a config file or flags
- Working directory: project root
3.7.2 Golden Path Demo (Deterministic)
Run against a known local target and compare with the expected output.
3.7.3 If CLI: provide an exact terminal transcript
$ sudo ./sniff --iface en0 --filter "tcp"
[14:32:01.102] ETH src=aa:bb:cc:11:22:33 dst=ff:ff:ff:ff:ff:ff type=0x0800
IP src=192.168.1.12 dst=93.184.216.34 ttl=64 proto=TCP
TCP sport=52344 dport=443 flags=SYN
4. Solution Architecture
4.1 High-Level Design
CLI Input -> Core Engine -> Output/Logs
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| CLI Parser | Parse flags and inputs | Keep interface minimal |
| Core Engine | Protocol logic and state | Keep deterministic timing |
| Output/Logs | Present results and errors | Use consistent formatting |
4.4 Data Structures (No Full Code)
Request:
- target
- protocol fields
Response:
- status
- timing
State:
- retries
- cache entries
5. Implementation Guide
5.1 Development Environment Setup
- Ensure required tools (tcpdump, dig, ip) are installed.
- Use elevated privileges where raw sockets are required.
5.2 Project Structure
project/
README.md
docs/
src/
tests/
data/
5.3 The Core Question You’re Answering
“What is actually on the wire when my computer talks to the internet?”
5.4 Concepts You Must Understand First
- Ethernet framing
- What fields make up an Ethernet II frame?
- Book Reference: “TCP/IP Illustrated, Vol 1” - Ch. 2
- IPv4 header layout
- Which fields matter for routing and validation?
- Book Reference: “TCP/IP Illustrated, Vol 1” - Ch. 3
- Capture scope
- Why do you only see some traffic on a switched network?
- Book Reference: “Understanding Linux Network Internals” - Ch. 1
5.5 Questions to Guide Your Design
- Parsing order
- How do you detect EtherType and decide next header?
- Performance
- How will you avoid dropping packets under load?
5.6 Thinking Exercise
Parse by hand
Given a hex dump of an Ethernet frame, identify where the IP header starts.
Questions to answer:
- How many bytes are in the Ethernet header?
- How do you tell whether it is IPv4 or ARP?
5.7 The Interview Questions They’ll Ask
- “Why does a sniffer miss traffic on a switched network?”
- “What is promiscuous mode and why is it required?”
- “How do you identify TCP vs UDP in a capture?”
- “What is the difference between capture filters and display filters?”
- “Why might capture timestamps be inaccurate?”
5.8 Hints in Layers
Hint 1: Start with Ethernet only Decode only Ethernet headers before adding IP parsing.
Hint 2: Use small filters Capture just ARP or ICMP while testing to reduce noise.
Hint 3: Pseudocode outline
- open capture on interface
- read raw frame bytes
- parse Ethernet header
- if IPv4: parse IP header
- if TCP/UDP: print ports
Hint 4: Validation
Compare your output against tcpdump -n -e for the same traffic.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Ethernet | “TCP/IP Illustrated, Vol 1” | Ch. 2 |
| IPv4 | “TCP/IP Illustrated, Vol 1” | Ch. 3 |
| Linux capture | “Understanding Linux Network Internals” | Ch. 1 |
5.10 Implementation Phases
- Establish core protocol I/O and a minimal success path.
- Add parsing, validation, and timeouts.
- Add logging, metrics, and polish for output.
5.11 Key Implementation Decisions
- Which interface and capture point provides visibility?
- What timeout and retry strategy balances speed and accuracy?
- How will results be validated against reference tools?
6. Testing Strategy
6.1 Test Categories
- Unit: parsing and validation logic
- Integration: protocol exchange with a real device
- System: full run with reference tools
6.2 Critical Test Cases
- Successful request/response path
- Timeout and retry behavior
- Invalid or unexpected input handling
6.3 Test Data
- Local gateway IP and a known reachable host
- A non-routable IP to trigger timeouts
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
Problem 1: “Capture shows nothing”
- Why: Wrong interface or missing permissions.
- Fix: Use the correct interface and run with elevated privileges.
- Quick test:
tcpdump -n -i <iface> -c 1
Problem 2: “Parsed fields look wrong”
- Why: Misaligned offsets or wrong byte order.
- Fix: Validate header sizes and use network byte order.
- Quick test: Compare with Wireshark field values.
7.2 Debugging Strategies
- Capture traffic with tcpdump or Wireshark.
- Compare against a known-good tool.
- Log timestamps and retry logic.
7.3 Performance Traps
- Excessive retries causing long runtimes.
- Inefficient parsing under high packet rates.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add basic configuration flags for interface and timeout.
- Improve output formatting and sorting.
8.2 Intermediate Extensions
- Add caching or state persistence.
- Add CSV or JSON output export.
8.3 Advanced Extensions
- Add concurrency with careful rate limiting.
- Add visualization or integration with a dashboard.
9. Real-World Connections
9.1 Industry Applications
- Network diagnostics and troubleshooting
- Security monitoring and policy enforcement
9.2 Related Open Source Projects
- tcpdump / Wireshark
- nmap / dnsmasq / unbound (as applicable)
9.3 Interview Relevance
- Explaining protocol behavior
- Diagnosing failures by layer
10. Resources
10.1 Essential Reading
| Topic | Book | Chapter |
|---|---|---|
| Ethernet | “TCP/IP Illustrated, Vol 1” | Ch. 2 |
| IPv4 | “TCP/IP Illustrated, Vol 1” | Ch. 3 |
| Linux capture | “Understanding Linux Network Internals” | Ch. 1 |
10.2 Video Resources
- Wireshark or tcpdump walkthroughs (search for recent tutorials)
- Vendor or RFC explainers for the relevant protocol
10.3 Tools & Documentation
- RFCs for the protocols used in this project
man tcpdump,man ip,man ss