Project 3: Build Your Own traceroute Utility
A traceroute clone that maps the hop-by-hop path to a destination.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate |
| Time Estimate | Week |
| Main Programming Language | C |
| Alternative Programming Languages | Python, Go, Rust |
| Coolness Level | Level 3: Genuinely Clever |
| Business Potential | 1. The “Resume Gold” |
| Prerequisites | See concepts below |
| Key Topics | IP Addressing, Subnetting, and Routing (Including NAT), Operations, Diagnostics, and Security |
1. Learning Objectives
- Build and validate: A traceroute clone that maps the hop-by-hop path to a destination..
- Explain protocol behavior and verify it with a capture or trace.
- Handle edge cases and produce reproducible results.
2. All Theory Needed (Per-Concept Breakdown)
IP Addressing, Subnetting, and Routing (Including NAT)
Fundamentals
IP is the addressing and routing system of the internet. IPv4 uses 32-bit addresses and relies on subnet masks (CIDR) to determine which destinations are local versus remote. Your device sends local traffic directly and remote traffic to the default gateway (your router). Routing is a hop-by-hop decision made by routers using the longest prefix match in the routing table. NAT (Network Address Translation) is common at the edge: it maps many private addresses to a single public address. Subnetting is the tool that partitions a network into smaller segments, which is critical for performance, security, and scaling in home/office environments.
Deep Dive
An IP address has two parts: the network prefix and the host identifier. CIDR notation (for example, /24) tells you how many bits belong to the prefix. The prefix defines the boundary of local delivery. If the destination IP shares the same prefix, the host uses ARP to find the destination MAC and sends directly. If not, it sends to the default gateway. This simple rule explains most “why can’t I reach this device” problems. Subnetting is the practice of choosing a prefix length that fits your network size and segmentation goals. Too large a subnet creates unnecessary broadcast traffic and larger failure domains. Too small a subnet causes address shortages or awkward routing rules.
Routing decisions are made by examining the destination IP and finding the most specific (longest) match in the routing table. A default route (0.0.0.0/0) catches everything else. Each router decrements the TTL field, which prevents routing loops from persisting forever. When TTL reaches zero, the router drops the packet and sends an ICMP Time Exceeded message. This is the mechanism that traceroute exploits. In home networks, your router usually has a handful of routes: the local LAN, perhaps a guest LAN, and a default route to your ISP. But in small office networks, you might add static routes to reach a lab subnet, or a VPN route to reach a remote office. Misconfigured routes cause asymmetric paths, which are hard to debug without captures.
NAT adds a stateful translation table at the edge. It replaces private source IPs and ports with a public IP and a chosen source port (often called PAT). When replies come back, the router uses this table to translate them back to the internal host. This is how many devices share one public address. NAT also breaks the original end-to-end model of the internet, which is why inbound connections typically require explicit port forwarding. Understanding NAT is critical for troubleshooting “I can browse the web but cannot host a server” issues, and for understanding why some peer-to-peer applications struggle. NAT is not a firewall, but it has similar observable effects because unsolicited inbound traffic is dropped by default when no translation table entry exists.
IPv6 changes the addressing landscape. It uses 128-bit addresses, removing the need for NAT at the edge. Instead of ARP, IPv6 uses Neighbor Discovery (ND). IPv6 hosts often use Stateless Address Autoconfiguration (SLAAC) to build their own addresses from router advertisements. In practice, many home networks operate dual-stack (IPv4 and IPv6). This means troubleshooting can involve two parallel protocol stacks, with different failure modes. A device might fail over IPv4 but work over IPv6, or the reverse. Understanding IP addressing at both versions makes you far more effective at diagnosing real-world issues.
Finally, routing and addressing are where policy is enforced. VLANs map to IP subnets. Firewall rules frequently reference IP ranges. VPNs create new routes. If you know how to design and reason about addresses and routing tables, you can predict how traffic will flow before you even run a packet capture. That ability is what transforms you from a user of networks into an engineer of networks.
How this fit on projects Subnet math and routing logic are required for Projects 1-3, 9, 14, 18, and 19.
Definitions & key terms
- CIDR: Prefix notation for networks (e.g., /24).
- Default gateway: Router used for off-subnet traffic.
- Longest prefix match: Routing rule that chooses the most specific route.
- NAT/PAT: Translation of internal addresses to a public address with ports.
Mental model diagram
LAN 192.168.1.0/24 Router/NAT Internet
Host 192.168.1.50 -> [NAT table] -> 203.0.113.5:45001
How it works
- Host checks if destination is in local subnet.
- If local, ARP and send directly; if not, send to gateway.
- Router chooses route via longest prefix match.
- NAT rewrites source IP/port for outbound flows. Invariants: TTL always decrements at routers. Failure modes: wrong subnet mask, missing default route, NAT table exhaustion.
Minimal concrete example
Routing table snippet:
192.168.1.0/24 -> eth0 (direct)
0.0.0.0/0 -> 192.168.1.1 (default)
Common misconceptions
- “Two devices with the same IP will work if they are on different switches.” They will conflict if on the same subnet.
- “NAT protects me from all inbound attacks.” It does not replace a firewall.
Check-your-understanding questions
- Why does a /24 network have 254 usable addresses?
- What happens when no route matches a destination?
Check-your-understanding answers
- Two addresses are reserved: network and broadcast.
- The packet is dropped (and often an ICMP unreachable is sent).
Real-world applications
- Planning guest and IoT subnets.
- Diagnosing port forwarding and inbound access problems.
Where you will apply it
- Projects 1-3, 9, 14, 18, 19
References
- “TCP/IP Illustrated, Vol 1” by Stevens - Ch. 3
- RFC 791 (IPv4), RFC 8200 (IPv6)
Key insights Addressing and routing are the map; NAT and policy are the gatekeepers.
Summary If you can compute subnets and read routing tables, you can predict traffic flow.
Homework/Exercises to practice the concept
- Divide 192.168.10.0/24 into four /26 subnets.
- Sketch a routing table for a network with a guest VLAN and a VPN.
Solutions to the homework/exercises
- /26 blocks at .0, .64, .128, .192.
- Include routes for each subnet plus a default route to the ISP.
Operations, Diagnostics, and Security
Fundamentals
Diagnostics and security are where theory becomes practical. ICMP provides feedback about connectivity, latency, and routing failures. Tools like ping and traceroute rely on ICMP to make network paths visible. Packet capture reveals the truth of what is happening on the wire, which is essential when logs or assumptions are wrong. Security in home/office networks is built on segmentation, stateful filtering, and secure Wi-Fi configuration. Understanding these tools and controls lets you diagnose failures quickly and prevent common threats.
Deep Dive
ICMP is often misunderstood as “just ping,” but it is the control plane of IP. It reports unreachable destinations, TTL expiry, and fragmentation requirements. Traceroute manipulates TTL values to force routers along a path to send ICMP Time Exceeded messages, which reveals hop-by-hop routing. Understanding ICMP types and codes lets you differentiate between “host unreachable” and “port unreachable,” which can save hours of guessing. But ICMP is not guaranteed to be delivered, so diagnostics must be combined with other evidence like TCP resets and packet captures.
Packet capture is the definitive tool for understanding network behavior. A capture shows headers at every layer, timestamps, retransmissions, and the exact sequence of events that produced a failure. It also reveals hidden behaviors, such as DNS retries, TCP window size changes, or unexpected multicast traffic. Effective capture requires careful filtering and knowledge of what you are looking for. On switched networks, you may only see traffic destined for your host, so you may need port mirroring or monitor mode on Wi-Fi. Without this awareness, you can falsely conclude that traffic is not present when you are simply not observing the right link.
Security in home networks is primarily about reducing attack surface and limiting trust. A stateful firewall tracks connection state and allows return traffic while blocking unsolicited inbound flows. This is different from NAT, though NAT produces a similar effect. Segmentation isolates devices so that a compromised IoT device cannot reach sensitive systems. Wi-Fi security must protect the air interface using WPA2 or WPA3, and should avoid legacy configurations like WEP or open networks. VPNs create encrypted tunnels across untrusted networks; they can be site-to-site (linking networks) or remote access (linking a device to a network). VPNs also alter routing by creating new routes over the tunnel interface, which is why they sometimes break local network access or change DNS behavior.
Operational visibility is not just about troubleshooting failures. It is about validating performance and policy. A bandwidth monitor can show whether a device is saturating the uplink. A DNS sinkhole can show which devices are contacting malicious domains. A firewall log can show attempted scans or misconfigured services. These tools are the foundation of a healthy home or office network.
Finally, security is a system property, not a single setting. A strong Wi-Fi passphrase is meaningless if devices are on the same flat network with no segmentation. A firewall rule is only as good as the routing table that feeds it. The projects in this guide are structured to force you to test and verify these properties, not just configure them.
How this fit on projects Projects 2-4, 9-20 use diagnostics and security concepts directly.
Definitions & key terms
- ICMP: Control messages for IP.
- Stateful firewall: Filters traffic based on connection state.
- Segmentation: Separating devices into isolated networks.
- VPN: Encrypted tunnel over untrusted networks.
Mental model diagram
Client -> Router -> Internet
| | |
| firewall state |
| ICMP feedback |
+-> capture point |
How it works
- Use ICMP to test reachability and path.
- Capture packets to confirm actual behavior.
- Enforce policy with firewall and segmentation.
- Use VPNs to extend trust securely. Invariants: ICMP is best-effort, not guaranteed. Failure modes: blocked ICMP, asymmetric routing, misapplied firewall rules.
Minimal concrete example
Traceroute logic:
TTL=1 -> ICMP Time Exceeded from hop 1
TTL=2 -> ICMP Time Exceeded from hop 2
...
Common misconceptions
- “If ping fails, the host is down.” ICMP may be blocked.
- “NAT equals firewall.” NAT does not express security policy.
Check-your-understanding questions
- Why can traceroute fail even when the destination is reachable?
- Why might a VPN break access to local printers?
Check-your-understanding answers
- Routers or hosts may block ICMP Time Exceeded messages.
- VPN routes may override local routes, sending traffic into the tunnel.
Real-world applications
- Diagnosing intermittent Wi-Fi dropouts.
- Segmenting IoT devices away from trusted systems.
Where you will apply it
- Projects 2-4, 9-20
References
- RFC 792 (ICMP)
- “TCP/IP Illustrated, Vol 1” by Stevens - Ch. 6
Key insights You cannot secure or fix what you cannot observe.
Summary Diagnostics and security are practical disciplines that transform network theory into reliable systems.
Homework/Exercises to practice the concept
- Capture a TCP handshake and annotate each packet.
- Design a guest network segmentation plan for your home.
Solutions to the homework/exercises
- Identify SYN, SYN-ACK, ACK, then the first data packet.
- Create a separate VLAN or SSID with no access to LAN subnets.
3. Project Specification
3.1 What You Will Build
A traceroute clone that maps the hop-by-hop path to a destination.
Included:
- CLI tool with clear output
- Validation steps and logging
- Documentation of assumptions
Excluded:
- Production-grade performance tuning
- Full security hardening
3.2 Functional Requirements
- Core function: Implement the primary behavior described in the project goal.
- Observable output: Produce deterministic output comparable to the Real World Outcome.
- Error handling: Handle timeouts, invalid inputs, and unreachable hosts gracefully.
3.3 Non-Functional Requirements
- Performance: Complete typical tasks within a few seconds on a LAN.
- Reliability: Fail safely and clearly on errors.
- Usability: Provide concise CLI flags and helpful messages.
3.4 Example Usage / Output
$ sudo ./mytrace example.com
1 192.168.1.1 1.2 ms 1.0 ms 1.1 ms
2 10.0.0.1 8.4 ms 7.9 ms 8.1 ms
3 203.0.113.9 15.3 ms 15.0 ms 15.2 ms
4 198.51.100.5 25.0 ms 24.7 ms 25.4 ms
5 93.184.216.34 34.1 ms 33.9 ms 34.2 ms
3.5 Data Formats / Schemas / Protocols
Protocols: IP Addressing, Subnetting, and Routing (Including NAT), Operations, Diagnostics, and Security.
3.6 Edge Cases
- Target unreachable or timing out
- Malformed or unexpected responses
- Multiple interfaces or subnets
3.7 Real World Outcome
$ sudo ./mytrace example.com
1 192.168.1.1 1.2 ms 1.0 ms 1.1 ms
2 10.0.0.1 8.4 ms 7.9 ms 8.1 ms
3 203.0.113.9 15.3 ms 15.0 ms 15.2 ms
4 198.51.100.5 25.0 ms 24.7 ms 25.4 ms
5 93.184.216.34 34.1 ms 33.9 ms 34.2 ms
3.7.1 How to Run (Copy/Paste)
- Build:
make(or create a virtual environment as needed) - Run:
./P03-build-your-own-traceroute - Config: update any constants in a config file or flags
- Working directory: project root
3.7.2 Golden Path Demo (Deterministic)
Run against a known local target and compare with the expected output.
3.7.3 If CLI: provide an exact terminal transcript
$ sudo ./mytrace example.com
1 192.168.1.1 1.2 ms 1.0 ms 1.1 ms
2 10.0.0.1 8.4 ms 7.9 ms 8.1 ms
3 203.0.113.9 15.3 ms 15.0 ms 15.2 ms
4 198.51.100.5 25.0 ms 24.7 ms 25.4 ms
5 93.184.216.34 34.1 ms 33.9 ms 34.2 ms
4. Solution Architecture
4.1 High-Level Design
CLI Input -> Core Engine -> Output/Logs
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| CLI Parser | Parse flags and inputs | Keep interface minimal |
| Core Engine | Protocol logic and state | Keep deterministic timing |
| Output/Logs | Present results and errors | Use consistent formatting |
4.4 Data Structures (No Full Code)
Request:
- target
- protocol fields
Response:
- status
- timing
State:
- retries
- cache entries
5. Implementation Guide
5.1 Development Environment Setup
- Ensure required tools (tcpdump, dig, ip) are installed.
- Use elevated privileges where raw sockets are required.
5.2 Project Structure
project/
README.md
docs/
src/
tests/
data/
5.3 The Core Question You’re Answering
“What path does my traffic take, and which router boundaries exist between me and a destination?”
5.4 Concepts You Must Understand First
- TTL behavior
- How does TTL prevent loops?
- Book Reference: “TCP/IP Illustrated, Vol 1” - Ch. 8
- ICMP Time Exceeded
- What generates it and what data it includes?
- Book Reference: “TCP/IP Illustrated, Vol 1” - Ch. 6
- Routing basics
- Why do hops change at router boundaries?
- Book Reference: “Computer Networks” - Ch. 5
5.5 Questions to Guide Your Design
- Probe type
- Use UDP probes or ICMP echo? What tradeoffs?
- Timeout and retries
- How many probes per hop? How long to wait?
5.6 Thinking Exercise
TTL ladder
Predict what you should see for TTL values 1, 2, and 3 when tracing to a remote host.
Questions to answer:
- Which hop will send the ICMP reply?
- What changes between hops besides RTT?
5.7 The Interview Questions They’ll Ask
- “How does traceroute work at the protocol level?”
- “Why can traceroute results differ between runs?”
- “What is the difference between UDP-based and ICMP-based traceroute?”
- “Why might some hops show as * * *?”
- “What does a sudden RTT spike at a hop indicate?”
5.8 Hints in Layers
Hint 1: Start with a small max hop Limit max hops to avoid long runs while debugging.
Hint 2: Capture ICMP replies Filter for ICMP type 11 to validate the TTL behavior.
Hint 3: Pseudocode outline
for ttl in 1..max:
send probe with TTL
wait for ICMP time exceeded or destination reached
record hop IP and RTT
Hint 4: Compare with system traceroute
Run traceroute or tracepath and compare hop counts.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| ICMP errors | “TCP/IP Illustrated, Vol 1” | Ch. 6 |
| Routing | “Computer Networks” | Ch. 5 |
| IP header | “TCP/IP Illustrated, Vol 1” | Ch. 3 |
5.10 Implementation Phases
- Establish core protocol I/O and a minimal success path.
- Add parsing, validation, and timeouts.
- Add logging, metrics, and polish for output.
5.11 Key Implementation Decisions
- Which interface and capture point provides visibility?
- What timeout and retry strategy balances speed and accuracy?
- How will results be validated against reference tools?
6. Testing Strategy
6.1 Test Categories
- Unit: parsing and validation logic
- Integration: protocol exchange with a real device
- System: full run with reference tools
6.2 Critical Test Cases
- Successful request/response path
- Timeout and retry behavior
- Invalid or unexpected input handling
6.3 Test Data
- Local gateway IP and a known reachable host
- A non-routable IP to trigger timeouts
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
Problem 1: “All hops are * * *“
- Why: ICMP replies blocked by routers or firewall.
- Fix: Try ICMP-based probes or different destination.
- Quick test: Run system traceroute for comparison.
Problem 2: “Stops early”
- Why: Destination blocked, rate-limited, or TTL max too low.
- Fix: Increase max hops and adjust timeouts.
- Quick test: Use
pingto confirm reachability.
7.2 Debugging Strategies
- Capture traffic with tcpdump or Wireshark.
- Compare against a known-good tool.
- Log timestamps and retry logic.
7.3 Performance Traps
- Excessive retries causing long runtimes.
- Inefficient parsing under high packet rates.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add basic configuration flags for interface and timeout.
- Improve output formatting and sorting.
8.2 Intermediate Extensions
- Add caching or state persistence.
- Add CSV or JSON output export.
8.3 Advanced Extensions
- Add concurrency with careful rate limiting.
- Add visualization or integration with a dashboard.
9. Real-World Connections
9.1 Industry Applications
- Network diagnostics and troubleshooting
- Security monitoring and policy enforcement
9.2 Related Open Source Projects
- tcpdump / Wireshark
- nmap / dnsmasq / unbound (as applicable)
9.3 Interview Relevance
- Explaining protocol behavior
- Diagnosing failures by layer
10. Resources
10.1 Essential Reading
| Topic | Book | Chapter |
|---|---|---|
| ICMP errors | “TCP/IP Illustrated, Vol 1” | Ch. 6 |
| Routing | “Computer Networks” | Ch. 5 |
| IP header | “TCP/IP Illustrated, Vol 1” | Ch. 3 |
10.2 Video Resources
- Wireshark or tcpdump walkthroughs (search for recent tutorials)
- Vendor or RFC explainers for the relevant protocol
10.3 Tools & Documentation
- RFCs for the protocols used in this project
man tcpdump,man ip,man ss
10.4 Related Projects in This Series
- Project 1: Network Device Scanner (ARP Discovery Tool)
- Project 2: Build Your Own ping Utility
- Project 6: DHCP Client
- Project 8: DHCP Server
- Project 9: Port Scanner
- Project 10: DNS Sinkhole (Pi-hole Style)
- Project 12: Bandwidth Monitor
- Project 13: Simple Packet Filter Firewall
- Project 14: Software Router with NAT
- Project 15: Wake-on-LAN Tool
- Project 17: HTTP Proxy Server
- Project 18: Network Topology Mapper
- Project 19: Simple VPN Server
- Project 20: Complete Home Network Stack (Capstone)