Project 5: BGP Path Visualizer Lab

Build a multi-vantage routing analysis CLI that maps destination paths to ASNs and surfaces interconnection risk signals relevant to telecom quality.

Quick Reference

Attribute Value
Difficulty Level 2: Intermediate
Time Estimate 8-14 hours
Main Programming Language Python (Alternatives: Go, JavaScript)
Alternative Programming Languages Go, JavaScript
Coolness Level Level 3: Genuinely Clever
Business Potential 2. The “Micro-SaaS / Pro Tool”
Prerequisites Basic APIs/JSON, networking fundamentals
Key Topics ASN mapping, path normalization, peering/transit inference

1. Learning Objectives

By completing this project, you will:

  1. Retrieve and normalize multi-probe routing path data.
  2. Map hops to ASNs and organizations.
  3. Compare route diversity across regions.
  4. Distinguish observed facts from inferred interconnection relationships.
  5. Generate operator-friendly risk summaries.

2. All Theory Needed (Per-Concept Breakdown)

2.1 BGP Policy and Path Selection

Fundamentals

BGP exchanges reachability between autonomous systems. Path choices are policy-driven, not purely latency-optimized. Telecom quality outcomes often depend on these policies.

Deep Dive into the concept

AS-level routing is shaped by business relationships and local policies. A destination can be reachable through multiple upstreams, and each region/provider may choose differently. This produces path asymmetry and quality variation for the same service.

Key practical consequence: one local traceroute is insufficient. You need multiple vantage points to understand user-facing behavior. For communication services, this matters because jitter and congestion can originate in interconnect segments far from endpoints.

Your tool should therefore aggregate paths from multiple probes and produce normalized AS-path summaries. It should preserve uncertainty when mapping hop data to AS-level relationships.

How this fit on projects

  • Standalone concept for this project.
  • Supports capstone troubleshooting and provider strategy.

Definitions & key terms

  • ASN -> Autonomous system identifier.
  • AS Path -> Ordered sequence of ASNs to destination.
  • Peering -> Bilateral network interconnection.
  • Transit -> Paid upstream reachability service.

Mental model diagram

Probe A -> AS6453 -> AS3356 -> AS15169
Probe B -> AS7018 -> AS1299 -> AS15169

How it works

  1. Resolve target.
  2. Query path data from multiple vantage points.
  3. Map hops to ASNs.
  4. Compare and summarize path diversity.

Failure modes: missing ASN mappings, unstable probe selection.

Minimal concrete example

target=voice.example.com
paths:
 EU: AS3333 -> AS3356 -> AS15169
 US: AS7018 -> AS1299 -> AS15169

Common misconceptions

  • “Shortest AS path always means best quality.” -> policy and congestion can dominate.

Check-your-understanding questions

  1. Why can two users see different paths to same destination?
  2. Why is multi-vantage analysis essential?

Check-your-understanding answers

  1. Different upstream relationships and local policies.
  2. Single-vantage data hides regional variability.

Real-world applications

  • Carrier incident triage.
  • Multi-provider capacity planning.

Where you’ll apply it

  • Final capstone quality/risk reporting.

References

  • RFC 4271
  • APNIC BGP in 2025 report

Key insights

Interconnection policy strongly influences perceived communication quality.

Summary

AS-path literacy links local service symptoms to internet-scale causes.

Homework/Exercises

  1. Compare three regions for one target.
  2. Identify likely single points of upstream dependency.

Solutions

  1. Normalize paths before comparison.
  2. Flag repeated single-upstream patterns.

2.2 Path Observability and Inference Discipline

Fundamentals

Routing tools often tempt overconfident conclusions. Strong engineering separates observed facts from inferred relationship labels.

Deep Dive into the concept

Observed facts include probe source, measured hop/path sequence, and ASN mapping confidence. Inference includes likely peering/transit classification and risk interpretation. Your tool should print both distinctly.

Use confidence labels for inferred edges and explain uncertainty sources (missing hops, IXP visibility limits, incomplete mapping). This builds operator trust and prevents misleading diagnostics.

Deterministic reporting requires fixed probe sets and time windows. If probes rotate or datasets shift, comparisons become noisy. Include caching and explicit run metadata to keep analyses reproducible.

How this fit on projects

  • Provides reporting discipline useful for every telecom operations artifact.

Definitions & key terms

  • Observed path -> Directly measured routing sequence.
  • Inference -> Interpretation derived from observations.
  • Confidence label -> Qualitative certainty indicator for interpretation.

Mental model diagram

Observed Data -> Normalization -> Inference Engine -> Report
      |                                  |
      +------------- kept distinct -------+

How it works

  1. Collect measured paths.
  2. Normalize and annotate unknowns.
  3. Apply inference heuristics.
  4. Emit split report: facts vs interpretations.

Failure modes: mixing inferred claims with raw measurements.

Minimal concrete example

Observed: AS3356 -> AS15169
Inference: likely transit-to-content edge (confidence=medium)

Common misconceptions

  • “ASN owner names alone reveal relationship type.” -> not always.

Check-your-understanding questions

  1. Why mark inference confidence explicitly?
  2. How do fixed probe sets improve analysis quality?

Check-your-understanding answers

  1. To prevent overclaiming from incomplete visibility.
  2. They reduce run-to-run variance and improve comparability.

Real-world applications

  • NOC reporting.
  • Provider performance reviews.

Where you’ll apply it

  • Capstone reporting and incident reviews.

References

  • RIPEstat API docs
  • APNIC routing analysis resources

Key insights

Reliable routing analysis requires epistemic discipline, not just API calls.

Summary

Separate facts and inference to make routing reports actionable and credible.

Homework/Exercises

  1. Add confidence labels to three inferred edges.
  2. Re-run with different probe sets and compare variance.

Solutions

  1. High confidence for repeated consistent edges, lower for sparse data.
  2. Large variance indicates weak baseline comparability.

3. Project Specification

3.1 What You Will Build

A CLI that resolves targets, queries path data, maps ASNs, and outputs multi-vantage summaries with risk flags.

3.2 Functional Requirements

  1. Resolve domain to IP.
  2. Fetch path data from at least two regions.
  3. Map ASNs to org labels.
  4. Print normalized AS paths.
  5. Emit risk summary with confidence.

3.3 Non-Functional Requirements

  • Performance: Return report within practical API latency limits.
  • Reliability: Handle transient API errors gracefully.
  • Usability: Clear output separation between facts and inference.

3.4 Example Usage / Output

bgp_path_lab trace --target voice.example.com --probes eu,us,ap

3.5 Data Formats / Schemas / Protocols

  • REST API JSON responses.
  • Internal normalized AS-path representation.
  • Optional JSON export report.

3.6 Edge Cases

  • Unresolved domains.
  • Missing ASN mappings.
  • API rate limits/timeouts.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

$ bgp_path_lab trace --target voice.example.com --probes eu-west,us-east

3.7.2 Golden Path Demo (Deterministic)

  • Fixed probe set and timestamp window.
  • Stable path summary and risk output.

3.7.3 If CLI: exact terminal transcript

$ bgp_path_lab trace --target voice.example.com --probes eu-west,us-east
[FACT] eu-west: AS3333 -> AS3356 -> AS15169
[FACT] us-east: AS7018 -> AS1299 -> AS15169
[INFER] edge AS3356->AS15169 likely transit/content (confidence=medium)
[SUMMARY] unique_paths=2 risk=medium
[EXIT] code=0

4. Solution Architecture

4.1 High-Level Design

Resolver -> API Client -> Path Normalizer -> ASN Mapper -> Inference Layer -> Report Renderer

4.2 Key Components

Component Responsibility Key Decisions
Resolver domain/IP normalization deterministic resolution strategy
API Client data retrieval and retries backoff + caching
Normalizer path cleanup/dedupe explicit unknown handling
Reporter human + machine outputs separate facts and inference

4.4 Data Structures (No Full Code)

ProbePath: probe_id, region, as_path[], raw_hops[]
Inference: edge, hypothesis, confidence
Report: facts[], inferences[], summary

4.4 Algorithm Overview

  1. Resolve target.
  2. Pull probe path data.
  3. Normalize and map ASNs.
  4. Derive risk/inference annotations.
  5. Render outputs.

Complexity: O(p * h) where p=probe count, h=average hops.


5. Implementation Guide

5.1 Development Environment Setup

$ mkdir -p cache reports
$ toolchain --check-http-json

5.2 Project Structure

bgp-path-lab/
├── src/
├── cache/
├── reports/
└── fixtures/

5.3 The Core Question You’re Answering

“How do I turn internet-scale routing data into actionable telecom quality insight?”

5.4 Concepts You Must Understand First

  • BGP policy vs shortest path.
  • ASN mapping caveats.
  • Measurement variance across probes/time.

5.5 Questions to Guide Your Design

  1. What constitutes a stable probe baseline?
  2. How will you label uncertainty?
  3. Which risk flags are actually actionable?

5.6 Thinking Exercise

Define three risk heuristics and decide which are observation-based versus inference-based.

5.7 The Interview Questions They’ll Ask

  1. Why is single-probe analysis insufficient?
  2. How does peering differ from transit operationally?
  3. How can routing policy affect voice quality?
  4. How do you prevent overclaiming from limited data?

5.8 Hints in Layers

Hint 1: Build deterministic input normalization first.

Hint 2: Cache ASN lookups aggressively.

Hint 3 (pseudocode):

collect_paths()
normalize()
annotate_confidence()
render_report()

5.9 Books That Will Help

Topic Book Chapter
BGP fundamentals Halabi 2-3
Interconnection models Halabi 4
Routing scale context APNIC 2025 report Full article

5.10 Implementation Phases

Phase 1: Foundation (2-4 hours)

  • Resolver and API client with retries.

Phase 2: Core Functionality (3-6 hours)

  • Path normalization and ASN mapping.

Phase 3: Polish & Edge Cases (3-4 hours)

  • Inference confidence labels + report exports.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Output format text/json/both both operator + automation use
Probe strategy random/fixed set fixed baseline deterministic comparisons

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit parser/normalizer checks AS-path dedupe
Integration API-to-report flow multi-probe trace
Edge API failures/missing ASN fallback handling

6.2 Critical Test Cases

  1. Stable output for fixed probe/time parameters.
  2. Graceful handling of missing ASN mappings.
  3. Retry/backoff under transient API errors.

6.3 Test Data

fixtures/path_sample_eu.json
fixtures/path_sample_us.json
fixtures/path_missing_asn.json

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Probe drift inconsistent runs pin probe set
Overconfident inference misleading reports add confidence labels
No cache API throttling persistent lookup cache

7.2 Debugging Strategies

  • Compare raw API payloads with normalized output.
  • Validate one region at a time before multi-region aggregation.
  • Keep run metadata (timestamp/probes/target) with each report.

7.3 Performance Traps

  • Repeated uncached ASN lookups increase latency and failure risk.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add markdown report export.
  • Add historical run diff view.

8.2 Intermediate Extensions

  • Add route-change alerting heuristic.
  • Add simple latency correlation input.

8.3 Advanced Extensions

  • Add RPKI/ROV context overlay.
  • Add provider scorecards by region.

9. Real-World Connections

9.1 Industry Applications

  • Carrier NOC route diagnostics.
  • Multi-cloud and UC provider path-risk assessment.
  • RIPE Atlas tooling ecosystem.
  • BGP analysis dashboards.

9.3 Interview Relevance

  • Demonstrates ability to connect protocol policy to customer-facing quality.

10. Resources

10.1 Essential Reading

  • RFC 4271
  • RIPEstat API docs
  • APNIC BGP in 2025 report

10.2 Video Resources

  • BGP policy engineering talks.

10.3 Tools & Documentation

  • RIPEstat and route analysis docs.
  • Previous: P04
  • Completes the route/interconnect layer for the capstone.

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain why BGP paths vary by region.
  • I can separate observed path facts from inferred relationships.

11.2 Implementation

  • Multi-probe output is deterministic with fixed inputs.
  • Report includes confidence labels and risk summary.

11.3 Growth

  • I can propose production extensions for routing observability.

12. Submission / Completion Criteria

Minimum Viable Completion

  • Multi-probe AS-path report with ASN labels.

Full Completion

  • Deterministic report with confidence-tagged inferences and error handling.

Excellence

  • Adds trend comparisons and proactive route-risk heuristics.