Project 2: Neighborhood Walkability Analyzer
Build a reproducible network-based walkability scoring pipeline using real street graphs and amenity accessibility metrics.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 2: Intermediate (The Developer) |
| Time Estimate | 12-16 hours |
| Main Programming Language | Python (Alternatives: R, Julia) |
| Coolness Level | Level 3: Genuinely Clever |
| Business Potential | 2. The “Micro-SaaS / Pro Tool” |
| Prerequisites | CRS fundamentals, vector joins, graph basics |
| Key Topics | OSMnx graphs, isochrones, spatial scoring |
1. Learning Objectives
- Acquire and validate a walkable street network from OSM data.
- Compute neighborhood accessibility metrics using travel-time constraints.
- Build a transparent composite walkability score.
- Publish map and ranked report with decomposition per component.
- Explain uncertainty and bias sources in score construction.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Street Networks as Directed Graphs
Fundamentals
- Intersections are nodes; traversable street segments are edges.
- Walking speed assumptions and edge impedance define travel-time metrics.
- Connectivity quality determines whether accessibility metrics are meaningful.
Deep Dive Walkability based on Euclidean distance is systematically biased in irregular street networks. Real movement follows network constraints: cul-de-sacs, blocked crossings, and disconnected paths. A graph model encodes this structure and supports travel-time reachability queries.
You should validate graph connectivity before scoring. Small disconnected components may produce extreme low scores that reflect data topology issues rather than real neighborhood conditions. Also define mode-specific assumptions; a walk graph differs from a drive graph.
2.2 Amenity Access and Spatial Join Policy
Fundamentals
- Amenities must be mapped to neighborhoods through explicit join rules.
- Boundary features require deterministic tie-break handling.
Deep Dive Amenity assignment is often underestimated. Boundary ambiguity and incomplete OSM tagging can skew scores. You need explicit inclusion policy and confidence notes. If an amenity lies on a border, choose a deterministic fallback (for example nearest centroid) and log the decision.
2.3 Composite Metric Design
Fundamentals
- Composite scores combine normalized components.
- Weights encode value judgments and must be documented.
Deep Dive A walkability score is not objective truth; it is a policy lens. Publish component breakdown so users can contest assumptions. Sensitivity checks (for example changing weights by +/-10%) should not fully reorder rankings if the metric is stable.
3. Project Specification
3.1 What You Will Build
A pipeline that ingests one city boundary and outputs:
- neighborhood walkability scores,
- isochrone overlays,
- component decomposition report.
Included:
- network QA checks,
- score components and final rank,
- map artifact with explainable popups.
Excluded:
- real-time traffic,
- multimodal optimization,
- policy recommendation automation.
3.2 Functional Requirements
- Download walkable OSM network for target place.
- Build isochrones for configured time thresholds.
- Compute amenity access and intersection density metrics.
- Normalize metrics and compute weighted score.
- Export map and tabular report.
3.3 Non-Functional Requirements
- Reproducibility: Fixed OSM snapshot date or cache.
- Interpretability: Score decomposition visible in outputs.
- Robustness: Handles sparse-data neighborhoods without crashing.
3.4 Data Formats / Schemas
Neighborhood output schema:
- neighborhood_id
- reachability_score
- amenity_access_score
- intersection_density_score
- final_walkability_score
- confidence_flag
3.5 Edge Cases
- Disconnected graph component for centroid node
- Neighborhood with no mapped amenities
- Boundary amenities with multiple candidate polygons
- Extremely small polygon with unstable metric values
3.6 Real World Outcome
3.6.1 How to Run
$ python run_project2_walkability.py --place "San Francisco, California, USA" --mode walk
3.6.2 Golden Path Demo
$ python run_project2_walkability.py --fixture fixtures/sf_walkability_snapshot.parquet
[INFO] neighborhoods=121
[INFO] graph_nodes=42811 graph_edges=96204
[INFO] score range: min=28.4 median=63.2 max=89.7
[DONE] outputs generated in outputs/
3.6.3 Exact Terminal Transcript (Live)
$ python run_project2_walkability.py --place "San Francisco, California, USA"
[INFO] Downloading walk network
[INFO] Computing 10/15-minute isochrones
[INFO] Calculating component metrics
[INFO] Exporting outputs/walkability_scores.geojson
[INFO] Exporting outputs/walkability_map.html
[DONE] Completed in 3m 25s
4. Solution Architecture
4.1 High-Level Design
OSM Data -> Graph Builder -> Neighborhood Metric Engine -> Scoring Layer -> Map + Report
4.2 Key Components
| Component | Responsibility | Key Decision |
|---|---|---|
| Graph Loader | Build walk network | OSM extract date and place boundary |
| Isochrone Engine | Compute reachability | Time thresholds and speed assumption |
| Amenity Matcher | Assign amenity access | Boundary tie-break policy |
| Scoring Module | Normalize + weight metrics | Weight governance and sensitivity checks |
| Reporter | Export map/report | Explainability fields |
4.3 Algorithm Overview
- Load graph and neighborhoods.
- Compute component metrics.
- Normalize each metric.
- Compute weighted composite.
- Export artifacts.
Complexity:
- Dominated by repeated shortest-path queries and spatial joins.
5. Implementation Guide
5.1 Development Environment Setup
$ mamba create -n geo-p02 python=3.11 osmnx geopandas networkx folium -y
$ mamba activate geo-p02
5.2 Project Structure
project2/
├── src/
│ ├── network.py
│ ├── isochrone.py
│ ├── amenities.py
│ ├── score.py
│ └── main.py
├── fixtures/
├── outputs/
└── tests/
5.3 The Core Question You Are Answering
“Can we measure neighborhood walkability in a way that reflects real movement constraints and remains transparent?”
5.4 Concepts You Must Understand First
- Directed graph routing basics.
- Spatial join determinism.
- Metric normalization and weighting.
5.5 Questions to Guide Your Design
- Which component should dominate score interpretation?
- What minimum data completeness is acceptable per neighborhood?
- How will you surface uncertainty?
5.6 Thinking Exercise
Compare two neighborhoods with similar amenity counts but different network connectivity; predict score difference before running code.
5.7 Interview Questions
- Why is network distance preferable to Euclidean distance for walkability?
- How do weighting choices affect policy interpretation?
- How do you debug disconnected graph artifacts?
5.8 Hints in Layers
- Hint 1: Validate graph connectivity first.
- Hint 2: Implement one component metric at a time.
- Hint 3: Add decomposition fields before composite score.
- Hint 4: Run simple sensitivity check on weights.
6. Testing Strategy
| Category | Purpose |
|---|---|
| Unit | Metric functions and normalization logic |
| Integration | End-to-end outputs with fixture city |
| Edge Case | Sparse amenities and disconnected components |
Critical tests:
- Known neighborhood with fixed fixture should preserve score within tolerance.
- Boundary amenity assignment should be deterministic.
- Missing amenity category should not crash pipeline.
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Fix |
|---|---|---|
| Disconnected graph nodes | Zero reachability in dense area | Component QA + node snap validation |
| Hidden weight bias | Counterintuitive rankings | Publish decomposition and sensitivity check |
| Amenity overcounting | Inflated scores | Deduplicate amenities by stable IDs |
8. Extensions & Challenges
- Add multimodal access (walk + transit).
- Add temporal scenarios (weekday peak vs off-peak assumptions).
- Add equity lens with demographic overlays.
9. Real-World Connections
- Housing search tools with access scoring.
- Urban planning accessibility audits.
- Retail site selection screening.
10. Resources
- OSMnx docs: https://osmnx.readthedocs.io/
- OSMnx methods paper: https://geoffboeing.com/publications/osmnx-complex-street-networks/
- GeoPandas user guide: https://geopandas.org/en/stable/docs/user_guide.html
11. Self-Assessment Checklist
- I can explain each score component and its weight.
- I can justify graph assumptions used for isochrones.
- I can reproduce the same ranking from fixture input.
- I can identify data quality limitations in my output.
12. Submission / Completion Criteria
Minimum Viable Completion
- One-city walkability score and map exported.
- Deterministic fixture output generated.
Full Completion
- Includes component decomposition and uncertainty flags.
- Includes sensitivity analysis summary.
Excellence
- Adds policy-specific scoring profiles (family, commuter, senior).
- Adds automated regression test for ranking stability.