Project 2: Neighborhood Walkability Analyzer

Build a reproducible network-based walkability scoring pipeline using real street graphs and amenity accessibility metrics.

Quick Reference

Attribute	Value
Difficulty	Level 2: Intermediate (The Developer)
Time Estimate	12-16 hours
Main Programming Language	Python (Alternatives: R, Julia)
Coolness Level	Level 3: Genuinely Clever
Business Potential	2. The “Micro-SaaS / Pro Tool”
Prerequisites	CRS fundamentals, vector joins, graph basics
Key Topics	OSMnx graphs, isochrones, spatial scoring

1. Learning Objectives

Acquire and validate a walkable street network from OSM data.
Compute neighborhood accessibility metrics using travel-time constraints.
Build a transparent composite walkability score.
Publish map and ranked report with decomposition per component.
Explain uncertainty and bias sources in score construction.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Street Networks as Directed Graphs

Fundamentals

Intersections are nodes; traversable street segments are edges.
Walking speed assumptions and edge impedance define travel-time metrics.
Connectivity quality determines whether accessibility metrics are meaningful.

Deep Dive Walkability based on Euclidean distance is systematically biased in irregular street networks. Real movement follows network constraints: cul-de-sacs, blocked crossings, and disconnected paths. A graph model encodes this structure and supports travel-time reachability queries.

You should validate graph connectivity before scoring. Small disconnected components may produce extreme low scores that reflect data topology issues rather than real neighborhood conditions. Also define mode-specific assumptions; a walk graph differs from a drive graph.

2.2 Amenity Access and Spatial Join Policy

Fundamentals

Amenities must be mapped to neighborhoods through explicit join rules.
Boundary features require deterministic tie-break handling.

Deep Dive Amenity assignment is often underestimated. Boundary ambiguity and incomplete OSM tagging can skew scores. You need explicit inclusion policy and confidence notes. If an amenity lies on a border, choose a deterministic fallback (for example nearest centroid) and log the decision.

2.3 Composite Metric Design

Fundamentals

Composite scores combine normalized components.
Weights encode value judgments and must be documented.

Deep Dive A walkability score is not objective truth; it is a policy lens. Publish component breakdown so users can contest assumptions. Sensitivity checks (for example changing weights by +/-10%) should not fully reorder rankings if the metric is stable.

3. Project Specification

3.1 What You Will Build

A pipeline that ingests one city boundary and outputs:

neighborhood walkability scores,
isochrone overlays,
component decomposition report.

Included:

network QA checks,
score components and final rank,
map artifact with explainable popups.

Excluded:

real-time traffic,
multimodal optimization,
policy recommendation automation.

3.2 Functional Requirements

Download walkable OSM network for target place.
Build isochrones for configured time thresholds.
Compute amenity access and intersection density metrics.
Normalize metrics and compute weighted score.
Export map and tabular report.

3.3 Non-Functional Requirements

Reproducibility: Fixed OSM snapshot date or cache.
Interpretability: Score decomposition visible in outputs.
Robustness: Handles sparse-data neighborhoods without crashing.

3.4 Data Formats / Schemas

Neighborhood output schema:
- neighborhood_id
- reachability_score
- amenity_access_score
- intersection_density_score
- final_walkability_score
- confidence_flag

3.5 Edge Cases

Disconnected graph component for centroid node
Neighborhood with no mapped amenities
Boundary amenities with multiple candidate polygons
Extremely small polygon with unstable metric values

3.6 Real World Outcome

3.6.1 How to Run

$ python run_project2_walkability.py --place "San Francisco, California, USA" --mode walk

3.6.2 Golden Path Demo

$ python run_project2_walkability.py --fixture fixtures/sf_walkability_snapshot.parquet
[INFO] neighborhoods=121
[INFO] graph_nodes=42811 graph_edges=96204
[INFO] score range: min=28.4 median=63.2 max=89.7
[DONE] outputs generated in outputs/

3.6.3 Exact Terminal Transcript (Live)

$ python run_project2_walkability.py --place "San Francisco, California, USA"
[INFO] Downloading walk network
[INFO] Computing 10/15-minute isochrones
[INFO] Calculating component metrics
[INFO] Exporting outputs/walkability_scores.geojson
[INFO] Exporting outputs/walkability_map.html
[DONE] Completed in 3m 25s

4. Solution Architecture

4.1 High-Level Design

OSM Data -> Graph Builder -> Neighborhood Metric Engine -> Scoring Layer -> Map + Report

4.2 Key Components

Component	Responsibility	Key Decision
Graph Loader	Build walk network	OSM extract date and place boundary
Isochrone Engine	Compute reachability	Time thresholds and speed assumption
Amenity Matcher	Assign amenity access	Boundary tie-break policy
Scoring Module	Normalize + weight metrics	Weight governance and sensitivity checks
Reporter	Export map/report	Explainability fields

4.3 Algorithm Overview

Load graph and neighborhoods.
Compute component metrics.
Normalize each metric.
Compute weighted composite.
Export artifacts.

Complexity:

Dominated by repeated shortest-path queries and spatial joins.

5. Implementation Guide

5.1 Development Environment Setup

$ mamba create -n geo-p02 python=3.11 osmnx geopandas networkx folium -y
$ mamba activate geo-p02

5.2 Project Structure

project2/
├── src/
│   ├── network.py
│   ├── isochrone.py
│   ├── amenities.py
│   ├── score.py
│   └── main.py
├── fixtures/
├── outputs/
└── tests/

5.3 The Core Question You Are Answering

“Can we measure neighborhood walkability in a way that reflects real movement constraints and remains transparent?”

5.4 Concepts You Must Understand First

Directed graph routing basics.
Spatial join determinism.
Metric normalization and weighting.

5.5 Questions to Guide Your Design

Which component should dominate score interpretation?
What minimum data completeness is acceptable per neighborhood?
How will you surface uncertainty?

5.6 Thinking Exercise

Compare two neighborhoods with similar amenity counts but different network connectivity; predict score difference before running code.

5.7 Interview Questions

Why is network distance preferable to Euclidean distance for walkability?
How do weighting choices affect policy interpretation?
How do you debug disconnected graph artifacts?

5.8 Hints in Layers

Hint 1: Validate graph connectivity first.
Hint 2: Implement one component metric at a time.
Hint 3: Add decomposition fields before composite score.
Hint 4: Run simple sensitivity check on weights.

6. Testing Strategy

Category	Purpose
Unit	Metric functions and normalization logic
Integration	End-to-end outputs with fixture city
Edge Case	Sparse amenities and disconnected components

Critical tests:

Known neighborhood with fixed fixture should preserve score within tolerance.
Boundary amenity assignment should be deterministic.
Missing amenity category should not crash pipeline.

7. Common Pitfalls & Debugging

Pitfall	Symptom	Fix
Disconnected graph nodes	Zero reachability in dense area	Component QA + node snap validation
Hidden weight bias	Counterintuitive rankings	Publish decomposition and sensitivity check
Amenity overcounting	Inflated scores	Deduplicate amenities by stable IDs

8. Extensions & Challenges

Add multimodal access (walk + transit).
Add temporal scenarios (weekday peak vs off-peak assumptions).
Add equity lens with demographic overlays.

9. Real-World Connections

Housing search tools with access scoring.
Urban planning accessibility audits.
Retail site selection screening.

10. Resources

OSMnx docs: https://osmnx.readthedocs.io/
OSMnx methods paper: https://geoffboeing.com/publications/osmnx-complex-street-networks/
GeoPandas user guide: https://geopandas.org/en/stable/docs/user_guide.html

11. Self-Assessment Checklist

I can explain each score component and its weight.
I can justify graph assumptions used for isochrones.
I can reproduce the same ranking from fixture input.
I can identify data quality limitations in my output.

12. Submission / Completion Criteria

Minimum Viable Completion

One-city walkability score and map exported.
Deterministic fixture output generated.

Full Completion

Includes component decomposition and uncertainty flags.
Includes sensitivity analysis summary.

Excellence

Adds policy-specific scoring profiles (family, commuter, senior).
Adds automated regression test for ranking stability.