Project 5: Satellite Land Cover Classifier

Build a repeatable remote-sensing pipeline that classifies land cover and quantifies change by administrative zone.

Quick Reference

Attribute Value
Difficulty Level 3: Advanced (The Engineer)
Time Estimate 28-40 hours
Main Programming Language Python (Alternatives: R, Julia)
Coolness Level Level 4: Hardcore Tech Flex
Business Potential 3. The “Service & Support” Model
Prerequisites Raster fundamentals, CRS alignment, basic classification
Key Topics Multi-band rasters, masking, change detection, zonal stats

1. Learning Objectives

  1. Select and prepare comparable Sentinel-2 scenes.
  2. Apply raster alignment and nodata/cloud masking correctly.
  3. Compute spectral indices and classify core land-cover classes.
  4. Summarize change by zones with coverage-aware confidence flags.
  5. Explain uncertainty and false-change failure modes.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Raster Grid Integrity

Fundamentals

  • Raster analytics depend on CRS, transform, resolution, and nodata metadata.
  • Multi-date comparison requires common grid alignment.

Deep Dive Most false change signals come from preprocessing mismatch, not true land transformation. Ensure both scenes are aligned to a canonical grid and use consistent resampling rules by variable type. Keep metadata checks explicit and fail fast if mismatch is detected.

2.2 Masking and Spectral Indices

Fundamentals

  • Cloud and nodata masks must be applied before index calculations.
  • Spectral indices amplify specific land-cover signatures.

Deep Dive Without masking, clouds and shadows can mimic dramatic change. NDVI-like index interpretation depends on scene comparability and radiometric consistency. Build QA metrics for valid-pixel coverage before reporting any change percentages.

2.3 Classification and Zonal Interpretation

Fundamentals

  • Classification uncertainty must be reported, not hidden.
  • Zonal summaries inherit polygon size and boundary effects.

Deep Dive Per-zone change reports are decision-friendly, but can overstate confidence for small polygons or low valid-pixel coverage. Add confidence flags based on pixel support and class uncertainty. Publish both absolute and relative change to avoid misleading ranking behavior.


3. Project Specification

3.1 What You Will Build

A pipeline that:

  • queries and selects quality-controlled Sentinel-2 scenes,
  • computes indices and classifies land cover,
  • compares classes across dates,
  • outputs zone-level change report and map.

Included:

  • deterministic scene IDs,
  • coverage-aware reporting,
  • interpretable class change outputs.

Excluded:

  • deep neural segmentation,
  • near-real-time stream processing,
  • global-scale processing.

3.2 Functional Requirements

  1. Select two comparable scenes for an AOI.
  2. Reproject/resample to canonical grid.
  3. Apply cloud and nodata masks.
  4. Compute indices and classify classes.
  5. Calculate zonal change metrics and export outputs.

3.3 Non-Functional Requirements

  • Reproducibility: Scene IDs and thresholds recorded.
  • Trustworthiness: Coverage and uncertainty metrics included.
  • Scalability: Windowed/tiled processing for memory stability.

3.4 Data Formats / Schemas

Zone change output columns:
- zone_id
- valid_pixel_coverage
- class_water_delta_pct
- class_vegetation_delta_pct
- class_built_up_delta_pct
- class_bare_soil_delta_pct
- confidence_flag

3.5 Edge Cases

  • Cloud-heavy scene with low valid coverage
  • Seasonal mismatch between scenes
  • Tiny polygons with unstable class percentages
  • Nodata collisions during differencing

3.6 Real World Outcome

3.6.1 How to Run

$ python run_project5_landcover.py --aoi aoi.geojson --start 2024-05-01 --end 2025-05-31

3.6.2 Golden Path Demo

$ python run_project5_landcover.py --fixture fixtures/sentinel_pair_manifest.json
[INFO] scenes selected: t1=2024-06-03 t2=2025-06-07
[INFO] valid coverage median=0.87
[INFO] zones processed=84
[DONE] change outputs exported

3.6.3 Exact Terminal Transcript (Live)

$ python run_project5_landcover.py --aoi aoi.geojson --start 2024-05-01 --end 2025-05-31
[INFO] Querying scenes with cloud <= 15%
[INFO] Aligning rasters to canonical grid
[INFO] Computing NDVI/NDBI/NDWI
[INFO] Running land-cover classification
[INFO] Computing zonal change metrics
[INFO] Exporting outputs/landcover_change_by_zone.csv
[DONE] Completed in 11m 42s

4. Solution Architecture

4.1 High-Level Design

Scene Query -> Grid Alignment -> Masking -> Index + Classification -> Change Engine -> Zonal Reporter

4.2 Key Components

Component Responsibility Key Decision
Scene Selector Choose comparable imagery Cloud and season thresholds
Preprocessor Align grids and masks Canonical transform policy
Classifier Assign land-cover classes Rule-based vs supervised baseline
Change Engine Compare class maps Temporal pairing strategy
Zonal Reporter Aggregate by polygons Coverage and confidence policy

4.3 Algorithm Overview

  1. Select valid scene pair.
  2. Align to common grid.
  3. Apply masks and compute indices.
  4. Classify both dates.
  5. Compute class deltas and zonal summaries.

5. Implementation Guide

5.1 Development Environment Setup

$ mamba create -n geo-p05 python=3.11 rasterio geopandas numpy scipy -y
$ mamba activate geo-p05

5.2 Project Structure

project5/
├── src/
│   ├── scenes.py
│   ├── preprocess.py
│   ├── classify.py
│   ├── change.py
│   └── main.py
├── fixtures/
├── outputs/
└── tests/

5.3 The Core Question You Are Answering

“How do we convert satellite pixels into change signals that remain defensible after uncertainty checks?”

5.4 Concepts You Must Understand First

  1. Raster alignment invariants.
  2. Mask-first processing.
  3. Coverage-aware zonal reporting.

5.5 Questions to Guide Your Design

  1. How strict should cloud threshold be for your AOI?
  2. Which classes are operationally meaningful for stakeholders?
  3. What minimum valid coverage is required per zone?

5.6 Thinking Exercise

Choose one zone with large observed change and list three alternative explanations unrelated to true land-cover change.

5.7 Interview Questions

  1. Why is grid alignment mandatory before differencing?
  2. How can nodata bias zonal change metrics?
  3. What makes a land-cover report decision-grade?

5.8 Hints in Layers

  • Hint 1: Freeze scene IDs first.
  • Hint 2: Validate transform/resolution equality before index math.
  • Hint 3: Compute coverage before classification summary.
  • Hint 4: Flag low-confidence zones explicitly in output.

6. Testing Strategy

Category Purpose
Unit Mask logic, index calculations, class delta math
Integration End-to-end scene-pair workflow
Edge Case Cloud-heavy scenes, low-coverage zones

Critical tests:

  1. Misaligned test rasters must fail fast with clear error.
  2. Cloud-mask fixture should reduce false-change spikes.
  3. Low-coverage zones should carry confidence warning flag.

7. Common Pitfalls & Debugging

Pitfall Symptom Fix
Misaligned rasters Checkerboard-like false change Align both scenes to canonical grid
Mask omission Implausible class transitions Apply cloud/nodata masks before classification
Overconfident reporting Small zones show extreme noisy deltas Add minimum coverage thresholds and flags

8. Extensions & Challenges

  • Add multi-year trend pipeline.
  • Add class-specific uncertainty estimates.
  • Add temporal smoothing for noisy class transitions.

9. Real-World Connections

  • Urban growth and impervious-surface monitoring.
  • Conservation and vegetation-loss tracking.
  • Municipal land-use change reporting.

10. Resources

  • Sentinel-2 mission overview: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-2
  • Rasterio docs: https://rasterio.readthedocs.io/en/stable/
  • GDAL docs: https://gdal.org/en/stable/

11. Self-Assessment Checklist

  • I can explain why scene pairing choices matter.
  • I can prove my rasters are aligned before differencing.
  • I can report valid-pixel coverage with every zone result.
  • I can articulate uncertainty sources in class-change outputs.

12. Submission / Completion Criteria

Minimum Viable Completion

  • Working class-change report for one AOI with deterministic scene IDs.

Full Completion

  • Includes coverage-aware zonal report and confidence flags.

Excellence

  • Includes uncertainty analysis and false-change diagnostics appendix.