Project 5: Satellite Land Cover Classifier
Build a repeatable remote-sensing pipeline that classifies land cover and quantifies change by administrative zone.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced (The Engineer) |
| Time Estimate | 28-40 hours |
| Main Programming Language | Python (Alternatives: R, Julia) |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | 3. The “Service & Support” Model |
| Prerequisites | Raster fundamentals, CRS alignment, basic classification |
| Key Topics | Multi-band rasters, masking, change detection, zonal stats |
1. Learning Objectives
- Select and prepare comparable Sentinel-2 scenes.
- Apply raster alignment and nodata/cloud masking correctly.
- Compute spectral indices and classify core land-cover classes.
- Summarize change by zones with coverage-aware confidence flags.
- Explain uncertainty and false-change failure modes.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Raster Grid Integrity
Fundamentals
- Raster analytics depend on CRS, transform, resolution, and nodata metadata.
- Multi-date comparison requires common grid alignment.
Deep Dive Most false change signals come from preprocessing mismatch, not true land transformation. Ensure both scenes are aligned to a canonical grid and use consistent resampling rules by variable type. Keep metadata checks explicit and fail fast if mismatch is detected.
2.2 Masking and Spectral Indices
Fundamentals
- Cloud and nodata masks must be applied before index calculations.
- Spectral indices amplify specific land-cover signatures.
Deep Dive Without masking, clouds and shadows can mimic dramatic change. NDVI-like index interpretation depends on scene comparability and radiometric consistency. Build QA metrics for valid-pixel coverage before reporting any change percentages.
2.3 Classification and Zonal Interpretation
Fundamentals
- Classification uncertainty must be reported, not hidden.
- Zonal summaries inherit polygon size and boundary effects.
Deep Dive Per-zone change reports are decision-friendly, but can overstate confidence for small polygons or low valid-pixel coverage. Add confidence flags based on pixel support and class uncertainty. Publish both absolute and relative change to avoid misleading ranking behavior.
3. Project Specification
3.1 What You Will Build
A pipeline that:
- queries and selects quality-controlled Sentinel-2 scenes,
- computes indices and classifies land cover,
- compares classes across dates,
- outputs zone-level change report and map.
Included:
- deterministic scene IDs,
- coverage-aware reporting,
- interpretable class change outputs.
Excluded:
- deep neural segmentation,
- near-real-time stream processing,
- global-scale processing.
3.2 Functional Requirements
- Select two comparable scenes for an AOI.
- Reproject/resample to canonical grid.
- Apply cloud and nodata masks.
- Compute indices and classify classes.
- Calculate zonal change metrics and export outputs.
3.3 Non-Functional Requirements
- Reproducibility: Scene IDs and thresholds recorded.
- Trustworthiness: Coverage and uncertainty metrics included.
- Scalability: Windowed/tiled processing for memory stability.
3.4 Data Formats / Schemas
Zone change output columns:
- zone_id
- valid_pixel_coverage
- class_water_delta_pct
- class_vegetation_delta_pct
- class_built_up_delta_pct
- class_bare_soil_delta_pct
- confidence_flag
3.5 Edge Cases
- Cloud-heavy scene with low valid coverage
- Seasonal mismatch between scenes
- Tiny polygons with unstable class percentages
- Nodata collisions during differencing
3.6 Real World Outcome
3.6.1 How to Run
$ python run_project5_landcover.py --aoi aoi.geojson --start 2024-05-01 --end 2025-05-31
3.6.2 Golden Path Demo
$ python run_project5_landcover.py --fixture fixtures/sentinel_pair_manifest.json
[INFO] scenes selected: t1=2024-06-03 t2=2025-06-07
[INFO] valid coverage median=0.87
[INFO] zones processed=84
[DONE] change outputs exported
3.6.3 Exact Terminal Transcript (Live)
$ python run_project5_landcover.py --aoi aoi.geojson --start 2024-05-01 --end 2025-05-31
[INFO] Querying scenes with cloud <= 15%
[INFO] Aligning rasters to canonical grid
[INFO] Computing NDVI/NDBI/NDWI
[INFO] Running land-cover classification
[INFO] Computing zonal change metrics
[INFO] Exporting outputs/landcover_change_by_zone.csv
[DONE] Completed in 11m 42s
4. Solution Architecture
4.1 High-Level Design
Scene Query -> Grid Alignment -> Masking -> Index + Classification -> Change Engine -> Zonal Reporter
4.2 Key Components
| Component | Responsibility | Key Decision |
|---|---|---|
| Scene Selector | Choose comparable imagery | Cloud and season thresholds |
| Preprocessor | Align grids and masks | Canonical transform policy |
| Classifier | Assign land-cover classes | Rule-based vs supervised baseline |
| Change Engine | Compare class maps | Temporal pairing strategy |
| Zonal Reporter | Aggregate by polygons | Coverage and confidence policy |
4.3 Algorithm Overview
- Select valid scene pair.
- Align to common grid.
- Apply masks and compute indices.
- Classify both dates.
- Compute class deltas and zonal summaries.
5. Implementation Guide
5.1 Development Environment Setup
$ mamba create -n geo-p05 python=3.11 rasterio geopandas numpy scipy -y
$ mamba activate geo-p05
5.2 Project Structure
project5/
├── src/
│ ├── scenes.py
│ ├── preprocess.py
│ ├── classify.py
│ ├── change.py
│ └── main.py
├── fixtures/
├── outputs/
└── tests/
5.3 The Core Question You Are Answering
“How do we convert satellite pixels into change signals that remain defensible after uncertainty checks?”
5.4 Concepts You Must Understand First
- Raster alignment invariants.
- Mask-first processing.
- Coverage-aware zonal reporting.
5.5 Questions to Guide Your Design
- How strict should cloud threshold be for your AOI?
- Which classes are operationally meaningful for stakeholders?
- What minimum valid coverage is required per zone?
5.6 Thinking Exercise
Choose one zone with large observed change and list three alternative explanations unrelated to true land-cover change.
5.7 Interview Questions
- Why is grid alignment mandatory before differencing?
- How can nodata bias zonal change metrics?
- What makes a land-cover report decision-grade?
5.8 Hints in Layers
- Hint 1: Freeze scene IDs first.
- Hint 2: Validate transform/resolution equality before index math.
- Hint 3: Compute coverage before classification summary.
- Hint 4: Flag low-confidence zones explicitly in output.
6. Testing Strategy
| Category | Purpose |
|---|---|
| Unit | Mask logic, index calculations, class delta math |
| Integration | End-to-end scene-pair workflow |
| Edge Case | Cloud-heavy scenes, low-coverage zones |
Critical tests:
- Misaligned test rasters must fail fast with clear error.
- Cloud-mask fixture should reduce false-change spikes.
- Low-coverage zones should carry confidence warning flag.
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Fix |
|---|---|---|
| Misaligned rasters | Checkerboard-like false change | Align both scenes to canonical grid |
| Mask omission | Implausible class transitions | Apply cloud/nodata masks before classification |
| Overconfident reporting | Small zones show extreme noisy deltas | Add minimum coverage thresholds and flags |
8. Extensions & Challenges
- Add multi-year trend pipeline.
- Add class-specific uncertainty estimates.
- Add temporal smoothing for noisy class transitions.
9. Real-World Connections
- Urban growth and impervious-surface monitoring.
- Conservation and vegetation-loss tracking.
- Municipal land-use change reporting.
10. Resources
- Sentinel-2 mission overview: https://sentinels.copernicus.eu/web/sentinel/missions/sentinel-2
- Rasterio docs: https://rasterio.readthedocs.io/en/stable/
- GDAL docs: https://gdal.org/en/stable/
11. Self-Assessment Checklist
- I can explain why scene pairing choices matter.
- I can prove my rasters are aligned before differencing.
- I can report valid-pixel coverage with every zone result.
- I can articulate uncertainty sources in class-change outputs.
12. Submission / Completion Criteria
Minimum Viable Completion
- Working class-change report for one AOI with deterministic scene IDs.
Full Completion
- Includes coverage-aware zonal report and confidence flags.
Excellence
- Includes uncertainty analysis and false-change diagnostics appendix.