Project 12: Debug Overlay & Profiler

Build an in-game performance overlay.

Quick Reference

Attribute Value
Difficulty Level 4
Time Estimate 2-3 weeks
Main Programming Language Game Boy Assembly (SM83/LR35902)
Alternative Programming Languages C (GBDK-2020), C++ (GBDK-2020)
Coolness Level Level 4
Business Potential Level 1
Prerequisites DMG memory map, VBlank timing, basic assembly concepts
Key Topics profiling, frame budget, prioritization

1. Learning Objectives

By completing this project, you will:

  1. Build and verify a working debug overlay & profiler system.
  2. Apply DMG hardware constraints (timing, memory, and I/O rules).
  3. Create repeatable validation steps using emulator tooling.
  4. Document decisions and trade-offs for future projects.

2. All Theory Needed (Per-Concept Breakdown)

Debugging, Profiling, and Frame Budgets

Fundamentals DMG games run under tight CPU and VRAM budgets. Profiling makes those budgets visible so you can prioritize work and avoid frame drops.

Deep Dive into the concept The DMG CPU executes a limited number of cycles per frame. If your logic exceeds the budget, the PPU will still progress, leading to missed VRAM windows and unstable output. A profiler helps you measure where time goes. A simple strategy is to mark frame start and end using a timer or VBlank counter, then compute the delta. You can also track counts of sprites, VRAM writes, and bank switches to identify heavy workloads. Once you can see budget usage, you can enforce limits: skip low-priority tasks, reduce effect density, or delay non-critical updates. This transforms performance from guesswork into a controlled system.

How this fit on projects This concept is central to Debug Overlay & Profiler. It informs the build pipeline, timing discipline, and verification steps used throughout the project.

Definitions & key terms

  • Frame budget: The total CPU time available per frame.
  • Instrumentation: Measuring performance by adding counters or timers.

Mental model diagram

Frame Start -> Work -> Frame End -> Budget Check

How it works (step-by-step)

  1. Measure frame start and end with timer.
  2. Compute delta and compare to budget.
  3. If over budget, skip low-priority tasks.
  4. Display metrics for visibility.

Minimal concrete example (pseudocode)

if frame_time > budget: skip_optional_tasks()

Common misconceptions

  • “Optimization is premature” -> on DMG, it is survival.

Check-your-understanding questions

  • Why can’t you exceed frame budget on DMG?
  • How do you reduce profiling overhead?
  • Predict what happens if you skip audio updates.

Check-your-understanding answers

  • PPU and timers progress regardless of CPU load.
  • Measure at fixed points and keep instrumentation light.
  • You will hear audio glitches or timing drift.

Real-world applications

  • Real-time performance tuning
  • Debug overlays

Where you’ll apply it You’ll apply it in Section 3.1 and Section 5.10. Also used in: P12 Debug Overlay & Profiler.

References

  • Making Embedded Systems, timing chapters

Key insights Performance is a visible budget, not a vague feeling.

Summary Profiling makes DMG timing constraints measurable and enforceable.

Homework/Exercises to practice the concept

  • Define a budget and list tasks by priority.

Solutions to the homework/exercises

  • Audio and input stay high priority; cosmetic effects are optional.

3. Project Specification

3.1 What You Will Build

You will build a DMG project component focused on Debug Overlay & Profiler. It will be functional, repeatable, and verifiable in strict emulators. It will include clear output signals (visual or logged) and a documented validation process. It will exclude advanced extras beyond scope, such as CGB-only features.

3.2 Functional Requirements

  1. Core functionality: Implement the primary system described in Debug Overlay & Profiler.
  2. Deterministic output: Provide a repeatable visible or logged result.
  3. Hardware constraints: Respect VRAM/OAM timing and memory map rules.

3.3 Non-Functional Requirements

  • Performance: Must stay within a safe per-frame budget.
  • Reliability: Must behave consistently across two emulators.
  • Usability: Clear on-screen or logged indicators for success.

3.4 Example Usage / Output

Build:
$ rgbasm -o build/main.o src/main.asm
$ rgblink -o build/game.gb build/main.o
$ rgbfix -v -p 0 build/game.gb

Run:
$ sameboy build/game.gb

Expected:
- No header warnings
- Stable on-screen indicator or log output

Exit Codes:
- 0 = success
- 1 = build failure
- 2 = header validation failure

3.5 Data Formats / Schemas / Protocols

StateRecord (pseudocode shape):

  • frame_counter: u16
  • input_mask: u8
  • flags: u8

AssetIndex (pseudocode shape):

  • bank_id: u8
  • offset: u16
  • length: u16

UpdateQueue item:

  • target: VRAM/OAM
  • dest: address
  • size: bytes

3.6 Edge Cases

  • VRAM/OAM access outside safe windows
  • Bank switch not restored
  • Input sampled multiple times per frame

3.7 Real World Outcome

A stable, reproducible DMG component that can be verified visually or via emulator logs, with no flicker or corruption.

3.7.1 How to Run (Copy/Paste)

$ rgbasm -o build/main.o src/main.asm
$ rgblink -o build/game.gb build/main.o
$ rgbfix -v -p 0 build/game.gb
$ sameboy build/game.gb

Exit Codes:

  • 0 = success
  • 1 = build failure
  • 2 = header validation failure

3.7.2 Golden Path Demo (Deterministic)

  • Load ROM
  • Observe the expected on-screen state or log output
  • Confirm stability for 30 seconds

3.7.3 Failure Demo (Deterministic)

  • Force a known invalid state (e.g., wrong bank selected)
  • Observe expected failure behavior (visual corruption or emulator warning)

4. Solution Architecture

4.1 High-Level Design

Input/Timer -> Core Logic -> Render/Sound Updates -> Validation

4.2 Key Components

Component Responsibility Key Decisions
Input/Timing Stable cadence VBlank-driven loop
Data/Assets Storage & layout Fixed bank + banked assets
Renderer Safe updates VBlank/STAT windows

4.4 Data Structures (No Full Code)

  • State: counters, flags, and last-input snapshot
  • Asset tables: offsets + bank IDs
  • Update queue: list of VRAM/OAM updates

4.4 Algorithm Overview

Key Algorithm: Frame Update

  1. Wait for VBlank
  2. Read input and update state
  3. Apply safe VRAM/OAM updates
  4. Render or log output

Complexity Analysis:

  • Time: O(n) over visible entities or updates
  • Space: O(n) for entity/state tables

5. Implementation Guide

5.1 Development Environment Setup

Install RGBDS
Install DMG-accurate emulator
Set up project folder

5.2 Project Structure

project-root/
|---- src/
|   |---- main.asm
|   |---- hardware.asm
|   `---- assets.asm
|---- build/
|---- tools/
`---- README.md

5.3 The Core Question You’re Answering

“How do I build debug overlay & profiler that behaves correctly under DMG hardware constraints?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Debugging, Profiling, and Frame Budgets
    • What parts are most timing-sensitive?
    • Why does DMG hardware enforce this?
    • Book Reference: “Game Boy Coding Adventure” - relevant chapters

5.5 Questions to Guide Your Design

  1. Timing and Safety
    • Where are the safe update windows?
    • How will you ensure you only write during those windows?
  2. Validation
    • What will you see or log when it works?
    • How will you reproduce the result exactly?

5.6 Thinking Exercise

Draw the Timing Window

Sketch a frame timeline and mark exactly where your updates will occur.

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What hardware constraints drive your design?”
  2. “How do you validate correctness on DMG?”
  3. “What makes your updates deterministic?”
  4. “How do you avoid timing glitches?”
  5. “How do you debug errors when you have no OS?”

5.8 Hints in Layers

Hint 1: Start with a stable VBlank loop Build the simplest loop that waits for VBlank and updates a single state.

Hint 2: Add one subsystem at a time Layer in input, rendering, or audio only after the base loop is stable.

Hint 3: Validate with emulator tools Use VRAM/OAM viewers and breakpoints to confirm data correctness.

Hint 4: Stress test timing Intentionally add workload and watch for corruption or flicker.


5.9 Books That Will Help

Topic Book Chapter
DMG fundamentals “Game Boy Coding Adventure” Ch. 1-5
Low-level systems “The Art of Assembly Language” Ch. 1-6

5.10 Implementation Phases

Phase 1: Foundation (2-4 days)

Goals:

  • Build a bootable ROM and stable loop
  • Create a minimal visible output

Tasks:

  1. Set up toolchain and build pipeline
  2. Display a simple on-screen indicator

Checkpoint: Emulator shows stable output without warnings

Phase 2: Core Functionality (1 week)

Goals:

  • Implement the core system for Debug Overlay & Profiler
  • Add validation logs or overlays

Tasks:

  1. Build the main subsystem
  2. Verify correctness with emulator tools

Checkpoint: System behaves correctly for 30 seconds

Phase 3: Polish & Edge Cases (3-5 days)

Goals:

  • Handle edge cases and timing failures
  • Document limitations and fixes

Tasks:

  1. Add edge case handling
  2. Stress test and refine

Checkpoint: No flicker/corruption under stress test

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Update timing VBlank only / HBlank VBlank only Safest for DMG
Data layout Dense / Aligned Aligned Easier debugging

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Validate small routines Input decoding, counters
Integration Tests Subsystem behavior Loop + rendering
Edge Case Tests Timing stress Max sprites, heavy updates

6.2 Critical Test Cases

  1. Baseline run: ROM boots and shows stable output.
  2. Stress test: Maximum updates without flicker.
  3. Regression test: Repeat run after changes and compare results.

6.3 Test Data

Input sequence: Up, Up, A, Start
Expected: deterministic state changes and stable display

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong update timing Flicker or corruption Move writes to VBlank
Bank not restored Random crashes Save/restore bank state
Incorrect register setup Blank screen Verify I/O writes

7.2 Debugging Strategies

  • Use emulator VRAM/OAM viewers: confirm data and timing.
  • Log state changes: compare expected vs actual frames.

7.3 Performance Traps

Overloading a frame with too many updates causes missed safe windows. Cap work per frame.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a visual status indicator for success
  • Add a simple on-screen counter

8.2 Intermediate Extensions

  • Add a debug toggle for extra metrics
  • Add a second validation scenario

8.3 Advanced Extensions

  • Run the ROM on real hardware via flash cart
  • Add a small automated test harness

9. Real-World Connections

9.1 Industry Applications

  • Embedded firmware: fixed timing loops and I/O constraints
  • Retro toolchains: reproducible builds for constrained devices
  • RGBDS: https://rgbds.gbdev.io/ - DMG assembler/linker
  • SameBoy: https://sameboy.github.io/ - Accurate DMG emulator

9.3 Interview Relevance

  • Hardware timing questions
  • Memory map and register-level reasoning

10. Resources

10.1 Essential Reading

  • Game Boy Coding Adventure by Maximilien Dagois - DMG fundamentals
  • The Art of Assembly Language by Randall Hyde - registers and timing

10.2 Video Resources

  • DMG dev walkthroughs (YouTube) - focus on timing and VRAM rules

10.3 Tools & Documentation

  • Pan Docs: https://gbdev.io/pandocs/ - hardware reference
  • RGBDS: https://rgbds.gbdev.io/ - toolchain docs

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the core hardware constraints behind this project
  • I can describe why the chosen timing model works
  • I can explain one trade-off I made

11.2 Implementation

  • All functional requirements are met
  • All test cases pass in two emulators
  • The output is stable and deterministic

11.3 Growth

  • I can explain this project in an interview
  • I documented what I would do differently next time

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Core functionality works and is visible
  • ROM builds without warnings
  • Behavior is reproducible

Full Completion:

  • All edge cases handled
  • Performance budget respected

Excellence (Going Above & Beyond):

  • Verified on real hardware
  • Includes automated validation steps