Project 9: Real-Time System Monitor

Build a live dashboard that shows CPU usage, temperature, and FPS graphs in real time.

Quick Reference

Attribute Value
Difficulty Level 2: Intermediate
Time Estimate 1-2 weeks
Main Programming Language C (Alternatives: Rust)
Alternative Programming Languages Rust
Coolness Level Level 4: Hardcore
Business Potential 2. The “Diagnostics” Level
Prerequisites Projects 2-3, timers and ADC basics
Key Topics Timers, ADC sampling, real-time graphs

1. Learning Objectives

By completing this project, you will:

  1. Measure CPU utilization with timer-based sampling.
  2. Read on-chip temperature via ADC and convert to degrees.
  3. Render scrolling graphs efficiently with dirty rectangles.
  4. Maintain a stable update rate (e.g., 10 Hz).
  5. Build a dashboard UI with numeric and graphical data.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Timers, Ticks, and CPU Utilization

Fundamentals

A timer is a hardware counter that increments at a known rate. By sampling timers at regular intervals, you can measure elapsed time and calculate CPU utilization. One common method is idle-time measurement: count how many cycles the CPU spends in an idle loop versus total cycles in a time window. CPU utilization = 1 - (idle_cycles / total_cycles). This approach is simple and works even without an RTOS.

Deep Dive into the concept

To compute CPU utilization, you need a stable time base. You can use a hardware timer that increments at a fixed frequency. In each sampling interval (e.g., 100 ms), you measure total cycles and idle cycles. Idle cycles can be counted by incrementing a counter in the idle loop or by sampling a low-power instruction. If your system uses DMA or interrupts, you must decide whether those count as CPU work or idle. A common approach is to count any time outside the idle loop as active. You also must ensure that your sampling interval does not drift; use a timer interrupt or a hardware alarm.

You must also handle the observer effect: the monitoring code itself consumes CPU. To minimize this, keep sampling routines lightweight and fixed in cost. If you update the display too frequently, it will distort the CPU usage you are measuring. Thus, decouple sampling (e.g., 100 Hz) from display updates (e.g., 10 Hz), and compute averages. This yields smoother graphs and more accurate readings.

How this fits on projects

Timer-based measurement appears in Section 3.2 and Section 5.10 Phase 2. It is also used in Project 7 (benchmarking) and Project 13 (task scheduler). Also used in: Project 7, Project 13.

Definitions & key terms

  • Tick -> Timer increment unit.
  • Sampling interval -> Fixed time window for measurement.
  • Idle loop -> Code that runs when no tasks are active.
  • Observer effect -> Measurement affecting the system.

Mental model diagram (ASCII)

Time window: [----100 ms----]
CPU cycles: total
Idle cycles: measured in idle loop
Utilization = 1 - idle/total

How it works (step-by-step)

  1. Configure a hardware timer at a fixed rate.
  2. In the idle loop, increment an idle counter.
  3. Every sampling interval, snapshot idle and total cycles.
  4. Compute utilization and reset counters.

Failure modes:

  • Sampling interval drifts -> wrong results.
  • Monitor too heavy -> skewed utilization.

Minimal concrete example

volatile uint32_t idle_ticks = 0;
void idle_loop(void) { idle_ticks++; }

void sample_cpu(void) {
  uint32_t total = timer_ticks();
  uint32_t idle = idle_ticks;
  cpu_util = 1.0f - (float)idle / (float)total;
  idle_ticks = 0;
}

Common misconceptions

  • “CPU usage must be measured with an RTOS.” -> You can do it bare-metal.
  • “Higher update rate means better accuracy.” -> It can add overhead.

Check-your-understanding questions

  1. Why use an idle loop counter?
  2. How does the observer effect show up?
  3. Why separate sampling and display update rates?

Check-your-understanding answers

  1. It approximates how much time the CPU is idle.
  2. Monitoring code consumes CPU, altering the measurement.
  3. To reduce overhead while still keeping display responsive.

Real-world applications

  • Embedded system diagnostics dashboards
  • Battery-powered devices measuring load

Where you’ll apply it

  • This project: Section 3.2, Section 5.10 Phase 2
  • Also used in: Project 7

References

  • “Making Embedded Systems” Ch. 7
  • RP2350 timer documentation

Key insights

CPU usage metrics are only meaningful with controlled sampling.

Summary

Measure idle time in fixed windows for reliable CPU utilization.

Homework/Exercises to practice the concept

  1. Implement a 100 ms timer interrupt.
  2. Measure idle ticks with and without display updates.
  3. Plot utilization over 10 seconds.

Solutions to the homework/exercises

  1. Use timer alarms to trigger every 100 ms.
  2. Utilization increases when display updates are frequent.
  3. Use a circular buffer for samples.

2.2 ADC Temperature Sensing

Fundamentals

The RP2350 includes an ADC that can read internal temperature sensors. Converting ADC counts to temperature requires calibration constants and a formula from the datasheet. Because ADC readings are noisy, you should average multiple samples for stable values.

Deep Dive into the concept

The ADC returns a raw value proportional to voltage. The internal temperature sensor has a known slope and offset, which are provided in the datasheet. You read the ADC channel, convert the raw count to voltage, then convert voltage to temperature. For example: temp = 27 - (voltage - 0.706)/0.001721. These constants vary by device, so you may need calibration. Noise and quantization cause fluctuations; a moving average of 8-16 samples yields a stable reading. Sampling rate also matters: too fast wastes CPU and shows noise; too slow makes the graph laggy. A sampling interval of 1 Hz is often sufficient for temperature.

How this fits on projects

Temperature sensing feeds the graphs in Section 3.2 and the dashboard in Section 3.7. It also relates to Project 1 (ADC bring-up) and Project 10 (bare-metal register access). Also used in: Project 10.

Definitions & key terms

  • ADC -> Analog-to-digital converter.
  • LSB -> Least significant bit of ADC reading.
  • Calibration -> Adjusting for sensor offset/slope.
  • Moving average -> Average of last N samples.

Mental model diagram (ASCII)

ADC counts -> voltage -> temperature -> graph

How it works (step-by-step)

  1. Configure ADC channel for temperature sensor.
  2. Read raw ADC counts.
  3. Convert to voltage.
  4. Apply temperature formula.
  5. Smooth with moving average.

Failure modes:

  • Wrong formula -> incorrect temperature.
  • No averaging -> noisy graph.

Minimal concrete example

float voltage = adc_read() * 3.3f / 4096.0f;
float temp = 27.0f - (voltage - 0.706f) / 0.001721f;

Common misconceptions

  • “ADC values are stable.” -> Noise is normal.
  • “Temperature is precise without calibration.” -> It can be off.

Check-your-understanding questions

  1. Why average ADC samples?
  2. What is the role of calibration?
  3. Why not sample temperature at 100 Hz?

Check-your-understanding answers

  1. To reduce noise and jitter.
  2. To correct sensor offset and slope.
  3. Temperature changes slowly; high rate wastes CPU.

Real-world applications

  • Thermal monitoring in embedded devices
  • Adaptive clock scaling based on temperature

Where you’ll apply it

  • This project: Section 3.2, Section 5.10 Phase 1
  • Also used in: Project 10

References

  • RP2350 ADC datasheet section

Key insights

Temperature sensing is simple but must be filtered to be useful.

Summary

ADC + smoothing yields stable temperature data for dashboards.

Homework/Exercises to practice the concept

  1. Read ADC values 100 times and compute variance.
  2. Implement a moving average filter.
  3. Compare results with and without calibration.

Solutions to the homework/exercises

  1. Expect visible variance due to noise.
  2. Use a circular buffer of samples.
  3. Calibration shifts the baseline closer to true values.

3. Project Specification

3.1 What You Will Build

A real-time system monitor UI that displays CPU usage, temperature, and FPS. It includes scrolling graphs updated at 10 Hz and numeric readouts.

3.2 Functional Requirements

  1. Timer sampling for CPU usage.
  2. ADC reads for temperature.
  3. Graph rendering with scrolling buffer.
  4. Stable update loop at 10 Hz.

3.3 Non-Functional Requirements

  • Performance: Monitor updates without dropping frames.
  • Reliability: Values remain stable and meaningful.
  • Usability: Clear labels and units.

3.4 Example Usage / Output

CPU: 42%  Temp: 33C  FPS: 58
Graph: CPU usage scrolling left

3.5 Data Formats / Schemas / Protocols

  • Circular buffer for samples (size 100)

3.6 Edge Cases

  • CPU usage pinned at 0% or 100%
  • ADC read failure
  • Graph overflow

3.7 Real World Outcome

The LCD shows three graphs updating smoothly with numeric values. CPU usage responds to load changes (e.g., enabling a render loop increases CPU%).

3.7.1 How to Run (Copy/Paste)

cd LEARN_RP2350_LCD_DEEP_DIVE/system_monitor
mkdir -p build
cd build
cmake ..
make -j4
cp system_monitor.uf2 /Volumes/RP2350

3.7.2 Golden Path Demo (Deterministic)

  • Idle CPU shows <10% utilization.
  • Start a render loop, CPU jumps to ~60%.
  • Temperature slowly increases by 1-2C.

3.7.3 Failure Demo (Deterministic)

  • Disable idle counter reset.
  • CPU usage becomes stuck at 0% or 100%.
  • Fix: reset counters each interval.

4. Solution Architecture

4.1 High-Level Design

[Timer Sampler] -> [Metrics] -> [Graph Renderer] -> LCD
[ADC Reader] ----^                   |

4.2 Key Components

| Component | Responsibility | Key Decisions | |———–|—————-|—————| | Sampler | Collect CPU/Temp | 10 Hz sampling | | Graph buffer | Store samples | Circular buffer | | Renderer | Draw graphs | Dirty rectangles |

4.3 Data Structures (No Full Code)

typedef struct { uint8_t cpu; int16_t temp; uint8_t fps; } sample_t;

4.4 Algorithm Overview

Key Algorithm: Scrolling Graph

  1. Shift graph left by 1 pixel.
  2. Draw new sample at right edge.
  3. Update only graph region.

Complexity Analysis:

  • Time: O(graph width) per update
  • Space: O(sample buffer)

5. Implementation Guide

5.1 Development Environment Setup

# Use pico-sdk timer + ADC examples

5.2 Project Structure

system_monitor/
- src/
  - metrics.c
  - graphs.c
  - main.c

5.3 The Core Question You’re Answering

“How do I measure and visualize system health in real time?”

5.4 Concepts You Must Understand First

  1. Timer sampling
  2. ADC conversion
  3. Graph rendering with dirty rectangles

5.5 Questions to Guide Your Design

  1. What sampling rate balances accuracy and overhead?
  2. How will you smooth noisy values?
  3. How will you handle graph scaling?

5.6 Thinking Exercise

Calculate how many pixels you need to shift for a 100-sample graph.

5.7 The Interview Questions They’ll Ask

  1. How do you measure CPU usage on bare metal?
  2. What is the trade-off between sampling rate and overhead?
  3. How do you display graphs efficiently?

5.8 Hints in Layers

  • Hint 1: Start with numeric values before graphs.
  • Hint 2: Add a circular buffer for samples.
  • Hint 3: Use dirty rectangles for updates.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Timers | “Making Embedded Systems” | Ch. 7 |

5.10 Implementation Phases

Phase 1: Sampling (3-4 days)

Goals: Measure CPU and temp. Tasks: Timer and ADC integration. Checkpoint: Stable numeric values.

Phase 2: Graphs (4-5 days)

Goals: Render scrolling graphs. Tasks: Graph buffer and draw routine. Checkpoint: Smooth graph updates.

Phase 3: Optimization (2-3 days)

Goals: Reduce overhead. Tasks: Dirty rectangles and batching. Checkpoint: Monitor doesn’t distort CPU usage.

5.11 Key Implementation Decisions

| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Sampling rate | 1 Hz vs 10 Hz | 10 Hz | Responsive but manageable | | Graph update | Full vs partial | Partial | Lower overhead |


6. Testing Strategy

6.1 Test Categories

| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Math checks | utilization formula | | Integration Tests | ADC reads | temperature stability | | Performance Tests | Overhead | monitor impact |

6.2 Critical Test Cases

  1. Idle test: CPU usage below 10%.
  2. Load test: CPU usage above 60%.
  3. Temperature smoothing: stable within +/-0.5C.

6.3 Test Data

Sample buffer size: 100

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

| Pitfall | Symptom | Solution | |———|———|———-| | Counter not reset | CPU stuck at 100% | Reset per interval | | No averaging | Temp jitter | Moving average | | Full redraw | FPS drop | Dirty rectangles |

7.2 Debugging Strategies

  • Log raw ADC values.
  • Toggle GPIO in sampler to measure timing.

7.3 Performance Traps

  • Graph rendering too frequently; limit to 10 Hz.

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add memory usage display.

8.2 Intermediate Extensions

  • Log samples to SD card.

8.3 Advanced Extensions

  • Add Wi-Fi telemetry (if external module available).

9. Real-World Connections

9.1 Industry Applications

  • Embedded device diagnostics
  • Thermal monitoring dashboards
  • LVGL monitor examples

9.3 Interview Relevance

  • Timer sampling and ADC usage are common interview topics.

10. Resources

10.1 Essential Reading

  • RP2350 datasheet (timers, ADC)

10.2 Video Resources

  • Embedded monitoring UI tutorials

10.3 Tools & Documentation

  • Logic analyzer for timing validation

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain idle-time CPU measurement.
  • I can convert ADC counts to temperature.

11.2 Implementation

  • Graphs update smoothly.
  • CPU usage responds to load changes.

11.3 Growth

  • I can explain system monitoring in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Numeric CPU and temperature values displayed.

Full Completion:

  • Graphs update at 10 Hz with minimal overhead.

Excellence (Going Above & Beyond):

  • Logging and alert thresholds with on-screen indicators.