Project 7: Interrupt Latency Profiler

A measurement harness that profiles interrupt latency under different system loads and priorities.

Quick Reference

Attribute	Value
Difficulty	Level 3: Advanced
Time Estimate	1-2 weeks
Main Programming Language	C (Alternatives: C++, Rust, Ada)
Alternative Programming Languages	C++, Rust, Ada
Coolness Level	Level 4: Hardcore Tech Flex
Business Potential	1. The “Resume Gold”
Prerequisites	Timers, GPIO toggling, NVIC basics
Key Topics	NVIC priority, latency, profiling, jitter

1. Learning Objectives

By completing this project, you will:

Measure ISR latency with GPIO and/or cycle counters.
Quantify worst-case latency under load.
Understand how priorities affect preemption.
Produce a latency report with min/avg/max values.

2. All Theory Needed (Per-Concept Breakdown)

NVIC, Interrupt Priority, and Latency Measurement

Fundamentals Interrupts allow the MCU to respond to events without polling. The NVIC assigns priorities and decides which interrupt runs next. Latency is the time between an interrupt event and when its handler actually executes. Measuring latency is critical for real-time systems because it defines how quickly you can respond to sensors, timers, or communication events.

Deep Dive into the concept On Cortex-M, when an interrupt event occurs, the NVIC determines whether it can preempt the current execution. If the new interrupt has higher priority, the core pushes registers onto the stack and jumps to the handler. This stacking and unstacking takes a predictable number of cycles, but additional latency can be introduced if interrupts are disabled or if higher-priority ISRs are running. The STM32F3 supports configurable priority levels; you must define a priority scheme that matches your system’s timing requirements. Latency measurement is done by toggling a GPIO at the first instruction of an ISR and measuring the delay from the event source. For timer interrupts, you can compare the timer output event to the GPIO toggle. For external interrupts, you can inject a known edge and measure the response. Another tool is the DWT cycle counter, which can count CPU cycles between event and handler execution. In many systems, average latency is less important than worst-case latency, because worst-case defines deadline misses. Therefore, a good profiler will stress the system by enabling other interrupts, running CPU-heavy tasks, and then measuring the maximum latency observed. Your results guide design decisions: if latency is too high, you might raise priority, reduce ISR work, or move tasks to DMA. This project teaches you that ‘real-time’ is a measured property, not a claim.

How this fit on projects In Interrupt Latency Profiler, you build a latency profiler that measures interrupt response under load and produces a report of min/avg/max latency.

Definitions & key terms

NVIC -> Nested Vectored Interrupt Controller that prioritizes and dispatches interrupts.
Latency -> Time between an event and ISR execution.
Preemption -> Higher-priority interrupt interrupting lower-priority code.
DWT -> Data Watchpoint and Trace unit, includes cycle counter.
Critical section -> Code where interrupts are temporarily disabled.

Mental model diagram (ASCII)

Event -> NVIC arbitration -> stacking -> ISR entry -> GPIO toggle

How it works (step-by-step, with invariants and failure modes)

Configure a periodic timer interrupt as the event source.
Toggle a GPIO at the first line of the ISR.
Measure delay between timer event and GPIO edge using scope/logic analyzer.
Stress the system with other interrupts and compute max latency.
Invariant: ISR entry time remains bounded; failure mode: unbounded latency due to long critical sections.

Minimal concrete example

void TIM3_IRQHandler(void) {
GPIOA->ODR ^= (1 << 5); // toggle for timing
TIM3->SR &= ~TIM_SR_UIF;
}

Common misconceptions

Average latency is enough ignores worst-case deadlines.
Interrupts are always immediate ignores priority and masking.
ISRs can do heavy work ignores that long ISRs increase latency for others.

Check-your-understanding questions

What factors increase worst-case interrupt latency?
How do priorities affect preemption?
Why is GPIO toggling a good measurement method?

Check-your-understanding answers

Long critical sections, higher-priority ISRs, and flash wait states increase latency.
Higher-priority interrupts can preempt lower-priority handlers at any time.
It produces a physical edge that can be measured precisely with external tools.

Real-world applications

Safety-critical control loops.
Real-time motor control and power electronics.
High-speed data acquisition systems.

Where you’ll apply it

In this project: see Section 6.2 Critical Test Cases and Section 7.2 Debugging Strategies.
Also used in: P03 Timer-Driven LED Sequencer.

References

ARM Cortex-M4 Technical Reference Manual (exception entry/exit).
Joseph Yiu, ‘The Definitive Guide to ARM Cortex-M3/M4’ (NVIC).
ST application notes on interrupt latency measurement.

Key insights

Real-time behavior is defined by worst-case latency, not average latency.

Summary Interrupt profiling makes latency visible and measurable. By understanding NVIC priorities and measuring response time, you can engineer systems that meet real-time deadlines.

Homework/Exercises to practice the concept

Measure ISR latency with and without other interrupts enabled.
Configure two interrupts with different priorities and observe preemption.
Use the DWT cycle counter to compute cycles between event and ISR entry.

Solutions to the homework/exercises

Latency increases when other interrupts run; log max latency as your design bound.
The higher-priority ISR preempts immediately, lower-priority waits.
DWT->CYCCNT gives cycle-accurate timing; subtract to compute latency.

Timers, Prescalers, and Interrupt Scheduling

Fundamentals Hardware timers are the MCU’s timekeeping engines. They count ticks derived from a clock source and can generate periodic events without CPU busy-waiting. By configuring a prescaler and auto-reload value, you set a precise interval. Timers can trigger interrupts, toggle pins, or drive DMA. Interrupts allow your firmware to react to timer events with deterministic latency, which is the basis of real-time scheduling. Using timers and interrupts instead of delay loops is the core technique for reliable embedded timing.

Deep Dive into the concept STM32 timers are flexible peripherals that can operate as basic timebases, output compare generators, PWM engines, or input capture units. At their core is a counter clocked by a timer clock derived from the APB bus. The prescaler divides the timer clock, and the auto-reload register sets the period at which the counter resets. Each update event can trigger an interrupt, set a status flag, or generate a DMA request. The advantage over software delays is determinism: the hardware counts regardless of CPU load. However, interrupts introduce latency. The NVIC prioritizes interrupts, and higher-priority ISRs can delay timer handlers. If you rely on timer interrupts for scheduling, you must measure latency and account for jitter. A timer-driven LED sequencer is an ideal training ground: you build a state machine that advances on each timer tick, ensuring that your pattern timing is stable even if the CPU is busy. Advanced timers also support one-pulse mode, dead-time insertion, and complementary outputs, which are critical for motor control but beyond the basics here. For a correct timer configuration, you compute the prescaler as (timer_clock / desired_tick) - 1, and the auto-reload as (tick_rate / desired_interval) - 1. You then verify by toggling a GPIO on each interrupt and measuring the period. Misconfigurations are easy to spot: a factor-of-two error usually means you forgot the APB timer clock doubling rule. Another common failure is forgetting to clear interrupt flags, which results in repeated or missed interrupts. In short, timers are your deterministic clockwork. Mastering them means you can build schedules, measure drift, and guarantee timing, which is the heart of embedded systems.

How this fit on projects In Interrupt Latency Profiler, you design a timer-driven schedule that advances a state machine without delay loops, proving deterministic timing.

Definitions & key terms

Prescaler -> Divider that slows the timer clock.
Auto-reload -> Value that sets the timer period.
Update event -> Timer overflow event that can trigger interrupt or DMA.
Interrupt latency -> Time between event and ISR execution.
Jitter -> Variation in interrupt timing due to system load.

Mental model diagram (ASCII)

Timer clock -> prescaler -> counter -> overflow -> ISR -> state machine step

How it works (step-by-step, with invariants and failure modes)

Select a timer and enable its clock.
Compute prescaler and auto-reload for desired tick rate.
Enable update interrupt and configure NVIC priority.
In ISR, update the state machine and clear interrupt flags.
Invariant: timer tick period remains stable; failure mode: drift or missed interrupts if flags are not cleared.

Minimal concrete example

// Timer ISR toggles LED state
void TIM2_IRQHandler(void) {
if (TIM2->SR & TIM_SR_UIF) {
    TIM2->SR &= ~TIM_SR_UIF;
    step_led_pattern();
}
}

Common misconceptions

Delays are fine for scheduling ignores CPU load and jitter.
Timer interrupts are always on time ignores NVIC priority and blocking ISRs.
Auto-reload equals milliseconds ignores prescaler and timer clock source.

Check-your-understanding questions

What causes a factor-of-two error in timer frequency?
How do you measure jitter in a timer-driven system?
Why must you clear the update flag in the ISR?

Check-your-understanding answers

Forgetting that timers may run at twice the APB clock when the prescaler is not 1.
Toggle a GPIO in the ISR and measure period variation with a logic analyzer.
If not cleared, the ISR may immediately retrigger or mask new interrupts.

Real-world applications

LED sequencing and deterministic UI timing.
Sensor sampling schedules with fixed intervals.
Control loops in motors or power electronics.

Where you’ll apply it

In this project: see Section 4.4 Algorithm Overview and Section 5.10 Implementation Phases.
Also used in: P04 PWM Motor or LED Brightness Controller, P05 ADC Sensor Sampler + Logging.

References

STM32F3 Reference Manual (timer chapter).
Joseph Yiu, ‘The Definitive Guide to ARM Cortex-M3/M4’ (interrupts).
Elecia White, ‘Making Embedded Systems’ (timing and scheduling).

Key insights

Timers create deterministic time; interrupts make that time actionable without blocking the CPU.

Summary Timers and interrupts are the foundation of real-time embedded behavior. They replace blocking delays with precise schedules and make timing measurable.

Homework/Exercises to practice the concept

Calculate prescaler and auto-reload for a 250 ms timer tick at 72 MHz.
Toggle a GPIO on each timer interrupt and measure drift over 5 minutes.
Experiment with different NVIC priorities and observe latency.

Solutions to the homework/exercises

A 1 kHz tick requires prescaler 71999; a 250 ms interval uses auto-reload 249.
Drift near zero indicates correct timer configuration; large drift indicates clock misconfig.
Raising priority reduces latency but can starve lower-priority handlers.

Timebase, SysTick, and Measurement Discipline

Fundamentals A reliable timebase is how you reason about time in firmware. SysTick is a dedicated timer tied to the core clock that can generate periodic interrupts. If you configure it for 1 kHz, you get a millisecond tick that becomes your system heartbeat. This tick drives delays, scheduling, timeouts, and timestamps. But SysTick is only reliable if you know the core clock frequency and if you measure the actual timing. Without measurement, a ‘1 ms tick’ is just a guess. A bring-up project must therefore treat SysTick as a calibration point and validate it with real-world observation.

Deep Dive into the concept SysTick is a 24-bit down-counter integrated into the Cortex-M core. It takes the core clock (or core clock divided by 8) and counts down from a reload value to zero, then sets a flag and optionally fires an interrupt. Because it is in the core, it is unaffected by peripheral bus prescalers, which makes it a good reference for measuring the system clock. To configure SysTick, you load the reload register with (core_clock / desired_tick) - 1, select the clock source, enable the counter, and enable its interrupt. The interrupt handler typically increments a global tick counter. On STM32, HAL_Delay and other drivers are often built on this tick. The main danger is implicit coupling: if SysTick is configured incorrectly, every delay and timeout in your system is wrong. Another hazard is jitter. SysTick interrupts can be delayed by higher-priority interrupts, which means the tick is not a precise real-time clock, but a best-effort scheduler. For measurement, you should use SysTick only as a reference and then validate it with a GPIO toggle or timer output. A robust timebase audit logs the computed reload value, the configured clock source, and the observed period. The audit should also include a drift test: toggle a pin every N ticks for several minutes and measure drift relative to a stopwatch or logic analyzer. If drift is observed, the root cause is often clock source accuracy (HSI vs HSE) or an incorrect reload value due to a wrong core clock assumption. You also need to consider wraparound. A 24-bit counter at 72 MHz will wrap quickly if used without an interrupt, and a 32-bit tick counter will wrap after ~49 days at 1 kHz. In embedded systems, you typically handle wraparound by using unsigned arithmetic and comparing time differences rather than absolute values. Finally, you must decide what ‘good enough’ measurement means. For LED blinking, 1-2% accuracy is acceptable. For UART baud or ADC sampling, you need tighter tolerances. The discipline is to tie every time-dependent feature to a measured reference rather than an assumption.

How this fit on projects In Interrupt Latency Profiler, SysTick is your baseline for clock validation. You compute the reload value, log it, and then validate the resulting tick with physical measurement.

Definitions & key terms

SysTick -> Core-integrated timer used for periodic interrupts and timekeeping.
Reload value -> The count value loaded into SysTick before it starts counting down.
Tick -> A periodic time event, commonly 1 ms in embedded systems.
Jitter -> Variation in the timing of periodic events due to interrupt latency.
Drift -> Long-term timing error relative to a reference clock.

Mental model diagram (ASCII)

Core clock -> SysTick down-counter -> interrupt -> tick++
                  |
                  v
             timing reference

How it works (step-by-step, with invariants and failure modes)

Compute reload value based on expected core clock and desired tick rate.
Configure SysTick to use core clock and enable interrupt.
Increment a global tick counter in the SysTick handler.
Toggle a GPIO every N ticks and measure the period.
Invariant: tick increments at configured frequency; failure mode: drift or jitter due to wrong clock or interrupt priority.

Minimal concrete example

void SysTick_Handler(void) {
g_tick_ms++;
}
void delay_ms(uint32_t ms) {
uint32_t start = g_tick_ms;
while ((uint32_t)(g_tick_ms - start) < ms) {
    __WFI();
}
}

Common misconceptions

SysTick gives precise real-time scheduling ignores interrupt latency and jitter.
If LED blinks, tick is correct ignores the need to measure drift over time.
Tick counters never overflow ignores wraparound in long-running systems.

Check-your-understanding questions

Why might SysTick interrupts be delayed even if configured correctly?
How do you compute reload for 1 kHz at 72 MHz?
What is the safest way to compare timeouts with wraparound?

Check-your-understanding answers

Higher-priority interrupts can preempt SysTick, causing jitter.
Reload = (72,000,000 / 1,000) - 1 = 71,999.
Use unsigned subtraction: if (now - start) >= timeout.

Real-world applications

Scheduling periodic sensor sampling.
Timeouts in communication protocols.
Measuring CPU load and real-time performance.

Where you’ll apply it

In this project: see Section 3.4 Example Usage and Section 6.2 Critical Test Cases.
Also used in: P03 Timer-Driven LED Sequencer, P07 Interrupt Latency Profiler.

References

ARM Cortex-M4 Technical Reference Manual (SysTick).
Joseph Yiu, ‘The Definitive Guide to ARM Cortex-M3/M4’ (timers and exceptions).
Elecia White, ‘Making Embedded Systems’ (timing discipline).

Key insights

A tick is only trustworthy when you can measure it and bound its jitter and drift.

Summary SysTick provides a convenient timebase, but it is only as accurate as your clock configuration and interrupt discipline. Treat it as a calibrated instrument, not a magic delay source.

Homework/Exercises to practice the concept

Compute the reload value for 2 kHz and verify it in code.
Measure the drift of a 1 Hz LED blink over 10 minutes.
Experiment with interrupt priorities and observe SysTick jitter.

Solutions to the homework/exercises

Reload for 2 kHz at 72 MHz is 35,999.
A 1 Hz blink should be 600 seconds over 10 minutes; log the error and compute percent drift.
Setting a higher-priority timer interrupt increases SysTick jitter; adjust priorities if needed.

3. Project Specification

3.1 What You Will Build

A latency profiler that triggers interrupts from a timer, toggles a GPIO at ISR entry, and reports latency statistics.

3.2 Functional Requirements

Latency Measurement: Measure latency from event to ISR entry using GPIO or DWT.
Priority Tests: Test multiple priority configurations and compare results.
Load Tests: Introduce background load and measure worst-case latency.
Report: Output min/avg/max latency statistics.

3.3 Non-Functional Requirements

Performance: Measurement resolution <1 us with DWT or external logic analyzer.
Reliability: Repeatable results across runs.
Usability: Simple configuration for priorities and load levels.

3.4 Example Usage / Output

Latency (us): min=1.2 avg=1.6 max=3.8
Priority: timer=1, uart=3
Load: 40% CPU
Status: PASS

3.5 Data Formats / Schemas / Protocols

Latency report: LAT_MIN_US=<f> LAT_AVG_US=<f> LAT_MAX_US=<f>

3.6 Edge Cases

Interrupts disabled during critical sections.
High-priority ISR starves lower-priority interrupts.
Measurement pin toggled too late in ISR.
DWT counter not enabled or overflowing.

3.7 Real World Outcome

You will produce a latency report that quantifies worst-case response time.

3.7.1 How to Run (Copy/Paste)

$ make flash
$ screen /dev/tty.usbmodem* 115200

3.7.2 Golden Path Demo (Deterministic)

Run with fixed load and priority settings; record results.

3.7.3 CLI Transcript (Success)

LAT_MIN_US=1.2 LAT_AVG_US=1.6 LAT_MAX_US=3.8
RESULT=PASS
# Exit code: 0

3.7.4 Failure Demo (Interrupts Masked)

LAT_MAX_US=120.0
RESULT=FAIL
# Exit code: 2

4. Solution Architecture

Timer events generate interrupts; ISR toggles a GPIO and DWT captures cycle counts for latency measurement.

4.1 High-Level Design

Timer Event -> NVIC -> ISR Entry -> GPIO Toggle -> Latency Report

4.2 Key Components

Component	Responsibility	Key Decisions
Event Source	Timer or external trigger	Timer for repeatability
ISR Probe	GPIO toggle and DWT capture	Probe at first instruction
Latency Analyzer	Computes statistics	Store samples in buffer

4.3 Data Structures (No Full Code)

typedef struct {
uint32_t samples[256];
uint16_t count;
} latency_buf_t;

4.4 Algorithm Overview

Latency Capture

Record cycle counter on event.
Record cycle counter at ISR entry.
Compute delta and store.

Complexity: O(1) per sample.

5. Implementation Guide

5.1 Development Environment Setup

make init
make flash
screen /dev/tty.usbmodem* 115200

5.2 Project Structure

project-root/
|-- src/
|   |-- main.c
|   |-- drivers/
|   `-- app/
|-- include/
|-- Makefile
`-- README.md

5.3 The Core Question You’re Answering

“What is my worst-case interrupt latency under real load?”

5.4 Concepts You Must Understand First

NVIC priorities and preemption.
GPIO timing measurement.
DWT cycle counter usage.

5.5 Questions to Guide Your Design

How will you trigger the event deterministically?
Where exactly will you toggle the GPIO?
What constitutes unacceptable latency?

5.6 Thinking Exercise

Latency Budget

Deadline = 50 us
Worst-case latency <= 10 us
ISR execution <= 40 us

5.7 The Interview Questions They’ll Ask

What is interrupt latency and why does worst-case matter?
How do priorities affect preemption?
How do you measure ISR latency on hardware?

5.8 Hints in Layers

Hint 1: Toggle GPIO at first line of ISR. Hint 2: Use DWT cycle counter for higher resolution. Hint 3: Introduce synthetic CPU load for stress testing.

5.9 Books That Will Help

Topic	Book	Chapter
Interrupts	Definitive Guide to ARM Cortex-M	Ch. 7
Real-time design	Making Embedded Systems	Ch. 8

5.10 Implementation Phases

Phase 1: Baseline Measurement (2 days)

Measure latency with minimal load.

Phase 2: Priority Experiments (4 days)

Change NVIC priorities and record effects.

Phase 3: Load Testing (4 days)

Introduce load and compute worst-case latency.

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Measurement	GPIO vs DWT	Both	GPIO for physical timing, DWT for precision
Event source	Timer vs external	Timer	Deterministic events

6. Testing Strategy

6.1 Test Categories

Category	Purpose	Examples
Unit Tests	Stats calculation	Min/avg/max computation
Integration Tests	Latency capture	GPIO timing validation
Edge Case Tests	Interrupt masking	Critical section delays

6.2 Critical Test Cases

Baseline Latency: Latency stays within expected range with minimal load.
Priority Impact: Lower priority interrupts show increased latency.
Masked Interrupts: Latency spikes when interrupts are disabled.

6.3 Test Data

Min=1.1 us, Avg=1.5 us, Max=3.7 us

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall	Symptom	Solution
GPIO toggled late	Latency appears smaller than reality	Toggle at ISR entry
DWT disabled	No cycle counts	Enable CYCCNT
Logging in ISR	Latency skewed	Store samples and log later

7.2 Debugging Strategies

Use a logic analyzer to validate GPIO timing.
Compare DWT cycles to measured GPIO edges.
Reduce ISR work and see latency improve.

7.3 Performance Traps

Heavy ISRs can make latency appear worse than it needs to be; move work out of ISR.

8. Extensions & Challenges

8.1 Beginner Extensions

Add a histogram of latency values.

8.2 Intermediate Extensions

Measure latency for external interrupt input.

8.3 Advanced Extensions

Automate sweep of priority configurations and plot results.

9. Real-World Connections

9.1 Industry Applications

Motor control: Interrupt latency determines control loop stability.
Safety systems: Guaranteed response times for fault detection.

FreeRTOS: RTOS timing and interrupt latency discussions.
CMSIS: NVIC and DWT access reference.

9.3 Interview Relevance

Interrupt latency measurement scenarios.
Priority and preemption design questions.

10. Resources

10.1 Essential Reading

Definitive Guide to ARM Cortex-M by Joseph Yiu - NVIC and exception details.
Making Embedded Systems by Elecia White - Real-time timing discipline.

10.2 Video Resources

Cortex-M interrupt latency measurement tutorial.
NVIC priority configuration demo.

10.3 Tools & Documentation

Logic analyzer: Measure GPIO timing.
STM32CubeIDE: Build and debug.

P03 Timer-Driven LED Sequencer - uses timer interrupts.
P14 Fault Injection and Watchdog Recovery - reliability under faults.

11. Self-Assessment Checklist

11.1 Understanding

I can explain NVIC priority and preemption.
I understand worst-case vs average latency.
I can use DWT for cycle measurements.

11.2 Implementation

Latency report produced with min/avg/max.
Load tests show expected changes.
Measurements repeat across runs.

11.3 Growth

I can design a latency budget for a new system.
I can improve latency by reducing ISR work.
I can explain these measurements in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

Latency measurement works.
Report produced.
At least two priority configs tested.

Full Completion:

Load tests performed.
Worst-case latency documented.

Excellence (Going Above & Beyond):

Automated latency sweep and plots created.

Project 7: Interrupt Latency Profiler

Quick Reference

1. Learning Objectives

2. All Theory Needed (Per-Concept Breakdown)

NVIC, Interrupt Priority, and Latency Measurement

Timers, Prescalers, and Interrupt Scheduling

Timebase, SysTick, and Measurement Discipline

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Non-Functional Requirements

3.4 Example Usage / Output

3.5 Data Formats / Schemas / Protocols

3.6 Edge Cases

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

3.7.2 Golden Path Demo (Deterministic)

3.7.3 CLI Transcript (Success)

3.7.4 Failure Demo (Interrupts Masked)

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Structures (No Full Code)

4.4 Algorithm Overview

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 The Core Question You’re Answering

5.4 Concepts You Must Understand First

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

5.7 The Interview Questions They’ll Ask

5.8 Hints in Layers

5.9 Books That Will Help

5.10 Implementation Phases

Phase 1: Baseline Measurement (2 days)

Phase 2: Priority Experiments (4 days)

Phase 3: Load Testing (4 days)

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

6.3 Test Data

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

7.3 Performance Traps

8. Extensions & Challenges

8.1 Beginner Extensions

8.2 Intermediate Extensions

8.3 Advanced Extensions

9. Real-World Connections

9.1 Industry Applications

9.2 Related Open Source Projects

9.3 Interview Relevance

10. Resources

10.1 Essential Reading

10.2 Video Resources

10.3 Tools & Documentation

10.4 Related Projects in This Series

11. Self-Assessment Checklist

11.1 Understanding

11.2 Implementation

11.3 Growth

12. Submission / Completion Criteria