Project 19: Real-Time Latency and Jitter Profiler (NVIC + Lock-Free Queue)
Build a measurement-first real-time control pipeline for NeoTrellis M4 that proves deterministic behavior using latency and jitter percentiles.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 4: Expert |
| Time Estimate | 2-3 weeks |
| Main Programming Language | C (Arduino core + CMSIS register access) |
| Alternative Programming Languages | Bare-metal C, Rust embedded (conceptual port) |
| Coolness Level | Level 4: The “Whoa, You Built That?” |
| Business Potential | 1. The “Resume Gold” |
| Prerequisites | Interrupts, timers, queue data structures, basic measurement tooling |
| Key Topics | NVIC priority tuning, latency instrumentation, jitter analysis, ISR/main-loop partitioning |
1. Learning Objectives
By completing this project, you will:
- Define explicit timing budgets for hard, firm, and soft real-time tasks.
- Implement ISR-safe event capture with lock-free SPSC queue handoff.
- Measure and report p50/p90/p99/max latency and jitter under stress load.
- Tune interrupt priorities to reduce jitter tails without starving lower-priority work.
- Build overload-degradation policy that preserves musical timing quality.
2. All Theory Needed (Project-Scoped)
2.1 Deterministic Timing Contracts
A real-time system is only deterministic if its worst-case behavior is bounded and measured. Define deadlines first, then design scheduling around them.
2.2 NVIC Priority and Preemption
Use CMSIS NVIC APIs and device docs to define preemption policy. Keep high-priority ISRs short and bounded.
2.3 Latency vs Jitter Metrics
Latency is delay; jitter is variation in delay. Musical feel is mostly impacted by jitter tails (p99/max), not means.
2.4 Lock-Free Event Transport
Use single-producer (ISR) single-consumer (main loop) ring buffer to avoid blocking and priority inversion.
2.5 Failure Modes
- ISR overload
- Queue overflow
- Hidden long critical sections
- Telemetry itself perturbing timing
3. Project Specification
3.1 What You Will Build
A firmware instrumentation harness that captures:
- edge-to-ISR latency
- ISR runtime
- queue transit delay
- event-to-output latency
It must run while LEDs, USB MIDI, and control scanning are active.
3.2 Functional Requirements
- Timestamp capture for each stage with cycle-level granularity.
- Lock-free queue path from ISR to main context.
- Periodic percentile report generation without blocking critical paths.
- Configurable stress modes (idle, moderate, full load).
3.3 Non-Functional Requirements
- Performance: < 10% measurement overhead in stress mode.
- Reliability: No queue corruption over 30-minute stress test.
- Repeatability: Same test profile gives comparable metrics across runs.
3.4 Real World Outcome
$ neotrellis_rt_profiler --window 15s --stress all
[15.000s] irq_latency_us: p50=7.8 p90=11.2 p99=18.4 max=31.0
[15.000s] queue_delay_us: p50=22.1 p90=44.7 p99=87.9 max=140.3
[15.000s] event_to_midi_us: p50=410.2 p90=730.6 p99=1188.9 max=1642.4
[15.000s] queue_overflow=0 deadline_miss=0
PASS: deterministic target met
4. Solution Architecture
4.1 High-Level Design
GPIO/Timer IRQ --> ISR timestamp + enqueue --> SPSC queue --> main dispatcher --> output sink
| |
+--> irq runtime stats +--> end-to-end stats
4.2 Key Components
| Component | Responsibility | Key Decision |
|---|---|---|
| IRQ probe | Capture edge and ISR-entry timestamps | Use low-overhead counter reads |
| SPSC queue | ISR-to-main event transport | Fixed-size power-of-two ring |
| Metrics aggregator | Compute percentiles | Offline windowed summary, not per-event heavy math |
| Stress harness | Reproducible workload injection | Deterministic stress profiles |
4.3 Data Shapes (Pseudocode)
event = {type, key_id, ts_irq_entry, ts_isr_exit, ts_dispatch_start, ts_output_done}
queue = ring<event, 256>
metrics = {hist_irq_latency, hist_queue_delay, hist_end_to_end, overflow_count}
5. Implementation Guide
5.1 Phases
- Baseline ISR timing capture only.
- Add queue handoff and main-loop dispatch timing.
- Add stress modes and percentile reporting.
- Tune priorities and validate under sustained load.
5.2 The Core Question You’re Answering
“Can the device prove bounded timing behavior under realistic burst load?”
5.3 Questions to Guide Design
- Which interrupts are allowed to preempt others?
- What overflow policy protects note-on/off integrity?
- How do you avoid measurement code becoming the bottleneck?
5.4 Thinking Exercise
Create a budget table for a 1 ms control frame and assign worst-case microseconds to each stage. Identify over-budget risk before coding.
5.5 Interview Questions They Will Ask
- How do you tune NVIC priorities in mixed workloads?
- Why is p99 more useful than average latency?
- How do you design a lock-free ISR-safe queue?
- What overload policy keeps user experience stable?
- How do you verify determinism, not just speed?
5.6 Hints in Layers
- Hint 1: Keep ISR logic to capture/enqueue/exit only.
- Hint 2: Separate metric collection from metric formatting.
- Hint 3: Validate queue invariants with long stress runs.
- Hint 4: Compare before/after each priority tuning change.
5.7 Common Pitfalls and Debugging
| Problem | Why | Fix | Quick Test |
|---|---|---|---|
| Jitter spikes every second | periodic heavy logging | batch logs and lower priority | disable logs and compare p99 |
| Event drops | queue too small or wrong policy | increase capacity, classify droppable events | burst test with overflow counter |
| Timing gets worse with metrics | instrumentation overhead too high | sample subset or lower report cadence | A/B run with instrumentation toggle |
5.8 Definition of Done
- p99 and max latency values are reported for at least 3 pipeline stages.
- Queue overflow behavior is explicit and tested.
- 30-minute stress test completes with zero corruption.
- Priority assignment rationale is documented.
6. References
- ARM Cortex-M4 Technical Reference Manual
- CMSIS-Core NVIC API
- Microchip ATSAMD51J19 Product + Datasheet Entry
- “Making Embedded Systems, 2nd Ed” by Elecia White