Project 20: CMSIS-DSP Polyphonic Wavetable Engine (DMA + Double Buffer DAC)

Build a serious embedded synth core on NeoTrellis M4 using CMSIS-DSP primitives, deterministic buffer scheduling, and zero-glitch playback targets.

Quick Reference

Attribute	Value
Difficulty	Level 5: Master
Time Estimate	3-4 weeks
Main Programming Language	C/C++ with CMSIS-DSP
Alternative Programming Languages	Bare-metal C
Coolness Level	Level 5: The “WTF, That’s Possible?”
Business Potential	2. The “Micro-SaaS”
Prerequisites	DSP basics, timer/DMA understanding, real-time profiling
Key Topics	FIR/IIR filters, envelopes, voice allocation, double buffering, DMA DAC scheduling

1. Learning Objectives

By completing this project, you will:

Design a block-based audio pipeline with explicit deadline budgets.
Implement polyphonic voice management with deterministic stealing policy.
Apply CMSIS-DSP filter kernels (FIR and biquad/IIR) in real-time flow.
Compare fixed-point and floating-point tradeoffs for CPU, memory, and audio behavior.
Prove zero-glitch operation through underrun and timing telemetry.

2. All Theory Needed (Project-Scoped)

2.1 Block Deadline Math

At sample rate Fs and block size N, deadline is N/Fs. All DSP for next block must complete before DMA consumes current block.

2.2 CMSIS-DSP in Practice

Use optimized kernels for filter and vector operations. Reserve custom math for truly unique parts.

2.3 Voice Lifecycle

A voice includes oscillator state, envelope state, and note assignment metadata. Deterministic voice-stealing prevents unpredictable output.

2.4 Filter Choices

FIR: linear phase, higher CPU cost for long tap counts.
IIR/biquad: efficient and common in embedded synthesis, but coefficient stability matters.

2.5 Fixed vs Float

On Cortex-M4F, float is often practical and simpler. Fixed-point is useful in high-voice-count or memory-constrained paths.

3. Project Specification

3.1 What You Will Build

A polyphonic wavetable synth with:

8-16 voices
ADSR envelopes
filter stage
DMA-driven double-buffer DAC output
performance telemetry

3.2 Functional Requirements

Stable wavetable oscillator per active voice.
Envelope handling without clicks.
Configurable filter chain using CMSIS-DSP primitives.
Double-buffer swap synchronized to DMA half/full events.

3.3 Non-Functional Requirements

Performance: p99 render time below configured block budget.
Reliability: zero buffer underruns in long stress sessions.
Quality: no obvious zipper noise/clicks in note transitions.

3.4 Real World Outcome

$ neotrellis_synth --voices 12 --rate 48000 --block 128 --profile 20s
[20.000s] block_deadline_ms=2.667
[20.000s] render_ms: p50=0.94 p90=1.31 p99=1.72 max=2.01
[20.000s] active_voices_peak=12 voice_steals=37
[20.000s] underruns=0
[20.000s] click_detector_events=0
PASS: zero-glitch profile achieved

4. Solution Architecture

4.1 High-Level Design

MIDI events --> voice manager --> oscillator bank --> envelope stage --> filter stage --> mix/limit --> audio block
                                                                                                 |
                                                                                                 v
                                                                                        DMA double-buffer DAC

4.2 Key Components

Component	Responsibility	Key Decision
Voice manager	note allocation and stealing	oldest/quietest/release-state policy
Oscillator bank	wavetable lookup and phase increment	table size vs interpolation cost
Envelope engine	ADSR control	audio-rate vs control-rate update strategy
DSP stage	filtering and shaping	FIR/IIR selection per use case
DMA scheduler	glitch-free output handoff	half/full interrupt orchestration

4.3 Core Data Shapes (Pseudocode)

voice = {note, phase, phase_inc, env_state, env_level, active, start_ts}
audio_block = array<float, 128>
state = {voices[16], current_buffer, underrun_count, render_hist}

5. Implementation Guide

5.1 Phases

Mono synth baseline with one waveform.
Add polyphony and deterministic voice stealing.
Integrate ADSR and click prevention.
Add CMSIS-DSP filters and timing telemetry.
Harden buffer scheduling and stress-test.

5.2 The Core Question You’re Answering

“Can this MCU sustain musically useful polyphony with deterministic deadlines and no audible artifacts?”

5.3 Questions to Guide Design

How much cycle budget remains after voice rendering but before filters?
Which parameter updates can run per block instead of per sample?
What fails first when polyphony is pushed beyond safe limits?

5.4 Thinking Exercise

Compute cycle budget at 44.1 kHz and 48 kHz for block sizes 64, 128, 256. Decide which mode gives best latency/performance balance for your goals.

5.5 Interview Questions They Will Ask

Why is double-buffering fundamental in real-time audio?
How do FIR and IIR differ in embedded tradeoffs?
How do you avoid note-on/note-off clicks?
What evidence proves zero-glitch behavior?
When would you switch a stage from float to fixed-point?

5.6 Hints in Layers

Hint 1: Build a reliable mono path before adding polyphony.
Hint 2: Instrument block render time from day one.
Hint 3: Keep allocations static; avoid dynamic allocation in audio path.
Hint 4: Use CMSIS-DSP kernels in hotspots before writing custom loops.

5.7 Common Pitfalls and Debugging

Problem	Why	Fix	Quick Test
Random clicks at high note density	deadline misses	reduce voices or optimize kernels	stress chord bursts, inspect underrun counter
Envelope zipper noise	coarse control-rate updates	increase update rate or interpolate	slow attack/release sweep listening test
Filter instability	invalid coefficient scaling	verify coefficients and numeric ranges	sine sweep + stability monitor

5.8 Definition of Done

20-minute stress run with zero underruns.
p99 render time remains below budget.
Envelope transitions are click-free.
Voice stealing behavior is deterministic and documented.

6. References

CMSIS-DSP Library
CMSIS-DSP Filtering Functions
ARM Cortex-M4 Technical Reference Manual
“Understanding Digital Signal Processing” by Richard Lyons