Project 20: CMSIS-DSP Polyphonic Wavetable Engine (DMA + Double Buffer DAC)

Build a serious embedded synth core on NeoTrellis M4 using CMSIS-DSP primitives, deterministic buffer scheduling, and zero-glitch playback targets.

Quick Reference

Attribute Value
Difficulty Level 5: Master
Time Estimate 3-4 weeks
Main Programming Language C/C++ with CMSIS-DSP
Alternative Programming Languages Bare-metal C
Coolness Level Level 5: The “WTF, That’s Possible?”
Business Potential 2. The “Micro-SaaS”
Prerequisites DSP basics, timer/DMA understanding, real-time profiling
Key Topics FIR/IIR filters, envelopes, voice allocation, double buffering, DMA DAC scheduling

1. Learning Objectives

By completing this project, you will:

  1. Design a block-based audio pipeline with explicit deadline budgets.
  2. Implement polyphonic voice management with deterministic stealing policy.
  3. Apply CMSIS-DSP filter kernels (FIR and biquad/IIR) in real-time flow.
  4. Compare fixed-point and floating-point tradeoffs for CPU, memory, and audio behavior.
  5. Prove zero-glitch operation through underrun and timing telemetry.

2. All Theory Needed (Project-Scoped)

2.1 Block Deadline Math

At sample rate Fs and block size N, deadline is N/Fs. All DSP for next block must complete before DMA consumes current block.

2.2 CMSIS-DSP in Practice

Use optimized kernels for filter and vector operations. Reserve custom math for truly unique parts.

2.3 Voice Lifecycle

A voice includes oscillator state, envelope state, and note assignment metadata. Deterministic voice-stealing prevents unpredictable output.

2.4 Filter Choices

  • FIR: linear phase, higher CPU cost for long tap counts.
  • IIR/biquad: efficient and common in embedded synthesis, but coefficient stability matters.

2.5 Fixed vs Float

On Cortex-M4F, float is often practical and simpler. Fixed-point is useful in high-voice-count or memory-constrained paths.


3. Project Specification

3.1 What You Will Build

A polyphonic wavetable synth with:

  • 8-16 voices
  • ADSR envelopes
  • filter stage
  • DMA-driven double-buffer DAC output
  • performance telemetry

3.2 Functional Requirements

  1. Stable wavetable oscillator per active voice.
  2. Envelope handling without clicks.
  3. Configurable filter chain using CMSIS-DSP primitives.
  4. Double-buffer swap synchronized to DMA half/full events.

3.3 Non-Functional Requirements

  • Performance: p99 render time below configured block budget.
  • Reliability: zero buffer underruns in long stress sessions.
  • Quality: no obvious zipper noise/clicks in note transitions.

3.4 Real World Outcome

$ neotrellis_synth --voices 12 --rate 48000 --block 128 --profile 20s
[20.000s] block_deadline_ms=2.667
[20.000s] render_ms: p50=0.94 p90=1.31 p99=1.72 max=2.01
[20.000s] active_voices_peak=12 voice_steals=37
[20.000s] underruns=0
[20.000s] click_detector_events=0
PASS: zero-glitch profile achieved

4. Solution Architecture

4.1 High-Level Design

MIDI events --> voice manager --> oscillator bank --> envelope stage --> filter stage --> mix/limit --> audio block
                                                                                                 |
                                                                                                 v
                                                                                        DMA double-buffer DAC

4.2 Key Components

Component Responsibility Key Decision
Voice manager note allocation and stealing oldest/quietest/release-state policy
Oscillator bank wavetable lookup and phase increment table size vs interpolation cost
Envelope engine ADSR control audio-rate vs control-rate update strategy
DSP stage filtering and shaping FIR/IIR selection per use case
DMA scheduler glitch-free output handoff half/full interrupt orchestration

4.3 Core Data Shapes (Pseudocode)

voice = {note, phase, phase_inc, env_state, env_level, active, start_ts}
audio_block = array<float, 128>
state = {voices[16], current_buffer, underrun_count, render_hist}

5. Implementation Guide

5.1 Phases

  1. Mono synth baseline with one waveform.
  2. Add polyphony and deterministic voice stealing.
  3. Integrate ADSR and click prevention.
  4. Add CMSIS-DSP filters and timing telemetry.
  5. Harden buffer scheduling and stress-test.

5.2 The Core Question You’re Answering

“Can this MCU sustain musically useful polyphony with deterministic deadlines and no audible artifacts?”

5.3 Questions to Guide Design

  1. How much cycle budget remains after voice rendering but before filters?
  2. Which parameter updates can run per block instead of per sample?
  3. What fails first when polyphony is pushed beyond safe limits?

5.4 Thinking Exercise

Compute cycle budget at 44.1 kHz and 48 kHz for block sizes 64, 128, 256. Decide which mode gives best latency/performance balance for your goals.

5.5 Interview Questions They Will Ask

  1. Why is double-buffering fundamental in real-time audio?
  2. How do FIR and IIR differ in embedded tradeoffs?
  3. How do you avoid note-on/note-off clicks?
  4. What evidence proves zero-glitch behavior?
  5. When would you switch a stage from float to fixed-point?

5.6 Hints in Layers

  • Hint 1: Build a reliable mono path before adding polyphony.
  • Hint 2: Instrument block render time from day one.
  • Hint 3: Keep allocations static; avoid dynamic allocation in audio path.
  • Hint 4: Use CMSIS-DSP kernels in hotspots before writing custom loops.

5.7 Common Pitfalls and Debugging

Problem Why Fix Quick Test
Random clicks at high note density deadline misses reduce voices or optimize kernels stress chord bursts, inspect underrun counter
Envelope zipper noise coarse control-rate updates increase update rate or interpolate slow attack/release sweep listening test
Filter instability invalid coefficient scaling verify coefficients and numeric ranges sine sweep + stability monitor

5.8 Definition of Done

  • 20-minute stress run with zero underruns.
  • p99 render time remains below budget.
  • Envelope transitions are click-free.
  • Voice stealing behavior is deterministic and documented.

6. References