Project 4: Real-Time Audio Spectrum Analyzer

Project Overview

Attribute	Value
Difficulty	Advanced
Time Estimate	3-4 weeks
Main Language	C
Alternatives	Rust, Arduino C++, MicroPython
Primary Book	The Scientist and Engineer’s Guide to DSP by Steven W. Smith
Knowledge Areas	DSP, Audio Processing, I2S, Dual-Core FreeRTOS, DMA

What You’ll Build

A device that captures audio through a microphone, performs FFT analysis, and displays a real-time frequency spectrum on an LED matrix or OLED display.

Physical Setup:

ESP32 connected to INMP441 (I2S digital microphone) or MAX4466 (analog)
8x32 or 16x16 WS2812B LED matrix, or SSD1306 OLED display
Music or voice near the microphone creates dancing visualizations

What You’ll See:

LED Matrix (8 frequency bands):
     █
     █     █
█    █  █  █        █
█ █  █  █  █  █  █  █
▔ ▔  ▔  ▔  ▔  ▔  ▔  ▔
20 50 125 315 800 2k 5k 12k Hz

OLED Display (128x64):
┌────────────────────────────────┐
│ Real-Time Audio Spectrum       │
│                                │
│ ▂▃▄▅▆▇█▇▆▅▄▃▂▁▁▂▃▄▅▆▇█▇▆▅▄▃▂▁ │
│                                │
│ Peak: 2.4kHz    dB: -18        │
│ FPS: 45         Clipping: No   │
└────────────────────────────────┘

Learning Objectives

By completing this project, you will be able to:

Sample audio using I2S with DMA for zero-copy continuous capture
Implement the Fast Fourier Transform on resource-constrained hardware
Leverage ESP32’s dual-core architecture for parallel processing
Apply digital signal processing concepts: windowing, magnitude calculation, dB scaling
Drive WS2812B LED matrices using the RMT peripheral
Profile and optimize real-time systems to achieve target frame rates
Understand time-frequency domain trade-offs in spectrum analysis

Deep Theoretical Foundation

Digital Audio Fundamentals

Sound is a continuous pressure wave in air. To process it digitally, we must sample it at discrete points in time.

Sampling and Nyquist Theorem

Continuous Sound Wave:

Amplitude
    ^
    │    ╭───╮      ╭───╮      ╭───╮
    │   ╱     ╲    ╱     ╲    ╱     ╲
 0 ─┼──╯───────╲──╯───────╲──╯───────╲─→ Time
    │           ╰╮         ╰╮         ╰
    │             ╰─────────╰───────────
    │

Sampled at discrete points:

    │    •         •         •
    │   •           •         •
    │  •             •         •
 0 ─┼─•───────────────•─────────•──────→ Time
    │                   •
    │                    •

Nyquist Theorem: To accurately represent a frequency f, you must sample at least 2f times per second.

Sample Rate	Maximum Frequency	Common Use
8,000 Hz	4,000 Hz	Telephone
22,050 Hz	11,025 Hz	AM radio quality
44,100 Hz	22,050 Hz	CD quality
48,000 Hz	24,000 Hz	Professional audio

Why 44.1kHz for this project: Human hearing extends to ~20kHz. Sampling at 44.1kHz captures the full audible spectrum with headroom.

Aliasing: What Happens When Nyquist is Violated

If you sample a 15kHz tone at 22kHz (less than 2×15kHz):

Original signal (15kHz):
    ╭─╮ ╭─╮ ╭─╮ ╭─╮ ╭─╮
───╯  ╰╯  ╰╯  ╰╯  ╰╯  ╰───

Sample points at 22kHz:
    •   •   •   •   •   •

What we reconstruct (7kHz!):
      ╭────╮    ╭────╮
────╯      ╰───╯      ╰───

The 15kHz signal "aliases" to 7kHz (22-15=7)

Anti-aliasing filter: Hardware or software filter to remove frequencies above Nyquist before sampling. Many I2S microphones include this internally.

The Fast Fourier Transform (FFT)

The FFT transforms time-domain samples into frequency-domain components—the mathematical heart of a spectrum analyzer.

What FFT Computes

Given N time-domain samples, FFT produces N/2 frequency “bins”:

Time Domain (1024 samples):              Frequency Domain (512 bins):

Amplitude                                Magnitude
    ^                                        ^
    │ ╭╮╭╮   ╭╮╭╮   ╭╮╭╮                    │        █
    │╭╯╰╯╰╮ ╭╯╰╯╰╮ ╭╯╰╯╰╮                   │     █  █  █
 0 ─┤      ╰╯    ╰╯    ╰╯                   │  █  █  █  █  █
    │                                    0 ─┼──────────────────→ Frequency
    │                            FFT         0Hz            22kHz
    └──────────────────────→ Time            │←  bin 0    bin 511 →│

Each bin represents a frequency range:

Bin width = Sample Rate / N = 44100 / 1024 = 43.07 Hz per bin
Bin 0 = 0 Hz (DC component)
Bin 1 = 0-43 Hz
Bin 100 = 4300-4343 Hz

FFT Output: Complex Numbers

FFT output is complex: each bin has real and imaginary components.

FFT output for bin k: X[k] = a + bi

Where:
- a = real component (cosine amplitude)
- b = imaginary component (sine amplitude)

Magnitude: |X[k]| = √(a² + b²)
Phase:     ∠X[k] = atan2(b, a)  (not needed for visualization)

For spectrum display, we only care about magnitude—how strong each frequency is.

Why FFT is Fast

DFT (Discrete Fourier Transform):
- Direct calculation: O(N²)
- 1024 samples: ~1 million operations
- Too slow for real-time

FFT (Fast Fourier Transform):
- Divide and conquer: O(N log N)
- 1024 samples: ~10,000 operations
- 100x faster!

The FFT exploits symmetry in the DFT equations using the Cooley-Tukey algorithm (1965).

Windowing: Reducing Spectral Leakage

FFT assumes the input repeats infinitely. But our 1024-sample buffer doesn’t perfectly align with signal periods.

Without windowing - discontinuity at edges:

Sample buffer:
│╭───╮   ╭───╮   ╭───│  Discontinuity!
││    ╲ ╱     ╲ ╱    │
││     ╳       ╳     │
│╰───╯ ╰───╯ ╰───╯   │
│←─── 1024 samples ──→│

This discontinuity creates artificial frequencies (spectral leakage)

Window functions taper the signal to zero at the edges:

Hann Window:
    ╭──────────────╮
   ╱                ╲
  ╱                  ╲
 ╱                    ╲
╱________________________╲
│←─── 1024 samples ────→│

Applied to signal:
- Multiplied sample-by-sample
- Edges become zero (no discontinuity)
- Center preserved (minimal signal distortion)

Window	Sidelobe Level	Frequency Resolution	Use Case
Rectangular	-13 dB	Excellent	Testing only
Hann	-31 dB	Good	General purpose
Hamming	-43 dB	Good	Speech analysis
Blackman	-58 dB	Poor	High dynamic range

For spectrum visualizers: Hann window is ideal—good compromise between frequency resolution and sidelobe suppression.

I2S: Digital Audio Interface

I2S (Inter-IC Sound) is a standard for transmitting digital audio between chips.

I2S Signal Lines

BCLK (Bit Clock):     ┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐┌┐
                      └┘└┘└┘└┘└┘└┘└┘└┘└┘└┘└┘└┘└┘└┘└┘└┘

WS (Word Select):     ┌───────────────────┐
      Left channel ───┘                   └───────────────
                                            Right channel

DOUT (Data):          │D15│D14│...│D1│D0│D15│D14│...│D1│D0│
                      │← Left channel →│← Right channel →│

BCLK: One clock per bit (e.g., 44100 × 16 × 2 = 1.4 MHz for 16-bit stereo)
WS/LRCLK: High = Right channel, Low = Left channel
DOUT: Serial audio data, MSB first

I2S on ESP32

ESP32 has two I2S peripherals. For audio input:

i2s_config_t i2s_config = {
    .mode = I2S_MODE_MASTER | I2S_MODE_RX,  // Receive mode
    .sample_rate = 44100,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
    .channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
    .communication_format = I2S_COMM_FORMAT_STAND_I2S,
    .dma_buf_count = 8,      // Number of DMA buffers
    .dma_buf_len = 1024,     // Samples per buffer
    .use_apll = true,        // Use APLL for accurate sample rate
};

i2s_pin_config_t pin_config = {
    .bck_io_num = 26,        // Bit Clock
    .ws_io_num = 25,         // Word Select (Left/Right)
    .data_in_num = 33,       // Data input
};

DMA: Zero-Copy Audio Capture

DMA (Direct Memory Access) moves data between peripherals and memory without CPU involvement.

Without DMA:                        With DMA:

I2S → CPU → RAM                     I2S → DMA → RAM
                                              │
CPU must copy each sample              CPU is free to
(blocks processing)                    run FFT while DMA
                                       fills next buffer

Double Buffering (Ping-Pong)

Time →  ────────────────────────────────────────────────→

DMA:    │ Fill Buffer A │ Fill Buffer B │ Fill Buffer A │
                │                │                │
CPU:            │ Process A      │ Process B      │ Process A
                ▼                ▼                ▼

Buffer A:  [samples...]     (being processed)   [new samples]
Buffer B:  (being filled)    [samples...]       (being filled)

This allows continuous audio capture with no gaps.

ESP32 Dual-Core Architecture

ESP32 has two Xtensa LX6 cores. We can dedicate each to specific tasks:

┌─────────────────────────────────────────────────────────────┐
│                         ESP32                                │
├─────────────────────────┬───────────────────────────────────┤
│        Core 0           │            Core 1                  │
├─────────────────────────┼───────────────────────────────────┤
│ Audio Capture Task      │ FFT Processing Task               │
│ - I2S DMA management    │ - Apply window function           │
│ - Buffer ready signal   │ - Compute FFT                     │
│ - Continuous sampling   │ - Calculate magnitudes            │
│                         │ - Map to display                   │
├─────────────────────────┼───────────────────────────────────┤
│                         │ Display Update Task               │
│                         │ - Render LED/OLED                 │
│                         │ - Gamma correction                │
└─────────────────────────┴───────────────────────────────────┘

Task Pinning

// Pin audio capture to Core 0
xTaskCreatePinnedToCore(
    audio_capture_task,    // Function
    "audio",               // Name
    4096,                  // Stack size
    NULL,                  // Parameters
    10,                    // Priority (high)
    &audio_task_handle,    // Task handle
    0                      // Core 0
);

// Pin FFT processing to Core 1
xTaskCreatePinnedToCore(
    fft_process_task,
    "fft",
    8192,                  // Larger stack for FFT arrays
    NULL,
    5,                     // Medium priority
    &fft_task_handle,
    1                      // Core 1
);

WS2812B LED Matrix Driving

WS2812B LEDs use a single-wire timing-critical protocol. Each bit is encoded by pulse duration:

Bit 0:                    Bit 1:
High: 0.4µs              High: 0.8µs
Low:  0.85µs             Low:  0.45µs

│───────│                │─────────────│
│       └────────────│   │             └──────│
│← 0.4 →│← 0.85 →│       │← 0.8 →│← 0.45 →│

Total bit time: 1.25µs (800kHz data rate)

For an 8×32 matrix (256 LEDs × 24 bits = 6,144 bits):

Transmission time: 6,144 × 1.25µs = 7.68ms
Maximum refresh rate: ~130 Hz

ESP32 RMT Peripheral

The RMT (Remote Control Transceiver) peripheral generates precise timing for WS2812B:

#include "driver/rmt.h"

// Configure RMT for WS2812B
rmt_config_t config = {
    .rmt_mode = RMT_MODE_TX,
    .channel = RMT_CHANNEL_0,
    .gpio_num = LED_GPIO,
    .clk_div = 2,                    // 40MHz
    .mem_block_num = 1,
};

// Timing for WS2812B
#define T0H 16  // 0.4µs at 40MHz
#define T1H 32  // 0.8µs
#define T0L 34  // 0.85µs
#define T1L 18  // 0.45µs

Project Specification

Hardware Requirements

Component	Quantity	Purpose
ESP32 DevKit	1	Main MCU
INMP441 I2S Mic	1	Digital audio input
8x32 WS2812B Matrix	1	Spectrum display
5V 3A Power Supply	1	LED power
Level Shifter (3.3V→5V)	1	Data line for LEDs
Capacitor (1000µF)	1	Power smoothing

Wiring Diagram

ESP32 DevKit                    INMP441 Microphone
┌─────────────┐                ┌────────────┐
│      GPIO26 │────────────────│ SCK (BCLK) │
│      GPIO25 │────────────────│ WS (LRCLK) │
│      GPIO33 │────────────────│ SD (DOUT)  │
│        3.3V │────────────────│ VDD        │
│         GND │────────────────│ GND        │
│             │                │ L/R → GND  │ (left channel)
│             │                └────────────┘
│             │
│             │                WS2812B LED Matrix
│             │                ┌────────────────────┐
│      GPIO13 │───[Level]──────│ DIN                │
│             │   Shifter      │                    │
│         GND │────────────────│ GND                │
│             │                └────────────────────┘
│             │
│             │                ← 5V 3A Supply
│             │                ┌────────────────────┐
│             │                │ 5V ──────→ LED VCC │
│             │                │ GND ─────→ LED GND │
│             │                └────────────────────┘
└─────────────┘

Note: Add 1000µF capacitor across LED power rails
      Add 330Ω resistor in series with DIN line

Functional Requirements

Audio Capture
- 44.1kHz sample rate, 16-bit mono
- 1024-sample FFT window (23ms latency)
- Continuous capture via DMA
FFT Processing
- Apply Hann window to reduce spectral leakage
- Compute 1024-point FFT
- Calculate magnitude in dB scale
Frequency Mapping
- Map 512 bins to 8 display bands (logarithmic)
- Apply smoothing for pleasant visuals
- Implement peak hold (optional)
Display Output
- 30+ FPS refresh rate
- Rainbow color gradient
- Gamma correction for LEDs
Performance
- Total latency < 50ms (audio to display)
- No audio dropouts
- CPU usage < 80% per core

Solution Architecture

System Pipeline

┌───────────┐    ┌───────────┐    ┌───────────┐    ┌───────────┐
│ I2S/DMA   │───→│ Window    │───→│   FFT     │───→│ Magnitude │
│ Capture   │    │ Function  │    │ (1024pt)  │    │ Calc      │
└───────────┘    └───────────┘    └───────────┘    └───────────┘
    23ms             1ms             8ms              2ms

                                                         │
                                                         ▼
┌───────────┐    ┌───────────┐    ┌───────────┐    ┌───────────┐
│ LED       │←───│ Color     │←───│ Smoothing │←───│ Bin       │
│ Output    │    │ Mapping   │    │ Filter    │    │ Mapping   │
└───────────┘    └───────────┘    └───────────┘    └───────────┘
    8ms              1ms              1ms              1ms

Total pipeline latency: ~45ms (well under perceptual threshold)

Task Structure

// Core 0: Audio capture
void audio_capture_task(void* param) {
    int16_t samples[1024];
    size_t bytes_read;

    while (1) {
        // DMA fills buffer, blocks until ready
        i2s_read(I2S_NUM_0, samples, sizeof(samples), &bytes_read, portMAX_DELAY);

        // Send to FFT task via queue
        xQueueSend(audio_queue, samples, 0);
    }
}

// Core 1: FFT processing and display
void fft_process_task(void* param) {
    int16_t samples[1024];
    float fft_input[1024];
    float fft_output[1024];
    float magnitudes[8];  // 8 frequency bands

    while (1) {
        // Wait for audio data
        xQueueReceive(audio_queue, samples, portMAX_DELAY);

        // Convert to float and apply window
        for (int i = 0; i < 1024; i++) {
            fft_input[i] = samples[i] * hann_window[i];
        }

        // Compute FFT
        dsps_fft2r_fc32(fft_input, 1024);

        // Calculate magnitudes for 8 bands
        calculate_band_magnitudes(fft_output, magnitudes);

        // Update LED display
        update_leds(magnitudes);
    }
}

Frequency Band Mapping

Human hearing is logarithmic. We use logarithmic spacing for natural-looking response:

FFT bin → Frequency → Display Bar

Bin Range    Frequency Range    Bar    Description
─────────    ───────────────    ───    ───────────
1-2          43-86 Hz           0      Sub-bass
3-5          86-215 Hz          1      Bass (kick drum)
6-12         215-516 Hz         2      Low-mid (bass guitar)
13-25        516-1075 Hz        3      Mid (vocals fundamental)
26-50        1075-2150 Hz       4      Upper-mid (presence)
51-100       2150-4300 Hz       5      High-mid (consonants)
101-200      4300-8600 Hz       6      High (sibilance)
201-400      8600-17200 Hz      7      Air (sparkle)

Key Data Structures

// Pre-computed Hann window coefficients
float hann_window[1024];  // Computed once at startup

// FFT band configuration
typedef struct {
    uint16_t bin_start;
    uint16_t bin_end;
    float smoothing;      // 0.0 = instant, 0.9 = very smooth
    float peak;           // Peak hold value
} fft_band_t;

fft_band_t bands[8] = {
    {1, 2, 0.7, 0},
    {3, 5, 0.7, 0},
    // ... etc
};

// LED frame buffer
typedef struct {
    uint8_t r, g, b;
} rgb_t;

rgb_t led_buffer[256];  // 8x32 matrix

Phased Implementation Guide

Phase 1: Audio Capture (Day 1-3)

Goal: See audio waveform in serial plotter

Configure I2S
- Set up INMP441 microphone
- 44.1kHz, 16-bit, mono
- Verify with serial output
Visualize Raw Samples
- Print samples to serial
- Use Arduino Serial Plotter
- Speak/clap → see waveform
Verify DMA Operation
- Check no buffer overruns
- Measure timing consistency
- Confirm continuous capture

Checkpoint: Serial plotter shows clean audio waveform when you speak

Phase 2: FFT Implementation (Day 4-7)

Goal: See frequency spectrum in serial

Compute Basic FFT
- Use ESP-DSP library
- 1024-point FFT
- Print raw bin values
Add Windowing
- Pre-compute Hann coefficients
- Apply before FFT
- Compare with/without (reduced leakage)
Calculate Magnitudes
- sqrt(real² + imag²)
- Convert to dB: 20*log10(mag)
- Print 8-band summary
Test with Tone Generator
- Play 440Hz tone from phone
- Should peak in bin ~10 (440/43)
- Verify frequency accuracy

Checkpoint: 1kHz tone shows peak in correct frequency range

Phase 3: Dual-Core Pipeline (Day 8-10)

Goal: Parallel audio capture and FFT processing

Create Task Structure
- Audio capture on Core 0
- FFT processing on Core 1
- Queue for sample transfer
Measure Performance
- Time each stage
- FFT should complete before next buffer
- Check for queue overflows
Handle Edge Cases
- Queue full → drop oldest buffer
- FFT too slow → reduce size or optimize
- Monitor heap usage

Checkpoint: Continuous processing at 44Hz (1024 samples / 44100 = 23ms)

Phase 4: LED Display (Day 11-14)

Goal: Visualization on LED matrix

Configure RMT for WS2812B
- Set timing parameters
- Test with solid color
- Verify all 256 LEDs work
Implement Bar Graph
- Map magnitude to bar height
- Apply logarithmic scaling
- Add color gradient (rainbow or single-color)
Add Visual Polish
- Smoothing filter (exponential moving average)
- Peak hold with decay
- Gamma correction for LEDs

Checkpoint: Dancing spectrum display responding to music

Phase 5: Optimization (Day 15-21)

Goal: Smooth, responsive, efficient

Profile Performance
- Measure total pipeline latency
- Identify bottlenecks
- Target <50ms total delay
Optimize FFT
- Use ESP-DSP optimized functions
- Consider fixed-point math
- Benchmark alternatives
Reduce Memory Usage
- Static allocation where possible
- Share buffers carefully
- Monitor for leaks
Add Features
- Multiple visualization modes
- Sensitivity adjustment
- Beat detection (bonus)

Testing Strategy

Unit Tests

Component	Test	Expected Result
I2S	Read 1024 samples	Non-zero values
FFT	440Hz input	Peak at bin 10±1
FFT	White noise	Flat spectrum
Window	Apply Hann	Edge samples = 0
LED	Set color	Correct color displayed

Performance Tests

Metric	Target	How to Measure
FFT time	< 15ms	`esp_timer_get_time()`
Display update	< 10ms	Timer around LED write
Total latency	< 50ms	Clap test (visual delay)
Frame rate	> 30 FPS	Count frames per second
CPU usage	< 80%	`vTaskGetRunTimeStats()`

Audio Quality Tests

Frequency Accuracy
- Play known frequencies (100Hz, 1kHz, 10kHz)
- Verify correct bars light up
Dynamic Range
- Whisper → quiet bars
- Loud music → full bars
- No clipping at max volume
Response Time
- Sharp transients (clap)
- Should appear within 2 frames (~66ms)

Common Pitfalls and Debugging

Audio Issues

Problem: No audio input

Check I2S pin connections
Verify microphone power (3.3V)
L/R pin determines channel (try toggling)

Problem: Audio is distorted

Check for clipping (samples at ±32767)
Reduce gain or add attenuation
Verify sample rate matches microphone

Problem: High-frequency noise

Add decoupling capacitor near mic (0.1µF)
Use shielded wires
Check for WiFi interference (disable WiFi if not needed)

FFT Issues

Problem: Spectrum looks wrong

Verify windowing is applied
Check FFT size matches sample count
Ensure correct bin-to-frequency mapping

Problem: All frequencies show same level

Check for DC offset (subtract average)
Verify magnitude calculation (real² + imag²)
Window might not be applied

Display Issues

Problem: LEDs flicker

Check power supply (5V 3A minimum for 256 LEDs)
Add capacitor across power rails
Reduce brightness if power limited

Problem: Colors are wrong

WS2812B is GRB, not RGB
Check color order in library
Verify gamma correction

Problem: Only some LEDs work

Check data line connections
Level shifter may be needed (3.3V → 5V)
Test with fewer LEDs first

Extensions and Challenges

Beginner Extensions

Multiple Display Modes
- Bar graph, waterfall, oscilloscope
- Button to cycle through modes
Color Themes
- Rainbow, fire, ocean, custom
- Store preference in NVS

Intermediate Challenges

Beat Detection
- Detect kick drum hits
- Flash LEDs on beat
- Calculate BPM
OLED Display
- Alternative to LED matrix
- Higher resolution spectrum
- Show peak frequency and dB level

Advanced Challenges

Stereo Analysis
- Two microphones
- Compare left/right channels
- Visualize stereo field
Wireless Audio
- Bluetooth A2DP sink
- Analyze streamed audio
- No microphone needed
Machine Learning
- Train classifier on ESP32
- Detect music vs speech
- Identify specific songs

Real-World Connections

Commercial Products

Product	Your Project Skill
Equalizer apps	FFT analysis, visualization
Guitar tuners	Frequency detection
Smart speakers	Audio processing, DSP
Music visualizers	Real-time graphics

Industry Applications

Audio Engineering: Spectrum analyzers, room correction
Voice Assistants: Preprocessing before speech recognition
Musical Instruments: Electronic effects, synthesizers
Environmental Monitoring: Sound level meters, noise detection

Resources

Official Documentation

Resource	URL
ESP-IDF I2S	docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/i2s.html
ESP-DSP Library	github.com/espressif/esp-dsp
WS2812B Datasheet	cdn-shop.adafruit.com/datasheets/WS2812B.pdf

Books

Book	Author	Relevant Chapters
DSP Guide	Steven W. Smith	Ch. 8-12: FFT (free online)
Making Embedded Systems	Elecia White	Ch. 6, 8: DMA, Multitasking
Mastering FreeRTOS	FreeRTOS.org	Tasks, Queues (free PDF)

Online Resources

Resource	Description
DSPGuide.com	Free complete DSP textbook
INMP441 Hookup Guide	SparkFun tutorial
FastLED Library	Arduino LED library

Self-Assessment Checklist

Fundamentals

I can explain the Nyquist theorem
I understand what FFT computes and why it’s fast
I can describe why windowing reduces spectral leakage
I know how DMA enables zero-copy audio capture

Implementation

Audio waveform is clean in serial plotter
Known frequencies map to correct FFT bins
Display achieves 30+ FPS
Latency is imperceptible (<100ms)

Code Quality

No audio dropouts during operation
Memory usage is stable over time
Both cores have headroom (<80% usage)
Display looks smooth and responsive

Interview Preparation

Be ready to answer these questions:

“Explain how FFT converts time-domain audio to frequency-domain spectrum.”
- Decomposes signal into constituent frequencies, O(n log n), complex output, magnitude = energy
“Why must sample rate be at least 2× the highest frequency?”
- Nyquist theorem, aliasing if violated, frequencies fold back
“How does I2S DMA work and why is it necessary?”
- DMA moves data without CPU, prevents sample drops, ping-pong buffering
“How do you divide work between ESP32’s two cores?”
- Pin tasks with xTaskCreatePinnedToCore, queues for communication
“What is spectral leakage and how does windowing fix it?”
- Discontinuity at buffer edges causes energy spread, window tapers edges to zero
“How do you map 512 FFT bins to 8 display bars?”
- Logarithmic frequency spacing (humans hear logarithmically), bin averaging

Next Project: P05-ota-smart-home-hub.md - OTA-Updatable Smart Home Hub