Project 1: The Spectrum Eye (IQ Visualizer)
Build a real-time (or file-driven) spectrum + waterfall analyzer that turns raw IQ samples into an interpretable picture of RF energy over time.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Beginner to Intermediate |
| Time Estimate | 2-3 weeks |
| Main Programming Language | Python (Alternatives: C++/Qt, Rust/egui, JavaScript/Electron) |
| Alternative Programming Languages | C, C++, Rust, Julia |
| Coolness Level | High |
| Business Potential | Medium (test tools, RF monitoring) |
| Prerequisites | Complex numbers, basic DSP, Python file I/O |
| Key Topics | IQ sampling, FFT, windowing, dB scaling, waterfall rendering |
1. Learning Objectives
By completing this project, you will:
- Parse interleaved IQ data and convert it to a correctly scaled complex stream.
- Implement an FFT pipeline with windowing, magnitude, and dB conversion.
- Render a stable waterfall visualization with correct axes and dynamic range.
- Diagnose common SDR artifacts (DC spike, IQ swap, clipping) from spectral cues.
- Build a deterministic test harness using known synthetic signals.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Complex Baseband and IQ Sample Handling
Fundamentals
IQ data is the lingua franca of software-defined radio. Instead of storing only a real-valued voltage, SDRs provide two streams: in-phase (I) and quadrature (Q). These represent the x and y components of a rotating phasor in the complex plane. The key benefit is that a complex signal can represent both positive and negative frequencies unambiguously, which allows you to treat a radio signal near 100 MHz as if it lives near 0 Hz after downconversion. In practice, an SDR delivers interleaved I and Q samples as bytes or signed integers. Before any math, you must de-interleave, convert to signed values, and scale them into a useful range, typically floating-point values in [-1, 1]. This step is deceptively easy to get wrong. If you forget to subtract the mid-point (127.5 for unsigned 8-bit data) you will see a false DC spike and a raised noise floor. If you swap I and Q, your spectrum is mirrored and frequency tuning will appear inverted. If you misinterpret endianness or signedness, you may clip or distort everything that follows.
A complex sample s[n] = I[n] + jQ[n] is a snapshot of amplitude and phase relative to the SDR’s local oscillator. Multiplying a complex stream by exp(-j2pift) shifts frequencies by f. This is the core digital tuning operation used in every SDR pipeline. Even though this project does not yet demodulate a signal, you must still understand the geometric meaning of IQ and how sampling rate, complex representation, and scaling determine what you see in a spectrum plot.
A second fundamental issue is sample layout. Most cheap SDRs output 8-bit unsigned IQ at a fixed rate. Others use 12-bit packed samples, 16-bit signed, or float32. There is no standard. The first responsibility of a spectrum analyzer is to parse and normalize that format. You also need to understand how the sample rate defines the bandwidth of the spectrum view. For complex samples, the usable bandwidth is approximately equal to the sample rate. That is why a 2.4 MSPS IQ stream spans about 2.4 MHz of spectrum.
Finally, you must recognize that IQ streams are not perfect: they include DC offsets (constant bias in I and/or Q), gain imbalance (different amplitude in I vs Q), and phase error (not exactly 90 degrees). These imperfections manifest as spectral artifacts. A good visualizer helps you detect them quickly and provides tools to mitigate them, such as DC removal or basic IQ correction.
Deep Dive into the Concept
Think of IQ sampling as a geometric camera. Each sample is a vector from the origin to a point (I, Q). A pure carrier is a circle. AM is a circle with a varying radius. FM is a circle that speeds up and slows down. When you plot I and Q in time, it may look like noise, but in the complex plane it is structured motion. This view leads directly to how a spectrum analyzer works: it is not measuring “frequency” directly but estimating which complex rotations are present in the signal.
A typical SDR pipeline begins with a hardware downconverter. The antenna sees an entire RF band. A mixer shifts the band to baseband or a low IF. The ADC samples it, producing digital IQ. From that moment, IQ is the universal representation. The sample format is often unsigned 8-bit interleaved [I0, Q0, I1, Q1, …]. Converting this to complex float requires: (1) read bytes, (2) split into I and Q arrays, (3) subtract the mid-point, (4) divide by the peak, (5) form complex values. The reason for subtracting the mid-point is that unsigned ADCs encode zero voltage at mid-scale. If you skip this, the average is non-zero, resulting in a spike at DC. Spectrally, a DC offset is a delta at 0 Hz that smears into nearby bins due to finite windowing. This is why many SDR tools include a “DC remove” toggle.
Another subtlety is IQ imbalance. Suppose I has 1.0 gain and Q has 0.9 gain, and the phase offset between them is 86 degrees instead of 90. Then the complex representation is skewed, which causes mirror images in the spectrum. The image rejection ratio (IRR) can be as low as 20-30 dB on cheap hardware. A visualizer should let you see this by displaying the full spectrum from -Fs/2 to +Fs/2. If you only show the positive half, you will miss the mirror and misdiagnose issues. For this reason, your waterfall should display negative and positive frequencies symmetrically around 0.
Sampling theory also matters here. IQ data is already complex, so Nyquist says a sample rate Fs gives usable bandwidth ~Fs. If you have real samples, bandwidth is Fs/2. This is why complex IQ is so valuable. But it also means your frequency axis must be correct. A common mistake is to plot 0..Fs/2 for complex IQ, which compresses the spectrum by 2x and makes tuning confusing. Another mistake is to ignore center frequency offsets. If your SDR is tuned to 100 MHz and you see a peak at +100 kHz in baseband, the actual station is at 100.1 MHz. Your visualizer should provide a correct frequency axis, and ideally allow you to label both “absolute frequency” and “baseband offset.”
Precision and scaling are another deep topic. The FFT output is complex. To plot a power spectrum, you compute magnitude or magnitude-squared. If you use magnitude, you may want 20log10. If you use magnitude-squared (power), you want 10log10. Mixing these creates 6 dB errors. For real data, the noise floor is influenced by ADC quantization. An 8-bit ADC gives about 48 dB of theoretical SNR (6 dB per bit). That means your plot should reasonably show signals at -40 dBFS. If your scaling is off, everything may look saturated or invisible. You need to select a reference level and dynamic range so that spectral features are visible without being clipped.
Finally, consider performance. A waterfall display is effectively a rolling image. You compute spectra at a fixed rate (e.g., 10-30 frames per second). If you process too much data per frame or use too large an FFT, you will lag. A good design decimates to the required resolution, uses overlapped FFTs, and precomputes windows. In Python, you can use numpy FFT which is already optimized, but you must avoid per-sample loops. For real-time streams, the pipeline is: read a block of N complex samples, apply window, compute FFT, shift, convert to dB, map to color, render. If any stage is too slow, the waterfall becomes jumpy and misleading. Understanding the cost of each step is essential for a tool that feels real.
How this fits on projects
- You will implement IQ parsing and scaling in §5.2 and §5.4.
- You will use DC removal and IQ sanity checks in §7.1.
- You will apply this concept again in P02 (AM demod), P03 (FM demod), and P10 (GPS acquisition).
Definitions & key terms
- IQ sample: A complex sample with in-phase (I) and quadrature (Q) components.
- Complex baseband: A signal representation centered at 0 Hz with both positive and negative frequencies.
- DC offset: A constant bias in I or Q producing a spike at 0 Hz.
- IQ imbalance: Gain or phase mismatch between I and Q leading to mirror images.
- dBFS: Decibels relative to full-scale ADC amplitude.
Mental model diagram (ASCII)
Antenna -> Mixer/LO -> ADC -> IQ bytes -> [I,Q] -> complex samples
| |
| v
DC offset Complex plane
How it works (step-by-step, with invariants and failure modes)
- Read raw bytes from a file or device stream.
- Interpret the format (u8, s16, interleaved).
- Convert to signed values and subtract mid-point.
- Scale to [-1, 1] or another normalized range.
- Form complex samples I + jQ.
Invariants:
- Mean(I) and Mean(Q) should be near 0 after centering.
- The complex magnitude should not exceed 1.0 if scaling is correct.
Failure modes:
- Incorrect signedness leads to mirrored or clipped spectra.
- Swap I/Q leads to inverted frequency axis.
- Missing centering causes a strong DC spike.
Minimal concrete example
import numpy as np
raw = np.fromfile("capture.iq", dtype=np.uint8)
I = raw[0::2].astype(np.float32) - 127.5
Q = raw[1::2].astype(np.float32) - 127.5
samples = (I + 1j * Q) / 127.5
Common misconceptions
- “IQ is stereo audio.” It is not; I/Q encodes phase and amplitude of one signal.
- “Nyquist is 2x the carrier.” It is 2x the bandwidth, not the RF center.
- “Unsigned bytes already represent zero at 0.” Zero is at mid-scale for unsigned ADCs.
Check-your-understanding questions
- Why does complex sampling allow you to see both positive and negative frequencies?
- What visual artifact indicates a DC offset?
- What happens if you swap I and Q?
- Why does an 8-bit SDR have a higher noise floor than a 12-bit SDR?
Check-your-understanding answers
- Because complex samples encode rotation direction; positive and negative frequencies rotate oppositely.
- A sharp spike at 0 Hz in the spectrum.
- The spectrum appears mirrored left-right.
- Fewer ADC bits increase quantization noise, reducing dynamic range.
Real-world applications
- SDR front-end diagnostics for labs and field engineering.
- Spectrum monitoring and interference hunting.
- IQ data capture and playback for RF testing.
Where you’ll apply it
- This project: §5.2 (Project Structure), §5.4 (Concepts), §7.1 (Pitfalls).
- Also used in: P02 AM Demodulator, P03 FM Receiver, P10 GPS L1 Tracker.
References
- “Software-Defined Radio for Engineers” by Collins et al., Chapter 3
- “Understanding Digital Signal Processing” by Lyons, Chapter 4
Key insights
Complex baseband is the universal representation that makes software tuning and visualization possible.
Summary
You learned how IQ data encodes amplitude and phase, why centering and scaling matter, and how front-end imperfections appear in spectra.
Homework/Exercises to practice the concept
- Load a short IQ file and compute the mean of I and Q; verify they are near zero after centering.
- Intentionally swap I and Q and observe the mirrored spectrum.
- Add a DC offset in software and confirm the DC spike appears.
Solutions to the homework/exercises
- After subtracting mid-scale, mean(I) and mean(Q) should be near 0; if not, adjust scaling.
- The spectrum flips left-right; signals that were at +100 kHz appear at -100 kHz.
- Adding +0.1 to I creates a visible spike at 0 Hz that persists across frames.
2.2 FFT-Based Spectral Estimation, Windowing, and dB Scaling
Fundamentals
A spectrum analyzer estimates how much energy exists at each frequency. The FFT is the workhorse: it converts a block of time-domain samples into a frequency-domain representation. But the FFT is not magical; it is a finite-length approximation. Because you only observe N samples, you implicitly multiply the infinite signal by a rectangular window. In the frequency domain, that window becomes a sinc-shaped smear, which spreads energy into adjacent bins. This is called spectral leakage. Windowing (Hann, Hamming, Blackman) reduces leakage at the cost of wider main lobes. For a waterfall display, the goal is not perfect spectral purity but a stable, interpretable picture. Hann is usually a good default.
After the FFT, you compute magnitude or power. If you want a display that matches how RF engineers and audio folks think, you convert to dB. The dB scale is logarithmic: it compresses large ranges into visible differences. In a waterfall, dynamic range is essential. Without dB scaling, weak signals disappear. But dB scaling requires care: if you compute magnitude and then 20log10, you are displaying amplitude in dB. If you compute power and then 10log10, you are displaying power in dB. Both are fine, but you must be consistent when comparing levels.
A practical waterfall also requires smoothing. Real spectra are noisy, especially with low bit-depth SDRs. If you render a raw FFT each time, the display flickers. You can stabilize it by averaging spectra across a few frames or applying an exponential moving average (EMA). This does not “cheat”; it simply trades time resolution for interpretability. Understanding this trade-off is part of building a useful tool.
Deep Dive into the Concept
Let’s unpack what the FFT is actually doing. Suppose you have N complex samples sampled at Fs. The FFT returns N frequency bins, each representing a complex coefficient for a discrete frequency. The bin spacing is Fs/N. If Fs=2.4 MHz and N=1024, you get about 2.34 kHz per bin. That is your frequency resolution. A larger FFT gives better resolution but costs more compute and latency. In a waterfall, you must balance resolution with frame rate. A good starting point is 1024 or 2048 points at 10-20 frames per second. If you need finer resolution, you can increase FFT size and reduce overlap or update rate.
Windowing is essential because your sample block is not aligned to an integer number of cycles. A rectangular window has the narrowest main lobe but the worst leakage. Hann reduces side lobes by about 31 dB, which greatly cleans up the display for strong signals. Hamming reduces side lobes slightly more. Blackman gives very low side lobes but widens the main lobe. For a spectrum eye, Hann or Blackman is usually best. If you choose no window, strong signals will smear and hide weaker ones. This is especially noticeable around FM broadcast stations, which can overwhelm adjacent weak signals.
| Scaling to dB requires a reference. If you are using normalized float samples in [-1,1], the maximum possible magnitude of a sine wave is 1.0, and the maximum power is 1.0 as well. You can display dBFS (decibels relative to full scale) where 0 dBFS is the maximum. If you take magnitude, you display 20*log10( | X | ), which yields 0 dBFS for a full-scale tone. The noise floor of an 8-bit SDR might sit around -50 to -60 dBFS. If you use magnitude-squared (power), use 10*log10, and still interpret 0 dBFS as full-scale power. The key is consistency. |
A waterfall is a 2D matrix: time on one axis, frequency on the other, and color representing magnitude. The trick is to map dB values to colors. If you map linearly, the display will be dominated by the strongest signals. Most SDR tools choose a min and max dB range (e.g., -100 to 0 dBFS) and then map that to a color gradient. You should provide controls for this range because noise floor varies with gain. An auto-ranging option can compute the median or 10th percentile of the spectrum as “noise” and set the floor accordingly.
Smoothing is another subtle issue. A single FFT is noisy due to random noise in samples. Averaging several FFTs reduces variance by about the square root of the number of averages. For example, averaging 4 spectra reduces random noise variance by 2x. But if you average too much, you blur fast changes. A moving average of 4-8 frames is often a good compromise for a waterfall. Another technique is overlap-add: compute FFTs on overlapping windows to smooth transitions and reduce scalloping loss. A 50% overlap with a Hann window gives nearly perfect reconstruction and stable visuals.
Finally, remember that FFT bins are ordered from 0 to Fs in numpy. To display a centered spectrum, you use fftshift to move the negative frequencies to the left. This is crucial for IQ data. If you forget, the display will be confusing, with 0 Hz at the left edge. A correct spectrum should show 0 Hz in the middle, negative frequencies on the left, positive on the right.
How this fits on projects
- You will implement FFT sizing, windowing, and dB conversion in §5.4 and §5.10.
- You will use FFT-based detection in P04 (ADS-B) and P09 (GSM burst visualization).
Definitions & key terms
- FFT (Fast Fourier Transform): Efficient algorithm for computing the discrete Fourier transform.
- Window function: A taper applied to samples before FFT to reduce spectral leakage.
- Spectral leakage: Energy from a tone spreading into adjacent bins due to finite sample length.
- dBFS: Decibels relative to full-scale amplitude or power.
- Waterfall: A time-stacked series of spectra visualized as an image.
Mental model diagram (ASCII)
Time samples -> [Window] -> [FFT] -> [Magnitude] -> [dB] -> [Color map]
| |
v v
Sliding blocks Waterfall image
How it works (step-by-step, with invariants and failure modes)
- Choose FFT size N and overlap ratio.
- Multiply samples by a window of length N.
- Compute FFT and shift to center 0 Hz.
- Compute magnitude or power.
- Convert to dB and clamp to display range.
- Map to colors and render a line of the waterfall.
Invariants:
- Bin spacing = Fs / N.
- 0 Hz bin should be centered after fftshift.
Failure modes:
- No window: strong leakage, weak signals disappear.
- Wrong dB scaling: levels appear off by 6 dB or more.
- Incorrect shift: spectrum appears “wrapped.”
Minimal concrete example
import numpy as np
N = 1024
window = np.hanning(N)
block = samples[:N] * window
spec = np.fft.fftshift(np.fft.fft(block))
mag = np.abs(spec)
mag_db = 20 * np.log10(mag + 1e-12)
Common misconceptions
- “Bigger FFT is always better.” It costs compute and reduces refresh rate.
- “Windowing reduces resolution.” It broadens the main lobe but improves leakage behavior.
- “dB is only for power.” dB can represent amplitude or power; the formula differs.
Check-your-understanding questions
- What does windowing do to spectral leakage?
- How does FFT size affect frequency resolution?
- Why do we use fftshift on IQ data?
- What is the difference between 20log10 and 10log10?
Check-your-understanding answers
- It reduces side lobes, making leakage weaker.
- Larger FFT gives finer bin spacing (Fs/N).
- To center 0 Hz and show negative frequencies on the left.
- 20log10 is for amplitude; 10log10 is for power.
Real-world applications
- Spectrum analyzers and signal monitoring tools.
- Interference hunting and RF compliance testing.
- Radar and communications diagnostics.
Where you’ll apply it
- This project: §3.4 (Example Output), §5.10 (Phases), §7.3 (Performance).
- Also used in: P04 ADS-B, P09 GSM.
References
- “Understanding Digital Signal Processing” by Lyons, Chapters 8-9
- “Digital Signal Processing” by Oppenheim & Schafer, Chapter 9
Key insights
The FFT is a finite, windowed estimate of frequency content; windowing and dB scaling are what make it readable.
Summary
You learned how FFT size, windowing, and scaling shape the spectrum, and how to map that into a stable waterfall.
Homework/Exercises to practice the concept
- Compare Hann vs Blackman windows on a single-tone signal and observe leakage.
- Plot spectra with 512, 1024, 4096 point FFTs and compare resolution.
- Try amplitude vs power dB scaling and note the difference in levels.
Solutions to the homework/exercises
- Blackman reduces leakage more but broadens the main lobe; Hann is a balanced compromise.
- Larger FFT shows narrower peaks and finer bin spacing but updates slower.
- Power dB is 6 dB lower than amplitude dB for the same signal magnitude.
3. Project Specification
3.1 What You Will Build
A spectrum eye that can ingest raw IQ data (file or live), compute FFT-based spectra, and render a real-time waterfall with correct frequency axes, dynamic range control, and basic artifact mitigation (DC removal, IQ swap option).
Included:
- File input for recorded IQ.
- Optional live SDR input via rtl_sdr or SoapySDR.
- FFT pipeline with windowing.
- dB scaling and color mapping.
- Waterfall and current spectrum views.
- Frequency axis with center frequency and offsets.
Excluded:
- Demodulation of signals (handled in later projects).
- Transmission or re-radiation.
- Advanced calibration beyond DC removal and IQ swap.
3.2 Functional Requirements
- IQ Parsing: Support u8 IQ interleaved files; allow selection of s16 format.
- FFT Pipeline: Compute spectra for configurable FFT sizes (512-4096).
- Waterfall Rendering: Render at >=10 FPS for 1024-point FFT.
- Dynamic Range Control: User-configurable min/max dB range.
- Center Frequency Support: Display absolute and relative frequency axes.
- Artifact Controls: Toggle DC removal and IQ swap.
- Deterministic Test Mode: Load a synthetic IQ file with known tones.
3.3 Non-Functional Requirements
- Performance: Sustains 10 FPS on a 2.4 MSPS file with 1024 FFT.
- Reliability: Deterministic output for identical input files.
- Usability: Clear CLI flags and readable labels for axes and units.
3.4 Example Usage / Output
$ python spectrum_eye.py --input capture.iq --fs 2.4e6 --fc 100.1e6 --fft 1024 \
--min-db -100 --max-db -20 --dc-remove
[INFO] Loaded 2.4e6 samples
[INFO] FFT size: 1024 | Bin width: 2343.75 Hz
[INFO] Waterfall: 12 FPS | Range: -100..-20 dBFS
3.5 Data Formats / Schemas / Protocols
IQ File (u8 interleaved):
- Layout: [I0, Q0, I1, Q1, …]
- Type: uint8
- Scaling: value 0..255, mid-scale 127.5 => 0.0
Config (CLI flags):
--fs(float): sample rate in Hz--fc(float): center frequency in Hz--fft(int): FFT size--min-db,--max-db(float): display range
3.6 Edge Cases
- Empty or short IQ file (< FFT length).
- Saturated samples (all 0 or 255) due to clipping.
- Incorrect format flag (u8 interpreted as s16).
- DC offset dominating spectrum.
- Very strong signal saturating display.
3.7 Real World Outcome
You will produce a stable waterfall that correctly shows real RF signals (FM broadcast peaks, ADS-B bursts) and their time variation.
3.7.1 How to Run (Copy/Paste)
python spectrum_eye.py --input capture.iq --fs 2.4e6 --fc 100.1e6 --fft 1024 \
--min-db -100 --max-db -20 --dc-remove --output waterfall.png
3.7.2 Golden Path Demo (Deterministic)
Use a synthetic test file containing two tones at +50 kHz and -200 kHz.
Expected: two bright lines in the waterfall at those offsets.
3.7.3 CLI Transcript (Exact)
$ python spectrum_eye.py --input test_two_tone.iq --fs 1.0e6 --fc 100.0e6 --fft 1024 --min-db -80 --max-db -10
[INFO] FFT size: 1024 | Bin width: 976.56 Hz
[INFO] Tone 1 detected at +50.0 kHz
[INFO] Tone 2 detected at -200.0 kHz
[OK] Wrote waterfall.png
3.7.4 Failure Demo (Bad Input)
$ python spectrum_eye.py --input empty.iq --fs 1.0e6
[ERROR] Input file too short for FFT size 1024
[EXIT] code=2
4. Solution Architecture
4.1 High-Level Design
+-------------+ +-----------+ +------------+ +-----------+
| IQ Source |-->| Preprocess|-->| FFT Engine |-->| Renderer |
+-------------+ +-----------+ +------------+ +-----------+
| | | |
v v v v
File/SDR Center,scale Spectrum bins Waterfall image
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| IQ Reader | Read/stream IQ bytes | Buffer size vs latency |
| Preprocess | Center, scale, optional DC removal | CPU vs correctness |
| FFT Engine | Window + FFT + shift | FFT size trade-offs |
| Visualizer | Color mapping, axis labels | dB range and palette |
4.3 Data Structures (No Full Code)
class SpectrumFrame:
timestamp: float
bins_db: np.ndarray # shape (N,)
fs: float
fc: float
4.4 Algorithm Overview
Key Algorithm: Sliding FFT Waterfall
- Read a block of N samples.
- Apply window.
- Compute FFT and shift.
- Convert to dB and clamp to range.
- Append line to waterfall buffer.
Complexity Analysis:
- Time: O(N log N) per frame
- Space: O(N * history) for waterfall image
5. Implementation Guide
5.1 Development Environment Setup
python -m venv .venv
source .venv/bin/activate
pip install numpy scipy matplotlib
5.2 Project Structure
spectrum-eye/
├── src/
│ ├── main.py
│ ├── iq_reader.py
│ ├── fft_pipeline.py
│ ├── render.py
│ └── config.py
├── tests/
│ ├── test_iq_reader.py
│ ├── test_fft.py
│ └── fixtures/
│ └── test_two_tone.iq
└── README.md
5.3 The Core Question You’re Answering
“How can I make invisible RF energy visible in a reliable, interpretable way?”
5.4 Concepts You Must Understand First
Stop and review these before coding:
- Complex baseband and IQ scaling (see §2.1)
- FFT, windowing, and bin resolution (see §2.2)
- dB scaling and dynamic range
- Sampling rate vs bandwidth
5.5 Questions to Guide Your Design
- How many FFT bins do you need to resolve an FM station spacing (200 kHz)?
- What update rate makes the waterfall feel smooth without overwhelming CPU?
- How will you map dB values to colors so weak signals are visible?
- How will you verify frequency accuracy using a known signal?
5.6 Thinking Exercise
Sketch a waterfall with two tones: one steady, one that appears for 1 second. How would each look as a vertical streak?
5.7 The Interview Questions They’ll Ask
- Why do we use fftshift for IQ data?
- What is spectral leakage and how do you reduce it?
- How does FFT size relate to frequency resolution?
- Why is DC offset visible as a spike?
5.8 Hints in Layers
Hint 1: Start with file input and offline plots using matplotlib.
Hint 2: Use numpy’s FFT and precompute your window.
Hint 3: Only then add real-time streaming and a waterfall buffer.
Hint 4: If the spectrum is mirrored, swap I and Q or negate Q.
5.9 Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | FFT basics | Understanding Digital Signal Processing (Lyons) | Ch. 8-9 | | SDR basics | Software-Defined Radio for Engineers (Collins) | Ch. 3 | | Spectral estimation | Digital Signal Processing (Oppenheim) | Ch. 9 |
5.10 Implementation Phases
Phase 1: Foundation (3-4 days)
Goals: Read IQ data, produce a single spectrum plot. Tasks:
- Build an IQ reader for u8 interleaved.
- Convert to complex samples and plot FFT. Checkpoint: A static spectrum plot shows at least one known tone.
Phase 2: Core Functionality (5-7 days)
Goals: Continuous FFT and waterfall. Tasks:
- Implement sliding FFT loop and buffering.
- Map dB to colors and render waterfall. Checkpoint: Waterfall updates at 10 FPS for a test file.
Phase 3: Polish & Edge Cases (3-4 days)
Goals: Robustness and controls. Tasks:
- Add DC removal and IQ swap toggle.
- Add frequency axis and dB scaling controls. Checkpoint: DC spike can be suppressed; spectrum aligns with known signals.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | FFT size | 512/1024/2048 | 1024 | Good balance of resolution and speed | | Window | Rect/Hann/Blackman | Hann | Reduces leakage with minimal blur | | Scaling | dBFS vs linear | dBFS | Better visibility of weak signals |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |———|———|———-| | Unit Tests | Validate parsing and scaling | IQ conversion, DC removal | | Integration Tests | Validate FFT pipeline | Known tone appears in correct bin | | Edge Case Tests | Handle invalid input | Empty file, wrong format |
6.2 Critical Test Cases
- Two-tone IQ file: Peaks at expected offsets within one bin.
- Zero file: Spectrum should be flat near noise floor.
- DC offset file: DC spike appears unless DC removal enabled.
6.3 Test Data
fs=1e6, tone1=+50kHz, tone2=-200kHz
Expected peaks at bins: 512+51 and 512-205 (approx)
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution | |———|———|———-| | Wrong IQ scaling | Overly strong DC spike | Subtract mid-scale and normalize | | Missing fftshift | Spectrum looks “wrapped” | Apply fftshift before plotting | | Wrong dB formula | Levels off by 6 dB | Use 20*log10 for amplitude |
7.2 Debugging Strategies
- Plot I/Q histograms: verify centering and clipping.
- Inject synthetic tones: validate frequency axis.
- Compare with rtl_power: sanity-check noise floor.
7.3 Performance Traps
- Recomputing window each frame (precompute it).
- Copying arrays too often (reuse buffers).
8. Extensions & Challenges
8.1 Beginner Extensions
- Add a peak marker that displays strongest frequency.
- Add average spectrum (slow moving average line).
8.2 Intermediate Extensions
- Add a zoomed FFT view around the cursor.
- Implement automatic noise floor estimation.
8.3 Advanced Extensions
- Add IQ imbalance correction and visualize improvement.
- Implement real-time streaming from SoapySDR.
9. Real-World Connections
9.1 Industry Applications
- RF test equipment: handheld spectrum analyzers and monitoring tools.
- Compliance testing: verify emissions and spurious signals.
9.2 Related Open Source Projects
- Gqrx: desktop SDR receiver with waterfall.
- QSpectrumanalyzer: fast real-time spectrum tool.
9.3 Interview Relevance
- Understanding FFT and windowing is a common DSP interview topic.
- Ability to reason about SDR sample formats shows systems awareness.
10. Resources
10.1 Essential Reading
- “Software-Defined Radio for Engineers” (Collins) Ch. 3
- “Understanding Digital Signal Processing” (Lyons) Ch. 8-9
10.2 Video Resources
- “GNU Radio: FFT and Spectral Analysis” (YouTube, GNU Radio conf)
- “IQ Explained” (SDR tutorials)
10.3 Tools & Documentation
- rtl_sdr: capture IQ samples for offline analysis.
- numpy.fft: FFT operations.
10.4 Related Projects in This Series
- P02 AM Demodulator: first audio decode.
- P03 FM Receiver: demodulation pipeline.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain why IQ data is complex.
- I can compute FFT bin spacing from Fs and N.
- I can explain why windowing reduces leakage.
11.2 Implementation
- My waterfall updates at >=10 FPS.
- I can detect DC offset and remove it.
- Frequency labels match known signals.
11.3 Growth
- I can describe a performance optimization I applied.
- I documented one surprising artifact I observed.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Correct IQ parsing and FFT display from a file.
- Waterfall view with correct frequency axis.
- Deterministic test file yields expected peaks.
Full Completion:
- Live SDR input with stable waterfall.
- DC removal and IQ swap toggles work.
- Documented noise floor and dynamic range.
Excellence (Going Above & Beyond):
- Real-time peak labeling and zoom.
- IQ imbalance correction with before/after comparison.