Project 10: DMA Streamer - Continuous Data Pipeline
Build a continuous DMA pipeline with double-buffering and throughput measurement.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 20-30 hours |
| Main Programming Language | C++ |
| Alternative Programming Languages | C |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | 3. The “Platform” Model |
| Prerequisites | C/C++ basics, Teensyduino setup, basic electronics, ability to use a multimeter/logic analyzer |
| Key Topics | DMA, buffers, cache coherency, throughput |
1. Learning Objectives
By completing this project, you will:
- Explain the core question for this project in your own words.
- Implement the main workflow and validate it with measurements.
- Handle at least two failure modes and document recovery.
- Produce a deterministic report that matches hardware behavior.
2. All Theory Needed (Per-Concept Breakdown)
DMA Pipelines, Buffering, and Coherency
Fundamentals DMA moves data between peripherals and memory without CPU intervention. It enables high-throughput, low-jitter pipelines but requires careful buffer placement and cache management. If buffers are in the wrong memory region or caches are not synchronized, data corruption occurs.
Deep Dive into the concept DMA engines are bus masters that read and write memory while the CPU runs. Teensy 4.x has memory regions that are fast for the CPU and regions safe for DMA. If you place DMA buffers in tightly coupled memory, transfers may fail. Cache coherency is another critical issue: the CPU may hold data in cache that the DMA cannot see. You must flush or invalidate caches around DMA transfers to keep data consistent. Buffering determines stability. Single buffers risk overwrite; double buffers allow capture and processing in parallel. Buffer size trades latency for stability. This project builds a double-buffered DMA pipeline, measures throughput, and logs overruns to prove the pipeline is stable under load.
How this fit on projects This concept directly powers the implementation choices and validation steps in this project.
Definitions & key terms
- DMA: Direct Memory Access engine that moves data without CPU.
- Descriptor: DMA configuration describing source, destination, and size.
- Coherency: Consistency between CPU cache and main memory.
- Double buffer: Two buffers used to overlap capture and processing.
Mental model diagram (ASCII)
Peripheral -> DMA -> Buffer A/B -> CPU processing
How it works (step-by-step)
- Allocate DMA-safe buffers and align them.
- Configure DMA descriptors for continuous transfers.
- Handle buffer-full interrupts and swap buffers.
- Measure throughput and overrun counts.
Minimal concrete example
DMAMEM uint16_t buf[1024];
DMAChannel dma;
dma.destinationBuffer(buf, sizeof(buf));
Common misconceptions
- DMA always improves performance.
- Any RAM buffer works for DMA.
- Cache coherency is automatic.
Check-your-understanding questions
- Why do DMA buffers need special placement?
- What does double buffering solve?
- How do you detect overruns?
Check-your-understanding answers
- DMA cannot access certain tightly coupled memory regions.
- It prevents data overwrite while CPU processes old data.
- Count buffer-full events and missed interrupts.
Real-world applications
- Audio streaming
- High-rate sensor capture
- USB data pipelines
Where you’ll apply it
- See §3.2 Functional Requirements and §5.10 Implementation Phases in this file.
- Also used in: P11-audio-lab-real-time-audio-effects-chain.md, P16-teensy-data-logger-sd-card-sensor-archive.md
References
- NXP DMA chapter
- PJRC DMA documentation
Key insights
DMA throughput is real only when buffers and caches are managed correctly.
Summary
A stable DMA pipeline needs buffer design, coherency control, and measurement.
Homework/Exercises to practice the concept
- Create a double-buffered DMA loop and log overrun count.
- Measure throughput with and without CPU load.
Solutions to the homework/exercises
- Use DMA completion interrupts to swap buffers and track counters.
- Add a busy loop and compare throughput logs.
3. Project Specification
3.1 What You Will Build
Build a continuous DMA pipeline with double-buffering and throughput measurement.
3.2 Functional Requirements
- Configure DMA for continuous transfers.
- Implement double-buffered processing.
- Measure throughput and overrun count.
- Report buffer occupancy over time.
3.3 Non-Functional Requirements
- Performance: Meet the target timing/throughput for the project.
- Reliability: Detect errors and recover without undefined behavior.
- Usability: Provide clear logs and a repeatable workflow.
3.4 Example Usage / Output
./P10-dma-streamer-continuous-data-pipeline --run
3.5 Data Formats / Schemas / Protocols
CSV with columns: timestamp, buf_a_fill, buf_b_fill, overruns
3.6 Edge Cases
- Buffer overrun when CPU slow
- Cache incoherency corrupts data
- DMA stalled by bus contention
3.7 Real World Outcome
You will run the project and see deterministic logs and measurements that match physical hardware behavior.
3.7.1 How to Run (Copy/Paste)
cd project-root
make
./P10-dma-streamer-continuous-data-pipeline --run
3.7.2 Golden Path Demo (Deterministic)
Use a fixed input configuration and a known test signal. Capture output for 60 seconds and verify it matches expected values.
3.7.3 If CLI: exact terminal transcript
$ ./P10-dma-streamer-continuous-data-pipeline --run --seed 42
[INFO] DMA Streamer - Continuous Data Pipeline starting
[INFO] Report saved to data/report.csv
[INFO] Status: OK
$ echo $?
0
Failure Demo (Deterministic)
$ ./P10-dma-streamer-continuous-data-pipeline --run --missing-device
[ERROR] Device not detected
$ echo $?
2
4. Solution Architecture
4.1 High-Level Design
Inputs -> Acquisition -> Processing -> Output/Log
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Acquisition | Configure peripherals and capture data | Use stable clock settings |
| Processing | Convert raw data to meaningful values | Apply calibration/filters |
| Output/Log | Emit reports and logs | CSV for reproducibility |
4.3 Data Structures (No Full Code)
struct Sample {
uint32_t timestamp_us;
uint32_t value;
uint32_t flags;
};
4.4 Algorithm Overview
Key Algorithm: Measurement + Report
- Initialize hardware and verify configuration.
- Capture data and record timestamps.
- Compute metrics and write report.
Complexity Analysis:
- Time: O(n) in samples
- Space: O(n) for log storage
5. Implementation Guide
5.1 Development Environment Setup
# Arduino IDE + Teensyduino must be installed
# Optional CLI workflow
arduino-cli core update-index
arduino-cli core install teensy:avr
5.2 Project Structure
project-root/
├── src/
│ ├── main.ino
│ ├── hw_config.h
│ └── measurements.cpp
├── tools/
│ └── analyze.py
├── data/
│ └── samples.csv
└── README.md
5.3 The Core Question You’re Answering
“How do I stream data continuously without CPU bottlenecks?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- DMA, buffers, cache coherency, throughput
- Data logging and measurement techniques
- Basic timing math and error analysis
5.5 Questions to Guide Your Design
- How large should buffers be?
- Which RAM region is DMA-safe?
- How will you detect overruns?
5.6 Thinking Exercise
Compute maximum data rate given buffer size and CPU processing time.
5.7 The Interview Questions They’ll Ask
- What is DMA and why use it?
- How does cache coherency affect DMA?
- What is double buffering?
5.8 Hints in Layers
- Start with small buffer and measure overrun.
- Use DMAMEM buffers.
- Log buffer fill percentages.
5.9 Books That Will Help
| Topic | Book | Chapter | |——-|——|———| | DMA concepts | Making Embedded Systems | Ch. 7 | | Memory systems | Computer Systems | Ch. 6 | | Embedded performance | Real-Time Concepts | Ch. 5 |
5.10 Implementation Phases
Phase 1: Foundation (6 hours)
Goals:
- Configure DMA
- Validate buffer data
Tasks:
- Configure DMA
- Validate buffer data
Checkpoint: DMA transfer works
Phase 2: Core Functionality (10 hours)
Goals:
- Double-buffer pipeline
- Throughput logs
Tasks:
- Double-buffer pipeline
- Throughput logs
Checkpoint: Stable streaming
Phase 3: Polish (6 hours)
Goals:
- Overrun analysis
- Optimization
Tasks:
- Overrun analysis
- Optimization
Checkpoint: Final report
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |———-|———|—————-|———–| | Buffering | Single buffer, double buffer | Double buffer | Avoids data loss during processing | | Logging format | CSV, binary | CSV | Human-readable while still scriptable | | Clock speed | Default, overclock | Default | Keeps peripherals in spec |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |———-|———|———-| | Unit Tests | Validate math, parsing, and conversions | Timer math, CRC checks | | Integration Tests | Verify peripherals and pipelines | DMA -> buffer -> log | | Edge Case Tests | Handle boundary conditions | Brownout, missing sensor |
6.2 Critical Test Cases
{test_cases}
6.3 Test Data
Use a fixed test input pattern and record outputs to data/report.csv
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution | |———|———|———-| {pitfalls}
7.2 Debugging Strategies
{debug_strats}
7.3 Performance Traps
Large buffers improve stability but increase latency. Measure both throughput and jitter to choose the right size.
8. Extensions & Challenges
8.1 Beginner Extensions
{ex_begin}
8.2 Intermediate Extensions
{ex_inter}
8.3 Advanced Extensions
{ex_adv}
9. Real-World Connections
9.1 Industry Applications
{industry_apps}
9.2 Related Open Source Projects
{open_source}
9.3 Interview Relevance
{interview_rel}
10. Resources
10.1 Essential Reading
{resources}
10.2 Video Resources
- Embedded systems timing walkthrough (YouTube)
- Teensy hardware deep dive (Conference talk)
10.3 Tools & Documentation
- Teensyduino: Toolchain for Teensy boards
- Logic Analyzer: Timing verification
- Multimeter: Voltage and current measurement
10.4 Related Projects in This Series
{related_projects}
11. Self-Assessment Checklist
11.1 Understanding
- I can explain the main concept without notes.
- I can explain why the measurements match (or do not match) expectations.
- I understand at least one tradeoff made in this project.
11.2 Implementation
- All functional requirements are met.
- All critical test cases pass.
- Logs and reports are reproducible.
- Edge cases are handled.
11.3 Growth
- I documented lessons learned.
- I can explain this project in a job interview.
- I identified one improvement for next iteration.
12. Submission / Completion Criteria
Minimum Viable Completion: {comp_min}
Full Completion: {comp_full}
Excellence (Going Above & Beyond): {comp_ex}