Project 17: SPI Master
Implement an SPI master that supports Mode 0 and Mode 3.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 1 week |
| Main Programming Language | Verilog (Alternatives: VHDL, SystemVerilog) |
| Alternative Programming Languages | VHDL, SystemVerilog |
| Coolness Level | High |
| Business Potential | Medium |
| Prerequisites | Shift registers, Clock dividers |
| Key Topics | SPI modes, CS framing |
1. Learning Objectives
- Generate SPI clock with correct polarity
- Shift data on correct edges
- Frame transactions with CS
2. All Theory Needed (Per-Concept Breakdown)
SPI Timing and Modes
Description/Expanded Explanation of the concept
SPI uses a shared clock. CPOL controls idle clock polarity; CPHA controls when data is sampled. Matching modes between master and slave is required for correct transfers.
Definitions & Key Terms
- CPOL -> clock polarity
- CPHA -> clock phase
- CS -> chip select framing signal
Mental Model Diagram (ASCII)
Mode 0: CPOL=0, CPHA=0 (sample on rising edge)
Mode 3: CPOL=1, CPHA=1 (sample on rising edge)
How It Works (Step-by-Step)
- Generate SCK with correct idle polarity.
- Drive MOSI on the shift edge.
- Sample MISO on the sample edge.
- Keep CS low for the entire frame.
Minimal Concrete Example
if (sck_rise) mosi <= shift[7];
if (sck_fall) shift <= {shift[6:0], miso};
Common Misconceptions
- “CPOL/CPHA only affect speed.” -> They define sampling edges.
- “CS can toggle mid-byte.” -> It usually frames the transaction.
Check-Your-Understanding Questions
- What is the difference between Mode 0 and Mode 3?
- When should MOSI change relative to SCK?
- Why is CS important?
Where You’ll Apply It
- This project: used in Section 3.2 and Section 4
- Also used in: P15-uart-transmitter.md (for comparison)
Shift Registers
Description/Expanded Explanation of the concept
Shift registers move data one bit per clock. They are the core of serial protocols and LED chasers. They can shift left, right, or rotate, and they are often paired with counters for timing.
Definitions & Key Terms
- Shift register -> register that shifts bits each cycle
- Serial -> one bit per clock
- Parallel -> multiple bits at once
Mental Model Diagram (ASCII)
[ b7 b6 b5 b4 b3 b2 b1 b0 ] -> shift -> [ b6 ... b0 new ]

How It Works (Step-by-Step)
- Load a register with a value.
- On each clock edge, shift left or right.
- Insert a new bit at the shifted end.
Minimal Concrete Example
shift <= {shift[6:0], serial_in};
Common Misconceptions
- “Shift registers are only for serial I/O.” -> They are also for patterns and delays.
- “Shifts are free.” -> They still consume registers and routing.
Check-Your-Understanding Questions
- What is the difference between shift and rotate?
- How many cycles to shift out 8 bits?
- Why use shift registers in UART TX?
Where You’ll Apply It
- This project: used in Section 3.2 and Section 4
- Also used in: P15-uart-transmitter.md, P17-spi-master.md
Verification with Testbenches and Waveforms
Description/Expanded Explanation of the concept
Testbenches are simulation-only modules that apply stimulus and check outputs. Waveforms (VCD) are the hardware engineer’s microscope; they reveal timing, glitches, and ordering problems. A good testbench is deterministic and covers edge cases.
Definitions & Key Terms
- Testbench -> a non-synthesizable module that drives a DUT
- VCD -> Value Change Dump waveform file
- Deterministic test -> same inputs produce same outputs every run
Mental Model Diagram (ASCII)
[Testbench] -> [DUT] -> [VCD] -> [GTKWave]

How It Works (Step-by-Step)
- Initialize inputs to known values.
- Apply stimulus over time.
- Dump waveforms and check outputs.
- Add assertions or PASS/FAIL messages.
Minimal Concrete Example
initial begin
$dumpfile("wave.vcd");
$dumpvars(0, tb);
a = 0; b = 1; #10;
$finish;
end
Common Misconceptions
- “If it simulates once, it’s correct.” -> Cover all relevant cases.
- “Waveforms are optional.” -> They are often the only way to debug timing.
Check-Your-Understanding Questions
- Why keep testbench and DUT separate?
- What is the purpose of
$dumpvars? - How do you make a testbench deterministic?
Where You’ll Apply It
- This project: used throughout Section 6 (testing)
- Also used in: all other projects in this folder
3. Project Specification
3.1 What You Will Build
An SPI master with Mode 0 and Mode 3 support.
3.2 Functional Requirements
- Requirement 1: Configurable clock divider
- Requirement 2: Support Mode 0 and Mode 3
- Requirement 3: Full-duplex 8-bit transfer
3.3 Non-Functional Requirements
- Performance: Stable operation at the target clock and interfaces.
- Reliability: Deterministic outputs on all defined inputs.
- Usability: Clear ports and documented behavior.
3.4 Example Usage / Output
{p['example_usage']}
3.5 Data Formats / Schemas / Protocols
{p[‘data_format’]}
3.6 Edge Cases
- Mode mismatch
- CS timing
3.7 Real World Outcome
3.7.1 How to Run (Copy/Paste)
vvp spi_tb
3.7.2 Golden Path Demo (Deterministic)
Run the demo command above with the provided testbench and confirm the outputs match the golden transcript.
3.7.3 CLI Transcript
TX=0x9A RX=0x3C mode=0
3.7.4 Failure Demo (Expected)
# Example failure case
ERROR: Output mismatch at vector 3
Expected: 0x0A, Got: 0x0B
EXIT CODE: 1
Notes:
- Exit code 0 indicates all tests passed
- Exit code 1 indicates a test failure
4. Solution Architecture
4.1 High-Level Design
[inputs] -> [core logic] -> [outputs]

4.2 Key Components
| Component | Responsibility |
|---|---|
| sck_gen | SPI clock generation |
| shift | MOSI/MISO shift register |
4.3 Data Structures (No Full Code)
// Example signals (adapt to your design)
reg [7:0] state_reg;
reg [7:0] data_reg;
4.4 Algorithm Overview
Key Algorithm: Core control flow
- Initialize state/reset conditions.
- Apply inputs and compute outputs.
- Update state on clock edges (if sequential).
Complexity Analysis:
- Time: O(1) per cycle
- Space: O(N) for registers and logic
5. Implementation Guide
5.1 Development Environment Setup
iverilog -v
# Ensure GTKWave is installed for waveform viewing
5.2 Project Structure
project-root/
|-- src/
| |-- top.v
| |-- core.v
|-- tb/
| |-- tb.v
|-- Makefile
|-- README.md

5.3 The Core Question You’re Answering
“How do you align data with a shared clock across devices?”
5.4 Concepts You Must Understand First
- Shift registers
- Clock dividers
5.5 Questions to Guide Your Design
- Which edge should sample in each mode?
- How long should CS stay low?
5.6 Thinking Exercise
Draw SCK and MOSI for Mode 0 and Mode 3.
5.7 The Interview Questions They’ll Ask
- What do CPOL and CPHA mean?
- Why is CS required?
5.8 Hints in Layers
- Drive MOSI on one edge, sample MISO on the other.
- Keep CS low for the full frame.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Serial protocols | Making Embedded Systems | Ch. 8 |
5.10 Implementation Phases
Phase 1: Foundation
Goals:
- Establish core module structure
- Implement minimal behavior
Tasks:
- Scaffold module ports and internal signals
- Write a minimal testbench that compiles
Checkpoint: Simulation runs without errors
Phase 2: Core Functionality
Goals:
- Implement full logic
- Verify edge cases
Tasks:
- Complete core logic
- Add directed tests for edge cases
Checkpoint: All tests pass and waveforms match expectations
Phase 3: Polish & Edge Cases
Goals:
- Improve readability
- Document behavior
Tasks:
- Add comments and README notes
- Expand tests for unusual inputs
Checkpoint: Design is deterministic and documented
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Reset strategy | Sync / Async | Sync | Simpler timing closure |
| Test coverage | Directed / Exhaustive | Exhaustive for small logic | Prevents missed cases |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Test core logic | Small vectors |
| Integration Tests | Test modules together | Full system |
| Edge Case Tests | Boundary conditions | Max/min values |
6.2 Critical Test Cases
- Test 1: Mode 0 transfer
- Test 2: Mode 3 transfer
6.3 Test Data
Use deterministic vectors and document expected outputs.
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Mode mismatch | Data shifted by 1 | Verify CPOL/CPHA |
7.2 Debugging Strategies
- Inspect waveforms at key internal signals
- Add temporary debug outputs to verify state
- Reduce testcases to the smallest failing case
7.3 Performance Traps
- Overly wide counters or combinational paths can reduce max clock
8. Extensions & Challenges
8.1 Beginner Extensions
- Add parameterization for widths
- Add optional features (enable, reset)
8.2 Intermediate Extensions
- Add configuration registers
- Build a simple driver or demo program
8.3 Advanced Extensions
- Integrate with another project in this series
- Implement a hardware demo on FPGA
9. Real-World Connections
9.1 Industry Applications
- Digital control systems and embedded peripherals
- FPGA prototyping and validation
9.2 Related Open Source Projects
- Yosys / nextpnr toolchain for open-source FPGA flow
- Example HDL projects in the FPGA community
9.3 Interview Relevance
- Demonstrates RTL thinking and verification skills
10. Resources
10.1 Essential Reading
- Making Embedded Systems - Focus on Ch. 8
10.2 Video Resources
- Search for project-specific HDL walkthroughs and waveforms
10.3 Tools & Documentation
- Icarus Verilog
- GTKWave
10.4 Related Projects in This Series
- See adjacent projects in
VERILOG_FROM_ZERO_PROJECTS/
11. Self-Assessment Checklist
11.1 Understanding
- I can explain the core concept without notes
- I can predict waveform behavior for basic inputs
11.2 Implementation
- All functional requirements are met
- All tests pass
- Edge cases are documented
11.3 Growth
- I can explain this project in an interview
- I documented at least one lesson learned
12. Submission / Completion Criteria
Minimum Viable Completion:
- Functional requirements implemented
- Testbench passes
- Waveforms inspected
Full Completion:
- All minimum criteria plus
- Edge cases covered and documented
Excellence (Going Above & Beyond):
- Hardware demo on FPGA
- Clear write-up of lessons learned
Appendix A: Deep Dive Walkthrough
A.1 Signal Map and Timing Contract
- Inputs:
clk,reset,start,tx_data[7:0],mode - Outputs:
sck,mosi,cs,rx_data[7:0],busy
A.2 Mode Summary
- Mode 0 (CPOL=0, CPHA=0): sample on rising, shift on falling.
- Mode 3 (CPOL=1, CPHA=1): sample on rising, shift on falling, idle high.
A.3 Timing Sketch
Mode 0: SCK _/\_/\_/\_ (idle low)
Mode 3: SCK ‾\_/\_/\_‾ (idle high)

A.4 Deterministic Test Vectors
- Connect a loopback (MOSI -> MISO) in the testbench.
- Send 0x3C and expect 0x3C back in both modes.
A.5 Debugging Tip
If bits are reversed, check whether you shift MSB-first or LSB-first and keep it consistent.
13. Deep Dive Appendix
13.1 Timing and Resource Budget
- SCK period = clk_hz / divider. Ensure divider >= 2.
- CPOL/CPHA determine idle level and sample edge.
- CS setup/hold must meet the slave device datasheet.
13.2 Waveform Interpretation Guide
- Verify CS goes low before the first clock edge and stays low until after the last bit.
- Check MOSI changes on the shift edge and MISO sampled on the capture edge.
Example mode 0:
CPOL=0, CPHA=0 -> sample on rising edge, shift on falling edge
13.3 Hardware Bring-Up Notes
- Use a logic analyzer to verify edge alignment.
- Ensure all devices share a common ground.
- Many sensors are 3.3V only; use a level shifter if needed.
13.4 Alternate Implementations and Trade-offs
- Add support for mode 1 and 2 for wider device compatibility.
- Implement multi-byte transfers with a length counter.
- Add multi-slave selection with multiple CS lines.
13.5 Additional Exercises
- Build a loopback test by tying MOSI to MISO.
- Add a configurable bit order (MSB/LSB first).
- Support quad-SPI signals as an advanced challenge.