Project 5: 8-bit ALU (Arithmetic Logic Unit)

Create an ALU with arithmetic, logic, and flag outputs.

Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1 week
Main Programming Language Verilog (Alternatives: VHDL, SystemVerilog)
Alternative Programming Languages VHDL, SystemVerilog
Coolness Level High
Business Potential Medium
Prerequisites Adders, Two’s complement, Muxes
Key Topics Opcode decoding, Flags, Datapath

1. Learning Objectives

  1. Design an opcode table for ALU operations
  2. Generate flags (Z, C, N)
  3. Write a testbench with edge cases

2. All Theory Needed (Per-Concept Breakdown)

Adders and Two’s Complement Arithmetic

Description/Expanded Explanation of the concept

Adders are the core of arithmetic hardware. Two’s complement lets the same adder perform subtraction by inverting and adding one. Carry propagation defines speed for ripple adders and is the main timing cost.

Definitions & Key Terms
  • Full adder -> adds a, b, and carry-in
  • Two’s complement -> negate by invert + 1
  • Carry propagation -> carry ripples through bit positions
Mental Model Diagram (ASCII)
[A0+B0] -> c1 -> [A1+B1] -> c2 -> [A2+B2] -> c3

Ripple-carry chain

How It Works (Step-by-Step)
  1. Build a 1-bit full adder.
  2. Chain full adders for N-bit addition.
  3. For subtraction, invert B and add 1.
  4. Verify carry and overflow behavior.
Minimal Concrete Example
assign {c_out, sum} = a + b + c_in;
Common Misconceptions
  • “Carry and overflow are the same.” -> They differ for signed numbers.
  • “Ripple carry is always fine.” -> It can be too slow for high speeds.
Check-Your-Understanding Questions
  1. How do you compute a - b using an adder?
  2. Why is carry a timing bottleneck?
  3. What is the difference between carry and overflow?
Where You’ll Apply It
  • This project: used in Section 3.2 and Section 4 (architecture)
  • Also used in: P05-8-bit-alu-arithmetic-logic-unit.md, P19-calculator-with-7-segment-display.md

Datapaths and Flag Generation

Description/Expanded Explanation of the concept

A datapath is the collection of registers, muxes, and an ALU that moves and transforms data. Flags (zero, carry, negative) are simple but essential signals that let control logic make decisions.

Definitions & Key Terms
  • Datapath -> the data-processing portion of a design
  • Control -> logic that selects operations and routes data
  • Flags -> status bits derived from results
Mental Model Diagram (ASCII)
[Regs] -> [ALU] -> [Regs]
          |  |
         Z  C flags

Registers ALU registers with flags

How It Works (Step-by-Step)
  1. Registers feed operands into the ALU.
  2. Control selects the ALU operation.
  3. Result and flags are computed in the same cycle.
  4. Flags are stored or forwarded for decision-making.
Minimal Concrete Example
assign z = (y == 0);
assign n = y[7];
Common Misconceptions
  • “Flags are only for CPUs.” -> Many control systems use them too.
  • “Flags don’t need testing.” -> They are part of the contract.
Check-Your-Understanding Questions
  1. When should the zero flag assert?
  2. How do you compute a negative flag for unsigned data?
  3. Why separate datapath and control?
Where You’ll Apply It
  • This project: used in Section 3.2 and Section 4
  • Also used in: P19-calculator-with-7-segment-display.md, Final CPU

Verification with Testbenches and Waveforms

Description/Expanded Explanation of the concept

Testbenches are simulation-only modules that apply stimulus and check outputs. Waveforms (VCD) are the hardware engineer’s microscope; they reveal timing, glitches, and ordering problems. A good testbench is deterministic and covers edge cases.

Definitions & Key Terms
  • Testbench -> a non-synthesizable module that drives a DUT
  • VCD -> Value Change Dump waveform file
  • Deterministic test -> same inputs produce same outputs every run
Mental Model Diagram (ASCII)
[Testbench] -> [DUT] -> [VCD] -> [GTKWave]

Testbench DUT VCD GTKWave flow

How It Works (Step-by-Step)
  1. Initialize inputs to known values.
  2. Apply stimulus over time.
  3. Dump waveforms and check outputs.
  4. Add assertions or PASS/FAIL messages.
Minimal Concrete Example
initial begin
  $dumpfile("wave.vcd");
  $dumpvars(0, tb);
  a = 0; b = 1; #10;
  $finish;
end
Common Misconceptions
  • “If it simulates once, it’s correct.” -> Cover all relevant cases.
  • “Waveforms are optional.” -> They are often the only way to debug timing.
Check-Your-Understanding Questions
  1. Why keep testbench and DUT separate?
  2. What is the purpose of $dumpvars?
  3. How do you make a testbench deterministic?
Where You’ll Apply It
  • This project: used throughout Section 6 (testing)
  • Also used in: all other projects in this folder

3. Project Specification

3.1 What You Will Build

An 8-bit ALU supporting add/sub/and/or/xor/shift and flags.

3.2 Functional Requirements

  1. Requirement 1: Support at least 6 ALU operations
  2. Requirement 2: Generate zero, carry, and negative flags
  3. Requirement 3: Provide deterministic simulation outputs

3.3 Non-Functional Requirements

  • Performance: Stable operation at the target clock and interfaces.
  • Reliability: Deterministic outputs on all defined inputs.
  • Usability: Clear ports and documented behavior.

3.4 Example Usage / Output

{p['example_usage']}

3.5 Data Formats / Schemas / Protocols

{p[‘data_format’]}

3.6 Edge Cases

  • 0x00 + 0x00
  • 0xFF + 0x01
  • 0x00 - 0x01

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

vvp alu_tb

3.7.2 Golden Path Demo (Deterministic)

Run the demo command above with the provided testbench and confirm the outputs match the golden transcript.

3.7.3 CLI Transcript

OP=ADD 0x12 + 0x34 = 0x46 Z=0 C=0 N=0

3.7.4 Failure Demo (Expected)

# Example failure case
ERROR: Output mismatch at vector 3
Expected: 0x0A, Got: 0x0B
EXIT CODE: 1

Notes:

  • Exit code 0 indicates all tests passed
  • Exit code 1 indicates a test failure

4. Solution Architecture

4.1 High-Level Design

[inputs] -> [core logic] -> [outputs]

Core logic flow

4.2 Key Components

Component Responsibility
alu_core Arithmetic and logic operations
flag_gen Zero/carry/negative flag logic

4.3 Data Structures (No Full Code)

// Example signals (adapt to your design)
reg [7:0] state_reg;
reg [7:0] data_reg;

4.4 Algorithm Overview

Key Algorithm: Core control flow

  1. Initialize state/reset conditions.
  2. Apply inputs and compute outputs.
  3. Update state on clock edges (if sequential).

Complexity Analysis:

  • Time: O(1) per cycle
  • Space: O(N) for registers and logic

5. Implementation Guide

5.1 Development Environment Setup

iverilog -v
# Ensure GTKWave is installed for waveform viewing

5.2 Project Structure

project-root/
|-- src/
|   |-- top.v
|   |-- core.v
|-- tb/
|   |-- tb.v
|-- Makefile
|-- README.md

Project folder structure

5.3 The Core Question You’re Answering

“How do real CPUs choose arithmetic operations and generate flags?”

5.4 Concepts You Must Understand First

  • Adders
  • Two’s complement
  • Muxes

5.5 Questions to Guide Your Design

  • How many opcode bits do you need?
  • Will shifts be logical or arithmetic?

5.6 Thinking Exercise

Define a 3-bit opcode table for 8 operations.

5.7 The Interview Questions They’ll Ask

  • What is the zero flag?
  • Carry vs overflow: difference?
  • How do you implement subtraction?

5.8 Hints in Layers

  • Implement add/sub first, then bitwise ops.
  • Use a case statement to select output.

5.9 Books That Will Help

Topic Book Chapter
ALU design Digital Design and Computer Architecture Ch. 5
Two’s complement Code (Petzold) Ch. 12

5.10 Implementation Phases

Phase 1: Foundation

Goals:

  • Establish core module structure
  • Implement minimal behavior

Tasks:

  1. Scaffold module ports and internal signals
  2. Write a minimal testbench that compiles

Checkpoint: Simulation runs without errors

Phase 2: Core Functionality

Goals:

  • Implement full logic
  • Verify edge cases

Tasks:

  1. Complete core logic
  2. Add directed tests for edge cases

Checkpoint: All tests pass and waveforms match expectations

Phase 3: Polish & Edge Cases

Goals:

  • Improve readability
  • Document behavior

Tasks:

  1. Add comments and README notes
  2. Expand tests for unusual inputs

Checkpoint: Design is deterministic and documented

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Reset strategy Sync / Async Sync Simpler timing closure
Test coverage Directed / Exhaustive Exhaustive for small logic Prevents missed cases

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Test core logic Small vectors
Integration Tests Test modules together Full system
Edge Case Tests Boundary conditions Max/min values

6.2 Critical Test Cases

  1. Test 1: Edge cases for overflow and underflow
  2. Test 2: Random vectors with expected results

6.3 Test Data

Use deterministic vectors and document expected outputs.

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong carry flag Flag incorrect on subtraction Define flag behavior and test

7.2 Debugging Strategies

  • Inspect waveforms at key internal signals
  • Add temporary debug outputs to verify state
  • Reduce testcases to the smallest failing case

7.3 Performance Traps

  • Overly wide counters or combinational paths can reduce max clock

8. Extensions & Challenges

8.1 Beginner Extensions

  • Add parameterization for widths
  • Add optional features (enable, reset)

8.2 Intermediate Extensions

  • Add configuration registers
  • Build a simple driver or demo program

8.3 Advanced Extensions

  • Integrate with another project in this series
  • Implement a hardware demo on FPGA

9. Real-World Connections

9.1 Industry Applications

  • Digital control systems and embedded peripherals
  • FPGA prototyping and validation
  • Yosys / nextpnr toolchain for open-source FPGA flow
  • Example HDL projects in the FPGA community

9.3 Interview Relevance

  • Demonstrates RTL thinking and verification skills

10. Resources

10.1 Essential Reading

  • Digital Design and Computer Architecture - Focus on Ch. 5
  • Code (Petzold) - Focus on Ch. 12

10.2 Video Resources

  • Search for project-specific HDL walkthroughs and waveforms

10.3 Tools & Documentation

  • Icarus Verilog
  • GTKWave
  • See adjacent projects in VERILOG_FROM_ZERO_PROJECTS/

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain the core concept without notes
  • I can predict waveform behavior for basic inputs

11.2 Implementation

  • All functional requirements are met
  • All tests pass
  • Edge cases are documented

11.3 Growth

  • I can explain this project in an interview
  • I documented at least one lesson learned

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Functional requirements implemented
  • Testbench passes
  • Waveforms inspected

Full Completion:

  • All minimum criteria plus
  • Edge cases covered and documented

Excellence (Going Above & Beyond):

  • Hardware demo on FPGA
  • Clear write-up of lessons learned

    Appendix A: Deep Dive Walkthrough

A.1 Signal Map and Opcode Contract

  • Inputs: a[7:0], b[7:0], op[2:0]
  • Outputs: y[7:0], z (zero), n (negative), c (carry), v (overflow)

Suggested opcode map (document in your RTL and tests):

  • 000: ADD
  • 001: SUB
  • 010: AND
  • 011: OR
  • 100: XOR
  • 101: SLT (signed)
  • 110: SHL (logical left)
  • 111: SHR (logical right)

A.2 Flag Definitions

  • Z: y == 0
  • N: y[7] == 1 (signed negative)
  • C: carry-out from add/sub (borrow for sub handled explicitly)
  • V: signed overflow for add/sub

A.3 Reference Vectors (ADD/SUB)

| op | a | b | y | c | v | z | |—–|——|——|——|—|—|—| | ADD | 7F | 01 | 80 | 0 | 1 | 0 | | ADD | FF | 01 | 00 | 1 | 0 | 1 | | SUB | 00 | 01 | FF | 0 | 0 | 0 | | SUB | 80 | 01 | 7F | 1 | 1 | 0 |

A.4 Dataflow Sketch

         +-------------------+
A ------>|                   |----> Y
B ------>|  ALU core + flags |
OP ----->|                   |
         +-------------------+

ALU core block

A.5 Verification Checklist

  • Verify each opcode with at least 3 vectors (normal, boundary, extreme).
  • Check flags independently of y (especially V and C).
  • Confirm shift operations fill with zeros (logical).

13. Deep Dive Appendix

13.1 Timing and Resource Budget

  • The ALU critical path typically goes through op decode -> mux -> adder/logic -> flags.
  • For FPGA, add/sub uses the carry chain; logic ops map to LUTs. The mux depth often sets the clock limit.
  • If you plan to clock the ALU, reserve timing slack for the longest opcode (usually ADD/SUB).

13.2 Waveform Interpretation Guide

  • Sweep opcode and verify that result and flags (Z, C, N, V) match the expected math.
  • Check flag timing: flags should update in the same cycle as the result.

Example (unsigned):

A=0xF0, B=0x30, ADD -> R=0x20, C=1
A=0x80, B=0x80, ADD (signed) -> R=0x00, V=1

13.3 Hardware Bring-Up Notes

  • Use DIP switches for A and B, another bank for opcode.
  • Display result on LEDs or a 7-seg bank via a simple hex display module.
  • If your board has few switches, build a small input latch controlled by a button.

13.4 Alternate Implementations and Trade-offs

  • Unified adder: implement ADD/SUB by conditionally inverting B and adding carry-in.
  • Dedicated logic block: keep AND/OR/XOR in a separate module and mux at the end.
  • Pipelined ALU: add a register stage after the ALU for higher clock rates.

13.5 Additional Exercises

  • Add shift and rotate operations with a barrel shifter.
  • Implement compare operations (EQ, LT, GE) and expose a compare flag bus.
  • Add saturating arithmetic for signal-processing style behavior.