Project 1: Blink LED on AVR — Bare Metal Arduino

Strip away all abstractions and control an LED with pure register manipulation on an ATmega328P microcontroller—your first step into the world where your code is the only thing between the CPU and the hardware.


Quick Reference

Attribute Value
Difficulty Beginner
Time Estimate Weekend (8-12 hours)
Language C (alt: AVR Assembly)
Platform Arduino Uno / ATmega328P
Prerequisites Basic C programming
Key Topics GPIO, registers, cross-compilation, memory-mapped I/O

1. Learning Objectives

By completing this project, you will:

  1. Understand memory-mapped I/O — Know why writing to specific memory addresses controls hardware peripherals
  2. Read and interpret datasheets — Extract register information from the ATmega328P documentation
  3. Master cross-compilation — Compile C code for a different CPU architecture (AVR vs x86)
  4. Grasp the difference between bare metal and Arduino — Appreciate what the Arduino framework does for you (and what it costs)
  5. Implement software timing — Create delays without library functions using calibrated loops
  6. Minimize code size — Write efficient embedded code that’s 5-6x smaller than Arduino equivalent
  7. Use bitwise operations fluently — Set, clear, toggle, and test individual bits in registers

What You Will NOT Learn (Yet)

  • Hardware interrupts (Project 2: UART and Project 3: Timers)
  • Timer-based precise timing (Project 3)
  • Serial communication for debugging (Project 2)
  • ARM or x86 architectures (Projects 4, 5, 6+)

2. Theoretical Foundation

2.1 Core Concepts

What is Memory-Mapped I/O?

In most microcontrollers, hardware peripherals are controlled by reading and writing to specific memory addresses. The CPU doesn’t distinguish between “real” RAM and “hardware registers”—it’s all just addresses on the memory bus.

Memory Map of ATmega328P:
┌──────────────────────────────────────┐ 0x08FF (2303)
│          SRAM                        │
│          (2KB working memory)        │
├──────────────────────────────────────┤ 0x0100 (256)
│          Extended I/O Registers      │
├──────────────────────────────────────┤ 0x0060 (96)
│          I/O Registers               │ ← DDRB, PORTB, PINB are here!
│          (PORTB at 0x25)             │
├──────────────────────────────────────┤ 0x0020 (32)
│          32 General Purpose          │
│          Registers (R0-R31)          │
└──────────────────────────────────────┘ 0x0000

When you write:   PORTB = 0x20;
CPU executes:     Store value 0x20 at memory address 0x25
Hardware reacts:  Pin PB5 goes HIGH (5V appears on physical pin)
                  LED connected to that pin turns ON

Key insight: There’s no special “GPIO instruction” in the CPU. It’s just a regular memory store operation. The magic happens because the hardware is listening at that address.

The Three GPIO Registers Per Port

Every GPIO port on AVR has three registers that work together:

Register Address Purpose Read Behavior Write Behavior
DDRx 0x24 Data Direction Current direction 0=input, 1=output
PORTx 0x25 Data/Pull-up Current output state Output value (or enable pull-up for inputs)
PINx 0x23 Input Pins Actual pin voltage level Toggle corresponding PORTx bit (AVR-specific)
GPIO Register Flow:
                    ┌─────────────────────────────────────────────┐
                    │                  DDRB Register               │
                    │  Bit 7  Bit 6  Bit 5  Bit 4  Bit 3  ...     │
                    │    0      0      1      0      0            │
                    │                  ↓                           │
                    │              OUTPUT MODE                     │
                    │            (for LED pin)                     │
                    └─────────────────────────────────────────────┘
                                         │
                                         ▼
                    ┌─────────────────────────────────────────────┐
                    │                 PORTB Register               │
                    │    0      0      1      0      0            │
                    │                  ↓                           │
                    │              5V OUTPUT                       │
                    │            (LED turns ON)                    │
                    └─────────────────────────────────────────────┘
                                         │
                                         ▼
                                    Physical Pin
                                         │
                                         ▼
                                    ┌─────────┐
                                    │   LED   │
                                    │   💡    │
                                    └────┬────┘
                                         │
                                        GND

Bitwise Operations: The Language of Hardware

Bare metal programming relies heavily on bitwise operations. Here’s why: registers pack multiple controls into a single byte. You must manipulate individual bits without disturbing others.

// SET bit 5 (turn ON pin PB5) - leaves other bits unchanged
PORTB |= (1 << 5);      // PORTB = PORTB | 0b00100000
                         // If PORTB was 0b00000011, result is 0b00100011

// CLEAR bit 5 (turn OFF pin PB5) - leaves other bits unchanged
PORTB &= ~(1 << 5);     // PORTB = PORTB & 0b11011111
                         // If PORTB was 0b00100011, result is 0b00000011

// TOGGLE bit 5 (flip state) - leaves other bits unchanged
PORTB ^= (1 << 5);      // XOR with 0b00100000

// CHECK if bit 5 is set (non-zero means set)
if (PORTB & (1 << 5)) { /* pin is HIGH */ }

The shift operator explained:

  • 1 << 0 = 0b00000001 = 1
  • 1 << 1 = 0b00000010 = 2
  • 1 << 5 = 0b00100000 = 32 (0x20)

2.2 Why This Matters

Understanding What Abstractions Hide

When you use Arduino’s digitalWrite(13, HIGH), here’s what actually happens:

Arduino Framework Execution:
┌─────────────────────────────────────────────────────────────────┐
│  digitalWrite(13, HIGH);                                        │
│           │                                                     │
│           ▼                                                     │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │  digitalPinToPort(13)  →  Returns: &PORTB               │   │
│  │  digitalPinToBitMask(13)  →  Returns: 0x20 (bit 5)      │   │
│  │  Check: is this pin a timer output? Handle PWM disable  │   │
│  │  cli()  →  Disable interrupts (atomic operation start)  │   │
│  │  temp = *port  →  Read current PORTB value              │   │
│  │  temp |= bitmask  →  OR in the new bit                  │   │
│  │  *port = temp  →  Write back to PORTB                   │   │
│  │  sei()  →  Re-enable interrupts                         │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  Machine code: ~15-20 instructions                              │
│  Flash usage: ~900 bytes for minimal program                    │
│  Execution time: ~50-100 cycles                                 │
└─────────────────────────────────────────────────────────────────┘

Your Bare Metal Code:
┌─────────────────────────────────────────────────────────────────┐
│  PORTB |= (1 << PB5);                                          │
│           │                                                     │
│           ▼                                                     │
│  sbi PORTB, 5    ← Single AVR instruction!                     │
│                                                                 │
│  Machine code: 1 instruction (2 bytes)                         │
│  Flash usage: ~150-200 bytes for minimal program               │
│  Execution time: 2 cycles                                       │
└─────────────────────────────────────────────────────────────────┘

Size comparison: Arduino ~900 bytes vs Bare Metal ~176 bytes = 5.1x smaller!
Speed comparison: Arduino ~50+ cycles vs Bare Metal 2 cycles = 25x faster!

Real-World Applications

The same techniques you learn here apply to:

Industry Application Why Bare Metal Matters
Medical Devices Pacemakers, insulin pumps Timing-critical, ultra-low power, certification requirements
Automotive Engine control units (ECUs) Real-time requirements, fail-safe operation
Aerospace Flight controllers, satellite systems Size/weight/power constraints, radiation hardening
Industrial PLC controllers, robotics Deterministic timing, reliability
Consumer IoT Battery-powered sensors Years of operation on coin cell battery
Audio DSP effects, synthesizers Sample-accurate timing at 44.1kHz+

2.3 ATmega328P Architecture

The ATmega328P is an 8-bit microcontroller based on the AVR RISC architecture:

ATmega328P Block Diagram (Simplified):
┌─────────────────────────────────────────────────────────────────────┐
│                          ATmega328P                                  │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                     CPU Core                                 │   │
│  │  ┌──────────────┐   ┌──────────────┐   ┌─────────────────┐ │   │
│  │  │   ALU        │   │  32 General  │   │  Program Counter │ │   │
│  │  │  (8-bit)     │   │  Purpose     │   │  (14-bit)        │ │   │
│  │  │              │   │  Registers   │   │                   │ │   │
│  │  └──────────────┘   │  (R0-R31)    │   └─────────────────┘ │   │
│  │                     └──────────────┘                        │   │
│  └─────────────────────────────────────────────────────────────┘   │
│           │                    │                    │                │
│           │                    │                    │                │
│  ┌────────▼────────┐  ┌────────▼────────┐  ┌───────▼──────────┐   │
│  │  Flash Memory   │  │   SRAM          │  │   EEPROM         │   │
│  │  (32KB)         │  │   (2KB)         │  │   (1KB)          │   │
│  │  Program Storage│  │   Variables     │  │   Persistent     │   │
│  └─────────────────┘  └─────────────────┘  └──────────────────┘   │
│                                                                     │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                   I/O Peripherals                            │   │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────────────┐ │   │
│  │  │ GPIO    │  │ Timer0  │  │ Timer1  │  │ USART (Serial)  │ │   │
│  │  │ Ports   │  │ (8-bit) │  │ (16-bit)│  │                 │ │   │
│  │  │ B, C, D │  │         │  │         │  │                 │ │   │
│  │  └────┬────┘  └─────────┘  └─────────┘  └─────────────────┘ │   │
│  └───────┼──────────────────────────────────────────────────────┘   │
│          │                                                          │
└──────────┼──────────────────────────────────────────────────────────┘
           │
           ▼
      Physical Pins
      (28-pin DIP or 32-pin TQFP)

Key Specifications:

  • Clock Speed: 16 MHz (with external crystal on Arduino Uno)
  • Flash: 32 KB (program storage) — minus 0.5KB for bootloader
  • SRAM: 2 KB (variables, stack)
  • EEPROM: 1 KB (persistent storage)
  • GPIO: 23 I/O pins (across 3 ports)
  • Architecture: Harvard (separate program and data memory)
  • Instruction Width: 16-bit (most instructions)

Arduino Uno Pin 13 = ATmega328P PB5:

Arduino Pin  │  ATmega328P Pin  │  Port/Bit  │  Function
─────────────┼──────────────────┼────────────┼─────────────────
Pin 13       │  Pin 19 (DIP)    │  PB5       │  Built-in LED
             │  Pin 17 (TQFP)   │            │  Also: SCK (SPI)

2.4 Common Misconceptions

Misconception Reality
“Bare metal is only for experts” It’s actually simpler—fewer layers to understand, more predictable behavior
“Arduino library is more efficient” Arduino adds significant overhead for safety and portability (~5x code size)
“I need assembly for bare metal” C compiles to excellent code; assembly is rarely needed except for critical timing
“Registers are like variables” Registers may have side effects when read/written (reading clears flags, etc.)
“volatile is optional” Without volatile, compiler WILL optimize away hardware accesses, causing bugs
“Any delay loop works” Compiler optimizations can eliminate delay loops; must use volatile
“Pin numbers are the same” Arduino pin numbers differ from ATmega328P port/bit designations

3. Project Specification

3.1 What You Will Build

A standalone C program that:

  1. Compiles with avr-gcc (not Arduino IDE)
  2. Configures pin 13 (PB5) as an output
  3. Toggles the LED on and off in an infinite loop
  4. Uses a software delay (no timer interrupts yet)
  5. Produces a binary smaller than 250 bytes
Program Flow:
┌────────────────────┐
│    Power On /      │
│    Reset           │
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│  Configure GPIO    │◄─── Set DDRB bit 5 = 1 (output mode)
│  (one-time setup)  │
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│     LED ON         │◄─── Set PORTB bit 5 = 1 (5V on pin)
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│   Delay ~500ms     │◄─── Software loop (calibrated)
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│     LED OFF        │◄─── Clear PORTB bit 5 = 0 (0V on pin)
└─────────┬──────────┘
          │
          ▼
┌────────────────────┐
│   Delay ~500ms     │
└─────────┬──────────┘
          │
          │ (loop forever)
          └──────────────────────────────────────┐
                                                 │
                                                 ▼
                                          Return to LED ON

3.2 Functional Requirements

ID Requirement Verification Method
FR1 LED blinks at approximately 1 Hz (on 500ms, off 500ms) Visual observation with stopwatch
FR2 Uses direct register manipulation (no Arduino functions) Code review
FR3 Compiles with avr-gcc and standard Makefile Build succeeds without errors
FR4 Flash size under 250 bytes Check with avr-size
FR5 Works on Arduino Uno (ATmega328P @ 16MHz) Hardware test
FR6 No external dependencies except avr-libc headers Code review

3.3 Non-Functional Requirements

ID Requirement Rationale
NFR1 Code has comments explaining each register access Educational value
NFR2 Makefile has separate compile, link, and flash targets Build system best practices
NFR3 Timing accuracy within ±20% of target Acceptable for visual demonstration
NFR4 Code compiles without warnings using -Wall Code quality

3.4 Example Output

Build Process:

$ make
avr-gcc -mmcu=atmega328p -Os -Wall -c blink.c -o blink.o
avr-gcc -mmcu=atmega328p -o blink.elf blink.o
avr-objcopy -O ihex blink.elf blink.hex
avr-size blink.elf
   text    data     bss     dec     hex filename
    176       0       0     176      b0 blink.elf

Flash Process:

$ make flash
avrdude -p m328p -c arduino -P /dev/ttyACM0 -b 115200 -U flash:w:blink.hex

avrdude: AVR device initialized and ready to accept instructions
avrdude: Device signature = 0x1e950f (ATmega328P)
avrdude: NOTE: "flash" memory has been specified, an erase cycle will be performed
avrdude: Erasing chip
avrdude: Reading input file "blink.hex"
avrdude: Writing flash (176 bytes)
avrdude: 176 bytes of flash verified

avrdude done.  Thank you.

Comparison with Arduino:

# Arduino IDE Blink sketch
$ arduino-cli compile --fqbn arduino:avr:uno Blink
Sketch uses 924 bytes (2%) of program storage space.

# Your bare metal version
$ avr-size blink.elf
   text    data     bss     dec     hex filename
    176       0       0     176      b0 blink.elf

# Result: 176 bytes vs 924 bytes = 5.25x smaller!

3.5 Real World Outcome

When complete, you will have:

  1. A working bare metal program running on real hardware without any framework
  2. Understanding of the build process: source → compile → link → hex → flash
  3. Ability to read the ATmega328P datasheet to find any register or peripheral
  4. Foundation for all future projects in this series
  5. Talking points for interviews: “I’ve written bare metal firmware that’s 5x smaller than framework code”
  6. Confidence to approach any microcontroller—the concepts transfer

4. Solution Architecture

4.1 High-Level Design

Source Files                  Build Process                 Hardware
┌──────────────┐             ┌──────────────┐              ┌──────────────┐
│   blink.c    │ ──────────► │   avr-gcc    │              │  ATmega328P  │
│              │             │  (compile)   │              │              │
│ - main()     │             └──────┬───────┘              │ ┌──────────┐ │
│ - delay_ms() │                    │                      │ │  Flash   │ │
└──────────────┘                    │                      │ │ 32KB     │ │
                                    ▼                      │ └────┬─────┘ │
┌──────────────┐             ┌──────────────┐              │      │       │
│  Makefile    │             │   blink.o    │              │      ▼       │
│              │             │   (object)   │              │ ┌──────────┐ │
│ - all        │             └──────┬───────┘              │ │   CPU    │ │
│ - flash      │                    │                      │ │  16MHz   │ │
│ - clean      │                    │                      │ └────┬─────┘ │
└──────────────┘                    │                      │      │       │
                                    ▼                      │      ▼       │
                             ┌──────────────┐              │ ┌──────────┐ │
                             │  blink.elf   │              │ │  GPIO    │ │
                             │  (linked)    │              │ │ PORTB    │─┼──► LED
                             └──────┬───────┘              │ └──────────┘ │
                                    │                      │              │
                                    ▼                      └──────────────┘
                             ┌──────────────┐                     ▲
                             │  blink.hex   │                     │
                             │  (Intel HEX) │ ────────────────────┘
                             └──────────────┘      avrdude
                                                  (programmer)

4.2 Key Components

Component Purpose Implementation Details
main() Entry point, GPIO setup, infinite loop Sets DDRB, loops toggling PORTB
delay_ms() Software delay using volatile counter Calibrated nested loops
Makefile Build automation Targets for compile, link, hex, flash

4.3 Register Map

These are the registers you’ll manipulate:

Register Address Purpose in This Project
DDRB 0x24 Set bit 5 to 1 to make PB5 an output
PORTB 0x25 Set/clear bit 5 to turn LED on/off

DDRB Bit Layout (Data Direction Register B):

Bit:     7      6      5      4      3      2      1      0
Pin:   PB7    PB6    PB5    PB4    PB3    PB2    PB1    PB0
       (n/a)  (n/a)  LED    D12    D11    D10    D9     D8
                     ↑
                     Set this to 1 for output

PORTB Bit Layout (Port B Data Register):

Bit:     7      6      5      4      3      2      1      0
Pin:   PB7    PB6    PB5    PB4    PB3    PB2    PB1    PB0
                     ↑
                     Set to 1 = LED ON (5V)
                     Set to 0 = LED OFF (0V)

4.4 Timing Analysis

Software Delay Calculation:

Given:

  • CPU Clock: 16 MHz = 16,000,000 cycles/second
  • Target delay: 500 ms = 0.5 seconds
  • Total cycles needed: 16,000,000 × 0.5 = 8,000,000 cycles

For a simple decrement loop:

while (count--) { }

This compiles to approximately:

loop:
    subi r24, 1     ; 1 cycle (subtract 1 from low byte)
    sbci r25, 0     ; 1 cycle (subtract carry from high byte)
    brne loop       ; 2 cycles if branch taken, 1 if not
                    ; Total: ~4 cycles per iteration

So for 1ms delay:

  • Cycles per ms: 16,000
  • Cycles per loop: ~4
  • Iterations per ms: 16,000 / 4 = 4,000

For 500ms:

  • Call delay_ms(500)
  • Inner loop runs 4,000 times per millisecond
  • Outer loop runs 500 times

Note: Actual timing will vary based on compiler optimization and exact instruction sequence. Calibrate by observation.


5. Implementation Guide

5.1 Development Environment Setup

macOS

# Install Homebrew if not present
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

# Install AVR toolchain
brew tap osx-cross/avr
brew install avr-gcc avrdude

# Verify installation
avr-gcc --version
# Expected: avr-gcc (GCC) 12.x.x or similar

avrdude -v
# Expected: avrdude version 7.x

# Find your Arduino's serial port
ls /dev/tty.usbmodem*  # or /dev/tty.usbserial*

Linux (Ubuntu/Debian)

sudo apt update
sudo apt install gcc-avr avr-libc avrdude make

# Verify
avr-gcc --version

# Find your Arduino's serial port
ls /dev/ttyACM*  # or /dev/ttyUSB*

# Add yourself to dialout group for serial port access
sudo usermod -a -G dialout $USER
# Log out and back in for this to take effect

Windows

Option 1: WSL (Windows Subsystem for Linux)

# In WSL Ubuntu, follow Linux instructions above
# For serial port access, the port will be /dev/ttyS[N] where N corresponds to COM port

Option 2: Native Windows

  1. Download and install WinAVR or use MSYS2:
    pacman -S mingw-w64-x86_64-avr-gcc mingw-w64-x86_64-avrdude
    

5.2 Project Structure

blink/
├── blink.c           # Main source code (~30 lines)
├── Makefile          # Build automation (~25 lines)
└── README.md         # Project documentation (optional)

Minimal, focused directory structure—bare metal projects should stay simple.

5.3 The Core Question You’re Answering

“How do I control hardware without an operating system or library?”

The answer has three parts:

  1. The CPU treats hardware registers as memory addresses
    • Writing to address 0x25 (PORTB) changes the voltage on physical pins
    • No special I/O instructions needed—just memory loads and stores
  2. The volatile keyword ensures actual memory access
    • Without volatile, the compiler may cache values in CPU registers
    • Hardware requires actual memory bus transactions
  3. The datasheet is your API documentation
    • No header files describe what registers do—only the datasheet
    • Learning to read datasheets is the core skill of embedded development

5.4 Concepts You Must Understand First

Before writing code, ensure you can answer these questions:

Concept Self-Test Question Resource
Binary/Hex “What is 0x20 in binary? What is 0b00100000 in decimal?” Any C book, Ch. on number bases
Bitwise ops “What is the result of (1 << 5)? What about 0x03 \| 0x20?” K&R Ch. 2.9
Pointers “What does *(volatile uint8_t *)0x25 = 0x20 do step by step?” K&R Ch. 5
volatile “Why can’t the compiler cache hardware register values?” See explanation below
Cross-compilation “Why can’t I run avr-gcc output on my laptop?” GCC documentation

Understanding volatile:

// WITHOUT volatile - compiler may optimize to:
int *ptr = (int *)0x1000;
*ptr = 1;   // Compiler might remove this...
*ptr = 2;   // ...and keep only this

// WITH volatile - compiler must emit both writes:
volatile int *ptr = (volatile int *)0x1000;
*ptr = 1;   // Compiler MUST write 1
*ptr = 2;   // Compiler MUST write 2

// For hardware, both writes matter because the device
// might be watching for state changes!

5.5 Questions to Guide Your Design

Hardware Questions (Answer from Datasheet)

  1. Which pin on the Arduino Uno has the built-in LED?
  2. What port and bit does that pin correspond to on the ATmega328P?
  3. What is the clock speed of the ATmega328P on Arduino Uno?
  4. What voltage does the ATmega328P run at?

Software Questions (Answer Through Implementation)

  1. How will you create a delay without using delay() or timers?
  2. How many loop iterations are needed for ~500ms at 16MHz?
  3. How will you prevent the compiler from optimizing away your delay loop?
  4. Should you use SET/CLEAR or XOR for toggling? What are the tradeoffs?

Toolchain Questions (Answer by Running Commands)

  1. What compiler flags are required to target ATmega328P?
  2. How do you convert ELF to Intel HEX format?
  3. What parameters does avrdude need for Arduino Uno?
  4. How do you check the size of your compiled binary?

5.6 Thinking Exercise

Before writing any code, trace through this instruction on paper:

PORTB |= (1 << 5);

Step-by-step trace:

  1. Evaluate (1 << 5)
    • Start with 1: 0b00000001
    • Shift left 5 positions: 0b00100000 = 32 = 0x20
  2. Read current PORTB value
    • Assume PORTB contains 0b00000011 (pins 0 and 1 are high)
  3. Perform bitwise OR
    • 0b00000011 | 0b00100000 = 0b00100011
    • Bit 5 is now set, bits 0 and 1 remain set
  4. Write result back to PORTB
    • PORTB now contains 0b00100011
  5. Hardware effect
    • Bit 5 of PORTB corresponds to pin PB5
    • The ATmega328P drives that pin HIGH (5V)
    • Current flows through the LED, turning it ON

Now trace PORTB &= ~(1 << 5); yourself. What does the ~ operator do? What’s the final value if PORTB started at 0b00100011?

5.7 Hints in Layers

Hint 1: Starting Point (Conceptual Direction)

Look at the ATmega328P datasheet, Section 14 (I/O Ports). Find the register summary table. Pin 13 on Arduino is PB5—that’s bit 5 of Port B.

Your code will have this basic structure:

  • Include a header for register definitions
  • A function to delay
  • main() that sets up GPIO and loops forever

Hint 2: Next Level (More Specific Guidance)

#include <avr/io.h>  // Provides DDRB, PORTB, PB5 macros

void delay_ms(unsigned int ms);  // You'll implement this

int main(void) {
    // 1. Set PB5 as output
    // 2. Loop forever: ON, delay, OFF, delay
}

void delay_ms(unsigned int ms) {
    // Use nested loops with volatile counters
}

The <avr/io.h> header automatically includes the correct definitions for your target MCU (specified by -mmcu=atmega328p).

Hint 3: Technical Details (Approach/Pseudocode)

For the delay function at 16MHz:

  • 16,000,000 cycles per second
  • For 1ms, you need ~16,000 cycles
  • A simple while(count--) loop takes ~4 cycles per iteration
  • So ~4,000 iterations per millisecond
void delay_ms(unsigned int ms) {
    while (ms--) {
        // Inner loop for ~1ms
        volatile unsigned int count = 4000;  // Calibrate this!
        while (count--) {
            // Empty body - volatile prevents optimization
        }
    }
}

For GPIO:

DDRB |= (1 << PB5);   // PB5 = output (LED pin)

while (1) {
    PORTB |= (1 << PB5);   // LED ON
    delay_ms(500);
    PORTB &= ~(1 << PB5);  // LED OFF
    delay_ms(500);
}

Hint 4: Verification Methods

# After flashing, use a stopwatch or phone timer
# LED should complete one on/off cycle per second
# If too fast: increase delay counter
# If too slow: decrease delay counter

# Check binary size:
avr-size blink.elf
# text should be under 250 bytes

# Verify hex file was created:
ls -la blink.hex

# If LED doesn't blink at all:
# 1. Check port/pin (PB5 = Arduino pin 13)
# 2. Check DDRB is set (output mode)
# 3. Check delay isn't optimized away (volatile?)

5.8 The Interview Questions They’ll Ask

After completing this project, you should be able to answer:

  1. “What is memory-mapped I/O and why is it used?”
    • Hardware registers appear as memory addresses
    • CPU uses standard load/store instructions
    • Simplifies hardware interface—no special I/O instructions needed
    • Allows C code to directly manipulate hardware
  2. “Why use volatile for hardware registers?”
    • Without volatile, compiler may:
      • Cache register values in CPU registers
      • Reorder or eliminate memory accesses
      • Combine multiple writes into one
    • Hardware expects actual memory bus transactions
    • volatile forces the compiler to emit every access
  3. “What’s the difference between DDRB and PORTB?”
    • DDRB: Data Direction Register—sets pin as input (0) or output (1)
    • PORTB: Data Register—sets output value (HIGH/LOW) or enables pull-up for inputs
    • Must set DDR first before PORT has the expected effect
  4. “How would you make the timing more accurate?”
    • Use hardware timers instead of software loops
    • Timers count independently of CPU execution
    • Can generate interrupts at precise intervals
    • (This is covered in Project 3)
  5. “What happens at address 0x0000 when the chip powers on?”
    • Reset vector—address of first instruction after reset
    • Contains a jump to the start of your program
    • Part of the interrupt vector table
  6. “How does cross-compilation work?”
    • Compiler runs on host (x86) but generates code for target (AVR)
    • Uses target’s instruction set and register model
    • Links against target’s libraries (avr-libc)
    • Output cannot run on host—only on target hardware

5.9 Books That Will Help

Topic Book Chapter/Section
AVR Architecture Overview “Make: AVR Programming” by Elliot Williams Chapters 1-2
GPIO and Port Manipulation “AVR Workshop” by John Boxall Chapter 2
Embedded C Fundamentals “Making Embedded Systems” by Elecia White Chapter 2
Bitwise Operations “C Programming: A Modern Approach” by K.N. King Chapter 20
Cross-Compilation GCC Manual Section 3 (Invoking GCC)
ATmega328P Details ATmega328P Datasheet Section 14 (I/O Ports)
Build Systems “The GNU Make Book” Chapter 1

5.10 Implementation Phases

Goal: Get any LED blinking—don’t worry about timing accuracy yet.

// Minimal test - inline delay, no functions
#include <avr/io.h>

int main(void) {
    DDRB |= (1 << PB5);  // Set PB5 as output

    while (1) {
        PORTB |= (1 << PB5);
        for (volatile long i = 0; i < 100000; i++);
        PORTB &= ~(1 << PB5);
        for (volatile long i = 0; i < 100000; i++);
    }
}

Checkpoint: LED blinks (at any rate). If not, debug hardware connection and build process.

Phase 2: Proper Delay Function (2 hours)

Goal: Calibrated 500ms delay in a reusable function.

#include <avr/io.h>

#define F_CPU 16000000UL  // 16 MHz clock

void delay_ms(unsigned int ms) {
    while (ms--) {
        // Approximately 1ms at 16MHz
        // Calibrate this value based on actual timing
        for (volatile unsigned int i = 0; i < 4000; i++);
    }
}

int main(void) {
    DDRB |= (1 << PB5);

    while (1) {
        PORTB |= (1 << PB5);
        delay_ms(500);
        PORTB &= ~(1 << PB5);
        delay_ms(500);
    }
}

Checkpoint: LED blinks approximately once per second (use stopwatch).

Phase 3: Complete Project with Makefile (4 hours)

Goal: Professional build system, optimized code, documented.

Create a Makefile with proper targets, add comments to code explaining each line, verify size requirements.

5.11 Key Implementation Decisions

Decision Options Recommended Rationale
LED pin Any GPIO PB5 (pin 13) Built-in LED, no external wiring needed
Delay method Software loop vs timer Software loop Simpler for first project; timers in Project 3
Optimization -O0, -Os, -O2, -O3 -Os Size-optimized (smallest code, still readable)
Toggle method SET/CLEAR vs XOR SET/CLEAR More explicit, easier to understand and debug
Clock definition Hardcode vs F_CPU Define F_CPU Documents assumption, enables reuse

6. Testing Strategy

6.1 Visual Verification

Test Expected Result Pass/Fail
LED blinks after flash LED toggles on/off  
Blink rate ~1 Hz ~1 second per complete cycle  
Consistent timing No visible variation in blink rate  
Survives power cycle Blinking resumes after unplugging/replugging USB  

6.2 Build Verification

# Test 1: Code compiles without warnings
$ avr-gcc -mmcu=atmega328p -Os -Wall -c blink.c -o blink.o
# Expected: No output (no warnings or errors)

# Test 2: Size is under 250 bytes
$ avr-size blink.elf
   text    data     bss     dec     hex filename
    176       0       0     176      b0 blink.elf
# Expected: text < 250

# Test 3: HEX file is valid Intel HEX format
$ head -5 blink.hex
:100000000C9434000C943E000C943E000C943E0082
...
# Expected: Lines starting with ':' (Intel HEX format)

# Test 4: Disassembly shows expected code
$ avr-objdump -d blink.elf | head -30
# Expected: Shows main function with sbi/cbi instructions for PORTB

6.3 Hardware Testing Checklist

  • Arduino Uno recognized by system (ls /dev/tty* shows USB device)
  • avrdude can communicate with chip: avrdude -p m328p -c arduino -P /dev/ttyACM0
  • Flash succeeds without verification errors
  • LED blinks after reset button press
  • LED continues blinking after USB disconnect (using external power)

6.4 Timing Verification

# Use phone stopwatch or computer timer
# Count 10 complete blink cycles
# Should take approximately 10 seconds

# Acceptable range: 8-12 seconds for 10 cycles (±20%)
# If outside this range, adjust delay_ms() calibration value

7. Common Pitfalls & Debugging

7.1 Compilation Errors

Error Cause Fix
'DDRB' undeclared Missing include Add #include <avr/io.h> at top of file
undefined reference to 'main' Wrong entry point or linking error Ensure main() function exists and is spelled correctly
cannot find -lgcc Wrong MCU flag Use -mmcu=atmega328p in both compile and link steps
unknown MCU 'atmega328p' avr-gcc not properly installed Reinstall avr-gcc; verify with avr-gcc --target-help

7.2 Flash Errors

Error Cause Fix
stk500_recv(): programmer is not responding Wrong port or baud rate Check port with ls /dev/tty*; try 57600 baud for older boards
AVR device not responding Arduino not connected or wrong board Check USB connection; verify board type
verification error Bad connection during flash Retry; use shorter USB cable; check for loose connections
permission denied on serial port User not in dialout group (Linux) Run sudo usermod -a -G dialout $USER and re-login

7.3 Runtime Issues

Symptom Likely Cause How to Fix
LED doesn’t blink at all Wrong pin, DDRB not set, or code not running Verify PB5, check DDRB setup, try simpler test
LED always ON Delay optimized away Add volatile to loop counter
LED always OFF Pin not set as output Ensure DDRB bit 5 is set to 1
LED blinks too fast Delay too short or optimized Increase loop count; verify volatile usage
LED blinks erratically Power issues or clock problems Check USB connection; verify fuses (advanced)

7.4 Debugging Without printf

Since you don’t have serial output yet (that’s Project 2), use these techniques:

  1. LED as status indicator: Different blink patterns for different code paths
    // Fast blink = code reached point A
    for (int i = 0; i < 10; i++) {
        PORTB ^= (1 << PB5);
        delay_ms(50);
    }
    
  2. Infinite loop trap: Add while(1); at suspected failure points

  3. Simplify: Remove code until it works, then add back piece by piece

  4. Multimeter test: Measure voltage on pin 13 (should be ~0V or ~5V)

  5. Check with avr-objdump: Disassemble your binary to verify code generation
    avr-objdump -d blink.elf
    

8. Extensions & Challenges

After completing the basic blink, try these progressively harder extensions:

8.1 Easy Extensions

  1. Multiple LEDs: Wire external LEDs to other pins; blink them at different rates

  2. Button Input: Add a button to another pin; only blink when button is pressed
    // Hint: For input, clear DDR bit and read PIN register
    DDRB &= ~(1 << PB0);  // PB0 as input
    PORTB |= (1 << PB0);  // Enable internal pull-up
    if (!(PINB & (1 << PB0))) { /* button pressed (active low) */ }
    
  3. Morse Code: Blink “SOS” pattern (… — …)
    • Dot: 200ms on
    • Dash: 600ms on
    • Gap between symbols: 200ms off
    • Gap between letters: 600ms off

8.2 Intermediate Challenges

  1. Optimize for Size: Get below 100 bytes
    • Use assembly instead of C for delay
    • Use XOR toggle instead of explicit set/clear
    • Inline everything
  2. Pure Assembly Version: Rewrite entirely in AVR assembly
    ; Hint: Look up sbi, cbi, rjmp, dec, brne instructions
    
  3. Variable Speed: Read ADC (analog potentiometer) to control blink rate
    • Requires understanding ADC peripheral (preview of future projects)

8.3 Advanced Challenges

  1. Power Optimization: Add sleep mode between blinks
    #include <avr/sleep.h>
    // Hint: Use Timer interrupt to wake from sleep
    
  2. Watchdog Timer: Use WDT for timing instead of software loop
    • More accurate than software delays
    • Works even in low-power modes
  3. Fuse Exploration: Learn to read and modify AVR fuses
    • Change clock source
    • Understand brownout detection
    • Warning: Incorrect fuse settings can brick your chip!

9. Real-World Connections

9.1 Where These Techniques Are Used

Industry Application Why This Matters
Automotive Dashboard indicators, ECU diagnostics GPIO control is foundation of all embedded I/O
Medical Pacemaker status LEDs, device indicators Reliability requires understanding hardware directly
Industrial PLC status indicators, alarm systems Deterministic timing for safety-critical systems
Consumer Toy microcontrollers, wearables Cost optimization requires minimal code size
Aerospace Satellite status, avionics indicators Radiation-hardened systems have minimal abstraction

9.2 Skills That Transfer

What You Learned Where It Applies
Reading datasheets Every new chip, peripheral, or sensor
Memory-mapped I/O All microcontrollers (ARM, PIC, MSP430, RISC-V)
Cross-compilation Embedded Linux, mobile development, WebAssembly
Bit manipulation Network protocols, file formats, compression
Build systems Any professional software project

9.3 Production Considerations

If this were production code, you would add:

#include <avr/io.h>
#include <avr/interrupt.h>
#include <avr/wdt.h>      // Watchdog timer for safety

#define LED_PIN PB5
#define BLINK_RATE_MS 500

// Production version would include:
// - Timer interrupt for reliable timing (not software delay)
// - Watchdog timer to reset if code hangs
// - Error handling for edge cases
// - Low-power sleep between blinks (for battery operation)
// - Configuration via EEPROM (adjustable blink rate)
// - Self-test on startup

10. Resources

10.1 Essential References

Resource URL Purpose
ATmega328P Datasheet Microchip Official hardware documentation
avr-libc Manual nongnu.org C library documentation for AVR
AVR Instruction Set Microchip Assembly reference
Arduino Pin Mapping Arduino.cc ATmega328P pin to Arduino pin mapping

10.2 Tutorials & Examples

Resource URL Purpose
AVR Bare Metal Examples GitHub Working code examples
Hackster.io Tutorial Hackster Step-by-step beginner guide
AVR Freaks Forum avrfreaks.net Community support and discussions

10.3 Tools Reference

Tool Purpose Key Commands
avr-gcc Compiler avr-gcc -mmcu=atmega328p -Os -Wall -c file.c
avr-objcopy Convert formats avr-objcopy -O ihex file.elf file.hex
avr-size Check binary size avr-size file.elf
avr-objdump Disassemble avr-objdump -d file.elf
avrdude Flash programmer avrdude -p m328p -c arduino -P /dev/ttyACM0 -U flash:w:file.hex

11. Self-Assessment Checklist

Before moving to Project 2 (UART Serial Communication), verify you can:

Knowledge

  • Explain what memory-mapped I/O means in one sentence
  • Describe the difference between DDRB, PORTB, and PINB
  • Explain why volatile is necessary for hardware registers
  • Read the ATmega328P datasheet to find any register’s address
  • Calculate approximate delay loop iterations for a given timing at 16MHz
  • Explain what happens when you write PORTB |= (1 << 5)

Skills

  • Write bare metal C code for AVR without using Arduino framework
  • Create a Makefile with compile, link, hex, and flash targets
  • Flash firmware to Arduino using avrdude command line
  • Debug basic issues without printf (using LED patterns)
  • Compare your binary size to Arduino framework equivalent

Confidence Checks

  • I could add a second LED on a different pin without looking at references
  • I could port this code to a different AVR chip (with datasheet)
  • I could explain this project clearly in a technical interview
  • I understand why the Arduino framework exists (and when to use it)

12. Completion Criteria

Your project is complete when:

Required

  1. Code compiles with avr-gcc -mmcu=atmega328p -Os -Wall with no warnings
  2. Binary is under 250 bytes (verify with avr-size)
  3. LED blinks at approximately 1 Hz on Arduino Uno hardware
  4. No Arduino framework used (no setup(), loop(), digitalWrite())
  5. Makefile works with make and make flash targets
  6. Code is commented explaining key register operations

Bonus Achievements

  • Binary under 150 bytes
  • Toggle LED using XOR instead of explicit on/off
  • Written an assembly version for comparison
  • Documented exact timing calculation in comments
  • Tested on a second AVR chip (e.g., ATtiny85)

Evidence of Completion

  • Screenshot or video of blinking LED
  • avr-size output showing binary size
  • Comparison with Arduino IDE Blink sketch size
  • Source code with comments

What’s Next?

Next Project: P02 - UART Serial Communication

Now that you can blink an LED, you’re ready to add serial output—your debugging lifeline for all future bare metal projects. You’ll learn about:

  • UART protocol and baud rate calculation
  • Polling vs interrupt-driven I/O
  • Implementing your own printf() for debugging
  • Ring buffers for buffered communication

This guide was expanded from LEARN_BARE_METAL_PROGRAMMING.md. For the complete learning path, see the README.