Project 2: UART Serial Communication - Bare Metal AVR

Build a bare metal UART driver that sends and receives data over serial, implementing your debugging lifeline for all embedded projects.


Quick Reference

Attribute Value
Difficulty Intermediate
Time Estimate 1 week (15-25 hours)
Language C (alt: AVR Assembly)
Platform Arduino Uno / ATmega328P
Prerequisites Project 1, understanding of serial communication
Key Topics UART, interrupts, ring buffers, baud rate calculation

1. Learning Objectives

By completing this project, you will:

  1. Understand asynchronous serial communication - Know how UART transmits and receives data without a clock line
  2. Master baud rate calculation - Derive clock divider values from system clock and desired baud rate
  3. Implement polling vs interrupt-driven I/O - Understand the tradeoffs between blocking and non-blocking approaches
  4. Build a ring buffer - Implement circular buffers for efficient data handling without data loss
  5. Create printf-like output - Enable formatted debugging output on bare metal
  6. Program the USART peripheral - Configure TX, RX, and control registers directly from the datasheet
  7. Handle critical sections - Safely share data between interrupt handlers and main code

What You Will Build

A complete UART driver that:

  • Initializes UART at any standard baud rate (9600, 115200, etc.)
  • Transmits characters, strings, and formatted output
  • Receives characters with interrupt-driven buffering
  • Implements a simple command-line interface
  • Provides foundation for all future debugging

2. Theoretical Foundation

2.1 UART Protocol Deep Dive

What is UART?

UART (Universal Asynchronous Receiver/Transmitter) is a hardware communication protocol that transmits data serially - one bit at a time - without a shared clock signal between devices. This “asynchronous” nature means both devices must agree on timing parameters beforehand.

UART Communication Model:

  Device A                                          Device B
┌──────────────┐                                  ┌──────────────┐
│              │          TX ─────────────────►   │              │
│   USART      │                                  │    USART     │
│  Peripheral  │          RX ◄─────────────────   │  Peripheral  │
│              │                                  │              │
│              │          GND ═══════════════════ │              │
└──────────────┘                                  └──────────────┘

Note: TX of Device A connects to RX of Device B and vice versa
      (Cross-over connection, not straight-through)

Frame Format (8N1)

The most common UART configuration is “8N1”: 8 data bits, No parity, 1 stop bit.

UART Frame Structure (8N1):

Idle State (HIGH/Mark)
    │
    ▼
    ┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
    │    │ D0 │ D1 │ D2 │ D3 │ D4 │ D5 │ D6 │ D7 │    │    │
    │ S  │    │    │    │    │    │    │    │    │ P  │ SP │
    │ T  │ L  │    │    │    │    │    │    │ M  │ A  │ O  │
    │ A  │ S  │    │    │    │    │    │    │ S  │ R  │ P  │
    │ R  │ B  │    │    │    │    │    │    │ B  │ I  │    │
    │ T  │    │    │    │    │    │    │    │    │ T  │    │
    └────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
      │    └────────────────────────────────────┘    │    │
      │              8 Data Bits                     │    │
      │              (LSB first)                     │    │
      │                                              │    │
      ▼                                              ▼    ▼
    LOW                                           Optional HIGH
    (Space)                                       (None for (Mark)
                                                   8N1)

Timeline at 9600 baud (104.17 us per bit):
├────────┤
  1 bit
  = 104.17 us

Total frame time = 10 bits = 1.042 ms
Maximum throughput = ~960 bytes/second (theoretical)

Start Bit Detection

The receiver must detect when a frame begins. Since the line idles HIGH:

Start Bit Detection:

Idle (HIGH) ─────────┐
                     │   ◄── Falling edge triggers sampling
                     └──────────────────────────────────
                          │
                          ├── Wait 0.5 bit time
                          │
                          ├── Sample (should still be LOW)
                          │
                          ├── Wait 1 bit time, sample D0
                          │
                          ├── Wait 1 bit time, sample D1
                          │
                          ... (continue for all data bits)
                          │
                          └── Sample stop bit (should be HIGH)

Sampling Point Strategy:
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│START│ D0  │ D1  │ D2  │ D3  │ D4  │ D5  │ D6  │ D7  │STOP │
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘
   ↑     ↑     ↑     ↑     ↑     ↑     ↑     ↑     ↑     ↑
   │     │     │     │     │     │     │     │     │     │
   └──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┴──┬──┘
      │     │     │     │     │     │     │     │     │
   Sample points at middle of each bit for best noise immunity

Why No Clock Line?

Comparison: Synchronous vs Asynchronous Serial

Synchronous (SPI/I2C):               Asynchronous (UART):
┌────────┐   CLK    ┌────────┐       ┌────────┐         ┌────────┐
│ Master │─────────►│ Slave  │       │Device A│   TX    │Device B│
│        │  DATA    │        │       │        │────────►│        │
│        │─────────►│        │       │        │   RX    │        │
└────────┘          └────────┘       │        │◄────────│        │
                                     │        │   GND   │        │
                                     │        │═════════│        │
                                     └────────┘         └────────┘

Advantages:                          Advantages:
+ Perfect synchronization            + Fewer wires (2 + GND)
+ Higher speeds possible             + Point-to-point simplicity
+ No baud rate errors                + Works over long distances
                                     + Simple hardware
Disadvantages:
- Extra wire for clock               Disadvantages:
- Clock skew at distances            - Must agree on baud rate
- Master/slave relationship          - Baud rate errors possible
                                     - Lower practical speeds

2.2 ATmega328P USART Hardware

Register Overview

The ATmega328P has one USART peripheral (USART0) controlled by six registers:

Register Address Purpose
UDR0 0xC6 Data Register (read for RX, write for TX)
UCSR0A 0xC0 Status Register (flags for TX/RX state)
UCSR0B 0xC1 Control Register B (enable TX/RX, interrupts)
UCSR0C 0xC2 Control Register C (frame format)
UBRR0H 0xC5 Baud Rate Register High byte
UBRR0L 0xC4 Baud Rate Register Low byte
USART Block Diagram (ATmega328P):

                           ┌─────────────────────────────────────────┐
                           │              ATmega328P                  │
                           │                                         │
                           │  ┌─────────────────────────────────┐   │
          ┌────────────────┼──┤         Clock Generator         │   │
          │                │  │                                 │   │
          │                │  │  ┌───────────────────────────┐ │   │
          │                │  │  │     UBRR0 (16-bit)        │ │   │
          │                │  │  │   ┌───────┬───────┐       │ │   │
  16 MHz ─┼────────────────┼──┼──┼──►│UBRR0H │UBRR0L │       │ │   │
  System  │                │  │  │   └───────┴───────┘       │ │   │
  Clock   │                │  │  │          │                │ │   │
          │                │  │  │    ┌─────▼─────┐          │ │   │
          │                │  │  │    │ Prescaler │          │ │   │
          │                │  │  │    │  ÷16 or ÷8│          │ │   │
          │                │  │  │    └─────┬─────┘          │ │   │
          │                │  │  │          │ Baud Rate      │ │   │
          │                │  │  │          │ Clock          │ │   │
          │                │  │  └──────────┼────────────────┘ │   │
          │                │  └─────────────┼──────────────────┘   │
          │                │                │                       │
          │                │  ┌─────────────▼───────────────────┐  │
          │                │  │         Transmitter              │  │
          │                │  │  ┌─────────────────────────┐    │  │
          │   CPU Data Bus │  │  │   Transmit Shift Reg    │────┼──┼───► TX Pin
          │        │       │  │  └────────────▲────────────┘    │  │     (PD1)
          │        │       │  │               │                  │  │
          │        ▼       │  │  ┌────────────┴────────────┐    │  │
          │   ┌────────────┼──┼──┤       UDR0 (TX)         │    │  │
          │   │            │  │  │   (write buffer)        │    │  │
          │   │            │  │  └─────────────────────────┘    │  │
          │   │            │  └──────────────────────────────────┘  │
          │   │            │                                        │
          │   │            │  ┌──────────────────────────────────┐  │
          │   │            │  │         Receiver                 │  │
          │   │            │  │  ┌─────────────────────────┐    │  │
          │   │            │  │  │   Receive Shift Reg     │◄───┼──┼──── RX Pin
          │   │            │  │  └────────────┬────────────┘    │  │     (PD0)
          │   │            │  │               │                  │  │
          │   │            │  │  ┌────────────▼────────────┐    │  │
          │   └────────────┼──┼──┤       UDR0 (RX)         │    │  │
          │                │  │  │   (read buffer)         │    │  │
          │                │  │  └─────────────────────────┘    │  │
          │                │  └──────────────────────────────────┘  │
          │                │                                        │
          │                │  ┌──────────────────────────────────┐  │
          │                │  │       Control & Status           │  │
          │                │  │  ┌────────┐ ┌────────┐ ┌───────┐ │  │
          │                │  │  │UCSR0A  │ │UCSR0B  │ │UCSR0C │ │  │
          │                │  │  │(status)│ │(enable)│ │(format)│ │  │
          │                │  │  └────────┘ └────────┘ └───────┘ │  │
          │                │  └──────────────────────────────────┘  │
          │                └────────────────────────────────────────┘
          │
          └──────────────────── (continued to other peripherals)

Status Register (UCSR0A) Bit Details

UCSR0A Register (0xC0):
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│ Bit7│ Bit6│ Bit5│ Bit4│ Bit3│ Bit2│ Bit1│ Bit0│
├─────┼─────┼─────┼─────┼─────┼─────┼─────┼─────┤
│RXC0 │TXC0 │UDRE0│ FE0 │DOR0 │UPE0 │U2X0 │MPCM0│
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘
  │     │     │     │     │     │     │     │
  │     │     │     │     │     │     │     └── Multi-processor Mode
  │     │     │     │     │     │     └──────── Double TX Speed
  │     │     │     │     │     └────────────── Parity Error
  │     │     │     │     └──────────────────── Data Overrun
  │     │     │     └────────────────────────── Frame Error
  │     │     └──────────────────────────────── TX Buffer Empty (READY)
  │     └────────────────────────────────────── TX Complete
  └──────────────────────────────────────────── RX Complete (DATA READY)

Key flags for basic operation:
- RXC0:  Set when unread data in receive buffer (read UDR0 to clear)
- UDRE0: Set when transmit buffer ready for new data
- TXC0:  Set when transmission complete (including shift register)

Control Register B (UCSR0B) Bit Details

UCSR0B Register (0xC1):
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│ Bit7│ Bit6│ Bit5│ Bit4│ Bit3│ Bit2│ Bit1│ Bit0│
├─────┼─────┼─────┼─────┼─────┼─────┼─────┼─────┤
│RXCIE│TXCIE│UDRIE│RXEN0│TXEN0│UCSZ02│RXB80│TXB80│
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘
  │     │     │     │     │     │     │     │
  │     │     │     │     │     │     │     └── TX bit 8 (9-bit mode)
  │     │     │     │     │     │     └──────── RX bit 8 (9-bit mode)
  │     │     │     │     │     └────────────── Data size bit 2
  │     │     │     │     └──────────────────── Transmitter Enable
  │     │     │     └────────────────────────── Receiver Enable
  │     │     └──────────────────────────────── Data Reg Empty Int Enable
  │     └────────────────────────────────────── TX Complete Int Enable
  └──────────────────────────────────────────── RX Complete Int Enable

Essential settings for 8N1:
- RXEN0 = 1:  Enable receiver
- TXEN0 = 1:  Enable transmitter
- RXCIE0 = 1: Enable RX interrupt (for interrupt-driven receive)

Control Register C (UCSR0C) for Frame Format

UCSR0C Register (0xC2):
┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐
│ Bit7│ Bit6│ Bit5│ Bit4│ Bit3│ Bit2│ Bit1│ Bit0│
├─────┼─────┼─────┼─────┼─────┼─────┼─────┼─────┤
│UMSEL01│UMSEL00│UPM01│UPM00│USBS0│UCSZ01│UCSZ00│UCPOL0│
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┘
  │     │     │     │     │     │     │     │
  │     │     │     │     │     │     │     └── Clock Polarity (sync only)
  │     │     │     │     │     └─────┴──────── Data bits (11=8-bit)
  │     │     │     │     └──────────────────── Stop bits (0=1, 1=2)
  │     │     └─────┴────────────────────────── Parity (00=none)
  └─────┴────────────────────────────────────── Mode (00=async)

For 8N1 configuration:
- UMSEL0[1:0] = 00: Asynchronous USART
- UPM0[1:0]   = 00: No parity
- USBS0       = 0:  1 stop bit
- UCSZ0[1:0]  = 11: 8 data bits (combined with UCSZ02=0 in UCSR0B)

2.3 Ring Buffers

Why Ring Buffers?

When receiving data via interrupts, bytes arrive faster than the main program can process them. A ring buffer (circular buffer) stores incoming data without blocking.

Problem Without Buffer:

Time ──────────────────────────────────────────────────►

Incoming:  'H'   'e'   'l'   'l'   'o'
            │     │     │     │     │
            ▼     ▼     ▼     ▼     ▼
           ┌───┐ ┌───┐ ┌───┐ ┌───┐ ┌───┐
           │ H │ │ e │ │ l │ │ l │ │ o │
           └─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘ └─┬─┘
             │     │     │     │     │
Main code:   ▼     X     X     ▼     X      (X = missed while processing 'H')
          Read 'H'             Read 'l'

Result: "Hl" instead of "Hello" - DATA LOSS!


Solution With Ring Buffer:

           ┌───┬───┬───┬───┬───┬───┬───┬───┐
Buffer:    │ H │ e │ l │ l │ o │   │   │   │
           └───┴───┴───┴───┴───┴───┴───┴───┘
             ▲                   ▲
            head               tail
           (read)             (write)

ISR writes at tail, main() reads at head - no data loss!

Ring Buffer Operations

Ring Buffer State Diagram:

Initial (Empty):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│   │   │   │   │   │   │   │   │
└───┴───┴───┴───┴───┴───┴───┴───┘
  ▲
 head = tail = 0
 count = 0

After writing 'A', 'B', 'C':
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ A │ B │ C │   │   │   │   │   │
└───┴───┴───┴───┴───┴───┴───┴───┘
  ▲           ▲
 head        tail
 count = 3

After reading 'A', 'B':
┌───┬───┬───┬───┬───┬───┬───┬───┐
│   │   │ C │   │   │   │   │   │
└───┴───┴───┴───┴───┴───┴───┴───┘
          ▲   ▲
        head tail
        count = 1

Wrap-around (after writing 'D', 'E', 'F', 'G', 'H', 'I'):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│ I │   │ C │ D │ E │ F │ G │ H │
└───┴───┴───┴───┴───┴───┴───┴───┘
      ▲   ▲
    tail head
    count = 7

Key insight: tail "wraps around" to the beginning!

Ring Buffer Algorithm

Write Operation (from ISR):
┌────────────────────────────────────────┐
│ if (count < BUFFER_SIZE)               │
│     buffer[tail] = new_byte            │
│     tail = (tail + 1) % BUFFER_SIZE    │  ◄── Modulo wraps around
│     count++                            │
│ else                                   │
│     // Buffer full - discard byte      │
│     // (or set overflow flag)          │
└────────────────────────────────────────┘

Read Operation (from main):
┌────────────────────────────────────────┐
│ cli()  // Disable interrupts           │
│ if (count > 0)                         │
│     byte = buffer[head]                │
│     head = (head + 1) % BUFFER_SIZE    │
│     count--                            │
│     sei()  // Enable interrupts        │
│     return byte                        │
│ else                                   │
│     sei()                              │
│     return -1  // No data              │
└────────────────────────────────────────┘

Buffer Size Considerations

Buffer Size vs Latency:

Small Buffer (8 bytes):
┌───┬───┬───┬───┬───┬───┬───┬───┐
│   │   │   │   │   │   │   │   │
└───┴───┴───┴───┴───┴───┴───┴───┘
+ Less RAM usage (precious on ATmega328P: 2KB total)
- Overflows easily if main code is slow
- At 115200 baud: fills in ~0.7ms

Medium Buffer (64 bytes):
┌───┬───┬───┬   ...64 bytes...   ───┬───┐
│   │   │   │                       │   │
└───┴───┴───┴───────────────────────┴───┘
+ Good balance for most applications
- Uses 64 bytes of RAM (3% of available)
- At 115200 baud: fills in ~5.5ms

Large Buffer (256 bytes):
+ Can handle long bursts
- Uses significant RAM (12.5%)
- May hide performance problems in main code

Recommendation: Start with 64 bytes, adjust based on testing

2.4 Polling vs Interrupts

Polling (Busy-Wait) Approach

Polling Model:

main() {
    while(1) {
        // Check if data available (busy polling)
        ┌────────────────────────┐
        │ if (UCSR0A & RXC0) {   │ ◄── Check flag repeatedly
        │     data = UDR0;       │
        │     process(data);     │
        │ }                      │
        └────────────────────────┘
                 │
                 │ Loop continuously
                 ▼
        ┌────────────────────────┐
        │ // Do other work       │ ◄── But might miss data!
        │ update_display();      │
        │ check_buttons();       │
        └────────────────────────┘
    }
}

Timeline:
╔═══════════════════════════════════════════════════════════╗
║ main():  Check──Work──Check──Work──Check──Work──Check     ║
║                                                           ║
║ UART RX:        ┌─────┐     ┌─────┐                       ║
║                 │  A  │     │  B  │  ← 'B' arrives while  ║
║                 └─────┘     └─────┘    doing "Work"       ║
║                      ↑           ↑                        ║
║                   Caught      MISSED! (overwritten by     ║
║                               next byte before read)      ║
╚═══════════════════════════════════════════════════════════╝

Pros:                          Cons:
+ Simple code                  - Wastes CPU cycles checking
+ Easy to debug                - Can miss data if check is slow
+ No race conditions           - Can't do other work reliably
+ Deterministic timing         - Not suitable for real-time

Interrupt-Driven Approach

Interrupt Model:

ISR(USART_RX_vect) {         // Called automatically on receive
    buffer_write(UDR0);       // Store immediately
}

main() {
    while(1) {
        ┌────────────────────────┐
        │ if (buffer_available())│ ◄── Check buffer, not hardware
        │     process(data);     │
        └────────────────────────┘
                 │
                 ▼
        ┌────────────────────────┐
        │ // Do other work       │ ◄── Safely do other things!
        │ update_display();      │
        │ check_buttons();       │
        └────────────────────────┘
    }
}

Timeline:
╔═══════════════════════════════════════════════════════════╗
║ main():   Work───Work───Read A───Work───Read B───Work     ║
║                    ▲               ▲                      ║
║                    │               │                      ║
║ ISR:     ┌─────┐   │     ┌─────┐   │                      ║
║          │ISR A│───┘     │ISR B│───┘                      ║
║          └─────┘         └─────┘                          ║
║              ▲               ▲                            ║
║ UART RX: ┌─────┐         ┌─────┐                          ║
║          │  A  │         │  B  │  ← Both bytes caught!    ║
║          └─────┘         └─────┘                          ║
╚═══════════════════════════════════════════════════════════╝

Pros:                          Cons:
+ Never misses data            - More complex code
+ CPU free between bytes       - Race conditions possible
+ Responsive to bursts         - Debugging harder
+ Real-time capable            - Interrupt latency

Critical Section Problem

Race Condition Example:

Shared variable: count (tracks bytes in buffer)

        main()                          ISR()
           │                               │
           │  if (count > 0) {             │
           │      // count is 1            │
           │                     ◄─────────│ Interrupt fires!
           │                               │ data = UDR0;
           │                               │ buffer[tail++] = data;
           │                               │ count++;  // count = 2
           │                               │ return;
           │                     ──────────►
           │      // Still thinks count=1!
           │      byte = buffer[head++];
           │      count--;  // count = 1 ???
           │  }
           │
           ▼
        Problem: count should be 1, but calculation was based on old value!


Solution: Critical Sections

        main()                          ISR()
           │                               │
           │  cli();  // Disable interrupts│
           │  if (count > 0) {             │
           │      byte = buffer[head++];   │
           │      count--;                 │
           │  }                            │
           │  sei();  // Enable interrupts │
           │                     ◄─────────│ NOW interrupt can fire
           │                               │ (safely)
           ▼

2.5 Common Misconceptions

Misconception Reality
“UART is slow” 115200 baud = ~11.5 KB/s, plenty for debugging and many applications
“I need RS-232 hardware” Modern UART uses TTL levels (0V/5V); RS-232 (-12V/+12V) is legacy
“TX and RX are the same” TX is transmit (output), RX is receive (input) - cross them between devices
“Baud rate and bps are identical” For UART yes; for modems no (multiple bits per symbol possible)
“Flow control is required” Not for simple debugging; needed for high-speed bulk transfers
“Interrupts are always better” Polling is simpler and fine when nothing else to do
“Ring buffer overflow is rare” It’s common! At 115200 baud, 64-byte buffer fills in ~5.5ms
“Any baud rate works” Only standard rates; others have high error and may not work

3. Project Specification

3.1 What You Will Build

A layered UART driver with:

Application Layer
┌─────────────────────────────────────────────────────┐
│  uart_printf("Temperature: %d C\n", temp);          │
│  char *cmd = uart_readline();                        │
└───────────────────────────┬─────────────────────────┘
                            │
                            ▼
String Functions Layer
┌─────────────────────────────────────────────────────┐
│  uart_puts("Hello");                                │
│  uart_gets(buffer, size);                           │
└───────────────────────────┬─────────────────────────┘
                            │
                            ▼
Character I/O Layer
┌─────────────────────────────────────────────────────┐
│  uart_putc('A');                                    │
│  char c = uart_getc();                              │
└───────────────────────────┬─────────────────────────┘
                            │
                            ▼
Driver Layer
┌─────────────────────────────────────────────────────┐
│  uart_init(115200);                                 │
│  ISR(USART_RX_vect) { /* buffer incoming */ }       │
└───────────────────────────┬─────────────────────────┘
                            │
                            ▼
Hardware Layer
┌─────────────────────────────────────────────────────┐
│  UDR0, UCSR0A, UCSR0B, UCSR0C, UBRR0                │
└─────────────────────────────────────────────────────┘

3.2 Functional Requirements

ID Requirement Priority
FR1 Initialize UART at configurable baud rate (9600, 115200) Must
FR2 Transmit single character (blocking until sent) Must
FR3 Transmit null-terminated string Must
FR4 Receive single character (blocking) Must
FR5 Receive with timeout (non-blocking option) Should
FR6 Interrupt-driven receive with ring buffer Should
FR7 printf-like formatted output (%d, %s, %x, %c) Should
FR8 readline with echo and backspace handling Nice
FR9 Support multiple baud rates without recompile Nice

3.3 Non-Functional Requirements

ID Requirement Metric
NFR1 No data loss at 115200 baud Test with continuous stream
NFR2 Code size under 1KB Check with avr-size
NFR3 Receive buffer of at least 64 bytes Ring buffer size
NFR4 Works with standard serial terminals screen, minicom, PuTTY
NFR5 Response time under 1ms for single char Scope measurement

3.4 Example Output

$ make flash
avrdude: writing flash (876 bytes)...

$ screen /dev/ttyACM0 115200
=== AVR Bare Metal Serial Demo ===
System initialized.
Build: Dec 21 2024 14:30:00
UART: 115200 baud, 8N1

> help
Available commands:
  led on    - Turn LED on
  led off   - Turn LED off
  status    - Show system status
  echo <msg> - Echo message back
  reboot    - Reset the system

> led on
LED is now ON

> led off
LED is now OFF

> status
Uptime: 42 seconds
Temperature: 23 C (ADC raw: 472)
Free RAM: 1847 bytes
RX buffer: 3/64 bytes used

> echo Hello World
Hello World

> test
Unknown command: test
Type 'help' for available commands.

>

3.5 Real World Outcome

When complete:

  1. Debugging capability: Print any variable, register, or message to terminal
  2. Interactive system: Build command-line interfaces for configuration
  3. Foundation for future projects: All subsequent projects use UART for output
  4. Industry-relevant skill: UART configuration appears in every embedded job interview
  5. Portfolio piece: Demonstrate understanding of hardware-level serial communication

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────┐
│                        main.c                                    │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │ int main(void) {                                            ││
│  │     uart_init(115200);                                      ││
│  │     sei();  // Enable global interrupts                     ││
│  │     uart_puts("System ready\r\n");                          ││
│  │     while(1) {                                              ││
│  │         if (uart_available()) {                             ││
│  │             char *cmd = uart_readline(buffer, 64);          ││
│  │             process_command(cmd);                           ││
│  │         }                                                   ││
│  │     }                                                       ││
│  │ }                                                           ││
│  └─────────────────────────────────────────────────────────────┘│
└────────────────────────────┬────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────┐
│                        uart.c                                    │
│                                                                  │
│  Transmit Path                    Receive Path                   │
│  ┌──────────────┐                ┌──────────────┐               │
│  │uart_printf() │                │uart_readline()│               │
│  │      │       │                │      ▲       │               │
│  │      ▼       │                │      │       │               │
│  │ uart_puts()  │                │ uart_getc()  │               │
│  │      │       │                │      ▲       │               │
│  │      ▼       │                │      │       │               │
│  │ uart_putc()  │                │ ring_buffer  │◄──┐           │
│  │      │       │                │      ▲       │   │           │
│  │      ▼       │                │      │       │   │           │
│  │  [UCSR0A]    │                │  [UCSR0A]    │   │ ISR       │
│  │  wait UDRE0  │                │  RXC0 flag   │   │           │
│  │      │       │                │      │       │   │           │
│  │      ▼       │                │      │       │   │           │
│  │  [UDR0]      │                │  [UDR0]  ────┼───┘           │
│  │  write byte  │                │  read byte   │               │
│  └──────────────┘                └──────────────┘               │
│                                                                  │
└────────────────────────────┬────────────────────────────────────┘
                             │
                             ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Hardware (USART0)                            │
│                                                                  │
│   ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐ │
│   │  UDR0    │    │ UCSR0A   │    │ UCSR0B   │    │  UBRR0   │ │
│   │  (data)  │    │ (status) │    │ (control)│    │ (baud)   │ │
│   └──────────┘    └──────────┘    └──────────┘    └──────────┘ │
│                                                                  │
│   TX (PD1) ─────────────────────────────►  To USB-Serial        │
│   RX (PD0) ◄─────────────────────────────  From USB-Serial      │
└─────────────────────────────────────────────────────────────────┘

4.2 Key Components

Component File Purpose
uart_init() uart.c Configure baud rate, enable TX/RX, setup interrupts
uart_putc() uart.c Transmit one character (blocking)
uart_getc() uart.c Receive one character from buffer
uart_puts() uart.c Transmit null-terminated string
uart_available() uart.c Check if data in receive buffer
uart_printf() uart.c Formatted output (minimal implementation)
Ring buffer ring_buffer.c Store incoming bytes without data loss
ISR(USART_RX_vect) uart.c Interrupt handler for receive
Command parser main.c Process user commands

4.3 UART Register Map

ATmega328P USART0 Register Summary:

Address   Name      Read/Write   Initial Value   Purpose
──────────────────────────────────────────────────────────────────
0xC6      UDR0      R/W          0x00           Data register
0xC5      UBRR0H    R/W          0x00           Baud rate high byte
0xC4      UBRR0L    R/W          0x00           Baud rate low byte
0xC2      UCSR0C    R/W          0x06           Frame format
0xC1      UCSR0B    R/W          0x00           Control (TX/RX enable)
0xC0      UCSR0A    R/W          0x20           Status flags

Initial state after reset:
- UART disabled (TXEN0 = RXEN0 = 0)
- UDRE0 = 1 (transmit buffer empty)
- 8-bit data, no parity, 1 stop bit (UCSR0C = 0x06)

4.4 Baud Rate Calculation

The baud rate is derived from the system clock using a prescaler:

Normal mode (U2X0 = 0):
UBRR = (F_CPU / (16 * BAUD)) - 1

Double-speed mode (U2X0 = 1):
UBRR = (F_CPU / (8 * BAUD)) - 1

Error calculation:
Actual_Baud = F_CPU / (16 * (UBRR + 1))       [normal mode]
Error = ((Actual_Baud - Desired_Baud) / Desired_Baud) * 100%


Example calculations at F_CPU = 16 MHz:

┌──────────┬──────────────┬──────────────┬──────────┬──────────────────┐
│Baud Rate │ UBRR (normal)│UBRR (double) │ Error(N) │ Error(D)         │
├──────────┼──────────────┼──────────────┼──────────┼──────────────────┤
│   2400   │     416      │     832      │   0.1%   │   0.0%           │
│   9600   │     103      │     207      │   0.2%   │   0.2%           │
│  19200   │      51      │     103      │   0.2%   │   0.2%           │
│  38400   │      25      │      51      │   0.2%   │   0.2%           │
│  57600   │      16      │      34      │   2.1%   │  -0.8%           │
│ 115200   │       8      │      16      │  -3.5%   │   2.1%           │
│ 230400   │       3      │       8      │   8.5%   │  -3.5%           │
│ 250000   │       3      │       7      │   0.0%   │   0.0%           │
│ 500000   │       1      │       3      │   0.0%   │   0.0%           │
│1000000   │       0      │       1      │   0.0%   │   0.0%           │
└──────────┴──────────────┴──────────────┴──────────┴──────────────────┘

Note: Error > 2% may cause communication problems!
      For 115200, use double-speed mode for better accuracy.


Worked example: 115200 baud at 16 MHz

Normal mode:
  UBRR = (16000000 / (16 * 115200)) - 1
       = (16000000 / 1843200) - 1
       = 8.68 - 1
       = 7.68 → rounds to 8

  Actual baud = 16000000 / (16 * (8 + 1))
              = 16000000 / 144
              = 111111 baud

  Error = (111111 - 115200) / 115200 * 100
        = -3.5%   ← Marginal, may work

Double-speed mode:
  UBRR = (16000000 / (8 * 115200)) - 1
       = (16000000 / 921600) - 1
       = 17.36 - 1
       = 16.36 → rounds to 16

  Actual baud = 16000000 / (8 * (16 + 1))
              = 16000000 / 136
              = 117647 baud

  Error = (117647 - 115200) / 115200 * 100
        = +2.1%   ← Better!

5. Implementation Guide

5.1 Development Environment Setup

# macOS
brew install minicom
# or use built-in 'screen'

# Linux (Ubuntu/Debian)
sudo apt install minicom picocom screen
# Add user to dialout group for serial port access
sudo usermod -a -G dialout $USER
# Log out and back in for group change to take effect

# Verify serial port
ls /dev/tty*  # Look for ttyACM0 or ttyUSB0

# Connect at 115200 baud using screen
screen /dev/ttyACM0 115200
# Exit with Ctrl-A, then K, then Y

# Or use minicom
minicom -D /dev/ttyACM0 -b 115200
# Exit with Ctrl-A, then X

# Or use picocom (recommended - cleaner interface)
picocom -b 115200 /dev/ttyACM0
# Exit with Ctrl-A, then Ctrl-X

5.2 Project Structure

uart_project/
├── main.c              # Application with command loop
├── uart.h              # UART function declarations
├── uart.c              # UART implementation
├── ring_buffer.h       # Ring buffer declarations
├── ring_buffer.c       # Ring buffer implementation (optional separate file)
├── Makefile            # Build automation
└── README.md           # Project documentation

Alternative (simpler structure):
uart_project/
├── main.c              # Everything in one file for simple projects
└── Makefile

5.3 The Core Question You’re Answering

“How do I communicate with a bare metal system without an operating system?”

The answer involves:

  1. Configuring hardware registers for the UART protocol
  2. Converting your system clock speed to baud rate prescaler values
  3. Handling data byte-by-byte at the register level
  4. Using interrupts and buffers to avoid missing incoming data
  5. Building higher-level functions (strings, printf) on top of character I/O

5.4 Concepts You Must Understand First

Concept Self-Test Question Reference
Async serial “How does the receiver know when a byte starts without a clock?” Make: AVR Programming Ch. 5
Baud rate “What’s the bit period in microseconds at 9600 baud?” ATmega328P datasheet Section 20
Interrupts “What happens to the CPU when an interrupt fires? What gets saved?” Make: AVR Programming Ch. 7
Critical sections “Why must you disable interrupts when reading from a shared buffer?” OSTEP Ch. 26
Ring buffers “How does a ring buffer prevent data loss? What happens when it’s full?” Any data structures book
Volatile “Why must buffer variables be volatile?” Embedded.com articles

5.5 Questions to Guide Your Design

Hardware Questions

  1. Which pins are TX and RX on the ATmega328P? (Answer: PD1 and PD0)
  2. What is the maximum baud rate the ATmega328P supports at 16MHz? (Answer: up to 2 Mbaud)
  3. How do you know when the transmit buffer is ready for new data? (Answer: UDRE0 flag)
  4. How do you know when received data is available? (Answer: RXC0 flag)

Software Questions

  1. How big should the receive buffer be for your application?
  2. What happens if the buffer overflows? How will you detect/handle it?
  3. How will you handle backspace characters in readline?
  4. Should newlines be CR, LF, or CRLF?

Edge Cases

  1. What if someone sends data faster than you can process it?
  2. What if the baud rate calculation has significant error?
  3. How do you handle different line endings from different terminals?
  4. What if the user types more characters than your buffer can hold?

5.6 Thinking Exercise

Before coding, manually calculate and verify:

Exercise 1: Baud Rate Calculation

Calculate the UBRR value for 9600 baud at 16MHz:

  1. Using normal mode: UBRR = (16000000 / (16 * 9600)) - 1 = ?
  2. What is the actual baud rate with this UBRR value?
  3. What is the error percentage?

Exercise 2: Timing Analysis

At 115200 baud:

  1. How long does it take to transmit one character (10 bits)?
  2. How many CPU cycles is that at 16MHz?
  3. If your main loop takes 1000 cycles, how many characters could arrive before you check the buffer?

Exercise 3: Buffer Sizing

  1. At 115200 baud, maximum ~11,520 bytes/second arrive
  2. If your main loop processes one command in 100ms, how many bytes could arrive?
  3. What buffer size would you need to not lose data?

5.7 Hints in Layers

Hint 1: Starting Point (Conceptual)

Start with transmit-only, polling mode. Get characters flowing to your terminal before worrying about receive or interrupts.

The minimal sequence:

  1. Calculate UBRR for your baud rate
  2. Write UBRR to registers
  3. Enable transmitter
  4. Set frame format
  5. Loop: wait for UDRE0, write character to UDR0

Hint 2: Next Level (More Specific)

Basic initialization structure:

void uart_init(uint32_t baud) {
    // Calculate baud rate register value
    uint16_t ubrr = (F_CPU / (16UL * baud)) - 1;

    // Set baud rate
    UBRR0H = (uint8_t)(ubrr >> 8);
    UBRR0L = (uint8_t)ubrr;

    // Enable transmitter and receiver
    UCSR0B = (1 << TXEN0) | (1 << RXEN0);

    // Set frame format: 8 data bits, 1 stop bit, no parity
    UCSR0C = (1 << UCSZ01) | (1 << UCSZ00);
}

Simple transmit function:

void uart_putc(char c) {
    // Wait for transmit buffer to be empty
    while (!(UCSR0A & (1 << UDRE0)));
    // Write data to transmit buffer
    UDR0 = c;
}

Hint 3: Technical Details (Interrupt-Driven RX)

For interrupt-driven receive, you need:

#include <avr/interrupt.h>

// Ring buffer structure
#define RX_BUFFER_SIZE 64
volatile uint8_t rx_buffer[RX_BUFFER_SIZE];
volatile uint8_t rx_head = 0;  // Read index
volatile uint8_t rx_tail = 0;  // Write index
volatile uint8_t rx_count = 0; // Bytes in buffer

// Initialize with RX interrupt enabled
void uart_init(uint32_t baud) {
    // ... baud rate setup as before ...

    // Enable TX, RX, and RX Complete Interrupt
    UCSR0B = (1 << TXEN0) | (1 << RXEN0) | (1 << RXCIE0);

    // Enable global interrupts
    sei();
}

// Receive complete interrupt handler
ISR(USART_RX_vect) {
    uint8_t data = UDR0;  // Must read UDR0 to clear interrupt flag

    if (rx_count < RX_BUFFER_SIZE) {
        rx_buffer[rx_tail] = data;
        rx_tail = (rx_tail + 1) % RX_BUFFER_SIZE;
        rx_count++;
    }
    // If buffer full, data is silently lost (overflow)
}

// Check if data available
uint8_t uart_available(void) {
    uint8_t count;
    cli();           // Disable interrupts
    count = rx_count;
    sei();           // Enable interrupts
    return count;
}

// Get one character (blocking)
char uart_getc(void) {
    while (rx_count == 0);  // Wait for data

    cli();  // Critical section
    char data = rx_buffer[rx_head];
    rx_head = (rx_head + 1) % RX_BUFFER_SIZE;
    rx_count--;
    sei();

    return data;
}

Hint 4: Verification Method

# Test 1: Transmit only - startup message
$ screen /dev/ttyACM0 115200
# Should see startup message when Arduino resets

# Test 2: Echo test - type characters, they should echo back
# Each character you type appears once (sent back by Arduino)

# Test 3: Buffer test - paste a long string quickly
# Should see complete string without missing characters

# Test 4: Printf test
# Output should be correctly formatted

# Test 5: Overflow test (optional)
$ cat /dev/urandom | head -c 1000 > /dev/ttyACM0
# Device should not crash; some data loss acceptable

5.8 The Interview Questions They’ll Ask

  1. “Explain UART protocol. What does 8N1 mean?”
    • 8 data bits, No parity, 1 stop bit
    • Asynchronous (no clock), uses start/stop bits for framing
    • LSB transmitted first
  2. “How do you calculate baud rate registers?”
    • UBRR = (F_CPU / (16 * BAUD)) - 1 for normal mode
    • Double-speed mode uses divisor of 8 instead of 16
    • Error must be under 2% for reliable operation
  3. “Why use a ring buffer for receive?”
    • Interrupt can arrive anytime, need to store data immediately
    • Main code processes at different speed than data arrives
    • FIFO order preserved, efficient memory use
  4. “What’s the difference between polling and interrupt-driven I/O?”
    • Polling: CPU constantly checks flags, simple but wastes cycles
    • Interrupt: CPU notified when data ready, efficient but complex
    • Polling can miss data if check interval > byte time
  5. “How do you handle critical sections in interrupt-driven code?”
    • Disable interrupts (cli) around shared data access
    • Keep critical sections short to minimize latency
    • Use volatile for shared variables
  6. “What happens if you don’t read UDR0 when data arrives?”
    • RXC0 flag stays set, new data overwrites old in shift register
    • Data OverRun flag (DOR0) set, indicating lost data
    • Receiver may become unresponsive
  7. “How would you implement flow control?”
    • Software: XON/XOFF characters (Ctrl-S/Ctrl-Q)
    • Hardware: RTS/CTS lines (not available on basic Arduino)
    • Buffer high-water mark triggers flow control

5.9 Books That Will Help

Topic Book Chapter
UART Protocol “Make: AVR Programming” by Elliot Williams Chapter 5: Serial I/O
Baud Rate Math ATmega328P Datasheet Section 20.3: Clock Generation
Interrupts “Make: AVR Programming” by Elliot Williams Chapter 7: Hardware Interrupts
Ring Buffers “Mastering Algorithms with C” by Kyle Loudon Chapter 5: Linked Lists
Serial Debugging “Making Embedded Systems” by Elecia White Chapter 8: System Design
Critical Sections “OSTEP” by Arpaci-Dusseau Chapter 26: Concurrency

5.10 Implementation Phases

Phase 1: Polling TX Only (2-3 hours)

Goal: Get characters appearing in terminal

#include <avr/io.h>

#define F_CPU 16000000UL
#define BAUD 9600

void uart_init(void) {
    uint16_t ubrr = (F_CPU / (16UL * BAUD)) - 1;
    UBRR0H = (ubrr >> 8);
    UBRR0L = ubrr;
    UCSR0B = (1 << TXEN0);  // Enable transmitter only
    UCSR0C = (1 << UCSZ01) | (1 << UCSZ00);  // 8N1
}

void uart_putc(char c) {
    while (!(UCSR0A & (1 << UDRE0)));  // Wait for empty buffer
    UDR0 = c;
}

void uart_puts(const char *s) {
    while (*s) uart_putc(*s++);
}

int main(void) {
    uart_init();
    uart_puts("Hello from bare metal!\r\n");
    while (1);
    return 0;
}

Phase 2: Add Polling RX (2-3 hours)

Goal: Echo characters back

// Add to Phase 1 code:

char uart_getc(void) {
    while (!(UCSR0A & (1 << RXC0)));  // Wait for data
    return UDR0;
}

int main(void) {
    uart_init();
    uart_puts("Echo test - type something:\r\n");

    while (1) {
        char c = uart_getc();
        uart_putc(c);  // Echo back

        // Also echo newline
        if (c == '\r') uart_putc('\n');
    }
}

Phase 3: Interrupt-Driven RX with Buffer (4-6 hours)

Goal: No data loss during processing

Add ring buffer and ISR for receive (see Hint 3 for code structure).

Phase 4: Printf and Command Parser (4-6 hours)

Goal: Formatted output and interactive commands

Implement minimal printf supporting %d, %s, %x, %c, and add command parsing.

5.11 Key Implementation Decisions

Decision Options Recommended Rationale
Initial baud rate 9600 vs 115200 9600 first Lower error, easier to debug
RX handling Polling vs interrupt Interrupt Avoid data loss in real apps
Buffer size 16/32/64/128 64 bytes Good balance of RAM vs safety
Printf Full vs minimal Minimal (%d, %s, %x) Flash size constraints
Line ending CR/LF/CRLF CRLF (\r\n) Universal terminal compatibility
Double-speed U2X0=0 vs U2X0=1 Enable for 115200 Lower baud rate error

6. Testing Strategy

6.1 Unit Tests (on Host)

Test ring buffer logic on your PC before embedding:

// test_ringbuffer.c - compile with gcc on host
#include <assert.h>
#include <stdio.h>

#define BUFFER_SIZE 8
uint8_t buffer[BUFFER_SIZE];
uint8_t head = 0, tail = 0, count = 0;

void buffer_put(uint8_t data) {
    if (count < BUFFER_SIZE) {
        buffer[tail] = data;
        tail = (tail + 1) % BUFFER_SIZE;
        count++;
    }
}

int buffer_get(void) {
    if (count > 0) {
        uint8_t data = buffer[head];
        head = (head + 1) % BUFFER_SIZE;
        count--;
        return data;
    }
    return -1;
}

void test_basic() {
    buffer_put('A');
    buffer_put('B');
    assert(buffer_get() == 'A');
    assert(buffer_get() == 'B');
    assert(buffer_get() == -1);  // Empty
    printf("Basic test passed\n");
}

void test_wraparound() {
    // Fill and empty several times
    for (int i = 0; i < 100; i++) {
        buffer_put('X');
    }
    assert(count == BUFFER_SIZE);  // Should be full
    for (int i = 0; i < BUFFER_SIZE; i++) {
        assert(buffer_get() == 'X');
    }
    assert(count == 0);  // Should be empty
    printf("Wraparound test passed\n");
}

int main() {
    test_basic();
    head = tail = count = 0;  // Reset
    test_wraparound();
    printf("All tests passed!\n");
    return 0;
}

6.2 Hardware Tests

Test Method Expected Result
Basic TX Boot device Startup message appears in terminal
Echo Type characters Characters echo back immediately
String Type “hello” + Enter Echoed string, processed as command
Buffer Paste 100 chars quickly All characters received (check count)
Overflow Paste 1000 chars Some loss OK, no crash, recovers
Commands Type “help” Help message displayed
Printf Output formatted numbers Correct decimal, hex, string output

6.3 Stress Testing

# Send continuous data and verify device stays responsive
$ while true; do echo "test"; sleep 0.01; done > /dev/ttyACM0

# In another terminal, check if device responds
$ echo "status" > /dev/ttyACM0

# Verify no crash after extended operation
$ cat /dev/urandom | base64 | head -c 100000 > /dev/ttyACM0
$ echo "status" > /dev/ttyACM0  # Should still respond

6.4 Baud Rate Verification

# If characters appear garbled, calculate actual vs expected baud rate
# Use oscilloscope if available to measure actual bit timing

# Common symptoms:
# - Garbage characters: baud rate mismatch
# - Every other char wrong: double-speed mode mismatch
# - First char correct, rest wrong: framing error

7. Common Pitfalls & Debugging

7.1 No Output At All

Symptom Likely Cause Fix
Nothing in terminal Wrong serial port Check ls /dev/tty*, try each
Terminal shows nothing TXEN0 not set Verify UCSR0B configuration
Port won’t open Permission denied Add user to dialout group (Linux)
Arduino not recognized Driver missing Install USB serial driver

7.2 Garbled Output

Expected: "Hello World"
Actual:   "HHHHeeellllllooo"   ← Baud rate too slow
Actual:   "H?l?o"              ← Baud rate too fast
Actual:   "Íello World"        ← Framing error (wrong start bit)
Actual:   "Hello World " + garbage ← Line ending mismatch

Debug steps:

  1. Verify UBRR calculation matches datasheet formula
  2. Check if double-speed mode is correctly enabled/disabled
  3. Ensure both ends use same baud rate, data bits, parity, stop bits

7.3 Lost Characters

Symptom Cause Fix
Missing chars in fast input Buffer overflow Increase buffer size
Random missing chars Interrupt latency too high Check for long interrupt handlers
First char always lost Not waiting for RXC0 initially Check initialization sequence
Chars lost during command processing Processing too slow Increase buffer, optimize code
Symptom Cause Fix
System hangs ISR not clearing flag Always read UDR0 in RX ISR
Corrupted data Missing critical section Add cli()/sei() around shared data
Random resets Stack overflow in ISR Keep ISR code minimal
ISR never fires Interrupts not enabled Call sei() after initialization

7.5 Debugging Tips

// Debug without UART (if UART itself is broken)
// Use LED to show progress through code
#define DEBUG_LED_INIT()  (DDRB |= (1 << PB5))
#define DEBUG_LED_ON()    (PORTB |= (1 << PB5))
#define DEBUG_LED_OFF()   (PORTB &= ~(1 << PB5))
#define DEBUG_LED_TOGGLE() (PORTB ^= (1 << PB5))

// Checkpoint debugging
DEBUG_LED_INIT();
DEBUG_LED_ON();    // Made it to checkpoint 1
_delay_ms(500);
DEBUG_LED_OFF();   // Moving to checkpoint 2
// ... next operation ...
DEBUG_LED_ON();    // Made it to checkpoint 2

// Once UART works, add debug levels
#define DEBUG_LEVEL 2
#if DEBUG_LEVEL >= 1
    #define DEBUG_INFO(s) uart_puts(s)
#else
    #define DEBUG_INFO(s)
#endif

#if DEBUG_LEVEL >= 2
    #define DEBUG_VERBOSE(s) uart_puts(s)
#else
    #define DEBUG_VERBOSE(s)
#endif

8. Extensions & Challenges

8.1 Easy Extensions (1-2 hours each)

  1. Hex dump function: Print memory regions in hex format with ASCII sidebar
  2. Number input: Parse decimal and hex integers from user input
  3. Uptime counter: Display seconds since boot in status command
  4. LED control: Blink LED based on serial commands

8.2 Intermediate Challenges (3-5 hours each)

  1. Software flow control (XON/XOFF): Send XOFF when buffer near full, XON when space available
  2. ANSI escape codes: Add color output and cursor positioning
  3. Command history: Up arrow recalls previous command (like bash)
  4. Tab completion: Complete partial command names
  5. Multiple baud rates: Switch baud rate at runtime via command

8.3 Advanced Challenges (1-2 days each)

  1. DMA-based TX: Use DMA for large transfers without CPU intervention
  2. Binary protocol: Implement SLIP or COBS for reliable binary data
  3. Bootloader communication: Update firmware over UART
  4. RS-485 support: Add direction control for half-duplex multi-drop
  5. UART-to-I2C bridge: Send I2C commands via UART terminal

9. Real-World Connections

9.1 Industrial Applications

Application How UART is Used Typical Baud Rate
GPS modules NMEA sentence output 4800-115200
Bluetooth modules AT commands and data 9600-460800
Industrial sensors Modbus RTU over RS-485 9600-115200
Barcode scanners Scan data output 9600-115200
Debug consoles Every embedded device 115200
3D printer control G-code commands 115200-250000
Cellular modems AT commands 115200

9.2 Professional Code Comparison

Your bare metal code vs. Arduino Serial library:

// Your bare metal version
// ~200-400 bytes depending on features
uart_init(115200);
uart_printf("Value: %d\n", value);

// Arduino version
// ~2KB for Serial class
Serial.begin(115200);
Serial.print("Value: ");
Serial.println(value);

// Your advantages:
// - 5-10x smaller code size
// - Full control over behavior
// - Can optimize for specific use case
// - Deep understanding of what's happening
// - No hidden abstractions

9.3 Career Impact

UART skills are foundational for:

  • Embedded Software Engineer: Every embedded system has debug serial
  • IoT Developer: Most sensors use UART
  • Hardware Engineer: Bringing up new boards requires serial debug
  • Security Researcher: Serial consoles often enable hardware hacking
  • Robotics Engineer: Motor controllers, sensors all use serial

10. Resources

10.1 Essential References

Resource Purpose
ATmega328P Datasheet Section 20 Official USART register documentation
avr-libc interrupt.h Interrupt macros and ISR syntax
RS-232 Wikipedia Protocol history and specifications

10.2 Online Tutorials

Resource URL Purpose
AVR UART Tutorial maxembedded.com/avr-uart Step-by-step UART guide
Ring Buffer Guide embedjournal.com Circular buffer implementation
Serial Protocol Deep Dive learn.sparkfun.com/tutorials/serial-communication Protocol fundamentals

10.3 Tools

Tool Purpose Usage
screen Serial terminal screen /dev/ttyACM0 115200
minicom Serial terminal with config minicom -D /dev/ttyACM0 -b 115200
picocom Simple serial terminal picocom -b 115200 /dev/ttyACM0
PuTTY Windows serial terminal GUI configuration
CoolTerm Cross-platform terminal GUI with logging

11. Self-Assessment Checklist

Knowledge

  • Can explain UART frame format (start bit, data bits, stop bit)
  • Can calculate UBRR for any baud rate at 16MHz
  • Can explain when to use normal vs double-speed mode
  • Can describe interrupt-driven vs polling I/O tradeoffs
  • Can explain why ring buffers are needed for serial receive
  • Can identify and explain all USART registers
  • Can describe what happens during a critical section

Skills

  • Can initialize UART from scratch without reference
  • Can implement transmit and receive functions
  • Can write an ISR for serial receive
  • Can implement a working ring buffer
  • Can debug baud rate mismatch problems
  • Can use a serial terminal effectively

Confidence

  • Could add UART to any AVR project without documentation
  • Could debug communication problems systematically
  • Could implement a command-line interface
  • Could explain this project confidently in an interview
  • Could teach UART basics to another developer

12. Completion Criteria

Required (Must Have)

  1. UART initializes at 115200 baud - configurable via define or function parameter
  2. Can transmit strings - uart_puts() works correctly
  3. Can receive characters - polling or interrupt-driven
  4. Works with standard terminal - screen, minicom, or picocom
  5. Code under 1KB - check with avr-size

Bonus (Should Have)

  • Interrupt-driven receive with ring buffer
  • No data loss at sustained 115200 baud input
  • printf-style formatted output (%d, %s, %x, %c)
  • Command-line interface with help command

Evidence of Completion

Provide:

  1. Screenshot of terminal showing bidirectional communication
  2. Code demonstrating command parsing
  3. avr-size output showing code size
  4. Brief explanation of your design decisions

Previous Project: P01 - Blink LED

Next Project: P03 - Hardware Timer and PWM


With UART working, you have the debugging lifeline for all future projects. Every bare metal system needs serial output - now you know how to build it from scratch. Next, we’ll master precise timing with hardware timers!