Project 4: UART Driver from Scratch

Back to ARM Deep Dive


Quick Reference

Attribute Details
Difficulty Advanced
Time Estimate 1-2 weeks
Language C (primary), ARM Assembly/Rust (alternatives)
Prerequisites Project 3 (bare-metal), understanding of serial communication, interrupts
Key Topics UART protocol, peripheral registers, NVIC, ring buffers, baud rate
Hardware Required STM32 board + USB-to-Serial adapter (or use built-in ST-Link VCP)
Book Reference “Making Embedded Systems, 2nd Edition” by Elecia White

Learning Objectives

By completing this project, you will:

  1. Understand UART protocol - Know the electrical and timing characteristics of serial communication
  2. Configure peripheral registers - Work with status, data, baud rate, and control registers
  3. Calculate baud rates - Derive divisor values from clock frequencies
  4. Implement interrupt handlers - Configure NVIC, write ISRs, handle flags
  5. Build ring buffers - Create efficient data structures for async I/O
  6. Debug communication issues - Troubleshoot timing, framing, and buffer problems

Theoretical Foundation

Core Concepts

1. UART Protocol Fundamentals

UART (Universal Asynchronous Receiver/Transmitter) is the most common serial protocol:

UART Frame Structure:

 Idle    Start   D0   D1   D2   D3   D4   D5   D6   D7  Parity  Stop   Idle
 (High)    |                                              (opt)   |    (High)
           v                                                      v
   ___     _    _____     _    _____     _____     _         _   ___
      |___| |___| :   |___| |___| :   |___| :   |___| ... ___| |_|
           |                                              |     |
           |<------------ 8-10 bit times --------------->|<-1->|

Key Parameters:
  - Baud rate: Bits per second (e.g., 115200)
  - Data bits: Usually 8
  - Parity: None, Even, or Odd
  - Stop bits: 1 or 2

Common notation: 115200-8N1 = 115200 baud, 8 data, No parity, 1 stop

Timing Example at 115200 baud:

  • Bit time = 1/115200 = 8.68 microseconds
  • Frame time (8N1) = 10 bits * 8.68us = 86.8 microseconds
  • Max throughput = 115200/10 = 11,520 bytes/second

2. UART vs Other Serial Protocols

Protocol Comparison:
+----------+--------+--------+---------+--------+------------+
| Protocol | Wires  | Speed  | Distance| Full   | Use Case   |
|          |        |        |         | Duplex |            |
+----------+--------+--------+---------+--------+------------+
| UART     | 2 (TX/ | 1Mbps  | 15m     | Yes    | Debug,     |
|          | RX)    |        |         |        | modems     |
+----------+--------+--------+---------+--------+------------+
| SPI      | 4+     | 50Mbps | <1m     | Yes    | Sensors,   |
|          |        |        |         |        | Flash      |
+----------+--------+--------+---------+--------+------------+
| I2C      | 2      | 3.4Mbps| <1m     | No     | Sensors,   |
|          | (SCL/  |        |         | (half) | EEPROM     |
|          | SDA)   |        |         |        |            |
+----------+--------+--------+---------+--------+------------+

UART advantages:

  • Only 2 wires (TX, RX) + ground
  • Widely supported (USB-serial adapters everywhere)
  • Works over long distances with RS-232/RS-485
  • Simple to implement and debug

3. STM32 USART Registers

The STM32 USART peripheral has several key registers:

USART Register Map (STM32F4):
+--------+---------------------------------------------------+
| Offset | Register | Purpose                                |
+--------+---------------------------------------------------+
| 0x00   | SR       | Status: TXE, RXNE, TC, ORE, FE, PE    |
| 0x04   | DR       | Data Register (read=RX, write=TX)      |
| 0x08   | BRR      | Baud Rate Register (mantissa+fraction) |
| 0x0C   | CR1      | Control 1: UE, TE, RE, RXNEIE, TXEIE  |
| 0x10   | CR2      | Control 2: STOP bits, clock settings   |
| 0x14   | CR3      | Control 3: DMA, flow control           |
| 0x18   | GTPR     | Guard time and prescaler               |
+--------+---------------------------------------------------+

Key Status Register (SR) Bits:
+------+------+----------------------------------------+
| Bit  | Name | Meaning                                |
+------+------+----------------------------------------+
| 7    | TXE  | TX buffer Empty (ready to send)        |
| 6    | TC   | Transmission Complete                  |
| 5    | RXNE | RX buffer Not Empty (data received)    |
| 4    | IDLE | Idle line detected                     |
| 3    | ORE  | Overrun Error (data lost)              |
| 2    | NF   | Noise detected                         |
| 1    | FE   | Framing Error (bad stop bit)           |
| 0    | PE   | Parity Error                           |
+------+------+----------------------------------------+

4. Baud Rate Calculation

The baud rate is set via a divisor in the BRR register:

Baud Rate Formula:
                    fPCLK
BaudRate = --------------------------
           16 * USARTDIV

Where USARTDIV = Mantissa + (Fraction / 16)

Example: 115200 baud with 16MHz peripheral clock

USARTDIV = 16,000,000 / (16 * 115200) = 8.6805

Mantissa = 8
Fraction = 0.6805 * 16 = 10.88 ≈ 11

BRR = (Mantissa << 4) | Fraction
BRR = (8 << 4) | 11 = 0x8B = 139

Actual baud rate:
16,000,000 / (16 * 8.6875) = 115,108 baud
Error = (115200 - 115108) / 115200 = 0.08% (acceptable)

5. Interrupt-Driven I/O

Polling wastes CPU cycles waiting. Interrupts allow the CPU to do other work:

Polling Approach:
+------------------+
| while(1) {       |     CPU stuck
|   if (RXNE)      | <-- waiting for
|     read byte    |     each byte
|   if (TXE)       |
|     send byte    |
| }                |
+------------------+

Interrupt Approach:
+------------------+     +------------------+
| main() {         |     | USART_IRQHandler |
|   enable_irq()   |     | {                |
|   while(1) {     |     |   if (RXNE)      |
|     do_work()    |     |     buffer_add() |
|   }              |     |   if (TXE)       |
| }                |     |     send_next()  |
+------------------+     | }                |
       |                 +--------^---------+
       |                          |
       +--- IRQ fires when -------+
            data ready

6. Ring Buffer Data Structure

Ring buffers (circular buffers) enable async I/O without blocking:

Ring Buffer Structure:
                    write here
                        |
                        v
+---+---+---+---+---+---+---+---+
| A | B | C | D |   |   |   |   |
+---+---+---+---+---+---+---+---+
  ^                   ^
  |                   |
  read next           head (write position)
  (tail)

typedef struct {
    uint8_t data[BUFFER_SIZE];
    volatile uint8_t head;  // Write position
    volatile uint8_t tail;  // Read position
} ring_buffer_t;

Operations:
- Write: data[head] = byte; head = (head + 1) % SIZE;
- Read:  byte = data[tail]; tail = (tail + 1) % SIZE;
- Empty: head == tail
- Full:  (head + 1) % SIZE == tail

Why volatile? Head/tail modified in interrupt AND main code!

Why This Matters

UART is the “printf debugging” of embedded systems:

  • Every embedded project needs debug output
  • Universal connectivity - works with any computer
  • Protocol foundation - same concepts apply to SPI, I2C
  • Interrupt patterns - same ISR patterns for all peripherals
  • Driver architecture - template for all peripheral drivers

Historical Context

UART dates back to the 1960s (teletypewriters). RS-232 was standardized in 1962. Today’s USB-serial adapters convert UART to USB, but the protocol remains identical. Understanding UART means understanding 60 years of serial communication.


Project Specification

What You’ll Build

A complete UART driver for bare-metal ARM that:

  1. Initializes UART with configurable baud rate
  2. Implements polling mode for simple use
  3. Implements interrupt mode for efficiency
  4. Uses ring buffers for TX and RX
  5. Provides printf-like formatted output
  6. Handles errors (overrun, framing)
  7. Serves as a debug console

Requirements

Core Requirements:

  • Configure UART pins (TX/RX as alternate function)
  • Calculate and set baud rate divisor
  • Enable UART peripheral (UE, TE, RE)
  • Polling transmit: wait for TXE, write DR
  • Polling receive: wait for RXNE, read DR
  • Basic echo functionality

Extended Requirements:

  • Configure NVIC for USART interrupt
  • Implement RX interrupt handler
  • Implement TX interrupt handler
  • Ring buffer for receive
  • Ring buffer for transmit
  • printf-style formatted output
  • Error detection and reporting

Hardware Setup

STM32 Nucleo boards have built-in ST-Link with Virtual COM Port:

  • USART2 TX: PA2
  • USART2 RX: PA3
  • Shows as /dev/ttyACM0 on Linux, COM3 on Windows

External USB-Serial Adapter:

  • Connect TX to RX on adapter
  • Connect RX to TX on adapter
  • Connect GND to GND
  • 3.3V logic levels!

Example Output

# On your computer:
$ screen /dev/ttyACM0 115200

=== ARM UART Driver Test ===
UART initialized at 115200 baud
Echo mode active. Type characters:

> Hello, ARM!
You typed: Hello, ARM!

> test interrupt
IRQ count: 47, TX: 14, RX: 14, Errors: 0

> stats
Buffer status: RX 0/64, TX 0/64
Overruns: 0, Framing errors: 0

> 0123456789ABCDEF (rapid typing test)
0123456789ABCDEF

Status:
  RX interrupts: 892
  TX interrupts: 1456
  Overrun errors: 0
  Characters processed: 2348

Solution Architecture

High-Level Design

+----------------+          +-------------------+
|   Application  |          |                   |
|  (main.c)      |          |   UART ISR        |
|                |          |                   |
| uart_puts()    |--+       |  RXNE: read DR,   |
| uart_printf()  |  |       |  add to rx_buf    |
| uart_getc()    |  |       |                   |
+-------+--------+  |       |  TXE: get from    |
        |           |       |  tx_buf, write DR |
        |           |       +---------+---------+
        v           v                 |
+-------------------+                 |
|   Ring Buffers    |<----------------+
|                   |
| +-------+  +-----+|
| |TX buf |  |RX   ||
| |       |  |buf  ||
| +-------+  +-----+|
+--------+----------+
         |
         v
+-------------------+
|  UART Registers   |
|  (hardware)       |
|                   |
|  SR, DR, BRR, CR1 |
+-------------------+
         |
         v
      TX/RX Pins
         |
         v
   Serial Cable to PC

File Structure

uart-driver/
+-- startup.s          # Vector table (from Project 3)
+-- main.c             # Application code
+-- uart.c             # UART driver implementation
+-- uart.h             # UART driver interface
+-- ring_buffer.c      # Ring buffer implementation
+-- ring_buffer.h      # Ring buffer interface
+-- stm32f4.h          # Register definitions
+-- linker.ld          # Memory layout
+-- Makefile           # Build script

UART Driver Interface

/* uart.h */
#ifndef UART_H
#define UART_H

#include <stdint.h>
#include <stddef.h>

/* Configuration */
typedef enum {
    UART_PARITY_NONE,
    UART_PARITY_EVEN,
    UART_PARITY_ODD
} uart_parity_t;

typedef enum {
    UART_STOPBITS_1,
    UART_STOPBITS_2
} uart_stopbits_t;

typedef struct {
    uint32_t baudrate;
    uart_parity_t parity;
    uart_stopbits_t stopbits;
    uint8_t use_interrupts;
} uart_config_t;

/* Initialization */
void uart_init(uart_config_t *config);
void uart_deinit(void);

/* Polling mode */
void uart_putc_blocking(char c);
char uart_getc_blocking(void);

/* Non-blocking (interrupt mode) */
int uart_putc(char c);           /* Returns -1 if buffer full */
int uart_getc(void);             /* Returns -1 if buffer empty */

/* String operations */
void uart_puts(const char *str);
void uart_printf(const char *fmt, ...);

/* Buffer status */
size_t uart_rx_available(void);
size_t uart_tx_space(void);
void uart_flush_tx(void);

/* Statistics */
typedef struct {
    uint32_t rx_bytes;
    uint32_t tx_bytes;
    uint32_t rx_interrupts;
    uint32_t tx_interrupts;
    uint32_t overrun_errors;
    uint32_t framing_errors;
} uart_stats_t;

void uart_get_stats(uart_stats_t *stats);

#endif /* UART_H */

Implementation Guide

Environment Setup

Reuse Project 3 setup, adding:

# Test terminal programs
sudo apt install screen minicom picocom

# Connect to serial port
screen /dev/ttyACM0 115200
# Exit with: Ctrl-A, then K, then Y

# Or use minicom
minicom -D /dev/ttyACM0 -b 115200

Implementation Hints in Layers

Hint 1: GPIO Configuration for UART

UART pins must be configured as alternate functions:

/* Configure PA2 (TX) and PA3 (RX) for USART2 */
void uart_gpio_init(void) {
    /* 1. Enable GPIOA clock */
    RCC->AHB1ENR |= RCC_AHB1ENR_GPIOAEN;

    /* 2. Set PA2, PA3 to Alternate Function mode */
    GPIOA->MODER &= ~((0x3 << (2 * 2)) | (0x3 << (3 * 2)));
    GPIOA->MODER |= (0x2 << (2 * 2)) | (0x2 << (3 * 2));

    /* 3. Set alternate function to AF7 (USART2) */
    /* AFR[0] covers pins 0-7 */
    GPIOA->AFR[0] &= ~((0xF << (2 * 4)) | (0xF << (3 * 4)));
    GPIOA->AFR[0] |= (7 << (2 * 4)) | (7 << (3 * 4));

    /* 4. High speed for TX pin */
    GPIOA->OSPEEDR |= (0x2 << (2 * 2));

    /* 5. No pull-up/pull-down (or weak pull-up for RX) */
    GPIOA->PUPDR &= ~((0x3 << (2 * 2)) | (0x3 << (3 * 2)));
    GPIOA->PUPDR |= (0x1 << (3 * 2));  /* Pull-up on RX */
}

Hint 2: UART Initialization (Polling Mode)

Start with simple polling - get it working before interrupts:

void uart_init_polling(uint32_t baudrate) {
    /* 1. Enable USART2 clock */
    RCC->APB1ENR |= RCC_APB1ENR_USART2EN;

    /* 2. Configure GPIO pins */
    uart_gpio_init();

    /* 3. Disable UART during configuration */
    USART2->CR1 = 0;

    /* 4. Set baud rate */
    /* Assuming APB1 clock = 16MHz (default) */
    /* USARTDIV = fPCLK / (16 * baud) */
    uint32_t pclk = 16000000;  /* Get actual clock if using PLL */
    uint32_t div = (pclk + baudrate / 2) / baudrate;  /* Round */
    USART2->BRR = div;

    /* 5. Configure frame: 8N1 */
    /* CR1: M=0 (8 bits), PCE=0 (no parity) */
    /* CR2: STOP=00 (1 stop bit) */
    USART2->CR2 = 0;

    /* 6. Enable TX, RX, and UART */
    USART2->CR1 = USART_CR1_TE | USART_CR1_RE | USART_CR1_UE;

    /* 7. Wait for USART to be ready */
    while (!(USART2->SR & USART_SR_TC));
}

Hint 3: Polling Transmit and Receive

Simple blocking I/O for testing:

void uart_putc_blocking(char c) {
    /* Wait until TX buffer is empty */
    while (!(USART2->SR & USART_SR_TXE));

    /* Write character to data register */
    USART2->DR = c;
}

char uart_getc_blocking(void) {
    /* Wait until RX buffer has data */
    while (!(USART2->SR & USART_SR_RXNE));

    /* Read and return character */
    return (char)USART2->DR;
}

void uart_puts(const char *str) {
    while (*str) {
        uart_putc_blocking(*str++);
    }
}

/* Test it! */
int main(void) {
    uart_init_polling(115200);

    uart_puts("Hello, UART!\r\n");

    while (1) {
        char c = uart_getc_blocking();
        uart_puts("You typed: ");
        uart_putc_blocking(c);
        uart_puts("\r\n");
    }
}

Hint 4: Ring Buffer Implementation

Create a reusable ring buffer module:

/* ring_buffer.h */
#define BUFFER_SIZE 64

typedef struct {
    uint8_t data[BUFFER_SIZE];
    volatile uint8_t head;  /* Write position */
    volatile uint8_t tail;  /* Read position */
} ring_buffer_t;

static inline void rb_init(ring_buffer_t *rb) {
    rb->head = 0;
    rb->tail = 0;
}

static inline uint8_t rb_is_empty(ring_buffer_t *rb) {
    return rb->head == rb->tail;
}

static inline uint8_t rb_is_full(ring_buffer_t *rb) {
    return ((rb->head + 1) % BUFFER_SIZE) == rb->tail;
}

static inline uint8_t rb_count(ring_buffer_t *rb) {
    return (rb->head - rb->tail + BUFFER_SIZE) % BUFFER_SIZE;
}

static inline int rb_put(ring_buffer_t *rb, uint8_t byte) {
    if (rb_is_full(rb)) return -1;
    rb->data[rb->head] = byte;
    rb->head = (rb->head + 1) % BUFFER_SIZE;
    return 0;
}

static inline int rb_get(ring_buffer_t *rb) {
    if (rb_is_empty(rb)) return -1;
    uint8_t byte = rb->data[rb->tail];
    rb->tail = (rb->tail + 1) % BUFFER_SIZE;
    return byte;
}

Hint 5: Interrupt-Driven UART

Now add interrupts for efficient operation:

/* Global buffers */
static ring_buffer_t rx_buffer;
static ring_buffer_t tx_buffer;
static uart_stats_t stats;

void uart_init_interrupt(uint32_t baudrate) {
    /* Same as polling init, plus: */

    /* Initialize buffers */
    rb_init(&rx_buffer);
    rb_init(&tx_buffer);

    /* Enable RXNE interrupt (receive not empty) */
    USART2->CR1 |= USART_CR1_RXNEIE;

    /* Configure NVIC for USART2 */
    /* USART2 is IRQ 38 on STM32F4 */
    NVIC_SetPriority(USART2_IRQn, 1);
    NVIC_EnableIRQ(USART2_IRQn);
}

/* Interrupt handler */
void USART2_IRQHandler(void) {
    /* Check for RX data */
    if (USART2->SR & USART_SR_RXNE) {
        uint8_t data = USART2->DR;  /* Read clears RXNE */
        if (rb_put(&rx_buffer, data) < 0) {
            stats.overrun_errors++;
        }
        stats.rx_bytes++;
        stats.rx_interrupts++;
    }

    /* Check for TX ready */
    if ((USART2->SR & USART_SR_TXE) &&
        (USART2->CR1 & USART_CR1_TXEIE)) {
        int data = rb_get(&tx_buffer);
        if (data >= 0) {
            USART2->DR = data;
            stats.tx_bytes++;
        } else {
            /* Buffer empty, disable TXE interrupt */
            USART2->CR1 &= ~USART_CR1_TXEIE;
        }
        stats.tx_interrupts++;
    }

    /* Check for errors */
    if (USART2->SR & USART_SR_ORE) {
        (void)USART2->DR;  /* Read DR to clear ORE */
        stats.overrun_errors++;
    }
    if (USART2->SR & USART_SR_FE) {
        (void)USART2->DR;
        stats.framing_errors++;
    }
}

/* Non-blocking send */
int uart_putc(char c) {
    if (rb_is_full(&tx_buffer)) {
        return -1;  /* Buffer full */
    }

    /* Critical section - disable interrupts briefly */
    __disable_irq();
    rb_put(&tx_buffer, c);

    /* Enable TXE interrupt to start sending */
    USART2->CR1 |= USART_CR1_TXEIE;
    __enable_irq();

    return 0;
}

/* Non-blocking receive */
int uart_getc(void) {
    return rb_get(&rx_buffer);
}

Hint 6: Add the Interrupt Vector

Update your vector table in startup.s:

    .section .vectors, "a"
    .word   _stack_top          @ 0x00
    .word   reset_handler       @ 0x04
    .word   nmi_handler         @ 0x08
    .word   hardfault_handler   @ 0x0C
    @ ... fill in reserved entries ...
    .word   0                   @ 0x94 (IRQ 37)
    .word   USART2_IRQHandler   @ 0x98 (IRQ 38 = USART2)
    @ ... more vectors if needed ...

Interview Questions You Should Be Able to Answer

After completing this project, you should confidently answer:

  1. “Explain how UART framing works.”
    • Idle high, start bit (low), 8 data bits, optional parity, stop bit(s)
    • Both sides must agree on baud rate
  2. “How do you calculate the baud rate divisor?”
    • USARTDIV = fPCLK / (16 * BaudRate)
    • Split into mantissa and fractional parts
    • Error should be < 3%
  3. “Why use a ring buffer for UART?”
    • Decouples interrupt timing from application timing
    • Prevents data loss during bursts
    • Allows non-blocking I/O
  4. “What causes an overrun error?”
    • New data arrives before previous was read
    • CPU too slow or interrupts disabled too long
    • Buffer too small or slow processing
  5. “Why must ring buffer indices be volatile?”
    • Modified by both ISR and main code
    • Without volatile, compiler might cache values
    • Could miss updates from interrupt
  6. “How do you debug UART without an oscilloscope?”
    • Loopback test (connect TX to RX)
    • LED toggle in ISR shows activity
    • Counter for RX/TX interrupts
    • Known test pattern

Books That Will Help

Topic Book Chapter
UART Protocol “Making Embedded Systems” - White Chapter 9
ARM Interrupts “Definitive Guide to ARM Cortex-M3/M4” - Yiu Chapter 8
Ring Buffers “Mastering Algorithms with C” - Loudon Chapter 6
Baud Rate STM32 Reference Manual USART Chapter
NVIC Configuration ARM Cortex-M TRM Interrupt Chapter

Testing Strategy

Test Cases

/* test_uart.c */

void test_loopback(void) {
    /* Connect TX to RX externally */
    uart_puts("Loopback test: ");

    char test[] = "ABCDEFGH";
    for (int i = 0; test[i]; i++) {
        uart_putc_blocking(test[i]);
    }

    /* Small delay for bytes to arrive */
    delay(1000);

    for (int i = 0; test[i]; i++) {
        int c = uart_getc();
        if (c != test[i]) {
            uart_puts("FAIL\r\n");
            return;
        }
    }
    uart_puts("PASS\r\n");
}

void test_buffer_stress(void) {
    /* Send 1000 characters rapidly */
    uart_puts("Buffer stress test: ");

    for (int i = 0; i < 1000; i++) {
        while (uart_putc('X') < 0);  /* Retry if full */
    }

    uart_stats_t s;
    uart_get_stats(&s);
    uart_printf("Sent: %d, Overruns: %d\r\n",
                s.tx_bytes, s.overrun_errors);
}

void test_echo(void) {
    uart_puts("Type characters, Ctrl-C to exit:\r\n");

    while (1) {
        int c = uart_getc();
        if (c >= 0) {
            if (c == 3) break;  /* Ctrl-C */
            uart_putc(c);       /* Echo */
        }
    }
}

Test Script on Host

#!/bin/bash
# test_uart.sh

PORT=/dev/ttyACM0
BAUD=115200

# Send test pattern
echo "HELLO" > $PORT

# Read response with timeout
timeout 1 cat $PORT

# Python test for more control
python3 << 'EOF'
import serial
import time

ser = serial.Serial('/dev/ttyACM0', 115200, timeout=1)

# Send test
ser.write(b'test\r\n')
time.sleep(0.1)

# Read response
response = ser.read(100)
print(f"Received: {response}")

# Speed test
start = time.time()
for i in range(1000):
    ser.write(b'X')
elapsed = time.time() - start
print(f"1000 bytes in {elapsed:.3f}s = {1000/elapsed:.0f} bytes/sec")

ser.close()
EOF

Common Pitfalls & Debugging

Problem Symptom Root Cause Fix
No output Nothing on terminal TX pin not configured, clock not enabled Check RCC and GPIO
Garbage characters Random symbols Wrong baud rate Verify clock and BRR value
Missing characters “Heo Wold” Overrun, ISR too slow Increase buffer, check priority
Echo works but interrupt doesn’t Polling OK, IRQ fails NVIC not configured Check NVIC enable and priority
TX stops Sends first char only TXEIE not re-enabled Enable in putc()
Characters doubled “HHeellloo” Echo from terminal Disable local echo in screen
Framing errors FE flag set Mismatched baud or noise Check both sides, use shorter cable

Debugging Techniques

/* Debug: toggle LED in interrupt */
void USART2_IRQHandler(void) {
    GPIOA->ODR ^= (1 << 5);  /* Toggle LED */
    /* ... rest of handler ... */
}

/* Debug: count interrupts */
volatile uint32_t irq_count = 0;
void USART2_IRQHandler(void) {
    irq_count++;
    /* ... */
}
/* In main: uart_printf("IRQs: %d\n", irq_count); */

/* Debug: check register values */
void uart_debug(void) {
    uart_printf("SR: 0x%08X\r\n", USART2->SR);
    uart_printf("CR1: 0x%08X\r\n", USART2->CR1);
    uart_printf("BRR: 0x%08X\r\n", USART2->BRR);
}

Extensions & Challenges

After Basic Completion

  1. Add DMA: Transfer entire buffers without CPU intervention
  2. Hardware flow control: Implement RTS/CTS
  3. Multiple UARTs: Support USART1, USART2, USART6
  4. Shell interface: Parse commands like “led on”, “stats”
  5. Binary protocol: Implement XMODEM for file transfer

Challenge Mode

  • Implement printf with full format specifiers
  • Add software UART (bit-bang) on any GPIO
  • Build a UART-to-I2C bridge
  • Implement a simple debugger over UART

Advanced: DMA-Based UART

/* Enable DMA for TX */
void uart_init_dma(void) {
    /* Enable DMA1 clock */
    RCC->AHB1ENR |= RCC_AHB1ENR_DMA1EN;

    /* Configure DMA1 Stream 6 for USART2 TX */
    DMA1_Stream6->CR = 0;
    DMA1_Stream6->CR |= (4 << 25);      /* Channel 4 */
    DMA1_Stream6->CR |= DMA_SxCR_MINC;  /* Memory increment */
    DMA1_Stream6->CR |= (1 << 6);       /* Memory to peripheral */
    DMA1_Stream6->PAR = (uint32_t)&USART2->DR;

    /* Enable USART DMA transmit */
    USART2->CR3 |= USART_CR3_DMAT;
}

void uart_dma_send(const uint8_t *data, uint16_t len) {
    DMA1_Stream6->M0AR = (uint32_t)data;
    DMA1_Stream6->NDTR = len;
    DMA1_Stream6->CR |= DMA_SxCR_EN;
}

Self-Assessment Checklist

Before moving to Project 5, verify you can:

  • Configure GPIO pins for alternate function (UART)
  • Calculate baud rate divisor for any clock/baud combination
  • Explain the difference between polling and interrupt I/O
  • Implement a ring buffer from scratch
  • Configure NVIC for a peripheral interrupt
  • Write an interrupt handler that clears flags correctly
  • Debug UART issues systematically
  • Explain what causes overrun and framing errors

Final Test

Without looking at your code:

  1. Write the baud rate calculation for 9600 baud at 84MHz clock
  2. Draw the ring buffer state after: put(A), put(B), get(), put(C)
  3. List the steps to enable USART2 RX interrupt on STM32

If you can do this, you understand UART deeply!


Learning Milestones

Track your progress through these milestones:

Milestone You Can… Understanding Level
1 See characters in terminal Basic TX works
2 Echo mode works Both TX and RX work
3 Interrupts don’t crash NVIC configured correctly
4 No characters lost at high speed Ring buffers work correctly

Complete Example: uart.c

Here’s a complete reference implementation:

/* uart.c - Complete UART driver */
#include "uart.h"
#include "stm32f4.h"
#include <stdarg.h>

#define BUFFER_SIZE 64

typedef struct {
    uint8_t data[BUFFER_SIZE];
    volatile uint8_t head;
    volatile uint8_t tail;
} ring_buffer_t;

static ring_buffer_t rx_buffer;
static ring_buffer_t tx_buffer;
static uart_stats_t stats;

/* Ring buffer helpers */
static inline void rb_init(ring_buffer_t *rb) {
    rb->head = rb->tail = 0;
}

static inline int rb_put(ring_buffer_t *rb, uint8_t byte) {
    uint8_t next = (rb->head + 1) % BUFFER_SIZE;
    if (next == rb->tail) return -1;
    rb->data[rb->head] = byte;
    rb->head = next;
    return 0;
}

static inline int rb_get(ring_buffer_t *rb) {
    if (rb->head == rb->tail) return -1;
    uint8_t byte = rb->data[rb->tail];
    rb->tail = (rb->tail + 1) % BUFFER_SIZE;
    return byte;
}

void uart_init(uart_config_t *config) {
    /* Enable clocks */
    RCC->AHB1ENR |= RCC_AHB1ENR_GPIOAEN;
    RCC->APB1ENR |= RCC_APB1ENR_USART2EN;

    /* Configure PA2/PA3 for USART2 */
    GPIOA->MODER &= ~((3 << 4) | (3 << 6));
    GPIOA->MODER |= (2 << 4) | (2 << 6);
    GPIOA->AFR[0] |= (7 << 8) | (7 << 12);

    /* Configure USART */
    USART2->CR1 = 0;
    USART2->BRR = 16000000 / config->baudrate;
    USART2->CR1 = USART_CR1_TE | USART_CR1_RE | USART_CR1_UE;

    if (config->use_interrupts) {
        rb_init(&rx_buffer);
        rb_init(&tx_buffer);
        USART2->CR1 |= USART_CR1_RXNEIE;
        NVIC_EnableIRQ(USART2_IRQn);
    }
}

void USART2_IRQHandler(void) {
    if (USART2->SR & USART_SR_RXNE) {
        uint8_t data = USART2->DR;
        rb_put(&rx_buffer, data);
        stats.rx_bytes++;
    }

    if ((USART2->SR & USART_SR_TXE) &&
        (USART2->CR1 & USART_CR1_TXEIE)) {
        int data = rb_get(&tx_buffer);
        if (data >= 0) {
            USART2->DR = data;
            stats.tx_bytes++;
        } else {
            USART2->CR1 &= ~USART_CR1_TXEIE;
        }
    }
}

void uart_putc_blocking(char c) {
    while (!(USART2->SR & USART_SR_TXE));
    USART2->DR = c;
}

int uart_putc(char c) {
    __disable_irq();
    int result = rb_put(&tx_buffer, c);
    USART2->CR1 |= USART_CR1_TXEIE;
    __enable_irq();
    return result;
}

int uart_getc(void) {
    return rb_get(&rx_buffer);
}

void uart_puts(const char *str) {
    while (*str) {
        uart_putc_blocking(*str++);
    }
}

<- Previous Project: Bare-Metal LED Blinker Next Project: ARM Memory Allocator ->