Project 7: Context Switcher & Mini-Scheduler

Build a preemptive scheduler that runs multiple “tasks” concurrently on a single ARM core, with context switching driven by the SysTick timer—like a tiny FreeRTOS kernel.


Quick Reference

Attribute Value
Language ARM Assembly + C
Difficulty Expert
Time 2-4 weeks
Coolness ★★★★★ Pure Magic (Super Cool)
Portfolio Value Resume Gold
Prerequisites Projects 3-4 (bare-metal + UART), ARM assembly, stack concepts
Key Topics Context switching, SysTick, TCB, preemption, scheduling

Learning Objectives

By completing this project, you will:

  1. Understand how multitasking works: See that “parallel” execution is an illusion created by fast switching
  2. Master ARM exception handling: Implement PendSV and SysTick handlers
  3. Save and restore CPU context: Know exactly which registers must be preserved
  4. Design Task Control Blocks: Create the data structure that represents a running task
  5. Implement scheduling algorithms: Build round-robin and priority-based schedulers
  6. Handle critical sections: Prevent race conditions in kernel code

The Core Question You’re Answering

“How can a single CPU core appear to run multiple programs simultaneously?”

This is the fundamental question of operating systems. The answer is: it can’t—but by switching between programs fast enough (thousands of times per second), we create the illusion of parallelism. Understanding this illusion requires knowing exactly what constitutes a program’s “state” and how to save/restore it.


Concepts You Must Understand First

Concept Why It Matters Where to Learn
ARM register set You’ll save/restore all of them Project 2, ARM TRM
Stack pointer and stack frames Each task needs its own stack CS:APP Chapter 3
ARM exception model Understand automatic stacking ARM Cortex-M TRM
Interrupt priorities SysTick vs PendSV ordering Joseph Yiu’s book Ch. 8
Function call convention Which registers are caller/callee saved AAPCS document
Linked lists TCBs are linked together Any DS course

Self-Assessment Questions

  1. Registers: In Cortex-M, which registers are automatically pushed to the stack on exception entry?
  2. Stack pointer: What’s the difference between MSP and PSP?
  3. Exception return: What does 0xFFFFFFF9 mean as a return value from an exception?
  4. Priority: Should SysTick or PendSV have higher priority? Why?
  5. Calling convention: Which registers must a function preserve (callee-saved)?

Theoretical Foundation

What Is Context?

A program’s “context” is everything needed to resume its execution:

Task Context on ARM Cortex-M:
┌─────────────────────────────────────────────────────────────┐
│  CPU Registers:                                             │
│    R0-R12      - General purpose registers                  │
│    SP (R13)    - Stack pointer (points to task's stack)     │
│    LR (R14)    - Link register (return address)             │
│    PC (R15)    - Program counter (where to resume)          │
│    xPSR        - Program status (flags, exception state)    │
│                                                             │
│  Task-specific:                                             │
│    Stack       - Task's private stack memory                │
│    TCB         - Metadata (state, priority, etc.)           │
└─────────────────────────────────────────────────────────────┘

The Context Switch Illusion

Time →
────────────────────────────────────────────────────────────────────────────

Task A:  █████████░░░░░░░░░░█████████░░░░░░░░░░█████████░░░░░░░░░░...
Task B:  ░░░░░░░░░█████████░░░░░░░░░░█████████░░░░░░░░░░█████████░...
Task C:  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░...

         ↑         ↑         ↑         ↑         ↑
         │         │         │         │         │
    Context    Context   Context   Context   Context
    switch     switch    switch    switch    switch
    (A→B)      (B→A)     (A→B)     (B→A)     (A→B)

Each "█" = 10ms of execution
Switch happens ~100 times/second (10ms time slice)
From a human perspective, A and B appear to run "at the same time"

Cortex-M Exception Stack Frame

When an exception occurs, the hardware automatically pushes 8 registers:

High addresses (before exception)
    ┌─────────────────┐ ← SP before exception
    │                 │
    │ (previous data) │
    │                 │
    ├─────────────────┤
    │      xPSR       │ ← Pushed automatically by hardware
    ├─────────────────┤
    │       PC        │ ← Where to return to
    ├─────────────────┤
    │       LR        │
    ├─────────────────┤
    │       R12       │
    ├─────────────────┤
    │       R3        │
    ├─────────────────┤
    │       R2        │
    ├─────────────────┤
    │       R1        │
    ├─────────────────┤
    │       R0        │ ← SP after exception entry
    └─────────────────┘
Low addresses

Note: R4-R11 are NOT saved automatically!
      You must save them manually in the context switch.

Why PendSV Instead of SysTick?

SysTick fires at regular intervals for time-keeping. PendSV is a “pendable” exception specifically designed for context switching:

Problem with switching directly in SysTick:
┌────────────────────────────────────────────────────────────────┐
│ SysTick fires during higher-priority ISR                       │
│ → Context switch happens in middle of ISR                      │
│ → ISR state corrupted                                          │
│ → System crash!                                                │
└────────────────────────────────────────────────────────────────┘

Solution: Use PendSV with lowest priority:
┌────────────────────────────────────────────────────────────────┐
│ SysTick fires → Sets PendSV pending bit                        │
│ Higher-priority ISRs complete                                  │
│ PendSV fires (lowest priority)                                 │
│ → Context switch happens safely                                │
│ → All ISRs have completed                                      │
└────────────────────────────────────────────────────────────────┘

Priority configuration:
- SysTick: Low but not lowest (e.g., 0xE0)
- PendSV: Lowest possible (0xFF)
- Other ISRs: Higher priority

Task Control Block (TCB)

The TCB holds everything the scheduler needs to manage a task:

typedef struct tcb {
    uint32_t *sp;           // Saved stack pointer (MUST be first!)
    struct tcb *next;       // Next task in list (for round-robin)
    uint32_t *stack_base;   // Base of allocated stack
    uint32_t stack_size;    // Size of stack
    uint8_t priority;       // Task priority (lower = higher priority)
    uint8_t state;          // READY, RUNNING, BLOCKED, SUSPENDED
    uint32_t ticks;         // Sleep counter / statistics
    char name[16];          // Human-readable name
} TCB_t;

Task states:
┌─────────────────────────────────────────────────────────────┐
                                                             
    ┌─────────┐    schedule     ┌─────────┐                 
      READY   ─────────────▶  RUNNING                  
    └─────────┘                 └─────────┘                 
                                                           
                                     preempt / yield       
                                                           
             event         ┌─────────────┐                 
         └─────────────────   BLOCKED                     
                             (wait/sleep)                 
                            └─────────────┘                 
                                                             
└─────────────────────────────────────────────────────────────┘

Stack Initialization for a New Task

When creating a task, we must set up its stack as if it had been switched out:

// Initialize task stack to look like it was interrupted
void init_task_stack(TCB_t *tcb, void (*task_func)(void)) {
    uint32_t *sp = tcb->stack_base + tcb->stack_size / 4;

    // Stack grows down, so start at top

    // Exception frame (what hardware pushes)
    *(--sp) = 0x01000000;           // xPSR - Thumb bit set
    *(--sp) = (uint32_t)task_func;  // PC - task entry point
    *(--sp) = 0xFFFFFFFD;           // LR - special EXC_RETURN value
    *(--sp) = 0x12121212;           // R12
    *(--sp) = 0x03030303;           // R3
    *(--sp) = 0x02020202;           // R2
    *(--sp) = 0x01010101;           // R1
    *(--sp) = 0x00000000;           // R0 - could be argument

    // Software saved registers (what PendSV pushes manually)
    *(--sp) = 0x11111111;           // R11
    *(--sp) = 0x10101010;           // R10
    *(--sp) = 0x09090909;           // R9
    *(--sp) = 0x08080808;           // R8
    *(--sp) = 0x07070707;           // R7
    *(--sp) = 0x06060606;           // R6
    *(--sp) = 0x05050505;           // R5
    *(--sp) = 0x04040404;           // R4

    tcb->sp = sp;
}

The PendSV Handler (Heart of Context Switching)

// PendSV_Handler - performs the actual context switch
PendSV_Handler:
    // Disable interrupts during switch
    CPSID   I

    // Get current task's stack pointer (PSP)
    MRS     R0, PSP

    // Save R4-R11 to current task's stack
    STMDB   R0!, {R4-R11}

    // Save updated SP to current TCB
    LDR     R1, =current_tcb
    LDR     R2, [R1]            // R2 = current_tcb pointer
    STR     R0, [R2]            // current_tcb->sp = R0

    // Call scheduler to get next task
    PUSH    {R14}               // Save LR (EXC_RETURN)
    BL      scheduler           // Returns next TCB in R0
    POP     {R14}

    // Update current_tcb pointer
    LDR     R1, =current_tcb
    STR     R0, [R1]

    // Get new task's stack pointer
    LDR     R0, [R0]            // R0 = new_tcb->sp

    // Restore R4-R11 from new task's stack
    LDMIA   R0!, {R4-R11}

    // Set PSP to new task's stack
    MSR     PSP, R0

    // Re-enable interrupts
    CPSIE   I

    // Return (hardware will pop R0-R3, R12, LR, PC, xPSR)
    BX      LR

Scheduling Algorithms

Round-Robin (Time-Sliced):

Simple: Each task gets equal time slice, cycle through all tasks

    Task A → Task B → Task C → Task A → ...

scheduler():
    current = current->next
    if current == NULL:
        current = task_list_head
    return current

Priority-Based:

Higher priority tasks run first, same priority uses round-robin

    Priority 0: A, B (highest)
    Priority 1: C
    Priority 2: D, E (lowest)

    Run order: A → B → A → B → ... (C, D, E starve unless A, B block)

scheduler():
    // Find highest priority ready task
    for each priority level:
        for each task at this level:
            if task->state == READY:
                return task

Priority with Aging:

Prevent starvation by boosting priority of waiting tasks

    Every tick: waiting_tasks.priority++
    After running: task.priority = base_priority

Critical Sections

Kernel data structures must be protected from concurrent access:

// WRONG - race condition
void add_task(TCB_t *task) {
    task->next = task_list;  // SysTick here → corrupted list!
    task_list = task;
}

// RIGHT - disable interrupts
void add_task(TCB_t *task) {
    uint32_t primask = __get_PRIMASK();
    __disable_irq();

    task->next = task_list;
    task_list = task;

    __set_PRIMASK(primask);  // Restore previous state
}

// Alternative: Use LDREX/STREX for lock-free operations

Why This Matters

Understanding context switching is essential for:

  • Operating systems development: Every OS does this
  • RTOS usage: Know what FreeRTOS/Zephyr do under the hood
  • Debugging: Understand stack traces, watchdogs, crashes
  • Performance: Context switches have overhead
  • Real-time systems: Predictable switching times are critical

Project Specification

What You Will Build

A preemptive mini-scheduler with:

  1. Task management: Create, start, suspend, resume tasks
  2. Preemptive scheduling: SysTick-driven context switches
  3. Round-robin scheduler: Fair time sharing between tasks
  4. Priority support: Higher priority tasks run first
  5. Kernel API: yield(), sleep(), get_tick()
  6. Statistics: Context switch count, CPU usage per task

Functional Requirements

  1. Task Creation:
    • Create tasks with name, function, stack size, priority
    • Initialize stack for first context switch
    • Support at least 8 concurrent tasks
  2. Scheduling:
    • 10ms time slice (100 Hz tick rate)
    • Round-robin among same-priority tasks
    • Higher priority preempts lower priority
    • Idle task runs when no other tasks ready
  3. Kernel Services:
    • yield(): Voluntarily give up CPU
    • sleep(ticks): Block for N tick periods
    • get_tick(): Return current tick count
    • suspend(task) / resume(task): Task state control
  4. Safety:
    • Stack overflow detection (canary values)
    • Watchdog integration (optional)
    • No priority inversion (basic)

Non-Functional Requirements

  • Deterministic: Bounded context switch time (<10us)
  • Low overhead: Minimal cycles spent in scheduler
  • Debuggable: Easy to inspect task states and stacks

Real World Outcome

=== ARM Mini-Scheduler Demo ===

Creating tasks:
  Task 1: LED blinker (priority 1, stack 256)
  Task 2: Serial echo (priority 2, stack 512)
  Task 3: Counter display (priority 3, stack 256)

Scheduler started, tick = 10ms

[00000010] Task3: Count = 1
[00000020] Task3: Count = 2
[00000025] Task2: Echo 'H'
[00000026] Task2: Echo 'i'
[00000030] Task3: Count = 3
[00000040] Task3: Count = 4
[00000500] Task1: LED ON
[00001000] Task1: LED OFF
[00001010] Task3: Count = 100

> stats
Scheduler Statistics:
  Uptime: 15.34 seconds
  Context switches: 1523
  Tasks: 3 active, 1 idle

  Task           State     Priority  CPU%   Switches  Stack
  LED_blinker    READY     1         5%     127       240/256
  Serial_echo    BLOCKED   2         12%    412       380/512
  Counter        RUNNING   3         78%    983       210/256
  Idle           READY     255       5%     1         60/128

> suspend 3
Task3 suspended

> resume 3
Task3 resumed

> priority 2 1
Task2 priority changed: 2 -> 1

Physical result: LED blinks, serial echoes typed chars, counter prints

Solution Architecture

High-Level Design

┌─────────────────────────────────────────────────────────────────────┐
│                       Mini-Scheduler                                │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐           │
│  │   SysTick    │──▶│    PendSV    │──▶│  Scheduler   │           │
│  │   Handler    │   │    Handler   │   │   (C code)   │           │
│  │  (tick++)    │   │  (asm switch)│   │  (pick next) │           │
│  └──────────────┘   └──────────────┘   └──────────────┘           │
│         │                  │                  │                    │
│         │                  │                  │                    │
│         ▼                  ▼                  ▼                    │
│  ┌─────────────────────────────────────────────────────────────┐  │
│  │                      Task List                               │  │
│  │  ┌─────┐    ┌─────┐    ┌─────┐    ┌─────┐                  │  │
│  │  │TCB 1│───▶│TCB 2│───▶│TCB 3│───▶│Idle │───▶NULL          │  │
│  │  └─────┘    └─────┘    └─────┘    └─────┘                  │  │
│  └─────────────────────────────────────────────────────────────┘  │
│                                                                     │
│  ┌──────────────┐   ┌──────────────┐   ┌──────────────┐           │
│  │   Kernel     │   │   Memory     │   │  Statistics  │           │
│  │   Services   │   │  Management  │   │   Tracking   │           │
│  │ yield,sleep  │   │  (stacks)    │   │              │           │
│  └──────────────┘   └──────────────┘   └──────────────┘           │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Key Components

Component Responsibility Key Functions
SysTick Handler Periodic tick, trigger PendSV SysTick_Handler
PendSV Handler Perform context switch (asm) PendSV_Handler
Scheduler Choose next task scheduler()
Task Manager Create, destroy, state changes task_create, task_suspend
Kernel API Services for tasks yield, sleep, get_tick
Stack Manager Allocate, initialize stacks init_task_stack

Data Structures

// Task state enumeration
typedef enum {
    TASK_READY = 0,
    TASK_RUNNING,
    TASK_BLOCKED,
    TASK_SUSPENDED,
    TASK_TERMINATED
} task_state_t;

// Task Control Block
typedef struct tcb {
    uint32_t *sp;              // Saved stack pointer (MUST BE FIRST)
    struct tcb *next;          // Next TCB in list
    struct tcb *prev;          // Previous TCB (for easy removal)

    // Stack info
    uint32_t *stack_base;      // Bottom of stack allocation
    uint32_t stack_size;       // Size in bytes

    // Scheduling info
    uint8_t priority;          // 0 = highest, 255 = lowest
    uint8_t base_priority;     // Original priority (for aging)
    task_state_t state;        // Current state

    // Timing
    uint32_t wake_tick;        // Tick count to wake up (if sleeping)
    uint32_t run_time;         // Total ticks spent running

    // Debug info
    char name[16];             // Task name
    uint32_t id;               // Unique task ID

    // Stack canary for overflow detection
    uint32_t stack_canary;
} TCB_t;

#define STACK_CANARY_VALUE 0xDEADBEEF

// Scheduler state
typedef struct {
    TCB_t *task_list;          // Head of task list
    TCB_t *current_task;       // Currently running task
    TCB_t *idle_task;          // Idle task (always runs)

    uint32_t tick_count;       // System tick counter
    uint32_t switch_count;     // Context switch counter
    uint32_t task_count;       // Number of tasks

    bool scheduler_running;    // Is scheduler started?
} scheduler_state_t;

System Configuration

// Scheduler configuration
#define TICK_RATE_HZ        100     // 100 Hz = 10ms tick
#define MAX_TASKS           16
#define IDLE_STACK_SIZE     128
#define MIN_STACK_SIZE      128
#define DEFAULT_STACK_SIZE  256

// Priority levels
#define PRIORITY_HIGHEST    0
#define PRIORITY_HIGH       1
#define PRIORITY_NORMAL     2
#define PRIORITY_LOW        3
#define PRIORITY_IDLE       255

// Calculate SysTick reload value
// SystemCoreClock / TICK_RATE_HZ - 1
#define SYSTICK_RELOAD      (SystemCoreClock / TICK_RATE_HZ - 1)

Implementation Guide

Development Environment Setup

# Tools needed
sudo apt-get install gcc-arm-none-eabi gdb-multiarch

# Create project structure
mkdir -p mini-scheduler/{src,include,tests}
cd mini-scheduler

Project Structure

mini-scheduler/
├── src/
│   ├── startup.s          # Startup code, vector table
│   ├── main.c             # Application entry, demo tasks
│   ├── scheduler.c        # Scheduler logic, task management
│   ├── context_switch.s   # PendSV handler in assembly
│   ├── systick.c          # SysTick configuration and handler
│   ├── kernel.c           # Kernel services (yield, sleep)
│   └── uart.c             # Serial output (from Project 4)
├── include/
│   ├── scheduler.h        # Public scheduler API
│   ├── config.h           # Configuration options
│   └── port.h             # Hardware-specific definitions
├── linker.ld
├── Makefile
└── tests/
    ├── test_basic.c       # Basic task switching test
    └── test_priority.c    # Priority scheduling test

Implementation Phases

Phase 1: Timer and Tick Counter (Days 1-3)

Goals:

  • Configure SysTick for 10ms ticks
  • Implement tick counter
  • Test periodic interrupt firing

Tasks:

  1. Configure SysTick with proper reload value
  2. Implement SysTick_Handler (just increment counter)
  3. Add get_tick() function
  4. Test with LED toggle

Key code:

volatile uint32_t tick_count = 0;

void systick_init(void) {
    // Disable SysTick during configuration
    SysTick->CTRL = 0;

    // Set reload value for 10ms ticks
    // Assuming 16 MHz clock: 16,000,000 / 100 - 1 = 159,999
    SysTick->LOAD = (SystemCoreClock / TICK_RATE_HZ) - 1;

    // Clear current value
    SysTick->VAL = 0;

    // Set priority (not lowest, that's for PendSV)
    NVIC_SetPriority(SysTick_IRQn, 0xE0);

    // Enable SysTick with processor clock and interrupt
    SysTick->CTRL = SysTick_CTRL_CLKSOURCE_Msk |
                    SysTick_CTRL_TICKINT_Msk |
                    SysTick_CTRL_ENABLE_Msk;
}

void SysTick_Handler(void) {
    tick_count++;

    // Pend PendSV to trigger context switch
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}

uint32_t get_tick(void) {
    return tick_count;
}

Checkpoint: LED toggles every 500ms using get_tick().

Phase 2: Task Creation and Stack Setup (Days 4-7)

Goals:

  • Define TCB structure
  • Implement task creation
  • Initialize task stacks correctly

Tasks:

  1. Define TCB structure with all fields
  2. Allocate stack space (static or malloc from pool)
  3. Initialize stack with correct initial values
  4. Create task list management

Key code:

TCB_t *task_create(const char *name, void (*func)(void),
                   uint32_t stack_size, uint8_t priority) {
    // Allocate TCB
    TCB_t *tcb = allocate_tcb();
    if (!tcb) return NULL;

    // Allocate stack
    tcb->stack_base = allocate_stack(stack_size);
    if (!tcb->stack_base) {
        free_tcb(tcb);
        return NULL;
    }
    tcb->stack_size = stack_size;

    // Initialize TCB fields
    strncpy(tcb->name, name, sizeof(tcb->name) - 1);
    tcb->priority = priority;
    tcb->base_priority = priority;
    tcb->state = TASK_READY;
    tcb->id = next_task_id++;

    // Set stack canary
    tcb->stack_base[0] = STACK_CANARY_VALUE;

    // Initialize stack
    init_task_stack(tcb, func);

    // Add to task list
    add_task_to_list(tcb);

    return tcb;
}

void init_task_stack(TCB_t *tcb, void (*func)(void)) {
    // Start at top of stack (stack grows down)
    uint32_t *sp = tcb->stack_base + (tcb->stack_size / 4);

    // Exception frame (hardware saved)
    *(--sp) = 0x01000000;           // xPSR (Thumb bit)
    *(--sp) = (uint32_t)func;       // PC
    *(--sp) = (uint32_t)task_exit;  // LR (task exit handler)
    *(--sp) = 0;                    // R12
    *(--sp) = 0;                    // R3
    *(--sp) = 0;                    // R2
    *(--sp) = 0;                    // R1
    *(--sp) = 0;                    // R0

    // Software saved (manually by PendSV)
    *(--sp) = 0;  // R11
    *(--sp) = 0;  // R10
    *(--sp) = 0;  // R9
    *(--sp) = 0;  // R8
    *(--sp) = 0;  // R7
    *(--sp) = 0;  // R6
    *(--sp) = 0;  // R5
    *(--sp) = 0;  // R4

    tcb->sp = sp;
}

Checkpoint: Tasks can be created and their stacks look correct in debugger.

Phase 3: Context Switch Implementation (Days 8-14)

Goals:

  • Implement PendSV handler in assembly
  • Write scheduler selection logic
  • Test switching between two tasks

Tasks:

  1. Write PendSV_Handler in assembly
  2. Implement simple round-robin scheduler
  3. Create two test tasks (blink different LEDs)
  4. Start scheduler and verify switching

Key assembly code:

// context_switch.s
.syntax unified
.cpu cortex-m4
.thumb

.global PendSV_Handler
.type PendSV_Handler, %function

PendSV_Handler:
    // Disable interrupts
    CPSID   I

    // Get current PSP
    MRS     R0, PSP

    // Is this the first switch? (PSP == 0 means first time)
    CBZ     R0, PendSV_restore

    // Save R4-R11 on current task's stack
    STMDB   R0!, {R4-R11}

    // Get current TCB pointer
    LDR     R1, =current_task
    LDR     R2, [R1]

    // Save SP to current TCB (sp is first field)
    STR     R0, [R2]

PendSV_restore:
    // Call scheduler to get next task (returns TCB* in R0)
    PUSH    {LR}
    BL      scheduler_select_next
    POP     {LR}

    // Update current_task pointer
    LDR     R1, =current_task
    STR     R0, [R1]

    // Get new task's SP from TCB (first field)
    LDR     R0, [R0]

    // Restore R4-R11 from new task's stack
    LDMIA   R0!, {R4-R11}

    // Set PSP to new task's stack
    MSR     PSP, R0

    // Ensure return uses PSP, thread mode
    ORR     LR, LR, #0x04

    // Re-enable interrupts
    CPSIE   I

    // Return - hardware restores R0-R3, R12, LR, PC, xPSR
    BX      LR

.size PendSV_Handler, .-PendSV_Handler

C scheduler:

TCB_t *scheduler_select_next(void) {
    TCB_t *next = current_task->next;

    // Round-robin: find next READY task
    while (next != current_task) {
        if (next->state == TASK_READY) {
            return next;
        }
        next = next->next;
        if (next == NULL) {
            next = task_list;
        }
    }

    // No other ready task? Return current or idle
    if (current_task->state == TASK_READY) {
        return current_task;
    }
    return idle_task;
}

Checkpoint: Two tasks alternate execution, visible via different LED blink rates.

Phase 4: Priority Scheduling (Days 15-18)

Goals:

  • Implement priority-based scheduling
  • Add priority inheritance (basic)
  • Test priority preemption

Tasks:

  1. Modify scheduler for priority ordering
  2. Add priority change API
  3. Test that high priority preempts low
  4. Implement basic aging to prevent starvation

Key code:

TCB_t *scheduler_select_next(void) {
    TCB_t *best = idle_task;
    uint8_t best_priority = PRIORITY_IDLE;

    // Find highest priority ready task
    TCB_t *task = task_list;
    while (task) {
        if (task->state == TASK_READY && task->priority < best_priority) {
            best = task;
            best_priority = task->priority;
        }
        task = task->next;
    }

    // Apply aging to tasks that didn't get selected
    task = task_list;
    while (task) {
        if (task != best && task->state == TASK_READY) {
            // Boost priority of waiting tasks (prevent starvation)
            if (task->priority > 0) {
                task->priority--;
            }
        }
        task = task->next;
    }

    // Reset selected task's priority to base
    best->priority = best->base_priority;

    return best;
}

Checkpoint: High-priority task runs first, lower priority only runs when higher blocks.

Phase 5: Kernel Services (Days 19-23)

Goals:

  • Implement yield(), sleep(), get_tick()
  • Add task suspension/resumption
  • Implement delay functions

Tasks:

  1. Implement yield() - voluntary preemption
  2. Implement sleep(ticks) - block for duration
  3. Add wake-up logic in SysTick handler
  4. Implement suspend()/resume()

Key code:

void yield(void) {
    // Trigger PendSV to switch tasks
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}

void sleep(uint32_t ticks) {
    // Calculate wake-up tick
    __disable_irq();
    current_task->wake_tick = tick_count + ticks;
    current_task->state = TASK_BLOCKED;
    __enable_irq();

    // Trigger context switch
    yield();
}

// In SysTick_Handler
void SysTick_Handler(void) {
    tick_count++;

    // Wake up sleeping tasks
    TCB_t *task = task_list;
    while (task) {
        if (task->state == TASK_BLOCKED && task->wake_tick <= tick_count) {
            task->state = TASK_READY;
        }
        task = task->next;
    }

    // Pend context switch
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}

void task_suspend(TCB_t *task) {
    __disable_irq();
    task->state = TASK_SUSPENDED;
    __enable_irq();

    if (task == current_task) {
        yield();  // Switch away from suspended task
    }
}

void task_resume(TCB_t *task) {
    __disable_irq();
    if (task->state == TASK_SUSPENDED) {
        task->state = TASK_READY;
    }
    __enable_irq();
}

Checkpoint: Tasks can sleep for specific durations, wake up on time.

Phase 6: Polish and Demo (Days 24-28)

Goals:

  • Add statistics tracking
  • Stack overflow detection
  • Create compelling demo

Tasks:

  1. Track CPU time per task
  2. Implement stack canary check
  3. Add statistics reporting via UART
  4. Create demo with multiple interesting tasks

Statistics code:

void scheduler_update_stats(TCB_t *old_task, TCB_t *new_task) {
    if (old_task) {
        old_task->run_time += (tick_count - old_task->last_run_tick);
    }
    new_task->last_run_tick = tick_count;
    switch_count++;
}

void check_stack_overflow(TCB_t *task) {
    if (task->stack_base[0] != STACK_CANARY_VALUE) {
        uart_printf("STACK OVERFLOW: Task %s\n", task->name);
        // Halt or reset
        while (1);
    }
}

void print_stats(void) {
    uart_printf("=== Scheduler Statistics ===\n");
    uart_printf("Uptime: %u ms\n", tick_count * 10);
    uart_printf("Context switches: %u\n", switch_count);

    TCB_t *task = task_list;
    while (task) {
        uint32_t cpu_pct = (task->run_time * 100) / tick_count;
        uint32_t stack_used = ((uint32_t)task->stack_base + task->stack_size -
                              (uint32_t)task->sp) * 100 / task->stack_size;
        uart_printf("  %s: State=%d, Priority=%d, CPU=%u%%, Stack=%u%%\n",
                    task->name, task->state, task->priority, cpu_pct, stack_used);
        task = task->next;
    }
}

Checkpoint: Full demo runs with multiple tasks, statistics display works.


Hints in Layers

Hint 1: MSP vs PSP

Cortex-M has two stack pointers:

  • MSP (Main Stack Pointer): Used by exception handlers, startup code
  • PSP (Process Stack Pointer): Used by tasks in thread mode

Configure to use PSP for tasks:

// In startup/first task initialization
__set_CONTROL(__get_CONTROL() | CONTROL_SPSEL_Msk);
__ISB();  // Instruction barrier required after CONTROL change

Exception handlers automatically use MSP, tasks use PSP.

Hint 2: EXC_RETURN Values

The LR value on exception entry has special meaning:

0xFFFFFFF1: Return to handler mode, use MSP
0xFFFFFFF9: Return to thread mode, use MSP
0xFFFFFFFD: Return to thread mode, use PSP ← Want this for tasks!

Ensure your initial stack setup and PendSV use the correct value.

Hint 3: Starting the First Task

The first context switch is special - there’s no task to save:

void scheduler_start(void) {
    // Configure PendSV to lowest priority
    NVIC_SetPriority(PendSV_IRQn, 0xFF);

    // Set current_task to first task
    current_task = task_list;
    current_task->state = TASK_RUNNING;

    // Set PSP to first task's stack
    __set_PSP((uint32_t)current_task->sp);

    // Switch to PSP for thread mode
    __set_CONTROL(__get_CONTROL() | CONTROL_SPSEL_Msk);
    __ISB();

    // Trigger first switch
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;

    // Enable interrupts - this will trigger PendSV
    __enable_irq();

    // Should never reach here
    while (1);
}

Or check PSP == 0 in PendSV to detect first switch.

Hint 4: Priority Inversion Problem

Priority inversion occurs when:

  1. Low-priority task holds a resource
  2. High-priority task waits for that resource
  3. Medium-priority task runs instead of low!

Solution: Priority inheritance

void mutex_lock(mutex_t *m) {
    if (m->owner && m->owner->priority > current_task->priority) {
        // Boost owner's priority to prevent inversion
        m->owner->priority = current_task->priority;
    }
    // ... acquire mutex
}
Hint 5: Stack Size Determination

How much stack does a task need?

  • Local variables in all functions
  • Function call depth (each call uses stack)
  • Interrupt nesting (exception frames)

Rule of thumb:

stack_size = max_call_depth * avg_frame_size + interrupt_allowance + safety_margin

Simple task: 256 bytes
Medium task: 512 bytes
Complex task: 1024+ bytes

Use canaries and high-water-mark tracking to tune.

Hint 6: Debugging Context Switches

Add trace output to PendSV (carefully, don’t block):

// In a ring buffer, not UART (UART is too slow)
volatile struct {
    uint32_t tick;
    uint32_t from_task;
    uint32_t to_task;
} switch_trace[32];
volatile int trace_idx = 0;

// At start of PendSV:
switch_trace[trace_idx].tick = tick_count;
switch_trace[trace_idx].from_task = current_task->id;
// At end:
switch_trace[trace_idx].to_task = current_task->id;
trace_idx = (trace_idx + 1) % 32;

Examine trace buffer in debugger when things go wrong.


Testing Strategy

Test Categories

Category Purpose Examples
Unit Tests Test individual functions TCB creation, stack init
Switch Tests Verify context switches Two tasks, alternation
Priority Tests Priority scheduling High preempts low
Timing Tests Sleep, yield accuracy sleep(100) takes ~1s
Stress Tests Many tasks, long runs 8 tasks, 1 hour

Critical Test Cases

// Test 1: Basic switching
void test_basic_switch(void) {
    volatile int count1 = 0, count2 = 0;

    void task1(void) { while(1) count1++; }
    void task2(void) { while(1) count2++; }

    task_create("T1", task1, 256, PRIORITY_NORMAL);
    task_create("T2", task2, 256, PRIORITY_NORMAL);
    scheduler_start();

    // After some time, both counts should be non-zero
    delay_ms(1000);
    assert(count1 > 0);
    assert(count2 > 0);
}

// Test 2: Priority
void test_priority(void) {
    volatile int high_runs = 0, low_runs = 0;

    void high_task(void) {
        while(1) { high_runs++; yield(); }
    }
    void low_task(void) {
        while(1) { low_runs++; yield(); }
    }

    task_create("High", high_task, 256, PRIORITY_HIGH);
    task_create("Low", low_task, 256, PRIORITY_LOW);
    scheduler_start();

    // High priority should run much more often
    delay_ms(1000);
    assert(high_runs > low_runs * 10);
}

// Test 3: Sleep timing
void test_sleep(void) {
    void sleep_task(void) {
        while (1) {
            uint32_t start = get_tick();
            sleep(100);  // Sleep 1 second (100 * 10ms)
            uint32_t elapsed = get_tick() - start;
            assert(elapsed >= 100 && elapsed <= 102);  // Allow small variance
        }
    }

    task_create("Sleeper", sleep_task, 256, PRIORITY_NORMAL);
    task_create("Spinner", spin_task, 256, PRIORITY_NORMAL);
    scheduler_start();
}

Common Pitfalls & Debugging

Frequent Mistakes

Pitfall Symptom Solution
Wrong stack alignment Hard fault on first switch Ensure 8-byte alignment
Forgetting to save LR Exception return fails PUSH {LR} before BL
Wrong EXC_RETURN Returns to wrong mode/stack Use 0xFFFFFFFD for tasks
Interrupt enable timing Race in first switch Enable after PSP set up
Priority inversion High-priority task starves Implement priority inheritance
Stack overflow Random crashes Add canaries, increase stack

Debugging Tips

  1. Use debugger: Set breakpoint in PendSV, examine registers
  2. Check stack pointers: PSP and MSP should be in valid ranges
  3. Validate TCBs: Ensure sp field is first, values look sane
  4. Single-step assembly: Follow PendSV instruction by instruction
  5. Reduce to two tasks: Eliminate variables when debugging

Extensions & Challenges

Beginner Extensions

  • CPU usage display: Show percentage per task via UART
  • Task names in debug: Print task name on switch
  • Stack usage high-water mark: Track maximum stack usage

Intermediate Extensions

  • Mutexes: Implement basic mutex with priority inheritance
  • Message queues: Inter-task communication
  • Event flags: Wait for multiple conditions
  • Dynamic task creation/deletion: Add/remove tasks at runtime

Advanced Extensions

  • Tickless idle: Don’t run SysTick during long idle
  • Memory protection: Use MPU to isolate task stacks
  • Multicore: Extend for ARM with multiple cores
  • Real-time guarantees: Rate-monotonic scheduling

The Interview Questions They’ll Ask

  1. “What is a context switch and what does it involve?”
    • Saving CPU registers, switching stack pointers, loading new task’s state
  2. “What’s the difference between preemptive and cooperative scheduling?”
    • Preemptive: Timer forces switch. Cooperative: Task must yield.
  3. “What is priority inversion and how do you prevent it?”
    • High-priority waits for low-priority’s resource while medium runs
    • Priority inheritance: boost holder’s priority
  4. “Why use PendSV instead of switching directly in SysTick?”
    • PendSV has lowest priority, runs after all other ISRs complete
  5. “How do you determine stack size for a task?”
    • Analyze call depth, local variables, ISR nesting, add margin
  6. “What’s the overhead of a context switch?”
    • Save/restore registers, scheduler run time, cache effects
    • Typically 1-10 microseconds on Cortex-M

Books That Will Help

Topic Book Chapter
Context switching fundamentals Operating Systems: Three Easy Pieces Ch. 6
ARM exception handling The Definitive Guide to ARM Cortex-M3 Ch. 8
SysTick timer Making Embedded Systems Ch. 6
Process control blocks Modern Operating Systems Ch. 2
Critical sections Operating Systems: Three Easy Pieces Ch. 28
RTOS concepts Making Embedded Systems Ch. 10

Self-Assessment Checklist

Understanding

  • I can explain what a context switch involves on ARM
  • I understand the difference between MSP and PSP
  • I know why PendSV should have lowest priority
  • I can describe the automatic exception stack frame
  • I understand priority inversion and how to prevent it

Implementation

  • Tasks can be created with specified stack and priority
  • Context switching works between multiple tasks
  • Higher priority tasks preempt lower priority
  • sleep() blocks task for correct duration
  • Stack overflow is detected via canary

Testing

  • Two tasks alternate execution
  • Priority ordering is correct
  • Timing functions are accurate
  • System runs stable for extended periods

Learning Milestones

  1. SysTick fires regularly → Timer interrupt works
  2. You can switch between two tasks manually → Context save/restore works
  3. Preemption works automatically → Scheduler and PendSV integrated
  4. Three+ tasks run smoothly → Round-robin or priority scheduling works

This guide was expanded from LEARN_ARM_DEEP_DIVE.md. For the complete learning path, see the project index.