Project 7: Context Switcher & Mini-Scheduler
Build a preemptive scheduler that runs multiple “tasks” concurrently on a single ARM core, with context switching driven by the SysTick timer—like a tiny FreeRTOS kernel.
Quick Reference
| Attribute | Value |
|---|---|
| Language | ARM Assembly + C |
| Difficulty | Expert |
| Time | 2-4 weeks |
| Coolness | ★★★★★ Pure Magic (Super Cool) |
| Portfolio Value | Resume Gold |
| Prerequisites | Projects 3-4 (bare-metal + UART), ARM assembly, stack concepts |
| Key Topics | Context switching, SysTick, TCB, preemption, scheduling |
Learning Objectives
By completing this project, you will:
- Understand how multitasking works: See that “parallel” execution is an illusion created by fast switching
- Master ARM exception handling: Implement PendSV and SysTick handlers
- Save and restore CPU context: Know exactly which registers must be preserved
- Design Task Control Blocks: Create the data structure that represents a running task
- Implement scheduling algorithms: Build round-robin and priority-based schedulers
- Handle critical sections: Prevent race conditions in kernel code
The Core Question You’re Answering
“How can a single CPU core appear to run multiple programs simultaneously?”
This is the fundamental question of operating systems. The answer is: it can’t—but by switching between programs fast enough (thousands of times per second), we create the illusion of parallelism. Understanding this illusion requires knowing exactly what constitutes a program’s “state” and how to save/restore it.
Concepts You Must Understand First
| Concept | Why It Matters | Where to Learn |
|---|---|---|
| ARM register set | You’ll save/restore all of them | Project 2, ARM TRM |
| Stack pointer and stack frames | Each task needs its own stack | CS:APP Chapter 3 |
| ARM exception model | Understand automatic stacking | ARM Cortex-M TRM |
| Interrupt priorities | SysTick vs PendSV ordering | Joseph Yiu’s book Ch. 8 |
| Function call convention | Which registers are caller/callee saved | AAPCS document |
| Linked lists | TCBs are linked together | Any DS course |
Self-Assessment Questions
- Registers: In Cortex-M, which registers are automatically pushed to the stack on exception entry?
- Stack pointer: What’s the difference between MSP and PSP?
- Exception return: What does 0xFFFFFFF9 mean as a return value from an exception?
- Priority: Should SysTick or PendSV have higher priority? Why?
- Calling convention: Which registers must a function preserve (callee-saved)?
Theoretical Foundation
What Is Context?
A program’s “context” is everything needed to resume its execution:
Task Context on ARM Cortex-M:
┌─────────────────────────────────────────────────────────────┐
│ CPU Registers: │
│ R0-R12 - General purpose registers │
│ SP (R13) - Stack pointer (points to task's stack) │
│ LR (R14) - Link register (return address) │
│ PC (R15) - Program counter (where to resume) │
│ xPSR - Program status (flags, exception state) │
│ │
│ Task-specific: │
│ Stack - Task's private stack memory │
│ TCB - Metadata (state, priority, etc.) │
└─────────────────────────────────────────────────────────────┘
The Context Switch Illusion
Time →
────────────────────────────────────────────────────────────────────────────
Task A: █████████░░░░░░░░░░█████████░░░░░░░░░░█████████░░░░░░░░░░...
Task B: ░░░░░░░░░█████████░░░░░░░░░░█████████░░░░░░░░░░█████████░...
Task C: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░...
↑ ↑ ↑ ↑ ↑
│ │ │ │ │
Context Context Context Context Context
switch switch switch switch switch
(A→B) (B→A) (A→B) (B→A) (A→B)
Each "█" = 10ms of execution
Switch happens ~100 times/second (10ms time slice)
From a human perspective, A and B appear to run "at the same time"
Cortex-M Exception Stack Frame
When an exception occurs, the hardware automatically pushes 8 registers:
High addresses (before exception)
┌─────────────────┐ ← SP before exception
│ │
│ (previous data) │
│ │
├─────────────────┤
│ xPSR │ ← Pushed automatically by hardware
├─────────────────┤
│ PC │ ← Where to return to
├─────────────────┤
│ LR │
├─────────────────┤
│ R12 │
├─────────────────┤
│ R3 │
├─────────────────┤
│ R2 │
├─────────────────┤
│ R1 │
├─────────────────┤
│ R0 │ ← SP after exception entry
└─────────────────┘
Low addresses
Note: R4-R11 are NOT saved automatically!
You must save them manually in the context switch.
Why PendSV Instead of SysTick?
SysTick fires at regular intervals for time-keeping. PendSV is a “pendable” exception specifically designed for context switching:
Problem with switching directly in SysTick:
┌────────────────────────────────────────────────────────────────┐
│ SysTick fires during higher-priority ISR │
│ → Context switch happens in middle of ISR │
│ → ISR state corrupted │
│ → System crash! │
└────────────────────────────────────────────────────────────────┘
Solution: Use PendSV with lowest priority:
┌────────────────────────────────────────────────────────────────┐
│ SysTick fires → Sets PendSV pending bit │
│ Higher-priority ISRs complete │
│ PendSV fires (lowest priority) │
│ → Context switch happens safely │
│ → All ISRs have completed │
└────────────────────────────────────────────────────────────────┘
Priority configuration:
- SysTick: Low but not lowest (e.g., 0xE0)
- PendSV: Lowest possible (0xFF)
- Other ISRs: Higher priority
Task Control Block (TCB)
The TCB holds everything the scheduler needs to manage a task:
typedef struct tcb {
uint32_t *sp; // Saved stack pointer (MUST be first!)
struct tcb *next; // Next task in list (for round-robin)
uint32_t *stack_base; // Base of allocated stack
uint32_t stack_size; // Size of stack
uint8_t priority; // Task priority (lower = higher priority)
uint8_t state; // READY, RUNNING, BLOCKED, SUSPENDED
uint32_t ticks; // Sleep counter / statistics
char name[16]; // Human-readable name
} TCB_t;
Task states:
┌─────────────────────────────────────────────────────────────┐
│ │
│ ┌─────────┐ schedule ┌─────────┐ │
│ │ READY │ ─────────────▶ │ RUNNING │ │
│ └─────────┘ └─────────┘ │
│ ▲ │ │
│ │ │ preempt / yield │
│ │ ▼ │
│ │ event ┌─────────────┐ │
│ └───────────────── │ BLOCKED │ │
│ │ (wait/sleep)│ │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Stack Initialization for a New Task
When creating a task, we must set up its stack as if it had been switched out:
// Initialize task stack to look like it was interrupted
void init_task_stack(TCB_t *tcb, void (*task_func)(void)) {
uint32_t *sp = tcb->stack_base + tcb->stack_size / 4;
// Stack grows down, so start at top
// Exception frame (what hardware pushes)
*(--sp) = 0x01000000; // xPSR - Thumb bit set
*(--sp) = (uint32_t)task_func; // PC - task entry point
*(--sp) = 0xFFFFFFFD; // LR - special EXC_RETURN value
*(--sp) = 0x12121212; // R12
*(--sp) = 0x03030303; // R3
*(--sp) = 0x02020202; // R2
*(--sp) = 0x01010101; // R1
*(--sp) = 0x00000000; // R0 - could be argument
// Software saved registers (what PendSV pushes manually)
*(--sp) = 0x11111111; // R11
*(--sp) = 0x10101010; // R10
*(--sp) = 0x09090909; // R9
*(--sp) = 0x08080808; // R8
*(--sp) = 0x07070707; // R7
*(--sp) = 0x06060606; // R6
*(--sp) = 0x05050505; // R5
*(--sp) = 0x04040404; // R4
tcb->sp = sp;
}
The PendSV Handler (Heart of Context Switching)
// PendSV_Handler - performs the actual context switch
PendSV_Handler:
// Disable interrupts during switch
CPSID I
// Get current task's stack pointer (PSP)
MRS R0, PSP
// Save R4-R11 to current task's stack
STMDB R0!, {R4-R11}
// Save updated SP to current TCB
LDR R1, =current_tcb
LDR R2, [R1] // R2 = current_tcb pointer
STR R0, [R2] // current_tcb->sp = R0
// Call scheduler to get next task
PUSH {R14} // Save LR (EXC_RETURN)
BL scheduler // Returns next TCB in R0
POP {R14}
// Update current_tcb pointer
LDR R1, =current_tcb
STR R0, [R1]
// Get new task's stack pointer
LDR R0, [R0] // R0 = new_tcb->sp
// Restore R4-R11 from new task's stack
LDMIA R0!, {R4-R11}
// Set PSP to new task's stack
MSR PSP, R0
// Re-enable interrupts
CPSIE I
// Return (hardware will pop R0-R3, R12, LR, PC, xPSR)
BX LR
Scheduling Algorithms
Round-Robin (Time-Sliced):
Simple: Each task gets equal time slice, cycle through all tasks
Task A → Task B → Task C → Task A → ...
scheduler():
current = current->next
if current == NULL:
current = task_list_head
return current
Priority-Based:
Higher priority tasks run first, same priority uses round-robin
Priority 0: A, B (highest)
Priority 1: C
Priority 2: D, E (lowest)
Run order: A → B → A → B → ... (C, D, E starve unless A, B block)
scheduler():
// Find highest priority ready task
for each priority level:
for each task at this level:
if task->state == READY:
return task
Priority with Aging:
Prevent starvation by boosting priority of waiting tasks
Every tick: waiting_tasks.priority++
After running: task.priority = base_priority
Critical Sections
Kernel data structures must be protected from concurrent access:
// WRONG - race condition
void add_task(TCB_t *task) {
task->next = task_list; // SysTick here → corrupted list!
task_list = task;
}
// RIGHT - disable interrupts
void add_task(TCB_t *task) {
uint32_t primask = __get_PRIMASK();
__disable_irq();
task->next = task_list;
task_list = task;
__set_PRIMASK(primask); // Restore previous state
}
// Alternative: Use LDREX/STREX for lock-free operations
Why This Matters
Understanding context switching is essential for:
- Operating systems development: Every OS does this
- RTOS usage: Know what FreeRTOS/Zephyr do under the hood
- Debugging: Understand stack traces, watchdogs, crashes
- Performance: Context switches have overhead
- Real-time systems: Predictable switching times are critical
Project Specification
What You Will Build
A preemptive mini-scheduler with:
- Task management: Create, start, suspend, resume tasks
- Preemptive scheduling: SysTick-driven context switches
- Round-robin scheduler: Fair time sharing between tasks
- Priority support: Higher priority tasks run first
- Kernel API: yield(), sleep(), get_tick()
- Statistics: Context switch count, CPU usage per task
Functional Requirements
- Task Creation:
- Create tasks with name, function, stack size, priority
- Initialize stack for first context switch
- Support at least 8 concurrent tasks
- Scheduling:
- 10ms time slice (100 Hz tick rate)
- Round-robin among same-priority tasks
- Higher priority preempts lower priority
- Idle task runs when no other tasks ready
- Kernel Services:
yield(): Voluntarily give up CPUsleep(ticks): Block for N tick periodsget_tick(): Return current tick countsuspend(task)/resume(task): Task state control
- Safety:
- Stack overflow detection (canary values)
- Watchdog integration (optional)
- No priority inversion (basic)
Non-Functional Requirements
- Deterministic: Bounded context switch time (<10us)
- Low overhead: Minimal cycles spent in scheduler
- Debuggable: Easy to inspect task states and stacks
Real World Outcome
=== ARM Mini-Scheduler Demo ===
Creating tasks:
Task 1: LED blinker (priority 1, stack 256)
Task 2: Serial echo (priority 2, stack 512)
Task 3: Counter display (priority 3, stack 256)
Scheduler started, tick = 10ms
[00000010] Task3: Count = 1
[00000020] Task3: Count = 2
[00000025] Task2: Echo 'H'
[00000026] Task2: Echo 'i'
[00000030] Task3: Count = 3
[00000040] Task3: Count = 4
[00000500] Task1: LED ON
[00001000] Task1: LED OFF
[00001010] Task3: Count = 100
> stats
Scheduler Statistics:
Uptime: 15.34 seconds
Context switches: 1523
Tasks: 3 active, 1 idle
Task State Priority CPU% Switches Stack
LED_blinker READY 1 5% 127 240/256
Serial_echo BLOCKED 2 12% 412 380/512
Counter RUNNING 3 78% 983 210/256
Idle READY 255 5% 1 60/128
> suspend 3
Task3 suspended
> resume 3
Task3 resumed
> priority 2 1
Task2 priority changed: 2 -> 1
Physical result: LED blinks, serial echoes typed chars, counter prints
Solution Architecture
High-Level Design
┌─────────────────────────────────────────────────────────────────────┐
│ Mini-Scheduler │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ SysTick │──▶│ PendSV │──▶│ Scheduler │ │
│ │ Handler │ │ Handler │ │ (C code) │ │
│ │ (tick++) │ │ (asm switch)│ │ (pick next) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │ │ │ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Task List │ │
│ │ ┌─────┐ ┌─────┐ ┌─────┐ ┌─────┐ │ │
│ │ │TCB 1│───▶│TCB 2│───▶│TCB 3│───▶│Idle │───▶NULL │ │
│ │ └─────┘ └─────┘ └─────┘ └─────┘ │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Kernel │ │ Memory │ │ Statistics │ │
│ │ Services │ │ Management │ │ Tracking │ │
│ │ yield,sleep │ │ (stacks) │ │ │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
Key Components
| Component | Responsibility | Key Functions |
|---|---|---|
| SysTick Handler | Periodic tick, trigger PendSV | SysTick_Handler |
| PendSV Handler | Perform context switch (asm) | PendSV_Handler |
| Scheduler | Choose next task | scheduler() |
| Task Manager | Create, destroy, state changes | task_create, task_suspend |
| Kernel API | Services for tasks | yield, sleep, get_tick |
| Stack Manager | Allocate, initialize stacks | init_task_stack |
Data Structures
// Task state enumeration
typedef enum {
TASK_READY = 0,
TASK_RUNNING,
TASK_BLOCKED,
TASK_SUSPENDED,
TASK_TERMINATED
} task_state_t;
// Task Control Block
typedef struct tcb {
uint32_t *sp; // Saved stack pointer (MUST BE FIRST)
struct tcb *next; // Next TCB in list
struct tcb *prev; // Previous TCB (for easy removal)
// Stack info
uint32_t *stack_base; // Bottom of stack allocation
uint32_t stack_size; // Size in bytes
// Scheduling info
uint8_t priority; // 0 = highest, 255 = lowest
uint8_t base_priority; // Original priority (for aging)
task_state_t state; // Current state
// Timing
uint32_t wake_tick; // Tick count to wake up (if sleeping)
uint32_t run_time; // Total ticks spent running
// Debug info
char name[16]; // Task name
uint32_t id; // Unique task ID
// Stack canary for overflow detection
uint32_t stack_canary;
} TCB_t;
#define STACK_CANARY_VALUE 0xDEADBEEF
// Scheduler state
typedef struct {
TCB_t *task_list; // Head of task list
TCB_t *current_task; // Currently running task
TCB_t *idle_task; // Idle task (always runs)
uint32_t tick_count; // System tick counter
uint32_t switch_count; // Context switch counter
uint32_t task_count; // Number of tasks
bool scheduler_running; // Is scheduler started?
} scheduler_state_t;
System Configuration
// Scheduler configuration
#define TICK_RATE_HZ 100 // 100 Hz = 10ms tick
#define MAX_TASKS 16
#define IDLE_STACK_SIZE 128
#define MIN_STACK_SIZE 128
#define DEFAULT_STACK_SIZE 256
// Priority levels
#define PRIORITY_HIGHEST 0
#define PRIORITY_HIGH 1
#define PRIORITY_NORMAL 2
#define PRIORITY_LOW 3
#define PRIORITY_IDLE 255
// Calculate SysTick reload value
// SystemCoreClock / TICK_RATE_HZ - 1
#define SYSTICK_RELOAD (SystemCoreClock / TICK_RATE_HZ - 1)
Implementation Guide
Development Environment Setup
# Tools needed
sudo apt-get install gcc-arm-none-eabi gdb-multiarch
# Create project structure
mkdir -p mini-scheduler/{src,include,tests}
cd mini-scheduler
Project Structure
mini-scheduler/
├── src/
│ ├── startup.s # Startup code, vector table
│ ├── main.c # Application entry, demo tasks
│ ├── scheduler.c # Scheduler logic, task management
│ ├── context_switch.s # PendSV handler in assembly
│ ├── systick.c # SysTick configuration and handler
│ ├── kernel.c # Kernel services (yield, sleep)
│ └── uart.c # Serial output (from Project 4)
├── include/
│ ├── scheduler.h # Public scheduler API
│ ├── config.h # Configuration options
│ └── port.h # Hardware-specific definitions
├── linker.ld
├── Makefile
└── tests/
├── test_basic.c # Basic task switching test
└── test_priority.c # Priority scheduling test
Implementation Phases
Phase 1: Timer and Tick Counter (Days 1-3)
Goals:
- Configure SysTick for 10ms ticks
- Implement tick counter
- Test periodic interrupt firing
Tasks:
- Configure SysTick with proper reload value
- Implement SysTick_Handler (just increment counter)
- Add get_tick() function
- Test with LED toggle
Key code:
volatile uint32_t tick_count = 0;
void systick_init(void) {
// Disable SysTick during configuration
SysTick->CTRL = 0;
// Set reload value for 10ms ticks
// Assuming 16 MHz clock: 16,000,000 / 100 - 1 = 159,999
SysTick->LOAD = (SystemCoreClock / TICK_RATE_HZ) - 1;
// Clear current value
SysTick->VAL = 0;
// Set priority (not lowest, that's for PendSV)
NVIC_SetPriority(SysTick_IRQn, 0xE0);
// Enable SysTick with processor clock and interrupt
SysTick->CTRL = SysTick_CTRL_CLKSOURCE_Msk |
SysTick_CTRL_TICKINT_Msk |
SysTick_CTRL_ENABLE_Msk;
}
void SysTick_Handler(void) {
tick_count++;
// Pend PendSV to trigger context switch
SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}
uint32_t get_tick(void) {
return tick_count;
}
Checkpoint: LED toggles every 500ms using get_tick().
Phase 2: Task Creation and Stack Setup (Days 4-7)
Goals:
- Define TCB structure
- Implement task creation
- Initialize task stacks correctly
Tasks:
- Define TCB structure with all fields
- Allocate stack space (static or malloc from pool)
- Initialize stack with correct initial values
- Create task list management
Key code:
TCB_t *task_create(const char *name, void (*func)(void),
uint32_t stack_size, uint8_t priority) {
// Allocate TCB
TCB_t *tcb = allocate_tcb();
if (!tcb) return NULL;
// Allocate stack
tcb->stack_base = allocate_stack(stack_size);
if (!tcb->stack_base) {
free_tcb(tcb);
return NULL;
}
tcb->stack_size = stack_size;
// Initialize TCB fields
strncpy(tcb->name, name, sizeof(tcb->name) - 1);
tcb->priority = priority;
tcb->base_priority = priority;
tcb->state = TASK_READY;
tcb->id = next_task_id++;
// Set stack canary
tcb->stack_base[0] = STACK_CANARY_VALUE;
// Initialize stack
init_task_stack(tcb, func);
// Add to task list
add_task_to_list(tcb);
return tcb;
}
void init_task_stack(TCB_t *tcb, void (*func)(void)) {
// Start at top of stack (stack grows down)
uint32_t *sp = tcb->stack_base + (tcb->stack_size / 4);
// Exception frame (hardware saved)
*(--sp) = 0x01000000; // xPSR (Thumb bit)
*(--sp) = (uint32_t)func; // PC
*(--sp) = (uint32_t)task_exit; // LR (task exit handler)
*(--sp) = 0; // R12
*(--sp) = 0; // R3
*(--sp) = 0; // R2
*(--sp) = 0; // R1
*(--sp) = 0; // R0
// Software saved (manually by PendSV)
*(--sp) = 0; // R11
*(--sp) = 0; // R10
*(--sp) = 0; // R9
*(--sp) = 0; // R8
*(--sp) = 0; // R7
*(--sp) = 0; // R6
*(--sp) = 0; // R5
*(--sp) = 0; // R4
tcb->sp = sp;
}
Checkpoint: Tasks can be created and their stacks look correct in debugger.
Phase 3: Context Switch Implementation (Days 8-14)
Goals:
- Implement PendSV handler in assembly
- Write scheduler selection logic
- Test switching between two tasks
Tasks:
- Write PendSV_Handler in assembly
- Implement simple round-robin scheduler
- Create two test tasks (blink different LEDs)
- Start scheduler and verify switching
Key assembly code:
// context_switch.s
.syntax unified
.cpu cortex-m4
.thumb
.global PendSV_Handler
.type PendSV_Handler, %function
PendSV_Handler:
// Disable interrupts
CPSID I
// Get current PSP
MRS R0, PSP
// Is this the first switch? (PSP == 0 means first time)
CBZ R0, PendSV_restore
// Save R4-R11 on current task's stack
STMDB R0!, {R4-R11}
// Get current TCB pointer
LDR R1, =current_task
LDR R2, [R1]
// Save SP to current TCB (sp is first field)
STR R0, [R2]
PendSV_restore:
// Call scheduler to get next task (returns TCB* in R0)
PUSH {LR}
BL scheduler_select_next
POP {LR}
// Update current_task pointer
LDR R1, =current_task
STR R0, [R1]
// Get new task's SP from TCB (first field)
LDR R0, [R0]
// Restore R4-R11 from new task's stack
LDMIA R0!, {R4-R11}
// Set PSP to new task's stack
MSR PSP, R0
// Ensure return uses PSP, thread mode
ORR LR, LR, #0x04
// Re-enable interrupts
CPSIE I
// Return - hardware restores R0-R3, R12, LR, PC, xPSR
BX LR
.size PendSV_Handler, .-PendSV_Handler
C scheduler:
TCB_t *scheduler_select_next(void) {
TCB_t *next = current_task->next;
// Round-robin: find next READY task
while (next != current_task) {
if (next->state == TASK_READY) {
return next;
}
next = next->next;
if (next == NULL) {
next = task_list;
}
}
// No other ready task? Return current or idle
if (current_task->state == TASK_READY) {
return current_task;
}
return idle_task;
}
Checkpoint: Two tasks alternate execution, visible via different LED blink rates.
Phase 4: Priority Scheduling (Days 15-18)
Goals:
- Implement priority-based scheduling
- Add priority inheritance (basic)
- Test priority preemption
Tasks:
- Modify scheduler for priority ordering
- Add priority change API
- Test that high priority preempts low
- Implement basic aging to prevent starvation
Key code:
TCB_t *scheduler_select_next(void) {
TCB_t *best = idle_task;
uint8_t best_priority = PRIORITY_IDLE;
// Find highest priority ready task
TCB_t *task = task_list;
while (task) {
if (task->state == TASK_READY && task->priority < best_priority) {
best = task;
best_priority = task->priority;
}
task = task->next;
}
// Apply aging to tasks that didn't get selected
task = task_list;
while (task) {
if (task != best && task->state == TASK_READY) {
// Boost priority of waiting tasks (prevent starvation)
if (task->priority > 0) {
task->priority--;
}
}
task = task->next;
}
// Reset selected task's priority to base
best->priority = best->base_priority;
return best;
}
Checkpoint: High-priority task runs first, lower priority only runs when higher blocks.
Phase 5: Kernel Services (Days 19-23)
Goals:
- Implement yield(), sleep(), get_tick()
- Add task suspension/resumption
- Implement delay functions
Tasks:
- Implement yield() - voluntary preemption
- Implement sleep(ticks) - block for duration
- Add wake-up logic in SysTick handler
- Implement suspend()/resume()
Key code:
void yield(void) {
// Trigger PendSV to switch tasks
SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}
void sleep(uint32_t ticks) {
// Calculate wake-up tick
__disable_irq();
current_task->wake_tick = tick_count + ticks;
current_task->state = TASK_BLOCKED;
__enable_irq();
// Trigger context switch
yield();
}
// In SysTick_Handler
void SysTick_Handler(void) {
tick_count++;
// Wake up sleeping tasks
TCB_t *task = task_list;
while (task) {
if (task->state == TASK_BLOCKED && task->wake_tick <= tick_count) {
task->state = TASK_READY;
}
task = task->next;
}
// Pend context switch
SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}
void task_suspend(TCB_t *task) {
__disable_irq();
task->state = TASK_SUSPENDED;
__enable_irq();
if (task == current_task) {
yield(); // Switch away from suspended task
}
}
void task_resume(TCB_t *task) {
__disable_irq();
if (task->state == TASK_SUSPENDED) {
task->state = TASK_READY;
}
__enable_irq();
}
Checkpoint: Tasks can sleep for specific durations, wake up on time.
Phase 6: Polish and Demo (Days 24-28)
Goals:
- Add statistics tracking
- Stack overflow detection
- Create compelling demo
Tasks:
- Track CPU time per task
- Implement stack canary check
- Add statistics reporting via UART
- Create demo with multiple interesting tasks
Statistics code:
void scheduler_update_stats(TCB_t *old_task, TCB_t *new_task) {
if (old_task) {
old_task->run_time += (tick_count - old_task->last_run_tick);
}
new_task->last_run_tick = tick_count;
switch_count++;
}
void check_stack_overflow(TCB_t *task) {
if (task->stack_base[0] != STACK_CANARY_VALUE) {
uart_printf("STACK OVERFLOW: Task %s\n", task->name);
// Halt or reset
while (1);
}
}
void print_stats(void) {
uart_printf("=== Scheduler Statistics ===\n");
uart_printf("Uptime: %u ms\n", tick_count * 10);
uart_printf("Context switches: %u\n", switch_count);
TCB_t *task = task_list;
while (task) {
uint32_t cpu_pct = (task->run_time * 100) / tick_count;
uint32_t stack_used = ((uint32_t)task->stack_base + task->stack_size -
(uint32_t)task->sp) * 100 / task->stack_size;
uart_printf(" %s: State=%d, Priority=%d, CPU=%u%%, Stack=%u%%\n",
task->name, task->state, task->priority, cpu_pct, stack_used);
task = task->next;
}
}
Checkpoint: Full demo runs with multiple tasks, statistics display works.
Hints in Layers
Hint 1: MSP vs PSP
Cortex-M has two stack pointers:
- MSP (Main Stack Pointer): Used by exception handlers, startup code
- PSP (Process Stack Pointer): Used by tasks in thread mode
Configure to use PSP for tasks:
// In startup/first task initialization
__set_CONTROL(__get_CONTROL() | CONTROL_SPSEL_Msk);
__ISB(); // Instruction barrier required after CONTROL change
Exception handlers automatically use MSP, tasks use PSP.
Hint 2: EXC_RETURN Values
The LR value on exception entry has special meaning:
0xFFFFFFF1: Return to handler mode, use MSP
0xFFFFFFF9: Return to thread mode, use MSP
0xFFFFFFFD: Return to thread mode, use PSP ← Want this for tasks!
Ensure your initial stack setup and PendSV use the correct value.
Hint 3: Starting the First Task
The first context switch is special - there’s no task to save:
void scheduler_start(void) {
// Configure PendSV to lowest priority
NVIC_SetPriority(PendSV_IRQn, 0xFF);
// Set current_task to first task
current_task = task_list;
current_task->state = TASK_RUNNING;
// Set PSP to first task's stack
__set_PSP((uint32_t)current_task->sp);
// Switch to PSP for thread mode
__set_CONTROL(__get_CONTROL() | CONTROL_SPSEL_Msk);
__ISB();
// Trigger first switch
SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
// Enable interrupts - this will trigger PendSV
__enable_irq();
// Should never reach here
while (1);
}
Or check PSP == 0 in PendSV to detect first switch.
Hint 4: Priority Inversion Problem
Priority inversion occurs when:
- Low-priority task holds a resource
- High-priority task waits for that resource
- Medium-priority task runs instead of low!
Solution: Priority inheritance
void mutex_lock(mutex_t *m) {
if (m->owner && m->owner->priority > current_task->priority) {
// Boost owner's priority to prevent inversion
m->owner->priority = current_task->priority;
}
// ... acquire mutex
}
Hint 5: Stack Size Determination
How much stack does a task need?
- Local variables in all functions
- Function call depth (each call uses stack)
- Interrupt nesting (exception frames)
Rule of thumb:
stack_size = max_call_depth * avg_frame_size + interrupt_allowance + safety_margin
Simple task: 256 bytes
Medium task: 512 bytes
Complex task: 1024+ bytes
Use canaries and high-water-mark tracking to tune.
Hint 6: Debugging Context Switches
Add trace output to PendSV (carefully, don’t block):
// In a ring buffer, not UART (UART is too slow)
volatile struct {
uint32_t tick;
uint32_t from_task;
uint32_t to_task;
} switch_trace[32];
volatile int trace_idx = 0;
// At start of PendSV:
switch_trace[trace_idx].tick = tick_count;
switch_trace[trace_idx].from_task = current_task->id;
// At end:
switch_trace[trace_idx].to_task = current_task->id;
trace_idx = (trace_idx + 1) % 32;
Examine trace buffer in debugger when things go wrong.
Testing Strategy
Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Test individual functions | TCB creation, stack init |
| Switch Tests | Verify context switches | Two tasks, alternation |
| Priority Tests | Priority scheduling | High preempts low |
| Timing Tests | Sleep, yield accuracy | sleep(100) takes ~1s |
| Stress Tests | Many tasks, long runs | 8 tasks, 1 hour |
Critical Test Cases
// Test 1: Basic switching
void test_basic_switch(void) {
volatile int count1 = 0, count2 = 0;
void task1(void) { while(1) count1++; }
void task2(void) { while(1) count2++; }
task_create("T1", task1, 256, PRIORITY_NORMAL);
task_create("T2", task2, 256, PRIORITY_NORMAL);
scheduler_start();
// After some time, both counts should be non-zero
delay_ms(1000);
assert(count1 > 0);
assert(count2 > 0);
}
// Test 2: Priority
void test_priority(void) {
volatile int high_runs = 0, low_runs = 0;
void high_task(void) {
while(1) { high_runs++; yield(); }
}
void low_task(void) {
while(1) { low_runs++; yield(); }
}
task_create("High", high_task, 256, PRIORITY_HIGH);
task_create("Low", low_task, 256, PRIORITY_LOW);
scheduler_start();
// High priority should run much more often
delay_ms(1000);
assert(high_runs > low_runs * 10);
}
// Test 3: Sleep timing
void test_sleep(void) {
void sleep_task(void) {
while (1) {
uint32_t start = get_tick();
sleep(100); // Sleep 1 second (100 * 10ms)
uint32_t elapsed = get_tick() - start;
assert(elapsed >= 100 && elapsed <= 102); // Allow small variance
}
}
task_create("Sleeper", sleep_task, 256, PRIORITY_NORMAL);
task_create("Spinner", spin_task, 256, PRIORITY_NORMAL);
scheduler_start();
}
Common Pitfalls & Debugging
Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong stack alignment | Hard fault on first switch | Ensure 8-byte alignment |
| Forgetting to save LR | Exception return fails | PUSH {LR} before BL |
| Wrong EXC_RETURN | Returns to wrong mode/stack | Use 0xFFFFFFFD for tasks |
| Interrupt enable timing | Race in first switch | Enable after PSP set up |
| Priority inversion | High-priority task starves | Implement priority inheritance |
| Stack overflow | Random crashes | Add canaries, increase stack |
Debugging Tips
- Use debugger: Set breakpoint in PendSV, examine registers
- Check stack pointers: PSP and MSP should be in valid ranges
- Validate TCBs: Ensure sp field is first, values look sane
- Single-step assembly: Follow PendSV instruction by instruction
- Reduce to two tasks: Eliminate variables when debugging
Extensions & Challenges
Beginner Extensions
- CPU usage display: Show percentage per task via UART
- Task names in debug: Print task name on switch
- Stack usage high-water mark: Track maximum stack usage
Intermediate Extensions
- Mutexes: Implement basic mutex with priority inheritance
- Message queues: Inter-task communication
- Event flags: Wait for multiple conditions
- Dynamic task creation/deletion: Add/remove tasks at runtime
Advanced Extensions
- Tickless idle: Don’t run SysTick during long idle
- Memory protection: Use MPU to isolate task stacks
- Multicore: Extend for ARM with multiple cores
- Real-time guarantees: Rate-monotonic scheduling
The Interview Questions They’ll Ask
- “What is a context switch and what does it involve?”
- Saving CPU registers, switching stack pointers, loading new task’s state
- “What’s the difference between preemptive and cooperative scheduling?”
- Preemptive: Timer forces switch. Cooperative: Task must yield.
- “What is priority inversion and how do you prevent it?”
- High-priority waits for low-priority’s resource while medium runs
- Priority inheritance: boost holder’s priority
- “Why use PendSV instead of switching directly in SysTick?”
- PendSV has lowest priority, runs after all other ISRs complete
- “How do you determine stack size for a task?”
- Analyze call depth, local variables, ISR nesting, add margin
- “What’s the overhead of a context switch?”
- Save/restore registers, scheduler run time, cache effects
- Typically 1-10 microseconds on Cortex-M
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Context switching fundamentals | Operating Systems: Three Easy Pieces | Ch. 6 |
| ARM exception handling | The Definitive Guide to ARM Cortex-M3 | Ch. 8 |
| SysTick timer | Making Embedded Systems | Ch. 6 |
| Process control blocks | Modern Operating Systems | Ch. 2 |
| Critical sections | Operating Systems: Three Easy Pieces | Ch. 28 |
| RTOS concepts | Making Embedded Systems | Ch. 10 |
Self-Assessment Checklist
Understanding
- I can explain what a context switch involves on ARM
- I understand the difference between MSP and PSP
- I know why PendSV should have lowest priority
- I can describe the automatic exception stack frame
- I understand priority inversion and how to prevent it
Implementation
- Tasks can be created with specified stack and priority
- Context switching works between multiple tasks
- Higher priority tasks preempt lower priority
- sleep() blocks task for correct duration
- Stack overflow is detected via canary
Testing
- Two tasks alternate execution
- Priority ordering is correct
- Timing functions are accurate
- System runs stable for extended periods
Learning Milestones
- SysTick fires regularly → Timer interrupt works
- You can switch between two tasks manually → Context save/restore works
- Preemption works automatically → Scheduler and PendSV integrated
- Three+ tasks run smoothly → Round-robin or priority scheduling works
This guide was expanded from LEARN_ARM_DEEP_DIVE.md. For the complete learning path, see the project index.