Project 15: Tiny Operating System

Build a minimal but complete operating system with preemptive multitasking, memory protection (using MPU), IPC (semaphores, queues), and a simple shell—like a tiny FreeRTOS you built yourself.


Quick Reference

Attribute Value
Language C + ARM Assembly
Difficulty Master
Time 1-2 months
Prerequisites Projects 6-7, 9, strong OS theory
Key Topics Multitasking, MPU, Syscalls, IPC
Coolness Level 5: Pure Magic (Super Cool)
Portfolio Value Industry Disruptor
Hardware STM32 board

Learning Objectives

By completing this project, you will:

  1. Implement privilege separation: Understand ARM privilege levels and mode switching
  2. Configure the Memory Protection Unit: Set up MPU regions for kernel and user space isolation
  3. Build a system call interface: Use SVCall to provide kernel services to user tasks
  4. Implement inter-process communication: Create semaphores, mutexes, and message queues
  5. Create a preemptive scheduler: Round-robin or priority-based task scheduling
  6. Build a command shell: Interactive interface for process management
  7. Integrate everything from previous projects: This capstone ties together all ARM concepts

The Core Question You’re Answering

“How does an operating system provide isolation, resource management, and coordination for multiple concurrent tasks on a single-core processor, and what hardware features make this possible on ARM?”

This question forces you to understand that an OS is not magic—it’s a carefully designed layer of software that leverages specific hardware features (privilege modes, MPU, interrupts, timers) to create the illusion of multiple programs running simultaneously while protecting them from each other.


Concepts You Must Understand First

Before starting this project, ensure you understand these concepts:

Concept Why It Matters Where to Learn
Context switching Core of multitasking Project 7
ARM exception model Syscalls use SVCall Project 9
Timer interrupts SysTick drives preemption Project 7, 12
Stack management Each task needs its own stack Project 5, 7
ARM privilege levels Kernel vs user mode ARM TRM
Scheduling algorithms Round-robin, priority OSTEP

Key Concepts Deep Dive

  1. ARM Privilege Levels
    • What is the difference between privileged (Handler) and unprivileged (Thread) modes?
    • How do you switch from privileged to unprivileged mode?
    • Why can’t unprivileged code access certain registers?
    • How does the processor automatically switch to privileged mode on exceptions?
    • “The Definitive Guide to ARM Cortex-M3/M4” Chapter 3
  2. Memory Protection Unit (MPU)
    • What does the MPU protect against?
    • How do you define memory regions (base, size, permissions)?
    • What happens when an MPU violation occurs?
    • How do you configure different regions for different tasks?
    • “The Definitive Guide to ARM Cortex-M3/M4” Chapter 11
  3. System Calls (SVCall)
    • How does the SVC instruction work?
    • How do you extract the syscall number from the SVC instruction?
    • How do you pass parameters and return values?
    • What is the advantage of SVCall over direct function calls?
    • “Operating Systems: Three Easy Pieces” Chapter 6
  4. Inter-Process Communication
    • What are the fundamental IPC primitives (semaphores, mutexes, queues)?
    • How do semaphores enable synchronization and mutual exclusion?
    • What is priority inversion and how do you prevent it?
    • How do message queues enable asynchronous communication?
    • “Operating Systems: Three Easy Pieces” Chapters 27-31
  5. Process State and Scheduling
    • What states can a task be in (Ready, Running, Blocked)?
    • How does the scheduler choose which task to run next?
    • What triggers a context switch (voluntary vs preemptive)?
    • How do you handle task creation and termination?
    • “Operating Systems: Three Easy Pieces” Chapters 7-8

Theoretical Foundation

Operating System Layer Cake

A minimal OS provides these layers of abstraction:

OS Architecture:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│                    User Space (Unprivileged)                    │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │  Task 1      Task 2      Task 3      Shell              │   │
│  │  [user]      [user]      [user]      [user]             │   │
│  └─────────────────────────────────────────────────────────┘   │
│                              │                                  │
│                         SVC (syscall)                           │
│                              │                                  │
│                              ▼                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                   Kernel Space                           │   │
│  │  ┌─────────────────────────────────────────────────────┐│   │
│  │  │ Syscall Handler │ Scheduler │ IPC │ Memory Manager  ││   │
│  │  └─────────────────────────────────────────────────────┘│   │
│  └─────────────────────────────────────────────────────────┘   │
│                              │                                  │
│                              ▼                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                      Hardware                            │   │
│  │  MPU │ SysTick │ NVIC │ UART │ GPIO                      │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

ARM Privilege Modes

Cortex-M has two privilege levels:

Privilege Levels:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Privileged (Handler Mode)                                     │
│   ─────────────────────────                                     │
│   • Always in Handler mode during exceptions                    │
│   • Full access to all registers (including control regs)       │
│   • Can access MPU, NVIC, SCB                                   │
│   • Can execute privileged instructions                         │
│   • Kernel code runs here                                       │
│                                                                 │
│   Unprivileged (Thread Mode)                                    │
│   ─────────────────────────                                     │
│   • Configured via CONTROL register                             │
│   • Limited register access                                     │
│   • Cannot directly modify MPU, NVIC                            │
│   • Cannot execute privileged instructions                      │
│   • User tasks run here                                         │
│                                                                 │
│   Transition:                                                   │
│   • Exception entry: Automatic to Privileged                    │
│   • Exception return: Can specify unprivileged via EXC_RETURN   │
│   • CONTROL register: Can only be written in privileged mode    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Memory Protection Unit (MPU)

The MPU enforces memory access permissions:

MPU Configuration:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   MPU Region Definition:                                        │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │ Region Number │ Base Address │ Size │ Permissions │ EN   │  │
│   ├─────────────────────────────────────────────────────────┤  │
│   │      0        │ 0x08000000   │ 32K  │ RO+X        │  1   │  │ Kernel code
│   │      1        │ 0x20000000   │ 8K   │ RW          │  1   │  │ Kernel data
│   │      2        │ 0x08008000   │ 64K  │ RO+X (user) │  1   │  │ User code
│   │      3        │ 0x20002000   │ 4K   │ RW (user)   │  1   │  │ Task 1 stack
│   │      4        │ 0x20003000   │ 4K   │ RW (user)   │  0   │  │ Task 2 stack
│   └─────────────────────────────────────────────────────────┘  │
│                                                                 │
│   Access Permissions (AP bits):                                 │
│   • 000: No access                                              │
│   • 001: Privileged RW only                                     │
│   • 010: Privileged RW, User RO                                 │
│   • 011: Full access (Privileged and User RW)                   │
│                                                                 │
│   On violation:                                                 │
│   • MemManage fault exception triggers                          │
│   • MMFSR indicates type of violation                           │
│   • Kernel can terminate task or log error                      │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

System Call Mechanism

System calls use the SVC instruction:

System Call Flow:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   User Code:                                                    │
│   ──────────                                                    │
│   int result = sys_write(fd, buf, len);                         │
│                                                                 │
│   Expands to:                                                   │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │  MOV R0, fd                                              │  │
│   │  MOV R1, buf                                             │  │
│   │  MOV R2, len                                             │  │
│   │  SVC #1          ; Syscall number in immediate           │  │
│   │  ; Return value in R0                                    │  │
│   └─────────────────────────────────────────────────────────┘  │
│                              │                                  │
│                              ▼                                  │
│   Kernel (SVC Handler):                                         │
│   ─────────────────────                                         │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │  1. Read stacked PC to find SVC instruction              │  │
│   │  2. Extract syscall number from immediate field          │  │
│   │  3. Read arguments from stacked R0, R1, R2               │  │
│   │  4. Execute kernel function                              │  │
│   │  5. Store return value in stacked R0                     │  │
│   │  6. Return from exception                                │  │
│   └─────────────────────────────────────────────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Task State Machine

Tasks transition between states:

Task States:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│             ┌──────────────────────────────────────┐            │
│             │                                      │            │
│             ▼                                      │            │
│        ┌─────────┐    schedule    ┌─────────┐     │            │
│        │  READY  │───────────────▶│ RUNNING │     │            │
│        └─────────┘                └─────────┘     │            │
│             ▲                          │          │            │
│             │                          │ wait()   │ preempt/   │
│        signal                          │          │ yield      │
│             │                          ▼          │            │
│             │                     ┌─────────┐     │            │
│             └─────────────────────│ BLOCKED │     │            │
│                                   └─────────┘     │            │
│                                                   │            │
│             ┌─────────┐                           │            │
│             │ ZOMBIE  │◀──────────────────────────┘            │
│             └─────────┘     exit()                             │
│                                                                 │
│   State Transitions:                                            │
│   • READY → RUNNING: Scheduler selects this task                │
│   • RUNNING → READY: Preemption (time slice expired)            │
│   • RUNNING → BLOCKED: Task calls wait() on semaphore/queue     │
│   • BLOCKED → READY: Resource becomes available (signal)        │
│   • RUNNING → ZOMBIE: Task calls exit() or returns from main    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Preemptive Scheduling

SysTick enables time-sliced scheduling:

Preemptive Scheduling:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Time ─────────────────────────────────────────────────────▶  │
│                                                                 │
│   SysTick: ┃    ┃    ┃    ┃    ┃    ┃    ┃    ┃    ┃          │
│            ▼    ▼    ▼    ▼    ▼    ▼    ▼    ▼    ▼          │
│                                                                 │
│   Task 1:  ████████      ████████      ████████                │
│   Task 2:        ████████      ████████      ████████          │
│   Task 3:              ████████      ████████      ████████    │
│                                                                 │
│   Each tick:                                                    │
│   1. SysTick interrupt fires                                    │
│   2. Current task is preempted (context saved)                  │
│   3. Scheduler selects next ready task                          │
│   4. New task's context is restored                             │
│   5. Return from interrupt to new task                          │
│                                                                 │
│   Time quantum: e.g., 10ms per task                             │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Semaphores and Synchronization

Semaphores coordinate access to shared resources:

Semaphore Operation:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Semaphore Structure:                                          │
│   ┌─────────────────────┐                                       │
│   │ count: int          │  Current value (available resources)  │
│   │ wait_queue: list    │  Tasks waiting for this semaphore     │
│   └─────────────────────┘                                       │
│                                                                 │
│   wait(sem):                           signal(sem):             │
│   ┌────────────────────┐               ┌────────────────────┐   │
│   │ disable interrupts │               │ disable interrupts │   │
│   │ if (sem.count > 0) │               │ sem.count++        │   │
│   │   sem.count--      │               │ if (wait_queue)    │   │
│   │   return           │               │   wake first task  │   │
│   │ else               │               │ enable interrupts  │   │
│   │   add to wait_queue│               └────────────────────┘   │
│   │   block this task  │                                        │
│   │ enable interrupts  │                                        │
│   └────────────────────┘                                        │
│                                                                 │
│   Mutex = binary semaphore (count = 0 or 1)                     │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Message Queues

Queues enable inter-task communication:

Message Queue:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Queue Structure:                                              │
│   ┌─────────────────────────────────────────────┐              │
│   │ buffer[N]: messages                         │              │
│   │ head, tail: indices                         │              │
│   │ count: current messages                     │              │
│   │ sem_full: semaphore for full condition      │              │
│   │ sem_empty: semaphore for empty condition    │              │
│   └─────────────────────────────────────────────┘              │
│                                                                 │
│   send(queue, msg):               receive(queue, &msg):        │
│   ┌──────────────────┐            ┌──────────────────┐         │
│   │ wait(sem_full)   │            │ wait(sem_empty)  │         │
│   │ buffer[tail]=msg │            │ msg=buffer[head] │         │
│   │ tail = (tail+1)%N│            │ head=(head+1)%N  │         │
│   │ signal(sem_empty)│            │ signal(sem_full) │         │
│   └──────────────────┘            └──────────────────┘         │
│                                                                 │
│   Producer ──[msg]──▶ Queue ──[msg]──▶ Consumer                │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Common Misconceptions

Misconception 1: “User mode can’t do anything useful” Reality: User mode can access its own memory and execute normal instructions. Only privileged operations (I/O, MPU config) are restricted.

Misconception 2: “Each task needs its own MPU configuration” Reality: You can share code regions and only change data/stack regions on context switch, or use a simpler model with fewer regions.

Misconception 3: “Preemption is complex” Reality: Preemption is just a context switch triggered by a timer interrupt instead of a voluntary yield.

Misconception 4: “Syscalls are slow” Reality: SVC is fast—just exception entry/exit overhead. The kernel function itself dominates latency.


Project Specification

What You Will Build

A complete minimal operating system with:

  • Preemptive multitasking with up to 8 tasks
  • User/kernel mode separation with MPU protection
  • System call interface for kernel services
  • Semaphores, mutexes, and message queues
  • Interactive shell for process management
  • Task creation, termination, and resource cleanup

Functional Requirements

  1. Task Management:
    • Create tasks with specified stack size and priority
    • Terminate tasks gracefully with resource cleanup
    • Support at least 8 concurrent tasks
    • Track task state (Ready, Running, Blocked, Zombie)
  2. Scheduling:
    • Preemptive round-robin or priority-based scheduling
    • Configurable time quantum (default 10ms)
    • Idle task when no other tasks ready
    • Yield syscall for voluntary context switch
  3. Memory Protection:
    • Kernel code/data inaccessible from user tasks
    • Each task has protected stack region
    • MPU fault handler for violations
    • Shared code region for user tasks
  4. System Calls:
    • sys_write(fd, buf, len) - Output to console
    • sys_read(fd, buf, len) - Input from console
    • sys_yield() - Voluntary context switch
    • sys_exit(code) - Terminate current task
    • sys_spawn(entry, priority) - Create new task
    • sys_sleep(ms) - Sleep for duration
    • Semaphore operations (wait, signal, create)
    • Message queue operations (send, receive, create)
  5. IPC Primitives:
    • Binary semaphores (mutexes)
    • Counting semaphores
    • Message queues with configurable size
    • Blocking and non-blocking variants
  6. Shell Interface:
    • ps - List all tasks with state and priority
    • exec <name> - Start a new task
    • kill <pid> - Terminate a task
    • mem - Show memory usage
    • sem - Show semaphore status
    • msg - Show message queue status

Non-Functional Requirements

  • Footprint: Kernel < 16KB Flash, < 4KB RAM
  • Latency: Syscall < 10us, context switch < 50us
  • Reliability: No kernel panics on user misbehavior
  • Determinism: Consistent scheduling behavior

Real World Outcome

When complete, your OS will work like this:

=== TinyOS v1.0 ===
Kernel: 8KB Flash, 2KB RAM
User space: 120KB Flash, 30KB RAM
Tasks: 8 max, priority 0-7

Boot sequence:
  [OK] MPU configured: 8 regions
  [OK] Kernel in privileged mode
  [OK] User tasks in unprivileged mode
  [OK] SysTick @ 1ms
  [OK] Shell task started

TinyOS> ps
PID  NAME       PRI  STATE     STACK  CPU%
  1  idle         7  READY     128    85%
  2  shell        2  RUNNING   512     2%
  3  blinker      3  READY     256     1%
  4  sensor       1  BLOCKED   256    12%

TinyOS> exec counter
Starting task 'counter' (PID 5)

TinyOS> kill 3
Task 'blinker' terminated

TinyOS> mem
Kernel heap: 1024/2048 bytes used
User heap:   4096/30720 bytes used
Task stacks: 1280 bytes total
Free memory: 25344 bytes

TinyOS> sem
SEM       VALUE  WAITERS
uart_tx       1  (none)
sensor_rdy    0  sensor(4)
data_mutex    1  (none)

TinyOS> msg
QUEUE     SIZE  PENDING
cmd_q     16    2 messages
data_q    64    0 messages

TinyOS> exec producer
Starting task 'producer' (PID 6)

TinyOS> exec consumer
Starting task 'consumer' (PID 7)

TinyOS> ps
PID  NAME       PRI  STATE     STACK  CPU%
  1  idle         7  READY     128    62%
  2  shell        2  RUNNING   512     3%
  4  sensor       1  READY     256    15%
  5  counter      4  READY     256     5%
  6  producer     3  BLOCKED   256     8%
  7  consumer     3  READY     256     7%

# User task trying to access kernel memory:
TinyOS> exec bad_task
Starting task 'bad_task' (PID 8)
[FAULT] MPU violation at 0x20000010 by task 8
[FAULT] Terminating task 'bad_task'

TinyOS> ps
PID  NAME       PRI  STATE     STACK  CPU%
  1  idle         7  READY     128    62%
  2  shell        2  RUNNING   512     3%
  ...
# Task 8 was terminated due to protection fault

Questions to Guide Your Design

Work through these questions BEFORE writing code:

  1. Memory Layout: How do you partition Flash and RAM between kernel and user space?

  2. Stack Allocation: Do you use static stacks or a stack allocator? How big should stacks be?

  3. MPU Regions: How many regions do you need? How do you handle task switching with limited regions?

  4. Syscall Design: How do you pass arguments? Return values? Errors?

  5. Priority Inversion: What if a high-priority task waits on a semaphore held by a low-priority task?

  6. Idle Task: What should the idle task do? (WFI instruction for power saving)

  7. Task Exit: How do you clean up task resources? Who reclaims the stack?

  8. Reentrant Kernel: Can one syscall be interrupted by another? Should it?


Thinking Exercise

Before writing any code, design these scenarios on paper:

Scenario 1: Task Creation

  1. Shell calls sys_spawn("blinker", 3)
  2. What kernel structures are created?
  3. How is the new task’s stack initialized?
  4. What MPU regions change (if any)?
  5. When does the new task first run?

Scenario 2: Mutex Deadlock

  1. Task A holds mutex M1, waiting for M2
  2. Task B holds mutex M2, waiting for M1
  3. How would you detect this?
  4. How would you prevent it?

Scenario 3: MPU Fault Handling

  1. User task reads from address 0x20000010 (kernel data)
  2. MPU triggers MemManage fault
  3. What information is available to the handler?
  4. How do you terminate just that task, not the whole system?

Solution Architecture

High-Level Design

┌─────────────────────────────────────────────────────────────────┐
│                          TinyOS                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   User Space                                                     │
│   ──────────                                                     │
│   ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐           │
│   │  Shell  │  │ Blinker │  │ Sensor  │  │  Idle   │           │
│   └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘           │
│        │            │            │            │                  │
│        └────────────┴────────────┴────────────┘                 │
│                            │                                     │
│                       SVC (syscall)                              │
│                            │                                     │
│   ─────────────────────────┼─────────────────────────────────   │
│                            ▼                                     │
│   Kernel Space                                                   │
│   ────────────                                                   │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │  Syscall    │  Scheduler  │  IPC      │  Memory  │ Shell │  │
│   │  Handler    │             │  Manager  │  Manager │       │  │
│   └──────┬──────────────┬───────────┬───────────┬──────┬────┘  │
│          │              │           │           │      │        │
│          ▼              ▼           ▼           ▼      ▼        │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │                    Hardware Abstraction                  │  │
│   │   SysTick  │  MPU  │  NVIC  │  UART  │  GPIO            │  │
│   └─────────────────────────────────────────────────────────┘  │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Memory Map

Memory Layout:
┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│   Flash (128KB example):                                        │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │ 0x08000000  Vector Table (kernel)           256 bytes   │  │
│   │ 0x08000100  Kernel Code                     ~8 KB       │  │
│   │ 0x08002000  User Code (shared)              ~118 KB     │  │
│   │ 0x0801FFFF  End of Flash                                │  │
│   └─────────────────────────────────────────────────────────┘  │
│                                                                 │
│   RAM (32KB example):                                           │
│   ┌─────────────────────────────────────────────────────────┐  │
│   │ 0x20000000  Kernel Data/BSS                 ~2 KB       │  │
│   │ 0x20000800  Kernel Heap                     ~2 KB       │  │
│   │ 0x20001000  User Heap                       ~4 KB       │  │
│   │ 0x20002000  Task Stacks (8 x 1KB)           ~8 KB       │  │
│   │ 0x2000A000  Free                            ~16 KB      │  │
│   │ 0x20007FFF  End of RAM                                  │  │
│   └─────────────────────────────────────────────────────────┘  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Data Structures

// Task Control Block
typedef struct TCB {
    uint32_t    *stack_ptr;      // Current stack pointer (must be first!)
    uint32_t    *stack_base;     // Bottom of stack
    uint32_t     stack_size;     // Size of stack
    uint8_t      state;          // READY, RUNNING, BLOCKED, ZOMBIE
    uint8_t      priority;       // 0 (highest) to 7 (lowest)
    uint8_t      pid;            // Process ID
    char         name[16];       // Task name
    uint32_t     sleep_until;    // For sleep syscall
    void        *waiting_on;     // Semaphore/queue we're blocked on
    struct TCB  *next;           // Next in ready/wait queue
} TCB;

// Task states
#define TASK_READY    0
#define TASK_RUNNING  1
#define TASK_BLOCKED  2
#define TASK_ZOMBIE   3

// Semaphore
typedef struct {
    volatile int32_t count;
    TCB             *wait_queue;
    char             name[16];
} Semaphore;

// Message Queue
typedef struct {
    uint8_t     *buffer;
    uint32_t     msg_size;
    uint32_t     capacity;
    uint32_t     count;
    uint32_t     head;
    uint32_t     tail;
    Semaphore    sem_full;
    Semaphore    sem_empty;
    Semaphore    sem_mutex;
    char         name[16];
} MessageQueue;

// Kernel state
typedef struct {
    TCB         *current_task;
    TCB         *ready_queue[8];  // One queue per priority
    TCB          tasks[MAX_TASKS];
    uint8_t      num_tasks;
    uint32_t     ticks;           // System tick counter
    Semaphore    semaphores[MAX_SEMS];
    MessageQueue queues[MAX_QUEUES];
} Kernel;

Syscall Table

// System call numbers
#define SYS_YIELD    0
#define SYS_EXIT     1
#define SYS_WRITE    2
#define SYS_READ     3
#define SYS_SPAWN    4
#define SYS_SLEEP    5
#define SYS_SEM_CREATE  10
#define SYS_SEM_WAIT    11
#define SYS_SEM_SIGNAL  12
#define SYS_QUEUE_CREATE 20
#define SYS_QUEUE_SEND   21
#define SYS_QUEUE_RECV   22

// Syscall wrapper (user space)
static inline int syscall(int num, int arg0, int arg1, int arg2) {
    register int r0 __asm__("r0") = arg0;
    register int r1 __asm__("r1") = arg1;
    register int r2 __asm__("r2") = arg2;
    register int ret __asm__("r0");

    __asm__ volatile (
        "svc %[num]"
        : "=r" (ret)
        : [num] "I" (num), "r" (r0), "r" (r1), "r" (r2)
        : "memory"
    );
    return ret;
}

// User space wrappers
void sys_yield(void) { syscall(SYS_YIELD, 0, 0, 0); }
void sys_exit(int code) { syscall(SYS_EXIT, code, 0, 0); }
int sys_write(int fd, const void *buf, int len) {
    return syscall(SYS_WRITE, fd, (int)buf, len);
}

Implementation Guide

Development Environment Setup

# Required tools
sudo apt-get install gcc-arm-none-eabi gdb-multiarch openocd

# Create project structure
mkdir -p tinyos/{kernel,user,inc}
cd tinyos

Project Structure

tinyos/
├── kernel/
│   ├── main.c              # Kernel entry point
│   ├── scheduler.c         # Task scheduling
│   ├── syscall.c           # Syscall handler
│   ├── ipc.c               # Semaphores, queues
│   ├── mpu.c               # MPU configuration
│   ├── fault.c             # Fault handlers
│   └── startup.s           # Vector table, context switch
├── user/
│   ├── shell.c             # Shell task
│   ├── syscall_wrappers.c  # Syscall user-side
│   ├── blinker.c           # Example task
│   └── user_entry.c        # User task entry points
├── inc/
│   ├── kernel.h            # Kernel interfaces
│   ├── syscall.h           # Syscall numbers
│   ├── ipc.h               # IPC structures
│   └── stm32f4xx.h         # Hardware definitions
├── linker_kernel.ld        # Kernel linker script
├── linker_user.ld          # User linker script (optional)
└── Makefile

Implementation Phases

Phase 1: Basic Kernel (Days 1-7)

Goals:

  • Boot into kernel mode
  • Run a single task in unprivileged mode
  • Implement syscall mechanism

Tasks:

  1. Set up kernel startup code (stack, BSS clear)
  2. Create first user task structure
  3. Switch to unprivileged mode
  4. Implement SVC handler skeleton
  5. Implement sys_yield (no-op at first)
  6. Test: User task runs and can call syscall

Checkpoint: User task runs and successfully calls sys_write.

// Kernel main
void kernel_main(void) {
    // Initialize kernel structures
    kernel_init();

    // Create shell task
    task_create(shell_main, "shell", 2, 512);

    // Create idle task
    task_create(idle_main, "idle", 7, 128);

    // Start first task
    scheduler_start();
    // Never returns
}

// Switch to unprivileged mode (assembly)
// startup.s:
.global start_first_task
.thumb_func
start_first_task:
    // Load task's stack pointer
    ldr r0, =current_task
    ldr r0, [r0]
    ldr r0, [r0]         // r0 = stack_ptr

    // Restore context from stack
    ldmia r0!, {r4-r11}  // Restore R4-R11
    msr psp, r0          // Set PSP to task's stack

    // Switch to unprivileged mode using PSP
    mov r0, #3           // CONTROL: PSP, unprivileged
    msr control, r0
    isb

    // "Return" to user task
    pop {r0-r3, r12, lr}
    pop {pc}

Phase 2: Preemptive Scheduling (Days 8-14)

Goals:

  • Multiple tasks running
  • SysTick-driven preemption
  • Round-robin scheduling

Tasks:

  1. Implement ready queue
  2. Configure SysTick for 1ms ticks
  3. Implement PendSV for context switching
  4. Add multiple test tasks
  5. Implement sys_yield properly
  6. Test: Tasks take turns running

Checkpoint: Multiple tasks run in round-robin fashion.

// SysTick handler
void SysTick_Handler(void) {
    kernel.ticks++;

    // Check for sleeping tasks to wake
    wake_sleeping_tasks();

    // Trigger context switch
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}

// PendSV handler (context switch)
// In assembly for precise control
.global PendSV_Handler
.thumb_func
PendSV_Handler:
    // Save current context
    mrs r0, psp
    stmdb r0!, {r4-r11}  // Push R4-R11

    // Save stack pointer to current TCB
    ldr r1, =current_task
    ldr r1, [r1]
    str r0, [r1]         // Save SP to TCB

    // Call scheduler to get next task
    push {lr}
    bl scheduler_next_task
    pop {lr}

    // Load new task's stack pointer
    ldr r1, =current_task
    ldr r1, [r1]
    ldr r0, [r1]

    // Restore context
    ldmia r0!, {r4-r11}  // Pop R4-R11
    msr psp, r0

    // Return to new task
    bx lr

// Scheduler
TCB *scheduler_next_task(void) {
    // Find highest priority ready task
    for (int p = 0; p < 8; p++) {
        if (kernel.ready_queue[p]) {
            TCB *task = kernel.ready_queue[p];
            kernel.ready_queue[p] = task->next;
            task->state = TASK_RUNNING;
            kernel.current_task = task;
            return task;
        }
    }
    // Should never reach here (idle task is always ready)
    return NULL;
}

Phase 3: MPU Protection (Days 15-21)

Goals:

  • Configure MPU regions
  • Kernel memory protected
  • Fault on violations

Tasks:

  1. Define memory regions
  2. Implement MPU configuration
  3. Set up MemManage fault handler
  4. Test protection with bad accesses
  5. Terminate violating task

Checkpoint: User task that touches kernel memory is terminated, system continues.

// MPU configuration
void mpu_init(void) {
    // Disable MPU during configuration
    MPU->CTRL = 0;

    // Region 0: Kernel code (read-only, privileged only)
    MPU->RNR = 0;
    MPU->RBAR = 0x08000000 | MPU_RBAR_VALID_Msk | 0;
    MPU->RASR = MPU_RASR_ENABLE_Msk |
                (14 << MPU_RASR_SIZE_Pos) |  // 32KB
                (5 << MPU_RASR_AP_Pos);      // RO, privileged only

    // Region 1: Kernel data (read-write, privileged only)
    MPU->RNR = 1;
    MPU->RBAR = 0x20000000 | MPU_RBAR_VALID_Msk | 1;
    MPU->RASR = MPU_RASR_ENABLE_Msk |
                (12 << MPU_RASR_SIZE_Pos) |  // 8KB
                (1 << MPU_RASR_AP_Pos) |     // RW, privileged only
                MPU_RASR_XN_Msk;             // No execute

    // Region 2: User code (read-only, user accessible)
    MPU->RNR = 2;
    MPU->RBAR = 0x08008000 | MPU_RBAR_VALID_Msk | 2;
    MPU->RASR = MPU_RASR_ENABLE_Msk |
                (16 << MPU_RASR_SIZE_Pos) |  // 128KB
                (6 << MPU_RASR_AP_Pos);      // RO, user accessible

    // Region 3-7: Task stacks (configured per task)

    // Enable MPU with default memory map for privileged
    MPU->CTRL = MPU_CTRL_ENABLE_Msk |
                MPU_CTRL_PRIVDEFENA_Msk;
    __DSB();
    __ISB();
}

// Configure stack region for task
void mpu_configure_task_stack(TCB *task, int region) {
    MPU->RNR = region;
    MPU->RBAR = (uint32_t)task->stack_base | MPU_RBAR_VALID_Msk | region;
    MPU->RASR = MPU_RASR_ENABLE_Msk |
                (9 << MPU_RASR_SIZE_Pos) |   // 1KB (adjust as needed)
                (3 << MPU_RASR_AP_Pos) |     // RW, full access
                MPU_RASR_XN_Msk;             // No execute
    __DSB();
    __ISB();
}

// MemManage fault handler
void MemManage_Handler(void) {
    uint32_t mmfar = SCB->MMFAR;
    uint32_t cfsr = SCB->CFSR;

    printf("[FAULT] MPU violation at 0x%08X by task %d\n",
           mmfar, kernel.current_task->pid);
    printf("[FAULT] CFSR: 0x%08X\n", cfsr);
    printf("[FAULT] Terminating task '%s'\n", kernel.current_task->name);

    // Terminate the offending task
    task_terminate(kernel.current_task);

    // Clear fault status
    SCB->CFSR = cfsr;

    // Trigger reschedule
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
}

Phase 4: IPC (Days 22-28)

Goals:

  • Implement semaphores
  • Implement message queues
  • Task blocking on IPC

Tasks:

  1. Implement semaphore create/wait/signal
  2. Implement blocking (move to wait queue)
  3. Implement waking (move to ready queue)
  4. Implement message queue
  5. Test producer/consumer scenario

Checkpoint: Producer/consumer with message queue works correctly.

// Semaphore wait
int sem_wait(Semaphore *sem) {
    // Must be atomic
    __disable_irq();

    if (sem->count > 0) {
        sem->count--;
        __enable_irq();
        return 0;
    }

    // Block current task
    TCB *task = kernel.current_task;
    task->state = TASK_BLOCKED;
    task->waiting_on = sem;

    // Add to wait queue
    task->next = sem->wait_queue;
    sem->wait_queue = task;

    __enable_irq();

    // Trigger context switch
    SCB->ICSR |= SCB_ICSR_PENDSVSET_Msk;
    __ISB();

    // When we get here, we've been woken up
    return 0;
}

// Semaphore signal
int sem_signal(Semaphore *sem) {
    __disable_irq();

    if (sem->wait_queue) {
        // Wake first waiting task
        TCB *task = sem->wait_queue;
        sem->wait_queue = task->next;

        task->state = TASK_READY;
        task->waiting_on = NULL;

        // Add to ready queue
        add_to_ready_queue(task);
    } else {
        sem->count++;
    }

    __enable_irq();
    return 0;
}

// Message queue send
int queue_send(MessageQueue *queue, void *msg) {
    sem_wait(&queue->sem_full);    // Wait if queue is full
    sem_wait(&queue->sem_mutex);   // Lock queue

    // Copy message to buffer
    memcpy(queue->buffer + (queue->tail * queue->msg_size),
           msg, queue->msg_size);
    queue->tail = (queue->tail + 1) % queue->capacity;
    queue->count++;

    sem_signal(&queue->sem_mutex); // Unlock queue
    sem_signal(&queue->sem_empty); // Signal message available

    return 0;
}

Phase 5: Shell and Polish (Days 29-35+)

Goals:

  • Interactive shell
  • Complete syscall set
  • Thorough testing

Tasks:

  1. Implement shell task with command parsing
  2. Implement ps, kill, exec commands
  3. Add memory stats command
  4. Test all edge cases
  5. Optimize and clean up

Checkpoint: Full OS with shell managing tasks.


Hints in Layers

Hint 1: Getting Started

Start without MPU:

  1. Run two tasks in round-robin
  2. Get context switching working
  3. Then add MPU protection
  4. Debug kernel first, protect later
Hint 2: Context Switch Debugging

Common issues:

  • Missing R4-R11 in save/restore (they’re not auto-saved)
  • Wrong stack pointer (PSP vs MSP)
  • EXC_RETURN value wrong (should be 0xFFFFFFFD for PSP/Thread)
  • Stack misalignment (must be 8-byte aligned)
Hint 3: MPU Size Encoding

MPU region size is encoded as power of 2:

  • Size = 2^(RASR.SIZE+1)
  • SIZE=4 → 32 bytes
  • SIZE=9 → 1KB
  • SIZE=14 → 32KB
  • Minimum size is 32 bytes (SIZE=4)
Hint 4: Syscall Number Extraction

The SVC number is in the instruction:

void SVC_Handler_C(uint32_t *stack) {
    uint32_t pc = stack[6];  // Return address
    uint8_t svc_num = ((uint8_t *)pc)[-2];  // SVC number
}
Hint 5: Blocking Correctly

When a task blocks:

  1. Set state to BLOCKED
  2. Remove from ready queue
  3. Add to wait queue
  4. Call scheduler to run next task
  5. When signaled: state to READY, add to ready queue
Hint 6: Priority Inversion

If high-priority task waits on semaphore held by low-priority task:

  • Low task never runs (high is spinning)
  • Solution: Priority inheritance
  • Temporarily boost low task’s priority

Testing Strategy

Test Categories

Category Purpose Examples
Unit Tests Test kernel primitives Semaphore wait/signal
Scheduling Test task switching Multiple tasks, priorities
Protection Test MPU Bad access terminated
IPC Test communication Producer/consumer
Stress Test limits Many tasks, long runs

Critical Test Cases

  1. Basic Scheduling: Three tasks run round-robin
  2. Priority: High priority preempts low
  3. Semaphore: Producer/consumer synchronized
  4. Message Queue: Messages delivered in order
  5. MPU Violation: Bad access terminates task
  6. Task Exit: Resources cleaned up
  7. Long Run: 1 hour with no crashes
  8. Stress: 8 tasks all active

Test Tasks

// Producer task
void producer_main(void) {
    int count = 0;
    while (1) {
        Message msg = { .value = count++ };
        queue_send(&data_queue, &msg);
        sys_sleep(100);
    }
}

// Consumer task
void consumer_main(void) {
    while (1) {
        Message msg;
        queue_receive(&data_queue, &msg);
        printf("Received: %d\n", msg.value);
    }
}

// Bad task (for MPU testing)
void bad_task_main(void) {
    volatile int *kernel_data = (int *)0x20000010;
    *kernel_data = 0xDEADBEEF;  // Should fault!
}

Common Pitfalls & Debugging

Frequent Mistakes

Pitfall Symptom Solution
PSP not set Crash on first task Set PSP before unprivileged
R4-R11 lost Corrupted data after switch Save/restore in PendSV
MPU order Wrong region takes priority Higher number = higher priority
Interrupt in syscall Corruption Disable IRQ in critical sections
Stack overflow Random crashes Check stack pointer, add guard
Deadlock System hangs Analyze wait graphs

Debugging Strategies

  1. LED Debugging: Toggle LED at key points
  2. UART Logging: Printf in kernel (not user)
  3. Fault Analysis: Dump all status registers
  4. GDB Remote: Use your Project 14 stub!
  5. Statistics: Track context switches, syscalls

Kernel Panic Handler

void kernel_panic(const char *msg) {
    __disable_irq();

    printf("\n\n=== KERNEL PANIC ===\n");
    printf("Message: %s\n", msg);
    printf("Current task: %s (PID %d)\n",
           kernel.current_task->name, kernel.current_task->pid);

    // Dump registers, stack, etc.
    dump_system_state();

    // Halt
    while (1) {
        // Blink LED rapidly
        toggle_led();
        delay_ms(100);
    }
}

Extensions & Challenges

Beginner Extensions

  • More syscalls: getpid, nice, time
  • Task arguments: Pass arguments to new tasks
  • Exit codes: Report exit codes to parent
  • Statistics: CPU usage per task

Intermediate Extensions

  • Priority inheritance: Solve priority inversion
  • Signals: SIGTERM, SIGKILL
  • File system: Simple in-memory FS
  • Dynamic memory: malloc/free for user tasks
  • Drivers: Abstract driver interface

Advanced Extensions

  • Virtual memory: Simplified paging
  • Networking: TCP/IP stack
  • USB support: USB device stack
  • Multi-core: SMP scheduling
  • Security: Capabilities, namespaces

Self-Assessment Checklist

Understanding

  • I can explain ARM privilege levels and mode switching
  • I understand how the MPU enforces memory protection
  • I can describe the syscall flow from user to kernel
  • I understand preemptive vs cooperative scheduling
  • I can explain semaphore and queue operations
  • I know how priority inversion occurs and how to prevent it

Implementation

  • Tasks run in unprivileged mode
  • Context switch works correctly
  • MPU protects kernel from user
  • Syscalls work for all implemented calls
  • Semaphores block and wake correctly
  • Message queues work with producer/consumer
  • Shell can manage tasks

Quality

  • System runs for hours without crashes
  • Bad tasks don’t crash the kernel
  • Kernel code is well-organized
  • Memory usage is reasonable

The Interview Questions They’ll Ask

After completing this project, you’ll be ready for these questions:

  1. “How does an OS provide process isolation?”
    • Hardware support: MMU/MPU restricts memory access
    • Privilege levels: User code can’t execute privileged instructions
    • Syscalls: Only way to request kernel services
    • Each process has own address space
  2. “Explain how context switching works”
    • Save current task’s registers to its stack
    • Update current TCB with new stack pointer
    • Select next task to run (scheduler)
    • Load new task’s registers from its stack
    • Return to new task
  3. “What is a deadlock and how do you prevent it?”
    • Deadlock: Circular wait for resources
    • Prevention: Lock ordering, timeout, detection
    • Avoidance: Banker’s algorithm
    • Recovery: Kill one process
  4. “How do syscalls work?”
    • User executes SVC instruction with syscall number
    • CPU switches to privileged mode, jumps to handler
    • Kernel extracts number and arguments from stack
    • Kernel performs operation, stores return value
    • Returns to user mode
  5. “What is priority inversion and how do you solve it?”
    • High-priority task waits on resource held by low-priority
    • Medium-priority task runs, blocking low from releasing
    • Solution: Priority inheritance (boost low temporarily)

Books That Will Help

Topic Book Chapter
OS fundamentals “Operating Systems: Three Easy Pieces” All chapters
ARM privilege modes “Definitive Guide to ARM Cortex-M3/M4” Chapter 3
MPU “Definitive Guide to ARM Cortex-M3/M4” Chapter 11
Scheduling “Operating Systems: Three Easy Pieces” Chapters 7-10
Synchronization “Operating Systems: Three Easy Pieces” Chapters 26-31
Embedded OS design “Making Embedded Systems” Chapters 9-10

Resources

Documentation

  • ARM Cortex-M Technical Reference Manual
  • ARM MPU documentation
  • FreeRTOS source code (reference implementation)

Example Projects


This guide was expanded from LEARN_ARM_DEEP_DIVE.md. For the complete learning path, see the project index.