LEARN RTOS FROM SCRATCH IN C
Learn RTOS from Scratch in C: From Bare Metal to a Preemptive Kernel
Goal: To deeply understand the inner workings of a Real-Time Operating System (RTOS) by building one from the ground up in C on a real microcontroller. You will learn not just the theory, but the practical, low-level details of scheduling, context switching, synchronization, and hardware interaction.
Why Build an RTOS?
Most programmers use an OS every day, but treat it as a black box. Building one, even a simple one, tears that box open. An RTOS, with its focus on determinism and resource constraints, is a perfect subject. It strips away the complexity of a general-purpose OS like Linux, leaving only the essential kernel components: a scheduler, tasks, and synchronization primitives.
By building an RTOS, you will master concepts that are fundamental to all concurrent software and gain a profound understanding of the interface between hardware and software. This is not about building the next FreeRTOS; it’s about building knowledge.
After completing these projects, you will:
- Understand how a computer boots with no operating system.
- Directly control hardware peripherals using memory-mapped I/O.
- Master interrupt handling and its role in an OS.
- Implement cooperative and preemptive multitasking from scratch.
- Write the ARM assembly code for a context switch.
- Build your own semaphores, mutexes, and message queues.
Hardware and Toolchain Setup
This is not a simulation. You will need real hardware.
- Target Board: An STM32F4xx series board (like a “Black Pill” or a Nucleo board). These are cheap, powerful, well-documented, and use a standard ARM Cortex-M4 core. The concepts are transferable to other ARM MCUs.
- Debugger/Programmer: An ST-Link V2. This is essential for flashing your code onto the board and for debugging.
- Toolchain:
- ARM GCC Toolchain: The compiler, assembler, and linker (
arm-none-eabi-gcc). - Make: To automate the build process.
- OpenOCD: The software that communicates with the ST-Link to flash and debug the chip.
- GDB: The GNU Debugger, specifically the ARM version (
arm-none-eabi-gdb).
- ARM GCC Toolchain: The compiler, assembler, and linker (
Core Concept Analysis
An RTOS is a minimal kernel whose primary job is to decide which “task” should be running at any given moment, with timing guarantees.
┌──────────────────────────────────────────────────────────┐
│ USER TASKS │
│ │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Task A │ │ Task B │ │ Task C │ │
│ │ (e.g. GUI) │ │ (e.g. WiFi) │ │(e.g. Sensor)│ │
│ └───────────┘ └───────────┘ └───────────┘ │
└───────────────────────┬──────────────────────────────────┘
│ (System Calls: sleep, wait, post)
▼
┌──────────────────────────────────────────────────────────┐
│ YOUR RTOS KERNEL │
│ │
│ ┌────────────┐ ┌───────────┐ ┌────────────────────┐ │
│ │ Scheduler ├─► │ Sync/ITC │ │ Task Control Blocks│ │
│ │ (Decides) │ │(Sem,Mutex)│ │(Stacks, State, Prio) │ │
│ └────────────┘ └───────────┘ └────────────────────┘ │
└───────────────────────┬──────────────────────────────────┘
│ (Controls...)
▼
┌──────────────────────────────────────────────────────────┐
│ HARDWARE ABSTRACTION │
│ │
│ ┌────────────┐ ┌────────────┐ ┌────────────────────┐ │
│ │ SysTick ├─► │ Context │ │ Peripheral Drivers │ │
│ │ (Timer) │ │ Switch │ │ (GPIO, UART) │ │
│ └────────────┘ └────────────┘ └────────────────────┘ │
└───────────────────────┬──────────────────────────────────┘
│ (Manipulates...)
▼
┌──────────────────────────────────────────────────────────┐
│ PHYSICAL HARDWARE │
│ │
│ CPU Registers Timers GPIO Pins UART │
└──────────────────────────────────────────────────────────┘
Key RTOS Concepts
- Task: An independent thread of execution, with its own stack and state.
- Scheduler: The part of the kernel that decides which task to run. A preemptive scheduler can interrupt a running task to run a higher-priority one.
- System Tick: A periodic hardware timer interrupt that serves as the heartbeat of the RTOS, driving the preemptive scheduler.
- Context Switch: The low-level, architecture-specific process of saving the entire CPU state (registers, stack pointer) of the currently running task and restoring the state of the next task.
- Synchronization: Mechanisms like Mutexes and Semaphores that allow tasks to safely share resources and coordinate their actions.
The Project: Building Your RTOS Step-by-Step
Project 1: The Bare-Metal “Hello, World”
- File: LEARN_RTOS_FROM_SCRATCH_IN_C.md
- Main Programming Language: C, ARM Assembly
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Bare-Metal Programming / Embedded Systems
- Software or Tool: ARM GCC, Make, OpenOCD
- Main Book: “Mastering STM32, 2nd Edition” by Carmine Noviello
What you’ll build: A C program with no OS and no standard libraries that directly manipulates hardware registers to make an LED blink.
Why it teaches the fundamentals: This is the true starting point. You’ll learn how a microcontroller boots, how to write a linker script to place your code in memory, and how to control hardware by writing magic numbers to magic memory addresses. You will understand that there is nothing “magic” about hardware control.
Core challenges you’ll face:
- Toolchain Setup → maps to installing and configuring
arm-none-eabi-gcc - Linker Script → maps to telling the linker where RAM and Flash memory start and end
- Startup File (Assembly) → maps to setting up the initial stack pointer and calling your
mainfunction - Memory-Mapped I/O → maps to
#define LED_PIN_REGISTER (*(volatile uint32_t*)0x40020C14)and writing to it
Key Concepts:
- Memory-Mapped Peripherals: The core concept of embedded systems. Hardware is controlled by writing to specific memory addresses defined in the microcontroller’s datasheet.
- Linker Script: Defines the memory layout of the final executable file.
- Vector Table: A table of pointers to exception/interrupt handler functions, located at the start of memory.
Difficulty: Advanced Time estimate: Weekend Prerequisites: Strong C skills, willingness to read datasheets.
Real world outcome:
You will compile a .bin file, flash it to your board, and a single LED will blink. It seems simple, but you will have built it from absolute scratch, controlling every byte.
Implementation Hints:
- Find the datasheet for your STM32 board. You need the “Reference Manual” (RM).
- Look up the chapter on GPIO (General Purpose Input/Output).
- To turn on a pin, you need to:
a. Enable the clock for the GPIO port (e.g., GPIOC) in the RCC (Reset and Clock Control) register.
b. Configure the specific pin (e.g., PC13) as an output in the GPIO’s
MODERregister. c. Set or clear the pin by writing to the GPIO’sBSRRorODRregister. - Your
mainfunction will be awhile(1)loop that toggles the pin and has a simpleforloop delay.
Learning milestones:
- Your code compiles and links without
libc→ You understand the minimal requirements for an executable. - The LED blinks → You have successfully controlled hardware by writing to memory.
- You can change the blink rate → You are in full control of the CPU.
Project 2: The System Tick Interrupt
- File: LEARN_RTOS_FROM_SCRATCH_IN_C.md
- Main Programming Language: C, ARM Assembly
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 3: Advanced
- Knowledge Area: Interrupt Handling / Embedded Systems
- Software or Tool: ARM Cortex-M SysTick Timer
- Main Book: “The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors” by Joseph Yiu
What you’ll build: A program that configures the Cortex-M SysTick timer to generate an interrupt every millisecond. You will write an Interrupt Service Routine (ISR) for this interrupt that increments a global tick_count variable.
Why it teaches the fundamentals: This is the heartbeat of your RTOS. Almost all scheduling decisions in a preemptive RTOS are driven by this timer interrupt. You will learn how the CPU automatically suspends your main code, runs your ISR, and then resumes, laying the groundwork for preemption.
Core challenges you’ll face:
- Configuring the SysTick Timer → maps to setting the reload value and enabling the timer and its interrupt
- Writing an ISR → maps to a C function with a specific name (e.g.,
SysTick_Handler) that the hardware is hardwired to call - Understanding the Vector Table → maps to placing a pointer to your ISR in the correct slot in the vector table in your startup assembly file
- Volatile Keyword → maps to learning why you need
volatile uint32_t tick_count;to prevent the compiler from optimizing away reads of a variable changed in an ISR
Key Concepts:
- Interrupts: A signal to the CPU from hardware that requires immediate attention.
- Interrupt Service Routine (ISR): The function that the CPU executes when an interrupt occurs.
- SysTick: A standard 24-bit timer built into every ARM Cortex-M core, designed specifically for this purpose.
Difficulty: Advanced Time estimate: Weekend Prerequisites: Project 1.
Real world outcome:
Your main loop will no longer use a busy-wait for loop for delays. Instead, it can create precise, non-blocking delays by waiting for the tick_count variable to reach a certain value. You can have two LEDs blinking at different, precise rates (e.g., 1Hz and 2.5Hz) in your main loop.
Implementation Hints:
- The SysTick registers (CTRL, LOAD, VAL) are standardized by ARM. You’ll find them in the Cortex-M programming manual.
- Set the
LOADregister to(SystemClockFrequency / 1000) - 1for a 1ms tick. - Enable the timer, its interrupt, and set its clock source in the
CTRLregister. - In your
startup.sfile, you must have a vector table that includes an entry forSysTick_Handler. - Your ISR should be short and fast. For now, it just does
tick_count++;.
Learning milestones:
- Your ISR is successfully called every millisecond → You have mastered the basics of interrupt handling.
- You can create a precise 1-second delay by checking
tick_count→ You have a working system timer. - You understand the difference between a blocking
forloop delay and a non-blockingtick_countdelay → You are thinking about concurrency.
Project 3: A Cooperative Multi-Tasking Scheduler
- File: LEARN_RTOS_FROM_SCRATCH_IN_C.md
- Main Programming Language: C, ARM Assembly
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: OS Kernel Design / Context Switching
- Software or Tool: ARM GCC Inline Assembly
- Main Book: “The Definitive Guide to ARM® Cortex®-M3 and Cortex®-M4 Processors” by Joseph Yiu
What you’ll build: The core of an OS. You will create two independent “tasks” (C functions), each with its own stack. You will write a yield() function that performs a context switch: it saves the CPU registers of the current task to its stack, chooses the next task to run, and restores the registers of that next task from its stack.
Why it teaches the fundamentals: This is the single most important project. It demystifies what a “task” or “thread” is. You will learn that it’s nothing more than a function with its own dedicated stack space, and a “context switch” is just a carefully orchestrated sequence of saving and restoring CPU registers.
Core challenges you’ll face:
- Designing the Task Control Block (TCB) → maps to a
structthat holds a pointer to the task’s stack - Allocating stacks for each task → maps to creating static arrays
uint8_t task1_stack[1024]; - Initializing the task stacks → maps to pre-filling the stack of a new task with “fake” register values so the context switcher can restore from it the first time
- Writing the context switch logic in assembly → maps to using
__asm volatiletoPUSHandPOPthe ARM registers (R4-R11, LR) and manipulate the stack pointer (SP)
Key Concepts:
- Task Control Block (TCB): A data structure used by the kernel to manage information about a task.
- Context Switch: The process of saving the state of one task and restoring the state of another. On ARM Cortex-M, this means saving/restoring registers R0-R12, SP, LR, and PC. The hardware helps with some of this.
- Stack Frame: The layout of data on the stack, including saved registers.
Difficulty: Expert Time estimate: 1-2 weeks Prerequisites: Project 2.
Real world outcome:
You will have two LEDs, each controlled by a separate task function. One task will blink its LED in a loop and call yield(). The other task will do the same. You will see both LEDs blinking, seemingly at the same time, as your scheduler switches between the two tasks.
Implementation Hints:
- The TCB is simple at first:
struct TCB { void *stack_pointer; };. - When a task is created, its stack must be initialized to look as if it had been interrupted right before its first instruction. This means pushing initial values for the registers, with the Program Counter (PC) pointing to the task’s C function and the
xPSRregister having the “Thumb mode” bit set. - The context switch logic is the hardest part. The ARM
PendSVexception is designed for this. Youryield()function will trigger thePendSVexception. ThePendSV_HandlerISR will contain your assembly code. - The
PendSV_Handlerdoes this: a. Disable interrupts. b. Save registers R4-R11 of the current task onto its stack. c. Save the current stack pointer into the current task’s TCB. d. Call your C scheduler function (scheduler()) to choose the next task. e. Get the stack pointer from the next task’s TCB. f. Restore registers R4-R11 for the next task from its stack. g. Enable interrupts and return from the exception. The hardware automatically restores the other registers (PC, LR, etc.).
Learning milestones:
- You can switch from Task A to Task B and back → You have successfully written a context switch.
- Each task maintains its own local variables → You understand the role of separate stacks.
- You can add a third task and have it run in sequence → Your scheduler is general enough to handle N tasks.
Project 4: A Preemptive, Priority-Based Scheduler
- File: LEARN_RTOS_FROM_SCRATCH_IN_C.md
- Main Programming Language: C
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: OS Kernel Design / Scheduling Algorithms
- Software or Tool: ARM SysTick Timer
- Main Book: “Real-Time Concepts for Embedded Systems” by Qing Li
What you’ll build: You will upgrade your cooperative scheduler to be fully preemptive. The SysTick interrupt (from Project 2) will now trigger the context switch, forcibly interrupting tasks. You will also add a priority level to each task and modify the scheduler to always choose the highest-priority “Ready” task to run.
Why it teaches the fundamentals: This completes the core of a modern RTOS kernel. You learn how an OS can guarantee that high-priority work (like processing a critical sensor) will always run immediately, even if a low-priority task (like updating a display) is currently running. This is the essence of “real-time” behavior.
Core challenges you’ll face:
- Triggering the scheduler from the SysTick ISR → maps to calling your context switch logic at the end of the
SysTick_Handler - Adding states to your TCB → maps to including a
task_stateenum (READY,RUNNING,BLOCKED) - Implementing a priority-based scheduling algorithm → maps to looping through your TCB array to find the highest-priority task that is in the
READYstate - Protecting critical sections → maps to learning to disable/enable interrupts around short sections of code in the kernel that must not be interrupted
Key Concepts:
- Preemption: The ability of the OS to interrupt a running task to run another, higher-priority task.
- Task States: A task isn’t just running or not; it can be ready to run, actively running, or blocked waiting for an event.
- Priority-Based Scheduling: A simple and deterministic scheduling algorithm where every task has a fixed priority.
Difficulty: Expert Time estimate: 1-2 weeks
- Prerequisites: Project 3.
Real world outcome:
You will have two tasks blinking LEDs at different rates. Task A has a high priority, Task B has a low priority. If Task A needs to run (e.g., its sleep timer expires), it will immediately preempt Task B, which will only get to run when Task A is sleeping or blocked. This deterministic behavior will be visible in the timing of the LEDs.
Implementation Hints:
- Your
SysTick_Handlerwill now do more than just increment a tick counter. It will also be responsible for decrementing sleep timers for any blocked tasks. If a task’s sleep timer reaches zero, its state changes fromBLOCKEDtoREADY. - After updating timers, the SysTick handler will call the scheduler.
- The scheduler logic is simple: iterate through all tasks. Keep track of the highest-priority task you’ve seen so far that has a
READYstate. After checking all tasks, that’s the one you switch to. - What happens if no task is ready? You should have an “Idle” task (the lowest-priority task that just sits in a
while(1)loop) to run in this case.
Learning milestones:
- The SysTick interrupt successfully preempts a running task → You have a working preemptive kernel.
- When a high-priority task becomes ready, it always runs immediately → You have implemented priority-based scheduling.
- You have an idle task that runs when no other work is available → Your kernel is robust and always has something to schedule.
- You understand what “priority inversion” is and why a mutex is more than just a binary semaphore → You are thinking about advanced scheduling problems.
Summary
| Project | Key RTOS Concept | C/Hardware Technology | Difficulty |
|---|---|---|---|
| 1. The Bare-Metal “Hello, World” | Hardware Control | Memory-Mapped I/O | Advanced |
| 2. The System Tick Interrupt | Kernel Heartbeat | ARM SysTick, ISRs | Advanced |
| 3. Cooperative Multi-Tasking | Context Switching | Inline Assembly, Stacks | Expert |
| 4. Preemptive, Priority-Based Scheduler | Preemption, Scheduling | SysTick, Task States | Expert |