Project 5: Sleep, Delay, and Idle Task

Add time services so tasks can sleep without busy-waiting, and implement an idle task that runs when nothing else is READY.

Quick Reference

Attribute	Value
Difficulty	Advanced
Time Estimate	1-2 weeks
Main Programming Language	C
Alternative Programming Languages	Rust, Ada
Coolness Level	High
Business Potential	High
Prerequisites	Projects 1-4, SysTick timebase, preemptive scheduler
Key Topics	Time services, sleep queues, tick wraparound, idle task, low power modes

1. Learning Objectives

By completing this project, you will:

Implement a sleep_ms() API that blocks a task without busy-waiting.
Manage a sleep queue and wake tasks at precise tick counts.
Handle tick counter wraparound safely.
Implement an idle task and optional low-power entry.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Time Services and Tick Arithmetic

Fundamentals

Time services in an RTOS are built on a monotonically increasing tick counter. A sleep or delay call records a wake-up tick and blocks the task until the global tick reaches that value. Because ticks are finite-width integers, they eventually wrap around. Correct time comparisons must therefore use unsigned arithmetic and subtraction rather than naive greater-than comparisons. A stable time service is foundational for timeouts, periodic tasks, and timers.

Additional fundamentals: Sleeping is not the same as delaying in a busy loop. A sleep API is an explicit contract with the scheduler: the task gives up the CPU and expects to be resumed later. This lets other tasks run and reduces power consumption, which is a core motivation for building an RTOS in the first place.

Deep Dive into the concept

The global tick counter is typically a 32-bit unsigned integer incremented by SysTick. At 1 ms per tick, a 32-bit counter wraps in about 49.7 days. If your scheduler compares now >= wake_tick directly, it will fail across a wrap. The standard technique is to compare using signed subtraction: if ((int32_t)(now - wake_tick) >= 0) which works correctly even when now has wrapped. This works because unsigned subtraction is modulo arithmetic, and casting to signed preserves ordering within half the range. This is a well-known pattern in embedded timing.

A sleep service must block a task without wasting CPU. That means changing its state from RUNNING to BLOCKED and storing its wake-up tick. The scheduler then ignores it until the wake-up tick arrives. The wake-up mechanism can be implemented as a linear scan of tasks each tick, or with a sorted list of sleeping tasks. A sorted list improves efficiency when many tasks sleep, but it adds insertion overhead. For a small kernel, a linear scan is acceptable and simpler to reason about.

Another aspect is periodic tasks. If a task needs to run every 100 ms, it should calculate its next wake-up time based on an absolute schedule, not relative to the current time. Otherwise, delays accumulate and cause drift. The correct pattern is next += period, then sleep until next. This is the same principle used in software timers and time-triggered systems. Your sleep API should support this pattern by accepting an absolute tick or by letting the task track it.

Time services also interact with scheduler correctness. If the SysTick ISR delays (due to interrupts being masked), the tick count may lag real time, causing tasks to wake late. This is acceptable in soft real-time systems but must be quantified in hard real-time contexts. In this project, you will measure wake-up accuracy to validate your implementation. Later, this will feed into jitter analysis in Project 10.

Additional depth: Time services must also consider the resolution and granularity of ticks. A 1 ms tick cannot accurately represent sub-millisecond delays, and tasks that require microsecond precision must use hardware timers or cycle counters. This is why many RTOSes provide both tick-based delays and hardware timer APIs. In your system, you should clearly document that sleep_ms() has a resolution of one tick and that actual sleep times can be up to one tick longer than requested. This sets correct expectations and prevents misunderstandings in later projects.

Another subtlety is that tick-based delays are subject to interrupt masking. If interrupts are disabled for a long period, the tick counter will not advance, effectively “freezing” time in the kernel. This can cause tasks to wake late and can accumulate delay. A robust system minimizes the time interrupts are masked and keeps critical sections short. When you measure sleep accuracy, include tests where you intentionally add long critical sections to see how the time service behaves. This teaches you to consider worst-case timing rather than average behavior.

The sleep API also interacts with task priorities. If multiple tasks become READY at the same tick, the scheduler must decide which runs first. In a priority-based system, higher-priority tasks should run first; lower-priority tasks may be delayed even if their wake time has arrived. This is correct behavior, but it means that wake-up time is necessary but not sufficient for immediate execution. Your timing report should distinguish between wake-up time (task becomes READY) and actual run time (task gets CPU). This distinction becomes important in later jitter measurements.

Finally, you should consider the data structure invariants for your sleep queue. If you allow tasks to cancel or change their sleep, you need to remove them from the queue safely. Even if you do not implement cancel, you should design the queue with clear ownership rules and ensure that tasks cannot be in multiple queues simultaneously. This discipline will pay off when you add more kernel objects like timers and event flags.

How this fit on projects

Time services are implemented in Sec. 3.1 and validated in Sec. 3.7.2. They are reused in software timers in Project 8.

Definitions & key terms

Tick: Periodic time unit generated by SysTick.
Wake-up tick: The tick count at which a sleeping task becomes READY.
Wraparound: Overflow of tick counter back to zero.
Absolute scheduling: Scheduling based on fixed time points.

Mental model diagram (ASCII)

Tick counter: 0 ... 0xFFFF_FFFF -> wraps -> 0
Task sleep: wake_tick stored
Scheduler: if (now - wake_tick) >= 0 -> READY

How it works (step-by-step, with invariants and failure modes)

Task calls sleep_ms and computes wake_tick.
Task state set to BLOCKED; scheduler selects another task.
SysTick increments now; scheduler checks sleeping tasks.
When (now - wake_tick) >= 0, task becomes READY.
Failure mode: naive comparisons break on wrap.

Minimal concrete example

void task_sleep(uint32_t ticks) {
    current->wake_tick = g_tick + ticks;
    current->state = BLOCKED;
    schedule();
}

bool time_reached(uint32_t now, uint32_t wake) {
    return (int32_t)(now - wake) >= 0;
}

Common misconceptions

“Time never wraps” -> it always wraps with finite-width counters.
“Delay is just a busy loop” -> that wastes CPU and breaks deadlines.
“Relative sleep is enough” -> it introduces drift for periodic tasks.

Check-your-understanding questions

Why does now >= wake_tick fail after wraparound?
How do you avoid drift in periodic tasks?
What happens if a task sleeps longer than half the counter range?

Check-your-understanding answers

Because wrap makes now smaller even though time progressed.
Use absolute scheduling: next += period.
The signed subtraction trick breaks; avoid very long sleeps.

Real-world applications

Periodic sensor sampling in industrial systems.
Real-time control loops with fixed deadlines.

Where you’ll apply it

This project: Sec. 3.1, Sec. 3.7.2, Sec. 5.10 Phase 1.
Also used in: Project 8 and Project 10.

References

“Real-Time Concepts for Embedded Systems” by Qing Li, Ch. 9-11.
ARM documentation on SysTick and counter wrap.

Key insights

Correct time arithmetic is the difference between deterministic sleep and subtle timing bugs.

Summary

A reliable tick-based time service requires correct wraparound handling and absolute scheduling for periodic tasks.

Homework/Exercises to practice the concept

Simulate tick wraparound with a small 8-bit counter.
Implement a periodic task with absolute scheduling and measure drift.
Add unit tests for the time_reached() function across wrap.

Solutions to the homework/exercises

Wrap at 255 and verify comparisons across 0.
Expect near-zero drift after many periods.
Test cases where now is less than wake but time has advanced.

2.2 Sleep Queues and Idle Task Design

Fundamentals

A sleep queue is the data structure that holds blocked tasks waiting for a future tick. Each entry includes the task pointer and its wake-up time. The idle task is a special task that runs when no other tasks are READY; it can simply loop or enter a low-power mode. Together, these mechanisms prevent busy-waiting and keep the CPU available for real work.

Additional fundamentals: The idle task is a safety net and a power tool at the same time. It guarantees that the scheduler always has a valid task to run, and it provides a natural place to enter low-power modes or collect statistics. Treat it as part of the kernel, not as an afterthought.

Deep Dive into the concept

A sleep queue can be implemented in different ways: unsorted lists, sorted lists, or timer wheels. A simple unsorted list is easiest but requires scanning all sleeping tasks on every tick. This is acceptable for a small number of tasks. A sorted list reduces per-tick work because you only check the head, but it increases insertion cost. In this project, you can choose either; the key is to document the tradeoff. For an educational RTOS, clarity and correctness usually win over micro-optimization.

The idle task must never block and must be always READY when no other tasks are. It is typically given the lowest priority. In low-power designs, the idle task can execute WFI (Wait For Interrupt) to put the CPU into a low-power state until the next interrupt occurs. This can significantly reduce power consumption, which matters for battery-powered systems. However, entering low-power modes can introduce wake-up latency, so you must consider whether it is acceptable for your timing requirements. In a learning RTOS, implementing a simple idle loop is enough, but adding a WFI instruction is a valuable extension.

Sleep queues also interact with task state transitions. When a task calls sleep, it moves to BLOCKED and is inserted into the sleep queue. When its wake time arrives, it is moved back to READY and becomes eligible for scheduling. This movement must be done atomically with respect to interrupts to avoid losing tasks. If an ISR wakes a task while the scheduler is updating the queue, the task could be lost or woken twice. This is why you must use critical sections around queue updates.

The idle task also serves as a diagnostics hook. You can measure how often the system is idle to estimate CPU utilization. For example, increment a counter in the idle loop to approximate idle time. This is useful for tuning system performance and is a common technique in commercial RTOSes. In later projects, you can combine this with jitter measurements to estimate worst-case behavior.

Additional depth: Idle task design is often underestimated. In a real system, the idle task is the place where the kernel can perform background maintenance, such as cleaning up resources from terminated tasks, checking stack canaries, or collecting statistics. It is also where many RTOSes implement power management by entering low-power modes. When you execute WFI, the core stops until an interrupt arrives, which saves power but can add a small wake-up latency. Measuring that latency is a useful exercise to understand the tradeoff between power and responsiveness.

The idle task must be the lowest priority so it never preempts real work. If you accidentally give it a higher priority, it can starve real tasks and create confusing bugs. To prevent this, many RTOSes create the idle task internally and disallow user code from changing its priority. In your learning kernel, make sure the idle task priority is fixed and documented.

Another subtle aspect is that the idle task should not use a lot of stack. It typically runs a tight loop and, if it calls WFI, it should not call other functions that consume stack. This makes it easier to size the idle stack and to detect stack issues elsewhere. You can also use the idle task to measure CPU utilization by incrementing a counter each loop and comparing it to a reference. This provides a low-cost metric for system load and helps you evaluate whether your time services are efficient.

Finally, the idle task is a great place to integrate debugging hooks. For example, you can check for stack canary corruption or log periodic statistics without disturbing time-critical tasks. This transforms the idle task from a simple placeholder into a useful system service.

How this fit on projects

You will implement sleep queue operations in Sec. 3.1 and idle behavior in Sec. 3.7.2. The idle task is also referenced in Project 9 when checking stack usage.

Definitions & key terms

Sleep queue: List of tasks blocked until a wake-up tick.
Idle task: Task that runs when no others are READY.
WFI: Cortex-M instruction to wait for interrupt.
CPU utilization: Fraction of time not spent in idle.

Mental model diagram (ASCII)

READY tasks -> scheduler -> run
No READY -> idle task -> optional WFI
Sleeping tasks -> sleep queue -> wake on tick

How it works (step-by-step, with invariants and failure modes)

Task calls sleep; added to sleep queue.
Each tick, scheduler checks sleep queue.
Tasks with expired wake time become READY.
If no READY tasks, idle task runs.
Failure mode: sleep queue update without critical section -> lost tasks.

Minimal concrete example

void idle_task(void) {
    while (1) {
        __WFI();
    }
}

Common misconceptions

“Idle task is wasted” -> it is required and can save power.
“All sleeping tasks wake on same tick” -> each has its own wake time.
“Critical sections not needed” -> race conditions will drop tasks.

Check-your-understanding questions

Why is an idle task mandatory in a preemptive RTOS?
What is the tradeoff between a sorted and unsorted sleep queue?
How can idle time be used to estimate CPU utilization?

Check-your-understanding answers

Because the scheduler must always have something to run.
Sorted list reduces per-tick checks but increases insertion cost.
The idle loop count is proportional to available CPU time.

Real-world applications

Low-power IoT devices that sleep between events.
Industrial controllers with periodic control loops and idle time.

Where you’ll apply it

This project: Sec. 3.1, Sec. 3.7.2, Sec. 5.10 Phase 2.
Also used in: Project 9.

References

ARM documentation for WFI/WFE instructions.
“Real-Time Concepts for Embedded Systems” by Qing Li, Ch. 11.

Key insights

A correct sleep queue and idle task turn a busy-loop system into a real RTOS.

Summary

Sleep queues block tasks until time expires, and the idle task safely occupies the CPU when nothing else is ready.

Homework/Exercises to practice the concept

Implement both sorted and unsorted sleep queues and measure tick overhead.
Add an idle counter and compute CPU utilization.
Test WFI and measure wake-up latency.

Solutions to the homework/exercises

Count cycles per tick and compare CPU usage.
Use idle loop counts to estimate percent idle time.
Measure GPIO pulse before and after entering WFI.

3. Project Specification

3.1 What You Will Build

A time service layer that allows tasks to sleep for N ticks and an idle task that runs when no other tasks are READY. The system will handle tick wraparound and will wake tasks at precise times.

3.2 Functional Requirements

Sleep API: sleep_ms() blocks a task until wake tick.
Wake-up Logic: Tasks wake at correct tick, including wraparound cases.
Idle Task: Runs when no other tasks are READY.

3.3 Non-Functional Requirements

Performance: Sleep queue processing < 50 us per tick with <=8 tasks.
Reliability: No missed wake-ups over 10,000 cycles.
Usability: Wake-up timing verifiable via GPIO or UART.

3.4 Example Usage / Output

Task A sleeps 100 ms and toggles LED.
Idle task toggles another LED slowly when all tasks are sleeping.

3.5 Data Formats / Schemas / Protocols

Sleep queue entries: {task_id, wake_tick}.

3.6 Edge Cases

Tick wraparound during sleep.
Task sleeps for 0 ticks.
All tasks asleep, idle task must run.

3.7 Real World Outcome

Tasks sleep without busy-waiting, waking at expected times, while the idle task runs during idle periods and optionally enters low power.

3.7.1 How to Run (Copy/Paste)

make clean all
make flash

Exit codes:

make: 0 success, 2 build failure.
openocd: 0 flash success, 1 connection failure.

3.7.2 Golden Path Demo (Deterministic)

Task A toggles LED every 100 ms using sleep_ms(100).
Task B toggles LED every 250 ms using sleep_ms(250).
Idle task toggles a slow heartbeat when both are sleeping.

3.7.3 Failure Demo (Deterministic)

Replace the wrap-safe compare with now >= wake.
Force tick close to overflow and sleep across wrap.
Task fails to wake; restore safe compare and verify fix.

4. Solution Architecture

4.1 High-Level Design

SysTick -> tick++ -> check sleep queue -> READY tasks -> scheduler
                                    -> no READY -> idle task

4.2 Key Components

Component	Responsibility	Key Decisions
`sleep.c`	Sleep API and queue maintenance	Sorted vs unsorted queue
`idle.c`	Idle task behavior	WFI vs busy loop
`sched.c`	Task state transitions	READY/BLOCKED handling

4.4 Data Structures (No Full Code)

typedef struct {
    tcb_t *task;
    uint32_t wake_tick;
} sleep_entry_t;

4.4 Algorithm Overview

Key Algorithm: Wake Scan

For each sleeping task, check if wake time reached.
If yes, set task to READY.
If no READY tasks, run idle task.

Complexity Analysis:

Time: O(n) per tick for unsorted list.
Space: O(n) for sleep entries.

5. Implementation Guide

5.1 Development Environment Setup

brew install arm-none-eabi-gcc openocd

5.2 Project Structure

rtos-p05/
|-- src/
|   |-- sleep.c
|   |-- idle.c
|   `-- main.c
|-- include/
`-- Makefile

5.3 The Core Question You’re Answering

“How do I block tasks without wasting CPU and still wake them at precise times?”

5.4 Concepts You Must Understand First

Tick arithmetic and wraparound-safe comparisons.
Sleep queue management and idle task behavior.

5.5 Questions to Guide Your Design

How will you represent sleeping tasks efficiently?
How will you handle wraparound safely?
Should the idle task use WFI or just loop?

5.6 Thinking Exercise

Design a test case where a task sleeps across tick wrap and predict the correct wake time.

5.7 The Interview Questions They’ll Ask

“Why is busy-waiting bad in an RTOS?”
“How do you handle tick overflow?”
“What does the idle task do?”

5.8 Hints in Layers

Hint 1: Start unsorted Use a simple list first and validate correctness.

Hint 2: Add wrap-safe compare Use (int32_t)(now - wake) >= 0.

Hint 3: Idle task heartbeat Toggle a GPIO to confirm idle execution.

5.9 Books That Will Help

Topic	Book	Chapter
Time services	“Real-Time Concepts for Embedded Systems”	Ch. 11
RTOS basics	“Making Embedded Systems”	Ch. 6

5.10 Implementation Phases

Phase 1: Sleep API (3-4 days)

Goals: Implement sleep_ms and wake logic. Tasks: Add wake_tick, block tasks, scan on tick. Checkpoint: Tasks wake on time in simple tests.

Phase 2: Idle Task (2-3 days)

Goals: Idle task runs when no READY tasks. Tasks: Create idle task and assign lowest priority. Checkpoint: Idle heartbeat visible.

Phase 3: Wraparound Validation (2-3 days)

Goals: Prove wrap-safe compare works. Tasks: Force tick near overflow, run sleep across wrap. Checkpoint: Task wakes correctly.

5.11 Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Sleep queue structure	Sorted vs unsorted	Unsorted	Simpler and sufficient for few tasks
Idle behavior	Busy loop vs WFI	WFI optional	Power saving when supported

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

Task sleeps across wrap and wakes correctly.
Idle task runs when all tasks are blocked.
Sleep durations measured within 1 tick of target.

6.3 Test Data

Ticks per second: 1000
Wrap period: ~49.7 days (32-bit)

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

Log wake_tick and current tick to UART.
Use GPIO pulses to measure wake timing.

7.3 Performance Traps

Scanning a large sleep queue every tick can become expensive.

8. Extensions & Challenges

8.1 Beginner Extensions

Add a sleep_until(tick) API.
Add a simple CPU utilization counter in idle.

8.2 Intermediate Extensions

Implement a sorted sleep queue.
Add tickless idle for long sleeps.

8.3 Advanced Extensions

Implement a timer wheel for large numbers of tasks.
Add dynamic clock scaling in idle.

9. Real-World Connections

9.1 Industry Applications

Power-managed IoT nodes that sleep between sensor reads.
Real-time control systems with periodic cycles.

FreeRTOS: vTaskDelay and idle task hooks.
Zephyr: tickless idle and power management.

9.3 Interview Relevance

RTOS sleep and delay APIs are frequent interview topics.

10. Resources

10.1 Essential Reading

“Real-Time Concepts for Embedded Systems” by Qing Li, Ch. 11.

10.2 Video Resources

RTOS delay and idle task tutorials.

10.3 Tools & Documentation

Logic analyzer or scope for wake-up timing.

11. Self-Assessment Checklist

11.1 Understanding

I can explain tick wraparound handling.
I understand why idle tasks are required.

11.2 Implementation

Tasks sleep and wake correctly.
Idle task runs only when no other tasks are READY.

11.3 Growth

I can describe tradeoffs between sorted and unsorted sleep queues.

12. Submission / Completion Criteria

Minimum Viable Completion:

sleep_ms() blocks tasks and wakes correctly.
Idle task runs when all tasks are blocked.

Full Completion:

Wraparound-safe timing verified.
Wake-up timing measured and documented.

Excellence (Going Above & Beyond):

Tickless idle implemented.
CPU utilization reported via idle counter.