LEARN WINDOWS SYSTEM PROGRAMMING EXECUTION MODEL

Learn Windows System Programming: Execution Model, Processes, Threads, Fibers & Job Objects

Goal: Deeply understand the Windows execution model—from how processes and threads work differently than UNIX, to mastering fibers for cooperative multitasking, and using Job Objects for sandboxing and resource management.

Why This Matters

Windows is the dominant desktop OS and a major player in enterprise servers. Yet most developers treat its internals as a black box. The Windows execution model is fundamentally different from UNIX:

Processes are heavyweight, threads are the unit of scheduling (not processes like in UNIX)
Fibers provide user-mode cooperative scheduling (no UNIX equivalent)
Job Objects enable process grouping and resource limiting (similar to cgroups but different semantics)
The Windows kernel (NTOS) has unique handle-based object management

After completing these projects, you will:

Understand the Windows process/thread model at the kernel level
Be able to build your own cooperative schedulers using Fibers
Master Job Objects for sandboxing and resource management
Know how to use Windows debugging APIs
Understand security contexts, tokens, and impersonation

Core Concept Analysis

The Windows Execution Hierarchy

┌─────────────────────────────────────────────────────────────────┐
│                         Job Object                               │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │                        Process                               │ │
│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ │
│  │  │     Thread 1    │ │     Thread 2    │ │     Thread 3    │ │ │
│  │  │  ┌───────────┐  │ │  ┌───────────┐  │ │                 │ │ │
│  │  │  │  Fiber A  │  │ │  │  Fiber C  │  │ │                 │ │ │
│  │  │  │  Fiber B  │  │ │  │  Fiber D  │  │ │                 │ │ │
│  │  │  └───────────┘  │ │  └───────────┘  │ │                 │ │ │
│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │
│  └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Fundamental Concepts

1. Processes in Windows

A Windows process is:

A container for resources: Virtual address space, handles, security token
NOT a unit of execution: The process itself doesn’t run—threads do
Identified by: Process ID (PID) and a HANDLE
Created via: CreateProcess() (not fork() + exec() like UNIX)

Key structures:

EPROCESS: Kernel-mode process block
PEB (Process Environment Block): User-mode process info
TEB (Thread Environment Block): Per-thread info accessible from user mode

2. Threads in Windows

A Windows thread is:

The unit of scheduling: The kernel schedules threads, not processes
Lightweight within a process: Shares address space, handles, token
Has its own: Stack, registers, TEB, thread-local storage
Priority-based scheduling: 0-31 priority levels

Key APIs:

CreateThread() / _beginthreadex()
SuspendThread() / ResumeThread()
WaitForSingleObject() / WaitForMultipleObjects()
SetThreadPriority() / SetThreadAffinityMask()

3. Fibers (User-Mode Threads)

Fibers are:

Cooperatively scheduled: No preemption—must explicitly yield
Lighter than threads: Just a stack and register context
Scheduled by your code: You call SwitchToFiber() to switch
Useful for: Porting code expecting cooperative multitasking, coroutines

Key APIs:

ConvertThreadToFiber() / ConvertFiberToThread()
CreateFiber() / CreateFiberEx()
SwitchToFiber()
DeleteFiber()

4. Job Objects

Job Objects provide:

Process grouping: Multiple processes in one manageable unit
Resource limits: CPU time, memory, I/O, process count
Security boundaries: Prevent processes from escaping the job
Accounting: Track resource usage across all processes

Key APIs:

CreateJobObject()
AssignProcessToJobObject()
SetInformationJobObject() / QueryInformationJobObject()
TerminateJobObject()

Project List

Projects are ordered from foundational understanding to advanced implementations.

Project 1: Process & Thread Inspector

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust (windows-rs)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Windows Internals / Process Management
Software or Tool: Process Explorer (what you’re building a mini version of)
Main Book: “Windows Internals, Part 1” by Pavel Yosifovich, Alex Ionescu, Mark Russinovich, David Solomon

What you’ll build: A command-line tool that enumerates all running processes, lists their threads, shows thread states (Running, Waiting, etc.), and displays key information like base priority, CPU time, and start address.

Why it teaches Windows execution model: You cannot understand the execution model without seeing it. This project forces you to use the fundamental enumeration APIs and understand how Windows organizes processes and threads. You’ll see that threads are the real execution units.

Core challenges you’ll face:

Enumerating processes → maps to understanding PROCESSENTRY32 and snapshot APIs
Enumerating threads per process → maps to understanding THREADENTRY32
Getting detailed thread info → maps to THREAD_BASIC_INFORMATION, NtQueryInformationThread
Handling access rights → maps to understanding Windows security model

Key Concepts:

Toolhelp32 API: “Windows Via C/C++” Chapter 4 - Jeffrey Richter
Process/Thread structures: “Windows Internals, Part 1” Chapter 3 - Yosifovich et al.
Handle and Object model: “Windows Internals, Part 1” Chapter 8 - Yosifovich et al.
Security and Access Rights: “Windows System Programming” Chapter 15 - Johnson Hart

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Basic C programming, understanding of Windows API conventions (HANDLE, DWORD, etc.), familiarity with Visual Studio or command-line compilation with cl.exe

Real world outcome:

C:\> process_inspector.exe

PID     Process Name              Threads  Priority  Working Set
------  -----------------------   -------  --------  -----------
4       System                    142      8         24 KB
632     csrss.exe                 14       13        4,512 KB
1284    explorer.exe              47       8         98,304 KB
        ├─ TID 1288  State: Wait    Priority: 9   CPU: 00:00:05.234
        ├─ TID 1292  State: Wait    Priority: 8   CPU: 00:00:00.015
        ├─ TID 1420  State: Running Priority: 10  CPU: 00:02:15.891
        └─ ...
5678    chrome.exe                34       8         245,760 KB
        ├─ TID 5680  State: Wait    Priority: 8   CPU: 00:00:45.123
        └─ ...

Press 'R' to refresh, 'Q' to quit, or enter PID for details: _

Implementation Hints:

The core enumeration pattern in Windows uses “snapshots”:

High-level approach:
CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0) - snapshot all processes
Process32First/Process32Next - iterate through processes
CreateToolhelp32Snapshot(TH32CS_SNAPTHREAD, 0) - snapshot all threads
Thread32First/Thread32Next - iterate, filter by owning PID
OpenThread() + GetThreadTimes() for detailed info

Key structures to understand:

PROCESSENTRY32 - Contains dwSize, th32ProcessID, szExeFile, cntThreads, etc.
THREADENTRY32 - Contains th32ThreadID, th32OwnerProcessID, tpBasePri, etc.

For deeper thread information, you’ll need:

OpenThread(THREAD_QUERY_INFORMATION, ...) to get a thread handle
GetThreadTimes() for CPU time
Undocumented: NtQueryInformationThread with ThreadBasicInformation for start address

Questions to ask yourself:

Why does Windows use snapshots instead of live enumeration?
What happens if a process exits while you’re enumerating?
Why can’t you always open every process/thread?

Learning milestones:

You enumerate all processes → You understand snapshot-based enumeration
You list threads per process → You understand the process/thread relationship
You show thread states and priorities → You understand Windows scheduling
You handle access denied gracefully → You understand Windows security model

Project 2: Thread Synchronization Playground

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Concurrency / Thread Synchronization
Software or Tool: Custom debugging/learning tool
Main Book: “Windows Via C/C++” by Jeffrey Richter

What you’ll build: An interactive console application that demonstrates all major Windows synchronization primitives: Critical Sections, Mutexes, Semaphores, Events (manual and auto-reset), and Slim Reader/Writer Locks. Each demo shows the behavior with multiple threads and visualizes who holds the lock.

Why it teaches Windows threading: Threads are useless without synchronization. Windows has a rich set of primitives, each with different semantics. Building this playground forces you to understand when to use each primitive and how they behave under contention.

Core challenges you’ll face:

Critical Section behavior → maps to understanding user-mode fast path vs kernel transition
Mutex vs Critical Section → maps to understanding process-local vs cross-process
Event signaling patterns → maps to producer-consumer, one-shot vs persistent signals
Reader/Writer patterns → maps to SRWLock exclusive vs shared acquisition
Visualizing lock state → maps to understanding wait lists and ownership

Key Concepts:

Critical Sections: “Windows Via C/C++” Chapter 8 - Jeffrey Richter
Kernel Synchronization Objects: “Windows Via C/C++” Chapter 9 - Jeffrey Richter
Slim Reader/Writer Locks: “Windows Internals, Part 1” Chapter 8 - Yosifovich et al.
Wait Functions: “Windows System Programming” Chapter 10 - Johnson Hart

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, understanding of basic threading concepts (race conditions, deadlocks)

Real world outcome:

C:\> sync_playground.exe

=== Windows Synchronization Playground ===

Choose a demo:
1. Critical Section - Fast mutex (process-local only)
2. Mutex - Cross-process synchronization
3. Semaphore - Counting synchronization
4. Auto-Reset Event - One thread wakes per signal
5. Manual-Reset Event - All threads wake per signal
6. SRWLock - Multiple readers OR one writer
7. Condition Variable - Wait for complex conditions
8. Barrier - Synchronize N threads at a point

Enter choice: 1

=== Critical Section Demo ===
Spawning 4 threads competing for one critical section...

[Thread 1] Waiting to enter critical section...
[Thread 2] Waiting to enter critical section...
[Thread 3] Waiting to enter critical section...
[Thread 4] Waiting to enter critical section...
[Thread 1] >>> ENTERED critical section (Owner: TID 1234)
[Thread 1] Working for 500ms...
[Thread 1] <<< LEAVING critical section
[Thread 3] >>> ENTERED critical section (Owner: TID 1238)
...

Critical Section internals:
  - LockCount: 0 (unlocked) / -1 (locked) / <-1 (locked + waiters)
  - RecursionCount: How many times owner entered
  - OwningThread: Current owner's TID
  - SpinCount: Iterations before kernel wait

Implementation Hints:

Structure your playground as a menu-driven application where each option spawns worker threads that demonstrate the primitive:

Conceptual architecture:
1. Main menu loop - select primitive to demo
2. For each primitive:
   a. Initialize the sync object
   b. Create N worker threads
   c. Each worker: try to acquire, do work, release
   d. Print state transitions in real-time
   e. Clean up when demo ends

For Critical Sections, explore:

InitializeCriticalSectionAndSpinCount() - Why spin before blocking?
The internal RTL_CRITICAL_SECTION structure (debug info available)
TryEnterCriticalSection() - Non-blocking acquisition

For Events, demonstrate the key difference:

Auto-reset: WaitForSingleObject resets the event, only ONE waiter wakes
Manual-reset: Event stays signaled, ALL waiters wake

Questions to ask yourself:

Why would you use a Mutex instead of a Critical Section?
What happens if you signal an auto-reset event twice before any wait?
How does SRWLock achieve multiple simultaneous readers?

Learning milestones:

Critical Section demo works → You understand user-mode synchronization
Mutex cross-process demo works → You understand kernel objects and naming
Event demos show correct wake behavior → You understand signaling semantics
SRWLock shows concurrent reads → You understand reader/writer patterns

Project 3: Thread Pool from Scratch

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 3: Advanced
Knowledge Area: Concurrency / Thread Management
Software or Tool: Windows Thread Pool (what you’re reimplementing)
Main Book: “C++ Concurrency in Action” by Anthony Williams

What you’ll build: A complete thread pool implementation that supports: work items, wait callbacks (trigger on object signal), timer callbacks, and I/O completion callbacks. You’ll essentially rebuild the Windows Thread Pool API (CreateThreadpoolWork, SubmitThreadpoolWork, etc.).

Why it teaches Windows threading: The Windows Thread Pool is how professional Windows apps handle concurrency. By building your own, you’ll understand: thread lifecycle management, efficient waiting on multiple objects, completion ports, and callback-based programming.

Core challenges you’ll face:

Work queue design → maps to lock-free vs locked queues, work stealing
Dynamic thread management → maps to when to create/destroy worker threads
Wait object integration → maps to WaitForMultipleObjects limitations (64 objects)
Timer coalescing → maps to efficient timer management with timer queues
I/O completion integration → maps to IOCP fundamentals

Key Concepts:

Thread Pool API: “Windows Via C/C++” Chapter 11 - Jeffrey Richter
I/O Completion Ports: “Windows Via C/C++” Chapter 10 - Jeffrey Richter
Work Queue Design: “C++ Concurrency in Action” Chapter 9 - Anthony Williams
Scalable Synchronization: “The Art of Multiprocessor Programming” Chapter 7 - Herlihy & Shavit

Difficulty: Advanced Time estimate: 1 month+ Prerequisites: Projects 1-2 completed, solid understanding of threading, familiarity with C++ (for cleaner design, though C is fine)

Real world outcome:

// Example usage of YOUR thread pool:
#include "my_threadpool.h"

void ProcessFile(void* context) {
    const char* filename = (const char*)context;
    printf("Processing %s on thread %u\n", filename, GetCurrentThreadId());
    // ... do work ...
}

int main() {
    MyThreadPool pool(4, 16);  // min 4, max 16 threads

    // Submit work items
    for (int i = 0; i < 100; i++) {
        pool.SubmitWork(ProcessFile, filenames[i]);
    }

    // Wait callback: run when file handle is signaled (I/O complete)
    pool.RegisterWait(fileHandle, OnFileReady, context, INFINITE);

    // Timer callback: run every 5 seconds
    pool.CreateTimer(HeartbeatCallback, context, 5000, 5000);

    pool.WaitForAll();
    return 0;
}

Output:

C:\> threadpool_demo.exe

[Pool] Initialized with 4 worker threads
[Pool] Worker 1 (TID 4532) started
[Pool] Worker 2 (TID 4536) started
[Pool] Worker 3 (TID 4540) started
[Pool] Worker 4 (TID 4544) started

Processing file001.txt on thread 4532
Processing file002.txt on thread 4536
Processing file003.txt on thread 4540
[Pool] High load detected, spawning worker 5 (TID 4548)
Processing file004.txt on thread 4544
Processing file005.txt on thread 4548
...

[Timer] Heartbeat at T+5000ms
[Wait] File handle signaled, running callback
...

[Pool] All work complete. Processed 100 items in 2.3 seconds.
[Pool] Peak threads: 8, Final threads: 4

Implementation Hints:

Start with a basic work queue thread pool, then add features:

Phase 1: Basic work queue
- Circular buffer or linked list of work items
- N worker threads sleeping on a condition variable
- SubmitWork wakes one worker
- Worker grabs item, executes, loops

Phase 2: Dynamic thread management
- Track queue depth and thread activity
- Spawn threads when queue backs up
- Retire threads after idle timeout
- Respect min/max thread limits

Phase 3: Wait callbacks
- Dedicated waiter thread(s)
- Use WaitForMultipleObjects (max 64)
- For >64 waits, use multiple waiter threads or completion port
- When object signals, queue work item to call user callback

Phase 4: Timer callbacks
- Use CreateTimerQueueTimer or manual timer wheel
- Queue work item when timer fires

Phase 5: I/O completion callbacks
- Create an IOCP with CreateIoCompletionPort
- Worker threads call GetQueuedCompletionStatus
- Or use separate I/O completion threads

Questions to ask yourself:

Why does Windows have a 64-object limit on WaitForMultipleObjects?
How do you avoid the “thundering herd” when signaling workers?
When should threads exit vs sleep?
How does IOCP scale better than select/poll models?

Learning milestones:

Basic work queue processes items → You understand producer-consumer threading
Dynamic thread scaling works → You understand pool management heuristics
Wait callbacks fire correctly → You understand kernel object signaling
I/O completions are handled → You understand IOCP fundamentals

Project 4: Fiber-Based Coroutine Library

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust (for comparison)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: Cooperative Multitasking / Coroutines
Software or Tool: Windows Fibers API
Main Book: “Windows Via C/C++” by Jeffrey Richter

What you’ll build: A coroutine library using Windows Fibers that provides yield(), resume(), and spawn() primitives. You’ll build a scheduler that runs multiple coroutines on a single thread, switching between them cooperatively.

Why it teaches Fibers: Fibers are Windows’ answer to user-mode threading. Unlike threads (preemptively scheduled by the kernel), fibers are cooperatively scheduled by your code. This project teaches you exactly how context switching works at the user level.

Core challenges you’ll face:

Converting threads to fibers → maps to understanding ConvertThreadToFiber
Creating fiber contexts → maps to understanding stack allocation and start routines
Implementing yield → maps to saving/restoring fiber context with SwitchToFiber
Building a scheduler → maps to round-robin, priority, or work-stealing scheduling
Handling fiber completion → maps to cleanup and return value propagation

Key Concepts:

Fiber API: “Windows Via C/C++” Chapter 12 - Jeffrey Richter
Coroutine Concepts: “C++ Concurrency in Action” Chapter 4.4 - Anthony Williams
Cooperative Scheduling: “Modern Operating Systems” Chapter 2.4 - Tanenbaum
Context Switching: “Windows Internals, Part 1” Chapter 5 - Yosifovich et al.

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-3 completed, understanding of call stacks, strong C programming

Real world outcome:

// Example usage of YOUR coroutine library:
#include "my_coroutines.h"

void task_a(void* arg) {
    for (int i = 0; i < 5; i++) {
        printf("[Task A] Step %d\n", i);
        yield();  // Give other coroutines a chance
    }
}

void task_b(void* arg) {
    for (int i = 0; i < 5; i++) {
        printf("[Task B] Step %d\n", i);
        yield();
    }
}

void task_c(void* arg) {
    printf("[Task C] Waiting for event...\n");
    yield_until(some_condition);  // Advanced: wait for condition
    printf("[Task C] Event received!\n");
}

int main() {
    scheduler_init();

    coro_spawn(task_a, NULL);
    coro_spawn(task_b, NULL);
    coro_spawn(task_c, NULL);

    scheduler_run();  // Run until all coroutines complete

    printf("All tasks complete.\n");
    return 0;
}

Output:

C:\> fiber_coroutines.exe

[Scheduler] Initialized, main thread converted to fiber
[Scheduler] Spawned coroutine 1 (task_a)
[Scheduler] Spawned coroutine 2 (task_b)
[Scheduler] Spawned coroutine 3 (task_c)
[Scheduler] Starting execution...

[Task A] Step 0
[Scheduler] Switch: coro 1 -> coro 2
[Task B] Step 0
[Scheduler] Switch: coro 2 -> coro 3
[Task C] Waiting for event...
[Scheduler] Switch: coro 3 -> coro 1
[Task A] Step 1
[Scheduler] Switch: coro 1 -> coro 2
[Task B] Step 1
...
[Task A] Step 4
[Scheduler] Coroutine 1 completed
[Task B] Step 4
[Scheduler] Coroutine 2 completed
[Task C] Event received!
[Scheduler] Coroutine 3 completed

All tasks complete.

Implementation Hints:

The core fiber operations:

Fiber lifecycle:
1. ConvertThreadToFiber(NULL) - Your main thread becomes a fiber
2. CreateFiber(stackSize, fiberProc, param) - Create new fiber
3. SwitchToFiber(fiberAddress) - Switch to another fiber
4. DeleteFiber(fiberAddress) - Clean up when done

Scheduler design:
- Maintain a list of "ready" fibers
- The current fiber yields by:
  1. Adding itself back to ready list
  2. Picking next fiber from ready list
  3. SwitchToFiber to that fiber
- When a fiber's function returns, it must switch back to scheduler
  (or it crashes - fibers can't just "exit")

Key insight: The fiber function cannot simply return. You must:

Have the fiber function explicitly switch to the scheduler fiber when done
Or wrap the user’s function in a trampoline that handles return

Advanced features to consider:

yield_until(predicate) - Don’t reschedule until condition is true
coro_join(coro_id) - Wait for another coroutine to complete
Return values from coroutines
Exception propagation across fibers (tricky!)

Questions to ask yourself:

What’s stored in a fiber’s context? (Answer: registers, stack pointer)
Why can’t a fiber just return from its function?
How would you implement sleeping (time-based yield)?
What happens if a fiber throws an exception?

Learning milestones:

Basic yield/resume works → You understand fiber context switching
Scheduler runs multiple fibers → You understand cooperative scheduling
Fiber completion is handled → You understand fiber lifecycle
Advanced yields work → You understand blocking operations in cooperative systems

Project 5: Job Object Sandbox

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust (windows-rs)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 3: Advanced
Knowledge Area: Process Isolation / Sandboxing
Software or Tool: Job Objects, Windows Sandbox
Main Book: “Windows Internals, Part 1” by Yosifovich et al.

What you’ll build: A process launcher that runs untrusted programs inside a Job Object with strict resource limits: maximum memory, CPU time, process count, and restricted UI access. The sandbox will also prevent child processes from escaping.

Why it teaches Job Objects: Job Objects are Windows’ answer to “how do I contain a process?” Chrome, Edge, Docker for Windows, and many sandboxing tools use them. This project teaches you the critical controls for security and resource management.

Core challenges you’ll face:

Creating and configuring Job Objects → maps to understanding JOBOBJECT__LIMIT_INFORMATION*
Assigning processes to jobs → maps to understanding assignment rules and inheritance
Preventing job escape → maps to JOB_OBJECT_LIMIT_BREAKAWAY_OK and security
Monitoring resource usage → maps to QueryInformationJobObject for accounting
Handling limit violations → maps to completion port notifications

Key Concepts:

Job Object API: “Windows Via C/C++” Chapter 5 - Jeffrey Richter
Job Object Limits: “Windows Internals, Part 1” Chapter 5 - Yosifovich et al.
Process Security: “Windows Internals, Part 1” Chapter 7 - Yosifovich et al.
Sandboxing Techniques: Chromium Sandbox Design Document (online)

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-4 completed, understanding of Windows security basics

Real world outcome:

C:\> sandbox.exe --memory 100MB --cpu-time 5s --processes 3 -- untrusted.exe arg1 arg2

=== Job Object Sandbox ===
Configuration:
  Memory limit:    100 MB (hard limit)
  CPU time limit:  5 seconds (total across all processes)
  Process limit:   3 (including initial process)
  UI restrictions: No clipboard, no display changes, no global atoms
  Breakaway:       DISABLED (children cannot escape)

[Sandbox] Created job object: \BaseNamedObjects\Sandbox_12345
[Sandbox] Applied limits successfully
[Sandbox] Launching: untrusted.exe arg1 arg2
[Sandbox] Process 7890 assigned to job

--- Running ---
[Monitor] Memory: 24 MB / 100 MB
[Monitor] CPU time: 0.5s / 5.0s
[Monitor] Processes: 1 / 3

[Monitor] Child process 7892 created (cmd.exe)
[Monitor] Processes: 2 / 3

[Monitor] Memory: 78 MB / 100 MB
[Monitor] CPU time: 2.1s / 5.0s

[WARNING] Memory approaching limit (78%)

[Monitor] Memory: 95 MB / 100 MB
[LIMIT] Memory limit exceeded - process terminated

=== Sandbox Report ===
Exit reason:      Memory limit exceeded
Total CPU time:   2.8 seconds
Peak memory:      98 MB
Peak processes:   2
Exit code:        Terminated by job

Implementation Hints:

Job Object creation and configuration flow:

1. CreateJobObject(NULL, name) - Create the job

2. Set basic limits (JOBOBJECT_BASIC_LIMIT_INFORMATION):
   - LimitFlags: JOB_OBJECT_LIMIT_*
   - ActiveProcessLimit: Max processes
   - ProcessTimeLimit: CPU time in 100ns units

3. Set extended limits (JOBOBJECT_EXTENDED_LIMIT_INFORMATION):
   - ProcessMemoryLimit: Per-process max memory
   - JobMemoryLimit: Total job max memory
   - JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE: Kill all when job closed
   - JOB_OBJECT_LIMIT_BREAKAWAY_OK: Set to 0 to prevent escapes

4. Set UI restrictions (JOBOBJECT_BASIC_UI_RESTRICTIONS):
   - JOB_OBJECT_UILIMIT_DESKTOP: Can't switch desktops
   - JOB_OBJECT_UILIMIT_GLOBALATOMS: Can't create global atoms
   - JOB_OBJECT_UILIMIT_CLIPBOARD: Can't access clipboard

5. Associate a completion port for notifications:
   - CreateIoCompletionPort + SetInformationJobObject
   - Get notified of: new process, exit, limit violations

6. CreateProcess with CREATE_SUSPENDED, then AssignProcessToJobObject,
   then ResumeThread - ensures process starts in job

7. Monitor via completion port or polling QueryInformationJobObject

Key limit flags to understand:

JOB_OBJECT_LIMIT_PROCESS_MEMORY - Per-process limit
JOB_OBJECT_LIMIT_JOB_MEMORY - Total job limit
JOB_OBJECT_LIMIT_PROCESS_TIME - Per-process CPU limit
JOB_OBJECT_LIMIT_JOB_TIME - Total job CPU limit
JOB_OBJECT_LIMIT_ACTIVE_PROCESS - Max simultaneous processes
JOB_OBJECT_LIMIT_DIE_ON_UNHANDLED_EXCEPTION - No crash dialog
JOB_OBJECT_LIMIT_BREAKAWAY_OK - Allow/deny child escape

Questions to ask yourself:

Why would you ever allow breakaway?
How do you handle the case where the process is already in a job?
What notifications does the completion port receive?
How would you extend this to network restrictions? (Answer: you can’t with just jobs—need firewall/containers)

Learning milestones:

Job creation and limits work → You understand basic job configuration
Processes can’t exceed limits → You understand limit enforcement
Child processes inherit job → You understand job inheritance
Notifications work → You understand job completion ports

Project 6: Process Relationship Visualizer

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C, Python (for visualization)
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Process Management / Debugging
Software or Tool: Process Monitor / Process Explorer
Main Book: “Windows Internals, Part 1” by Yosifovich et al.

What you’ll build: A tool that shows the process tree (parent-child relationships), thread lineage, and job object memberships in a visual tree format. It will also show which processes share handles to the same objects.

Why it teaches processes and threads: Understanding the relationships between execution units is crucial. Who spawned whom? Which processes are in the same job? This project makes the invisible structure visible.

Core challenges you’ll face:

Building the process tree → maps to understanding ppid and timing issues
Detecting job membership → maps to IsProcessInJob and NtQueryInformationProcess
Finding shared handles → maps to NtQuerySystemInformation SystemHandleInformation
Real-time updates → maps to ETW or polling strategies

Key Concepts:

Process Parent Tracking: “Windows Internals, Part 1” Chapter 3 - Yosifovich et al.
Handle Table: “Windows Internals, Part 1” Chapter 8 - Yosifovich et al.
ETW for Process Events: “Windows Internals, Part 2” Chapter 9 - Yosifovich et al.
NtQuerySystemInformation: Various undocumented API references

Difficulty: Intermediate Time estimate: 1-2 weeks Prerequisites: Project 1 completed, understanding of tree data structures

Real world outcome:

C:\> process_tree.exe

Process Tree (with Job Objects and Shared Handles)
==================================================

[Session 0 - Services]
├─ services.exe (PID 684)
│  ├─ svchost.exe (PID 856) [Job: WMI_ProviderSubsystem]
│  ├─ svchost.exe (PID 932)
│  │  └─ taskhostw.exe (PID 4512)
│  └─ SearchIndexer.exe (PID 3244)
│
[Session 1 - User: Douglas]
├─ explorer.exe (PID 4128)
│  ├─ chrome.exe (PID 5632) [Job: Chromium_Sandbox]
│  │  ├─ chrome.exe (PID 5640) [GPU Process]
│  │  ├─ chrome.exe (PID 5712) [Renderer]
│  │  └─ chrome.exe (PID 5756) [Renderer]
│  │      └─ Shares handle 0x1A4 with PID 5632 (Mutex)
│  ├─ cmd.exe (PID 6120)
│  │  └─ process_tree.exe (PID 6124) [This process]
│  └─ notepad.exe (PID 5890)

Job Object Summary:
  Chromium_Sandbox: 4 processes, Memory limit: 4GB, Breakaway: No
  WMI_ProviderSubsystem: 1 process, No limits

Legend: [Job: name] = Job membership, [Role] = Known process roles

Implementation Hints:

Building the tree:

1. Enumerate all processes (Toolhelp32)
2. For each process, find parent PID (th32ParentProcessID)
   - Caveat: Parent may have exited! Check if parent exists.
3. Build tree structure in memory
4. Query job membership for each:
   - OpenProcess + IsProcessInJob (for known jobs)
   - Or: NtQueryInformationProcess(ProcessJobObjectAssociations)
5. Print tree with indentation

Detecting shared handles (advanced):
- NtQuerySystemInformation(SystemHandleInformation)
- This returns ALL handles on the system
- Group by object address - handles with same address point to same object
- Filter to show only handles shared across processes

Process tree challenges:

Parent process may exit before child - ppid then refers to non-existent process
Some processes are “orphaned” (reparented to System or Session 0)
Multiple roots: different sessions, different security contexts

Questions to ask yourself:

What happens to child processes when parent exits?
How would you detect a process that changed its parent (spoofing)?
What’s the difference between a process tree and a job object tree?

Learning milestones:

Tree structure builds correctly → You understand process relationships
Job memberships shown → You understand job object queries
Shared handles detected → You understand the kernel handle table
Real-time updates work → You understand process notification mechanisms

Project 7: Thread Context Debugger

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 4: Expert
Knowledge Area: Debugging / Thread Internals
Software or Tool: WinDbg, x64dbg
Main Book: “Windows Internals, Part 1” by Yosifovich et al.

What you’ll build: A debugger that can attach to any process, suspend its threads, dump their full register context (RIP/EIP, RSP/ESP, all general-purpose registers, flags), show the call stack, and optionally single-step through instructions.

Why it teaches thread internals: The thread context (CONTEXT structure) IS the thread state. When the kernel switches threads, it saves/restores this structure. By manipulating it directly, you understand exactly what “execution state” means.

Core challenges you’ll face:

Attaching to a process → maps to DebugActiveProcess and debug privileges
Handling debug events → maps to WaitForDebugEvent loop, event types
Reading thread context → maps to SuspendThread + GetThreadContext
Stack walking → maps to frame pointer chains, StackWalk64 or manual
Single stepping → maps to trap flag in EFLAGS, EXCEPTION_SINGLE_STEP

Key Concepts:

Debugging API: “Windows Via C/C++” Chapter 24 - Jeffrey Richter
CONTEXT structure: “Windows Internals, Part 1” Chapter 5 - Yosifovich et al.
Stack Walking: “Debugging with WinDbg” online documentation
x64 Calling Convention: Microsoft x64 ABI documentation

Difficulty: Expert Time estimate: 1 month+ Prerequisites: Projects 1-6 completed, assembly language basics (x86-64), understanding of calling conventions

Real world outcome:

C:\> mini_debugger.exe --attach 5632

[Debugger] Attached to process 5632 (notepad.exe)
[Debugger] 3 threads found

> threads
TID 5636  State: Waiting      EIP: ntdll!NtWaitForMultipleObjects+0x14
TID 5640  State: Running      EIP: user32!GetMessageW+0x2a
TID 5644  State: Waiting      EIP: ntdll!NtWaitForWorkViaWorkerFactory+0x14

> context 5636
Thread 5636 Context:
  RIP: 0x00007FFB1A2E0014  RSP: 0x000000BBCF1FF6E8
  RAX: 0x0000000000000000  RBX: 0x00000000FFFFFFFF
  RCX: 0x0000000000000002  RDX: 0x000000BBCF1FF758
  R8:  0x0000000000000000  R9:  0x0000000000000000
  ...
  EFLAGS: 0x00000246 [ ZF PF IF ]

> stack 5636
Call Stack for TID 5636:
  #0  ntdll!NtWaitForMultipleObjects+0x14
  #1  KERNELBASE!WaitForMultipleObjectsEx+0xf1
  #2  user32!MsgWaitForMultipleObjectsEx+0x15d
  #3  notepad!WinMain+0x156
  #4  notepad!__mainCRTStartup+0x1a5
  #5  kernel32!BaseThreadInitThunk+0x14
  #6  ntdll!RtlUserThreadStart+0x21

> break 0x00007FF654321000
[Debugger] Breakpoint 1 set at 0x00007FF654321000

> continue
[Debugger] Breakpoint 1 hit in thread 5640
  RIP: 0x00007FF654321000

> step
[Debugger] Single step complete
  RIP: 0x00007FF654321003  ; moved 3 bytes (one instruction)

Implementation Hints:

The debugging loop structure:

1. Get debug privileges (SE_DEBUG_NAME)
2. DebugActiveProcess(pid) - attach to process
3. Debug event loop:
   while (WaitForDebugEvent(&event, INFINITE)) {
     switch (event.dwDebugEventCode) {
       case CREATE_PROCESS_DEBUG_EVENT:
         // Save base address, handle
       case CREATE_THREAD_DEBUG_EVENT:
         // Track new thread
       case EXCEPTION_DEBUG_EVENT:
         // Handle breakpoints, single-step
       case EXIT_PROCESS_DEBUG_EVENT:
         // Clean up
       ...
     }
     ContinueDebugEvent(pid, tid, DBG_CONTINUE);
   }

Getting thread context:

SuspendThread(hThread);
CONTEXT ctx = { .ContextFlags = CONTEXT_ALL };
GetThreadContext(hThread, &ctx);
// Read ctx.Rip, ctx.Rsp, ctx.Rax, etc.
ResumeThread(hThread);

Setting a breakpoint:

1. ReadProcessMemory at target address (save original byte)
2. WriteProcessMemory with 0xCC (int 3)
3. On EXCEPTION_BREAKPOINT:
   - Restore original byte
   - Set single-step flag to re-set breakpoint after
4. On EXCEPTION_SINGLE_STEP:
   - Re-write 0xCC if this was a breakpoint step

Stack walking (basic approach for x64):

- Start with RSP
- Use StackWalk64 with proper callbacks
- Or manual: follow RBP chain (if frame pointers used)
- Use symbol APIs (DbgHelp) for function names

Questions to ask yourself:

What’s in CONTEXT_FULL vs CONTEXT_ALL?
Why must you suspend a thread before getting its context?
How do breakpoints work at the CPU level?
What’s the difference between software and hardware breakpoints?

Learning milestones:

Attach and enumerate threads → You understand debug attachment
Read and display context → You understand thread state
Walk the stack → You understand calling conventions
Breakpoints work → You understand exception handling and memory patching

Project 8: Mini Process Monitor (ETW-based)

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C, Rust (windows-rs)
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 4: Expert
Knowledge Area: Event Tracing / System Monitoring
Software or Tool: Process Monitor (Sysinternals)
Main Book: “Windows Internals, Part 2” by Yosifovich et al.

What you’ll build: A real-time process/thread creation and termination monitor using Event Tracing for Windows (ETW). It will show process start with full command line, thread creation with start address, and process exit with exit code—all in real-time without polling.

Why it teaches the execution model: ETW is how Windows itself monitors these events. By consuming ETW events, you see process/thread lifecycle exactly as the kernel sees it—no polling, no missed events, microsecond resolution.

Core challenges you’ll face:

Setting up ETW sessions → maps to StartTrace, EnableTraceEx2
Consuming events in real-time → maps to OpenTrace with PROCESS_TRACE_MODE_REAL_TIME
Parsing event data → maps to TdhGetEventInformation, property parsing
Handling high event rates → maps to buffering, missed events

Key Concepts:

ETW Architecture: “Windows Internals, Part 2” Chapter 9 - Yosifovich et al.
Kernel Providers: Microsoft Process/Thread ETW Provider documentation
TDH API: Microsoft TraceDataHelper documentation
Real-time Tracing: “Troubleshooting with Windows Sysinternals Tools” - Russinovich

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Projects 1-6 completed, understanding of provider GUIDs, experience with complex Windows APIs

Real world outcome:

C:\> mini_procmon.exe

[ETW] Starting real-time process/thread trace...
[ETW] Session started: MiniProcMon_Session

Timestamp           Event                PID   TID   Details
---------           -----                ---   ---   -------
10:23:45.123456     Process Start        7890   -    cmd.exe
                                                      Parent: 4128 (explorer.exe)
                                                      CmdLine: "cmd.exe" /K "echo hello"
                                                      User: DESKTOP\Douglas

10:23:45.124789     Thread Start         7890  7894  Start: 0x00007FF7A1234560
                                                      Stack: 1 MB

10:23:45.125012     Thread Start         7890  7898  Start: ntdll!TppWorkerThread

10:23:45.234567     Process Start        7902   -    echo.exe
                                                      Parent: 7890 (cmd.exe)

10:23:45.245678     Process Exit         7902   -    ExitCode: 0

10:23:47.891234     Thread Exit          7890  7898  ExitCode: 0

^C
[ETW] Session stopped. Events captured: 127
      Processes created: 2
      Processes exited: 1
      Threads created: 15
      Threads exited: 12

Implementation Hints:

ETW setup flow:

1. Define event trace properties (EVENT_TRACE_PROPERTIES)
   - LogFileMode: EVENT_TRACE_REAL_TIME_MODE
   - LoggerName: Your session name

2. StartTrace(&sessionHandle, sessionName, &properties)

3. Enable the provider(s):
   - Process provider GUID: {22FB2CD6-0E7B-422B-A0C7-2FAD1FD0E716}
   - Thread provider GUID: {3D6FA8D1-FE05-11D0-9DDA-00C04FD7BA7C}
   - Or Microsoft-Windows-Kernel-Process GUID
   EnableTraceEx2(sessionHandle, &providerGuid, ...)

4. Set up consumer:
   EVENT_TRACE_LOGFILE logfile = {
     .LoggerName = sessionName,
     .ProcessTraceMode = PROCESS_TRACE_MODE_REAL_TIME |
                         PROCESS_TRACE_MODE_EVENT_RECORD,
     .EventRecordCallback = YourCallback
   };
   TRACEHANDLE traceHandle = OpenTrace(&logfile);

5. ProcessTrace(&traceHandle, 1, NULL, NULL);
   // This blocks and calls your callback for each event

6. In callback, parse event:
   - event->EventHeader.ProviderId tells you which provider
   - event->EventHeader.EventDescriptor.Opcode tells you event type
   - Use TdhGetEventInformation + TdhGetProperty to extract fields

Key event opcodes:

Process: Start=1, End=2, DCStart=3, DCEnd=4
Thread: Start=1, End=2, DCStart=3, DCEnd=4

Questions to ask yourself:

Why use ETW instead of polling?
What happens if events come faster than you can process?
How do you correlate thread events with their owning process?
What’s the overhead of ETW tracing?

Learning milestones:

Session starts and provider enabled → You understand ETW setup
Process events captured → You understand provider events
Event data parsed correctly → You understand TDH parsing
High-rate events handled → You understand ETW performance

Project 9: User-Mode Scheduler (UMS)

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 5: Master
Knowledge Area: Advanced Threading / Scheduler Implementation
Software or Tool: Windows UMS API
Main Book: “Windows Internals, Part 1” by Yosifovich et al.

What you’ll build: A custom user-mode scheduler using Windows User-Mode Scheduling (UMS) that manages a pool of user-mode threads with your own scheduling algorithm—without kernel transitions for context switches.

Why it teaches execution model: UMS is the apex of Windows threading. It lets YOU be the scheduler. When a UMS thread blocks, the kernel notifies you (not the kernel scheduler), and you pick the next thread. This is how SQL Server achieves massive concurrency.

Core challenges you’ll face:

Creating UMS scheduler threads → maps to EnterUmsSchedulingMode
Creating UMS worker threads → maps to CreateUmsThreadContext, CreateRemoteThreadEx
Handling scheduler callbacks → maps to UmsSchedulerProc, blocking notifications
Implementing scheduling decisions → maps to picking next thread, priority, fairness
Avoiding deadlocks → maps to careful lock-free programming in scheduler

Key Concepts:

UMS Architecture: “Windows Internals, Part 1” Chapter 5 - Yosifovich et al.
UMS API: Microsoft UMS documentation (MSDN)
Scheduler Design: “Operating Systems: Three Easy Pieces” - Scheduling chapters
Lock-Free Programming: “C++ Concurrency in Action” Chapter 7 - Anthony Williams

Difficulty: Master Time estimate: 1 month+ Prerequisites: All previous projects completed, deep understanding of threading, OS scheduling theory

Real world outcome:

C:\> ums_scheduler.exe --workers 100 --algorithm priority

[UMS] Initializing User-Mode Scheduling
[UMS] Scheduler thread: TID 4532
[UMS] Creating 100 UMS worker threads...
[UMS] Workers created and attached to scheduler

=== Custom UMS Scheduler Demo ===
Scheduling algorithm: Priority-based with aging
Workers: 100
Estimated context switches/sec: 50,000+

[Scheduler] Received: ThreadBlocked (worker 45 blocked on I/O)
[Scheduler] Picking next: Worker 23 (priority 8)
[Scheduler] Executing: Worker 23

[Scheduler] Received: ThreadBlocked (worker 23 blocked on mutex)
[Scheduler] Picking next: Worker 67 (priority 6, aged from 4)
[Scheduler] Executing: Worker 67

[Scheduler] Received: ThreadYield (worker 67 voluntary yield)
[Scheduler] Received: ThreadBlocked (worker 12 completed)

... (thousands of switches per second) ...

=== Statistics (after 10 seconds) ===
Total context switches:  523,456
Kernel transitions:      1,234 (only for I/O completion)
Average switch time:     < 1 microsecond
Workers completed:       89 / 100
Queue length (avg):      12
Starvation events:       0

Implementation Hints:

UMS is complex. Here’s the conceptual flow:

Initialization:
1. Create scheduler thread(s) - these are regular threads
2. Call EnterUmsSchedulingMode() from scheduler thread
   - Provides UmsSchedulerProc callback
   - Thread becomes a UMS scheduler

3. Create UMS worker contexts:
   CreateUmsThreadContext(&context);

4. Create actual threads with UMS:
   attrlist with PROC_THREAD_ATTRIBUTE_UMS_THREAD
   CreateRemoteThreadEx(... attrlist ...)

Scheduler callback (UmsSchedulerProc):
- Called when:
  - Worker blocks (I/O, kernel wait, page fault)
  - Worker yields explicitly (UmsThreadYield)
  - Worker completes

- Your job:
  1. Add blocked thread to wait list
  2. Pick next runnable thread from ready list
  3. Call ExecuteUmsThread(nextThread)

- This is YOUR scheduling algorithm!

Critical rules for UMS scheduler:

Scheduler code MUST NOT block (or you deadlock)
Use lock-free data structures for ready/wait queues
Handle all notification reasons properly
Don’t starve threads

Advanced features:

Priority queues with aging
Work stealing between multiple scheduler threads
Affinity hints
Fairness guarantees

Questions to ask yourself:

When does the kernel intervene vs your scheduler?
How does UMS achieve near-zero-cost context switches?
What operations cause a UMS worker to trap to your scheduler?
How would you implement priority inheritance?

Learning milestones:

UMS scheduler receives notifications → You understand UMS architecture
Workers run and switch correctly → You understand ExecuteUmsThread
Custom algorithm works → You understand scheduler implementation
High throughput achieved → You understand UMS performance benefits

Project 10: Process Injection Detector

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 3. The “Service & Support” Model
Difficulty: Level 4: Expert
Knowledge Area: Security / Malware Detection
Software or Tool: Windows Defender, EDR tools
Main Book: “Practical Malware Analysis” by Sikorski & Honig

What you’ll build: A security tool that detects when code is injected into a process: new threads with unusual start addresses, memory regions with RWX permissions, and hollowed process indicators.

Why it teaches execution model security: Attackers abuse the execution model—injecting threads, hollowing processes, using fibers maliciously. Understanding detection means understanding how these primitives can be misused.

Core challenges you’ll face:

Detecting remote thread creation → maps to monitoring CreateRemoteThread, ETW
Scanning for RWX memory → maps to VirtualQueryEx, memory protection flags
Detecting hollowed processes → maps to comparing disk vs memory image
Identifying suspicious start addresses → maps to is start address in a known module?

Key Concepts:

Injection Techniques: “Practical Malware Analysis” Chapter 12 - Sikorski & Honig
Memory Forensics: “The Art of Memory Forensics” - Ligh et al.
ETW for Security: Microsoft Threat Intelligence ETW providers
Process Hollowing: Various security research papers

Difficulty: Expert Time estimate: 2-3 weeks Prerequisites: Projects 1-8 completed, understanding of PE format, security mindset

Real world outcome:

C:\> injection_detector.exe --monitor-all

[Detector] Monitoring for process injection...
[Detector] Using ETW + periodic scanning

=== Injection Detection Events ===

[ALERT] Remote Thread Detected
  Target:     notepad.exe (PID 7890)
  Source:     unknown.exe (PID 9012)
  Thread TID: 8456
  Start Addr: 0x00000001CAFE0000
  SUSPICIOUS: Start address NOT in any loaded module!
  Confidence: HIGH

[ALERT] RWX Memory Region Detected
  Process:    svchost.exe (PID 456)
  Address:    0x000001A0B7C00000
  Size:       65536 bytes
  Content:    Looks like shellcode (MZ header, API hashes)
  Confidence: HIGH

[ALERT] Possible Process Hollowing
  Process:    iexplore.exe (PID 2345)
  Finding:    EntryPoint mismatch
    Disk:     0x00401000
    Memory:   0x7FFE0000 (unmapped from original)
  Finding:    .text section differs significantly
  Confidence: MEDIUM

[INFO] Normal thread creation
  Process:    chrome.exe (PID 5632)
  Thread:     TID 5680
  Start Addr: chrome.dll!base::Thread::ThreadMain
  Status:     BENIGN (known module)

=== Summary (after 60 seconds) ===
Processes scanned: 127
Threads analyzed:  1,456
Alerts generated:  3
False positives:   0 (estimated)

Implementation Hints:

Detection strategies:

1. Remote Thread Detection:
   - ETW: Microsoft-Windows-Kernel-Process for thread creation
   - Check if creating process != owning process
   - Check if start address is in a legitimate module:
     EnumProcessModules + GetModuleInformation
     If start address not in any module range → suspicious

2. RWX Memory Scanning:
   - VirtualQueryEx to enumerate memory regions
   - Flag any region with PAGE_EXECUTE_READWRITE
   - Extra suspicious if not backed by a file (MappedFileName)
   - Scan content for shellcode patterns

3. Process Hollowing Detection:
   - Compare PEB.ImageBaseAddress with disk image
   - Check if entry point matches PE header
   - Compare section contents (especially .text) with disk
   - Look for memory regions that don't match the original file

4. Additional heuristics:
   - Thread start address in stack/heap (bad)
   - NtCreateThreadEx called from suspicious process
   - SetThreadContext to change EIP/RIP (used in some injections)

Monitoring approaches:

Real-time: ETW (Microsoft-Windows-Kernel-Process)
Periodic: Enumerate all processes/threads, scan for anomalies
On-demand: Scan specific process when suspicious activity detected

Questions to ask yourself:

How would an attacker evade your detection?
What’s the false positive rate for RWX detection? (JIT compilers use RWX)
How do you distinguish legitimate remote threads (debugging) from malicious?
What about injection via APC queue instead of remote threads?

Learning milestones:

Remote thread detection works → You understand thread creation monitoring
RWX scanning finds injections → You understand memory protection analysis
Hollowing detection works → You understand process image verification
Low false positives → You understand the benign use cases

Project 11: Fiber-Based Game Engine Scheduler

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 2. The “Micro-SaaS / Pro Tool”
Difficulty: Level 4: Expert
Knowledge Area: Game Development / Job Systems
Software or Tool: Unity/Unreal Job System concepts
Main Book: “Game Engine Architecture” by Jason Gregory

What you’ll build: A job-based task scheduler using fibers, inspired by the Naughty Dog GDC talk “Parallelizing the Naughty Dog Engine Using Fibers.” Jobs can spawn sub-jobs and wait for them without blocking threads.

Why it teaches fibers advanced usage: Game engines need extreme efficiency. By using fibers, you can have thousands of “jobs” that suspend mid-execution when waiting for dependencies, without consuming threads. This is practical cooperative scheduling at scale.

Core challenges you’ll face:

Job dependency graphs → maps to expressing and tracking job dependencies
Fiber pooling → maps to reusing fibers efficiently
Wait-free job queues → maps to lock-free MPMC queues
Work stealing → maps to load balancing across worker threads
Fiber-aware waiting → maps to yielding fiber, not blocking thread

Key Concepts:

Job System Design: “Game Engine Architecture” Chapter 7 - Jason Gregory
Naughty Dog Fiber System: GDC 2015 talk “Parallelizing the Naughty Dog Engine”
Lock-Free Queues: “C++ Concurrency in Action” Chapter 7 - Anthony Williams
Work Stealing: “The Art of Multiprocessor Programming” Chapter 16 - Herlihy & Shavit

Difficulty: Expert Time estimate: 1 month+ Prerequisites: Project 4 completed, understanding of game engine architecture, lock-free programming

Real world outcome:

// Usage example:
JobSystem::Initialize(8);  // 8 worker threads, each with fiber pool

// Define jobs
Job* physics = CreateJob([](JobContext& ctx) {
    // Simulate physics for all entities
    for (int i = 0; i < 1000; i++) {
        SimulateEntity(i);
        if (i % 100 == 0) ctx.Yield();  // Cooperate
    }
});

Job* render = CreateJob([](JobContext& ctx) {
    ctx.WaitFor(physics);  // Fiber yields, doesn't block thread!
    RenderWorld();
});

Job* ai = CreateJob([](JobContext& ctx) {
    // Spawn child jobs
    Job* children[10];
    for (int i = 0; i < 10; i++) {
        children[i] = ctx.SpawnChild([=](JobContext& c) {
            UpdateAI(i * 100, (i + 1) * 100);
        });
    }
    ctx.WaitForAll(children, 10);  // Wait for all AI jobs
});

SubmitJobs({physics, ai, render});
JobSystem::WaitForAll();

Output:

C:\> fiber_job_system.exe

[JobSystem] Initialized: 8 workers, 64 fibers per worker
[JobSystem] Frame 1 starting...

[Worker 0] Running: Physics (Job 0x1234)
[Worker 1] Running: AI (Job 0x5678)
[Worker 2] Running: AI_Child_0 (Job 0x9ABC)
...
[Worker 0] Physics yielded at step 100
[Worker 0] Picked up: AI_Child_3
[Worker 1] AI waiting for children - fiber yielded
[Worker 1] Picked up: AI_Child_5
...
[Worker 3] Render waiting for Physics - fiber yielded
[Worker 3] Picked up: AI_Child_7
...
[Worker 0] Physics completed
[Worker 3] Render can now run (dependency satisfied)
[Worker 3] Render completed

[JobSystem] Frame 1 complete: 12.4ms
  Jobs executed: 14
  Context switches: 156
  Thread blocks: 0 (all waits were fiber yields)

[JobSystem] Frame 2 starting...

Implementation Hints:

Architecture overview:

Components:
1. Worker Threads (N = core count)
   - Each thread is converted to fiber
   - Has local fiber pool + job queue

2. Fiber Pool
   - Pre-allocated fibers with fixed stack size
   - Recycled when job completes

3. Job Queue (per worker + global)
   - Lock-free MPMC queue
   - Support work stealing from other workers

4. Wait System
   - Counter-based: job increments counter, waiters decrement
   - When wait needed: suspend fiber, schedule new fiber
   - When counter hits zero: resume waiting fibers

Flow:
- CreateJob: allocate job descriptor, return handle
- SubmitJob: push to queue
- Worker loop:
  1. Pop job from local queue (or steal)
  2. If job has unsatisfied dependencies, push to waiting list
  3. Run job in current fiber
  4. On WaitFor: switch to scheduler fiber, pick new job
  5. On complete: signal dependents, recycle fiber

Key fiber operations:

SwitchToFiber(schedulerFiber) - yield current job
Scheduler fiber picks next ready job’s fiber
SwitchToFiber(jobFiber) - resume job

Work stealing algorithm:

Try local queue
If empty, try global queue
If empty, steal from random other worker
If all empty, spin/sleep

Questions to ask yourself:

Why use fibers instead of just more threads?
How do you handle job exceptions in a fiber?
What’s the memory overhead of 1000 fibers vs 1000 threads?
How would you add priority to jobs?

Learning milestones:

Jobs run on workers → You understand fiber-based worker pool
WaitFor doesn’t block threads → You understand fiber yielding
Work stealing balances load → You understand distributed scheduling
High job throughput achieved → You understand practical fiber systems

Project 12: Windows Service with Job Object Management

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: Windows Services / System Programming
Software or Tool: Windows Service Control Manager
Main Book: “Windows System Programming” by Johnson Hart

What you’ll build: A Windows service that manages child worker processes in job objects. The service starts, monitors, restarts failed workers, enforces resource limits, and provides a control interface.

Why it teaches real-world Windows programming: Services + Job Objects is how real Windows infrastructure works. Docker, IIS, and SQL Server use these patterns. This combines process management, job objects, and service architecture.

Core challenges you’ll face:

Service lifecycle → maps to ServiceMain, control handler, status reporting
Managing child processes → maps to CreateProcess, monitoring, restart logic
Job object integration → maps to resource limits, accounting, termination
Inter-process communication → maps to named pipes or other IPC for control
Graceful shutdown → maps to propagating stop to children, cleanup

Key Concepts:

Windows Services: “Windows System Programming” Chapter 13 - Johnson Hart
Service Control Manager: “Windows Via C/C++” Chapter 4 - Jeffrey Richter
Job Objects for Services: “Windows Internals, Part 1” Chapter 5 - Yosifovich et al.
Named Pipes IPC: “Windows System Programming” Chapter 11 - Johnson Hart

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-6 completed, understanding of Windows services basics

Real world outcome:

C:\> sc start WorkerManagerService
[SC] StartService SUCCESS

C:\> worker_ctl.exe status

=== Worker Manager Service ===
Status: Running (PID 2345)
Uptime: 00:05:32

Job Object: WorkerJob_001
  Memory Limit:  500 MB per process
  CPU Limit:     80% (relative weight)
  Process Limit: 10

Workers:
  ID   PID    Status    CPU    Memory   Restarts   Uptime
  ---  ----   ------    ---    ------   --------   ------
  1    3456   Running   12%    45 MB    0          00:05:30
  2    3789   Running   8%     62 MB    0          00:05:28
  3    4012   Running   15%    38 MB    1          00:02:15
  4    4234   Crashed   -      -        3          -

Last restart: Worker 4 crashed (exit code 0xC0000005) - restarting...

C:\> worker_ctl.exe scale 6
[WorkerCtl] Scaling to 6 workers...
[WorkerCtl] Started worker 5 (PID 4567)
[WorkerCtl] Started worker 6 (PID 4890)
[WorkerCtl] Scale complete: 6 workers running

C:\> sc stop WorkerManagerService
[SC] StopService SUCCESS
  (Service gracefully terminated all workers)

Implementation Hints:

Service structure:

ServiceMain():
1. Register control handler (SERVICE_CONTROL_STOP, etc.)
2. Report SERVICE_START_PENDING
3. Initialize: create job object, set limits
4. Spawn initial workers (CreateProcess into job)
5. Report SERVICE_RUNNING
6. Enter monitoring loop:
   - Wait on job completion port OR control event
   - Handle worker exits (restart if needed)
   - Handle control signals (stop/pause)
7. On STOP: terminate job, cleanup, report SERVICE_STOPPED

Control handler:
- SERVICE_CONTROL_STOP: set stop event
- SERVICE_CONTROL_INTERROGATE: report current status
- Custom controls (128+): handle via named pipe

Worker management:
- CreateProcess with CREATE_SUSPENDED
- AssignProcessToJobObject
- ResumeThread
- Monitor via job completion port

IPC (for worker_ctl.exe):
- Named pipe: \\.\pipe\WorkerManagerControl
- Commands: status, scale N, restart ID, shutdown
- Responses: JSON or simple text protocol

Service installation:

sc create WorkerManager binPath= "C:\path\to\service.exe"
sc config WorkerManager start= auto
sc start WorkerManager

Questions to ask yourself:

How do you handle the service being killed (power loss)?
What happens if a worker is stuck (not crashing, but hung)?
How do you upgrade workers without downtime?
How would you add per-worker configuration?

Learning milestones:

Service installs and starts → You understand service lifecycle
Workers spawn in job object → You understand job integration
Crashed workers restart → You understand monitoring and recovery
Control interface works → You understand IPC patterns

Project 13: PE Loader and Executor

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C
Alternative Programming Languages: C++, Rust
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 1. The “Resume Gold”
Difficulty: Level 5: Master
Knowledge Area: Binary Loading / Windows Internals
Software or Tool: Windows Loader (ntdll)
Main Book: “Windows Internals, Part 1” by Yosifovich et al.

What you’ll build: A minimal PE (Portable Executable) loader that can load a Windows executable into memory, resolve its imports, apply relocations, and jump to its entry point—without using LoadLibrary.

Why it teaches the execution model: This IS the execution model. When you run an EXE, the Windows loader does exactly this. By building your own, you understand every step from bytes on disk to running code.

Core challenges you’ll face:

Parsing PE headers → maps to DOS header, NT headers, section headers
Mapping sections → maps to VirtualAlloc with correct protections
Resolving imports → maps to walking import directory, GetProcAddress
Applying relocations → maps to base relocation table, pointer patching
Executing the image → maps to TLS callbacks, entry point, DllMain

Key Concepts:

PE Format: Microsoft PE/COFF specification
Windows Loader: “Windows Internals, Part 1” Chapter 3 - Yosifovich et al.
Import/Export Tables: “Practical Malware Analysis” Chapter 1 - Sikorski & Honig
Relocations: “Practical Binary Analysis” Chapter 4 - Andriesse

Difficulty: Master Time estimate: 1 month+ Prerequisites: All previous projects, deep understanding of PE format, assembly basics

Real world outcome:

C:\> pe_loader.exe calc.exe

[Loader] Loading: calc.exe
[Loader] PE Type: PE32+ (64-bit)
[Loader] Image size: 0x22000
[Loader] Entry point: 0x1A20

[Loader] Mapping sections:
  .text    0x1000   size 0x10000  RX
  .rdata   0x11000  size 0x8000   R
  .data    0x19000  size 0x2000   RW
  .pdata   0x1B000  size 0x1000   R
  .rsrc    0x1C000  size 0x6000   R

[Loader] Allocated at base: 0x00007FF700000000

[Loader] Processing relocations:
  Base delta: +0x7FF6FF800000
  Relocations applied: 1,247

[Loader] Resolving imports:
  KERNEL32.dll (42 functions)
    GetModuleHandleW -> 0x00007FFB1A234560
    CreateFileW -> 0x00007FFB1A235780
    ...
  USER32.dll (28 functions)
    MessageBoxW -> 0x00007FFB18012340
    ...
  [Total: 8 DLLs, 156 imports resolved]

[Loader] Executing TLS callbacks: 0

[Loader] Jumping to entry point: 0x00007FF700001A20

*** Calculator window appears ***

Implementation Hints:

PE loading steps:

1. Read file into memory (raw bytes)

2. Parse headers:
   - DOS Header: e_lfanew points to NT headers
   - NT Headers: FileHeader + OptionalHeader
   - OptionalHeader: ImageBase, SizeOfImage, EntryPoint, etc.
   - Section Headers: name, VirtualAddress, SizeOfRawData, Characteristics

3. Allocate memory:
   VirtualAlloc(preferredBase, SizeOfImage, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE)
   - May not get preferred base → need relocations

4. Copy sections:
   For each section:
     memcpy(allocBase + section.VirtualAddress,
            fileData + section.PointerToRawData,
            section.SizeOfRawData)

5. Process relocations (if base != ImageBase):
   - Find .reloc section (IMAGE_DIRECTORY_ENTRY_BASERELOC)
   - For each relocation block:
     - For each entry: add baseDelta to address

6. Resolve imports:
   - Find import directory (IMAGE_DIRECTORY_ENTRY_IMPORT)
   - For each imported DLL:
     - LoadLibrary(dllName)  // or recursive load
     - For each import: GetProcAddress, write to IAT

7. Apply section protections:
   VirtualProtect each section based on Characteristics
   (IMAGE_SCN_MEM_EXECUTE, _READ, _WRITE)

8. Call TLS callbacks (IMAGE_DIRECTORY_ENTRY_TLS)

9. Jump to entry point:
   ((void(*)())entryPoint)();

Advanced considerations:

Bound imports (optional optimization)
Delay-load imports
Exception handling tables (.pdata)
Resource loading (if you want icons, etc.)

Questions to ask yourself:

What happens if a DLL isn’t found?
Why do we need relocations?
How does ASLR affect PE loading?
What’s the difference between loading a DLL vs an EXE?

Learning milestones:

Headers parse correctly → You understand PE format
Sections mapped with correct protections → You understand memory layout
Imports resolve → You understand the IAT/INT
Simple EXE runs → You’ve built a working loader

Project 14: Cross-Process Communication Framework

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C, Rust
Coolness Level: Level 3: Genuinely Clever
Business Potential: 4. The “Open Core” Infrastructure
Difficulty: Level 3: Advanced
Knowledge Area: IPC / Distributed Systems
Software or Tool: Named Pipes, Shared Memory
Main Book: “Windows System Programming” by Johnson Hart

What you’ll build: A complete IPC framework supporting named pipes, shared memory, and memory-mapped files—with a unified API. Include message passing, RPC-style calls, and event notification.

Why it teaches process relationships: Processes in Windows are isolated by design. IPC is how they cooperate. By building a framework, you understand all the mechanisms Windows provides for cross-process communication.

Core challenges you’ll face:

Named pipe implementation → maps to CreateNamedPipe, overlapped I/O
Shared memory → maps to CreateFileMapping, MapViewOfFile, synchronization
Message protocol → maps to framing, serialization, message types
Synchronization across processes → maps to named mutexes, events, semaphores
Security → maps to SECURITY_ATTRIBUTES, impersonation

Key Concepts:

Named Pipes: “Windows System Programming” Chapter 11 - Johnson Hart
Memory-Mapped Files: “Windows Via C/C++” Chapter 17 - Jeffrey Richter
Interprocess Synchronization: “Windows Via C/C++” Chapter 9 - Jeffrey Richter
Security Descriptors: “Windows Internals, Part 1” Chapter 7 - Yosifovich et al.

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-6 completed, understanding of serialization

Real world outcome:

// Server side:
#include "ipc_framework.h"

IPCServer server("MyService");
server.OnConnect([](IPCClient& client) {
    printf("Client connected: %s\n", client.GetProcessName());
});

server.RegisterHandler("add", [](const Message& req) -> Message {
    int a = req.GetInt("a");
    int b = req.GetInt("b");
    return Message().Set("result", a + b);
});

server.RegisterHandler("getData", [](const Message& req) -> Message {
    // Return via shared memory for large data
    auto shm = SharedMemory::Create("DataBuffer", 1024 * 1024);
    FillWithData(shm.GetPtr(), shm.GetSize());
    return Message().SetSharedMemory("data", shm);
});

server.Run();

// Client side:
IPCClient client("MyService");
client.Connect();

auto response = client.Call("add", Message().Set("a", 5).Set("b", 3));
printf("5 + 3 = %d\n", response.GetInt("result")); // 8

auto dataResp = client.Call("getData", Message());
auto shm = dataResp.GetSharedMemory("data");
ProcessData(shm.GetPtr(), shm.GetSize());

Output:

=== Server ===
[Server] Listening on \\.\pipe\MyService
[Server] Client connected: client.exe (PID 4567)
[Server] Handling 'add' request
[Server] Handling 'getData' request (shared memory path)

=== Client ===
[Client] Connected to MyService
5 + 3 = 8
[Client] Received 1 MB via shared memory
[Client] Data processed successfully

Implementation Hints:

Framework architecture:

Layers:
1. Transport Layer
   - NamedPipeTransport: byte streams via named pipes
   - SharedMemoryTransport: direct memory access
   - Both implement ITransport interface

2. Protocol Layer
   - Message framing (length prefix)
   - Serialization (JSON, MessagePack, custom binary)
   - Request/response correlation (message IDs)

3. API Layer
   - Server: listen, accept, route handlers
   - Client: connect, call, async notifications
   - Both support sync and async operations

Named Pipe Server:
CreateNamedPipe(pipeName, PIPE_ACCESS_DUPLEX | FILE_FLAG_OVERLAPPED, ...)
ConnectNamedPipe(pipe, &overlapped)  // async accept
ReadFile/WriteFile with overlapped for async I/O

Shared Memory:
CreateFileMapping(INVALID_HANDLE_VALUE, ..., size, name)
MapViewOfFile(...) - returns pointer
// Sender writes, signals event
// Receiver maps same name, reads

Synchronization:
CreateEvent(NULL, FALSE, FALSE, "Global\\MyEvent")
// Named events work cross-process

Design decisions:

Do you serialize everything through pipes, or use shared memory for large data?
How do you handle multiple concurrent requests?
How do you implement timeouts?
How do you secure the pipe (who can connect)?

Questions to ask yourself:

What’s the performance difference between pipes and shared memory?
How do you handle the server crashing mid-request?
What happens if client and server have different byte orders (x86 vs ARM)?
How would you add encryption?

Learning milestones:

Basic pipe communication works → You understand named pipes
Shared memory transfers work → You understand memory mapping
RPC-style calls work → You understand request/response protocols
Concurrent clients handled → You understand async I/O

Project 15: Mini Windows Task Manager

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: System Monitoring / GUI
Software or Tool: Windows Task Manager
Main Book: “Windows Via C/C++” by Jeffrey Richter

What you’ll build: A functional Task Manager clone with: process list (name, PID, CPU, memory), thread view per process, performance graphs (CPU, memory over time), and the ability to terminate processes.

Why it teaches the execution model visually: Everything you’ve learned comes together in a visual tool. Processes, threads, performance counters, job objects—all visible and interactive.

Core challenges you’ll face:

Performance counter sampling → maps to PDH API or NtQuerySystemInformation
CPU usage calculation → maps to delta of process times / delta of system time
Memory metrics → maps to working set, private bytes, commit size
Real-time graph rendering → maps to GDI/Direct2D, ring buffer for history
Safe process termination → maps to TerminateProcess with proper rights

Key Concepts:

Performance Counters: Microsoft PDH documentation
Process Memory: “Windows Internals, Part 1” Chapter 5 - Yosifovich et al.
CPU Time Calculation: “Windows System Programming” Chapter 6 - Johnson Hart
Win32 GUI: “Programming Windows” - Charles Petzold

Difficulty: Advanced Time estimate: 2-3 weeks Prerequisites: Projects 1-6 completed, basic Win32 GUI or willingness to learn

Real world outcome:

┌─────────────────────────────────────────────────────────────────────┐
│  Mini Task Manager                                           [_][X] │
├─────────────────────────────────────────────────────────────────────┤
│ [Processes] [Performance] [Details]                                 │
├─────────────────────────────────────────────────────────────────────┤
│ Name                PID    CPU%    Memory   Threads   Description   │
├─────────────────────────────────────────────────────────────────────┤
│ chrome.exe         5632   12.5%   245 MB   34        Google Chrome  │
│   └─ chrome.exe    5640    8.2%    89 MB   12        GPU Process    │
│   └─ chrome.exe    5712    2.1%    45 MB    8        Renderer       │
│ explorer.exe       4128    1.2%    98 MB   47        Windows Explor │
│ System             4       0.8%    24 KB   142       NT Kernel      │
│ ...                                                                 │
├─────────────────────────────────────────────────────────────────────┤
│ CPU Usage: ████████░░░░░░░░ 48%    Memory: ███████████░░░ 72%       │
│                                                                      │
│   100%│    ╭─╮                                                       │
│       │   ╭╯ ╰╮    ╭╮                                               │
│    50%│──╮│   ╰────╯╰─────────                                      │
│       │  ╰╯                                                          │
│     0%└────────────────────────────────────────                     │
│        -60s                              now                         │
└─────────────────────────────────────────────────────────────────────┘

Implementation Hints:

Architecture:

Components:
1. Data Collection (background thread)
   - Sample every 1 second
   - Use Toolhelp32 for process list
   - Use GetProcessTimes for CPU calculation
   - Store history in ring buffer

2. CPU Calculation:
   - Get system times: GetSystemTimes
   - Get process times: GetProcessTimes
   - CPU% = (process_kernel + process_user) delta / system_total delta * 100
   - Remember: multi-core can exceed 100% per process!

3. Memory Metrics:
   - GetProcessMemoryInfo → PROCESS_MEMORY_COUNTERS
   - WorkingSetSize = physical RAM used
   - PrivateUsage = committed private memory
   - GlobalMemoryStatusEx for system totals

4. GUI:
   - ListView for process list
   - Custom control for graphs (owner-draw or Direct2D)
   - Timer for periodic refresh

5. Process Tree:
   - Build parent-child relationships
   - Display as indented list or actual tree

6. Actions:
   - End Task: TerminateProcess (need PROCESS_TERMINATE)
   - Open file location: GetModuleFileNameEx + ShellExecute

Performance tips:

Don’t re-enumerate every refresh; track changes
Use batch updates for the ListView
Double-buffer graph drawing to avoid flicker

Questions to ask yourself:

How do you handle processes you can’t open (access denied)?
How do you calculate per-core CPU usage?
What’s the difference between working set and private bytes?
How would you add network usage per process?

Learning milestones:

Process list displays → You understand enumeration
CPU usage is accurate → You understand time calculations
Graphs work smoothly → You understand visualization
End task works → You understand process control

Project Comparison Table

Project	Difficulty	Time	Depth of Understanding	Fun Factor
1. Process & Thread Inspector	Intermediate	1-2 weeks	⭐⭐⭐	⭐⭐⭐
2. Thread Sync Playground	Intermediate	1-2 weeks	⭐⭐⭐⭐	⭐⭐⭐
3. Thread Pool from Scratch	Advanced	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
4. Fiber Coroutine Library	Advanced	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
5. Job Object Sandbox	Advanced	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
6. Process Relationship Visualizer	Intermediate	1-2 weeks	⭐⭐⭐	⭐⭐⭐
7. Thread Context Debugger	Expert	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
8. Mini Process Monitor (ETW)	Expert	2-3 weeks	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
9. User-Mode Scheduler (UMS)	Master	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
10. Process Injection Detector	Expert	2-3 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐⭐
11. Fiber Game Engine Scheduler	Expert	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
12. Windows Service + Job Objects	Advanced	2-3 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐
13. PE Loader and Executor	Master	1 month+	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
14. Cross-Process IPC Framework	Advanced	2-3 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐
15. Mini Task Manager	Advanced	2-3 weeks	⭐⭐⭐⭐	⭐⭐⭐⭐⭐

Recommended Learning Path

Phase 1: Foundations (4-6 weeks)

Project 1: Process & Thread Inspector - See the execution model
Project 2: Thread Synchronization Playground - Understand synchronization
Project 6: Process Relationship Visualizer - Understand process trees

Phase 2: Threading Deep Dive (6-8 weeks)

Project 3: Thread Pool from Scratch - Build serious threading infrastructure
Project 4: Fiber Coroutine Library - Master cooperative scheduling
Project 7: Thread Context Debugger - Understand thread internals

Phase 3: Job Objects & Isolation (4-6 weeks)

Project 5: Job Object Sandbox - Master resource isolation
Project 12: Windows Service + Job Objects - Real-world integration

Phase 4: Advanced Topics (8-12 weeks)

Project 8: Mini Process Monitor (ETW) - System-level monitoring
Project 9: User-Mode Scheduler (UMS) - Apex of threading
Project 13: PE Loader and Executor - Understand the loader

Phase 5: Capstone Projects (Choose 2-3)

Project 10: Process Injection Detector - Security application
Project 11: Fiber Game Engine Scheduler - High-performance computing
Project 14: Cross-Process IPC Framework - System integration
Project 15: Mini Task Manager - Everything together

Final Capstone: Build a Container Runtime

File: LEARN_WINDOWS_SYSTEM_PROGRAMMING_EXECUTION_MODEL.md
Main Programming Language: C++
Alternative Programming Languages: C, Rust
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 5. The “Industry Disruptor”
Difficulty: Level 5: Master
Knowledge Area: Virtualization / Containerization
Software or Tool: Docker for Windows, Windows Containers
Main Book: “Windows Internals, Part 1” + “Windows Internals, Part 2”

What you’ll build: A minimal Windows container runtime that isolates processes using: Job Objects for resource limits, process namespaces (via Silos), filesystem virtualization, and registry redirection.

Why this is the ultimate project: Windows containers use EVERYTHING you’ve learned: processes, threads, job objects, fibers (for scheduling), security, the Windows loader, IPC, and more. This is what mastery looks like.

Core challenges you’ll face:

Process isolation → maps to Job Objects + Silo (Server Silo for true isolation)
Filesystem virtualization → maps to minifilter driver or bind mounts
Registry virtualization → maps to registry hive loading, key remapping
Network isolation → maps to Windows Filtering Platform or HNS
Container image handling → maps to layer extraction, copy-on-write

Key Concepts:

Windows Containers: Microsoft container documentation
Silos: “Windows Internals, Part 1” Chapter 3 - Yosifovich et al.
HCS (Host Compute Service): Windows Container APIs
WC API: Low-level container primitives

Difficulty: Master Time estimate: 3-6 months Prerequisites: ALL previous projects completed, deep Windows internals knowledge

Real world outcome:

C:\> mini_container.exe run --memory 512m --cpus 2 myimage

[Container] Pulling image: myimage
[Container] Extracting 3 layers...
[Container] Creating container filesystem at C:\Containers\abc123
[Container] Creating Server Silo
[Container] Setting job limits: Memory=512MB, CPU=2 cores
[Container] Starting init process in container...

Container abc123 is running.

C:\> mini_container.exe exec abc123 cmd
Microsoft Windows [Version 10.0.19041]
(c) Container Environment

C:\> whoami
ContainerUser

C:\> dir C:\
 Volume in drive C is ContainerRoot
 Directory of C:\

05/15/2024  10:00 AM    <DIR>          Windows
05/15/2024  10:00 AM    <DIR>          Program Files
05/15/2024  10:01 AM    <DIR>          Users
              0 File(s)              0 bytes

C:\> exit

C:\> mini_container.exe stop abc123
[Container] Terminating container processes...
[Container] Cleanup complete.

Implementation Hints:

This is a massive project. Start with these milestones:

Milestone 1: Process-in-Job
- Create job object with limits
- Start process in job
- Monitor and restrict

Milestone 2: Filesystem Isolation
- Create overlay filesystem (or use bind mounts)
- Redirect filesystem calls (via registry or API hooking)
- Copy-on-write layer

Milestone 3: Server Silo (if available)
- Use HCS/HCN APIs or lower-level NtCreate* calls
- True namespace isolation

Milestone 4: Networking
- Create virtual switch (HNS)
- Assign container to network
- NAT or bridge mode

Milestone 5: Image format
- Layer-based images (like Docker)
- Extract and apply layers

Resources:

Windows Container source (some is open)
HCS API documentation
RunHCS (Microsoft’s runtime)
hcsshim (Go library for HCS)

Learning milestones:

Process isolation works → Foundation complete
Filesystem appears separate → Virtualization working
Network isolated → Full isolation achieved
Images work → Distribution solved

Summary

#	Project	Main Language
1	Process & Thread Inspector	C
2	Thread Synchronization Playground	C
3	Thread Pool from Scratch	C++
4	Fiber-Based Coroutine Library	C
5	Job Object Sandbox	C
6	Process Relationship Visualizer	C++
7	Thread Context Debugger	C
8	Mini Process Monitor (ETW)	C++
9	User-Mode Scheduler (UMS)	C
10	Process Injection Detector	C
11	Fiber-Based Game Engine Scheduler	C++
12	Windows Service with Job Objects	C++
13	PE Loader and Executor	C
14	Cross-Process Communication Framework	C++
15	Mini Windows Task Manager	C++
Capstone	Container Runtime	C++

Essential Resources

Books

“Windows Internals, Part 1” by Yosifovich, Ionescu, Russinovich, Solomon - THE definitive reference
“Windows Internals, Part 2” - Continues with I/O, networking, security
“Windows Via C/C++” by Jeffrey Richter - Practical API programming
“Windows System Programming” by Johnson Hart - Classical systems programming

Online Resources

Microsoft Docs: Official API documentation
ReactOS Source: Open-source Windows implementation (great for learning)
The Old New Thing (blog): Raymond Chen’s insights into Windows design
Windows Internals Training: Pavel Yosifovich’s courses

Tools

WinDbg: Kernel and user-mode debugging
Process Monitor/Explorer: Sysinternals tools
API Monitor: See API calls in real-time
x64dbg: User-mode debugging

Good luck on your Windows internals journey! 🖥️