Project 7: NUMA Memory Migration Tool
Build a CLI tool that inspects page locations for a process and migrates them to specific NUMA nodes, then verifies the result.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 2 weeks |
| Main Programming Language | C (Alternatives: Rust, Python) |
| Alternative Programming Languages | Rust, Python |
| Coolness Level | Level 3: Genuinely Useful |
| Business Potential | 2. The “Internal Tool” |
| Prerequisites | Linux syscalls, virtual memory basics, NUMA policies |
| Key Topics | move_pages, migrate_pages, page tables, NUMA balancing |
1. Learning Objectives
By completing this project, you will:
- Explain how Linux tracks the NUMA location of pages.
- Use
move_pagesto query and migrate page locations. - Handle errors like EBUSY and EACCES gracefully.
- Verify migration using
/proc/<pid>/numa_maps. - Understand how NUMA balancing can move pages back.
- Provide deterministic migration demos with fixed page ranges.
2. All Theory Needed (Per-Concept Breakdown)
2.1 Linux Virtual Memory, Page Tables, and Page Migration
Fundamentals
Linux manages memory in pages. Each page is mapped from a virtual address to a physical frame, and that frame belongs to a NUMA node. The kernel can migrate pages between nodes to improve locality or balance load. The move_pages system call lets you query or request migration of pages in a process’s address space. Understanding page tables, page faults, and how pages are tracked in the kernel is essential to building a reliable migration tool.
Deep Dive into the Concept
Virtual memory provides each process with a large, contiguous address space, but physical memory is divided into page frames. The page table translates virtual addresses to physical frames. On a NUMA system, each frame is allocated from a specific node. When a process first accesses a page, the kernel handles a page fault and allocates a frame according to the current NUMA policy. This is why first-touch is so important.
Linux tracks page locations and can migrate them for better locality or memory balancing. The move_pages system call can be used in two ways: query mode (where nodes is NULL) and migrate mode (where nodes is an array of target node IDs). In query mode, the kernel fills an array with the node IDs where each page currently resides. In migrate mode, the kernel attempts to move those pages to the requested nodes. Migration can fail for several reasons: the page may be locked (EBUSY), the calling process lacks permissions (EACCES), or the target node lacks free memory (ENOMEM).
Migration is not instantaneous. The kernel may need to copy the page contents, update page tables, and synchronize with running threads. This can be expensive, especially for large ranges. Moreover, migration may be reversed by automatic NUMA balancing if the process continues to access pages from a different node. Your tool should therefore include a verification step and possibly a “pin” or “policy” option to prevent immediate remigration.
It is also important to understand the granularity: move_pages operates on pages, not on large blocks. That means your tool should align addresses to page boundaries, and you should expect partial success. A robust report should include the number of pages moved, failed, busy, and already local. These statistics are critical for users to interpret the result.
Finally, you must consider permissions. Migrating pages of another process usually requires CAP_SYS_NICE. Therefore, your tool should allow migrating its own process without special privileges, and it should provide clear error messages when run on other processes.
How this fits on projects
This concept defines the system calls and semantics your tool relies on. It directly informs how you query, migrate, and report page locations.
Definitions & Key Terms
- Page table -> Data structure mapping virtual to physical addresses.
- Page frame -> Physical memory unit backing a page.
- move_pages -> Syscall to query or migrate pages.
- Migration -> Copying a page to a different NUMA node.
- EBUSY -> Error indicating a page is busy/locked.
Mental Model Diagram (ASCII)
Virtual Address -> Page Table -> Physical Frame (Node 0)
move_pages(query) -> returns node ids
move_pages(migrate) -> request move to Node 1
How It Works (Step-by-Step)
- Align the target address range to page boundaries.
- Build an array of page addresses.
- Call
move_pageswith nodes=NULL to query locations. - Call
move_pageswith target node IDs to migrate. - Re-query to verify new placement.
Invariants: Addresses must be page-aligned; the number of nodes matches number of pages.
Failure modes: EBUSY, EACCES, ENOMEM, partial migration.
Minimal Concrete Example
long pages = len / page_size;
void *addrs[pages];
int status[pages];
// Query
move_pages(pid, pages, addrs, NULL, status, 0);
// Migrate
int nodes[pages];
for (i = 0; i < pages; i++) nodes[i] = target_node;
move_pages(pid, pages, addrs, nodes, status, MPOL_MF_MOVE);
Common Misconceptions
- “Migration is always successful” -> Many pages can be busy or pinned.
- “move_pages works without privileges” -> Only for the current process.
- “Migration is permanent” -> NUMA balancing can move pages back.
Check-Your-Understanding Questions
- Why must addresses be page-aligned for
move_pages? - What does EBUSY mean in migration?
- How can you verify page locations after migration?
Check-Your-Understanding Answers
move_pagesoperates on pages, not arbitrary addresses.- The page is locked or in use, so it cannot be moved.
- Use
move_pagesquery mode or/proc/<pid>/numa_maps.
Real-World Applications
- Repairing bad placement after a process migrates.
- Memory tuning for long-running services.
Where You’ll Apply It
- In this project: Sec. 3.2, Sec. 3.7, Sec. 5.10.
- Also used in: P05-numa-aware-memory-allocator.
References
- “Operating Systems: Three Easy Pieces” – Ch. 13-21
- “The Linux Programming Interface” – Ch. 6
Key Insights
Page migration is powerful but partial; your tool must expect and report failures.
Summary
Linux can migrate pages between NUMA nodes, but the operation is constrained by page state and permissions. move_pages provides both query and migration modes, and robust tools must handle partial success and verification.
Homework/Exercises to Practice the Concept
- Use
move_pagesto query your own process’s stack pages. - Migrate a small heap region to a different node.
- Compare the results with
/proc/<pid>/numa_maps.
Solutions to the Homework/Exercises
- The status array should show the node IDs for stack pages.
- Migration will succeed if pages are not locked and target node has space.
- The node distribution should match the migration report.
2.2 NUMA Balancing, Policies, and Verification
Fundamentals
Linux has automatic NUMA balancing, which moves pages based on observed access patterns. Policies like bind and preferred also influence placement. After you migrate pages, NUMA balancing might move them back if threads continue to access them from another node. Verification is therefore essential: you must re-check page locations and report whether the migration “stuck.” A good migration tool surfaces both the immediate result and the expected stability of placement.
Deep Dive into the Concept
Automatic NUMA balancing is a kernel feature that samples memory accesses and migrates pages closer to the accessing CPU. It can improve performance in workloads that move across nodes, but it can also fight against explicit placement. If you migrate pages to node 1 but threads continue to run on node 0, the kernel may move those pages back to node 0 over time. This can be surprising if you expect migration to be permanent.
Policies such as bind, preferred, and interleave still apply. If a process is running under a bind policy for node 0, migrating pages to node 1 may be rejected or quickly undone. To make migration stick, you can temporarily change policy or pin threads to the target node. This is why your tool should include options like --policy=bind or --pin-threads when migrating.
Verification is best done through multiple sources: move_pages query mode provides immediate location, and /proc/<pid>/numa_maps gives a higher-level summary of where memory is allocated. numastat -p provides aggregated stats. Your tool can combine these to produce a clear report: how many pages moved, how many failed, and how many later drifted back.
Another important factor is permissions. Migration of another process often requires CAP_SYS_NICE. When you cannot migrate, you should still provide a query mode so users can inspect page distribution. This makes the tool useful even without elevated privileges.
Finally, determinism: for repeatable tests, you should provide a fixture mode that allocates a predictable memory range in a helper process and then migrates it. This ensures that results do not depend on random allocation patterns or unrelated memory activity.
How this fits on projects
This concept informs how you verify and interpret migration results, and why you might need to adjust policy or affinity for stable outcomes.
Definitions & Key Terms
- NUMA balancing -> Kernel feature that migrates pages based on access patterns.
- Policy enforcement -> Applying bind/preferred rules to memory ranges.
- Verification -> Checking whether migration succeeded and persisted.
- Drift -> Pages moving back due to access patterns.
Mental Model Diagram (ASCII)
Migrate pages -> Node 1
Threads still on Node 0
Kernel NUMA balancing -> pages drift back to Node 0
How It Works (Step-by-Step)
- Query page locations.
- Migrate pages to target node.
- Optionally pin threads and set policy.
- Re-query after a delay to detect drift.
- Report stable vs migrated pages.
Invariants: Migration success is time-sensitive; placement can change.
Failure modes: NUMA balancing override, policy mismatch, permission errors.
Minimal Concrete Example
# Disable automatic balancing for deterministic tests
echo 0 | sudo tee /proc/sys/kernel/numa_balancing
Common Misconceptions
- “Migration is permanent” -> Balancing may move pages back.
- “Policies don’t matter” -> Policies can override migration.
- “One verification is enough” -> you should check after a delay.
Check-Your-Understanding Questions
- Why might pages move back after migration?
- How can you make migration more stable?
- What tools can verify placement?
Check-Your-Understanding Answers
- NUMA balancing relocates pages closer to active threads.
- Pin threads and adjust memory policy.
move_pagesquery,/proc/<pid>/numa_maps,numastat -p.
Real-World Applications
- Fixing placement in long-running services after load changes.
- Tuning performance in memory-bound analytics jobs.
Where You’ll Apply It
- In this project: Sec. 3.7, Sec. 5.8.
- Also used in: P09-numa-aware-thread-pool, P10-numa-aware-database-buffer-pool.
References
- “Operating Systems: Three Easy Pieces” – Ch. 13-21
- “Linux Kernel Development” (Love) – Ch. 11
Key Insights
Migration is a negotiation between your intent and the kernel’s balancing policy.
Summary
NUMA balancing and policies can override explicit migration. A good tool verifies not only immediate success but also stability over time, and it provides guidance on how to make placement stick.
Homework/Exercises to Practice the Concept
- Migrate pages and re-check after 10 seconds.
- Disable NUMA balancing and compare results.
- Pin threads to target node and observe reduced drift.
Solutions to the Homework/Exercises
- Some pages may drift back if threads stay on the old node.
- With balancing off, migration is more stable.
- Pinning threads reduces drift and improves locality.
3. Project Specification
3.1 What You Will Build
A CLI tool numa-migrate that can query and migrate pages of a process, report migration statistics, and verify results after a delay.
Included: query mode, migrate mode, verification, deterministic test helper. Excluded: kernel-level balancing control (optional).
3.2 Functional Requirements
- Query page locations for a PID and address range.
- Migrate pages to a target node using
move_pages. - Report counts: moved, failed, busy, unchanged.
- Verify placement via re-query and
numa_maps. - Support deterministic helper process with fixed allocation.
- Provide clear errors for permission issues.
3.3 Non-Functional Requirements
- Performance: handle millions of pages without excessive overhead.
- Reliability: handle partial failures gracefully.
- Usability: clear output and exit codes.
3.4 Example Usage / Output
$ ./numa-migrate --pid 1234 --range 1G --to-node 1
Pages examined: 262144
Moved: 248901
Busy: 1023
Failed: 121
3.5 Data Formats / Schemas / Protocols
JSON output:
{
"pid": 1234,
"pages": 262144,
"moved": 248901,
"busy": 1023,
"failed": 121,
"target_node": 1
}
3.6 Edge Cases
- Process exits mid-migration.
- Pages are locked or pinned (EBUSY).
- No permission to access process (EACCES).
3.7 Real World Outcome
You can repair bad page placement without restarting the process, and you can confirm whether the migration persists.
3.7.1 How to Run (Copy/Paste)
cc -O2 -Wall -o numa-migrate src/main.c
sudo ./numa-migrate --pid 1234 --range 1G --to-node 1
3.7.2 Golden Path Demo (Deterministic)
$ ./numa-migrate --pid 4321 --range 512M --to-node 1 --verify
Pages examined: 131072
Moved: 125000
Busy: 500
Failed: 100
Verification: 95.8% on node 1
3.7.3 If CLI: Exact Terminal Transcript
$ ./numa-migrate --pid 4321 --check
Node 0: 4.2% pages
Node 1: 95.8% pages
3.7.4 Failure Demo (Deterministic)
$ ./numa-migrate --pid 99999 --range 1G --to-node 1
ERROR: process not found
EXIT: 1
Exit Codes:
0success1invalid arguments or process not found2permission denied3migration failure
4. Solution Architecture
4.1 High-Level Design
+--------------+ +--------------+ +------------------+
| Query Engine |-->| Migration |-->| Verification |
+--------------+ +--------------+ +------------------+
4.2 Key Components
| Component | Responsibility | Key Decisions | |—|—|—| | Query Engine | Build page list and call move_pages | page alignment | | Migration Engine | Request move to target node | MPOL_MF_MOVE flags | | Verification | Re-query and parse numa_maps | delay interval | | Reporter | Summarize results | JSON vs table |
4.3 Data Structures (No Full Code)
typedef struct {
long pages;
long moved;
long busy;
long failed;
} MigStats;
4.4 Algorithm Overview
Key Algorithm: Migration
- Align range to page boundaries.
- Query current locations.
- Request migration to target node.
- Re-query and report stats.
Complexity Analysis:
- Time: O(pages)
- Space: O(pages) for address arrays
5. Implementation Guide
5.1 Development Environment Setup
sudo apt-get install -y build-essential
5.2 Project Structure
numa-migrate/
|-- src/
| |-- main.c
| |-- query.c
| |-- migrate.c
| +-- report.c
|-- tests/
| +-- helper_process.c
|-- Makefile
+-- README.md
5.3 The Core Question You’re Answering
“Can I repair bad page placement without restarting the process?”
5.4 Concepts You Must Understand First
- Page tables and virtual memory.
- move_pages semantics and errors.
- NUMA balancing and policy effects.
5.5 Questions to Guide Your Design
- How will you handle partial migration?
- How will you verify results after a delay?
- Will you provide a helper process for deterministic tests?
5.6 Thinking Exercise
If you migrate pages but leave threads on the old node, what happens to performance over time?
5.7 The Interview Questions They’ll Ask
- “What is the difference between move_pages and migrate_pages?”
- “Why might migration fail?”
- “How does NUMA balancing interact with migration?”
5.8 Hints in Layers
Hint 1: Start with query mode (nodes=NULL).
Hint 2: Add migration with MPOL_MF_MOVE.
Hint 3: Implement verification via /proc/<pid>/numa_maps.
Hint 4: Handle EBUSY by retrying or reporting.
5.9 Books That Will Help
| Topic | Book | Chapter | |—|—|—| | Virtual memory | “Operating Systems: Three Easy Pieces” | Ch. 13-21 | | Linux system calls | “The Linux Programming Interface” | Ch. 6 |
5.10 Implementation Phases
Phase 1: Query Mode (3-4 days)
Goals: implement page location queries.
Tasks:
- Build address list for a range.
- Call move_pages in query mode.
Checkpoint: prints accurate node distribution.
Phase 2: Migration Mode (4-5 days)
Goals: implement migration requests.
Tasks:
- Add target node array.
- Collect migration stats.
Checkpoint: pages moved on test process.
Phase 3: Verification (2-3 days)
Goals: verify and report stability.
Tasks:
- Re-query after delay.
- Parse
/proc/<pid>/numa_maps.
Checkpoint: reports both immediate and delayed placement.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale | |—|—|—|—| | Verification | move_pages vs numa_maps | both | redundancy and clarity | | Error handling | fail fast vs partial | partial + report | migration is often partial | | Helper process | required vs optional | optional | supports deterministic tests |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples | |—|—|—| | Unit Tests | Alignment and parsing | page boundary checks | | Integration Tests | Migration success | helper process migration | | Edge Case Tests | Permissions | EACCES handling |
6.2 Critical Test Cases
- Query mode reports correct distribution.
- Migration moves >90% of pages for a free range.
- Permission error exits with code 2.
6.3 Test Data
helper_process alloc=512MB, seed=42
expected_after_migration >= 95% on target node
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution | |—|—|—| | Unaligned range | EINVAL errors | Align to page size | | Page locks | Many EBUSY | Retry or report | | NUMA balancing | Pages drift back | Pin threads or disable balancing |
7.2 Debugging Strategies
- Use
straceto inspect move_pages calls. - Compare with
numastat -p.
7.3 Performance Traps
- Migrating huge ranges can stall the process; throttle requests.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add
--dry-runmode to only query. - Add
--csvoutput.
8.2 Intermediate Extensions
- Add throttling and batching.
- Integrate with automatic NUMA balancing controls.
8.3 Advanced Extensions
- Build a daemon that continuously optimizes page placement.
9. Real-World Connections
9.1 Industry Applications
- Performance tuning for memory-bound services.
- Repairing bad placements after VM migrations.
9.2 Related Open Source Projects
- numactl – includes migratepages utility.
9.3 Interview Relevance
- Understanding of Linux virtual memory and NUMA policies.
10. Resources
10.1 Essential Reading
- “Operating Systems: Three Easy Pieces” – Ch. 13-21
- “The Linux Programming Interface” – Ch. 6
10.2 Video Resources
- NUMA balancing talks.
10.3 Tools & Documentation
- numastat – placement statistics.
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain how move_pages works.
- I can explain why migration can fail.
11.2 Implementation
- Migration stats are accurate.
- Verification works after delay.
11.3 Growth
- I can explain this tool in a systems interview.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Query and migrate modes work for current process.
- Basic stats reported.
Full Completion:
- Verification with numa_maps and delayed checks.
- Proper error handling for permissions and busy pages.
Excellence (Going Above & Beyond):
- Background optimizer daemon with adaptive policies.