Project 7: The “Poor Man’s Docker” (Container Runtime)

Run a command in isolated PID and mount namespaces with a minimal rootfs.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 2 weeks
Language Go or C (Alt: Rust, Python)
Prerequisites Project 2, root access
Key Topics namespaces, cgroups, chroot/pivot_root

1. Learning Objectives

By completing this project, you will:

  1. Create a new PID namespace with clone or unshare.
  2. Mount a new /proc for isolated process views.
  3. Set up a root filesystem with chroot or pivot_root.
  4. Apply basic resource limits with cgroups.

2. Theoretical Foundation

2.1 Core Concepts

  • Namespaces: Isolate process trees, mounts, hostnames, and networks.
  • PID 1 role: The first process in the namespace handles signals and reaping.
  • Cgroups: Enforce resource limits for CPU and memory.

2.2 Why This Matters

Containers are just processes with restricted views of kernel resources. This project reveals the mechanics.

2.3 Historical Context / Background

Linux namespaces and cgroups were introduced to enable containerization and isolation.

2.4 Common Misconceptions

  • “Containers are VMs”: They share the kernel.
  • “chroot is a container”: Without namespaces, it is not isolated.

3. Project Specification

3.1 What You Will Build

A small runtime that spawns a command in new namespaces with a private root and /proc.

3.2 Functional Requirements

  1. Create PID and mount namespaces.
  2. Mount a fresh /proc inside the container.
  3. Set hostname in UTS namespace.
  4. Run a command as PID 1 inside container.

3.3 Non-Functional Requirements

  • Safety: Clean up mounts and processes on exit.
  • Reliability: PID 1 should reap child processes.
  • Usability: Simple CLI run <cmd>.

3.4 Example Usage / Output

$ sudo ./mycontainer run /bin/bash
container# ps aux
PID  COMMAND
1    /bin/bash
2    ps aux

3.5 Real World Outcome

You will enter a shell with an isolated PID tree and hostname:

$ sudo ./mycontainer run /bin/bash
container# hostname
container-host
container# ps aux
PID  COMMAND
1    /bin/bash
2    ps aux

4. Solution Architecture

4.1 High-Level Design

parent -> clone new namespaces -> setup rootfs/proc -> exec command -> monitor PID 1

4.2 Key Components

Component Responsibility Key Decisions
Namespace setup PID/UTS/MNT Use clone flags
Rootfs setup chroot/pivot_root chroot first
Proc mount /proc inside mount(“proc”)
Cgroups Resource limits optional in v1

4.3 Data Structures

struct child_config { char **argv; char *rootfs; };

4.4 Algorithm Overview

Key Algorithm: Container Spawn

  1. clone with CLONE_NEWPID CLONE_NEWNS CLONE_NEWUTS.
  2. In child, set hostname and rootfs.
  3. Mount /proc, exec command.

Complexity Analysis:

  • Time: O(1) per spawn
  • Space: O(1)

5. Implementation Guide

5.1 Development Environment Setup

go version
# or gcc --version

5.2 Project Structure

project-root/
├── main.c
└── README.md

5.3 The Core Question You’re Answering

“What actually is a container, and how is it different from a VM?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. namespaces (man 7 namespaces)
  2. chroot vs pivot_root
  3. PID 1 behavior

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. How will PID 1 handle signals and reaping?
  2. Will you isolate networking or use host network?
  3. How will you clean up mounts on exit?

5.6 Thinking Exercise

The ps lie

Explain why ps shows host processes if you don’t remount /proc.

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What is a namespace?”
  2. “Why is PID 1 special?”
  3. “How do cgroups differ from namespaces?”

5.8 Hints in Layers

Hint 1: Use unshare first Experiment with unshare --fork --pid --mount-proc /bin/bash.

Hint 2: Set hostname Use sethostname in UTS namespace.

Hint 3: Mount /proc Mount a fresh procfs inside the container root.

5.9 Books That Will Help

Topic Book Chapter
Namespaces “TLPI” namespaces section
Containers “Container Security” Ch. 2-3

5.10 Implementation Phases

Phase 1: Foundation (3-4 days)

Goals:

  • Spawn a process in new PID namespace.

Tasks:

  1. clone/unshare with CLONE_NEWPID.
  2. Exec a command.

Checkpoint: ps shows only container processes.

Phase 2: Core Functionality (4-5 days)

Goals:

  • Add rootfs and /proc mount.

Tasks:

  1. chroot into rootfs.
  2. mount proc.

Checkpoint: ps works inside container.

Phase 3: Polish & Edge Cases (3-4 days)

Goals:

  • Add hostname and basic cgroups.

Tasks:

  1. Set hostname.
  2. Add memory/cpu limits.

Checkpoint: Limits enforced and cleanup works.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Root isolation chroot vs pivot_root chroot first Simpler
Networking host vs isolated host first Less setup

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Namespace PID isolation ps output
Rootfs chroot works ls /
Proc proper mount ps shows 2 processes

6.2 Critical Test Cases

  1. PID 1 receives SIGTERM and exits.
  2. /proc mounted inside container.
  3. Host processes hidden from container.

6.3 Test Data

Command: /bin/bash

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
No /proc mount ps broken mount proc inside
PID 1 exits Zombie children reap in init loop
chroot only host visible use namespaces

7.2 Debugging Strategies

  • Start with unshare CLI to validate.
  • Use strace to confirm mount and clone calls.

7.3 Performance Traps

None significant for a minimal runtime.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add a --hostname flag.
  • Add simple logging.

8.2 Intermediate Extensions

  • Add network namespace with veth pair.
  • Add overlayfs rootfs.

8.3 Advanced Extensions

  • Add image format and unpacking.
  • Implement minimal OCI runtime spec.

9. Real-World Connections

9.1 Industry Applications

  • Container runtimes and sandboxing tooling.
  • runc: https://github.com/opencontainers/runc
  • nsjail: https://github.com/google/nsjail

9.3 Interview Relevance

  • Namespaces, cgroups, and PID 1 behavior are common interview topics.

10. Resources

10.1 Essential Reading

  • namespaces(7) - man 7 namespaces
  • clone(2) - man 2 clone

10.2 Video Resources

  • Container internals talks (search “Linux namespaces cgroups”)

10.3 Tools & Documentation

  • unshare(1) - man 1 unshare

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain namespace types.
  • I can explain PID 1 responsibilities.
  • I can describe cgroup limits.

11.2 Implementation

  • Container runs with isolated PID tree.
  • /proc is correctly mounted.
  • Hostname is set inside container.

11.3 Growth

  • I can extend to network isolation.
  • I can explain differences vs VMs.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Run a command in a PID namespace with a new /proc.

Full Completion:

  • Add rootfs isolation and hostname change.

Excellence (Going Above & Beyond):

  • Add cgroup limits and network namespace.

This guide was generated from LEARN_LINUX_UNIX_INTERNALS_DEEP_DIVE.md. For the complete learning path, see the parent directory.