LEARN PROXMOX FROM SCRATCH IN C
Learn Proxmox from First Principles in C
Goal: To deeply understand how a hyper-converged virtualization platform like Proxmox VE works by building its core components—a VM hypervisor, a container runtime, a virtual networking stack, and a distributed storage system—from scratch in C on Linux.
Why Build Proxmox from Scratch?
Proxmox is a masterpiece of open-source integration. It combines a KVM-based hypervisor, LXC containers, a software-defined storage system (Ceph), and a robust clustering layer into a single, cohesive platform. To simply learn its web interface is to miss the beauty of the underlying technologies.
Building simplified versions of these components in C is the ultimate “first principles” approach. You will bypass the layers of abstraction and interact directly with the Linux kernel features that make the cloud possible. This path is not about replacing Proxmox; it’s about understanding it so deeply that it no longer seems like magic, but like an elegant composition of technologies you have built yourself.
After completing these projects, you will:
- Understand that KVM is a kernel module that turns Linux into a hypervisor, and you can control it directly.
- Know that an LXC container is just a Linux process with a restricted view of the system, a concept you will build by hand.
- Grasp how virtual networking is constructed from Linux bridges and
vethpairs. - Implement the core principles of distributed and copy-on-write storage, demystifying technologies like Ceph and ZFS.
- Appreciate Proxmox as a brilliant orchestration layer on top of powerful C-based and kernel-level primitives.
Core Concept Analysis: The Pillars of Proxmox
Proxmox isn’t a single application; it’s a curated collection of powerful open-source systems. To understand Proxmox is to understand its pillars:
┌──────────────────────────────────────────────────────────────────────────┐
│ PROXMOX VE PLATFORM │
│ (Web UI, `pveproxy`, `pvedaemon`, Clustering/Corosync) │
└─────────────────────────────┬───────────────────┬────────────────────────┘
│ │
(Orchestrates...) │ │
▼ ▼
┌───────────────────────────┐ ┌───────────────────────────┐
│ COMPUTE LAYER │ │ STORAGE LAYER │
│ │ │ │
│ ┌───────────┐ ┌────────┐ │ │ ┌─────────┐ ┌──────────┐ │
│ │ KVM / QEMU│ │ LXC │ │ │ │ ZFS │ │ Ceph │ │
│ │ (Full VMs)│ │(Containers)│ │ │ (Local) │ │(Clustered) │
│ └───────────┘ └────────┘ │ │ └─────────┘ └──────────┘ │
└────────────┬──────────────┘ └────────────┬──────────────┘
│ │
(Relies on...) │ │
▼ ▼
┌──────────────────────────────────────────────────────────────────────────┐
│ LINUX KERNEL & SYSTEM-LEVEL PRIMITIVES │
│ │
│ `/dev/kvm` Namespaces cgroups Linux Bridge Netlink Sockets LVM ... │
└──────────────────────────────────────────────────────────────────────────┘
Our journey will focus on building toy versions of the components in the COMPUTE and STORAGE layers by using the PRIMITIVES at the bottom.
The Project Path: From Kernel Syscall to Mini-Cloud
Project 1: The Raw KVM Hypervisor Call
- File: LEARN_PROXMOX_FROM_SCRATCH_IN_C.md
- Main Programming Language: C
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 5: Master
- Knowledge Area: Virtualization / Kernel Programming
- Software or Tool: Linux KVM API (
/dev/kvm) - Main Book: The Linux Kernel Documentation (
Documentation/virt/kvm)
What you’ll build: A C program that bypasses QEMU and libvirt entirely. It will open /dev/kvm, create a virtual machine, create a virtual CPU (VCPU), allocate some memory for the VM, load a tiny piece of 16-bit “real mode” assembly code into that memory, and run it. The assembly code will do nothing but write a character to the serial port, which you can observe.
Why it teaches the fundamentals: This is the absolute lowest level of virtualization control. You will learn that Proxmox doesn’t “create” VMs; it tells QEMU to ask the Linux kernel (via the KVM API) to create and run them. This project shows you how to be QEMU. It demystifies virtualization completely.
Core challenges you’ll face:
- Understanding the KVM
ioctlAPI → maps to a sequence of system calls to create the VM, create VCPUs, and configure memory - Setting up VM memory → maps to
mmaping memory in your C program and telling the kernel to use it as the guest’s physical RAM - Running the VCPU → maps to the
KVM_RUNioctl loop, which enters the guest, runs code, and exits on events like I/O - Handling VM exits → maps to checking why the VCPU exited (e.g.,
KVM_EXIT_IO) and emulating the required hardware (like a serial port)
Key Concepts:
/dev/kvmAPI: The file descriptor that represents the KVM hypervisor.- VM/VCPU File Descriptors: In Linux, everything is a file. A VM and its CPUs are file descriptors you control with
ioctl. - VM Exit: When the guest VM needs something from the host (like I/O), it exits, returning control to your C program, which must then emulate the necessary hardware.
Difficulty: Master Time estimate: 1-2 weeks Prerequisites: Excellent C skills, basic understanding of computer architecture (registers, memory).
Real world outcome: You will run your C program, and it will print a message like “Guest says: H”. You will have created a virtual machine and run code inside it using nothing but your own C code and the Linux kernel.
Implementation Hints:
- The sequence is roughly:
open("/dev/kvm"),ioctl(kvm_fd, KVM_CREATE_VM),ioctl(vm_fd, KVM_CREATE_VCPU),mmap()some RAM, set up the memory layout, load your machine code into RAM, then loop withioctl(vcpu_fd, KVM_RUN). - Start with a tiny “flat binary” of 16-bit assembly. You can find examples online for simple KVM tutorials.
- The hardest part is handling the VM exits. When your guest code tries to
outto port0x3f8(the serial port), the VM will exit. Your C code must catch thisKVM_EXIT_IO, read the character the guest wanted to print from the VCPU’s state, and print it to the host’s console.
Learning milestones:
- You can successfully create a VM and a VCPU → You understand the KVM object hierarchy.
- You can load and run machine code in the guest → You have created and executed a virtual machine.
- You can handle a VM exit and emulate a serial port → You grasp the fundamental interaction between guest and host that makes virtualization possible.
Project 2: The LXC Container Runtime
- File: LEARN_PROXMOX_FROM_SCRATCH_IN_C.md
- Main Programming Language: C
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 1. The “Resume Gold”
- Difficulty: Level 4: Expert
- Knowledge Area: OS Internals / Linux Namespaces & Cgroups
- Software or Tool:
clone,unshare, cgroupfs - Main Book: “The Linux Programming Interface” by Michael Kerrisk
What you’ll build: A C program that creates a fully isolated container, replicating the core functionality of LXC (which Proxmox uses). It will use clone() to create a new process in new PID, network, and mount namespaces, use cgroups to limit its memory, and use pivot_root to give it a private root filesystem.
Why it teaches the fundamentals: This project reveals that a Proxmox “CT” (Container) is not a VM. It’s a standard Linux process that has been firewalled from the rest of the system by the kernel. Building this proves you understand the Linux primitives that enabled the entire container revolution (including Docker).
Core challenges you’ll face:
- Using
clone()with multiple namespace flags → maps toCLONE_NEWPID | CLONE_NEWNET | CLONE_NEWNS - Setting up a private root filesystem → maps to downloading a minimal rootfs (like Alpine), and using
pivot_rootto make it the new root for the container process - Configuring a virtual network for the container → maps to creating a
vethpair, putting one end in the host and the other in the container’s network namespace - Orchestrating all these steps in the correct order → maps to a complex dance of syscalls and setup before the final
exec
Key Concepts:
- Namespaces: Isolates what a process can see.
- Cgroups: Limits what a process can use.
pivot_root: Changes the root filesystem of a process. More powerful and secure thanchroot.vethpairs: A virtual ethernet tunnel used to connect network namespaces to the outside world (often via a bridge on the host).
Difficulty: Expert Time estimate: 1-2 weeks Prerequisites: Strong C skills, solid Linux command-line knowledge.
Real world outcome:
You’ll have a program ./mycontainer /bin/sh that gives you a shell. Inside this shell, ps aux will show your shell as PID 1, ifconfig will show only a private veth network interface, and ls / will show the Alpine rootfs, not your host’s. You will have built your own docker run.
Implementation Hints:
- This is a complex orchestration. A good approach is to have the parent process create the namespaces and network, and then the child process configures them before calling
exec. - Setting up the network involves: creating a bridge on the host, creating the
vethpair, attaching one end to the bridge, and pushing the other end into the child’s network namespace using its PID. - The sequence of
mountandpivot_rootis tricky and must be done carefully to avoid breaking the filesystem.
Learning milestones:
- Your container has its own PID space → You have mastered PID namespaces.
- Your container has its own network interface and can’t see the host’s → You have mastered network namespaces.
- Your container has a private root filesystem → You have mastered mount namespaces and
pivot_root.
Project 3: A Copy-on-Write Storage Layer
- File: LEARN_PROXMOX_FROM_SCRATCH_IN_C.md
- Main Programming Language: C
- Coolness Level: Level 4: Hardcore Tech Flex
- Business Potential: 2. The “Micro-SaaS / Pro Tool”
- Difficulty: Level 3: Advanced
- Knowledge Area: Filesystems / Storage
- Software or Tool: File I/O,
mmap - Main Book: N/A, but online articles on qcow2 format are helpful.
What you’ll build: A program that simulates a copy-on-write (CoW) block device. You will have a read-only “base image” file. Your program will expose a “virtual disk” to a client. When the client reads a block, your program reads from the base image. When the client writes a block, your program writes the new data to a separate “overlay” file and records the new location in a map. Subsequent reads for that block will now come from the overlay.
Why it teaches the fundamentals: This demystifies a key feature of both ZFS and VM disk formats like qcow2, which Proxmox uses extensively. CoW is what makes creating a new VM from a template or taking a snapshot an instantaneous, space-efficient operation. You learn that a “snapshot” isn’t a full copy, but a promise to not change the original.
Core challenges you’ll face:
- Designing the overlay format → maps to creating a mapping from logical block addresses to physical locations in the overlay file
- Intercepting reads and writes → maps to creating functions
read_blockandwrite_blockthat contain the CoW logic - Managing the mapping → maps to using a hash map or a simple array to track which blocks are in the overlay vs. the base image
- “Committing” an overlay → maps to (bonus challenge) writing a tool to merge the overlay file back into a new base image.
Key Concepts:
- Copy-on-Write (CoW): The strategy of not modifying data in place, but writing a modified copy to a new location.
- Block Storage: The concept of a disk as a linear array of fixed-size blocks.
- Indirection/Mapping: The core idea of tracking where the “latest” version of a block of data lives.
Difficulty: Advanced Time estimate: Weekend Prerequisites: Solid C skills, especially file I/O.
Real world outcome:
You will have a base file base.img and programs to interact with it. You can create a “snapshot” that creates an empty overlay1.img. After writing to the virtual disk, overlay1.img will contain data, but base.img will be untouched. You can then create a second snapshot overlay2.img from the same base image.
Implementation Hints:
- Divide your virtual disk into fixed-size blocks (e.g., 4KB).
- Your
write_block(block_id, data)function should: a. Writedatato the end of the overlay file. b. Record the new offset in your mapping:map[block_id] = new_offset. - Your
read_block(block_id)function should: a. Check ifblock_idexists in your map. b. If yes, read from the recorded offset in the overlay file. c. If no, read fromblock_id * block_sizein the base image file.
Learning milestones:
- Reads correctly fall through to the base image → Your read path is working.
- Writes are stored in the overlay file and subsequent reads get the new data → Your copy-on-write logic is correct.
- You can have multiple, independent overlays on the same base image → You have replicated the core of VM templating.
Project 4: The Cluster Control Plane
- File: LEARN_PROXMOX_FROM_SCRATCH_IN_C.md
- Main Programming Language: C
- Coolness Level: Level 5: Pure Magic (Super Cool)
- Business Potential: 4. The “Open Core” Infrastructure
- Difficulty: Level 5: Master
- Knowledge Area: Distributed Systems / Cloud Orchestration
- Software or Tool: Sockets API,
libmicrohttpd(optional) - Main Book: “Distributed Systems” by Tanenbaum and van Steen
What you’ll build: A miniature version of the Proxmox management daemons and API. This is a distributed system with two parts:
- A Node Agent Daemon: Runs on each machine in your “cluster”. It listens on a TCP socket for commands like
start_vm:<vm_name>and uses your code from Project 1 or 2 to execute them. - An API Server: Runs on one node. It provides a simple HTTP REST API (e.g.,
POST /vmswith some JSON). When it receives a request, it chooses a node and sends the appropriate command to that node’s Agent Daemon.
Why it teaches the fundamentals: This project synthesizes everything and replicates the core architecture of Proxmox, OpenStack, and Kubernetes. You learn how a simple HTTP request is transformed into a running workload on a specific machine in a cluster, which is the fundamental loop of all cloud platforms.
Core challenges you’ll face:
- Designing a simple client-server protocol → maps to defining the text-based or JSON commands your agents will accept
- Writing a daemon process → maps to using
forkto background a process, and setting up a listening TCP socket - Implementing the API server → maps to parsing HTTP requests and serializing JSON responses
- Basic scheduling logic → maps to the API server maintaining a list of available nodes and picking one (e.g., round-robin)
Key Concepts:
- Control Plane: The API server and any central logic. This is what the user talks to.
- Data Plane: The node agents and the workloads (VMs/containers) they manage. This is where the work gets done.
- Agent Architecture: The common cloud pattern of a central controller instructing distributed agents.
- REST API: The standard interface for modern web services.
Difficulty: Master Time estimate: 1-2 weeks Prerequisites: All previous projects, strong networking and C skills.
Real world outcome:
You will have two “node” machines running your agent daemon. From a third machine, you can run curl -X POST -d '{"name": "my-vm"}' http://<api_server_ip>:8080/vms. A moment later, you can ssh into one of the nodes and see that a new VM or container named my-vm is running. You will have built a fully functional, distributed, multi-node orchestration system from scratch.
Learning milestones:
- Your node agent can receive a command and start a VM/container → You have a working data plane worker.
- Your API server can accept an HTTP request and correctly parse it → You have a working control plane entrypoint.
- The API server can successfully command a node agent to start a workload → You have a complete, end-to-end cloud management loop.
Summary
| Project | Proxmox Component | Key C/Linux Technology | Difficulty |
|---|---|---|---|
| 1. The Raw KVM Hypervisor Call | KVM/QEMU | /dev/kvm ioctl API |
Master |
| 2. The LXC Container Runtime | LXC Containers | clone, namespaces, cgroups |
Expert |
| 3. A Copy-on-Write Storage Layer | ZFS/qcow2 Snapshots | File I/O, mmap |
Advanced |
| 4. The Cluster Control Plane | pvedaemon/pveproxy/API | Sockets, HTTP, RPC | Master |