Project 13: Network Boot (PXE) Client
Build a bootloader that obtains an IP address via DHCP, downloads a kernel via TFTP, and boots it–exactly how diskless workstations and data center servers boot.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | ★★★★☆ Expert |
| Time Estimate | 2-3 weeks |
| Language | x86 Assembly + C (alt: Pure C with UEFI) |
| Prerequisites | Project 4 (Two-Stage Bootloader), basic networking concepts, BIOS interrupt programming |
| Key Topics | PXE API, UNDI, DHCP handshake, TFTP protocol, network firmware services, data center infrastructure |
1. Learning Objectives
By completing this project, you will:
-
Understand PXE Architecture: Learn how the Preboot Execution Environment provides network services before any OS exists, including the relationship between PXE, UNDI, and network option ROMs.
-
Master Network Protocols at Boot Time: Implement DHCP discovery and TFTP file transfer in a pre-OS environment where no network stack exists.
-
Explore Firmware Network Services: Understand how BIOS-provided network APIs work and how they abstract different network hardware.
-
Build Data Center Infrastructure Knowledge: Learn the technology that powers diskless workstations, cloud server provisioning, OS deployment, and system recovery.
-
Handle Robust Network Programming: Implement timeout handling, retry logic, and error recovery in constrained environments.
-
Practice Protocol Implementation: Gain hands-on experience implementing RFC-specified protocols (DHCP, TFTP) from scratch.
2. Theoretical Foundation
2.1 Core Concepts
PXE (Preboot Execution Environment)
PXE is an Intel-developed standard that enables network booting. It defines:
- A client-server architecture for boot image delivery
- APIs for network access before OS initialization
- A standardized boot flow: DHCP -> TFTP -> Execute
PXE Boot Flow Overview:
┌─────────────────────────────────────────────────────────────────────────────┐
│ PXE NETWORK BOOT SEQUENCE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────┐ ┌───────────────────┐ │
│ │ Client │ │ Infrastructure │ │
│ │ System │ │ │ │
│ └─────┬─────┘ └─────────┬─────────┘ │
│ │ │ │
│ │ 1. POWER ON │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ 2. BIOS/UEFI POST │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ 3. Network Option ROM loads (PXE ROM) │ │
│ │ - UNDI driver initializes NIC │ │
│ │ - PXE API becomes available │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ 4. DHCP DISCOVER ─────────────────────────────────►│ │
│ │ (broadcast: "I need an IP and boot info") │ │
│ │ │ │
│ │ 5. DHCP OFFER ◄─────────────────────────────────────│ │
│ │ (unicast: "Here's IP + TFTP server + filename") │ DHCP Server │
│ │ │ │
│ │ 6. DHCP REQUEST ───────────────────────────────────►│ │
│ │ (broadcast: "I accept this offer") │ │
│ │ │ │
│ │ 7. DHCP ACK ◄───────────────────────────────────────│ │
│ │ (unicast: "Configuration confirmed") │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ 8. TFTP READ REQUEST ──────────────────────────────►│ │
│ │ (unicast: "Send me <bootfile>") │ TFTP Server │
│ │ │ │
│ │ 9. TFTP DATA ◄──────────────────────────────────────│ │
│ │ (512-byte blocks with ACKs) │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ 10. JUMP TO DOWNLOADED CODE │ │
│ │ (kernel or second-stage bootloader executes) │ │
│ ▼ ▼ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
UNDI (Universal Network Driver Interface)
UNDI is the low-level API that abstracts network hardware:
UNDI Architecture:
┌─────────────────────────────────────────────────────────────────────────────┐
│ UNDI STACK │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ YOUR BOOTLOADER CODE │ │
│ │ (Calls PXE API or UNDI API directly) │ │
│ └───────────────────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ PXE API LAYER │ │
│ │ │ │
│ │ PXENV_GET_CACHED_INFO - Get DHCP info from cached response │ │
│ │ PXENV_UDP_OPEN - Open UDP connection │ │
│ │ PXENV_UDP_WRITE - Send UDP packet │ │
│ │ PXENV_UDP_READ - Receive UDP packet │ │
│ │ PXENV_TFTP_READ_FILE - Download file via TFTP │ │
│ │ PXENV_UNDI_GET_STATE - Query driver state │ │
│ │ │ │
│ └───────────────────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ UNDI DRIVER LAYER │ │
│ │ │ │
│ │ PXENV_UNDI_STARTUP - Initialize driver │ │
│ │ PXENV_UNDI_INITIALIZE - Initialize NIC │ │
│ │ PXENV_UNDI_OPEN - Activate NIC │ │
│ │ PXENV_UNDI_TRANSMIT - Send raw packet │ │
│ │ PXENV_UNDI_ISR - Handle interrupts │ │
│ │ PXENV_UNDI_GET_INFORMATION - Get MAC address, link status │ │
│ │ │ │
│ └───────────────────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ NETWORK INTERFACE CARD (NIC) │ │
│ │ │ │
│ │ Intel PRO/1000 | Realtek RTL8139 | Broadcom BCM │ │
│ │ │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
DHCP (Dynamic Host Configuration Protocol)
DHCP provides network configuration and boot parameters:
DHCP Packet Structure (simplified):
┌─────────────────────────────────────────────────────────────────────────────┐
│ DHCP PACKET FORMAT │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Byte Offset Field Description │
│ ────────────────────────────────────────────────────────────────────────── │
│ 0 op 1=Request, 2=Reply │
│ 1 htype Hardware type (1=Ethernet) │
│ 2 hlen Hardware address length (6) │
│ 3 hops Relay agent hops │
│ 4-7 xid Transaction ID (random) │
│ 8-9 secs Seconds since boot started │
│ 10-11 flags Broadcast flag │
│ 12-15 ciaddr Client IP (if known) │
│ 16-19 yiaddr "Your" IP (assigned by server) │
│ 20-23 siaddr Server IP (TFTP server) │
│ 24-27 giaddr Gateway IP (relay agent) │
│ 28-43 chaddr Client hardware address (MAC) │
│ 44-107 sname Server hostname (optional) │
│ 108-235 file Boot filename │
│ 236-239 magic 99.130.83.99 (DHCP magic cookie) │
│ 240+ options DHCP options (variable length) │
│ │
│ IMPORTANT OPTIONS FOR PXE: │
│ ────────────────────────────────────────────────────────────────────────── │
│ Option 53 - DHCP Message Type (1=Discover, 2=Offer, 3=Request, 5=ACK) │
│ Option 54 - Server Identifier │
│ Option 66 - TFTP Server Name │
│ Option 67 - Boot Filename │
│ Option 43 - Vendor-Specific (contains PXE extensions) │
│ Option 60 - Vendor Class Identifier ("PXEClient:...") │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
TFTP (Trivial File Transfer Protocol)
TFTP is a simple, UDP-based file transfer protocol:
TFTP Protocol Flow:
┌─────────────────────────────────────────────────────────────────────────────┐
│ TFTP FILE DOWNLOAD │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ CLIENT SERVER │
│ ────── ────── │
│ │ │ │
│ │ 1. RRQ (Read Request) │ │
│ │ Opcode=1, Filename, Mode="octet" │ │
│ │ ─────────────────────────────────────────────────► │ │
│ │ UDP port 69 │ │
│ │ │ │
│ │ 2. DATA Block #1 │ │
│ │ Opcode=3, Block=1, Data (512 bytes) │ │
│ │ ◄───────────────────────────────────────────────── │ │
│ │ New UDP port (TID) │ │
│ │ │ │
│ │ 3. ACK Block #1 │ │
│ │ Opcode=4, Block=1 │ │
│ │ ─────────────────────────────────────────────────► │ │
│ │ │ │
│ │ 4. DATA Block #2 │ │
│ │ Opcode=3, Block=2, Data (512 bytes) │ │
│ │ ◄───────────────────────────────────────────────── │ │
│ │ │ │
│ │ 5. ACK Block #2 │ │
│ │ ─────────────────────────────────────────────────► │ │
│ │ │ │
│ │ ... continues until ... │ │
│ │ │ │
│ │ N. DATA Block #N (< 512 bytes = last block) │ │
│ │ ◄───────────────────────────────────────────────── │ │
│ │ │ │
│ │ N+1. ACK Block #N (transfer complete) │ │
│ │ ─────────────────────────────────────────────────► │ │
│ │ │ │
│ ▼ ▼ │
│ │
│ TFTP Packet Types: │
│ ────────────────────────────────────────────────────────────────────────── │
│ Opcode 1 = RRQ (Read Request) [opcode][filename\0][mode\0] │
│ Opcode 2 = WRQ (Write Request) [opcode][filename\0][mode\0] │
│ Opcode 3 = DATA [opcode][block#][data...] │
│ Opcode 4 = ACK [opcode][block#] │
│ Opcode 5 = ERROR [opcode][errcode][errmsg\0] │
│ │
│ Block numbers are 16-bit, starting at 1 │
│ Standard block size is 512 bytes │
│ Last block has < 512 bytes (signals end of transfer) │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
2.2 Why This Matters
Data Center Reality
Network boot is fundamental to modern infrastructure:
Data Center Boot Scenarios:
┌─────────────────────────────────────────────────────────────────────────────┐
│ WHY PXE MATTERS IN THE REAL WORLD │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ SCENARIO 1: Diskless Workstations │
│ ────────────────────────────────────────────────────────────────────────── │
│ • Libraries, schools, call centers with 100+ identical machines │
│ • No local storage = no malware persistence, no data theft │
│ • Central image = update once, all machines get new software │
│ • Cost savings = no local drives to fail, easy hardware swap │
│ │
│ SCENARIO 2: Cloud Server Provisioning │
│ ────────────────────────────────────────────────────────────────────────── │
│ • AWS, Azure, GCP: new VM instances boot from network │
│ • Bare-metal cloud (Packet, Equinix Metal): physical servers PXE boot │
│ • Dynamic OS deployment: same hardware, different OS per customer │
│ • Automated at scale: thousands of servers, zero manual intervention │
│ │
│ SCENARIO 3: OS Installation │
│ ────────────────────────────────────────────────────────────────────────── │
│ • Enterprise deployments: SCCM, Cobbler, Foreman │
│ • Linux installers: Anaconda, Debian-installer support PXE │
│ • Windows Deployment Services (WDS): PXE-based │
│ • Recovery environments: boot recovery tools without media │
│ │
│ SCENARIO 4: Embedded Systems │
│ ────────────────────────────────────────────────────────────────────────── │
│ • Routers, switches: boot image from central management server │
│ • Industrial systems: update firmware across factory floor │
│ • Kiosk systems: identical boot image, central management │
│ │
│ THE SCALE: │
│ ────────────────────────────────────────────────────────────────────────── │
│ • Hyperscale: Facebook/Meta provisions servers at rate of 1000s/week │
│ • Each AWS availability zone: 10,000s of physical servers │
│ • Enterprise: Fortune 500 companies manage 50,000+ endpoints via PXE │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
2.3 Historical Context
The Evolution of Network Boot
- 1997: Intel releases PXE 1.0 specification
- 1999: PXE 2.0 adds UNDI abstraction layer
- 2000s: iPXE project extends PXE with HTTP, iSCSI, SAN boot
- 2010s: UEFI includes native network boot (HTTP Boot in UEFI 2.5)
- Today: PXE remains standard, with iPXE and UEFI HTTP Boot as alternatives
Network Boot Evolution:
1990 1997 1999 2005 2015 2024
│ │ │ │ │ │
▼ ▼ ▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ BOOTP │──►│ PXE 1.0 │─►│ PXE 2.0 │─►│ iPXE │─►│ UEFI │─►│ HTTP Boot│
│ (RFC951) │ │ │ │ + UNDI │ │ gPXE │ │ Network │ │ Secure │
└──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘
│ │ │ │ │ │
│ │ │ │ │ │
Simple Standard Abstracted Extended Native Modern
boot network hardware HTTP/iSCSI UEFI TLS/HTTP
protocol boot drivers support support support
2.4 Common Misconceptions
- “PXE requires special network hardware”
- Reality: Most modern NICs have PXE ROMs; onboard NICs almost always include it
- Even virtual NICs in QEMU/VirtualBox support PXE
- “Network boot is slow”
- Reality: Modern gigabit networks transfer a 10MB kernel in under 1 second
- TFTP is simple/fast; no TCP overhead, no encryption overhead
- “PXE only works with BIOS, not UEFI”
- Reality: UEFI has native PXE support and adds HTTP Boot
- UEFI PXE is similar but uses EFI runtime services
- “You need expensive infrastructure”
- Reality: A laptop running dnsmasq can be a complete PXE server
- QEMU has built-in TFTP support for testing
- “DHCP and PXE are the same thing”
- Reality: DHCP assigns IPs; PXE uses DHCP extensions for boot info
- You can have DHCP without PXE, but PXE requires DHCP (or BOOTP)
3. Project Specification
3.1 What You Will Build
A bootloader that:
- Detects PXE/UNDI firmware presence
- Uses the PXE API to perform DHCP discovery
- Obtains an IP address, subnet mask, and boot file information
- Downloads a kernel image via TFTP
- Loads the kernel to memory and transfers control
3.2 Functional Requirements
| ID | Requirement | Priority |
|---|---|---|
| FR1 | Detect PXE ROM presence via INT 1Ah | Must Have |
| FR2 | Initialize UNDI driver and get NIC information | Must Have |
| FR3 | Perform complete DHCP handshake (DORA) | Must Have |
| FR4 | Parse DHCP response for IP, gateway, TFTP server, filename | Must Have |
| FR5 | Download boot file via TFTP | Must Have |
| FR6 | Handle TFTP block acknowledgment and retransmission | Must Have |
| FR7 | Display progress during download | Should Have |
| FR8 | Verify downloaded file integrity (optional checksum) | Could Have |
| FR9 | Support PXE API (high-level) and UNDI API (low-level) | Should Have |
| FR10 | Jump to downloaded kernel with boot info structure | Must Have |
3.3 Non-Functional Requirements
| ID | Requirement | Target |
|---|---|---|
| NFR1 | Stage 1 bootloader fits in MBR (446 bytes usable) | Must Have |
| NFR2 | Total code size < 64KB (single segment) | Should Have |
| NFR3 | Handle network timeouts gracefully | Must Have |
| NFR4 | Retry failed operations up to 3 times | Should Have |
| NFR5 | Work in QEMU with virtio-net or e1000 | Must Have |
3.4 Example Usage / Output
# Server setup (on host machine):
$ mkdir -p /srv/tftp
$ cp kernel.bin /srv/tftp/
$ dnsmasq --interface=eth0 \
--dhcp-range=10.0.2.100,10.0.2.200,12h \
--enable-tftp \
--tftp-root=/srv/tftp \
--dhcp-boot=kernel.bin
# QEMU with network boot:
$ qemu-system-x86_64 \
-boot n \
-netdev user,id=net0,tftp=/srv/tftp,bootfile=kernel.bin \
-device virtio-net,netdev=net0 \
-serial stdio
# Expected bootloader output:
PXE Network Bootloader v1.0
===========================
Detecting PXE environment...
PXE ROM found at segment 0x9C00
PXENV+ structure at 0x9C00:0x0010
PXE version: 2.1
UNDI code segment: 0x9800
Initializing network interface...
UNDI: Intel PRO/1000 MT Desktop (82540EM)
MAC Address: 52:54:00:12:34:56
Link status: UP
Performing DHCP discovery...
Sending DHCP DISCOVER (broadcast)...
Received DHCP OFFER from 10.0.2.2
Offered IP: 10.0.2.100
Sending DHCP REQUEST...
Received DHCP ACK
Assigned IP: 10.0.2.100
Subnet Mask: 255.255.255.0
Gateway: 10.0.2.2
TFTP Server: 10.0.2.2
Boot File: kernel.bin
Downloading kernel.bin via TFTP...
Requesting file from 10.0.2.2:69
[=====================================] 100%
Downloaded 65536 bytes (128 blocks)
Preparing to boot...
Loading kernel at 0x100000
Entry point: 0x100000
Jumping to kernel...
[Kernel output begins here]
Hello from network-booted kernel!
3.5 Real World Outcome
You will have built the same technology that powers:
- Every diskless thin client
- Cloud provider bare-metal provisioning (AWS, Packet, Equinix)
- Enterprise OS deployment (SCCM, Cobbler, Foreman)
- Linux installer network boot
- System recovery without bootable media
4. Solution Architecture
4.1 High-Level Design
Network Boot Architecture:
┌─────────────────────────────────────────────────────────────────────────────┐
│ NETWORK BOOTLOADER ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ STAGE 1 (MBR - 512 bytes) │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ 1. Minimal initialization (stack, segments) │ │
│ │ 2. Detect PXE (INT 1Ah, AX=5650h) │ │
│ │ 3. If PXE found: jump to Stage 2 in PXE ROM area │ │
│ │ 4. If no PXE: display error, halt │ │
│ └──────────────────────────────────┬──────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ STAGE 2 (Network Boot Code) │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ ┌───────────────┐ ┌───────────────┐ ┌───────────────┐ │ │
│ │ │ PXE Detection │───►│ DHCP Handler │───►│ TFTP Handler │ │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ - Find PXENV+ │ │ - Discover │ │ - Open file │ │ │
│ │ │ - Find !PXE │ │ - Offer │ │ - Read blocks │ │ │
│ │ │ - Get API │ │ - Request │ │ - ACK blocks │ │ │
│ │ │ entry point │ │ - Acknowledge │ │ - Close file │ │ │
│ │ └───────────────┘ └───────────────┘ └───────────────┘ │ │
│ │ │ │ │ │ │
│ │ │ │ │ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ PXE API INTERFACE │ │ │
│ │ │ │ │ │
│ │ │ pxe_call(opcode, parameter_block) │ │ │
│ │ │ - Sets ES:BX to parameter block │ │ │
│ │ │ - Calls PXE entry point via far call │ │ │
│ │ │ - Returns status in AX │ │ │
│ │ │ │ │ │
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ └──────────────────────────────┼────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ PXE ROM (Network Card BIOS) │ │
│ ├─────────────────────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ UNDI Layer PXE Services │ │
│ │ ────────── ──────────── │ │
│ │ - Hardware abstraction - DHCP caching │ │
│ │ - Packet TX/RX - UDP open/read/write │ │
│ │ - Interrupt handling - TFTP file operations │ │
│ │ │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ NETWORK INTERFACE CARD │ │
│ └──────────────────────────────┬──────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────────────┐ │
│ │ NETWORK │ │
│ │ (DHCP Server + TFTP Server) │ │
│ └─────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
4.2 Key Components
- PXE Detector: Finds PXE structures in memory using INT 1Ah or scanning
- API Interface: Wrapper functions for calling PXE/UNDI functions
- DHCP Client: Performs the 4-way handshake, parses options
- TFTP Client: Downloads files in 512-byte blocks with ACKs
- Boot Info Builder: Prepares data structure for kernel
- Display Handler: Shows progress and status messages
4.3 Data Structures
/* PXE Environment Structure (PXENV+) - found via INT 1Ah */
typedef struct {
uint8_t signature[6]; /* "PXENV+" */
uint16_t version; /* PXE version (e.g., 0x0201 = 2.1) */
uint8_t length; /* Structure length */
uint8_t checksum; /* Byte checksum (sum to 0) */
uint32_t rm_entry; /* Real-mode entry point (SEG:OFF) */
uint32_t pm_offset; /* Protected-mode entry offset */
uint16_t pm_selector; /* Protected-mode entry selector */
uint16_t stack_seg; /* Stack segment */
uint16_t stack_size; /* Stack size */
uint16_t bc_code_seg; /* Base-code code segment */
uint16_t bc_code_size; /* Base-code code size */
uint16_t bc_data_seg; /* Base-code data segment */
uint16_t bc_data_size; /* Base-code data size */
uint16_t undi_data_seg; /* UNDI data segment */
uint16_t undi_data_size; /* UNDI data size */
uint16_t undi_code_seg; /* UNDI code segment */
uint16_t undi_code_size; /* UNDI code size */
uint32_t pxe_ptr; /* Pointer to !PXE structure */
} __attribute__((packed)) pxenv_t;
/* !PXE Structure (newer, preferred) */
typedef struct {
uint8_t signature[4]; /* "!PXE" */
uint8_t struct_length; /* Structure length */
uint8_t struct_checksum; /* Checksum */
uint8_t struct_rev; /* Revision */
uint8_t reserved;
uint32_t undi_rom_id; /* UNDI ROM ID pointer */
uint32_t base_rom_id; /* Base ROM ID pointer */
uint32_t entry_point_sp; /* Stack segment entry */
uint32_t entry_point_esp; /* Stack segment ESP */
uint16_t status_callout; /* Status callout */
uint8_t reserved2;
uint8_t seg_desc_cnt; /* Segment descriptor count */
uint16_t first_seg_sel; /* First segment selector */
/* ... segment descriptors follow */
} __attribute__((packed)) pxe_t;
/* DHCP Cached Information (returned by PXENV_GET_CACHED_INFO) */
typedef struct {
uint16_t status;
uint16_t packet_type; /* 1=DHCPDISCOVER, 2=DHCPACK, 3=Cached */
uint16_t buffer_size;
uint16_t buffer_offset;
uint16_t buffer_segment;
uint16_t buffer_limit;
} __attribute__((packed)) pxenv_cached_info_t;
/* TFTP Read File (PXENV_TFTP_READ_FILE) */
typedef struct {
uint16_t status;
uint32_t server_ip; /* TFTP server IP (network byte order) */
uint32_t gateway_ip; /* Gateway IP (if different subnet) */
uint8_t filename[128]; /* Filename to download */
uint32_t buffer_size; /* Size of receive buffer */
uint32_t buffer_offset; /* Buffer offset (real mode) */
uint32_t buffer_segment; /* Buffer segment (real mode) */
} __attribute__((packed)) pxenv_tftp_read_file_t;
/* UDP Open/Read/Write structures for manual DHCP/TFTP */
typedef struct {
uint16_t status;
uint32_t src_ip; /* Source IP for this socket */
} __attribute__((packed)) pxenv_udp_open_t;
typedef struct {
uint16_t status;
uint32_t ip; /* Source IP of received packet */
uint32_t gateway; /* Gateway if not local */
uint16_t src_port; /* Source UDP port */
uint16_t dest_port; /* Destination UDP port */
uint16_t buffer_size; /* Buffer size (in/out) */
uint16_t buffer_offset; /* Buffer offset */
uint16_t buffer_segment; /* Buffer segment */
} __attribute__((packed)) pxenv_udp_read_t;
4.4 Algorithm Overview
PXE Boot Algorithm:
1. INITIALIZATION
├─ Set up stack at safe location (0x7C00 - 0x1000)
├─ Save DL (boot drive, usually 0x00 for network)
└─ Clear direction flag, set up segments
2. PXE DETECTION
├─ Call INT 1Ah with AX=5650h ("PV" for PXE Verify)
├─ If CF=0 and AX=564Eh ("VN"), PXE present
│ ├─ ES:BX points to PXENV+ structure
│ ├─ Validate signature "PXENV+"
│ ├─ Verify checksum (all bytes sum to 0)
│ └─ Follow pxe_ptr to find !PXE if version >= 2.1
└─ If CF=1, no PXE (display error, halt)
3. UNDI INITIALIZATION
├─ Call PXENV_UNDI_GET_STATE to verify driver ready
├─ Call PXENV_UNDI_GET_INFORMATION for MAC address
└─ Display NIC information to user
4. DHCP ACQUISITION
├─ Option A: Use cached DHCP (simple)
│ ├─ Call PXENV_GET_CACHED_INFO, PacketType=2 (DHCPACK)
│ └─ Parse returned BOOTPLY structure for IP, server, filename
│
└─ Option B: Manual DHCP (full control)
├─ Call PXENV_UDP_OPEN with source IP 0.0.0.0
├─ Build DHCP DISCOVER packet
├─ Call PXENV_UDP_WRITE to 255.255.255.255:67
├─ Call PXENV_UDP_READ for OFFER (with timeout)
├─ Build DHCP REQUEST packet
├─ Call PXENV_UDP_WRITE to server
├─ Call PXENV_UDP_READ for ACK
└─ Parse ACK for IP, gateway, TFTP server, filename
5. TFTP DOWNLOAD
├─ Option A: Use PXE TFTP API (simple)
│ ├─ Build PXENV_TFTP_READ_FILE structure
│ ├─ Set server IP, filename, buffer location
│ ├─ Call PXE API function 0x0023
│ └─ File is loaded to buffer, size returned
│
└─ Option B: Manual TFTP (full control)
├─ Build TFTP RRQ (Read Request) packet
├─ Send to server:69 via PXENV_UDP_WRITE
├─ Loop:
│ ├─ Receive DATA block via PXENV_UDP_READ
│ ├─ Copy data to destination buffer
│ ├─ Send ACK for block number
│ ├─ If block < 512 bytes, done
│ └─ Increment expected block number
└─ Display progress during transfer
6. BOOT KERNEL
├─ Set up boot info structure (memory map, framebuffer, etc.)
├─ Transition to protected mode if needed
├─ Jump to kernel entry point
└─ Pass pointer to boot info in register (e.g., EBX)
5. Implementation Guide
5.1 Development Environment Setup
# Required tools
$ brew install nasm qemu # macOS
$ apt install nasm qemu-system-x86 dnsmasq # Linux
# Create project directory
$ mkdir pxe-bootloader && cd pxe-bootloader
# Project structure
pxe-bootloader/
├── src/
│ ├── stage1.asm # MBR loader (optional, QEMU can boot NBP directly)
│ ├── pxe_main.asm # Main PXE bootloader entry
│ ├── pxe_api.asm # PXE API wrapper functions
│ ├── dhcp.asm # DHCP client implementation
│ ├── tftp.asm # TFTP client implementation
│ ├── display.asm # Console output routines
│ └── kernel.asm # Simple test kernel to boot
├── include/
│ └── pxe.inc # PXE structure definitions
├── tftp/
│ └── kernel.bin # Test kernel for TFTP server
├── Makefile
└── run.sh # QEMU launch script
5.2 Project Structure
Makefile:
AS = nasm
ASFLAGS = -f bin
all: pxeboot.bin kernel.bin
pxeboot.bin: src/pxe_main.asm src/pxe_api.asm src/dhcp.asm src/tftp.asm
$(AS) $(ASFLAGS) -o $@ src/pxe_main.asm
kernel.bin: src/kernel.asm
$(AS) $(ASFLAGS) -o $@ src/kernel.asm
run: all
cp kernel.bin tftp/
qemu-system-x86_64 \
-boot n \
-netdev user,id=net0,tftp=tftp,bootfile=pxeboot.bin \
-device e1000,netdev=net0 \
-serial stdio
clean:
rm -f *.bin tftp/kernel.bin
5.3 The Core Question You’re Answering
“How does a computer boot an operating system when it has no local storage, using only network infrastructure?”
This forces you to understand:
- How firmware provides network services before any OS exists
- How IP configuration works (DHCP) at the lowest level
- How files are transferred (TFTP) with minimal protocol overhead
- The chain of trust in network boot (firmware -> bootloader -> kernel)
5.4 Concepts You Must Understand First
Before implementing, you should be able to answer:
- What is the difference between PXE and UNDI?
- PXE = high-level boot services (DHCP, TFTP)
- UNDI = low-level network driver interface (packet TX/RX)
- PXE uses UNDI internally
- Why is TFTP used instead of HTTP or FTP?
- TFTP uses UDP (no connection state)
- Minimal code size to implement
- Fits in boot ROM constraints
- Simple block-based transfer
- What is the DHCP 4-way handshake (DORA)?
- Discover: Client broadcasts “I need an IP”
- Offer: Server offers an IP
- Request: Client requests the offered IP
- Acknowledge: Server confirms the lease
- How does PXE API calling convention work?
- Parameters in a structure at ES:BX (or ES:DI for !PXE)
- Opcode in AX (or BX for some APIs)
- Far call to PXE entry point
- Status returned in structure and AX
5.5 Questions to Guide Your Design
PXE Detection:
- How do you find the PXENV+ structure? (INT 1Ah or memory scan?)
- What’s the difference between PXENV+ and !PXE structures?
- How do you validate the structures (checksum)?
DHCP Implementation:
- Should you use cached DHCP info or perform full handshake?
- What options do you need to parse (IP, gateway, TFTP server, filename)?
- How do you handle timeouts and retries?
TFTP Implementation:
- Use PXE TFTP API or implement over UDP?
- How do you handle block acknowledgment?
- What buffer address should you use for the download?
Kernel Handoff:
- Where do you load the kernel in memory?
- What information does the kernel need (IP, memory map)?
- How do you transition from real mode to kernel entry?
5.6 Thinking Exercise
Before coding, trace through this scenario manually:
You are a PXE bootloader running on a diskless workstation.
The NIC's PXE ROM has already initialized and done basic DHCP.
1. You call INT 1Ah with AX=5650h. What do you expect in ES:BX?
2. You find PXENV+ at 0x9C00:0x0010. How do you verify it's valid?
3. You call PXENV_GET_CACHED_INFO. What structure do you pass?
4. The DHCP response shows siaddr=10.0.2.2, file="kernel.bin".
How do you initiate the TFTP download?
5. TFTP sends you block 1 (512 bytes). What's your response?
6. Block 47 arrives with only 128 bytes. What does this mean?
7. You've loaded 23,680 bytes at 0x100000. How do you boot it?
5.7 Hints in Layers
Hint 1: Finding PXE (Starting Point)
The standard way to detect PXE:
; INT 1Ah, AX=5650h ("PV") returns:
; CF=0, AX=564Eh ("VN") if PXE present
; ES:BX = pointer to PXENV+ structure
detect_pxe:
mov ax, 0x5650 ; "PV" - PXE Verify
int 0x1a
jc .no_pxe ; CF set = not present
cmp ax, 0x564E ; "VN" = Valid Network
jne .no_pxe
; ES:BX now points to PXENV+
Hint 2: Calling PXE API (Next Level)
All PXE calls follow this pattern:
; Generic PXE API call
; Input: AX = opcode, ES:DI = parameter block
; Output: AX = status
pxe_call:
push ds
push es
push di
; Set up for far call to PXE entry point
push word [pxe_entry_seg]
push word [pxe_entry_off]
; ES:DI already points to parameter block
; BX = opcode
retf ; Far call via return
Hint 3: Using Cached DHCP (Technical Details)
; Get cached DHCP information
; PacketType: 1=DHCPDISCOVER, 2=DHCPACK, 3=Cached reply
get_dhcp_info:
mov word [cached_info.status], 0
mov word [cached_info.packet_type], 2 ; Want DHCPACK
mov word [cached_info.buffer_size], 1024
mov word [cached_info.buffer_segment], 0
mov word [cached_info.buffer_offset], dhcp_buffer
mov bx, 0x0071 ; PXENV_GET_CACHED_INFO opcode
lea di, [cached_info]
call pxe_call
; dhcp_buffer now contains BOOTPLY packet
; yiaddr (offset 16) = our IP
; siaddr (offset 20) = TFTP server IP
; file (offset 108) = boot filename
Hint 4: TFTP Download (Verification)
; Use PXE's built-in TFTP (simplest approach)
tftp_download:
; Set up TFTP read file structure
mov eax, [tftp_server_ip]
mov [tftp_read.server_ip], eax
mov dword [tftp_read.gateway_ip], 0 ; Use default
; Copy filename
lea si, [boot_filename]
lea di, [tftp_read.filename]
mov cx, 128
rep movsb
; Set destination buffer
mov dword [tftp_read.buffer_size], 0x100000 ; 1MB max
mov word [tftp_read.buffer_segment], 0x1000 ; 0x10000 linear
mov word [tftp_read.buffer_offset], 0
; Call TFTP read file
mov bx, 0x0023 ; PXENV_TFTP_READ_FILE opcode
lea di, [tftp_read]
call pxe_call
; buffer_size now contains actual bytes downloaded
5.8 The Interview Questions They’ll Ask
- “Explain the PXE boot process from power-on to OS load.”
- Strong answer: “CPU starts, BIOS POST runs, network option ROM initializes NIC and runs DHCP to get IP and boot filename. PXE downloads NBP via TFTP, executes it. NBP may do more TFTP to get kernel, then jumps to kernel entry point. The key is firmware-provided network stack before any OS exists.”
- “Why does PXE use TFTP instead of HTTP?”
- Strong answer: “TFTP is UDP-based with no connection state, requires minimal code to implement (fits in boot ROM), has simple block-based transfer with implicit flow control via ACKs. HTTP requires TCP stack, more complex parsing. Modern UEFI adds HTTP Boot, but traditional PXE uses TFTP for simplicity.”
- “How would you debug a PXE boot failure?”
- Strong answer: “Check physical connection and DHCP server logs. Verify DHCP offers include next-server and filename options. Check TFTP server is running on port 69 and file exists. Use Wireshark to capture DHCP/TFTP traffic. On client, add debug output after each PXE call to see which step fails.”
- “What’s the difference between PXENV+ and !PXE structures?”
- Strong answer: “PXENV+ is older (PXE 2.0), found via INT 1Ah. !PXE is newer (PXE 2.1+), provides stack-segment-based entry point safer for protected mode. PXENV+ has pxe_ptr field pointing to !PXE. Use !PXE if available, fall back to PXENV+.”
- “How do data centers use PXE at scale?”
- Strong answer: “Provisioning systems like Cobbler, Foreman, MAAS use PXE to image thousands of servers identically. DHCP servers dynamically return different boot files based on MAC address. iPXE allows chaining to HTTP for larger boot images. BMC (IPMI/Redfish) can trigger PXE boot remotely.”
5.9 Books That Will Help
| Topic | Book Reference |
|---|---|
| PXE Specification | Intel PXE Specification v2.1 - Primary reference |
| DHCP Protocol | “TCP/IP Illustrated, Volume 1” by W. Richard Stevens, Chapter 16 |
| TFTP Protocol | RFC 1350 + “TCP/IP Illustrated, Volume 1” Chapter 15 |
| x86 Real Mode | “Low-Level Programming” by Igor Zhirkov, Chapters 1-3 |
| Networking Basics | “Computer Networks” by Tanenbaum, Chapter 5 (Network Layer) |
| iPXE Reference | iPXE Documentation - Modern PXE extension |
5.10 Implementation Phases
Phase 1: PXE Detection and Display (3-4 days)
- Find PXENV+ via INT 1Ah
- Validate structure and checksum
- Display PXE version and NIC info
- Test: PXE info displays correctly in QEMU
Phase 2: DHCP Information Retrieval (3-4 days)
- Use PXENV_GET_CACHED_INFO
- Parse BOOTPLY packet for IP, server, filename
- Display network configuration
- Test: Correct IP and filename shown
Phase 3: TFTP Download (4-5 days)
- Use PXENV_TFTP_READ_FILE
- Set up buffer and filename
- Display progress (bytes received)
- Test: Kernel.bin downloads correctly
Phase 4: Kernel Boot (2-3 days)
- Jump to loaded kernel
- Pass boot info structure
- Handle protected mode if needed
- Test: Kernel executes and prints message
Phase 5: Polish and Error Handling (2-3 days)
- Add timeout handling
- Retry failed operations
- Clean error messages
- Test: Graceful handling of network errors
5.11 Key Implementation Decisions
- Use Cached DHCP vs Manual DHCP?
- Cached: Simple, PXE ROM already did DHCP
- Manual: Full control, learning experience
- Recommendation: Start with cached, add manual as extension
- Use PXE TFTP API vs UDP API?
- PXE TFTP: Single call downloads file
- UDP: Manual TFTP implementation
- Recommendation: Start with PXE TFTP, add UDP as extension
- Real Mode vs Protected Mode?
- Real Mode: PXE API works directly
- Protected Mode: Need V86 mode or UNDI callbacks
- Recommendation: Stay in real mode for simplicity
6. Testing Strategy
Unit Tests:
# Test PXE detection
$ qemu-system-x86_64 -boot n -netdev user,id=n0 -device e1000,netdev=n0
# Expected: "PXE ROM found" message
# Test DHCP retrieval
# Expected: IP address and TFTP info displayed
# Test TFTP download
$ echo "Test kernel content" > tftp/kernel.bin
# Expected: File downloaded, size matches
# Test kernel execution
# Expected: Kernel runs and prints message
Integration Tests:
# Full boot test with real dnsmasq
$ sudo dnsmasq --interface=br0 --dhcp-range=192.168.1.100,192.168.1.200 \
--enable-tftp --tftp-root=/srv/tftp \
--dhcp-boot=pxeboot.bin
# Test with different NICs
$ qemu-system-x86_64 -netdev user,... -device virtio-net,...
$ qemu-system-x86_64 -netdev user,... -device rtl8139,...
$ qemu-system-x86_64 -netdev user,... -device e1000,...
7. Common Pitfalls & Debugging
Problem 1: INT 1Ah returns CF=1 (no PXE)
- Cause: Not booting from network or NIC has no PXE ROM
- Fix: Use
-boot nin QEMU, or boot order in BIOS set to network - Verify:
qemu-system-x86_64 -boot order=n ...
Problem 2: PXENV+ structure fails checksum
- Cause: Reading wrong memory location or structure corrupted
- Fix: Print ES:BX values, verify signature bytes are “PXENV+”
- Verify: Sum all bytes of structure, should equal 0
Problem 3: DHCP info shows all zeros
- Cause: DHCP not completed before bootloader runs
- Fix: PXE ROM should have done DHCP; check QEMU network config
- Verify: Look for DHCP traffic in Wireshark
Problem 4: TFTP download hangs
- Cause: TFTP server not running, wrong port, or file not found
- Fix: Check dnsmasq running, file in tftp root, port 69 open
- Verify:
tftp localhost -c get kernel.bin
Problem 5: Kernel doesn’t run after download
- Cause: Wrong load address, missing jump, or kernel format issue
- Fix: Verify load address matches kernel link address, check entry point
- Verify: Add debug print before jump, check kernel is flat binary
8. Extensions & Challenges
- Manual DHCP Implementation: Instead of cached info, implement full DORA handshake
- Manual TFTP Over UDP: Implement TFTP using PXENV_UDP_READ/WRITE
- Progress Bar: Show graphical progress bar during download
- Menu System: Download menu file, let user choose boot option
- HTTP Boot: Extend to support HTTP via iPXE chainloading
- iSCSI Boot: Research and prototype iSCSI root filesystem boot
- Multicast TFTP: Implement MTFTP for multiple simultaneous clients
9. Real-World Connections
- AWS Nitro: Uses PXE for bare-metal provisioning
- Cobbler: Popular Linux provisioning server uses PXE
- Foreman: Enterprise provisioning with PXE integration
- MAAS (Metal as a Service): Canonical’s data center tool
- Windows Deployment Services: Microsoft’s PXE implementation
- iPXE: Open source PXE replacement with HTTP, iSCSI, SAN boot
10. Resources
Specifications:
Tutorials:
Tools:
- dnsmasq - DHCP and TFTP server
- Wireshark - Network packet analysis
- iPXE - Extended PXE implementation for reference
11. Self-Assessment Checklist
Before considering this project complete, verify:
- Can explain PXE architecture (PXE, UNDI, option ROM)
- Can describe DHCP 4-way handshake
- Can explain TFTP protocol operation
- PXE detection works in QEMU
- DHCP info correctly retrieved and displayed
- TFTP download completes successfully
- Downloaded kernel executes correctly
- Error handling works (network timeouts, missing file)
- Can explain data center use cases for PXE
- Can debug PXE boot failures systematically
12. Submission / Completion Criteria
Your implementation is complete when:
- PXE Detection: Bootloader finds and validates PXENV+ structure
- DHCP Display: Shows assigned IP, TFTP server, and boot filename
- TFTP Download: Successfully downloads kernel.bin from TFTP server
- Progress Display: Shows download progress (bytes or percentage)
- Kernel Execution: Downloaded kernel runs and produces output
- Error Handling: Graceful messages for network failures
- Documentation: README explains how to set up and test
Bonus Points:
- Manual DHCP implementation (not just cached)
- Manual TFTP over UDP implementation
- Support for multiple file download
- Integration with real hardware (not just QEMU)
“Network boot is the invisible infrastructure powering the cloud. Every EC2 instance, every Kubernetes node, every diskless workstation boots this way. Understanding PXE means understanding how modern data centers operate at the most fundamental level.”