Project 5: Build a WASI Runtime
Project 5: Build a WASI Runtime
Extend your WASM interpreter to support file I/O, environment variables, and command-line arguments through WASI
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Advanced |
| Time Estimate | 2-3 weeks |
| Languages | C (primary), Rust, Go, Zig |
| Prerequisites | Project 3 (Interpreter), POSIX familiarity |
| Main Reference | WASI Specification (wasi.dev) |
| Knowledge Area | System Interfaces, Security, Sandboxing |
Learning Objectives
After completing this project, you will be able to:
- Implement WASI syscalls - Provide fd_write, fd_read, path_open, and other POSIX-like operations
- Understand capability-based security - Implement preopened directories and file descriptor rights
- Marshal data across boundaries - Transfer strings and buffers between host and WASM memory
- Handle command-line arguments - Pass args and environment to WASM programs
- Build a production-quality runtime - Run real command-line WASM programs
- Understand sandboxing tradeoffs - Balance security with functionality
Conceptual Foundation
1. What Is WASI?
WASI (WebAssembly System Interface) is a standardized API that allows WASM modules to interact with the operating system in a portable, sandboxed way:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ WASI Architecture โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ WASM Module WASI Runtime โ
โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ โ โ
โ โ (import "wasi" โโโโโโโโโโโถโ fd_write() โ write() โ โ
โ โ "fd_write") โ โ fd_read() โ read() โ โ
โ โ โ โ path_open() โ open() โ โ
โ โ call $fd_write โ โ environ_get() โ getenv() โ โ
โ โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ โ โ โ
โ โผ โผ โ
โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Linear Memory โโโโโโโโโโโโ Host copies data in/out โ โ
โ โ (strings, bufs) โ โ of WASM memory โ โ
โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ

Key insight: WASI doesnโt give WASM modules direct system access. Instead, every operation goes through controlled API functions that the runtime implements.
2. Why WASI Exists
Without WASI, WASM modules can only:
- Compute (arithmetic, control flow)
- Access their own linear memory
- Call imported functions from the host
With WASI, WASM modules can:
- Read/write files
- Access environment variables
- Get command-line arguments
- Query the clock
- Generate random numbers
- (Future) Network, async I/O
The value proposition: Write once, run anywhereโon any OS, in any runtime, with strong sandboxing.
3. Capability-Based Security Model
WASI uses capabilities instead of ambient authority:
Traditional Model (POSIX):
โโโโโโโโโโโโโโโโโโโโโโโโโ
Process can access ANY file it has permission for.
open("/etc/passwd", O_RDONLY) // Works if process has read permission
Capability Model (WASI):
โโโโโโโโโโโโโโโโโโโโโโโโ
Process can ONLY access preopened directories.
open("/etc/passwd") // FAILS - /etc not preopened
open("./data.txt") // Works IF . was preopened
Preopens are capabilities granted at startup:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ wasi-runtime program.wasm โ
โ --dir=/home/user/data:/data โ
โ --dir=/tmp:/tmp โ
โ โ
โ Module can access: โ
โ /data/* (mapped from /home/user/data) โ
โ /tmp/* (mapped from /tmp) โ
โ NOTHING ELSE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ

Why capabilities?
- Defense in depth: Even if WASM code is malicious, it canโt escape its sandbox
- Explicit permissions: User decides what the module can access at runtime
- Auditability: Clear boundary of whatโs allowed
4. WASI Versions
WASI has evolved through several versions:
| Version | Status | Features |
|---|---|---|
| preview1 | Stable | Files, args, env, clock, random |
| preview2 | In progress | Component model, async, improved APIs |
| preview3 | Planned | Full async I/O, networking |
This project implements preview1, which is what most existing tools use.
5. The WASI API Surface
WASI preview1 defines these functions (imported from wasi_snapshot_preview1):
Arguments & Environment:
args_get- Get command line argumentsargs_sizes_get- Get argument count and buffer sizeenviron_get- Get environment variablesenviron_sizes_get- Get environment count and buffer size
Clock:
clock_res_get- Get clock resolutionclock_time_get- Get current time
File Descriptors:
fd_advise- Provide file advisory informationfd_allocate- Allocate space in filefd_close- Close a file descriptorfd_datasync- Synchronize file datafd_fdstat_get- Get file descriptor statusfd_fdstat_set_flags- Set file descriptor flagsfd_fdstat_set_rights- Set file descriptor rightsfd_filestat_get- Get file statisticsfd_filestat_set_size- Set file sizefd_filestat_set_times- Set file timesfd_pread- Read from file at offsetfd_prestat_dir_name- Get preopened directory namefd_prestat_get- Get prestat infofd_pwrite- Write to file at offsetfd_read- Read from file descriptorfd_readdir- Read directory entriesfd_renumber- Renumber a file descriptorfd_seek- Seek in filefd_sync- Synchronize file statefd_tell- Get file positionfd_write- Write to file descriptor
Path Operations:
path_create_directory- Create directorypath_filestat_get- Get file status by pathpath_filestat_set_times- Set times by pathpath_link- Create hard linkpath_open- Open file by pathpath_readlink- Read symbolic linkpath_remove_directory- Remove directorypath_rename- Rename filepath_symlink- Create symbolic linkpath_unlink_file- Remove file
Miscellaneous:
poll_oneoff- Poll for eventsproc_exit- Exit processproc_raise- Send signal to selfrandom_get- Get random bytessched_yield- Yield processorsock_recv/sock_send/sock_shutdown- Socket operations (often unimplemented)
6. Data Structures in WASM Memory
WASI uses pointers into linear memory for data transfer:
Example: fd_write(fd, iovs_ptr, iovs_len, nwritten_ptr)
WASM Linear Memory:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โ 0x1000: โโโโโโโโโโโโโโโโโโ โ iovs_ptr โ
โ โ buf_ptr: 0x2000โ (iovec[0].buf) โ
โ โ buf_len: 13 โ (iovec[0].buf_len) โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ
โ 0x2000: "Hello, World!" โ actual data to write โ
โ โ
โ 0x3000: โโโโโโโโโโโโโโโโโโ โ nwritten_ptr โ
โ โ (output) โ runtime writes bytes written โ
โ โโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
struct iovec {
uint32_t buf; // Pointer to buffer in WASM memory
uint32_t buf_len; // Length of buffer
};

Key functions need to:
- Read pointers from WASM memory
- Follow those pointers to read/write data
- Write results back to WASM memory
7. Error Handling
WASI functions return an errno value (0 = success):
typedef enum {
WASI_ERRNO_SUCCESS = 0,
WASI_ERRNO_2BIG = 1, // Argument list too long
WASI_ERRNO_ACCES = 2, // Permission denied
WASI_ERRNO_ADDRINUSE = 3, // Address in use
// ... many more ...
WASI_ERRNO_BADF = 8, // Bad file descriptor
WASI_ERRNO_NOENT = 44, // No such file or directory
WASI_ERRNO_NOTDIR = 54, // Not a directory
WASI_ERRNO_INVAL = 28, // Invalid argument
// ... etc ...
} wasi_errno_t;
8. File Descriptor Table
Your runtime must maintain a file descriptor table:
#define MAX_FDS 1024
typedef struct {
int host_fd; // Actual OS file descriptor
char* path; // For preopened dirs: the virtual path
uint64_t rights_base; // What operations are allowed
uint64_t rights_inheriting; // Rights for opened files
uint8_t type; // File type (regular, directory, etc.)
bool is_preopen; // Is this a preopened directory?
} WasiFd;
typedef struct {
WasiFd fds[MAX_FDS];
int next_fd;
} FdTable;
// Standard file descriptors
// fd 0 = stdin
// fd 1 = stdout
// fd 2 = stderr
// fd 3+ = preopened directories and opened files
9. Rights and Capabilities
Each file descriptor has associated rights:
// Rights bits (from WASI spec)
#define WASI_RIGHT_FD_DATASYNC (1ULL << 0)
#define WASI_RIGHT_FD_READ (1ULL << 1)
#define WASI_RIGHT_FD_SEEK (1ULL << 2)
#define WASI_RIGHT_FD_FDSTAT_SET_FLAGS (1ULL << 3)
#define WASI_RIGHT_FD_SYNC (1ULL << 4)
#define WASI_RIGHT_FD_TELL (1ULL << 5)
#define WASI_RIGHT_FD_WRITE (1ULL << 6)
#define WASI_RIGHT_FD_ADVISE (1ULL << 7)
#define WASI_RIGHT_FD_ALLOCATE (1ULL << 8)
#define WASI_RIGHT_PATH_CREATE_DIRECTORY (1ULL << 9)
#define WASI_RIGHT_PATH_CREATE_FILE (1ULL << 10)
#define WASI_RIGHT_PATH_LINK_SOURCE (1ULL << 11)
#define WASI_RIGHT_PATH_LINK_TARGET (1ULL << 12)
#define WASI_RIGHT_PATH_OPEN (1ULL << 13)
#define WASI_RIGHT_FD_READDIR (1ULL << 14)
#define WASI_RIGHT_PATH_READLINK (1ULL << 15)
#define WASI_RIGHT_PATH_RENAME_SOURCE (1ULL << 16)
#define WASI_RIGHT_PATH_RENAME_TARGET (1ULL << 17)
#define WASI_RIGHT_PATH_FILESTAT_GET (1ULL << 18)
#define WASI_RIGHT_PATH_FILESTAT_SET_SIZE (1ULL << 19)
#define WASI_RIGHT_PATH_FILESTAT_SET_TIMES (1ULL << 20)
#define WASI_RIGHT_FD_FILESTAT_GET (1ULL << 21)
#define WASI_RIGHT_FD_FILESTAT_SET_SIZE (1ULL << 22)
#define WASI_RIGHT_FD_FILESTAT_SET_TIMES (1ULL << 23)
#define WASI_RIGHT_PATH_SYMLINK (1ULL << 24)
#define WASI_RIGHT_PATH_REMOVE_DIRECTORY (1ULL << 25)
#define WASI_RIGHT_PATH_UNLINK_FILE (1ULL << 26)
#define WASI_RIGHT_POLL_FD_READWRITE (1ULL << 27)
#define WASI_RIGHT_SOCK_SHUTDOWN (1ULL << 28)
Rights checking:
wasi_errno_t check_rights(FdTable* table, int fd, uint64_t required) {
if (fd < 0 || fd >= MAX_FDS || !table->fds[fd].host_fd) {
return WASI_ERRNO_BADF;
}
if ((table->fds[fd].rights_base & required) != required) {
return WASI_ERRNO_NOTCAPABLE;
}
return WASI_ERRNO_SUCCESS;
}
Project Specification
Required WASI Functions
Implement these functions to run most command-line programs:
Tier 1 (Absolutely Required):
fd_write- Write to stdout/stderrfd_read- Read from stdinfd_close- Close file descriptorsproc_exit- Exit the programargs_get,args_sizes_get- Command-line argumentsenviron_get,environ_sizes_get- Environment variables
Tier 2 (File Operations):
path_open- Open filesfd_prestat_get,fd_prestat_dir_name- Preopened directoriesfd_seek,fd_tell- File positioningfd_filestat_get- File metadata
Tier 3 (Full Compatibility):
clock_time_get- Get current timerandom_get- Random number generationpath_create_directory,path_remove_directorypath_unlink_filefd_readdir- Directory listing
Input/Output
# Run a simple program
$ ./wasi-runtime hello.wasm
Hello, World!
# With arguments
$ ./wasi-runtime cat.wasm file.txt
(contents of file.txt)
# With preopened directories
$ ./wasi-runtime --dir=./data:/data process.wasm
(processes files in ./data)
# With environment
$ ./wasi-runtime --env=DEBUG=1 app.wasm
Success Criteria
- Hello World: Print โHello, World!โ to stdout
- Echo: Read args and print them back
- Cat: Read a file and print its contents
- Environment: Read and print environment variables
- File write: Create and write to a file
- Compatibility: Run programs compiled with
clang --target=wasm32-wasi
Solution Architecture
Runtime Structure
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ WASI Runtime Architecture โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ main.c โ
โ โ โ
โ โโโโถ wasm_interp/ (From Project 3) โ
โ โ โ parser.c โ
โ โ โ exec.c โ
โ โ โ memory.c โ
โ โ โ
โ โโโโถ wasi/ โ
โ โ wasi.c Main WASI implementation โ
โ โ wasi.h WASI types and constants โ
โ โ fd_table.c File descriptor management โ
โ โ fd_table.h โ
โ โ args_env.c Arguments and environment โ
โ โ clock.c Clock functions โ
โ โ random.c Random number generation โ
โ โ
โ Data flow: โ
โ โโโโโโโโโโ โ
โ 1. Load WASM module โ
โ 2. Initialize WASI state (fd table, args, env) โ
โ 3. Register WASI imports โ
โ 4. Instantiate module โ
โ 5. Call _start or main โ
โ 6. WASI calls go through your implementations โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ

Key Data Structures
// wasi.h
// WASI context passed to all WASI functions
typedef struct {
// Memory access
uint8_t* memory;
uint32_t memory_size;
// File descriptors
FdTable fd_table;
// Arguments
char** args;
int argc;
// Environment
char** environ;
int environ_count;
// Exit code (set by proc_exit)
int exit_code;
bool exited;
} WasiCtx;
// Import function signature
typedef uint32_t (*WasiFunc)(WasiCtx* ctx, uint32_t* args);
// Import registration
typedef struct {
const char* name;
WasiFunc func;
int param_count;
} WasiImport;
Integration with Interpreter
// In your interpreter's import resolution
WasiCtx* wasi_ctx;
Value call_import(const char* module, const char* name, Value* args, int nargs) {
if (strcmp(module, "wasi_snapshot_preview1") == 0) {
// Find and call WASI function
WasiFunc func = lookup_wasi_func(name);
if (func) {
uint32_t wasi_args[8];
for (int i = 0; i < nargs; i++) {
wasi_args[i] = args[i].i32;
}
uint32_t result = func(wasi_ctx, wasi_args);
return (Value){.type = VAL_I32, .i32 = result};
}
}
trap("Unknown import: %s.%s", module, name);
}
Implementation Guide
Phase 1: Basic Output (Days 1-3)
Goal: Print โHello, World!โ
Implement fd_write:
// fd_write(fd: i32, iovs: i32, iovs_len: i32, nwritten: i32) -> errno
uint32_t wasi_fd_write(WasiCtx* ctx, uint32_t* args) {
uint32_t fd = args[0];
uint32_t iovs_ptr = args[1];
uint32_t iovs_len = args[2];
uint32_t nwritten_ptr = args[3];
// Validate fd (0=stdin, 1=stdout, 2=stderr)
if (fd != 1 && fd != 2) {
return WASI_ERRNO_BADF;
}
uint32_t total_written = 0;
// Process each iovec
for (uint32_t i = 0; i < iovs_len; i++) {
// Read iovec from WASM memory
uint32_t iov_offset = iovs_ptr + i * 8;
uint32_t buf_ptr = read_u32(ctx->memory, iov_offset);
uint32_t buf_len = read_u32(ctx->memory, iov_offset + 4);
// Bounds check
if (buf_ptr + buf_len > ctx->memory_size) {
return WASI_ERRNO_FAULT;
}
// Write to host fd
int host_fd = (fd == 1) ? STDOUT_FILENO : STDERR_FILENO;
ssize_t written = write(host_fd, ctx->memory + buf_ptr, buf_len);
if (written < 0) {
return errno_to_wasi(errno);
}
total_written += written;
}
// Write number of bytes written
write_u32(ctx->memory, nwritten_ptr, total_written);
return WASI_ERRNO_SUCCESS;
}
Test program (compile with wasi-sdk):
#include <stdio.h>
int main() {
printf("Hello, World!\n");
return 0;
}
Checkpoint: โHello, World!โ prints to terminal.
Phase 2: Process Exit (Day 4)
Goal: Handle program termination
// proc_exit(code: i32) -> noreturn
uint32_t wasi_proc_exit(WasiCtx* ctx, uint32_t* args) {
ctx->exit_code = args[0];
ctx->exited = true;
// You have several options here:
// 1. longjmp back to the runtime
// 2. Throw an exception
// 3. Set a flag and check it in the execution loop
// For simplicity, use longjmp:
longjmp(ctx->exit_jmp, 1);
return 0; // Never reached
}
Checkpoint: Programs can exit with different codes.
Phase 3: Arguments (Days 5-7)
Goal: Pass command-line arguments
// args_sizes_get(argc: i32, argv_buf_size: i32) -> errno
uint32_t wasi_args_sizes_get(WasiCtx* ctx, uint32_t* args) {
uint32_t argc_ptr = args[0];
uint32_t argv_buf_size_ptr = args[1];
// Calculate buffer size needed
uint32_t buf_size = 0;
for (int i = 0; i < ctx->argc; i++) {
buf_size += strlen(ctx->args[i]) + 1; // +1 for null terminator
}
write_u32(ctx->memory, argc_ptr, ctx->argc);
write_u32(ctx->memory, argv_buf_size_ptr, buf_size);
return WASI_ERRNO_SUCCESS;
}
// args_get(argv: i32, argv_buf: i32) -> errno
uint32_t wasi_args_get(WasiCtx* ctx, uint32_t* args) {
uint32_t argv_ptr = args[0]; // Array of pointers
uint32_t argv_buf_ptr = args[1]; // Actual string data
uint32_t buf_offset = 0;
for (int i = 0; i < ctx->argc; i++) {
// Write pointer to this arg
write_u32(ctx->memory, argv_ptr + i * 4, argv_buf_ptr + buf_offset);
// Write the string
size_t len = strlen(ctx->args[i]) + 1;
memcpy(ctx->memory + argv_buf_ptr + buf_offset, ctx->args[i], len);
buf_offset += len;
}
return WASI_ERRNO_SUCCESS;
}
Test program:
#include <stdio.h>
int main(int argc, char** argv) {
for (int i = 0; i < argc; i++) {
printf("arg[%d] = %s\n", i, argv[i]);
}
return 0;
}
Checkpoint: ./wasi-runtime echo.wasm hello world prints arguments.
Phase 4: Environment (Days 8-9)
Goal: Pass environment variables
// environ_sizes_get(count: i32, buf_size: i32) -> errno
uint32_t wasi_environ_sizes_get(WasiCtx* ctx, uint32_t* args) {
uint32_t count_ptr = args[0];
uint32_t buf_size_ptr = args[1];
uint32_t buf_size = 0;
for (int i = 0; i < ctx->environ_count; i++) {
buf_size += strlen(ctx->environ[i]) + 1;
}
write_u32(ctx->memory, count_ptr, ctx->environ_count);
write_u32(ctx->memory, buf_size_ptr, buf_size);
return WASI_ERRNO_SUCCESS;
}
// environ_get(environ: i32, environ_buf: i32) -> errno
uint32_t wasi_environ_get(WasiCtx* ctx, uint32_t* args) {
uint32_t environ_ptr = args[0];
uint32_t environ_buf_ptr = args[1];
uint32_t buf_offset = 0;
for (int i = 0; i < ctx->environ_count; i++) {
write_u32(ctx->memory, environ_ptr + i * 4, environ_buf_ptr + buf_offset);
size_t len = strlen(ctx->environ[i]) + 1;
memcpy(ctx->memory + environ_buf_ptr + buf_offset, ctx->environ[i], len);
buf_offset += len;
}
return WASI_ERRNO_SUCCESS;
}
Checkpoint: Programs can read environment variables.
Phase 5: File Descriptor Table (Days 10-12)
Goal: Set up fd table with preopens
void init_fd_table(FdTable* table) {
memset(table, 0, sizeof(FdTable));
// fd 0 = stdin
table->fds[0] = (WasiFd){
.host_fd = STDIN_FILENO,
.rights_base = WASI_RIGHT_FD_READ,
.type = WASI_FILETYPE_CHARACTER_DEVICE,
};
// fd 1 = stdout
table->fds[1] = (WasiFd){
.host_fd = STDOUT_FILENO,
.rights_base = WASI_RIGHT_FD_WRITE,
.type = WASI_FILETYPE_CHARACTER_DEVICE,
};
// fd 2 = stderr
table->fds[2] = (WasiFd){
.host_fd = STDERR_FILENO,
.rights_base = WASI_RIGHT_FD_WRITE,
.type = WASI_FILETYPE_CHARACTER_DEVICE,
};
table->next_fd = 3;
}
int add_preopen(FdTable* table, const char* host_path, const char* guest_path) {
int host_fd = open(host_path, O_RDONLY | O_DIRECTORY);
if (host_fd < 0) return -1;
int fd = table->next_fd++;
table->fds[fd] = (WasiFd){
.host_fd = host_fd,
.path = strdup(guest_path),
.rights_base = DIRECTORY_RIGHTS,
.rights_inheriting = FILE_RIGHTS,
.type = WASI_FILETYPE_DIRECTORY,
.is_preopen = true,
};
return fd;
}
Phase 6: Prestat Functions (Days 13-14)
Goal: Let programs discover preopened directories
// fd_prestat_get(fd: i32, buf: i32) -> errno
uint32_t wasi_fd_prestat_get(WasiCtx* ctx, uint32_t* args) {
uint32_t fd = args[0];
uint32_t buf_ptr = args[1];
if (fd >= MAX_FDS || !ctx->fd_table.fds[fd].is_preopen) {
return WASI_ERRNO_BADF;
}
WasiFd* wasi_fd = &ctx->fd_table.fds[fd];
// Write prestat struct:
// u8 tag (0 = directory)
// u32 name_len
write_u8(ctx->memory, buf_ptr, 0); // PREOPENTYPE_DIR
write_u32(ctx->memory, buf_ptr + 4, strlen(wasi_fd->path));
return WASI_ERRNO_SUCCESS;
}
// fd_prestat_dir_name(fd: i32, path: i32, path_len: i32) -> errno
uint32_t wasi_fd_prestat_dir_name(WasiCtx* ctx, uint32_t* args) {
uint32_t fd = args[0];
uint32_t path_ptr = args[1];
uint32_t path_len = args[2];
if (fd >= MAX_FDS || !ctx->fd_table.fds[fd].is_preopen) {
return WASI_ERRNO_BADF;
}
WasiFd* wasi_fd = &ctx->fd_table.fds[fd];
size_t name_len = strlen(wasi_fd->path);
if (path_len < name_len) {
return WASI_ERRNO_NAMETOOLONG;
}
memcpy(ctx->memory + path_ptr, wasi_fd->path, name_len);
return WASI_ERRNO_SUCCESS;
}
Phase 7: File Operations (Days 15-18)
Goal: Open and read files
// path_open(fd, dirflags, path, path_len, oflags, fs_rights_base,
// fs_rights_inheriting, fdflags, opened_fd) -> errno
uint32_t wasi_path_open(WasiCtx* ctx, uint32_t* args) {
uint32_t dir_fd = args[0];
uint32_t dirflags = args[1];
uint32_t path_ptr = args[2];
uint32_t path_len = args[3];
uint32_t oflags = args[4];
uint64_t rights_base = args[5] | ((uint64_t)args[6] << 32);
uint64_t rights_inherit = args[7] | ((uint64_t)args[8] << 32);
uint32_t fdflags = args[9];
uint32_t opened_fd_ptr = args[10];
// Validate directory fd
if (dir_fd >= MAX_FDS || ctx->fd_table.fds[dir_fd].type != WASI_FILETYPE_DIRECTORY) {
return WASI_ERRNO_NOTDIR;
}
// Read path from WASM memory
char path[PATH_MAX];
if (path_len >= PATH_MAX) return WASI_ERRNO_NAMETOOLONG;
memcpy(path, ctx->memory + path_ptr, path_len);
path[path_len] = '\0';
// Convert WASI flags to POSIX
int posix_flags = 0;
if (oflags & WASI_OFLAGS_CREAT) posix_flags |= O_CREAT;
if (oflags & WASI_OFLAGS_EXCL) posix_flags |= O_EXCL;
if (oflags & WASI_OFLAGS_TRUNC) posix_flags |= O_TRUNC;
if ((rights_base & WASI_RIGHT_FD_READ) && (rights_base & WASI_RIGHT_FD_WRITE)) {
posix_flags |= O_RDWR;
} else if (rights_base & WASI_RIGHT_FD_WRITE) {
posix_flags |= O_WRONLY;
} else {
posix_flags |= O_RDONLY;
}
// Open relative to directory fd
int host_fd = openat(ctx->fd_table.fds[dir_fd].host_fd, path, posix_flags, 0666);
if (host_fd < 0) {
return errno_to_wasi(errno);
}
// Allocate new fd
int new_fd = ctx->fd_table.next_fd++;
ctx->fd_table.fds[new_fd] = (WasiFd){
.host_fd = host_fd,
.rights_base = rights_base,
.type = WASI_FILETYPE_REGULAR_FILE,
};
write_u32(ctx->memory, opened_fd_ptr, new_fd);
return WASI_ERRNO_SUCCESS;
}
// fd_read(fd, iovs, iovs_len, nread) -> errno
uint32_t wasi_fd_read(WasiCtx* ctx, uint32_t* args) {
uint32_t fd = args[0];
uint32_t iovs_ptr = args[1];
uint32_t iovs_len = args[2];
uint32_t nread_ptr = args[3];
// Similar to fd_write but with read()
// ...
}
Phase 8: Clock and Random (Days 19-21)
Goal: Time and randomness
// clock_time_get(clock_id, precision, time) -> errno
uint32_t wasi_clock_time_get(WasiCtx* ctx, uint32_t* args) {
uint32_t clock_id = args[0];
uint64_t precision = args[1] | ((uint64_t)args[2] << 32);
uint32_t time_ptr = args[3];
struct timespec ts;
clockid_t posix_clock;
switch (clock_id) {
case WASI_CLOCK_REALTIME:
posix_clock = CLOCK_REALTIME;
break;
case WASI_CLOCK_MONOTONIC:
posix_clock = CLOCK_MONOTONIC;
break;
default:
return WASI_ERRNO_INVAL;
}
if (clock_gettime(posix_clock, &ts) != 0) {
return errno_to_wasi(errno);
}
uint64_t nanos = ts.tv_sec * 1000000000ULL + ts.tv_nsec;
write_u64(ctx->memory, time_ptr, nanos);
return WASI_ERRNO_SUCCESS;
}
// random_get(buf, buf_len) -> errno
uint32_t wasi_random_get(WasiCtx* ctx, uint32_t* args) {
uint32_t buf_ptr = args[0];
uint32_t buf_len = args[1];
if (buf_ptr + buf_len > ctx->memory_size) {
return WASI_ERRNO_FAULT;
}
// Use system random source
int fd = open("/dev/urandom", O_RDONLY);
if (fd < 0) {
return errno_to_wasi(errno);
}
ssize_t n = read(fd, ctx->memory + buf_ptr, buf_len);
close(fd);
if (n != buf_len) {
return WASI_ERRNO_IO;
}
return WASI_ERRNO_SUCCESS;
}
Testing Strategy
Unit Tests
Test individual WASI functions:
void test_fd_write() {
WasiCtx ctx = create_test_ctx();
// Set up iovec in memory
write_u32(ctx.memory, 0x1000, 0x2000); // buf ptr
write_u32(ctx.memory, 0x1004, 5); // buf len
memcpy(ctx.memory + 0x2000, "hello", 5);
uint32_t args[] = {1, 0x1000, 1, 0x3000}; // fd=1, iovs, iovs_len=1, nwritten
uint32_t result = wasi_fd_write(&ctx, args);
assert(result == WASI_ERRNO_SUCCESS);
assert(read_u32(ctx.memory, 0x3000) == 5);
}
Integration Tests
Use pre-compiled WASI programs:
# Compile test programs with wasi-sdk
/opt/wasi-sdk/bin/clang --target=wasm32-wasi -o hello.wasm hello.c
/opt/wasi-sdk/bin/clang --target=wasm32-wasi -o cat.wasm cat.c
# Test
./wasi-runtime hello.wasm | grep "Hello"
echo "test" | ./wasi-runtime cat.wasm /dev/stdin | grep "test"
Compatibility Tests
Run programs compiled by others:
# Download wasi-libc test suite
git clone https://github.com/WebAssembly/wasi-libc.git
cd wasi-libc/test
# Try to run each test
for wasm in *.wasm; do
./wasi-runtime "$wasm" && echo "PASS: $wasm" || echo "FAIL: $wasm"
done
Compare Against wasmtime
# Run same program in wasmtime and your runtime
wasmtime program.wasm arg1 arg2 > expected.txt
./wasi-runtime program.wasm arg1 arg2 > actual.txt
diff expected.txt actual.txt
Common Pitfalls
1. Endianness in Memory
WASM uses little-endian. Be consistent:
uint32_t read_u32(uint8_t* mem, uint32_t addr) {
return mem[addr] |
(mem[addr + 1] << 8) |
(mem[addr + 2] << 16) |
(mem[addr + 3] << 24);
}
void write_u32(uint8_t* mem, uint32_t addr, uint32_t val) {
mem[addr] = val & 0xff;
mem[addr + 1] = (val >> 8) & 0xff;
mem[addr + 2] = (val >> 16) & 0xff;
mem[addr + 3] = (val >> 24) & 0xff;
}
2. Path Resolution Security
Donโt allow path traversal attacks:
// BAD: allows escape via ../
path_open(3, 0, "../../../etc/passwd", ...)
// Your runtime must:
// 1. Resolve the path relative to the preopen
// 2. Check that resolved path is still under preopen
// 3. Use openat() to ensure atomicity
3. Rights Inheritance
When opening a file from a directory, the fileโs rights are limited:
// File can't have more rights than directory allows
new_fd.rights_base = requested_rights & dir_fd.rights_inheriting;
4. errno Mapping
Map POSIX errno to WASI errno correctly:
wasi_errno_t errno_to_wasi(int posix_errno) {
switch (posix_errno) {
case 0: return WASI_ERRNO_SUCCESS;
case EACCES: return WASI_ERRNO_ACCES;
case EBADF: return WASI_ERRNO_BADF;
case EEXIST: return WASI_ERRNO_EXIST;
case EINVAL: return WASI_ERRNO_INVAL;
case ENOENT: return WASI_ERRNO_NOENT;
case ENOTDIR: return WASI_ERRNO_NOTDIR;
case EPERM: return WASI_ERRNO_PERM;
// ... many more ...
default: return WASI_ERRNO_IO; // Fallback
}
}
5. 64-bit Arguments
Some WASI functions take 64-bit arguments split across two 32-bit params:
// rights is actually u64, passed as two u32
uint64_t rights_base = args[5] | ((uint64_t)args[6] << 32);
Extensions
1. Socket Support
Add sock_* functions for networking:
uint32_t wasi_sock_connect(WasiCtx* ctx, uint32_t* args) {
// Create socket, connect to address
// Return new fd
}
2. Async I/O (preview2)
Implement poll_oneoff properly:
uint32_t wasi_poll_oneoff(WasiCtx* ctx, uint32_t* args) {
// Use poll() or select() on host
// Return which subscriptions are ready
}
3. Thread Support
Add thread_spawn for parallel execution:
uint32_t wasi_thread_spawn(WasiCtx* ctx, uint32_t* args) {
// Create new thread with shared memory
// Each thread needs its own stack
}
4. Filesystem Virtualization
Map multiple host paths to virtual filesystem:
// Virtual FS structure
// /home -> /Users/me
// /tmp -> /var/tmp
// /app -> read-only from WASM bundle
5. Capability Tokens
Implement fine-grained capabilities:
// Instead of blanket preopens:
// Grant specific operations on specific paths
grant_capability(ctx, "/data/output.txt", WRITE_ONLY);
grant_capability(ctx, "/data/input.txt", READ_ONLY);
Real-World Connections
Edge Computing
Cloudflare Workers, Fastly Compute@Edge, AWS Lambda:
- Run WASI programs with millisecond cold starts
- Strong isolation between tenants
- Your runtime demonstrates the security model
Container Alternative
Krustlet, runwasi:
- Run WASM in Kubernetes instead of containers
- WASI provides the system interface
- Smaller, faster, more portable than Docker
Plugin Systems
Figma, Shopify, Envoy Proxy:
- Run third-party code safely
- WASI provides controlled access to resources
- Your runtime shows how this works
Blockchain
Near Protocol, Polkadot:
- Deterministic execution for smart contracts
- WASI-like interfaces for chain interaction
- Gas metering (add instruction counting!)
Self-Assessment Checklist
WASI Understanding
- Explain capability-based security vs ambient authority
- Describe how preopens limit file access
- Explain why WASI uses pointers into linear memory for data
- List the WASI preview1 function categories
Implementation
- Run โHello, World!โ to stdout
- Pass command-line arguments to programs
- Open and read files from preopened directories
- Properly handle all errno cases
Security
- Prevent path traversal attacks (../)
- Enforce rights on file descriptors
- Validate all memory accesses
- Sandbox file access to preopens only
Compatibility
- Run programs compiled with wasi-sdk
- Match wasmtime behavior for basic programs
- Handle wasi-libc initialization correctly
Resources
Specifications
- WASI Specification - Official spec
- wasi-libc - WASI C library
- WASI Tutorial - Wasmtime docs
Reference Implementations
- wasmtime WASI - Production implementation
- wasm3 WASI - Clean, readable
Tools
Articles
- Standardizing WASI - Lin Clarkโs illustrated guide
- The WebAssembly System Interface - Official site
Key Insights
WASI is POSIX with guardrails. The API surface is familiar, but the capability model fundamentally changes the security story. Programs can only access what theyโre explicitly granted.
The host is in control. Every WASI call goes through your runtime. You decide what files exist, what time it is, what random numbers look like. This is powerful for sandboxing and testing.
Memory marshalling is the hard part. WASI function signatures look simple, but the real complexity is in correctly reading and writing data structures from WASM linear memory.
Preopens are the security boundary. Understanding preopens is understanding WASI security. A program canโt access anything it wasnโt granted at startup.
After completing this project, youโve built a complete WebAssembly runtime that can run real command-line programs. The Capstone project integrates everything into a professional toolchain.
The Core Question Youโre Answering
โHow does sandboxed code safely interact with the outside world, and what does capability-based security really mean?โ
This is the fundamental question that WASI addresses. When you run untrusted code, you face a dilemma:
- Too restrictive: Code canโt do anything useful (no I/O, no persistence)
- Too permissive: Code can access anything (read your SSH keys, delete files)
WASIโs answer: Explicit capability grants. Instead of asking โdoes this process have permission?โ, you ask โwas this process given the capability?โ. The difference is profound:
Traditional (Ambient Authority):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Process starts with access to everything user can access โ
โ System checks: "Does user have permission for this file?" โ
โ Problem: Any code in process can access any user file โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Capability-Based (WASI):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Process starts with access to NOTHING โ
โ Host explicitly grants: "Here's a handle to /data directory" โ
โ Module can only access what was explicitly given โ
โ Problem solved: Untrusted code can't escape its sandbox โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ

By building a WASI runtime, youโre implementing the answer to this question in code.
Concepts You Must Understand First
Before diving into implementation, ensure you have solid grounding in these concepts:
1. WASI Preview 1/2 Specification
WASI defines a standardized ABI for WebAssembly modules to interact with the host system. Preview 1 (stable) uses the wasi_snapshot_preview1 module namespace. Preview 2 (WASI 0.2) uses the Component Model with Wit IDL for more expressive interfaces.
Key differences: | Aspect | Preview 1 | Preview 2 | |โโโ|โโโโ|โโโโ| | Module format | Core WASM | Components | | Interface definition | Fixed function signatures | Wit IDL | | Type system | i32/i64/f32/f64 only | Rich types, resources | | Async I/O | poll_oneoff (limited) | Native async (planned) |
2. Capability-Based Security Model
Unlike access control lists (ACLs) where permissions are checked against an identity, capabilities are unforgeable tokens that grant specific rights. A file descriptor in WASI is a capability:
// The fd IS the capability. Having fd 3 means you have access.
// You can't fabricate fd 3 if it wasn't given to you.
// You can't convert fd 3 (read-only) to fd 3 (read-write).
Historical context: This model was pioneered by systems like:
- Capsicum (FreeBSD): Sandboxing with capabilities
- CloudABI: POSIX subset with capabilities (heavily influenced WASI)
- seL4: Capability-based microkernel
3. File Descriptors and Preopens
Preopens are the bridge between the host filesystem and the sandboxed module:
// Before execution:
// fd 0 = stdin (capability to read console)
// fd 1 = stdout (capability to write console)
// fd 2 = stderr (capability to write console)
// fd 3 = preopen "/" -> "/home/user/app/data" (capability to access this dir)
// The module sees "/" but can only access /home/user/app/data on host
The module discovers preopens by calling fd_prestat_get starting at fd 3 until it gets EBADF.
4. System Call Abstraction
WASI functions map to host system calls but with sandboxing:
WASI call Host System Call Sandboxing Applied
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
fd_write(1, ...) โ write(STDOUT, ...) (fd mapping)
path_open(3, "x") โ openat(host_fd, "x") (path resolution + rights)
fd_read(4, ...) โ read(mapped_fd, ...) (fd mapping + rights check)
5. Sandboxing and Isolation
WASI provides multiple layers of isolation:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Layer 1: Memory Isolation โ
โ - WASM linear memory is separate from host memory โ
โ - All pointers are offsets into this memory โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 2: Capability Restriction โ
โ - Only preopened directories are accessible โ
โ - Rights bits limit operations per fd โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 3: Path Sandboxing โ
โ - ".." cannot escape preopen directory โ
โ - Symlinks resolved within sandbox โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Layer 4: Resource Limits (optional) โ
โ - Memory limits โ
โ - CPU limits (fuel/gas) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ

6. Host Function Implementation
Your runtime provides host functions that the WASM module imports:
// Module imports: (import "wasi_snapshot_preview1" "fd_write" (func ...))
// Your runtime provides the implementation:
typedef struct {
const char* module; // "wasi_snapshot_preview1"
const char* name; // "fd_write"
void* host_func; // pointer to your implementation
int param_count; // 4 (fd, iovs, iovs_len, nwritten)
int result_count; // 1 (errno)
} HostImport;
Questions to Guide Your Design
fd_write and fd_read Implementation
- How do you read the iovec array from WASM memory?
- Whatโs the layout of an iovec struct? (buf: u32, buf_len: u32)
- How do you iterate through the array?
- What happens if the pointer is out of bounds?
- How do you map WASI fd to host fd?
- Where do you store the mapping?
- How do you handle closed fds?
- What if the WASI fd doesnโt exist?
- How do you check rights before performing the operation?
- Does fd 4 have WASI_RIGHT_FD_WRITE?
- What errno do you return for insufficient rights?
- How do you handle partial writes?
- Host write() returned less than requested
- Do you retry or return partial count?
Path Resolution in Sandbox
- How do you prevent โ..โ from escaping the sandbox?
- Track directory depth during resolution
- Use openat() to stay relative
- What about symlinks pointing outside?
- How do you handle absolute paths?
- WASI doesnโt allow absolute paths
- Return ENOTCAPABLE if path starts with /
- How do you resolve paths relative to preopens?
- Module requests โdata/file.txtโ
- Which preopen matches this path?
- What if multiple preopens could match?
- Whatโs your strategy for symlink resolution?
- O_NOFOLLOW on each component
- Read symlink target, validate it stays in sandbox
- Handle symlink loops (ELOOP after 40 iterations)
Managing Capabilities (Preopens)
- How do you parse preopen arguments from CLI?
--dir=/host/path:/guest/path- Default rights for directories vs files?
- How do you expose preopens to the module?
- fd_prestat_get returns preopen info
- fd_prestat_dir_name returns the virtual path
- Module calls these to discover its capabilities
- How do rights inheritance work?
- Directory has rights_inheriting
- Files opened from dir get: requested_rights & rights_inheriting
- How do you handle rights on opened files?
- Store rights in fd table entry
- Check before each operation
- Never escalate rights
Mapping WASI Calls to Host OS
- How do you handle WASI errno vs POSIX errno?
- Different numeric values
- Some WASI errors donโt exist in POSIX
- Create mapping table
- How do you handle platform differences?
- Linux has openat2() with RESOLVE_BENEATH
- macOS needs manual path component walking
- Windows needs completely different approach
- How do you handle 64-bit values in 32-bit WASM?
- Rights are u64, split across two u32 params
- File sizes, timestamps are u64
- Use proper little-endian encoding
Thinking Exercise
Trace a fd_write Call from WASM Through WASI to Actual File I/O
Follow this complete journey step by step:
Step 1: WASM Module Executes
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
(call $fd_write
(i32.const 1) ;; fd = 1 (stdout)
(i32.const 0x1000) ;; iovs ptr in linear memory
(i32.const 1) ;; iovs_len = 1 iovec
(i32.const 0x2000) ;; nwritten ptr for output
)
Linear Memory at this point:
Addr 0x1000: [0x00, 0x30, 0x00, 0x00] ;; buf = 0x3000
Addr 0x1004: [0x0D, 0x00, 0x00, 0x00] ;; len = 13
Addr 0x3000: "Hello, World!" ;; actual string data
Step 2: Interpreter Recognizes Import Call
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Instruction: call $fd_write
- Lookup: fd_write is import from wasi_snapshot_preview1
- Pop 4 values from stack: [1, 0x1000, 1, 0x2000]
- Call host function: wasi_fd_write(ctx, args)
Step 3: Your wasi_fd_write Implementation
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
uint32_t wasi_fd_write(WasiCtx* ctx, uint32_t* args) {
uint32_t fd = args[0]; // 1
uint32_t iovs_ptr = args[1]; // 0x1000
uint32_t iovs_len = args[2]; // 1
uint32_t nwritten_ptr = args[3]; // 0x2000
Step 4: Validate File Descriptor
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Look up fd 1 in table
WasiFd* wasi_fd = &ctx->fd_table.fds[fd];
// Check: is fd valid?
if (!wasi_fd->active) return WASI_ERRNO_BADF;
// Check: does fd have write rights?
if (!(wasi_fd->rights_base & WASI_RIGHT_FD_WRITE))
return WASI_ERRNO_NOTCAPABLE;
Step 5: Read iovec Array from WASM Memory
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Bounds check
if (iovs_ptr + iovs_len * 8 > ctx->memory_size)
return WASI_ERRNO_FAULT;
// Read first iovec
uint32_t buf_ptr = read_u32_le(ctx->memory, iovs_ptr); // 0x3000
uint32_t buf_len = read_u32_le(ctx->memory, iovs_ptr + 4); // 13
// Bounds check the buffer
if (buf_ptr + buf_len > ctx->memory_size)
return WASI_ERRNO_FAULT;
Step 6: Map to Host File Descriptor
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// wasi fd 1 (stdout) โ host fd 1 (STDOUT_FILENO)
int host_fd = wasi_fd->host_fd; // STDOUT_FILENO
Step 7: Perform Actual System Call
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Prepare iovec for host
struct iovec host_iov = {
.iov_base = ctx->memory + buf_ptr, // &memory[0x3000]
.iov_len = buf_len // 13
};
// Call writev (handles multiple iovecs efficiently)
ssize_t written = writev(host_fd, &host_iov, 1);
// Handle errors
if (written < 0) {
return errno_to_wasi(errno);
}
Step 8: Write Result Back to WASM Memory
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
// Store bytes written at nwritten_ptr
write_u32_le(ctx->memory, nwritten_ptr, (uint32_t)written);
return WASI_ERRNO_SUCCESS;
}
Step 9: Return to WASM Execution
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- Host function returns 0 (SUCCESS)
- Interpreter pushes result onto stack
- WASM code checks result, reads nwritten if needed
Step 10: Observable Effect
โโโโโโโโโโโโโโโโโโโโโโโโโโ
Terminal shows: Hello, World!
Draw this as a diagram in your notes. Understanding this flow is essential.
The Interview Questions Theyโll Ask
WASI Fundamentals
Q: What is WASI and why does it exist?
WASI (WebAssembly System Interface) is a standardized API that allows WebAssembly modules to interact with the operating system portably. It exists because WASM alone has no I/O capabilitiesโitโs pure computation. WASI provides a POSIX-like API with capability-based security, enabling WASM modules to do file I/O, networking, etc., while maintaining strong sandboxing guarantees.
Q: How does WASI differ from browser WebAssembly?
In browsers, JavaScript provides host bindings. WASI standardizes non-browser host bindings for system interfaces. Key differences:
- Standard module namespace (wasi_snapshot_preview1)
- Capability-based security model
- POSIX-like API surface
- Designed for CLI/server, not browser DOM
Q: Whatโs the difference between WASI Preview 1 and Preview 2?
Preview 1 uses fixed function signatures with raw i32/i64 types. Preview 2 (WASI 0.2) introduces the Component Model with Wit IDL, providing rich types, resources, and better language interop. Preview 2 is modular (wasi-filesystem, wasi-sockets, etc.) while Preview 1 is monolithic.
Sandboxing and Security
Q: How does WASI prevent a module from reading /etc/passwd?
WASI uses capability-based security with preopens. The module can only access directories explicitly granted at startup via preopens. If /etc wasnโt preopened, thereโs no capability to access it. Path resolution prevents โ..โ from escaping the sandbox. The module literally has no way to reference /etc/passwd.
Q: What is a preopen and how does it work?
A preopen is a directory opened by the host and passed to the module at startup as a file descriptor (starting at fd 3). The module discovers preopens by iterating fd_prestat_get from fd 3 until EBADF. Each preopen maps a host directory to a virtual path the module sees. All path operations are relative to some preopen.
Q: Explain the difference between ambient authority and capability-based security.
Ambient authority: A process inherits all permissions of its user. Any code in the process can access any file the user can. Example: POSIX open(โ/etc/passwdโ). Capability-based: A process starts with no permissions. Capabilities (unforgeable tokens) must be explicitly granted. Example: WASI preopen grants access to specific directories only.
Q: How would you prevent a path traversal attack in your WASI implementation?
- Reject absolute paths (start with /)
- Track depth during path resolution; โ..โ canโt go below 0
- Use openat() with O_NOFOLLOW for each component
- Resolve symlinks within sandbox, fail if they escape
- On Linux, use openat2() with RESOLVE_BENEATH flag
Implementation Details
Q: How do you transfer data between host and WASM memory?
WASM memory is a contiguous byte array. Pointers in WASI calls are offsets into this array. To transfer data:
- Read: Copy from memory[ptr] to host buffer
- Write: Copy from host buffer to memory[ptr] Must always bounds-check: ptr + len <= memory_size
Q: What happens when a WASM module calls proc_exit()?
You canโt just call exit() because youโd kill the host process. Options:
- longjmp back to runtime entry point
- Throw an exception (in C++/Rust)
- Set a flag and check it in the execution loop The exit code should be captured and made available to the caller.
Q: How do you handle the split of 64-bit values in WASI?
WASM32 can only pass i32 values. 64-bit values (rights, file sizes, timestamps) are passed as two consecutive i32 arguments: low 32 bits first, then high 32 bits. Reconstruct:
u64 value = low | ((u64)high << 32).
Q: What rights does a file inherit when opened from a directory?
file_rights = requested_rights & directory.rights_inheritingThe file can never have more rights than the directory allows. This is capability attenuationโyou can only reduce rights, never escalate.
Hints in Layers
Layer 0: Start Simple
- Implement only fd_write for fd 1 (stdout) and fd 2 (stderr)
- Hardcode the file descriptor mapping
- Donโt worry about preopens yet
- Goal: Print โHello, World!โ
Layer 1: Add Basic Infrastructure
- Create an FdTable structure to map WASI fds to host fds
- Initialize stdin/stdout/stderr at startup
- Add bounds checking for memory access
- Implement proc_exit with longjmp
Layer 2: Command-Line Support
- Implement args_sizes_get and args_get
- Parse argc/argv from host and store in WasiCtx
- Implement environ_sizes_get and environ_get
- Test: Run program that echoes its arguments
Layer 3: Preopens Foundation
- Add preopen support to FdTable
- Parse โdir=host:guest from CLI
- Implement fd_prestat_get and fd_prestat_dir_name
- Test: Module can discover preopened directories
Layer 4: File Operations
- Implement path_open with basic flags
- Use openat() relative to preopenโs host_fd
- Implement fd_read similar to fd_write
- Implement fd_close, fd_seek, fd_tell
- Test: Read a file and print its contents
Layer 5: Security Hardening
- Add rights checking to all operations
- Implement path resolution with โ..โ protection
- Handle symlinks safely (O_NOFOLLOW + validate)
- Add comprehensive bounds checking
- Test: Verify module canโt escape sandbox
Layer 6: Full Compatibility
- Implement remaining fd operations (stat, sync, etc.)
- Implement directory operations (readdir, mkdir, etc.)
- Add clock_time_get and random_get
- Test against wasi-libc test suite
Debugging Tips
- Print every WASI call with arguments for tracing
- Compare output with wasmtime for same program
- Use a simple test program, not complex wasi-libc programs
- Check errno mapping carefully
Books That Will Help
| Book | Author | Relevant Chapters | Why It Helps |
|---|---|---|---|
| The Linux Programming Interface | Michael Kerrisk | Ch. 4-5 (File I/O), Ch. 13-14 (File Systems), Ch. 56-57 (Sockets) | The definitive reference for POSIX system calls. Youโll implement WASI by mapping to these. Essential for understanding fd semantics, file operations, and system call error handling. |
| Operating Systems: Three Easy Pieces | Remzi Arpaci-Dusseau | Virtualization section (Ch. 4-11), Persistence section (Ch. 36-42) | Explains process isolation, virtual memory, and file systems at a conceptual level. Helps you understand WHY sandboxing works the way it does. Free online: https://pages.cs.wisc.edu/~remzi/OSTEP/ |
| Computer Systems: A Programmerโs Perspective | Bryant & OโHallaron | Ch. 8 (Exceptional Control Flow), Ch. 10 (System-Level I/O) | Deep dive into how system calls work at the hardware level, including traps, context switches, and I/O. Essential for understanding the host side of WASI. |
| Secure Programming Cookbook | Viega & Messier | Ch. 1-2 (Access Control), Ch. 13 (Randomness) | Practical security patterns. Helps you implement random_get correctly and understand security implications of your choices. |
Supplementary Reading
| Resource | Type | Focus |
|---|---|---|
| WASI Specification | Spec | Authoritative reference for all WASI functions |
| CloudABI Capsicum Paper | Paper | The security model that inspired WASI |
| Wasmtime WASI Tutorial | Tutorial | Practical guide to running WASI programs |
Real-World Outcome
After completing this project, youโll be able to run WASI programs with sandboxed file I/O:
$ tree test_data/
test_data/
โโโ input.txt
โโโ output/
$ cat test_data/input.txt
Hello from the sandboxed world!
Line 2 of input
Line 3 of input
# Compile a file copy program with wasi-sdk
$ cat file_copy.c
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv) {
if (argc != 3) {
fprintf(stderr, "Usage: file_copy <src> <dst>\n");
return 1;
}
FILE* src = fopen(argv[1], "r");
if (!src) {
perror("fopen src");
return 1;
}
FILE* dst = fopen(argv[2], "w");
if (!dst) {
perror("fopen dst");
fclose(src);
return 1;
}
char buf[4096];
size_t n;
while ((n = fread(buf, 1, sizeof(buf), src)) > 0) {
fwrite(buf, 1, n, dst);
}
printf("Copied %s to %s\n", argv[1], argv[2]);
fclose(src);
fclose(dst);
return 0;
}
$ /opt/wasi-sdk/bin/clang --target=wasm32-wasi -o file_copy.wasm file_copy.c
# Run with your WASI runtime - note the sandboxed directory mapping
$ ./wasi-runtime \
--dir=./test_data:/data \
file_copy.wasm \
/data/input.txt \
/data/output/copy.txt
Copied /data/input.txt to /data/output/copy.txt
$ cat test_data/output/copy.txt
Hello from the sandboxed world!
Line 2 of input
Line 3 of input
# Demonstrate sandbox security - attempt to escape
$ ./wasi-runtime \
--dir=./test_data:/data \
escape_attempt.wasm
# escape_attempt.wasm tries: fopen("/etc/passwd", "r")
WASI Error: path_open failed with ENOTCAPABLE (76)
The program cannot access /etc/passwd - not in sandbox
# escape_attempt.wasm tries: fopen("../../../etc/passwd", "r")
WASI Error: path_open failed with EACCES (2)
Path traversal blocked - ".." cannot escape preopen
# Demonstrate preopens discovery
$ cat list_preopens.c
#include <stdio.h>
#include <wasi/api.h>
int main() {
for (__wasi_fd_t fd = 3; ; fd++) {
__wasi_prestat_t prestat;
__wasi_errno_t err = __wasi_fd_prestat_get(fd, &prestat);
if (err == __WASI_ERRNO_BADF) break;
if (err != __WASI_ERRNO_SUCCESS) continue;
char path[256];
__wasi_fd_prestat_dir_name(fd, (uint8_t*)path, prestat.u.dir.pr_name_len);
path[prestat.u.dir.pr_name_len] = '\0';
printf("fd %d: preopen '%s'\n", fd, path);
}
return 0;
}
$ /opt/wasi-sdk/bin/clang --target=wasm32-wasi -o list_preopens.wasm list_preopens.c
$ ./wasi-runtime \
--dir=./test_data:/data \
--dir=/tmp:/tmp \
list_preopens.wasm
fd 3: preopen '/data'
fd 4: preopen '/tmp'
# Verbose output showing WASI calls (with --trace flag)
$ ./wasi-runtime --trace \
--dir=./test_data:/data \
file_copy.wasm /data/input.txt /data/output/copy.txt
[WASI] args_sizes_get() -> argc=3, buf_size=47
[WASI] args_get() -> ["file_copy.wasm", "/data/input.txt", "/data/output/copy.txt"]
[WASI] fd_prestat_get(3) -> type=DIR, name_len=5
[WASI] fd_prestat_dir_name(3) -> "/data"
[WASI] fd_prestat_get(4) -> EBADF
[WASI] path_open(3, "/data/input.txt", O_RDONLY) -> fd=5
[WASI] path_open(3, "/data/output/copy.txt", O_CREAT|O_WRONLY) -> fd=6
[WASI] fd_read(5, iovs=1) -> 79 bytes
[WASI] fd_write(6, iovs=1) -> 79 bytes
[WASI] fd_read(5, iovs=1) -> 0 bytes (EOF)
[WASI] fd_write(1, "Copied /data/input.txt...") -> 43 bytes
[WASI] fd_close(5) -> SUCCESS
[WASI] fd_close(6) -> SUCCESS
[WASI] proc_exit(0)
Exit code: 0
This demonstrates a complete WASI runtime that:
- Maps host directories to virtual paths (sandboxing)
- Implements file read/write operations through fd_read/fd_write
- Handles command-line arguments and environment
- Prevents sandbox escape via path traversal
- Supports preopen discovery for guest programs
- Provides tracing for debugging WASI calls