Project 3: Custom Wayland Protocol Extension

Design a custom Wayland protocol in XML and implement both server and client support using wayland-scanner.

Quick Reference

Attribute	Value
Difficulty	Advanced (Level 3)
Time Estimate	1-2 weeks
Main Programming Language	C (Alternatives: Rust, C++, Zig)
Alternative Programming Languages	Rust, C++, Zig
Coolness Level	Level 4: Hardcore Tech Flex
Business Potential	Level 1: Resume Gold
Prerequisites	Project 1 completed, basic protocol literacy
Key Topics	Wayland XML, code generation, versioning, server resources

1. Learning Objectives

By completing this project, you will:

Design a new Wayland protocol extension in XML.
Generate client and server bindings with wayland-scanner.
Implement server-side behavior in your compositor or a test server.
Implement a client that uses your new protocol safely.
Handle versioning and protocol errors gracefully.

2. All Theory Needed (Per-Concept Breakdown)

2.1 Wayland Protocol XML Structure

Description / Expanded Explanation

Wayland protocols are defined in XML. Each interface declares requests, events, enums, and descriptions. The XML is the single source of truth that drives code generation and wire compatibility.

Definitions & Key Terms

protocol -> XML file containing interfaces
interface -> collection of requests/events
request -> client-to-server message
event -> server-to-client message
enum -> named constants

Mental Model Diagram (ASCII)

protocol.xml
  <interface name="example">
    <request name="do_thing"/>
    <event name="done"/>
  </interface>

How It Works (Step-by-Step)

Define a protocol XML file.
Add interface, requests, events, and enums.
Run wayland-scanner to generate header and source.
Implement server side callbacks.
Use client side API to call requests.

Minimal Concrete Example

<interface name="zwlr_example_v1" version="1">
  <request name="ping"/>
  <event name="pong"/>
</interface>

Common Misconceptions

Misconception -> XML is just documentation. Correction -> It drives code generation and ABI.
Misconception -> You can change XML freely after clients exist. Correction -> Changes must be versioned carefully.

Check-Your-Understanding Questions

Why is XML the source of truth for the protocol?
What happens if you remove a request from the protocol?
Predict how clients behave if the server does not implement an advertised request.

Where You’ll Apply It

In this project: see §5.10 Phase 1 and §3.5 Data Formats.
Also used in: P01-bare-metal-wayland-client.

2.2 Code Generation with wayland-scanner

Description / Expanded Explanation

wayland-scanner reads XML and generates C headers and stubs. There are separate client and server outputs. Understanding how these files map to XML helps you read and debug the generated code.

Definitions & Key Terms

wayland-scanner -> code generation tool
client-header -> client API declarations
server-header -> server-side interface declarations
private-code -> internal protocol tables

Mental Model Diagram (ASCII)

protocol.xml -> wayland-scanner -> protocol-client.h + protocol-server.h

How It Works (Step-by-Step)

Run wayland-scanner client-header -> client header.
Run wayland-scanner server-header -> server header.
Run wayland-scanner private-code -> protocol c file.
Compile generated files into your project.

Minimal Concrete Example

wayland-scanner client-header protocol.xml protocol-client.h
wayland-scanner server-header protocol.xml protocol-server.h
wayland-scanner private-code protocol.xml protocol-protocol.c

Common Misconceptions

Misconception -> Only client headers are needed. Correction -> Server header and private code are required server-side.
Misconception -> Generated code is optional. Correction -> You must use generated code to stay ABI compatible.

Check-Your-Understanding Questions

What is the purpose of the private-code output?
Why does the server need its own generated header?
Predict what happens if you forget to compile the protocol c file.

Where You’ll Apply It

In this project: see §5.2 Project Structure and §5.10 Phase 1.
Also used in: P02-simple-wayland-compositor-wlroots.

2.3 Versioning and Compatibility

Description / Expanded Explanation

Wayland protocols evolve by incrementing version numbers while maintaining backwards compatibility. Clients bind to a version and should only use requests up to that version. Your protocol must define a stable surface for clients.

Definitions & Key Terms

version -> interface version number
bind -> client request specifying desired version
compatibility -> ability for old clients to work with new servers

Mental Model Diagram (ASCII)

server advertises version 2
client binds version 1 -> only uses v1 requests

How It Works (Step-by-Step)

Server advertises a version.
Client binds at min(server, client) version.
Server must accept requests up to that version.
New requests are added only in higher versions.

Minimal Concrete Example

protocol = wl_registry_bind(registry, name, &zwlr_example_v1_interface, 1);

Common Misconceptions

Misconception -> Versioning is optional. Correction -> It is required for safe extension evolution.
Misconception -> You can change request signatures in-place. Correction -> That breaks ABI; you must add new requests instead.

Check-Your-Understanding Questions

Why does the client choose a version at bind time?
What happens if a client sends a v2 request while bound to v1?
Predict how you would deprecate a request safely.

Where You’ll Apply It

In this project: see §3.2 Functional Requirements and §6.2 Critical Test Cases.
Also used in: P04-wayland-panel-bar-layer-shell.

2.4 Server-Side Resource Management

Description / Expanded Explanation

When a client binds your protocol, the server creates a resource object. You must track its state and clean it up when the client disconnects. This prevents crashes and leaks.

Definitions & Key Terms

resource -> server-side object bound to a client
destroy request -> client-initiated cleanup
client disconnect -> server cleanup path

Mental Model Diagram (ASCII)

client bind -> resource create -> callbacks -> destroy -> cleanup

How It Works (Step-by-Step)

Client binds your protocol global.
Server creates a resource and sets implementation callbacks.
Client sends requests; server handles them.
Client destroys resource or disconnects.
Server cleans up any associated state.

Minimal Concrete Example

struct wl_resource *res = wl_resource_create(client, &zwlr_example_v1_interface,
                                             version, id);
wl_resource_set_implementation(res, &impl, data, destroy_cb);

Common Misconceptions

Misconception -> Client disconnect automatically frees all memory. Correction -> You must release any associated allocations.
Misconception -> You can ignore destroy callbacks. Correction -> That leaks resources and may crash later.

Check-Your-Understanding Questions

What is the difference between destroy request and client disconnect?
Why must the server own resource lifetimes?
Predict the result of accessing a resource after destroy.

Where You’ll Apply It

In this project: see §5.10 Phase 2 and §7.1 Frequent Mistakes.
Also used in: P02-simple-wayland-compositor-wlroots.

2.5 Client Binding and Dispatch

Description / Expanded Explanation

The client binds to your protocol and installs listeners for events. Proper dispatch ensures requests and events are processed in order. This is similar to core Wayland but you control the API design.

Definitions & Key Terms

listener -> set of callbacks for events
dispatch -> process incoming events
roundtrip -> synchronize client and server

Mental Model Diagram (ASCII)

bind -> add_listener -> send request -> receive event

How It Works (Step-by-Step)

Client binds protocol from registry.
Client adds event listener.
Client sends a request.
Server emits event.
Client dispatches and handles event.

Minimal Concrete Example

zwlr_example_v1_add_listener(example, &example_listener, NULL);
zwlr_example_v1_ping(example);

Common Misconceptions

Misconception -> Requests are synchronous. Correction -> Requests are asynchronous; use events to receive results.
Misconception -> Listener can be omitted if you only send requests. Correction -> You still need to dispatch to process errors and disconnects.

Check-Your-Understanding Questions

Why are Wayland requests asynchronous?
What happens if you never dispatch events?
Predict how a protocol error is delivered to the client.

Where You’ll Apply It

In this project: see §5.10 Phase 3 and §3.4 Example Usage.
Also used in: P01-bare-metal-wayland-client.

2.6 File Descriptor Passing and Unix Socket Semantics

Description / Expanded Explanation

Wayland can pass file descriptors over the socket using SCM_RIGHTS. This is how dmabuf, shared memory, and many custom protocols transfer resources. Understanding the lifetime rules and duplication semantics is essential when your custom protocol includes fd arguments.

Definitions & Key Terms

SCM_RIGHTS -> Unix mechanism for passing fds between processes
fd argument -> protocol argument of type “fd” in XML
duplication -> the receiver gets its own fd referring to the same file description
close-after-send -> safe because the fd is duplicated in transit

Mental Model Diagram (ASCII)

server fd -> sendmsg(SCM_RIGHTS) -> client receives new fd
  |                                      |
  +-- server can close                   +-- client owns its copy

How It Works (Step-by-Step)

Define an argument of type=”fd” in the protocol XML.
Server sends an event or request containing the fd.
The kernel duplicates the fd into the receiving process.
The receiver handles and closes its copy when done.

Minimal Concrete Example

// server side
int fd = open("/tmp/data.bin", O_RDONLY);
my_proto_send_fd(resource, fd);
close(fd); // safe after send

Common Misconceptions

Misconception -> The receiver shares the same fd number. Correction -> The receiver gets a new fd referring to the same underlying file.
Misconception -> You must keep the fd open forever after sending. Correction -> You can close after send; the receiver has its own copy.

Check-Your-Understanding Questions

Why does fd passing require a Unix domain socket?
What happens if the receiver never closes its fd?
How do you express an fd in a Wayland protocol XML file?

Where You’ll Apply It

In this project: see §3.5 Data Formats and §5.10 Phase 2.
Also used in: P02-simple-wayland-compositor-wlroots for dmabuf support.

2.7 Protocol Error Handling and Contract Enforcement

Description / Expanded Explanation

Wayland is strict: protocol violations are fatal. Your custom protocol must clearly define invariants and enforce them with wl_resource_post_error or wl_client_post_no_memory. This prevents undefined behavior and keeps the compositor and client in sync.

Definitions & Key Terms

wl_resource_post_error -> report a protocol error for a specific resource
wl_client_post_no_memory -> report OOM to a client
wl_display_terminate -> terminate the compositor on fatal server error
invariant -> a rule that must always hold (e.g., size > 0)

Mental Model Diagram (ASCII)

client sends request
        |
        v
validate arguments -> ok -> execute
                  -> invalid -> post_error + disconnect

How It Works (Step-by-Step)

Define invariants in your protocol documentation.
Validate every request argument on the server side.
If invalid, post a protocol error and let Wayland disconnect the client.
Use no-memory errors when allocation fails.

Minimal Concrete Example

if (width <= 0 || height <= 0) {
  wl_resource_post_error(resource, MY_PROTO_ERROR_BAD_SIZE,
                         "width/height must be > 0");
  return;
}

Common Misconceptions

Misconception -> You should ignore invalid requests to be forgiving. Correction -> Wayland relies on strict errors to keep both sides consistent.
Misconception -> Errors are only for server bugs. Correction -> Errors are for client misuse; server must enforce them.

Check-Your-Understanding Questions

Why does Wayland terminate the client on protocol errors?
When should you prefer wl_client_post_no_memory over a custom error?
What happens if you continue after detecting invalid arguments?

Where You’ll Apply It

In this project: see §3.2 Functional Requirements and §6.2 Critical Test Cases.
Also used in: P01-bare-metal-wayland-client when handling configure rules.

2.8 Protocol Design Patterns and Object Lifecycles

Description / Expanded Explanation

Wayland protocols are object trees. Managers create objects, objects emit events, and destroy requests close lifecycles. Designing your protocol with clear factories and lifetimes makes it easier to implement and test.

Definitions & Key Terms

factory object -> global that creates new protocol objects
destroy request -> explicit lifecycle end for a resource
event-driven state -> state that changes only in response to events
id allocation -> client chooses IDs for new objects

Mental Model Diagram (ASCII)

global manager
   |
   +-- create_widget() -> widget object
   |        |
   |        +-- set_state(), destroy()
   |
   +-- create_controller() -> controller object

How It Works (Step-by-Step)

Define a global manager interface.
Provide requests that create new objects (factory pattern).
Specify explicit destroy requests to release resources.
Use events to notify clients of state changes or results.

Minimal Concrete Example

<interface name="my_manager" version="1">
  <request name="create_widget">
    <arg name="id" type="new_id" interface="my_widget"/>
  </request>
</interface>

Common Misconceptions

Misconception -> You can omit destroy requests and rely on disconnect. Correction -> Explicit destroy lets clients release resources early.
Misconception -> A single interface should do everything. Correction -> Smaller interfaces with clear lifetimes are easier to evolve.

Check-Your-Understanding Questions

Why does the client allocate object IDs instead of the server?
What is the benefit of a factory object in Wayland protocols?
How would you version a protocol that adds a new request?

Where You’ll Apply It

In this project: see §4.1 High-Level Design and §5.10 Phase 1.
Also used in: P02-simple-wayland-compositor-wlroots for globals and resources.

3. Project Specification

3.1 What You Will Build

A custom protocol extension called zwlr_screenshot_v1 (or similar) that allows a client to request a screenshot from the compositor. The server side will return a shm buffer containing the captured pixels. The client will save the image to a PNG file.

3.2 Functional Requirements

Protocol XML: define requests and events for screenshot capture.
Code generation: use wayland-scanner to generate bindings.
Server implementation: respond to capture requests and send a buffer.
Client implementation: request a screenshot and save to disk.
Versioning: support version 1 and reject higher versions.

3.3 Non-Functional Requirements

Performance: capture within 500ms for a 1080p output.
Reliability: handle client disconnects gracefully.
Usability: clear CLI output and errors.

3.4 Example Usage / Output

$ ./screenshot-client --output HDMI-A-1 --file shot.png
[info] requesting screenshot
[info] received buffer 1920x1080
[info] wrote shot.png

3.5 Data Formats / Schemas / Protocols

Protocol XML with interface zwlr_screenshot_v1
Buffer format: ARGB8888 shm buffer
File output: PNG

3.6 Edge Cases

Output name not found -> error.
Client requests while no buffer available -> error event.
Client disconnect mid-transfer -> server cleans resources.

3.7 Real World Outcome

3.7.1 How to Run (Copy/Paste)

meson setup build
meson compile -C build
./build/screenshot-server
./build/screenshot-client --output HDMI-A-1 --file shot.png

3.7.2 Golden Path Demo (Deterministic)

Capture a screenshot on a fixed output with a known background color and verify the file hash is stable.

3.7.3 If CLI: exact terminal transcript

$ ./screenshot-client --output HDMI-A-1 --file shot.png
[info] run_id=0003
[info] request sent
[info] event: screenshot_ready size=1920x1080
[info] wrote shot.png
$ echo $?
0

Failure Demo (Deterministic)

$ ./screenshot-client --output DOES_NOT_EXIST --file shot.png
[error] output not found
$ echo $?
4

4. Solution Architecture

4.1 High-Level Design

client -> registry bind -> zwlr_screenshot_v1 -> request
server -> capture -> shm buffer -> event -> client saves file

Screenshot Protocol Flow

4.2 Key Components

4.3 Data Structures (No Full Code)

struct screenshot_request {
  struct wl_resource *resource;
  struct wl_buffer *buffer;
  int width;
  int height;
};

4.4 Algorithm Overview

Key Algorithm: Screenshot Capture

Client sends capture request.
Server allocates shm buffer and renders the output into it.
Server emits event with buffer and metadata.
Client maps buffer and saves PNG.

Complexity Analysis:

Time: O(width * height) for copying pixels.
Space: O(width * height) for buffer.

5. Implementation Guide

5.1 Development Environment Setup

sudo apt install wayland-protocols libwayland-dev

5.2 Project Structure

protocol/
├── protocol.xml
├── generated/
│   ├── protocol-client.h
│   ├── protocol-server.h
│   └── protocol-protocol.c
├── server/
│   └── screenshot-server.c
└── client/
    └── screenshot-client.c

Protocol Project Tree

5.3 The Core Question You’re Answering

“How do Wayland extensions work end-to-end, from XML definition to runtime behavior?”

5.4 Concepts You Must Understand First

XML protocol structure (see §2.1).
wayland-scanner outputs (see §2.2).
Versioning rules (see §2.3).
Resource lifecycle (see §2.4).

5.5 Questions to Guide Your Design

What requests and events does v1 need?
How will you encode errors (event vs protocol error)?
How will clients specify which output to capture?
What is the expected buffer format?

5.6 Thinking Exercise

Sketch the exact sequence of messages for a screenshot request: request, buffer creation, event, client save.

5.7 The Interview Questions They’ll Ask

How does Wayland protocol extension differ from X11 extensions?
Why is code generation necessary?
How do you version a Wayland protocol safely?
What happens when a client uses an unsupported request?

5.8 Hints in Layers

Hint 1: Keep v1 simple Only support capturing the primary output.

Hint 2: Reuse existing buffers If you already have a renderer, reuse its output buffer.

Hint 3: Add a protocol error for misuse If the client sends a second request before the first completes, send an error.

5.9 Books That Will Help

| Topic | Book | Chapter | |——-|——|———| | Protocol design | The Wayland Book | Ch. 5 | | API design | The Practice of Programming | Ch. 8 |

5.10 Implementation Phases

Phase 1: XML + Codegen (2-3 days)

Goals:

Write protocol XML
Generate bindings

Tasks:

Define interface and requests.
Add build rules for wayland-scanner.

Checkpoint: build produces protocol headers and c file.

Phase 2: Server Implementation (4-5 days)

Goals:

Implement request handlers
Return screenshot buffers

Tasks:

Create resources on bind.
Implement capture request.

Checkpoint: server logs show request handled.

Phase 3: Client Tool (3-4 days)

Goals:

Bind protocol and request screenshot
Save PNG to disk

Tasks:

Bind and send request.
Map buffer and encode PNG.

Checkpoint: valid PNG is saved.

5.11 Key Implementation Decisions

6. Testing Strategy

6.1 Test Categories

6.2 Critical Test Cases

Version mismatch: client binds higher version -> reject.
Concurrent requests: two capture requests -> error.
Client disconnect: ensure resources freed.

6.3 Test Data

output=HDMI-A-1 file=shot.png

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

7.2 Debugging Strategies

Use WAYLAND_DEBUG=1 to see request/event flow.
Add server logs in request handlers.

7.3 Performance Traps

Copying large buffers unnecessarily; reuse where possible.

8. Extensions & Challenges

8.1 Beginner Extensions

Add a request to list available outputs.
Support capturing only a region.

8.2 Intermediate Extensions

Add PNG compression options.
Add async progress events.

8.3 Advanced Extensions

Add DMA-BUF support for zero-copy capture.
Integrate with portal-like permissions.

9. Real-World Connections

9.1 Industry Applications

Wayland protocols like xdg-shell and layer-shell are defined the same way.
Screenshot and screencast portals use similar patterns.

wayland-protocols repository
xdg-desktop-portal

9.3 Interview Relevance

Protocol versioning and ABI stability.
Designing clear client-server APIs.

10. Resources

10.1 Essential Reading

The Wayland Book, protocol design chapter
wayland-scanner documentation

10.2 Video Resources

Talks on Wayland protocol evolution

10.3 Tools & Documentation

wayland-scanner
wayland-info

P01-bare-metal-wayland-client - client basics
P02-simple-wayland-compositor-wlroots - server integration

10.5 Protocol References (Quick Links)

Wayland core protocol: https://wayland.freedesktop.org/docs/html/
Wayland protocol index: https://wayland.app/protocols
wayland-protocols repo: https://gitlab.freedesktop.org/wayland/wayland-protocols
linux-dmabuf (FD passing patterns): https://wayland.app/protocols/linux-dmabuf-unstable-v1

11. Self-Assessment Checklist

11.1 Understanding

I can write a minimal Wayland protocol in XML.
I can explain how wayland-scanner outputs are used.
I can describe versioning rules.

11.2 Implementation

Client can request a screenshot and save it.
Server handles disconnects safely.
Generated code is integrated into build.

11.3 Growth

I can design a new protocol and justify its API.
I can explain this project in an interview.

12. Submission / Completion Criteria

Minimum Viable Completion:

Protocol XML and generated bindings compile successfully.
Client sends request and receives response.

Full Completion:

All minimum criteria plus:
Clean error handling and version checks.
Screenshot saved as PNG.

Excellence (Going Above & Beyond):

Zero-copy DMA-BUF capture.
Protocol supports permissions and user confirmation.

13. Deep Dive Appendix: Protocol Design Review Checklist

13.1 Design Review Questions

Does every object have a clear owner and lifetime?
Are all requests validated with explicit error codes?
Is there a deterministic ordering of events for state changes?
Can the protocol evolve without breaking older clients?

13.2 XML Skeleton Walkthrough

<protocol name="my_proto">
  <interface name="my_manager" version="1">
    <request name="create_widget">
      <arg name="id" type="new_id" interface="my_widget"/>
    </request>
  </interface>
  <interface name="my_widget" version="1">
    <request name="set_color">
      <arg name="r" type="uint"/>
      <arg name="g" type="uint"/>
      <arg name="b" type="uint"/>
    </request>
    <event name="ready"/>
    <request name="destroy" type="destructor"/>
  </interface>
</protocol>

Manager creates widgets.
Widgets accept state updates and emit events.
destroy enforces explicit lifecycle end.

13.3 Compatibility Matrix

Add request -> bump version, keep old behavior for older clients.
Add event -> safe; old clients ignore unknown events only if versioned.
Remove or change semantics -> not safe; create a new interface instead.

13.4 Deterministic Failure Demo

If a client sends negative sizes, the server must disconnect it:

[client] my_widget.set_size(-1, 20)
[server] protocol error: BAD_SIZE
[server] disconnect client