Project 14: Web Terminal (xterm.js Backend)
Build a web-based terminal using xterm.js in the browser and a backend that manages PTYs over WebSockets.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 3-6 weeks |
| Main Programming Language | Go + JavaScript (Alternatives: Python, Rust) |
| Alternative Programming Languages | Python, Rust |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | Level 4: Open Core Infrastructure |
| Prerequisites | PTY basics, WebSockets, auth |
| Key Topics | PTY over WebSocket, resize, security |
1. Learning Objectives
By completing this project, you will:
- Bridge a PTY to a browser over WebSockets.
- Handle input/output streaming with backpressure.
- Propagate resize events from browser to PTY.
- Implement authentication and access control.
- Deliver a responsive web terminal UI.
2. All Theory Needed (Per-Concept Breakdown)
Concept 1: PTY-over-WebSocket Streaming and Backpressure
Fundamentals
A web terminal forwards bytes between a PTY and a WebSocket. The PTY produces a stream of bytes; the browser consumes them. Backpressure occurs when the browser cannot keep up, so the backend must manage buffering and flow control to avoid memory growth and latency.
Deep Dive into the Concept
In a web terminal, the backend creates a PTY for each client and spawns a shell. It then streams PTY output to the browser via a WebSocket connection. The browser sends keystrokes back over the same connection. This is conceptually similar to a local terminal, but the network adds latency and buffering.
Backpressure is the main challenge. If the backend writes to the WebSocket faster than the client can read, buffers grow. To handle this, the backend should either drop data (not acceptable for terminals) or throttle reading from the PTY. Most systems rely on the PTY buffer to apply backpressure: if you stop reading from the PTY, the kernel buffer fills and the child process will eventually block on write. This provides natural flow control. The backend must therefore integrate WebSocket send readiness with PTY reads.
On the browser side, xterm.js provides APIs for writing data. If you write too fast, the browser UI can become laggy. Chunking output and using requestAnimationFrame to flush helps. Some implementations batch output and apply frame pacing to keep the UI smooth.
Another key issue is framing. WebSockets can transmit text or binary frames. Terminal data is bytes, so you should use binary frames to avoid UTF-8 conversion issues. The browser can treat the data as Uint8Array and feed it into xterm.js. This avoids corruption of non-UTF8 sequences.
How this fits on projects
This concept builds on P01 and P13.
Definitions & Key Terms
- WebSocket -> bidirectional network protocol.
- Backpressure -> throttling when consumer is slower than producer.
- Binary frames -> raw byte frames in WebSocket.
Mental Model Diagram (ASCII)
PTY -> backend -> WebSocket -> browser xterm.js
^ |
| v
+--------- input ----------
How It Works (Step-by-Step)
- Create PTY per client.
- Read PTY output and send via WebSocket.
- Receive input from WebSocket and write to PTY.
- Apply throttling if WebSocket buffers grow.
Invariants:
- Output order preserved.
- No byte loss under backpressure.
Failure modes:
- Unbounded WebSocket buffer growth.
- UTF-8 conversion corrupting bytes.
Minimal Concrete Example
for {
n, _ := pty.Read(buf)
ws.WriteMessage(websocket.BinaryMessage, buf[:n])
}
Common Misconceptions
- “Text frames are fine.” -> They can corrupt binary control bytes.
- “Backpressure is automatic.” -> You must integrate it into the loop.
Check-Your-Understanding Questions
- Why use binary WebSocket frames?
- How do you handle backpressure from the browser?
- What happens if you drop PTY output?
Check-Your-Understanding Answers
- To preserve raw bytes without UTF-8 conversion.
- Pause PTY reads or buffer with a strict limit.
- The terminal state becomes inconsistent.
Real-World Applications
- Web-based SSH terminals
- Cloud IDE terminals
Where You’ll Apply It
- This project: Section 3.2 (streaming), Section 7.3 (performance)
- Also used in: P13-full-terminal-emulator
References
- WebSocket protocol RFC
- xterm.js docs
Key Insight
A web terminal is a PTY bridge with strict flow control and byte fidelity.
Summary
Streaming raw bytes over WebSockets requires careful buffering and backpressure management.
Homework/Exercises to Practice the Concept
- Build a PTY-to-WebSocket bridge and test with a local client.
- Simulate slow client reads and observe buffer growth.
- Switch from text to binary frames and compare output.
Solutions to the Homework/Exercises
- Use a simple Go server and
websocatclient. - Add sleep delays on client and log buffer size.
- Use binary frames to preserve control bytes.
Concept 2: Resize Semantics and Session Security
Fundamentals
Terminal size changes must propagate from the browser to the PTY so applications can redraw correctly. Security is also critical: a web terminal exposes a shell over HTTP, so you need authentication, authorization, and safe sandboxing.
Deep Dive into the Concept
In a browser terminal, resizing the window changes the number of rows and columns displayed by xterm.js. The backend must be notified of these changes and must call TIOCSWINSZ on the PTY slave so the child process receives SIGWINCH. This ensures programs like vim and htop resize correctly. Without this, applications will think they are still at the old size and may render incorrectly or corrupt output.
Resize messages should be small and structured, such as JSON: {"type":"resize","cols":120,"rows":40}. The backend receives the message and updates the PTY. Resize events should be debounced to avoid flooding the backend on continuous window resizing.
Security is non-negotiable. A web terminal is effectively remote code execution. You must enforce authentication (e.g., tokens or sessions), authorization (which user can access which PTY), and rate limits. You should also isolate the shell in a restricted environment: container, chroot, or limited user account. This project focuses on minimal auth and safe defaults, but you should design the architecture so stronger isolation can be added later.
Finally, you must ensure that the WebSocket connection cannot be hijacked. Use HTTPS/WSS, include CSRF protections for session cookies, and enforce origin checks. Without these, attackers could connect to the terminal and take over sessions.
How this fits on projects
This concept is critical for P14 and related to P15 security.
Definitions & Key Terms
- SIGWINCH -> signal on window resize.
- Auth token -> credential for WebSocket access.
- Sandbox -> restricted execution environment.
Mental Model Diagram (ASCII)
browser resize -> WS resize msg -> backend -> TIOCSWINSZ -> SIGWINCH
How It Works (Step-by-Step)
- Browser detects resize and sends message.
- Backend parses resize message.
- Backend calls
ioctl(TIOCSWINSZ)on PTY. - Child receives SIGWINCH and redraws.
Invariants:
- Resize messages are validated.
- Only authenticated clients can resize sessions.
Failure modes:
- No resize propagation causes broken TUIs.
- Missing auth exposes shell to the public.
Minimal Concrete Example
if msg.Type == "resize" {
pty.Resize(msg.Cols, msg.Rows)
}
Common Misconceptions
- “Resize only affects UI.” -> It changes PTY size too.
- “Local network is safe.” -> Web terminals need auth everywhere.
Check-Your-Understanding Questions
- Why is resize propagation required?
- What is the minimum auth required for a web terminal?
- Why use HTTPS/WSS for WebSocket?
Check-Your-Understanding Answers
- To deliver correct size to the child app via SIGWINCH.
- At least a token or session-based authentication.
- To prevent sniffing and hijacking of the terminal stream.
Real-World Applications
- Cloud shells (AWS, GCP)
- Remote dev environments
Where You’ll Apply It
- This project: Section 3.2 (resize), Section 7.1 (pitfalls)
- Also used in: P15-feature-complete-terminal-capstone
References
ioctl(TIOCSWINSZ)documentation- Web security best practices
Key Insight
A web terminal is a remote shell; resize and security are first-class features.
Summary
Correct resize propagation and strong authentication are required for a usable and safe web terminal.
Homework/Exercises to Practice the Concept
- Implement resize messages and verify
stty sizein the shell. - Add a token-based auth guard on the WebSocket.
- Add an origin check and deny cross-site connections.
Solutions to the Homework/Exercises
- Send resize JSON and call
TIOCSWINSZ. - Require a token in query params or headers.
- Check the
Originheader and reject mismatches.
3. Project Specification
3.1 What You Will Build
A web terminal that:
- Uses xterm.js in the browser.
- Bridges PTY I/O over WebSockets.
- Handles resize events.
- Enforces authentication and basic security.
Intentionally excluded:
- Multi-user access control lists or billing.
3.2 Functional Requirements
- PTY per client: spawn a PTY for each WebSocket.
- Streaming: binary frames for PTY output and input.
- Resize: handle resize messages and call
TIOCSWINSZ. - Auth: require a token for WebSocket access.
- UI: basic web page with xterm.js.
3.3 Non-Functional Requirements
- Latency: keystroke echo under 100 ms for local network.
- Security: WSS, origin checks, rate limits.
- Stability: handle disconnects gracefully.
3.4 Example Usage / Output
$ ./webterm --port 8080
[webterm] listening on http://localhost:8080
3.5 Data Formats / Schemas / Protocols
- Resize message:
{"type":"resize","cols":120,"rows":40} - Error JSON:
{"error":"unauthorized","code":"AUTH"}
3.6 Edge Cases
- Client disconnects mid-session.
- Resize spam on rapid window drag.
- Invalid auth token.
3.7 Real World Outcome
A browser terminal that runs vim and htop with near-local responsiveness.
3.7.1 How to Run (Copy/Paste)
./webterm --port 8080 --auth-token demo123
3.7.2 Golden Path Demo (Deterministic)
- Open
http://localhost:8080/?token=demo123. - Verify shell prompt appears and responds.
3.7.3 Failure Demo (Deterministic)
$ curl -i http://localhost:8080/ws
HTTP/1.1 401 Unauthorized
{"error":"unauthorized","code":"AUTH"}
exit status: 0
3.7.4 If Web App: screen-by-screen flows
- Landing page: shows a terminal area and a connect button.
- Connected state: terminal grid with input focus.
- Error state: banner “Unauthorized” when token missing.
ASCII wireframe:
+--------------------------------------+
| Web Terminal |
| [Connect] |
|--------------------------------------|
| $ |
| |
| |
+--------------------------------------+
4. Solution Architecture
4.1 High-Level Design
Browser (xterm.js) <-> WebSocket <-> PTY backend
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Web UI | Render xterm.js | Minimal HTML/JS |
| WS Handler | Auth and stream bytes | Binary frames |
| PTY Manager | Spawn shells | Per-connection PTY |
| Resize Handler | Apply window size | Debounce messages |
4.3 Data Structures (No Full Code)
type ResizeMsg struct { Type string; Cols int; Rows int }
4.4 Algorithm Overview
Key Algorithm: WS Loop
- Authenticate connection.
- Start PTY and goroutines for read/write.
- Forward input to PTY and output to WS.
- Handle resize messages.
Complexity Analysis:
- Time: O(n) for bytes forwarded
- Space: O(buffer size)
5. Implementation Guide
5.1 Development Environment Setup
go version
5.2 Project Structure
webterm/
|-- server/
| |-- main.go
| |-- pty.go
| `-- ws.go
|-- web/
| |-- index.html
| `-- app.js
|-- Makefile
`-- README.md
5.3 The Core Question You’re Answering
“How do you bridge a PTY over the network while preserving terminal semantics?”
5.4 Concepts You Must Understand First
- PTY lifecycle and job control.
- WebSocket streaming and backpressure.
- Resize propagation and security.
5.5 Questions to Guide Your Design
- How will you authenticate clients?
- How will you handle reconnects?
- How will you limit session resources?
5.6 Thinking Exercise
Estimate the bandwidth needed for a 120x40 terminal refreshing at 30 FPS.
5.7 The Interview Questions They’ll Ask
- Why do you need binary WebSocket frames?
- How do you handle resize events?
- What security risks exist in web terminals?
5.8 Hints in Layers
Hint 1: Use a simple token Start with a single shared token.
Hint 2: Use binary frames Avoid UTF-8 conversion.
Hint 3: Add resize messages Keep them small and debounced.
Hint 4: Add rate limits Protect against abuse.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Networking | “UNIX Network Programming” Vol 1 | Ch. 5-7 |
| Security | “Foundations of Information Security” | Ch. 2 |
5.10 Implementation Phases
Phase 1: Backend PTY bridge (1-2 weeks)
Goals: PTY <-> WS streaming. Tasks:
- Spawn PTY per connection.
- Stream bytes both directions. Checkpoint: Browser shell works.
Phase 2: UI and resize (1 week)
Goals: xterm.js integration. Tasks:
- Add xterm.js UI.
- Send resize messages.
Checkpoint:
stty sizematches browser.
Phase 3: Security (1 week)
Goals: auth and rate limits. Tasks:
- Add token auth.
- Add origin checks. Checkpoint: unauthorized access rejected.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| WS frames | Text vs binary | Binary | Preserve bytes |
| Auth | Token vs OAuth | Token | Minimal viable |
| Resize | Continuous vs debounce | Debounce | Reduce load |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Resize handler | Valid/invalid sizes |
| Integration Tests | PTY bridge | Echo input |
| Security Tests | Unauthorized access | 401 response |
6.2 Critical Test Cases
- Resize:
stty sizematches browser. - Backpressure: large output does not crash server.
- Auth: invalid token rejected.
6.3 Test Data
Input: "echo hello" through WebSocket
Expected: "hello" echoed in browser
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Using text frames | Corrupted escape sequences | Use binary frames |
| No resize propagation | Full-screen apps broken | Send TIOCSWINSZ |
| No auth | Public shell exposure | Require token |
7.2 Debugging Strategies
- Log WebSocket frame sizes and timing.
- Compare PTY output with local terminal output.
7.3 Performance Traps
Writing each byte as a separate WS frame is slow; batch data.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add reconnect support.
- Add a session list UI.
8.2 Intermediate Extensions
- Add multi-user session sharing.
- Add audit logging.
8.3 Advanced Extensions
- Add containerized sessions per user.
- Implement full auth (OAuth).
9. Real-World Connections
9.1 Industry Applications
- Cloud IDEs and dev environments
- Remote admin consoles
9.2 Related Open Source Projects
- xterm.js: browser terminal
- ttyd: web terminal server
9.3 Interview Relevance
- Networking and protocol bridging
- Security considerations in web apps
10. Resources
10.1 Essential Reading
- WebSocket RFC 6455
- xterm.js documentation
10.2 Video Resources
- Talks on web terminal architecture
10.3 Tools & Documentation
websocatfor testing WebSockets
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain PTY-over-WebSocket.
- I can handle resize semantics.
- I can design basic auth.
11.2 Implementation
- Browser terminal works with low latency.
- Security checks enforce access control.
- Tests cover resize and auth.
11.3 Growth
- I can extend to multi-user sessions.
- I can add container isolation.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Working web terminal with PTY bridge.
- Resize propagation and basic auth.
Full Completion:
- Backpressure handling and security hardening.
- Deterministic integration tests.
Excellence (Going Above & Beyond):
- Session persistence and reconnect support.
- Containerized multi-user architecture.