Project 11: Sixel/Image Protocol Support
Add inline image support (Sixel or Kitty) with safe decoding, placement, and scroll behavior.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3: Advanced |
| Time Estimate | 3-5 weeks |
| Main Programming Language | C (Alternatives: Rust) |
| Alternative Programming Languages | Rust |
| Coolness Level | Level 4: Hardcore Tech Flex |
| Business Potential | Level 2: Open Source Builder |
| Prerequisites | Parser, renderer, memory safety |
| Key Topics | Sixel/Kitty protocols, DCS parsing, image compositing |
1. Learning Objectives
By completing this project, you will:
- Parse DCS or APC image sequences safely.
- Decode a chosen image protocol (Sixel or Kitty).
- Place images inside the terminal grid correctly.
- Enforce size and memory limits to prevent abuse.
- Integrate images with scrollback behavior.
2. All Theory Needed (Per-Concept Breakdown)
Concept 1: Sixel Encoding and DCS Parsing
Fundamentals
Sixel is a bitmap encoding that represents a 6-pixel-high column per character. It is transmitted as a DCS (Device Control String) sequence that begins with ESC P and ends with ST (ESC \). The payload encodes pixel data and optional color definitions. Correct parsing requires buffering until the terminator and then decoding the payload into a pixel buffer.
Deep Dive into the Concept
Sixel encodes images as a stream of printable characters in the range 0x3F-0x7E. Each character represents a 6-bit vertical slice of pixels. The protocol also supports color changes (#), line breaks (-), and repeat counts (!). A simple Sixel decoder reads these characters and builds a bitmap row by row. Because Sixel uses a 6-pixel vertical unit, the output height is a multiple of 6, and you must handle line breaks properly to increment the row group.
Sixel sequences are embedded in a DCS: ESC P q starts a Sixel sequence, and ESC \ terminates it. The parser must collect all bytes until the terminator, which might be split across reads. This is a streaming parsing problem, similar to OSC but with potentially large payloads. You should enforce a maximum payload size to prevent memory exhaustion.
Color handling in Sixel is optional. The protocol allows defining palette entries and switching active colors. A minimal implementation can support a fixed palette and ignore custom definitions, but for correctness you should parse # commands to update the palette. Sixel pixels are drawn with the current color index, so the decoder must track active color state as it processes the stream.
The parser must also handle invalid sequences. Unterminated DCS should be discarded after a timeout or length cap. Repeated characters with invalid counts should be rejected or clamped. The key invariant is: the decoder must never read beyond its buffers or allocate unlimited memory.
How this fits on projects
This concept is central to this project and reused in P13 and P15.
Definitions & Key Terms
- Sixel -> 6-bit vertical bitmap encoding.
- DCS -> Device Control String (ESC P … ESC ).
- ST -> String Terminator (ESC ).
- Palette -> color map used by Sixel.
Mental Model Diagram (ASCII)
ESC P q ... sixel data ... ESC \
|--- payload ---|
How It Works (Step-by-Step)
- Parse DCS start and buffer payload.
- Decode Sixel characters into pixels.
- Apply palette and line breaks.
- Produce a bitmap image.
Invariants:
- Payload size is bounded.
- Decoder state resets after completion.
Failure modes:
- Unterminated sequences consume memory.
- Invalid repeat counts crash decoder.
Minimal Concrete Example
if (state == DCS && byte == '\\' && prev == ESC) end_dcs();
Common Misconceptions
- “Sixel is just base64.” -> It is a custom 6-bit bitmap encoding.
- “DCS ends at ESC.” -> It ends at ESC followed by ‘\’.
Check-Your-Understanding Questions
- Why is Sixel height in multiples of 6?
- How do you detect the end of a DCS sequence?
- Why must Sixel payload size be capped?
Check-Your-Understanding Answers
- Each Sixel character encodes a 6-pixel column.
- Look for ESC followed by ‘\’ (ST).
- To prevent memory exhaustion and DoS.
Real-World Applications
- Inline image rendering in terminals
- Plotting tools like gnuplot
Where You’ll Apply It
- This project: Section 3.2 (decoder), Section 6.2 (tests)
- Also used in: P13-full-terminal-emulator
References
- Sixel specification
- DEC terminal documentation
Key Insight
Sixel is a stateful bitmap protocol embedded inside a DCS stream.
Summary
Parsing and decoding Sixel safely is the first step to inline images.
Homework/Exercises to Practice the Concept
- Decode a minimal Sixel payload into a bitmap.
- Implement a size cap and test with oversized payloads.
- Verify termination with ST sequences.
Solutions to the Homework/Exercises
- Use a 6x1 bitmap and log pixels for a single character.
- Reject payloads larger than your cap.
- Feed a known sequence and ensure it terminates correctly.
Concept 2: Image Placement, Compositing, and Scroll Behavior
Fundamentals
Images must be placed into the terminal grid. Some protocols define images in pixel coordinates, others in cell coordinates. The renderer must decide how images interact with text, including whether they occupy cells, overlay text, or move with scrolling. A safe implementation needs clear rules.
Deep Dive into the Concept
Sixel images are traditionally treated as pixel graphics that occupy a rectangular region on the screen. Terminals vary in how they integrate these graphics with text. A simple model is to treat the image as a block of cells: compute the number of cells required based on cell size and place the image at the current cursor position. This approach integrates with scrolling: when the cursor moves or the screen scrolls, the image moves with its cell region, or it can be clipped if it moves out of view.
For safety and determinism, define explicit placement rules. For example: “Images are anchored to the cursor position at the time of receipt, measured in cell coordinates. The image occupies N rows and M columns, rounded up from pixel dimensions.” This makes scroll behavior predictable and testable. You can store images in a list with their cell anchors and draw them during rendering after drawing text cells, compositing the image on top.
Memory and performance constraints are crucial. Images can be large; you must set a maximum pixel count and reject images that exceed it. You should also limit the number of images stored in memory and evict old ones when scrollback grows. Otherwise, repeated image output can consume unbounded memory.
Finally, determine how images interact with scrollback. A minimal implementation can discard images when they scroll out of the visible area, or store them as part of the scrollback buffer. Storing images in scrollback is more complex but more correct. For this project, you can implement a bounded image list and remove images that are fully outside the viewport.
How this fits on projects
This concept is used in P13 and P15.
Definitions & Key Terms
- Cell anchor -> the cell position where an image is placed.
- Compositing -> drawing images on top of text.
- Clipping -> cutting off parts of an image outside the viewport.
Mental Model Diagram (ASCII)
+-------------------+
| text text [img] |
| text text [img] |
+-------------------+
How It Works (Step-by-Step)
- Decode image to pixel buffer.
- Compute cell width/height for the image.
- Store image with anchor cell and dimensions.
- During render, draw image in its region.
Invariants:
- Image dimensions are bounded.
- Images are anchored to deterministic cell positions.
Failure modes:
- Unbounded images cause memory spikes.
- Incorrect anchoring causes images to drift.
Minimal Concrete Example
int cols = (img_w + cell_w - 1) / cell_w;
int rows = (img_h + cell_h - 1) / cell_h;
store_image(anchor_row, anchor_col, cols, rows, pixels);
Common Misconceptions
- “Images should always overlay text.” -> Many terminals treat them as cells.
- “Scrollback ignores images.” -> Users expect images to move with text.
Check-Your-Understanding Questions
- How do you map pixel dimensions to cell dimensions?
- Why define explicit anchoring rules?
- What is a safe image size policy?
Check-Your-Understanding Answers
- Divide by cell size and round up.
- To make placement deterministic and testable.
- Set a max pixel count and reject larger images.
Real-World Applications
- Inline plots in data tools
- Rich TUI interfaces
Where You’ll Apply It
- This project: Section 3.2 (placement rules), Section 7.1 (pitfalls)
- Also used in: P15-feature-complete-terminal-capstone
References
- Kitty graphics protocol docs
- iTerm2 image protocol docs
Key Insight
Image support is less about decoding and more about clear placement and memory policy.
Summary
Define deterministic placement rules and enforce strict size limits for safety.
Homework/Exercises to Practice the Concept
- Render a 32x32 image in a 8x16 cell grid and compute cell footprint.
- Implement clipping for images partially off-screen.
- Enforce a maximum pixel count.
Solutions to the Homework/Exercises
- 32x32 with 8x16 cells becomes 4 cols x 2 rows.
- Skip drawing pixels outside viewport bounds.
- Reject images larger than a defined threshold.
3. Project Specification
3.1 What You Will Build
An image protocol module that:
- Parses Sixel (or Kitty) sequences safely.
- Decodes payloads into pixel buffers.
- Places images into the terminal grid.
- Enforces size limits and memory caps.
Intentionally excluded:
- Full image scaling or advanced blending modes.
3.2 Functional Requirements
- DCS parsing: buffer until ST terminator.
- Sixel decoding: convert payload to bitmap.
- Placement: map pixels to cell region.
- Limits: enforce max pixels and max images.
- Integration: images drawn with text output.
3.3 Non-Functional Requirements
- Security: reject oversized payloads.
- Determinism: fixed test images and size caps.
- Performance: decode within reasonable time.
3.4 Example Usage / Output
$ ./img_term --sixel demo.six
[img] decoded 120x60 image
[img] placed at row 5 col 10
3.5 Data Formats / Schemas / Protocols
- Image object:
{w,h,pixels,anchor_row,anchor_col}
3.6 Edge Cases
- Unterminated DCS sequence.
- Images larger than max size.
- Overlapping images.
3.7 Real World Outcome
Inline images render in the terminal without breaking text output.
3.7.1 How to Run (Copy/Paste)
cc -O2 -o img_term img_term.c
TZ=UTC LC_ALL=C ./img_term --sixel samples/demo.six
3.7.2 Golden Path Demo (Deterministic)
- Render a fixed 120x60 Sixel file.
- Verify placement at a known cursor position.
3.7.3 Failure Demo (Deterministic)
$ ./img_term --sixel samples/huge.six
error: image exceeds maximum size (max=1024x1024)
exit status: 65
3.7.4 If TUI: ASCII layout
+----------------------------------+
| text text [IMG] |
| text text [IMG] |
+----------------------------------+
4. Solution Architecture
4.1 High-Level Design
Parser -> Sixel Decoder -> Image Store -> Renderer
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| DCS Parser | Buffer payloads | Size caps |
| Sixel Decoder | Convert to pixels | Simple palette |
| Image Store | Track images and anchors | Bounded list |
| Renderer | Composite images | Draw after text |
4.3 Data Structures (No Full Code)
struct Image { int w,h; int row,col; uint32_t *pixels; };
4.4 Algorithm Overview
Key Algorithm: Decode Sixel
- Read chars, map to 6-bit columns.
- Apply repeat counts and line breaks.
- Write pixels into bitmap.
Complexity Analysis:
- Time: O(payload)
- Space: O(w*h)
5. Implementation Guide
5.1 Development Environment Setup
cc --version
5.2 Project Structure
img-term/
|-- src/
| |-- dcs.c
| |-- sixel.c
| `-- render.c
|-- samples/
| |-- demo.six
| `-- huge.six
|-- Makefile
`-- README.md
5.3 The Core Question You’re Answering
“How do you render images inside a text terminal safely?”
5.4 Concepts You Must Understand First
- DCS parsing and termination.
- Sixel decoding rules.
- Image placement and size limits.
5.5 Questions to Guide Your Design
- Will images occupy cells or overlay pixels?
- What is the maximum allowed image size?
- How will you handle overlapping images?
5.6 Thinking Exercise
Define a policy for images that exceed max size. Reject or scale?
5.7 The Interview Questions They’ll Ask
- How does Sixel encoding work?
- Why are size limits required for security?
- How do images interact with scrolling?
5.8 Hints in Layers
Hint 1: Start with a fixed palette Ignore custom colors initially.
Hint 2: Enforce size limits early Reject oversized payloads immediately.
Hint 3: Anchor to cursor position Make placement deterministic.
Hint 4: Clip to viewport Avoid drawing off-screen pixels.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Graphics | “Computer Graphics from Scratch” | Ch. 6-9 |
| Parsing | “Language Implementation Patterns” | Ch. 3 |
5.10 Implementation Phases
Phase 1: Parsing (1 week)
Goals: buffer DCS sequences. Tasks:
- Implement DCS buffering and termination.
- Add size caps. Checkpoint: DCS payload captured reliably.
Phase 2: Decoding (1-2 weeks)
Goals: decode Sixel into bitmap. Tasks:
- Implement sixel decoder.
- Add palette handling. Checkpoint: Demo images decode correctly.
Phase 3: Placement (1 week)
Goals: render images in grid. Tasks:
- Map pixels to cell region.
- Integrate with renderer. Checkpoint: Images display at correct position.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Protocol | Sixel vs Kitty | Sixel | Simpler to implement |
| Placement | Cell-aligned vs pixel overlay | Cell-aligned | Predictable |
| Limits | Hard cap vs scaling | Hard cap | Security and simplicity |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Decode patterns | Known sixel payloads |
| Integration Tests | Render images | demo.six |
| Security Tests | Oversized payload | huge.six |
6.2 Critical Test Cases
- Termination: DCS ends at ST.
- Size limit: oversized images rejected.
- Placement: anchor at cursor position.
6.3 Test Data
Payload: simple 6x6 pattern
Expected: correct pixels in bitmap
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Unterminated DCS | Parser stuck | Add length cap and timeout |
| Wrong line breaks | Image shifted | Handle ‘-‘ properly |
| No size limits | Memory spike | Enforce max pixels |
7.2 Debugging Strategies
- Render a bounding box for image placement.
- Log decode progress and row/col counts.
7.3 Performance Traps
Large image decoding can stall UI; consider async decoding later.
8. Extensions & Challenges
8.1 Beginner Extensions
- Add support for Kitty protocol.
- Add image scaling to fit cells.
8.2 Intermediate Extensions
- Add image caching and reuse.
- Add transparency handling.
8.3 Advanced Extensions
- Add animated image support.
- Add GPU texture upload pipeline.
9. Real-World Connections
9.1 Industry Applications
- Notebook-like terminals
- Data visualization tools
9.2 Related Open Source Projects
- xterm: Sixel support reference
- kitty: graphics protocol
9.3 Interview Relevance
- Protocol parsing and safety
- Graphics integration in text UIs
10. Resources
10.1 Essential Reading
- Sixel protocol reference
- Kitty graphics protocol docs
10.2 Video Resources
- Talks on terminal graphics
10.3 Tools & Documentation
imgcatorkitty icatfor test images
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain Sixel encoding.
- I can parse DCS reliably.
- I can describe image placement rules.
11.2 Implementation
- Images render correctly in the grid.
- Size limits are enforced.
- Tests pass for sample images.
11.3 Growth
- I can extend to Kitty protocol.
- I can add GPU upload.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Sixel decoding works for demo images.
- Images placed deterministically in grid.
Full Completion:
- Size limits and robust parsing.
- Integration with renderer and scroll behavior.
Excellence (Going Above & Beyond):
- Kitty protocol support.
- Image caching and GPU acceleration.