Project 11: Sixel/Image Protocol Support

Add inline image support (Sixel or Kitty) with safe decoding, placement, and scroll behavior.

Quick Reference

Attribute Value
Difficulty Level 3: Advanced
Time Estimate 3-5 weeks
Main Programming Language C (Alternatives: Rust)
Alternative Programming Languages Rust
Coolness Level Level 4: Hardcore Tech Flex
Business Potential Level 2: Open Source Builder
Prerequisites Parser, renderer, memory safety
Key Topics Sixel/Kitty protocols, DCS parsing, image compositing

1. Learning Objectives

By completing this project, you will:

  1. Parse DCS or APC image sequences safely.
  2. Decode a chosen image protocol (Sixel or Kitty).
  3. Place images inside the terminal grid correctly.
  4. Enforce size and memory limits to prevent abuse.
  5. Integrate images with scrollback behavior.

2. All Theory Needed (Per-Concept Breakdown)

Concept 1: Sixel Encoding and DCS Parsing

Fundamentals

Sixel is a bitmap encoding that represents a 6-pixel-high column per character. It is transmitted as a DCS (Device Control String) sequence that begins with ESC P and ends with ST (ESC \). The payload encodes pixel data and optional color definitions. Correct parsing requires buffering until the terminator and then decoding the payload into a pixel buffer.

Deep Dive into the Concept

Sixel encodes images as a stream of printable characters in the range 0x3F-0x7E. Each character represents a 6-bit vertical slice of pixels. The protocol also supports color changes (#), line breaks (-), and repeat counts (!). A simple Sixel decoder reads these characters and builds a bitmap row by row. Because Sixel uses a 6-pixel vertical unit, the output height is a multiple of 6, and you must handle line breaks properly to increment the row group.

Sixel sequences are embedded in a DCS: ESC P q starts a Sixel sequence, and ESC \ terminates it. The parser must collect all bytes until the terminator, which might be split across reads. This is a streaming parsing problem, similar to OSC but with potentially large payloads. You should enforce a maximum payload size to prevent memory exhaustion.

Color handling in Sixel is optional. The protocol allows defining palette entries and switching active colors. A minimal implementation can support a fixed palette and ignore custom definitions, but for correctness you should parse # commands to update the palette. Sixel pixels are drawn with the current color index, so the decoder must track active color state as it processes the stream.

The parser must also handle invalid sequences. Unterminated DCS should be discarded after a timeout or length cap. Repeated characters with invalid counts should be rejected or clamped. The key invariant is: the decoder must never read beyond its buffers or allocate unlimited memory.

How this fits on projects

This concept is central to this project and reused in P13 and P15.

Definitions & Key Terms

  • Sixel -> 6-bit vertical bitmap encoding.
  • DCS -> Device Control String (ESC P … ESC ).
  • ST -> String Terminator (ESC ).
  • Palette -> color map used by Sixel.

Mental Model Diagram (ASCII)

ESC P q ... sixel data ... ESC \
           |--- payload ---|

How It Works (Step-by-Step)

  1. Parse DCS start and buffer payload.
  2. Decode Sixel characters into pixels.
  3. Apply palette and line breaks.
  4. Produce a bitmap image.

Invariants:

  • Payload size is bounded.
  • Decoder state resets after completion.

Failure modes:

  • Unterminated sequences consume memory.
  • Invalid repeat counts crash decoder.

Minimal Concrete Example

if (state == DCS && byte == '\\' && prev == ESC) end_dcs();

Common Misconceptions

  • “Sixel is just base64.” -> It is a custom 6-bit bitmap encoding.
  • “DCS ends at ESC.” -> It ends at ESC followed by ‘\’.

Check-Your-Understanding Questions

  1. Why is Sixel height in multiples of 6?
  2. How do you detect the end of a DCS sequence?
  3. Why must Sixel payload size be capped?

Check-Your-Understanding Answers

  1. Each Sixel character encodes a 6-pixel column.
  2. Look for ESC followed by ‘\’ (ST).
  3. To prevent memory exhaustion and DoS.

Real-World Applications

  • Inline image rendering in terminals
  • Plotting tools like gnuplot

Where You’ll Apply It

References

  • Sixel specification
  • DEC terminal documentation

Key Insight

Sixel is a stateful bitmap protocol embedded inside a DCS stream.

Summary

Parsing and decoding Sixel safely is the first step to inline images.

Homework/Exercises to Practice the Concept

  1. Decode a minimal Sixel payload into a bitmap.
  2. Implement a size cap and test with oversized payloads.
  3. Verify termination with ST sequences.

Solutions to the Homework/Exercises

  1. Use a 6x1 bitmap and log pixels for a single character.
  2. Reject payloads larger than your cap.
  3. Feed a known sequence and ensure it terminates correctly.

Concept 2: Image Placement, Compositing, and Scroll Behavior

Fundamentals

Images must be placed into the terminal grid. Some protocols define images in pixel coordinates, others in cell coordinates. The renderer must decide how images interact with text, including whether they occupy cells, overlay text, or move with scrolling. A safe implementation needs clear rules.

Deep Dive into the Concept

Sixel images are traditionally treated as pixel graphics that occupy a rectangular region on the screen. Terminals vary in how they integrate these graphics with text. A simple model is to treat the image as a block of cells: compute the number of cells required based on cell size and place the image at the current cursor position. This approach integrates with scrolling: when the cursor moves or the screen scrolls, the image moves with its cell region, or it can be clipped if it moves out of view.

For safety and determinism, define explicit placement rules. For example: “Images are anchored to the cursor position at the time of receipt, measured in cell coordinates. The image occupies N rows and M columns, rounded up from pixel dimensions.” This makes scroll behavior predictable and testable. You can store images in a list with their cell anchors and draw them during rendering after drawing text cells, compositing the image on top.

Memory and performance constraints are crucial. Images can be large; you must set a maximum pixel count and reject images that exceed it. You should also limit the number of images stored in memory and evict old ones when scrollback grows. Otherwise, repeated image output can consume unbounded memory.

Finally, determine how images interact with scrollback. A minimal implementation can discard images when they scroll out of the visible area, or store them as part of the scrollback buffer. Storing images in scrollback is more complex but more correct. For this project, you can implement a bounded image list and remove images that are fully outside the viewport.

How this fits on projects

This concept is used in P13 and P15.

Definitions & Key Terms

  • Cell anchor -> the cell position where an image is placed.
  • Compositing -> drawing images on top of text.
  • Clipping -> cutting off parts of an image outside the viewport.

Mental Model Diagram (ASCII)

+-------------------+
| text text [img]   |
| text text [img]   |
+-------------------+

How It Works (Step-by-Step)

  1. Decode image to pixel buffer.
  2. Compute cell width/height for the image.
  3. Store image with anchor cell and dimensions.
  4. During render, draw image in its region.

Invariants:

  • Image dimensions are bounded.
  • Images are anchored to deterministic cell positions.

Failure modes:

  • Unbounded images cause memory spikes.
  • Incorrect anchoring causes images to drift.

Minimal Concrete Example

int cols = (img_w + cell_w - 1) / cell_w;
int rows = (img_h + cell_h - 1) / cell_h;
store_image(anchor_row, anchor_col, cols, rows, pixels);

Common Misconceptions

  • “Images should always overlay text.” -> Many terminals treat them as cells.
  • “Scrollback ignores images.” -> Users expect images to move with text.

Check-Your-Understanding Questions

  1. How do you map pixel dimensions to cell dimensions?
  2. Why define explicit anchoring rules?
  3. What is a safe image size policy?

Check-Your-Understanding Answers

  1. Divide by cell size and round up.
  2. To make placement deterministic and testable.
  3. Set a max pixel count and reject larger images.

Real-World Applications

  • Inline plots in data tools
  • Rich TUI interfaces

Where You’ll Apply It

References

  • Kitty graphics protocol docs
  • iTerm2 image protocol docs

Key Insight

Image support is less about decoding and more about clear placement and memory policy.

Summary

Define deterministic placement rules and enforce strict size limits for safety.

Homework/Exercises to Practice the Concept

  1. Render a 32x32 image in a 8x16 cell grid and compute cell footprint.
  2. Implement clipping for images partially off-screen.
  3. Enforce a maximum pixel count.

Solutions to the Homework/Exercises

  1. 32x32 with 8x16 cells becomes 4 cols x 2 rows.
  2. Skip drawing pixels outside viewport bounds.
  3. Reject images larger than a defined threshold.

3. Project Specification

3.1 What You Will Build

An image protocol module that:

  • Parses Sixel (or Kitty) sequences safely.
  • Decodes payloads into pixel buffers.
  • Places images into the terminal grid.
  • Enforces size limits and memory caps.

Intentionally excluded:

  • Full image scaling or advanced blending modes.

3.2 Functional Requirements

  1. DCS parsing: buffer until ST terminator.
  2. Sixel decoding: convert payload to bitmap.
  3. Placement: map pixels to cell region.
  4. Limits: enforce max pixels and max images.
  5. Integration: images drawn with text output.

3.3 Non-Functional Requirements

  • Security: reject oversized payloads.
  • Determinism: fixed test images and size caps.
  • Performance: decode within reasonable time.

3.4 Example Usage / Output

$ ./img_term --sixel demo.six
[img] decoded 120x60 image
[img] placed at row 5 col 10

3.5 Data Formats / Schemas / Protocols

  • Image object: {w,h,pixels,anchor_row,anchor_col}

3.6 Edge Cases

  • Unterminated DCS sequence.
  • Images larger than max size.
  • Overlapping images.

3.7 Real World Outcome

Inline images render in the terminal without breaking text output.

3.7.1 How to Run (Copy/Paste)

cc -O2 -o img_term img_term.c
TZ=UTC LC_ALL=C ./img_term --sixel samples/demo.six

3.7.2 Golden Path Demo (Deterministic)

  1. Render a fixed 120x60 Sixel file.
  2. Verify placement at a known cursor position.

3.7.3 Failure Demo (Deterministic)

$ ./img_term --sixel samples/huge.six
error: image exceeds maximum size (max=1024x1024)
exit status: 65

3.7.4 If TUI: ASCII layout

+----------------------------------+
| text text [IMG]                  |
| text text [IMG]                  |
+----------------------------------+

4. Solution Architecture

4.1 High-Level Design

Parser -> Sixel Decoder -> Image Store -> Renderer

4.2 Key Components

Component Responsibility Key Decisions
DCS Parser Buffer payloads Size caps
Sixel Decoder Convert to pixels Simple palette
Image Store Track images and anchors Bounded list
Renderer Composite images Draw after text

4.3 Data Structures (No Full Code)

struct Image { int w,h; int row,col; uint32_t *pixels; };

4.4 Algorithm Overview

Key Algorithm: Decode Sixel

  1. Read chars, map to 6-bit columns.
  2. Apply repeat counts and line breaks.
  3. Write pixels into bitmap.

Complexity Analysis:

  • Time: O(payload)
  • Space: O(w*h)

5. Implementation Guide

5.1 Development Environment Setup

cc --version

5.2 Project Structure

img-term/
|-- src/
|   |-- dcs.c
|   |-- sixel.c
|   `-- render.c
|-- samples/
|   |-- demo.six
|   `-- huge.six
|-- Makefile
`-- README.md

5.3 The Core Question You’re Answering

“How do you render images inside a text terminal safely?”

5.4 Concepts You Must Understand First

  1. DCS parsing and termination.
  2. Sixel decoding rules.
  3. Image placement and size limits.

5.5 Questions to Guide Your Design

  1. Will images occupy cells or overlay pixels?
  2. What is the maximum allowed image size?
  3. How will you handle overlapping images?

5.6 Thinking Exercise

Define a policy for images that exceed max size. Reject or scale?

5.7 The Interview Questions They’ll Ask

  1. How does Sixel encoding work?
  2. Why are size limits required for security?
  3. How do images interact with scrolling?

5.8 Hints in Layers

Hint 1: Start with a fixed palette Ignore custom colors initially.

Hint 2: Enforce size limits early Reject oversized payloads immediately.

Hint 3: Anchor to cursor position Make placement deterministic.

Hint 4: Clip to viewport Avoid drawing off-screen pixels.

5.9 Books That Will Help

Topic Book Chapter
Graphics “Computer Graphics from Scratch” Ch. 6-9
Parsing “Language Implementation Patterns” Ch. 3

5.10 Implementation Phases

Phase 1: Parsing (1 week)

Goals: buffer DCS sequences. Tasks:

  1. Implement DCS buffering and termination.
  2. Add size caps. Checkpoint: DCS payload captured reliably.

Phase 2: Decoding (1-2 weeks)

Goals: decode Sixel into bitmap. Tasks:

  1. Implement sixel decoder.
  2. Add palette handling. Checkpoint: Demo images decode correctly.

Phase 3: Placement (1 week)

Goals: render images in grid. Tasks:

  1. Map pixels to cell region.
  2. Integrate with renderer. Checkpoint: Images display at correct position.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Protocol Sixel vs Kitty Sixel Simpler to implement
Placement Cell-aligned vs pixel overlay Cell-aligned Predictable
Limits Hard cap vs scaling Hard cap Security and simplicity

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Decode patterns Known sixel payloads
Integration Tests Render images demo.six
Security Tests Oversized payload huge.six

6.2 Critical Test Cases

  1. Termination: DCS ends at ST.
  2. Size limit: oversized images rejected.
  3. Placement: anchor at cursor position.

6.3 Test Data

Payload: simple 6x6 pattern
Expected: correct pixels in bitmap

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Unterminated DCS Parser stuck Add length cap and timeout
Wrong line breaks Image shifted Handle ‘-‘ properly
No size limits Memory spike Enforce max pixels

7.2 Debugging Strategies

  • Render a bounding box for image placement.
  • Log decode progress and row/col counts.

7.3 Performance Traps

Large image decoding can stall UI; consider async decoding later.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Add support for Kitty protocol.
  • Add image scaling to fit cells.

8.2 Intermediate Extensions

  • Add image caching and reuse.
  • Add transparency handling.

8.3 Advanced Extensions

  • Add animated image support.
  • Add GPU texture upload pipeline.

9. Real-World Connections

9.1 Industry Applications

  • Notebook-like terminals
  • Data visualization tools
  • xterm: Sixel support reference
  • kitty: graphics protocol

9.3 Interview Relevance

  • Protocol parsing and safety
  • Graphics integration in text UIs

10. Resources

10.1 Essential Reading

  • Sixel protocol reference
  • Kitty graphics protocol docs

10.2 Video Resources

  • Talks on terminal graphics

10.3 Tools & Documentation

  • imgcat or kitty icat for test images

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain Sixel encoding.
  • I can parse DCS reliably.
  • I can describe image placement rules.

11.2 Implementation

  • Images render correctly in the grid.
  • Size limits are enforced.
  • Tests pass for sample images.

11.3 Growth

  • I can extend to Kitty protocol.
  • I can add GPU upload.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Sixel decoding works for demo images.
  • Images placed deterministically in grid.

Full Completion:

  • Size limits and robust parsing.
  • Integration with renderer and scroll behavior.

Excellence (Going Above & Beyond):

  • Kitty protocol support.
  • Image caching and GPU acceleration.