Project 6: Custom Data Formatter

Teach LLDB to print your data structures in a human-friendly way.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 1-2 weeks
Language Python (LLDB data formatters)
Prerequisites Project 5, pointers, Python scripting
Key Topics SBValue, summaries, synthetic children

1. Learning Objectives

By completing this project, you will:

  1. Write a Python summary provider for a custom struct.
  2. Register the formatter in LLDB.
  3. Traverse a linked list safely with SBValue.
  4. (Optional) Add synthetic children for richer output.

2. Theoretical Foundation

2.1 Core Concepts

  • Summary provider: Returns a one-line description for a type.
  • Synthetic children: Expose custom child elements for pretty printing.
  • SBValue: The LLDB object representing a value in the debug session.

2.2 Why This Matters

Readable data structures can reduce debugging time by orders of magnitude. A good formatter turns “0x7ffeefbff5e8” into meaningful insight.

2.3 Common Misconceptions

  • “Formatters are cosmetic.” (They change how fast you understand state.)
  • “Formatters are fragile.” (With good guards, they are robust.)

3. Project Specification

3.1 What You Will Build

A C program with a linked list and a Python formatter that prints the list size instead of a raw pointer.

3.2 Functional Requirements

  1. LLDB prints a summary like “Linked list with 3 nodes”.
  2. The formatter handles null pointers safely.
  3. The formatter can be enabled/disabled.

3.3 Non-Functional Requirements

  • Safety: Avoid infinite loops in malformed lists.
  • Clarity: Output should be easy to scan.

3.4 Example Usage / Output

(lldb) command script import formatter.py
(lldb) fr v head
(Node *) head = 0x00007ffeefbff5e8 (Linked list with 3 nodes)

3.5 Real World Outcome

(lldb) command script import formatter.py
(lldb) b 10
(lldb) run
(lldb) fr v head
(Node *) head = 0x00007ffeefbff5e8 (Linked list with 3 nodes)

4. Solution Architecture

4.1 High-Level Design

LLDB -> SBValue -> Formatter -> Summary String

4.2 Key Components

Component Responsibility Key Decisions
C program Provide data structure Linked list is simple and clear
Formatter Generate summary Count nodes safely
LLDB config Register formatter Type summary add

5. Implementation Guide

5.1 Development Environment Setup

clang -g -O0 -o linkedlist linkedlist.c
lldb ./linkedlist

5.2 Project Structure

project-root/
|-- linkedlist.c
|-- formatter.py
`-- notes.md

5.3 The Core Question You’re Answering

“How can I make LLDB show my data structures in a human-friendly way?”

5.4 Concepts You Must Understand First

  1. SBValue traversal
    • Book: Building a Debugger - Ch. 2-3
  2. Pointers and structs
    • Book: Understanding and Using C Pointers - Ch. 4
  3. Python formatting
    • Book: Fluent Python - Ch. 7-9

5.5 Questions to Guide Your Design

  1. How will you traverse the list without infinite loops?
  2. How do you detect null pointers?
  3. How do you scope the formatter to Node * only?

5.6 Thinking Exercise

Draw the linked list in memory and annotate the pointer chain. Predict how LLDB will show it without a formatter.

5.7 The Interview Questions They’ll Ask

  1. What is the difference between a summary provider and synthetic children?
  2. How do you register a formatter in LLDB?
  3. How do you guard against cycles in data structures?
  4. Why should formatters avoid heavy computation?

5.8 Hints in Layers

Hint 1: Start with a simple summary

(lldb) type summary add --summary-string "nodes = ${var}" "Node *"

Hint 2: Use GetChildMemberWithName

next_ptr = valobj.GetChildMemberWithName("next")

Hint 3: Add a safety limit

max_nodes = 100

5.9 Books That Will Help

Topic Book Chapter
Debugger internals Building a Debugger Ch. 2-3
Pointers & structs Understanding and Using C Pointers Ch. 4
Python scripting Fluent Python Ch. 7-9

5.10 Implementation Phases

Phase 1: Minimal Summary (1-2 days)

  • Print a static summary string for Node *.

Phase 2: Dynamic Traversal (2-4 days)

  • Count nodes by walking the list.

Phase 3: Polish (1-2 days)

  • Handle null pointers and limit traversal.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Traversal limit Unlimited vs capped Capped Avoid infinite loops

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Formatter output Correct summary “Linked list with 3 nodes”
Safety checks Prevent loops Capped traversal

6.2 Critical Test Cases

  1. Formatter works on a 3-node list.
  2. Formatter returns “(null)” on null pointers.

7. Common Pitfalls & Debugging

Pitfall Symptom Solution
Formatter not applied Raw pointer shown Verify type name and registration
Infinite loop LLDB hangs Add traversal limit or visited set

8. Extensions & Challenges

  • Add synthetic children to show list elements as an array.
  • Support doubly-linked lists.

9. Real-World Connections

  • Large C++ codebases often ship LLDB formatters for STL types.
  • Good formatters make debugging sessions faster and less error-prone.

10. Resources

  • LLDB Data Formatters: https://lldb.llvm.org/use/variable.html
  • LLDB Python Reference: https://lldb.llvm.org/use/python-reference.html

11. Self-Assessment Checklist

  • I can register a summary provider.
  • I can traverse SBValue safely.
  • I can avoid infinite loops in the formatter.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Summary string appears instead of raw pointer.

Full Completion:

  • Formatter counts nodes correctly and handles null.

Excellence (Going Above & Beyond):

  • Add synthetic children for list elements.

This guide was generated from LEARN_LLDB_DEEP_DIVE.md. For the full learning path, see the parent directory README.