Project 6: Custom Data Formatter
Teach LLDB to print your data structures in a human-friendly way.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 1-2 weeks |
| Language | Python (LLDB data formatters) |
| Prerequisites | Project 5, pointers, Python scripting |
| Key Topics | SBValue, summaries, synthetic children |
1. Learning Objectives
By completing this project, you will:
- Write a Python summary provider for a custom struct.
- Register the formatter in LLDB.
- Traverse a linked list safely with SBValue.
- (Optional) Add synthetic children for richer output.
2. Theoretical Foundation
2.1 Core Concepts
- Summary provider: Returns a one-line description for a type.
- Synthetic children: Expose custom child elements for pretty printing.
- SBValue: The LLDB object representing a value in the debug session.
2.2 Why This Matters
Readable data structures can reduce debugging time by orders of magnitude. A good formatter turns “0x7ffeefbff5e8” into meaningful insight.
2.3 Common Misconceptions
- “Formatters are cosmetic.” (They change how fast you understand state.)
- “Formatters are fragile.” (With good guards, they are robust.)
3. Project Specification
3.1 What You Will Build
A C program with a linked list and a Python formatter that prints the list size instead of a raw pointer.
3.2 Functional Requirements
- LLDB prints a summary like “Linked list with 3 nodes”.
- The formatter handles null pointers safely.
- The formatter can be enabled/disabled.
3.3 Non-Functional Requirements
- Safety: Avoid infinite loops in malformed lists.
- Clarity: Output should be easy to scan.
3.4 Example Usage / Output
(lldb) command script import formatter.py
(lldb) fr v head
(Node *) head = 0x00007ffeefbff5e8 (Linked list with 3 nodes)
3.5 Real World Outcome
(lldb) command script import formatter.py
(lldb) b 10
(lldb) run
(lldb) fr v head
(Node *) head = 0x00007ffeefbff5e8 (Linked list with 3 nodes)
4. Solution Architecture
4.1 High-Level Design
LLDB -> SBValue -> Formatter -> Summary String
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| C program | Provide data structure | Linked list is simple and clear |
| Formatter | Generate summary | Count nodes safely |
| LLDB config | Register formatter | Type summary add |
5. Implementation Guide
5.1 Development Environment Setup
clang -g -O0 -o linkedlist linkedlist.c
lldb ./linkedlist
5.2 Project Structure
project-root/
|-- linkedlist.c
|-- formatter.py
`-- notes.md
5.3 The Core Question You’re Answering
“How can I make LLDB show my data structures in a human-friendly way?”
5.4 Concepts You Must Understand First
- SBValue traversal
- Book: Building a Debugger - Ch. 2-3
- Pointers and structs
- Book: Understanding and Using C Pointers - Ch. 4
- Python formatting
- Book: Fluent Python - Ch. 7-9
5.5 Questions to Guide Your Design
- How will you traverse the list without infinite loops?
- How do you detect null pointers?
- How do you scope the formatter to
Node *only?
5.6 Thinking Exercise
Draw the linked list in memory and annotate the pointer chain. Predict how LLDB will show it without a formatter.
5.7 The Interview Questions They’ll Ask
- What is the difference between a summary provider and synthetic children?
- How do you register a formatter in LLDB?
- How do you guard against cycles in data structures?
- Why should formatters avoid heavy computation?
5.8 Hints in Layers
Hint 1: Start with a simple summary
(lldb) type summary add --summary-string "nodes = ${var}" "Node *"
Hint 2: Use GetChildMemberWithName
next_ptr = valobj.GetChildMemberWithName("next")
Hint 3: Add a safety limit
max_nodes = 100
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Debugger internals | Building a Debugger | Ch. 2-3 |
| Pointers & structs | Understanding and Using C Pointers | Ch. 4 |
| Python scripting | Fluent Python | Ch. 7-9 |
5.10 Implementation Phases
Phase 1: Minimal Summary (1-2 days)
- Print a static summary string for
Node *.
Phase 2: Dynamic Traversal (2-4 days)
- Count nodes by walking the list.
Phase 3: Polish (1-2 days)
- Handle null pointers and limit traversal.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Traversal limit | Unlimited vs capped | Capped | Avoid infinite loops |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Formatter output | Correct summary | “Linked list with 3 nodes” |
| Safety checks | Prevent loops | Capped traversal |
6.2 Critical Test Cases
- Formatter works on a 3-node list.
- Formatter returns “(null)” on null pointers.
7. Common Pitfalls & Debugging
| Pitfall | Symptom | Solution |
|---|---|---|
| Formatter not applied | Raw pointer shown | Verify type name and registration |
| Infinite loop | LLDB hangs | Add traversal limit or visited set |
8. Extensions & Challenges
- Add synthetic children to show list elements as an array.
- Support doubly-linked lists.
9. Real-World Connections
- Large C++ codebases often ship LLDB formatters for STL types.
- Good formatters make debugging sessions faster and less error-prone.
10. Resources
- LLDB Data Formatters: https://lldb.llvm.org/use/variable.html
- LLDB Python Reference: https://lldb.llvm.org/use/python-reference.html
11. Self-Assessment Checklist
- I can register a summary provider.
- I can traverse SBValue safely.
- I can avoid infinite loops in the formatter.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Summary string appears instead of raw pointer.
Full Completion:
- Formatter counts nodes correctly and handles null.
Excellence (Going Above & Beyond):
- Add synthetic children for list elements.
This guide was generated from LEARN_LLDB_DEEP_DIVE.md. For the full learning path, see the parent directory README.