Project 6: Custom Data Formatter
Build a linked list in C and teach LLDB to display it with a Python formatter.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 1-2 weeks |
| Language | Python (LLDB formatter), C (debug target) |
| Prerequisites | Project 5, pointers, linked lists |
| Key Topics | SBValue, type summaries, synthetic children |
1. Learning Objectives
By completing this project, you will:
- Create a Python summary formatter for a C struct pointer.
- Use the SBValue API to traverse a linked list.
- Register a type formatter in LLDB.
- Optionally implement synthetic children for richer views.
2. Theoretical Foundation
2.1 Core Concepts
- SBValue: LLDB’s object for a runtime value, used to inspect members and dereference pointers.
- Type Summaries: One-line display strings for a type (e.g.,
Node * -> "Linked list with 3 nodes"). - Synthetic Children: Custom views that make complex structures appear as arrays or readable fields.
2.2 Why This Matters
Debugger productivity depends on how quickly you can understand data. A custom formatter turns raw pointers into meaningful summaries, saving time in every debug session.
2.3 Historical Context / Background
Formatters evolved from GDB’s pretty printers. LLDB provides a structured, Python-first API for defining summaries and synthetic children.
2.4 Common Misconceptions
- “Formatters are just for C++ STL”: They are most powerful for your own types.
- “You need a full debugger plugin”: A simple Python file is enough.
3. Project Specification
3.1 What You Will Build
A C program that builds a small linked list and stops at a breakpoint. A Python formatter will print a human-friendly summary when you inspect the list pointer in LLDB.
3.2 Functional Requirements
- Linked list: A C struct with
valueandnext. - Python summary: A formatter that counts nodes or prints values.
- Formatter registration: Use
type summary addto activate.
3.3 Non-Functional Requirements
- Safety: Avoid infinite loops in the formatter.
- Performance: Limit traversal depth to a safe count.
- Clarity: Output should be short and readable.
3.4 Example Usage / Output
(lldb) command script import formatter.py
(lldb) type summary add --python-function formatter.LinkedListSummary "Node *"
(lldb) b 10
(lldb) run
(lldb) fr v head
(Node *) head = 0x00007ffee7c0f5e8 (Linked list with 3 nodes)
3.5 Real World Outcome
You will inspect head in LLDB and see a human-friendly summary instead of just a pointer address. Example:
(lldb) fr v head
(Node *) head = 0x00007ffee7c0f5e8 (Linked list with 3 nodes)
4. Solution Architecture
4.1 High-Level Design
C program -> Stop at breakpoint -> LLDB loads formatter -> Summary displays linked list info
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
linkedlist.c |
Construct linked list | Keep list small and static |
formatter.py |
Provide summary function | Cap traversal depth |
| LLDB config | Register formatter | Use explicit type summary add |
4.3 Data Structures
typedef struct Node {
int value;
struct Node *next;
} Node;
4.4 Algorithm Overview
Key Algorithm: Linked List Summary
- Take an
SBValueforNode *. - Dereference to access
valueandnext. - Walk nodes until
next == NULLor max depth. - Return summary string.
Complexity Analysis:
- Time: O(n) for n nodes (bounded)
- Space: O(1)
5. Implementation Guide
5.1 Development Environment Setup
clang -g -o linkedlist linkedlist.c
5.2 Project Structure
project-root/
├── linkedlist.c
├── formatter.py
└── README.md
5.3 The Core Question You’re Answering
“How can I teach LLDB to understand my program’s data structures?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Pointers and Linked Lists
- How does
nextlink nodes? - How do you detect
NULL? - Book Reference: “C Interfaces and Implementations” Ch. 3
- How does
- SBValue API
- How do you access struct fields?
- How do you dereference a pointer?
- Formatter Registration
- How does
type summary addmatch types?
- How does
5.5 Questions to Guide Your Design
Before implementing, think through these:
- Should the summary show count or values?
- How will you avoid infinite loops on corrupted lists?
- What should the formatter show for
NULL?
5.6 Thinking Exercise
Design the Output
Write two candidate summaries: one with just a node count, another with the first few values. Decide which is most useful during debugging.
5.7 The Interview Questions They’ll Ask
Prepare to answer these:
- “What is a data formatter in LLDB?”
- “How does SBValue represent a pointer?”
- “Why would you limit traversal depth?”
5.8 Hints in Layers
Hint 1: Start with count-only
Return a summary like Linked list with 3 nodes before attempting value lists.
Hint 2: Use GetChildMemberWithName
next_val = valobj.GetChildMemberWithName("next")
Hint 3: Cap traversal
Stop after, say, 50 nodes to avoid infinite loops.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Linked lists | “C Interfaces and Implementations” | Ch. 3 |
| Debugging data structures | “Advanced Apple Debugging & Reverse Engineering” | Ch. 6 |
| LLDB API | LLDB Python docs | Summary formatters section |
5.10 Implementation Phases
Phase 1: Foundation (2-3 days)
Goals:
- Build the linked list program.
- Stop at a breakpoint.
Tasks:
- Write
linkedlist.cwith a 3-node list. - Compile with
-g. - Confirm LLDB stops at the breakpoint.
Checkpoint: You can inspect head and see a raw pointer.
Phase 2: Core Functionality (3-5 days)
Goals:
- Implement the Python summary formatter.
- Register it in LLDB.
Tasks:
- Write
formatter.pywithLinkedListSummary. - Import with
command script import. - Add type summary and verify output.
Checkpoint: LLDB prints the summary string for Node *.
Phase 3: Polish & Edge Cases (2-4 days)
Goals:
- Handle NULL and corrupted lists.
- Optionally add synthetic children.
Tasks:
- Return
NULLoremptyon null pointers. - Add max traversal depth.
- (Optional) Provide children for values.
Checkpoint: Formatter behaves safely on bad data.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Summary content | Count vs values | Count (plus first 3 values optional) | Short and safe |
| Traversal depth | Unlimited vs capped | Capped (50) | Prevent infinite loops |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Formatter Load | Ensure import works | command script import |
| Summary Display | Validate output | fr v head |
| Edge Cases | NULL or corrupted list | head = NULL |
6.2 Critical Test Cases
- Summary appears:
fr v headshows string. - NULL handling:
headnull prints a safe message. - Depth cap: Large list stops at max depth.
6.3 Test Data
Linked list with 3 nodes
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Wrong type name | Formatter not used | Match exact type string |
| Infinite loop | LLDB hangs | Add depth cap |
| Missing debug symbols | Fields invisible | Compile with -g |
7.2 Debugging Strategies
- Validate SBValue: Check
IsValid()andGetTypeName(). - Print debug logs: Use
lldb.debugger.GetCommandInterpreter().HandleCommandto log.
7.3 Performance Traps
Deep traversal on large lists can slow LLDB; keep summaries short.
8. Extensions & Challenges
8.1 Beginner Extensions
- Print the first 3 node values in the summary.
- Add a formatter for a binary tree node.
8.2 Intermediate Extensions
- Implement synthetic children for indexed access.
- Detect cycles in the list and label them.
8.3 Advanced Extensions
- Write formatters for a custom container library.
- Integrate formatter loading in
~/.lldbinitfor a project.
9. Real-World Connections
9.1 Industry Applications
- Complex data debugging: Custom formatters for internal types.
- Team workflows: Share formatters across teams to standardize debugging.
9.2 Related Open Source Projects
- LLDB Formatters: https://lldb.llvm.org/use/variable.html#python-based-formatters
- Swift LLDB formatters: https://github.com/apple/swift-lldb
9.3 Interview Relevance
- Shows advanced debugger knowledge and tooling sophistication.
10. Resources
10.1 Essential Reading
- LLDB Data Formatters - https://lldb.llvm.org/use/variable.html#python-based-formatters
- C Interfaces and Implementations by David Hanson - Ch. 3
10.2 Video Resources
- LLDB data formatter demos - LLDB community videos
10.3 Tools & Documentation
type summary add: https://lldb.llvm.org/use/command.html#type
10.4 Related Projects in This Series
- LLDB Python Scripting: the prerequisite for formatter work.
11. Self-Assessment Checklist
11.1 Understanding
- I can explain what SBValue represents.
- I can describe how a type summary is registered.
- I can explain why depth limits are needed.
11.2 Implementation
- Formatter loads and activates for
Node *. - Summary output is correct and stable.
- NULL lists are handled safely.
11.3 Growth
- I can extend the formatter to new types.
- I can share a formatter package with a team.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Build a linked list program and register a summary formatter.
fr v headshows a human-readable summary.
Full Completion:
- Handle NULL and long lists safely.
- Document how to load the formatter automatically.
Excellence (Going Above & Beyond):
- Add synthetic children for indexed access.
- Provide formatters for multiple data structures.
This guide was generated from LEARN_LLDB_DEEP_DIVE.md. For the complete learning path, see the parent directory.