Project 6: Custom Data Formatter

Build a linked list in C and teach LLDB to display it with a Python formatter.

Quick Reference

Attribute Value
Difficulty Expert
Time Estimate 1-2 weeks
Language Python (LLDB formatter), C (debug target)
Prerequisites Project 5, pointers, linked lists
Key Topics SBValue, type summaries, synthetic children

1. Learning Objectives

By completing this project, you will:

  1. Create a Python summary formatter for a C struct pointer.
  2. Use the SBValue API to traverse a linked list.
  3. Register a type formatter in LLDB.
  4. Optionally implement synthetic children for richer views.

2. Theoretical Foundation

2.1 Core Concepts

  • SBValue: LLDB’s object for a runtime value, used to inspect members and dereference pointers.
  • Type Summaries: One-line display strings for a type (e.g., Node * -> "Linked list with 3 nodes").
  • Synthetic Children: Custom views that make complex structures appear as arrays or readable fields.

2.2 Why This Matters

Debugger productivity depends on how quickly you can understand data. A custom formatter turns raw pointers into meaningful summaries, saving time in every debug session.

2.3 Historical Context / Background

Formatters evolved from GDB’s pretty printers. LLDB provides a structured, Python-first API for defining summaries and synthetic children.

2.4 Common Misconceptions

  • “Formatters are just for C++ STL”: They are most powerful for your own types.
  • “You need a full debugger plugin”: A simple Python file is enough.

3. Project Specification

3.1 What You Will Build

A C program that builds a small linked list and stops at a breakpoint. A Python formatter will print a human-friendly summary when you inspect the list pointer in LLDB.

3.2 Functional Requirements

  1. Linked list: A C struct with value and next.
  2. Python summary: A formatter that counts nodes or prints values.
  3. Formatter registration: Use type summary add to activate.

3.3 Non-Functional Requirements

  • Safety: Avoid infinite loops in the formatter.
  • Performance: Limit traversal depth to a safe count.
  • Clarity: Output should be short and readable.

3.4 Example Usage / Output

(lldb) command script import formatter.py
(lldb) type summary add --python-function formatter.LinkedListSummary "Node *"
(lldb) b 10
(lldb) run
(lldb) fr v head
(Node *) head = 0x00007ffee7c0f5e8 (Linked list with 3 nodes)

3.5 Real World Outcome

You will inspect head in LLDB and see a human-friendly summary instead of just a pointer address. Example:

(lldb) fr v head
(Node *) head = 0x00007ffee7c0f5e8 (Linked list with 3 nodes)

4. Solution Architecture

4.1 High-Level Design

C program -> Stop at breakpoint -> LLDB loads formatter -> Summary displays linked list info

4.2 Key Components

Component Responsibility Key Decisions
linkedlist.c Construct linked list Keep list small and static
formatter.py Provide summary function Cap traversal depth
LLDB config Register formatter Use explicit type summary add

4.3 Data Structures

typedef struct Node {
    int value;
    struct Node *next;
} Node;

4.4 Algorithm Overview

Key Algorithm: Linked List Summary

  1. Take an SBValue for Node *.
  2. Dereference to access value and next.
  3. Walk nodes until next == NULL or max depth.
  4. Return summary string.

Complexity Analysis:

  • Time: O(n) for n nodes (bounded)
  • Space: O(1)

5. Implementation Guide

5.1 Development Environment Setup

clang -g -o linkedlist linkedlist.c

5.2 Project Structure

project-root/
├── linkedlist.c
├── formatter.py
└── README.md

5.3 The Core Question You’re Answering

“How can I teach LLDB to understand my program’s data structures?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. Pointers and Linked Lists
    • How does next link nodes?
    • How do you detect NULL?
    • Book Reference: “C Interfaces and Implementations” Ch. 3
  2. SBValue API
    • How do you access struct fields?
    • How do you dereference a pointer?
  3. Formatter Registration
    • How does type summary add match types?

5.5 Questions to Guide Your Design

Before implementing, think through these:

  1. Should the summary show count or values?
  2. How will you avoid infinite loops on corrupted lists?
  3. What should the formatter show for NULL?

5.6 Thinking Exercise

Design the Output

Write two candidate summaries: one with just a node count, another with the first few values. Decide which is most useful during debugging.

5.7 The Interview Questions They’ll Ask

Prepare to answer these:

  1. “What is a data formatter in LLDB?”
  2. “How does SBValue represent a pointer?”
  3. “Why would you limit traversal depth?”

5.8 Hints in Layers

Hint 1: Start with count-only

Return a summary like Linked list with 3 nodes before attempting value lists.

Hint 2: Use GetChildMemberWithName

next_val = valobj.GetChildMemberWithName("next")

Hint 3: Cap traversal

Stop after, say, 50 nodes to avoid infinite loops.

5.9 Books That Will Help

Topic Book Chapter
Linked lists “C Interfaces and Implementations” Ch. 3
Debugging data structures “Advanced Apple Debugging & Reverse Engineering” Ch. 6
LLDB API LLDB Python docs Summary formatters section

5.10 Implementation Phases

Phase 1: Foundation (2-3 days)

Goals:

  • Build the linked list program.
  • Stop at a breakpoint.

Tasks:

  1. Write linkedlist.c with a 3-node list.
  2. Compile with -g.
  3. Confirm LLDB stops at the breakpoint.

Checkpoint: You can inspect head and see a raw pointer.

Phase 2: Core Functionality (3-5 days)

Goals:

  • Implement the Python summary formatter.
  • Register it in LLDB.

Tasks:

  1. Write formatter.py with LinkedListSummary.
  2. Import with command script import.
  3. Add type summary and verify output.

Checkpoint: LLDB prints the summary string for Node *.

Phase 3: Polish & Edge Cases (2-4 days)

Goals:

  • Handle NULL and corrupted lists.
  • Optionally add synthetic children.

Tasks:

  1. Return NULL or empty on null pointers.
  2. Add max traversal depth.
  3. (Optional) Provide children for values.

Checkpoint: Formatter behaves safely on bad data.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Summary content Count vs values Count (plus first 3 values optional) Short and safe
Traversal depth Unlimited vs capped Capped (50) Prevent infinite loops

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Formatter Load Ensure import works command script import
Summary Display Validate output fr v head
Edge Cases NULL or corrupted list head = NULL

6.2 Critical Test Cases

  1. Summary appears: fr v head shows string.
  2. NULL handling: head null prints a safe message.
  3. Depth cap: Large list stops at max depth.

6.3 Test Data

Linked list with 3 nodes

7. Common Pitfalls & Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Wrong type name Formatter not used Match exact type string
Infinite loop LLDB hangs Add depth cap
Missing debug symbols Fields invisible Compile with -g

7.2 Debugging Strategies

  • Validate SBValue: Check IsValid() and GetTypeName().
  • Print debug logs: Use lldb.debugger.GetCommandInterpreter().HandleCommand to log.

7.3 Performance Traps

Deep traversal on large lists can slow LLDB; keep summaries short.


8. Extensions & Challenges

8.1 Beginner Extensions

  • Print the first 3 node values in the summary.
  • Add a formatter for a binary tree node.

8.2 Intermediate Extensions

  • Implement synthetic children for indexed access.
  • Detect cycles in the list and label them.

8.3 Advanced Extensions

  • Write formatters for a custom container library.
  • Integrate formatter loading in ~/.lldbinit for a project.

9. Real-World Connections

9.1 Industry Applications

  • Complex data debugging: Custom formatters for internal types.
  • Team workflows: Share formatters across teams to standardize debugging.
  • LLDB Formatters: https://lldb.llvm.org/use/variable.html#python-based-formatters
  • Swift LLDB formatters: https://github.com/apple/swift-lldb

9.3 Interview Relevance

  • Shows advanced debugger knowledge and tooling sophistication.

10. Resources

10.1 Essential Reading

  • LLDB Data Formatters - https://lldb.llvm.org/use/variable.html#python-based-formatters
  • C Interfaces and Implementations by David Hanson - Ch. 3

10.2 Video Resources

  • LLDB data formatter demos - LLDB community videos

10.3 Tools & Documentation

  • type summary add: https://lldb.llvm.org/use/command.html#type

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain what SBValue represents.
  • I can describe how a type summary is registered.
  • I can explain why depth limits are needed.

11.2 Implementation

  • Formatter loads and activates for Node *.
  • Summary output is correct and stable.
  • NULL lists are handled safely.

11.3 Growth

  • I can extend the formatter to new types.
  • I can share a formatter package with a team.

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Build a linked list program and register a summary formatter.
  • fr v head shows a human-readable summary.

Full Completion:

  • Handle NULL and long lists safely.
  • Document how to load the formatter automatically.

Excellence (Going Above & Beyond):

  • Add synthetic children for indexed access.
  • Provide formatters for multiple data structures.

This guide was generated from LEARN_LLDB_DEEP_DIVE.md. For the complete learning path, see the parent directory.