LEARN LLDB DEEP DIVE

Learn LLDB: From Zero to Debugging Master

Goal: Deeply understand the LLVM Debugger (LLDB)—from basic commands and crash analysis to advanced Python scripting and data formatters.

Why Learn LLDB?

LLDB is the modern, high-performance debugger from the LLM project. It’s the default on macOS and a first-class citizen in the C++, Objective-C, and Swift ecosystems. While GDB is the GNU standard, mastering LLDB gives you a debugger that is often faster, more scriptable, and better integrated with modern Clang/LLVM toolchains, especially for complex C++ types.

After completing these projects, you will:

Confidently navigate the state of any C/C++/Swift/Obj-C program.
Diagnose and fix crashes with LLDB’s powerful analysis tools.
Use structured commands and conditional breakpoints to efficiently find bugs.
Automate your debugging workflows with LLDB’s rich Python API.
Create custom “data formatters” to teach LLDB how to pretty-print your custom types.

Core Concept Analysis

1. The LLDB Command Structure

LLDB’s commands are structured and compositional, following a (noun) (verb) [options] pattern. This makes them more predictable and discoverable than GDB’s flatter command space.

(lldb) breakpoint set --file main.c --line 10
        ^          ^    ^          ^
        |          |    |          |
      (noun)      (verb) (option)  (argument)

(lldb) thread step-over
        ^     ^
        |     |
      (noun) (verb)

Common nouns: target, process, thread, frame, breakpoint, watchpoint
Common verbs: create, set, list, run, attach, step-in, step-over, read, select

Tip: LLDB provides GDB command aliases, but learning the native syntax is key to mastery.

2. The Python Scripting Bridge

LLDB is built from the ground up to be scriptable. It exposes a comprehensive, object-oriented Python API that allows you to control every aspect of the debugger.

# A simple Python script inside LLDB
import lldb

# The `lldb` object is the entry point to the debugger's state
debugger = lldb.debugger
target = debugger.GetSelectedTarget()
process = target.GetProcess()
thread = process.GetSelectedThread()
frame = thread.GetSelectedFrame()

# Read the 'rax' register
rax_value = frame.FindRegister("rax").GetValue()
print(f"RAX: {rax_value}")

# Evaluate a C++ expression
my_var = frame.EvaluateExpression("my_vector[2]")
print(f"my_vector[2] = {my_var.GetValue()})")

This structured API is the foundation for building powerful custom commands and data visualizers.

Project List

The best way to learn LLDB is to solve a series of increasingly difficult debugging challenges. Each “project” is a C program with a specific bug, designed to teach you a new LLDB skill.

Project 1: The Basics - First Steps

File: LEARN_LLDB_DEEP_DIVE.md
Main Programming Language: LLDB Commands (on a C target)
Alternative Programming Languages: N/A
Coolness Level: Level 2: Practical but Forgettable
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Debugging Fundamentals
Software or Tool: LLDB, Clang/GCC
Main Book: The LLDB Tutorial

What you’ll build: A simple C program with a for loop and a function call. You will use LLDB to step through it, inspect variables, and understand the flow of execution.

Why it teaches LLDB: This project builds the foundational muscle memory for the most common debugging loop, but with LLDB’s specific command syntax.

Core challenges you’ll face:

Compiling for debugging → maps to using the -g flag in clang or gcc
Creating a target and setting breakpoints → maps to target create and breakpoint set
Controlling execution → maps to run, continue, thread step-over (next), thread step-in (step)
Inspecting state → maps to frame variable (or p, print), bt for the call stack

Key Concepts:

Targets and Processes: The LLDB Tutorial
Stepping vs. Nexting: help thread
Stack Frames: help frame, help bt

Difficulty: Beginner Time estimate: 1-2 hours Prerequisites: Basic C knowledge.

Real world outcome: You will compile this C code with clang -g -o target target.c and debug it.

// target.c
#include <stdio.h>

void greet(int count) {
    printf("Hello for the %dth time!\n", count);
}

int main() {
    int i;
    for (i = 0; i < 5; ++i) {
        greet(i);
    }
    return 0;
}

Debugging Session:

$ lldb ./target
(lldb) target create "./target"
Current executable set to './target' (x86_64).
(lldb) breakpoint set --name main  # Or just `b main`
(lldb) run
(lldb) thread step-over           # Or just `n`
(lldb) thread step-over
(lldb) frame variable i           # Or `p i` or `fr v i`
(int) i = 0
(lldb) thread step-in             # Or just `s`
(lldb) bt                         # See the call stack
* thread #1, queue = 'com.apple.main-thread', stop reason = step in
  * frame #0: 0x0000000100003f44 target`greet(count=0) at target.c:4
    frame #1: 0x0000000100003f7a target`main at target.c:9
(lldb) continue

Learning milestones:

Set a breakpoint and run to it → You can control where execution starts.
Distinguish thread step-over from thread step-in → You understand the core stepping commands.
Print a variable’s value → You can inspect program state.
View the call stack (bt) → You understand program structure.

Project 2: The Crash - Crash Report Analysis

File: LEARN_LLDB_DEEP_DIVE.md
Main Programming Language: LLDB Commands (on a C target)
Alternative Programming Languages: N/A
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 1: Beginner
Knowledge Area: Crash Analysis / Debugging
Software or Tool: LLDB
Main Book: Advanced Apple Debugging & Reverse Engineering (though focused on Apple, concepts are general)

What you’ll build: A C program that reliably segfaults. You will learn to use LLDB to perform a “post-mortem” analysis of the crash.

Why it teaches LLDB: Post-mortem debugging is a critical skill. On macOS, this often involves loading a .crash report. On Linux, it involves loading a core dump. LLDB excels at both, providing rich context about the crashed state.

Core challenges you’ll face:

Loading a crashed process → maps to lldb -c <corefile> on Linux, or loading from crash logs on macOS
Inspecting the crashed thread → maps to bt and frame select
Finding the faulting instruction → maps to LLDB automatically showing you the line and reason
Reading registers at crash time → maps to register read to find the bad address

Key Concepts:

Crash Reports vs. Core Dumps: Platform-specific analysis.
Stop Reasons: LLDB provides detailed reasons for stopping (e.g., EXC_BAD_ACCESS).

Difficulty: Beginner Time estimate: 1-2 hours Prerequisites: Project 1.

Real world outcome: Compile clang -g -o crash crash.c, run it, and analyze the crash.

// crash.c
void crash_me() {
    char *p = NULL;
    *p = 'A'; // Segfault!
}
int main() { crash_me(); return 0; }

Debugging Session (on Linux with a core dump):

$ ulimit -c unlimited && ./crash
Segmentation fault (core dumped)
$ lldb ./crash -c core

(lldb) target create "./crash" --core "core"
Core file '/path/to/core' successfully loaded.
Process 12345 stopped
* thread #1, stop reason = signal SIGSEGV: invalid address (fault address: 0x0)
    frame #0: 0x000055555555513d crash`crash_me() at crash.c:3:5
   1   void crash_me() {
   2       char *p = NULL;
-> 3       *p = 'A'; // Segfault!
   4   }
(lldb) bt
* thread #1, stop reason = signal SIGSEGV
  * frame #0: 0x000055555555513d crash`crash_me at crash.c:3
    frame #1: 0x0000555555555152 crash`main at crash.c:5
(lldb) frame variable p
(char *) p = 0x0000000000000000

Learning milestones:

Load a crashed process into LLDB → You can begin post-mortem analysis.
Use bt to find the faulting thread and frame → You can pinpoint the crash location.
Read the stop reason → You understand what kind of crash it was.
Inspect variables to find the root cause → You can determine why it crashed.

Project 3: The Hang - Attaching to a Running Process

File: LEARN_LLDB_DEEP_DIVE.md
Main Programming Language: LLDB Commands (on a C target)
Alternative Programming Languages: N/A
Coolness Level: Level 3: Genuinely Clever
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Dynamic Analysis / Process Control
Software or Tool: LLDB, ps
Main Book: N/A (use help process)

What you’ll build: A program with an infinite loop. You’ll then attach LLDB to the already-running process, pause it, inspect its state, and detach without killing it.

Why it teaches LLDB: This is the core workflow for debugging unresponsive services. LLDB’s process attach is fast and provides immediate, rich state information.

Core challenges you’ll face:

Finding the Process ID (PID) → maps to using ps or pgrep
Attaching LLDB by PID or name → maps to process attach --pid <PID> or process attach --name <name>
Inspecting state → maps to bt, frame variable, register read
Detaching cleanly → maps to the process detach command

Key Concepts:

Process Attachment: help process attach.
Live vs. Post-Mortem: Understanding the difference in available information.

Difficulty: Intermediate Time estimate: 1 hour Prerequisites: Basic command-line knowledge.

Real world outcome: Compile clang -g -o hang hang.c, run it in one terminal, and debug from another.

// hang.c
#include <stdio.h>
#include <unistd.h>
int main() {
    volatile int counter = 0;
    while (1) { counter++; sleep(1); }
    return 0;
}

Debugging Session:

# Terminal 1
$ ./hang

# Terminal 2
$ pgrep hang
12345
$ lldb
(lldb) process attach --pid 12345
Process 12345 stopped
* thread #1, stop reason = signal SIGSTOP
    frame #0: ... in sleep ...
(lldb) # It's paused. Let's see the stack.
(lldb) bt
(lldb) # Select the frame in our code
(lldb) frame select 1
frame #1: 0x0000000100003f7a hang`main at hang.c:6
(lldb) frame variable counter
(int) counter = 10
(lldb) # Let's change the value
(lldb) expr counter = 100
(int) $0 = 100
(lldb) # We're done, let it run
(lldb) detach
Process 12345 detached

Learning milestones:

Successfully attach to a running process by PID → You can debug live systems.
Set a breakpoint while attached → You can intercept future behavior.
Use expr to change a variable’s value live → You can modify a running program’s state.
Detach without killing the process → You can perform non-destructive inspection.

Project 4: The Corruption - Using Watchpoints

File: LEARN_LLDB_DEEP_DIVE.md
Main Programming Language: LLDB Commands (on a C target)
Alternative Programming Languages: N/A
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 2: Intermediate
Knowledge Area: Advanced Debugging / Memory Analysis
Software or Tool: LLDB
Main Book: LLDB Quick Start Guide

What you’ll build: A C program with a memory corruption bug. You’ll use an LLDB “watchpoint” to have the debugger stop at the exact instruction that overwrites your data.

Why it teaches LLDB: Watchpoints are a debugging superpower. LLDB’s implementation is particularly powerful, leveraging hardware watchpoints for extreme speed. This project saves you hours of manual stepping.

Core challenges you’ll face:

Setting a watchpoint on a variable → maps to watchpoint set variable my_var
Setting a watchpoint on a memory address → maps to watchpoint set expression 0x...
Running to the trigger point → maps to continue and letting the hardware trap stop execution
Analyzing the context of the write → maps to bt to see who the culprit is

Key Concepts:

Watchpoints: help watchpoint.
Expressions: help expr. LLDB uses its Clang expression parser here.

Difficulty: Intermediate Time estimate: 2 hours Prerequisites: Project 1.

Real world outcome: Compile and debug clang -g -o corrupt corrupt.c.

// corrupt.c
#include <stdio.h>
int global_value = 100;
void buggy_function() {
    int *p = &global_value;
    *(p + 1) = 0; // Off-by-one write corrupts memory near global_value
}
int main() {
    int local_value = 200;
    buggy_function();
    // Why is local_value now 0?
    return 0;
}

Debugging Session:

$ lldb ./corrupt
(lldb) target create "./corrupt"
(lldb) b main
(lldb) run
(lldb) # We are at the start of main. Let's watch local_value
(lldb) watchpoint set variable local_value
Watchpoint created: Watchpoint 1: addr = 0x7ffeefbff5ec size = 4
(lldb) continue
Process 12345 resuming
Watchpoint 1 hit:
old value: 200
new value: 0
* thread #1, stop reason = watchpoint 1
    frame #0: 0x0000000100003f48 corrupt`buggy_function at corrupt.c:5
   2    int global_value = 100;
   3    void buggy_function() {
   4        int *p = &global_value;
-> 5        *(p + 1) = 0;
   6    }
   7    int main() {
(lldb) bt
* thread #1, stop reason = watchpoint 1
  * frame #0: 0x0000000100003f48 corrupt`buggy_function at corrupt.c:5
    frame #1: 0x0000000100003f7a corrupt`main at corrupt.c:9

Learning milestones:

Set a watchpoint on a local variable → You can monitor the stack.
Let LLDB find the exact line of corruption → You’ve automated memory bug hunting.
Set a conditional watchpoint → watchpoint modify -c 'new_val > 100'.
List and delete watchpoints → You can manage your debugging session.

Project 5: LLDB Python Scripting

File: LEARN_LLDB_DEEP_DIVE.md
Main Programming Language: Python (in LLDB)
Alternative Programming Languages: N/A
Coolness Level: Level 4: Hardcore Tech Flex
Business Potential: 1. The “Resume Gold”
Difficulty: Level 3: Advanced
Knowledge Area: Debugging Automation
Software or Tool: LLDB with Python support
Main Book: LLDB Python Scripting Reference

What you’ll build: A Python script that adds a new command to LLDB. This command will inspect the current frame and print a custom summary of its state.

Why it teaches LLDB: This is the gateway to true mastery. LLDB’s Python API allows you to automate any repetitive task, create domain-specific commands, and extend the debugger to understand your program’s data structures.

Core challenges you’ll face:

Creating a ~/.lldbinit file → maps to making your scripts load automatically
Importing a Python module → maps to command script import
Writing a Python function that takes debugger state → maps to the debugger, command, result, internal_dict function signature
Accessing the target, process, thread, and frame → maps to using the lldb.debugger and lldb.target objects

Key Concepts:

LLDB Python Module: lldb.SBDebugger, lldb.SBTarget, lldb.SBFrame.
Initialization files: The role of ~/.lldbinit.

Difficulty: Advanced Time estimate: 1 week Prerequisites: Python knowledge, Projects 1-4.

Real world outcome: A Python script (myscripts.py) that adds a frame_summary command.

# myscripts.py
import lldb

def frame_summary_command(debugger, command, result, internal_dict):
    """
    A custom LLDB command that prints a summary of the current frame.
    """
    target = debugger.GetSelectedTarget()
    if not target:
        result.SetError("Invalid target")
        return

    thread = target.GetProcess().GetSelectedThread()
    frame = thread.GetSelectedFrame()

    if not frame.IsValid():
        result.SetError("Invalid frame")
        return

    func_name = frame.GetFunctionName()
    line_entry = frame.GetLineEntry()
    file_path = line_entry.GetFileSpec().fullpath
    line_num = line_entry.GetLine()

    result.AppendMessage(f"You are in function '{func_name}'")
    result.AppendMessage(f"at {file_path}:{line_num}")
    result.AppendMessage("Local variables:")

    for var in frame.get_locals():
        result.AppendMessage(f"  - {var.GetName()}: {var.GetValue()}")

def __lldb_init_module(debugger, internal_dict):
    # Add the command to LLDB.
    # The first arg is the command name, the second is the python function.
    debugger.HandleCommand('command script add -f myscripts.frame_summary_command frame_summary')
    print("Custom command 'frame_summary' loaded.")

Debugging Session:

$ lldb ./my_program
(lldb) command script import myscripts.py
Custom command 'frame_summary' loaded.
(lldb) b main
(lldb) run
(lldb) n
(lldb) frame_summary
You are in function 'main'
at /path/to/my_program.c:10
Local variables:
  - i: 0

Learning milestones:

Create a ~/.lldbinit file → You can customize your LLDB environment.
Write a custom command → You can extend LLDB’s vocabulary.
Access program state (variables, registers) from Python → You can perform complex, automated analysis.
Use the result object to print formatted output → You can create professional-looking custom commands.

Final Overall Project: Custom Data Formatter

File: LEARN_LLDB_DEEP_DIVE.md
Main Programming Language: Python (in LLDB)
Alternative Programming Languages: N/A
Coolness Level: Level 5: Pure Magic (Super Cool)
Business Potential: 2. The “Micro-SaaS / Pro Tool” (as part of a larger debugging suite)
Difficulty: Level 4: Expert
Knowledge Area: Debugger Customization / Data Visualization
Software or Tool: LLDB
Main Book: LLDB Python Data Visualization Docs

What you’ll build: A C program that uses a linked list, and a Python “data formatter” that teaches LLDB how to print that list in a clean, readable way, instead of just showing a pointer address.

Why it’s the final goal: This is LLDB’s killer feature. It transforms the debugger from a generic tool into one that deeply understands your specific program. It’s the pinnacle of debugger automation and makes debugging complex data structures 100x easier.

Core challenges you’ll face:

Understanding SBValue → maps to the Python object representing a variable
Writing a “summary” function → maps to a Python function that returns a one-line summary string
Traversing data structures from Python → maps to using the SBValue API to follow pointers (GetChildMemberWithName, Dereference)
Registering the formatter → maps to the type summary add command

Real world outcome: The difference between a useless pointer and a beautiful summary.

// linkedlist.c
typedef struct Node {
    int value;
    struct Node *next;
} Node;

int main() {
    Node c = {3, NULL};
    Node b = {2, &c};
    Node a = {1, &b};
    Node *head = &a; // Breakpoint here
    return 0;
}

Without Formatter:

(lldb) fr v head
(Node *) head = 0x00007ffeefbff5e8

With Your Python Formatter:

# formatter.py
def LinkedListSummary(valobj, internal_dict):
    """
    Python summary function for a Node*.
    """
    head = valobj
    count = 0
    while head.GetChildMemberWithName("next").GetValueAsUnsigned() != 0:
        count += 1
        head = head.GetChildMemberWithName("next")
    return f"Linked list with {count+1} nodes"

def __lldb_init_module(debugger, internal_dict):
    debugger.HandleCommand(
        'type summary add --python-function formatter.LinkedListSummary "Node *"'
    )

Debugging Session with Formatter:

(lldb) command script import formatter.py
(lldb) b 10
(lldb) run
(lldb) fr v head
(Node *) head = 0x00007ffeefbff5e8 (Linked list with 3 nodes)

Learning milestones:

Write a simple summary for a struct → You can add one-line summaries.
Traverse a linked list in Python → You can follow pointers using the SBValue API.
Create a synthetic child provider → (Advanced) Teach LLDB to show a C++ std::vector as if it were a C-style array, with subscriptable children.

Summary

Project	Main Programming Language
The Basics	LLDB Commands (on a C target)
The Crash	LLDB Commands (on a C target)
The Hang	LLDB Commands (on a C target)
The Corruption	LLDB Commands (on a C target)
Python Scripting	Python (in LLDB)
Custom Data Formatter	Python (in LLDB)