Project 4: The Automated Crash Detective

Build a Python script that automates initial crash dump triage, extracting backtraces, registers, and crash signals from core files to generate concise analysis reports.

Quick Reference

Attribute	Value
Difficulty	Intermediate
Time Estimate	1-2 weeks
Language	Python
Prerequisites	Project 2 (GDB Backtrace), Basic Python scripting
Key Topics	GDB batch mode, GDB Python API, subprocess management, crash automation

1. Learning Objectives

By completing this project, you will:

Understand GDB’s batch mode and how to run non-interactive debugging sessions
Master the GDB Python API for programmatic access to debugging information
Learn subprocess management for orchestrating external tools from Python
Build robust text parsing to extract structured data from GDB output
Design reusable automation scripts that handle multiple crash scenarios
Develop practical SRE/DevOps skills for crash triage at scale

2. Theoretical Foundation

2.1 Core Concepts

GDB Batch Mode: Non-Interactive Debugging

GDB’s batch mode allows you to run debugging commands without human interaction. This is the foundation of automated crash analysis.

Interactive GDB Session                Batch Mode Execution
┌───────────────────────────┐         ┌───────────────────────────┐
│ $ gdb ./app core.1234     │         │ $ gdb --batch --quiet \   │
│ (gdb) bt                  │         │     --command=cmds.gdb \  │
│ #0  main() at app.c:10    │         │     ./app core.1234       │
│ (gdb) info registers      │         │                           │
│ rax  0x0 ...              │         │ #0  main() at app.c:10    │
│ (gdb) quit                │         │ rax  0x0 ...              │
└───────────────────────────┘         └───────────────────────────┘
        │                                       │
        ▼                                       ▼
  Human types commands               Script reads stdout output
  Human reads output                 Script parses and processes

Key batch mode flags:

--batch: Exit after processing commands (implies --quiet)
--quiet or -q: Suppress introductory and copyright messages
--command=FILE or -x FILE: Execute GDB commands from FILE
--eval-command=COMMAND or -ex COMMAND: Execute a single GDB command

GDB Python API: Structured Access to Debugging Data

GDB embeds a Python interpreter that provides programmatic access to debugging information. Unlike parsing text output, the Python API gives you structured data.

┌─────────────────────────────────────────────────────────────────┐
│                     GDB Architecture                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐    ┌─────────────────┐    ┌────────────────┐ │
│  │ Core Dump    │───▶│  GDB Core       │───▶│ Command Line   │ │
│  │ + Executable │    │  (symbol table, │    │ Interface      │ │
│  └──────────────┘    │   stack unwinder,│    └────────────────┘ │
│                      │   expression     │              │         │
│                      │   evaluator)     │              ▼         │
│                      └────────┬─────────┘    ┌────────────────┐ │
│                               │              │ Text Output    │ │
│                               │              └────────────────┘ │
│                               │                                  │
│                               ▼                                  │
│                      ┌─────────────────┐                        │
│                      │ Python API      │                        │
│                      │ (gdb module)    │                        │
│                      └────────┬────────┘                        │
│                               │                                  │
│                               ▼                                  │
│                      ┌─────────────────┐                        │
│                      │ Your Script     │                        │
│                      │ (analyzer.py)   │                        │
│                      └─────────────────┘                        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Essential GDB Python API functions:

Function	Purpose	Example
`gdb.execute(cmd)`	Run a GDB command, return output as string	`gdb.execute("bt")`
`gdb.parse_and_eval(expr)`	Evaluate expression, return gdb.Value	`gdb.parse_and_eval("$rip")`
`gdb.selected_frame()`	Get current stack frame	`frame = gdb.selected_frame()`
`gdb.selected_inferior()`	Get current inferior (debugged process)	`inf = gdb.selected_inferior()`
`gdb.newest_frame()`	Get the innermost (newest) frame	`top = gdb.newest_frame()`

Subprocess Management: Orchestrating External Tools

Python’s subprocess module allows your main script to invoke GDB as a child process and capture its output.

┌─────────────────────────────────────────────────────────────────┐
│                 Two-Process Architecture                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Process 1: Your Python Script (auto_analyzer.py)               │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                                                          │   │
│  │  executable = sys.argv[1]                               │   │
│  │  core_file = sys.argv[2]                                │   │
│  │                                                          │   │
│  │  result = subprocess.run(                                │   │
│  │      ["gdb", "--batch", "-x", "analyzer.py",            │   │
│  │       executable, core_file],                           │   │
│  │      capture_output=True, text=True                     │   │
│  │  )                                                       │   │
│  │                                                          │   │
│  │  report = parse_output(result.stdout)                   │   │
│  │                                                          │   │
│  └──────────────────────────┬──────────────────────────────┘   │
│                              │                                   │
│                              │ spawns                            │
│                              ▼                                   │
│  Process 2: GDB with Python Extension                           │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                                                          │   │
│  │  GDB loads core dump and executable                     │   │
│  │  GDB runs analyzer.py inside its Python interpreter     │   │
│  │  analyzer.py uses gdb.execute() and gdb.parse_and_eval()│   │
│  │  Output goes to stdout → captured by Process 1          │   │
│  │                                                          │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

2.2 Why This Matters

The scale problem:

A single server might generate 1-5 crashes per day during development
A fleet of 1000 servers might generate 100+ crashes per day
A mobile app with 1 million users might generate 10,000+ crash reports per day

Manual analysis does not scale. Every SRE team, every crash reporting service (Sentry, Crashlytics, Breakpad), and every serious debugging workflow relies on automation.

What automation enables:

Immediate triage: Categorize crashes by type within seconds
Deduplication: Group identical crashes to focus engineering effort
Alerting: Notify on-call engineers when new crash types appear
Trending: Track crash rates over time to detect regressions
Integration: Connect crash data to CI/CD, ticketing, and monitoring systems

2.3 Historical Context

Before automation (pre-2000s):

Engineers would manually run gdb on each core dump
No systematic tracking of which crashes had been analyzed
“I’ll look at that later” often meant “never”
Major crashes slipped through the cracks

The evolution of crash automation:

Shell scripts (1990s): Simple gdb --batch wrappers
GDB command files (2000s): Reusable .gdb scripts
GDB Python API (2008): Structured programmatic access
Crash reporting services (2010s): Sentry, Crashlytics, Raygun
Modern pipelines (2020s): Integration with observability platforms

Why the GDB Python API was a game-changer (GDB 7.0, 2009):

Before: Parse text output with fragile regex
After: Access typed values, iterate frames, query symbols programmatically
Made complex automation reliable and maintainable

2.4 Common Misconceptions

Misconception 1: “Batch mode means no interactivity, so it’s limited”

Reality: Batch mode gives you the SAME capabilities as interactive mode. You can:

Set breakpoints, watchpoints, and catchpoints
Evaluate any expression
Examine any memory
Walk the stack

The only difference is input comes from a script instead of a keyboard.

Misconception 2: “The GDB Python API is just for writing GDB extensions”

Reality: The API is equally useful for:

One-off analysis scripts
CI/CD integration
Automated crash reports
Custom debugging tools

You don’t need to modify GDB or create plugins.

Misconception 3: “Parsing GDB’s text output is good enough”

Reality: Text parsing is fragile because:

Output format changes between GDB versions
Localization can change output language
Edge cases produce unexpected formatting

The Python API provides stable, typed access to the same data.

Misconception 4: “This is only useful for C/C++ crashes”

Reality: GDB (and your automation) can analyze:

C and C++ programs
Rust programs (with debug symbols)
Go programs (with limitations)
Any language that produces ELF executables with DWARF debug info

3. Project Specification

3.1 What You Will Build

A Python script (auto_analyzer.py) that:

Takes an executable path and a core dump file as arguments
Programmatically invokes GDB to load the crash
Extracts key crash information (signal, backtrace, registers)
Produces a formatted summary report

┌─────────────────────────────────────────────────────────────────┐
│                    auto_analyzer.py                              │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  INPUT:                                                          │
│  ┌─────────────┐    ┌─────────────┐                            │
│  │ Executable  │    │ Core Dump   │                            │
│  │ (./my_app)  │    │ (core.1234) │                            │
│  └──────┬──────┘    └──────┬──────┘                            │
│         │                  │                                     │
│         └────────┬─────────┘                                    │
│                  ▼                                               │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                  GDB + Python Script                     │   │
│  │                                                          │   │
│  │  1. Load core dump                                       │   │
│  │  2. Extract signal information                          │   │
│  │  3. Get backtrace                                        │   │
│  │  4. Read register values                                 │   │
│  │  5. Identify crash location                             │   │
│  │                                                          │   │
│  └──────────────────────────┬──────────────────────────────┘   │
│                              │                                   │
│                              ▼                                   │
│  OUTPUT:                                                         │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                 Crash Analysis Report                    │   │
│  │                                                          │   │
│  │  Executable: ./my_app                                   │   │
│  │  Core File:  core.1234                                  │   │
│  │  Signal:     SIGSEGV (Segmentation fault)               │   │
│  │  Crashing IP: 0x55555555513d                            │   │
│  │                                                          │   │
│  │  --- Backtrace ---                                      │   │
│  │  #0  main () at crashing_program.c:4                    │   │
│  │                                                          │   │
│  │  --- Registers ---                                      │   │
│  │  RAX: 0x0                                               │   │
│  │  RBX: 0x0                                               │   │
│  │  ...                                                    │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

3.2 Functional Requirements

ID	Requirement	Priority
FR-1	Accept executable and core file paths as command-line arguments	Must Have
FR-2	Validate that both files exist and are readable	Must Have
FR-3	Extract the signal that caused the crash (SIGSEGV, SIGABRT, etc.)	Must Have
FR-4	Extract the crash address (if applicable)	Must Have
FR-5	Extract the full backtrace	Must Have
FR-6	Extract key register values (RIP, RSP, RAX, RBX, etc.)	Must Have
FR-7	Handle core dumps without debug symbols gracefully	Should Have
FR-8	Output a human-readable report to stdout	Must Have
FR-9	Support different crash types (SIGSEGV, SIGFPE, SIGABRT, SIGBUS)	Should Have
FR-10	Provide JSON output option for machine processing	Nice to Have

3.3 Non-Functional Requirements

ID	Requirement	Metric
NFR-1	Analysis completes quickly	< 5 seconds for typical core dump
NFR-2	Works with GDB 8.0+	Tested on Ubuntu 20.04, 22.04
NFR-3	Error messages are clear and actionable	User can resolve issues without docs
NFR-4	Script is portable	Works on any Linux distribution
NFR-5	No external Python dependencies (stdlib only)	Easy deployment

3.4 Example Usage / Output

Basic usage:

$ python3 auto_analyzer.py ./my_app core.1234

--- Crash Analysis Report ---
Executable: ./my_app
Core File:  core.1234
Signal:     SIGSEGV (Segmentation fault) at 0x0
Crashing IP (RIP): 0x55555555513d

--- Backtrace ---
#0  0x000055555555513d in main () at crashing_program.c:4

--- Registers ---
RAX: 0x0
RBX: 0x0
RCX: 0x7ffff7f9aa80
RDX: 0x7fffffffe528
RSI: 0x7fffffffe518
RDI: 0x1
RBP: 0x7fffffffe420
RSP: 0x7fffffffe420
RIP: 0x55555555513d

With JSON output:

$ python3 auto_analyzer.py --json ./my_app core.1234
{
  "executable": "./my_app",
  "core_file": "core.1234",
  "signal": {
    "name": "SIGSEGV",
    "description": "Segmentation fault",
    "address": "0x0"
  },
  "crash_ip": "0x55555555513d",
  "backtrace": [
    {
      "frame": 0,
      "address": "0x000055555555513d",
      "function": "main",
      "file": "crashing_program.c",
      "line": 4
    }
  ],
  "registers": {
    "rax": "0x0",
    "rbx": "0x0",
    ...
  }
}

Error handling:

$ python3 auto_analyzer.py ./missing_app core.1234
ERROR: Executable not found: ./missing_app

$ python3 auto_analyzer.py ./my_app core.wrong
ERROR: Core file not found: core.wrong

$ python3 auto_analyzer.py ./wrong_app core.1234
WARNING: Core file was not generated by this executable.
         Expected: ./wrong_app
         Actual:   ./my_app

3.5 Real World Outcome

After completing this project, you will have:

A reusable crash analysis tool that can be dropped into any project
Foundation for a crash reporting pipeline (add HTTP upload, database storage)
Skills transferable to commercial tools like Sentry, Datadog, or custom SRE tooling
Understanding of how tools like coredumpctl work internally

How this connects to production systems:

Your Script              Production Evolution
┌──────────────┐        ┌─────────────────────────────────────────┐
│auto_analyzer │   ──▶  │ Crash Pipeline                          │
│   .py        │        │                                         │
└──────────────┘        │ ┌─────────┐  ┌──────┐  ┌────────────┐  │
                        │ │ Collect │──│Analyze│──│ Store/Alert│  │
                        │ └─────────┘  └──────┘  └────────────┘  │
                        │                                         │
                        │ • systemd-coredump captures crashes    │
                        │ • Your script analyzes automatically   │
                        │ • Results go to Elasticsearch/Splunk   │
                        │ • PagerDuty alerts for new crash types │
                        └─────────────────────────────────────────┘

4. Solution Architecture

4.1 High-Level Design

There are two main approaches to implementing this project. You should understand both:

Approach A: GDB Batch File (Simpler)

┌────────────────────────────────────────────────────────────────┐
│                    Approach A: Batch Commands                   │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  auto_analyzer.py                                               │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │ 1. Create commands.gdb file:                              │ │
│  │    set pagination off                                     │ │
│  │    bt                                                      │ │
│  │    info registers                                         │ │
│  │    quit                                                    │ │
│  │                                                            │ │
│  │ 2. Run: gdb --batch -x commands.gdb ./app core.1234       │ │
│  │                                                            │ │
│  │ 3. Parse stdout text output                               │ │
│  │                                                            │ │
│  │ 4. Generate report                                         │ │
│  └───────────────────────────────────────────────────────────┘ │
│                                                                 │
│  Pros: Simple, no Python inside GDB                            │
│  Cons: Fragile text parsing, limited flexibility              │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

Approach B: GDB Python API (More Robust)

┌────────────────────────────────────────────────────────────────┐
│                    Approach B: Python API                       │
├────────────────────────────────────────────────────────────────┤
│                                                                 │
│  auto_analyzer.py (wrapper)                                     │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │ 1. Run: gdb --batch -x gdb_script.py ./app core.1234      │ │
│  │                                                            │ │
│  │ 2. Capture stdout                                          │ │
│  │                                                            │ │
│  │ 3. Present report (already formatted by gdb_script.py)    │ │
│  └───────────────────────────────────────────────────────────┘ │
│                                                                 │
│  gdb_script.py (runs inside GDB)                                │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │ import gdb                                                 │ │
│  │                                                            │ │
│  │ # Use API for structured access                           │ │
│  │ rip = gdb.parse_and_eval("$rip")                          │ │
│  │ frame = gdb.selected_frame()                               │ │
│  │ bt = gdb.execute("bt", to_string=True)                    │ │
│  │                                                            │ │
│  │ # Print formatted output                                   │ │
│  │ print(f"RIP: {rip}")                                       │ │
│  └───────────────────────────────────────────────────────────┘ │
│                                                                 │
│  Pros: Structured data, reliable, extensible                   │
│  Cons: Two-file setup, requires understanding GDB Python       │
│                                                                 │
└────────────────────────────────────────────────────────────────┘

Recommended approach: Start with Approach A to understand the basics, then refactor to Approach B for robustness.

4.2 Key Components

┌─────────────────────────────────────────────────────────────────┐
│                    Component Architecture                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              CLI Interface (main)                        │   │
│  │  • Parse command-line arguments                          │   │
│  │  • Validate inputs                                       │   │
│  │  • Handle --json flag                                    │   │
│  │  • Print final report                                    │   │
│  └────────────────────────────┬────────────────────────────┘   │
│                               │                                  │
│                               ▼                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              GDB Invoker                                  │   │
│  │  • Build GDB command line                                │   │
│  │  • Manage subprocess                                     │   │
│  │  • Handle GDB errors                                     │   │
│  │  • Return raw output                                     │   │
│  └────────────────────────────┬────────────────────────────┘   │
│                               │                                  │
│                               ▼                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              GDB Script (runs inside GDB)                │   │
│  │  • Extract signal info                                   │   │
│  │  • Generate backtrace                                    │   │
│  │  • Read registers                                        │   │
│  │  • Format output                                         │   │
│  └────────────────────────────┬────────────────────────────┘   │
│                               │                                  │
│                               ▼                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              Report Formatter                             │   │
│  │  • Parse GDB output                                      │   │
│  │  • Build report structure                                │   │
│  │  • Output text or JSON                                   │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

4.3 Data Structures

# Crash analysis data model

from dataclasses import dataclass
from typing import List, Optional, Dict

@dataclass
class StackFrame:
    """Represents a single frame in the call stack."""
    frame_number: int           # #0, #1, #2, ...
    address: str               # 0x55555555513d
    function: Optional[str]    # main (or None if no symbols)
    file: Optional[str]        # crashing_program.c (or None)
    line: Optional[int]        # 4 (or None)

@dataclass
class SignalInfo:
    """Information about the terminating signal."""
    name: str                  # SIGSEGV
    description: str           # Segmentation fault
    fault_address: Optional[str]  # 0x0 (address that caused fault)

@dataclass
class RegisterState:
    """CPU register values at crash time."""
    registers: Dict[str, str]  # {"rax": "0x0", "rbx": "0x7f...", ...}

@dataclass
class CrashReport:
    """Complete crash analysis report."""
    executable: str
    core_file: str
    signal: SignalInfo
    crash_address: str         # RIP value
    backtrace: List[StackFrame]
    registers: RegisterState
    has_symbols: bool          # True if debug symbols present

4.4 Algorithm Overview

Main Analysis Flow:

┌─────────────────────────────────────────────────────────────────┐
│                    Analysis Algorithm                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. VALIDATE INPUTS                                              │
│     ├─ Check executable exists and is ELF                       │
│     ├─ Check core file exists and is ELF core dump              │
│     └─ Verify core matches executable (optional warning)        │
│                                                                  │
│  2. PREPARE GDB SESSION                                          │
│     ├─ Create temporary command/script file                     │
│     └─ Build subprocess command line                            │
│                                                                  │
│  3. INVOKE GDB                                                   │
│     ├─ Run: gdb --batch --quiet -x script executable core       │
│     ├─ Capture stdout and stderr                                │
│     └─ Check return code                                        │
│                                                                  │
│  4. EXTRACT INFORMATION (inside GDB script)                      │
│     ├─ Signal: Parse "Program terminated with signal" message   │
│     ├─ RIP: gdb.parse_and_eval("$rip") or "info registers"     │
│     ├─ Backtrace: gdb.execute("bt") or "bt" command            │
│     └─ Registers: gdb.parse_and_eval("$rax") or "info regs"    │
│                                                                  │
│  5. FORMAT REPORT                                                │
│     ├─ Structure data into CrashReport                          │
│     ├─ Format as text (default) or JSON (--json flag)          │
│     └─ Print to stdout                                          │
│                                                                  │
│  6. CLEANUP                                                      │
│     └─ Remove temporary files                                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

5. Implementation Guide

5.1 Development Environment Setup

Required packages:

# Ubuntu/Debian
sudo apt-get update
sudo apt-get install gdb python3 build-essential

# Verify GDB has Python support
gdb --batch --eval-command="python print('Python works!')"
# Should output: Python works!

# Check GDB version (need 8.0+)
gdb --version

Create a test crash program:

// crashing_program.c
#include <stdio.h>

void inner_function(int *ptr) {
    *ptr = 42;  // Crash here if ptr is NULL
}

void outer_function(int *ptr) {
    printf("About to crash...\n");
    inner_function(ptr);
}

int main(int argc, char **argv) {
    int *ptr = NULL;  // NULL pointer
    outer_function(ptr);
    return 0;
}

Generate a core dump:

# Compile with debug symbols
gcc -g -o crashing_program crashing_program.c

# Enable core dumps
ulimit -c unlimited

# On some systems, configure core_pattern
# Check current setting:
cat /proc/sys/kernel/core_pattern

# Run and crash
./crashing_program
# Output: Segmentation fault (core dumped)

# Find the core file (location depends on core_pattern)
ls -la core* /var/lib/apport/coredump/ /var/crash/

5.2 Project Structure

auto_crash_analyzer/
├── auto_analyzer.py        # Main entry point (wrapper script)
├── gdb_analyzer.py         # GDB Python script (runs inside GDB)
├── test_programs/
│   ├── null_deref.c        # NULL pointer dereference
│   ├── stack_overflow.c    # Stack buffer overflow
│   ├── div_by_zero.c       # Division by zero (SIGFPE)
│   ├── abort_call.c        # Explicit abort() (SIGABRT)
│   └── Makefile            # Build all test programs
├── test_cores/             # Generated core dumps for testing
├── tests/
│   ├── test_analyzer.py    # Unit tests
│   └── test_integration.py # Integration tests
└── README.md

5.3 The Core Question You’re Answering

“How can I programmatically extract meaningful information from a crash dump without manual intervention?”

This question underpins all crash automation. The answer involves:

Understanding what data GDB can extract
Knowing how to invoke GDB non-interactively
Structuring output for both human and machine consumption
Handling edge cases (no symbols, corrupted dumps, etc.)

5.4 Concepts You Must Understand First

Before starting implementation, verify you understand:

Concept	Self-Check Question	Resource if Unsure
GDB basics	Can you run `bt`, `info registers`, and `p <var>` in GDB?	Project 2 of this series
Python subprocess	Can you capture stdout from a shell command in Python?	Python docs: subprocess module
ELF format	Can you use `file` command to identify ELF executables and core dumps?	`man elf`, `man file`
Unix signals	Can you list 5 common signals and when they occur?	`man 7 signal`
Debug symbols	What’s the difference between compiling with and without `-g`?	Project 2 of this series

5.5 Questions to Guide Your Design

Input handling:

How will you verify the executable is the correct one for the core dump?
What should happen if the files don’t exist?
Should you support absolute and relative paths?

GDB interaction:

Will you use a command file, inline -ex commands, or a Python script?
How will you handle GDB errors (e.g., “No stack” or “Cannot access memory”)?
Should your script work with both Python 2 and Python 3 in GDB?

Information extraction:

What if there’s no signal info (e.g., the core was generated by gcore)?
How many stack frames should you show by default?
Which registers are most important to include?

Output format:

Should the text output be colorized?
How will you handle very long backtraces?
What metadata should the JSON output include?

5.6 Thinking Exercise

Before writing code, trace through this scenario manually:

Exercise: You have a core dump from a multi-threaded program. Thread 2 crashed with SIGSEGV. Thread 1 was waiting in select(). Thread 3 was in malloc().

Draw a diagram showing what information GDB will show for each thread.
List the GDB commands needed to examine each thread.
Design your output format: How will you represent multiple threads?
What should happen if the user’s core dump has 100 threads?

This exercise prepares you for the multi-threaded extension in a later project.

5.7 Hints in Layers

Hint 1 - Starting Point (Conceptual Direction): Start with the simplest possible implementation: a command file with bt and info registers, invoked via subprocess. Get basic extraction working before adding Python API usage.

Hint 2 - Next Level (More Specific Guidance): Create commands.gdb:

set pagination off
set print pretty on
bt
info registers
quit

Invoke with: subprocess.run(["gdb", "--batch", "-q", "-x", "commands.gdb", exe, core], capture_output=True)

Hint 3 - Technical Details (Approach/Pseudocode):

# Wrapper script structure
def main():
    exe, core = parse_args()
    validate_inputs(exe, core)

    # Create temp command file
    with tempfile.NamedTemporaryFile(mode='w', suffix='.gdb') as f:
        f.write("set pagination off\n")
        f.write("bt\n")
        f.write("info registers\n")
        f.write("quit\n")
        f.flush()

        result = subprocess.run(
            ["gdb", "--batch", "-q", "-x", f.name, exe, core],
            capture_output=True, text=True, timeout=30
        )

    report = parse_gdb_output(result.stdout, result.stderr)
    print_report(report)

Hint 4 - Tools/Debugging (Verification Methods):

Test with a known-good core dump first
Print raw GDB output before parsing to see exact format
Use --eval-command="show version" to verify GDB is being invoked correctly
Check result.returncode - GDB returns 0 even if core is corrupt

5.8 The Interview Questions They’ll Ask

“How does GDB load a core dump?”
- Expected: GDB reads the ELF core file which contains memory segments, register values, and metadata. It uses the executable for symbol information.
“What’s the difference between gdb.execute() and gdb.parse_and_eval()?”
- Expected: execute() runs a command and returns text output. parse_and_eval() evaluates an expression and returns a typed gdb.Value object.
“How would you handle a core dump from a stripped binary?”
- Expected: The backtrace will show addresses instead of function names. You can still examine registers and memory. If you have the original debug symbols separately, you can load them with symbol-file.
“How would you scale this to analyze 1000 crashes per hour?”
- Expected: Parallelize analysis, deduplicate by backtrace signature, cache symbol files, use a queue system for crash files, store results in a database.
“What are the security implications of automated crash analysis?”
- Expected: Core dumps may contain sensitive data (passwords in memory, encryption keys). Analysis should happen in isolated environments. Results should be redacted before storage.

5.9 Books That Will Help

Topic	Book	Chapter
GDB basics	“The Art of Debugging with GDB” - Matloff & Salzman	Ch. 1-3
GDB scripting	“Debugging with GDB” - GNU Manual	Ch. 23 (Python)
Process memory	“Computer Systems: A Programmer’s Perspective” - Bryant & O’Hallaron	Ch. 9 (Virtual Memory)
Automation mindset	“Black Hat Python” - Justin Seitz	Ch. 1-2
Crash reporting systems	“Site Reliability Engineering” - Google	Ch. 15 (Postmortems)

5.10 Implementation Phases

Phase 1: Basic Extraction (Days 1-3)

Goals:

Create test crash programs
Invoke GDB via subprocess
Capture backtrace output

Deliverable: Script that prints raw GDB output for any core dump.

Phase 2: Structured Parsing (Days 3-6)

Goals:

Parse backtrace into stack frames
Extract signal information
Parse register values

Deliverable: Script that prints formatted report with sections.

Phase 3: GDB Python API Migration (Days 6-9)

Goals:

Rewrite extraction using gdb.execute() and gdb.parse_and_eval()
Improve reliability of data extraction
Handle edge cases (no symbols, truncated stacks)

Deliverable: Robust analyzer using Python API.

Phase 4: Polish and Testing (Days 9-14)

Goals:

Add JSON output option
Add comprehensive error handling
Test with multiple crash types
Document usage

Deliverable: Production-ready analyzer script.

5.11 Key Implementation Decisions

Decision 1: Command file vs. Python API

Factor	Command File	Python API
Simplicity	Easier to start	More code
Reliability	Fragile parsing	Structured data
Flexibility	Limited	High
Debugging	Harder	Easier

Recommendation: Start with command file, migrate to Python API.

Decision 2: Single script vs. two scripts

Factor	Single Script	Two Scripts
Deployment	One file	Two files
Complexity	Higher (embed GDB script)	Lower (separation)
Maintainability	Harder	Easier
Testing	Harder	Can test independently

Recommendation: Two scripts (wrapper + GDB script).

Decision 3: Text parsing vs. structured output

For the command file approach, you must parse text. Key patterns:

# Backtrace line pattern
#0  0x000055555555513d in main () at crashing_program.c:4
^   ^                     ^          ^                   ^
|   |                     |          |                   +-- line number
|   |                     |          +-- file name
|   |                     +-- function name
|   +-- address
+-- frame number

# Register pattern
rax            0x0                      0
^              ^                        ^
|              |                        +-- decimal value
|              +-- hex value
+-- register name

# Signal pattern
Program terminated with signal SIGSEGV, Segmentation fault.
                               ^        ^
                               |        +-- description
                               +-- signal name

6. Testing Strategy

Unit Tests

# tests/test_analyzer.py

import unittest
from analyzer import parse_backtrace_line, parse_register_line, parse_signal_line

class TestBacktraceParsing(unittest.TestCase):
    def test_full_frame_with_symbols(self):
        line = "#0  0x000055555555513d in main () at crashing_program.c:4"
        frame = parse_backtrace_line(line)
        self.assertEqual(frame.frame_number, 0)
        self.assertEqual(frame.address, "0x000055555555513d")
        self.assertEqual(frame.function, "main")
        self.assertEqual(frame.file, "crashing_program.c")
        self.assertEqual(frame.line, 4)

    def test_frame_without_symbols(self):
        line = "#0  0x000055555555513d in ?? ()"
        frame = parse_backtrace_line(line)
        self.assertEqual(frame.function, None)
        self.assertEqual(frame.file, None)

    def test_frame_with_args(self):
        line = "#1  0x0000555555555160 in foo (x=42, y=0x7fff) at test.c:10"
        frame = parse_backtrace_line(line)
        self.assertEqual(frame.function, "foo")

class TestRegisterParsing(unittest.TestCase):
    def test_standard_register(self):
        line = "rax            0x0                      0"
        name, value = parse_register_line(line)
        self.assertEqual(name, "rax")
        self.assertEqual(value, "0x0")

    def test_register_with_large_value(self):
        line = "rsp            0x7fffffffe420           140737488348192"
        name, value = parse_register_line(line)
        self.assertEqual(name, "rsp")
        self.assertEqual(value, "0x7fffffffe420")

class TestSignalParsing(unittest.TestCase):
    def test_sigsegv(self):
        line = "Program terminated with signal SIGSEGV, Segmentation fault."
        signal = parse_signal_line(line)
        self.assertEqual(signal.name, "SIGSEGV")
        self.assertEqual(signal.description, "Segmentation fault")

    def test_sigabrt(self):
        line = "Program terminated with signal SIGABRT, Aborted."
        signal = parse_signal_line(line)
        self.assertEqual(signal.name, "SIGABRT")

Integration Tests

#!/bin/bash
# tests/test_integration.sh

set -e

SCRIPT_DIR=$(dirname "$0")
ANALYZER="$SCRIPT_DIR/../auto_analyzer.py"
TEST_PROGS="$SCRIPT_DIR/../test_programs"
TEST_CORES="$SCRIPT_DIR/../test_cores"

# Build test programs
make -C "$TEST_PROGS"

# Generate core dumps
ulimit -c unlimited
cd "$TEST_CORES"

for prog in null_deref stack_overflow div_by_zero abort_call; do
    echo "Testing $prog..."

    # Generate core (will crash)
    "$TEST_PROGS/$prog" 2>/dev/null || true

    # Find core file (name depends on system)
    CORE=$(ls -t core* 2>/dev/null | head -1)

    if [ -z "$CORE" ]; then
        echo "ERROR: No core file generated for $prog"
        exit 1
    fi

    # Run analyzer
    OUTPUT=$(python3 "$ANALYZER" "$TEST_PROGS/$prog" "$CORE")

    # Verify output contains expected sections
    echo "$OUTPUT" | grep -q "Crash Analysis Report" || { echo "Missing report header"; exit 1; }
    echo "$OUTPUT" | grep -q "Backtrace" || { echo "Missing backtrace"; exit 1; }
    echo "$OUTPUT" | grep -q "Registers" || { echo "Missing registers"; exit 1; }

    # Verify signal detection
    case "$prog" in
        null_deref)    echo "$OUTPUT" | grep -q "SIGSEGV" ;;
        div_by_zero)   echo "$OUTPUT" | grep -q "SIGFPE" ;;
        abort_call)    echo "$OUTPUT" | grep -q "SIGABRT" ;;
    esac

    echo "  PASSED"
    rm -f "$CORE"
done

echo "All integration tests passed!"

Test with Different Scenarios

Scenario	How to Create	What to Verify
Simple SIGSEGV	Dereference NULL	Signal detected, address is 0x0
SIGFPE	Divide by zero	Signal is SIGFPE
SIGABRT	Call `abort()`	Signal is SIGABRT
Deep stack	Recursive function	Many frames shown
No symbols	Strip executable	Shows `??` for functions
Multi-threaded	pthread crash	Thread info included
Corrupted core	Truncate core file	Graceful error message

7. Common Pitfalls & Debugging

Pitfall 1: GDB Not Finding Python

Symptom:

$ gdb --batch -x script.py exe core
Python scripting is not supported in this copy of GDB.

Cause: GDB was compiled without Python support.

Fix:

# Check GDB Python support
gdb --batch --eval-command="python print('test')"

# If unsupported, install GDB with Python:
# Ubuntu/Debian
sudo apt-get install gdb

# From source
./configure --with-python

Pitfall 2: Core Pattern Redirects Core Dumps

Symptom: Core dump not appearing in current directory.

Cause: System’s core_pattern redirects to a different location.

Fix:

# Check current pattern
cat /proc/sys/kernel/core_pattern

# Common patterns and where to find cores:
# |/usr/share/apport/apport... -> /var/crash/ (Ubuntu)
# |/usr/lib/systemd/systemd-coredump... -> journalctl (systemd)
# core.%p ->./core.<pid>

# For testing, set simple pattern (requires root):
sudo sysctl kernel.core_pattern=core.%p

# Or use coredumpctl on systemd systems:
coredumpctl list
coredumpctl dump <pid> > core.file

Pitfall 3: GDB Output Format Changes

Symptom: Parser works on one system, fails on another.

Cause: Different GDB versions format output differently.

Fix: Use Python API instead of text parsing, or be defensive:

# Fragile (exact format)
frame_match = re.match(r'^#(\d+)\s+0x([0-9a-f]+)\s+in\s+(\S+)\s+\(\)', line)

# More robust (flexible whitespace, optional parts)
frame_match = re.match(
    r'^#(\d+)\s+'               # Frame number
    r'(?:0x)?([0-9a-f]+)\s+'    # Address (optional 0x prefix)
    r'in\s+(\S+)\s*'            # Function name
    r'(?:\([^)]*\))?\s*'        # Arguments (optional)
    r'(?:at\s+(\S+):(\d+))?',   # File:line (optional)
    line
)

Pitfall 4: subprocess Timeout

Symptom: Script hangs when analyzing large core dump.

Cause: GDB can take a long time for large or complex cores.

Fix:

try:
    result = subprocess.run(
        gdb_command,
        capture_output=True,
        text=True,
        timeout=60  # 60 second timeout
    )
except subprocess.TimeoutExpired:
    print("ERROR: GDB analysis timed out. Core file may be too large.")
    sys.exit(1)

Pitfall 5: Mismatched Executable and Core

Symptom: GDB shows wrong symbols or “warning: core file may not match”.

Cause: Core was generated by a different build of the executable.

Fix:

# Check before analysis
def verify_executable_matches_core(exe_path, core_path):
    """Verify the core was generated by this executable."""
    # Extract the path from the core file's NT_FILE note
    result = subprocess.run(
        ["eu-readelf", "-n", core_path],
        capture_output=True, text=True
    )

    if result.returncode != 0:
        return True  # Can't verify, proceed anyway

    # Look for original executable path in output
    for line in result.stdout.split('\n'):
        if 'NT_FILE' in line or exe_name in line:
            # Comparison logic here
            pass

    return True  # Default to proceeding

Debugging Techniques

1. Print raw GDB output:

result = subprocess.run(gdb_command, capture_output=True, text=True)
print("=== STDOUT ===")
print(result.stdout)
print("=== STDERR ===")
print(result.stderr)
print("=== RETURN CODE ===")
print(result.returncode)

2. Interactive debugging:

# Run the same commands interactively to see what GDB shows
gdb ./exe core
(gdb) bt
(gdb) info registers

3. Test Python API in GDB:

gdb -q ./exe core
(gdb) python
>>> import gdb
>>> print(gdb.parse_and_eval("$rip"))
>>> frame = gdb.selected_frame()
>>> print(frame.name())
>>> end

8. Extensions & Challenges

Extension 1: JSON Output

Add --json flag for machine-readable output:

def format_as_json(report: CrashReport) -> str:
    return json.dumps({
        "executable": report.executable,
        "core_file": report.core_file,
        "signal": {
            "name": report.signal.name,
            "description": report.signal.description,
            "address": report.signal.fault_address
        },
        "crash_ip": report.crash_address,
        "backtrace": [
            {
                "frame": f.frame_number,
                "address": f.address,
                "function": f.function,
                "file": f.file,
                "line": f.line
            }
            for f in report.backtrace
        ],
        "registers": report.registers.registers,
        "has_symbols": report.has_symbols,
        "analyzed_at": datetime.utcnow().isoformat()
    }, indent=2)

Extension 2: Crash Signature Generation

Generate a stable “fingerprint” for crash deduplication:

def generate_crash_signature(report: CrashReport) -> str:
    """Generate a stable signature for crash deduplication."""
    # Use top 3 frames (excluding library frames)
    significant_frames = []
    for frame in report.backtrace[:5]:
        if frame.function and not frame.function.startswith("__"):
            significant_frames.append(f"{frame.function}:{frame.file}")
        if len(significant_frames) >= 3:
            break

    # Include signal type
    signature_parts = [report.signal.name] + significant_frames

    # Hash for compact representation
    signature_string = "|".join(signature_parts)
    return hashlib.sha256(signature_string.encode()).hexdigest()[:16]

Extension 3: Batch Processing

Analyze multiple cores at once:

$ python3 auto_analyzer.py --batch /var/crash/*.core

Processing 15 core files...
[1/15] core.app1.1234 - SIGSEGV in main()
[2/15] core.app1.1235 - SIGSEGV in main() (duplicate of #1)
[3/15] core.app2.5678 - SIGABRT in abort()
...

Summary:
  Total crashes: 15
  Unique signatures: 3
  Most common: SIGSEGV in main() (12 occurrences)

Extension 4: Memory Analysis

Add memory examination to find what was being accessed:

# In GDB Python script
def analyze_crash_memory(fault_address):
    """Try to understand what memory was being accessed."""
    try:
        # Check if address is mapped
        maps = gdb.execute("info proc mappings", to_string=True)

        # Try to read memory around the fault address
        if fault_address != "0x0":
            nearby = gdb.execute(f"x/8wx {fault_address}", to_string=True)
            return nearby
    except gdb.error:
        pass
    return None

Extension 5: Source Context

Show source code around the crash:

# In GDB Python script
def get_source_context(frame, context_lines=3):
    """Get source code around the crashing line."""
    if frame.file and frame.line:
        try:
            output = gdb.execute(
                f"list {frame.file}:{frame.line - context_lines},"
                f"{frame.line + context_lines}",
                to_string=True
            )
            return output
        except gdb.error:
            return "Source not available"
    return None

9. Real-World Connections

How This Relates to Production Systems

systemd-coredump:

Your Script                     systemd-coredump
┌────────────────┐             ┌─────────────────────────────────┐
│ Takes core     │             │ Intercepts ALL core dumps       │
│ as argument    │             │ via core_pattern                │
├────────────────┤             ├─────────────────────────────────┤
│ Invokes GDB    │             │ Compresses and stores in        │
│ manually       │             │ /var/lib/systemd/coredump/     │
├────────────────┤             ├─────────────────────────────────┤
│ Parses output  │             │ Indexes by executable, PID,     │
│                │             │ timestamp                       │
├────────────────┤             ├─────────────────────────────────┤
│ Prints report  │             │ coredumpctl provides interface  │
└────────────────┘             └─────────────────────────────────┘

Sentry/Crashlytics:

Your Script                     Sentry
┌────────────────┐             ┌─────────────────────────────────┐
│ Single core    │             │ Millions of crashes per day     │
│ local analysis │             │ from thousands of users         │
├────────────────┤             ├─────────────────────────────────┤
│ GDB Python API │             │ Custom parsers for minidumps,   │
│                │             │ symbolication servers          │
├────────────────┤             ├─────────────────────────────────┤
│ Text/JSON      │             │ Web UI with graphs, trends,     │
│ output         │             │ alerts, integrations           │
├────────────────┤             ├─────────────────────────────────┤
│ Manual run     │             │ SDK in app, automatic upload    │
└────────────────┘             └─────────────────────────────────┘

Industry Use Cases

Game Development: Every crash from QA testers is analyzed automatically. New crash types page engineers.
Mobile Apps: Crashlytics/Bugsnag receive millions of crash reports, deduplicate, and show trends.
Cloud Infrastructure: AWS/GCP/Azure automate analysis of their internal service crashes.
Automotive: Safety-critical systems require automated post-crash analysis for regulatory compliance.
Financial Services: Every trading system crash is analyzed to determine if it caused incorrect trades.

10. Resources

Official Documentation

Books

Book	Relevant Chapters
“The Art of Debugging with GDB” - Matloff & Salzman	Ch. 1-3 (GDB basics), Ch. 7 (Scripting)
“Debugging with GDB” - GNU Manual	Ch. 23 (Python Extensions)
“Black Hat Python” - Seitz	Ch. 1-2 (Automation mindset)
“The Linux Programming Interface” - Kerrisk	Ch. 22 (Signals)

Online Resources

Tool	Purpose
`coredumpctl`	systemd interface for core dumps
`eu-readelf`	Examine ELF files (including cores)
`gcore`	Generate core dump of running process
`pstack`	Print stack of running process
`minidump_stackwalk`	Google Breakpad minidump analyzer

11. Self-Assessment Checklist

Functionality

Script accepts executable and core file as arguments
Script validates that both files exist
Script extracts signal name and description
Script extracts crash address (fault address)
Script generates full backtrace
Script extracts key register values
Script handles cores without debug symbols
Script outputs formatted report

Robustness

Script handles missing files gracefully
Script handles GDB errors
Script has timeout protection
Script works with SIGSEGV, SIGFPE, SIGABRT
Script output is consistent across GDB versions

Code Quality

Code is organized into functions
Functions have docstrings
Error messages are clear
No hardcoded paths

Extensions (Optional)

JSON output option implemented
Crash signature generation implemented
Batch processing implemented

12. Submission / Completion Criteria

You have successfully completed this project when:

Basic Analysis Works:

$ python3 auto_analyzer.py ./crashing_program core.1234
# Produces report with signal, backtrace, and registers

Error Handling Works:

$ python3 auto_analyzer.py ./nonexistent core.1234
ERROR: Executable not found: ./nonexistent

Multiple Crash Types Work:
- SIGSEGV (segmentation fault)
- SIGFPE (floating point exception)
- SIGABRT (abort)

Works Without Symbols:

$ strip ./crashing_program
$ ./crashing_program  # generates core
$ python3 auto_analyzer.py ./crashing_program core.xxx
# Shows addresses instead of function names, but doesn't crash

Tests Pass:

$ python3 -m pytest tests/
# All tests pass

Stretch Goals:

JSON output option (--json flag)
Analysis of multi-threaded core dumps
Crash signature for deduplication
Integration with a simple web interface

What’s Next?

After completing this project, you’re ready for:

Project 5: Multi-threaded Mayhem - Apply your automation skills to complex concurrent crashes
Project 7: The Minidump Parser - Build a parser for a different crash dump format
Project 10: Building a Centralized Crash Reporter - Scale this to a full crash pipeline

The automation skills you’ve built here are the foundation of production crash analysis systems. Every technique you’ve learned—batch mode, subprocess management, output parsing—appears in real-world crash reporting infrastructure.

Project 4: The Automated Crash Detective

Quick Reference

1. Learning Objectives

2. Theoretical Foundation

2.1 Core Concepts

GDB Batch Mode: Non-Interactive Debugging

GDB Python API: Structured Access to Debugging Data

Subprocess Management: Orchestrating External Tools

2.2 Why This Matters

2.3 Historical Context

2.4 Common Misconceptions

3. Project Specification

3.1 What You Will Build

3.2 Functional Requirements

3.3 Non-Functional Requirements

3.4 Example Usage / Output

3.5 Real World Outcome

4. Solution Architecture

4.1 High-Level Design

4.2 Key Components

4.3 Data Structures

4.4 Algorithm Overview

5. Implementation Guide

5.1 Development Environment Setup

5.2 Project Structure

5.3 The Core Question You’re Answering

5.4 Concepts You Must Understand First

5.5 Questions to Guide Your Design

5.6 Thinking Exercise

5.7 Hints in Layers

5.8 The Interview Questions They’ll Ask

5.9 Books That Will Help

5.10 Implementation Phases

5.11 Key Implementation Decisions

6. Testing Strategy

Unit Tests

Integration Tests

Test with Different Scenarios

7. Common Pitfalls & Debugging

Pitfall 1: GDB Not Finding Python

Pitfall 2: Core Pattern Redirects Core Dumps

Pitfall 3: GDB Output Format Changes

Pitfall 4: subprocess Timeout

Pitfall 5: Mismatched Executable and Core

Debugging Techniques

8. Extensions & Challenges

Extension 1: JSON Output

Extension 2: Crash Signature Generation

Extension 3: Batch Processing

Extension 4: Memory Analysis

Extension 5: Source Context

9. Real-World Connections

How This Relates to Production Systems

Industry Use Cases

10. Resources

Official Documentation

Books

Online Resources

Related Tools

11. Self-Assessment Checklist

Functionality

Robustness

Code Quality

Extensions (Optional)

12. Submission / Completion Criteria

What’s Next?