Project 9: Analyzing a Kernel Panic with crash

Use the crash utility to dissect a vmcore and understand exactly why the kernel died.

Quick Reference

Attribute	Value
Difficulty	Expert
Time Estimate	1-2 weeks
Language	crash utility commands
Prerequisites	Project 8 (vmcore capture)
Key Topics	crash utility, kernel debugging, kernel data structures, vmcore analysis

1. Learning Objectives

By completing this project, you will:

Learn to use the crash utility for kernel dump analysis
Understand how to navigate kernel data structures
Extract meaningful information from a vmcore file
Identify the root cause of a kernel panic
Learn essential crash commands for debugging
Understand kernel symbol tables and debug information

2. Theoretical Foundation

2.1 Core Concepts

What is the crash Utility?

The crash utility is an interactive analysis tool for kernel crash dumps and live systems. Think of it as “GDB for the kernel”—but specialized for post-mortem analysis of vmcore files.

┌─────────────────────────────────────────────────────────────────┐
│                  GDB vs CRASH COMPARISON                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  GDB                              crash                          │
│  ───                              ─────                          │
│  User-space debugger              Kernel debugger                │
│  Works on core dumps              Works on vmcore files          │
│  Needs debug symbols              Needs kernel debuginfo         │
│  Understands C structures         Understands kernel structures  │
│  One process at a time            Entire system state            │
│  bt shows user stack              bt shows kernel stack          │
│  Generic commands                 Kernel-specific commands       │
│                                                                  │
│  Similar Commands:                crash-Specific Commands:       │
│  bt (backtrace)                   ps (process list)              │
│  p (print)                        files (open files)             │
│  x (examine)                      net (network info)             │
│  info registers                   dev (device info)              │
│                                   mod (module info)              │
│                                   log (dmesg output)             │
│                                   struct (kernel structures)     │
│                                   task (task_struct info)        │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

The vmlinux File

To analyze a vmcore, crash needs the vmlinux file—the uncompressed kernel image with debug symbols:

┌─────────────────────────────────────────────────────────────────┐
│                 VMCORE ANALYSIS REQUIREMENTS                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  You Need:                                                       │
│                                                                  │
│  1. vmcore file                                                  │
│     └─ The crash dump from kdump                                │
│     └─ Located in /var/crash/*/vmcore                           │
│                                                                  │
│  2. vmlinux file (with debug symbols)                           │
│     └─ Uncompressed kernel with DWARF info                      │
│     └─ From kernel-debuginfo package                            │
│     └─ Usually in /usr/lib/debug/lib/modules/<version>/        │
│                                                                  │
│  3. (Optional) Module debug symbols                              │
│     └─ If crash is in a module                                  │
│     └─ .ko.debug files                                          │
│                                                                  │
│  crash vmlinux vmcore                                            │
│        ↓        ↓                                                │
│        │        └─ Memory snapshot                               │
│        └─ Symbol information                                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Essential crash Commands

┌─────────────────────────────────────────────────────────────────┐
│                  CRASH COMMAND REFERENCE                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  NAVIGATION & BASICS                                             │
│  ───────────────────                                             │
│  bt        - Backtrace of current/specified task                 │
│  ps        - Show process list at time of crash                  │
│  log       - Display kernel message buffer (dmesg)               │
│  sys       - System information                                  │
│  help CMD  - Get help on any command                             │
│                                                                  │
│  MEMORY EXAMINATION                                              │
│  ──────────────────                                              │
│  rd ADDR   - Read memory (default: 8 bytes)                      │
│  x/FMT     - GDB-style examine (x/16wx, x/s, etc.)              │
│  search    - Search memory for patterns                          │
│  vm        - Virtual memory information for a task               │
│                                                                  │
│  DATA STRUCTURES                                                 │
│  ───────────────                                                 │
│  struct    - Display structure definition or instance            │
│  p EXPR    - Print expression/variable value                     │
│  whatis    - Show type of symbol                                 │
│  sym       - Translate address to symbol or vice versa           │
│                                                                  │
│  PROCESS/TASK ANALYSIS                                           │
│  ─────────────────────                                           │
│  task      - Display task_struct contents                        │
│  set       - Set current context (task, CPU)                     │
│  files     - Open files for a process                            │
│  sig       - Signal information                                  │
│                                                                  │
│  MODULE ANALYSIS                                                 │
│  ───────────────                                                 │
│  mod       - Loaded modules information                          │
│  dis ADDR  - Disassemble at address                              │
│                                                                  │
│  ADVANCED                                                        │
│  ────────                                                        │
│  foreach   - Execute command across all tasks/CPUs               │
│  runq      - Display run queue                                   │
│  kmem      - Kernel memory usage                                 │
│  timer     - Timer information                                   │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Kernel Stack Trace Anatomy

Understanding a kernel backtrace:

crash> bt
PID: 1234   TASK: ffff88810a4d8000  CPU: 1   COMMAND: "insmod"
 #0 [ffffc90000a77e30] machine_kexec at ffffffff8105f370
 #1 [ffffc90000a77e80] __crash_kexec at ffffffff8111d0c1
 #2 [ffffc90000a77f48] crash_kexec at ffffffff8111e1b3
 #3 [ffffc90000a77f60] oops_end at ffffffff81027a5b
 #4 [ffffc90000a77f80] do_trap at ffffffff81021a7c
 #5 [ffffc90000a77fd0] do_error_trap at ffffffff81021c93
 #6 [ffffc90000a77ff0] exc_divide_error at ffffffff81a00c90
--- <EXCEPTION FRAME> ---
 #7 [ffffc90000a78018] buggy_init+0x15/0x30 [buggy_module]  <-- YOUR CODE
 #8 [ffffc90000a78038] do_one_initcall at ffffffff81001eb3
 #9 [ffffc90000a78098] do_init_module at ffffffff8111a16e
#10 [ffffc90000a780b8] load_module at ffffffff81119f67
#11 [ffffc90000a78200] __do_sys_finit_module at ffffffff8111af31
#12 [ffffc90000a78250] do_syscall_64 at ffffffff81a0007c
#13 [ffffc90000a78280] entry_SYSCALL_64_after_hwframe at ffffffff81c0008b

Breaking this down:
────────────────────
Frame #0-6:  Crash handling code (after the panic)
Frame #7:    YOUR CODE - where the bug actually occurred
Frame #8-11: Module loading code path
Frame #12-13: System call entry point

2.2 Why This Matters

Kernel crash analysis is the final frontier of debugging:

You’re debugging the operating system itself
No user-space tools are available during kernel execution
The entire system state is at your fingertips
Understanding this gives insight into how Linux actually works

2.3 Historical Context

1999: crash utility created by Dave Anderson (Red Hat)
2000s: Became standard tool for enterprise Linux support
Today: Maintained as open source, used by all major Linux vendors

2.4 Common Misconceptions

Misconception 1: “crash is like GDB for the kernel”

Reality: Similar concepts but different commands and capabilities

Misconception 2: “You need source code to use crash”

Reality: Debug symbols are sufficient; source helps but isn’t required

Misconception 3: “Kernel debugging is too hard for non-kernel developers”

Reality: Basic crash analysis is accessible with some learning

3. Project Specification

3.1 What You Will Build

A complete analysis workflow using crash:

Load the vmcore from Project 8 into crash
Identify the crashing task and its backtrace
Examine the kernel log at crash time
Inspect the crashing module
Determine root cause of the panic
Document the complete analysis process

3.2 Functional Requirements

Load vmcore successfully
- Install crash utility and debug symbols
- Load vmcore with correct vmlinux
Basic crash navigation
- Get system information
- View process list at crash time
- Examine kernel log
Backtrace analysis
- Get backtrace of crashing task
- Identify the faulting function
- Understand the call chain
Memory examination
- Examine registers at crash time
- View memory at crash location
- Understand the failing instruction
Root cause determination
- Identify what caused the panic
- Document the analysis
- Explain how to fix the bug

3.3 Non-Functional Requirements

All analysis reproducible from documentation
Clear explanation of each crash command used
Understanding demonstrated, not just commands copied

3.4 Example Usage / Output

Loading the vmcore:

$ sudo crash /usr/lib/debug/lib/modules/6.1.0-13-amd64/vmlinux \
             /var/crash/127.0.0.1-2025-12-20-15:00:00/vmcore

crash 8.0.2
Copyright (C) 2002-2023  Red Hat, Inc.
...

      KERNEL: /usr/lib/debug/lib/modules/6.1.0-13-amd64/vmlinux
    DUMPFILE: /var/crash/127.0.0.1-2025-12-20-15:00:00/vmcore
        CPUS: 2
        DATE: Fri Dec 20 15:00:00 EST 2025
      UPTIME: 00:05:23
LOAD AVERAGE: 0.08, 0.03, 0.01
       TASKS: 142
    NODENAME: crash-lab
     RELEASE: 6.1.0-13-amd64
     VERSION: #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29)
     MACHINE: x86_64  (2400 Mhz)
      MEMORY: 4 GB
       PANIC: "Kernel panic - not syncing: Fatal exception"
         PID: 1234
     COMMAND: "insmod"
        TASK: ffff88810a4d8000  [THREAD_INFO: ffff88810a4d8000]
         CPU: 1
       STATE: TASK_RUNNING (PANIC)

crash>

Getting the backtrace:

crash> bt
PID: 1234   TASK: ffff88810a4d8000  CPU: 1   COMMAND: "insmod"
 #0 [ffffc90000a77e30] machine_kexec at ffffffff8105f370
 #1 [ffffc90000a77e80] __crash_kexec at ffffffff8111d0c1
 #2 [ffffc90000a77f48] crash_kexec at ffffffff8111e1b3
 #3 [ffffc90000a77f60] oops_end at ffffffff81027a5b
 #4 [ffffc90000a77f80] no_context at ffffffff81073a9c
 #5 [ffffc90000a77fd0] __bad_area_nosemaphore at ffffffff81073cc1
 #6 [ffffc90000a77ff0] do_page_fault at ffffffff81074a94
 #7 [ffffc90000a78018] page_fault at ffffffff81a01204
    [exception RIP: buggy_init+0x15]
    RIP: ffffffffc06ef015  RSP: ffffc90000a780d0  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffffffffc06f0000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: ffffffffc06ef000
    RBP: ffffc90000a780d0   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: ffffffffc06ef000  R14: 0000000000000000  R15: 0000000000000001
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #8 [ffffc90000a78108] do_one_initcall at ffffffff81001eb3
...

Viewing the kernel log:

crash> log
[    0.000000] Linux version 6.1.0-13-amd64 ...
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.1.0-13-amd64 root=...
...
[  123.456789] buggy_module: loading out-of-tree module taints kernel.
[  123.457123] buggy_module: module loaded, about to crash...
[  123.458456] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  123.459789] #PF: supervisor write access in kernel mode
[  123.460123] #PF: error_code(0x0002) - not-present page
[  123.461456] Oops: 0002 [#1] PREEMPT SMP NOPTI
...
[  123.470123] Kernel panic - not syncing: Fatal exception

3.5 Real World Outcome

After this project, you’ll be able to:

Analyze kernel crashes from production systems
Provide meaningful crash analysis to kernel developers
Understand root causes of kernel panics
Debug kernel modules and drivers
Support Linux systems at the deepest level

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────┐
│                    CRASH ANALYSIS WORKFLOW                       │
└─────────────────────────────────────────────────────────────────┘

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│    vmcore    │     │   vmlinux    │     │ Module debug │
│              │     │              │     │   symbols    │
│ (crash dump) │     │(debug syms)  │     │  (optional)  │
└──────┬───────┘     └──────┬───────┘     └──────┬───────┘
       │                    │                    │
       └────────────────────┼────────────────────┘
                            │
                            ▼
                  ┌─────────────────────┐
                  │   crash utility     │
                  │                     │
                  │  Interactive shell  │
                  └──────────┬──────────┘
                             │
        ┌────────────────────┼────────────────────┐
        ▼                    ▼                    ▼
┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│ System Info  │     │  Backtrace   │     │   Memory     │
│              │     │              │     │  Analysis    │
│ • ps         │     │ • bt         │     │ • rd         │
│ • sys        │     │ • dis        │     │ • search     │
│ • mod        │     │ • struct     │     │ • vm         │
└──────────────┘     └──────────────┘     └──────────────┘
        │                    │                    │
        └────────────────────┼────────────────────┘
                             │
                             ▼
                  ┌─────────────────────┐
                  │   ROOT CAUSE        │
                  │   DETERMINATION     │
                  └─────────────────────┘

4.2 Key Components

crash utility: Interactive kernel debugger
vmlinux: Kernel with debug symbols
vmcore: Kernel memory dump
Your analysis: Systematic investigation process

4.3 Data Structures

Key kernel structures you’ll examine:

// task_struct - represents a process/thread
struct task_struct {
    volatile long state;           // Current state
    void *stack;                   // Kernel stack pointer
    unsigned int flags;            // Process flags
    int on_cpu;                    // CPU running on
    struct mm_struct *mm;          // Memory descriptor
    char comm[TASK_COMM_LEN];      // Process name
    pid_t pid;                     // Process ID
    // ... hundreds more fields
};

// pt_regs - CPU registers at exception
struct pt_regs {
    unsigned long r15, r14, r13, r12;
    unsigned long bp, bx, r11, r10, r9, r8;
    unsigned long ax, cx, dx, si, di;
    unsigned long orig_ax;
    unsigned long ip;              // Instruction pointer (RIP)
    unsigned long cs;
    unsigned long flags;
    unsigned long sp;              // Stack pointer (RSP)
    unsigned long ss;
};

4.4 Algorithm Overview

Crash Analysis Process:

Load and Verify
- Load vmcore with vmlinux
- Verify system information matches expected
Initial Reconnaissance
- Check panic message
- View kernel log
- Identify crashing task
Backtrace Analysis
- Get full backtrace
- Identify the exception frame
- Find your code in the trace
Register/Memory Examination
- Examine registers at crash
- Look at memory being accessed
- Understand the failing instruction
Root Cause Determination
- Connect evidence to cause
- Document findings
- Propose fix

5. Implementation Guide

5.1 Development Environment Setup

# Install crash utility
# Debian/Ubuntu
sudo apt-get install crash

# RHEL/CentOS/Fedora
sudo dnf install crash

# Install debug symbols for your kernel
# Debian/Ubuntu
sudo apt-get install linux-image-$(uname -r)-dbg

# RHEL/CentOS
sudo debuginfo-install kernel-$(uname -r)

# Find the vmlinux file
ls /usr/lib/debug/lib/modules/$(uname -r)/vmlinux

5.2 Project Structure

crash_analysis_project/
├── vmcore/
│   └── symlink or copy of vmcore from Project 8
├── analysis/
│   ├── session_log.txt        # Record of crash session
│   ├── root_cause.md          # Analysis conclusions
│   └── commands_used.md       # Command reference
└── scripts/
    └── analysis_commands.txt  # Batch commands for crash

5.3 The Core Question You’re Answering

“Given a complete snapshot of kernel memory at the moment of death, how do you determine exactly what went wrong and why?”

This involves:

Finding the failing code location
Understanding what that code was trying to do
Examining the state that caused the failure
Connecting to the root cause

5.4 Concepts You Must Understand First

Kernel address space layout
- Where kernel code lives
- Where modules are loaded
- Stack vs heap in kernel
x86-64 exception handling
- Page fault handling
- Exception frames on stack
task_struct and process representation
- How Linux represents processes
- Current task pointer
Kernel stack layout
- Stack frames in kernel
- Return addresses and saved registers

5.5 Questions to Guide Your Design

About the crash session:

What information does the initial crash banner tell you?
How do you know which task was running when it crashed?
What’s the difference between the current task and other tasks?

About backtrace analysis:

How do you identify where in the trace your code is?
What do the frame numbers mean?
What’s an “exception frame” in the trace?

About memory examination:

What register caused the fault (for a page fault)?
How do you decode the error code?
What was the code trying to do?

5.6 Thinking Exercise

Given this crash output:

PANIC: "Kernel panic - not syncing: Fatal exception"
PID: 1234    COMMAND: "insmod"

crash> bt
 #7 [ffffc90000a78018] buggy_init+0x15/0x30 [buggy_module]
    RIP: ffffffffc06ef015
    RAX: 0000000000000000

crash> dis -l buggy_init
0xffffffffc06ef000 <buggy_init>:        push   %rbp
0xffffffffc06ef001 <buggy_init+0x1>:    mov    %rsp,%rbp
0xffffffffc06ef004 <buggy_init+0x4>:    sub    $0x10,%rsp
0xffffffffc06ef008 <buggy_init+0x8>:    movq   $0x0,-0x8(%rbp)
0xffffffffc06ef010 <buggy_init+0x10>:   mov    -0x8(%rbp),%rax
0xffffffffc06ef015 <buggy_init+0x15>:   movl   $0x2a,(%rax)  <- crash here

Questions:

What instruction crashed? What is it trying to do?
What value is in %rax? What does that mean?
What’s at offset -0x8(%rbp)? How did it get there?
What’s the C code that produced this assembly?

5.7 Hints in Layers

Hint 1 - Getting Started:

# Load the crash dump
sudo crash /path/to/vmlinux /path/to/vmcore

# First commands to run:
crash> sys          # System info
crash> log | tail   # Last kernel messages
crash> bt           # Backtrace of current task

Hint 2 - Understanding the Backtrace:

# See all tasks and their states
crash> ps

# Get detailed info about the crashing task
crash> task

# Get backtrace with more detail
crash> bt -l        # With line numbers
crash> bt -f        # With frame contents

Hint 3 - Examining Memory:

# Disassemble the crashing function
crash> dis buggy_init

# Examine registers
crash> bt           # Shows registers at exception

# Read memory at address
crash> rd 0xffffc90000a78018 16

Hint 4 - Module Information:

# See loaded modules
crash> mod

# Get info about specific module
crash> mod -s buggy_module

# Find module's location in memory
crash> sym buggy_init

5.8 The Interview Questions They’ll Ask

“How do you load a vmcore in crash?”
- Expected: crash vmlinux vmcore, need matching debug symbols
“How do you find what caused a kernel panic?”
- Expected: Check bt, look for exception frame, examine registers
“What’s the difference between bt in GDB and crash?”
- Expected: crash shows kernel stack, includes register state, understands kernel structures
“How do you examine a kernel data structure in crash?”
- Expected: Use struct command with address or p for variables
“What does it mean when RIP points to a module address?”
- Expected: Crash occurred in a loadable module, need module debug symbols
“How do you find all processes that were running at crash time?”
- Expected: ps command shows all tasks with their states

5.9 Books That Will Help

Topic	Book	Chapter(s)
crash Usage	“Red Hat Crash Course”	Online documentation
Kernel Internals	“Linux Kernel Development” - Love	Ch. 3-5
Process Management	“Understanding the Linux Kernel”	Ch. 3
Memory Management	“Linux Kernel Development”	Ch. 15-16

5.10 Implementation Phases

Phase 1: Setup and Loading (Day 1-2)

Install crash and debug symbols
Successfully load vmcore
Verify version match

Phase 2: Basic Navigation (Day 3-4)

Learn essential commands
Navigate system information
View process list and logs

Phase 3: Backtrace Analysis (Day 5-7)

Analyze the full backtrace
Identify the crashing function
Understand the call chain

Phase 4: Deep Dive (Day 8-10)

Examine registers and memory
Disassemble crashing code
Trace the root cause

Phase 5: Documentation (Day 11-14)

Write up complete analysis
Document commands used
Create reproducible walkthrough

5.11 Key Implementation Decisions

Recording Session: Use script command to record your crash session
Command Notes: Keep notes on what each command revealed
Iterative Analysis: Often need multiple passes to understand fully

6. Testing Strategy

Verification Checklist

Before analysis:

crash loads vmcore without errors
Kernel version matches between vmcore and vmlinux
Module symbols available (if crash in module)

During analysis:

Can get backtrace of crashing task
Can identify the faulting instruction
Can examine relevant registers
Can read memory at crash location

After analysis:

Root cause identified and documented
Evidence supports conclusion
Analysis is reproducible

Self-Test Questions

After your analysis, you should be able to answer:

What function crashed?
What instruction failed?
Why did it fail (what was the invalid access)?
What would fix the bug?

7. Common Pitfalls & Debugging

Pitfall 1: Version Mismatch

Problem: crash reports symbol errors or wrong data

Symptom:

crash: cannot resolve "init_task"

Solution:

# Verify versions match
crash> sys | grep RELEASE
# Compare with vmlinux package version
rpm -qf /usr/lib/debug/lib/modules/*/vmlinux

Pitfall 2: Missing Module Symbols

Problem: Backtrace shows ?? for module functions

Symptom:

#7 [address] ?? ()

Solution:

# Load module with debug info
crash> mod -s buggy_module /path/to/buggy_module.ko.debug

# Or if debug info is in separate file
crash> mod -S /path/to/debug/modules/

Pitfall 3: Confusing Frame Numbers

Problem: Not understanding which frame is “your code”

Solution: Look for the exception frame (usually marked with ---<EXCEPTION FRAME>---) or unusual addresses (your module will have different prefix than kernel addresses)

Pitfall 4: Not Recording Analysis

Problem: Can’t reproduce analysis later

Solution:

# Use script to record everything
script analysis_session.txt
crash vmlinux vmcore
# ... do analysis...
exit

8. Extensions & Challenges

Extension 1: Analyze Different Panic Types

Trigger and analyze:

Stack overflow panic
Deadlock (soft lockup)
Memory corruption (use-after-free)
Hardware error simulation

Extension 2: Multi-CPU Analysis

For SMP systems:

crash> bt -a          # Backtrace all CPUs
crash> foreach bt     # Foreach task backtrace
crash> runq           # Run queues per CPU

Extension 3: Automated Analysis Script

Write a crash extension that:

Automatically extracts key information
Generates a structured report
Highlights common issues

Extension 4: Network/Storage Analysis

Analyze crashes in:

Network drivers
Filesystem code
Block device drivers

9. Real-World Connections

Enterprise Support

Red Hat, SUSE, and Ubuntu support teams use crash daily:

Customer sends vmcore
Engineer analyzes with crash
Root cause identified
Fix developed or workaround provided

Kernel Development

Kernel developers use crash to:

Understand complex bugs
Verify fixes are correct
Analyze race conditions
Debug lock ordering issues

Forensics

Security researchers use crash for:

Analyzing kernel exploits
Understanding rootkit behavior
Post-incident analysis

10. Resources

Official Documentation

Tutorials

Source Code

11. Self-Assessment Checklist

Before You Start

Have vmcore from Project 8
crash utility installed
Matching vmlinux with debug symbols
Basic understanding of kernel concepts

After Completion

Can load vmcore successfully
Can navigate crash interface
Can get and interpret backtraces
Can examine kernel memory and registers
Can identify root cause of panic
Can document analysis clearly
Feel confident debugging kernel issues

12. Submission / Completion Criteria

Your project is complete when you have:

Successful crash Session
- Loaded vmcore with correct vmlinux
- No symbol resolution errors
Complete Analysis
- Documented backtrace interpretation
- Identified crashing function and instruction
- Explained why the crash occurred
- Described how to fix the bug
Command Mastery
- Used at least 10 different crash commands
- Understand what each command shows
- Can explain when to use each
Documentation
- Session log recorded
- Analysis written up clearly
- Commands and their purposes documented
Understanding Demonstrated
- Can answer interview questions in 5.8
- Can explain the analysis to someone else
- Could analyze a different vmcore

Next: Project 10: Building a Centralized Crash Reporter - Scale crash analysis to production systems