Project 9: Analyzing a Kernel Panic with crash
Use the
crashutility to dissect a vmcore and understand exactly why the kernel died.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 1-2 weeks |
| Language | crash utility commands |
| Prerequisites | Project 8 (vmcore capture) |
| Key Topics | crash utility, kernel debugging, kernel data structures, vmcore analysis |
1. Learning Objectives
By completing this project, you will:
- Learn to use the
crashutility for kernel dump analysis - Understand how to navigate kernel data structures
- Extract meaningful information from a vmcore file
- Identify the root cause of a kernel panic
- Learn essential crash commands for debugging
- Understand kernel symbol tables and debug information
2. Theoretical Foundation
2.1 Core Concepts
What is the crash Utility?
The crash utility is an interactive analysis tool for kernel crash dumps and live systems. Think of it as “GDB for the kernel”—but specialized for post-mortem analysis of vmcore files.
┌─────────────────────────────────────────────────────────────────┐
│ GDB vs CRASH COMPARISON │
├─────────────────────────────────────────────────────────────────┤
│ │
│ GDB crash │
│ ─── ───── │
│ User-space debugger Kernel debugger │
│ Works on core dumps Works on vmcore files │
│ Needs debug symbols Needs kernel debuginfo │
│ Understands C structures Understands kernel structures │
│ One process at a time Entire system state │
│ bt shows user stack bt shows kernel stack │
│ Generic commands Kernel-specific commands │
│ │
│ Similar Commands: crash-Specific Commands: │
│ bt (backtrace) ps (process list) │
│ p (print) files (open files) │
│ x (examine) net (network info) │
│ info registers dev (device info) │
│ mod (module info) │
│ log (dmesg output) │
│ struct (kernel structures) │
│ task (task_struct info) │
│ │
└─────────────────────────────────────────────────────────────────┘
The vmlinux File
To analyze a vmcore, crash needs the vmlinux file—the uncompressed kernel image with debug symbols:
┌─────────────────────────────────────────────────────────────────┐
│ VMCORE ANALYSIS REQUIREMENTS │
├─────────────────────────────────────────────────────────────────┤
│ │
│ You Need: │
│ │
│ 1. vmcore file │
│ └─ The crash dump from kdump │
│ └─ Located in /var/crash/*/vmcore │
│ │
│ 2. vmlinux file (with debug symbols) │
│ └─ Uncompressed kernel with DWARF info │
│ └─ From kernel-debuginfo package │
│ └─ Usually in /usr/lib/debug/lib/modules/<version>/ │
│ │
│ 3. (Optional) Module debug symbols │
│ └─ If crash is in a module │
│ └─ .ko.debug files │
│ │
│ crash vmlinux vmcore │
│ ↓ ↓ │
│ │ └─ Memory snapshot │
│ └─ Symbol information │
│ │
└─────────────────────────────────────────────────────────────────┘
Essential crash Commands
┌─────────────────────────────────────────────────────────────────┐
│ CRASH COMMAND REFERENCE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ NAVIGATION & BASICS │
│ ─────────────────── │
│ bt - Backtrace of current/specified task │
│ ps - Show process list at time of crash │
│ log - Display kernel message buffer (dmesg) │
│ sys - System information │
│ help CMD - Get help on any command │
│ │
│ MEMORY EXAMINATION │
│ ────────────────── │
│ rd ADDR - Read memory (default: 8 bytes) │
│ x/FMT - GDB-style examine (x/16wx, x/s, etc.) │
│ search - Search memory for patterns │
│ vm - Virtual memory information for a task │
│ │
│ DATA STRUCTURES │
│ ─────────────── │
│ struct - Display structure definition or instance │
│ p EXPR - Print expression/variable value │
│ whatis - Show type of symbol │
│ sym - Translate address to symbol or vice versa │
│ │
│ PROCESS/TASK ANALYSIS │
│ ───────────────────── │
│ task - Display task_struct contents │
│ set - Set current context (task, CPU) │
│ files - Open files for a process │
│ sig - Signal information │
│ │
│ MODULE ANALYSIS │
│ ─────────────── │
│ mod - Loaded modules information │
│ dis ADDR - Disassemble at address │
│ │
│ ADVANCED │
│ ──────── │
│ foreach - Execute command across all tasks/CPUs │
│ runq - Display run queue │
│ kmem - Kernel memory usage │
│ timer - Timer information │
│ │
└─────────────────────────────────────────────────────────────────┘
Kernel Stack Trace Anatomy
Understanding a kernel backtrace:
crash> bt
PID: 1234 TASK: ffff88810a4d8000 CPU: 1 COMMAND: "insmod"
#0 [ffffc90000a77e30] machine_kexec at ffffffff8105f370
#1 [ffffc90000a77e80] __crash_kexec at ffffffff8111d0c1
#2 [ffffc90000a77f48] crash_kexec at ffffffff8111e1b3
#3 [ffffc90000a77f60] oops_end at ffffffff81027a5b
#4 [ffffc90000a77f80] do_trap at ffffffff81021a7c
#5 [ffffc90000a77fd0] do_error_trap at ffffffff81021c93
#6 [ffffc90000a77ff0] exc_divide_error at ffffffff81a00c90
--- <EXCEPTION FRAME> ---
#7 [ffffc90000a78018] buggy_init+0x15/0x30 [buggy_module] <-- YOUR CODE
#8 [ffffc90000a78038] do_one_initcall at ffffffff81001eb3
#9 [ffffc90000a78098] do_init_module at ffffffff8111a16e
#10 [ffffc90000a780b8] load_module at ffffffff81119f67
#11 [ffffc90000a78200] __do_sys_finit_module at ffffffff8111af31
#12 [ffffc90000a78250] do_syscall_64 at ffffffff81a0007c
#13 [ffffc90000a78280] entry_SYSCALL_64_after_hwframe at ffffffff81c0008b
Breaking this down:
────────────────────
Frame #0-6: Crash handling code (after the panic)
Frame #7: YOUR CODE - where the bug actually occurred
Frame #8-11: Module loading code path
Frame #12-13: System call entry point
2.2 Why This Matters
Kernel crash analysis is the final frontier of debugging:
- You’re debugging the operating system itself
- No user-space tools are available during kernel execution
- The entire system state is at your fingertips
- Understanding this gives insight into how Linux actually works
2.3 Historical Context
- 1999: crash utility created by Dave Anderson (Red Hat)
- 2000s: Became standard tool for enterprise Linux support
- Today: Maintained as open source, used by all major Linux vendors
2.4 Common Misconceptions
Misconception 1: “crash is like GDB for the kernel”
- Reality: Similar concepts but different commands and capabilities
Misconception 2: “You need source code to use crash”
- Reality: Debug symbols are sufficient; source helps but isn’t required
Misconception 3: “Kernel debugging is too hard for non-kernel developers”
- Reality: Basic crash analysis is accessible with some learning
3. Project Specification
3.1 What You Will Build
A complete analysis workflow using crash:
- Load the vmcore from Project 8 into crash
- Identify the crashing task and its backtrace
- Examine the kernel log at crash time
- Inspect the crashing module
- Determine root cause of the panic
- Document the complete analysis process
3.2 Functional Requirements
- Load vmcore successfully
- Install crash utility and debug symbols
- Load vmcore with correct vmlinux
- Basic crash navigation
- Get system information
- View process list at crash time
- Examine kernel log
- Backtrace analysis
- Get backtrace of crashing task
- Identify the faulting function
- Understand the call chain
- Memory examination
- Examine registers at crash time
- View memory at crash location
- Understand the failing instruction
- Root cause determination
- Identify what caused the panic
- Document the analysis
- Explain how to fix the bug
3.3 Non-Functional Requirements
- All analysis reproducible from documentation
- Clear explanation of each crash command used
- Understanding demonstrated, not just commands copied
3.4 Example Usage / Output
Loading the vmcore:
$ sudo crash /usr/lib/debug/lib/modules/6.1.0-13-amd64/vmlinux \
/var/crash/127.0.0.1-2025-12-20-15:00:00/vmcore
crash 8.0.2
Copyright (C) 2002-2023 Red Hat, Inc.
...
KERNEL: /usr/lib/debug/lib/modules/6.1.0-13-amd64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2025-12-20-15:00:00/vmcore
CPUS: 2
DATE: Fri Dec 20 15:00:00 EST 2025
UPTIME: 00:05:23
LOAD AVERAGE: 0.08, 0.03, 0.01
TASKS: 142
NODENAME: crash-lab
RELEASE: 6.1.0-13-amd64
VERSION: #1 SMP PREEMPT_DYNAMIC Debian 6.1.55-1 (2023-09-29)
MACHINE: x86_64 (2400 Mhz)
MEMORY: 4 GB
PANIC: "Kernel panic - not syncing: Fatal exception"
PID: 1234
COMMAND: "insmod"
TASK: ffff88810a4d8000 [THREAD_INFO: ffff88810a4d8000]
CPU: 1
STATE: TASK_RUNNING (PANIC)
crash>
Getting the backtrace:
crash> bt
PID: 1234 TASK: ffff88810a4d8000 CPU: 1 COMMAND: "insmod"
#0 [ffffc90000a77e30] machine_kexec at ffffffff8105f370
#1 [ffffc90000a77e80] __crash_kexec at ffffffff8111d0c1
#2 [ffffc90000a77f48] crash_kexec at ffffffff8111e1b3
#3 [ffffc90000a77f60] oops_end at ffffffff81027a5b
#4 [ffffc90000a77f80] no_context at ffffffff81073a9c
#5 [ffffc90000a77fd0] __bad_area_nosemaphore at ffffffff81073cc1
#6 [ffffc90000a77ff0] do_page_fault at ffffffff81074a94
#7 [ffffc90000a78018] page_fault at ffffffff81a01204
[exception RIP: buggy_init+0x15]
RIP: ffffffffc06ef015 RSP: ffffc90000a780d0 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffffffffc06f0000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffc06ef000
RBP: ffffc90000a780d0 R8: 0000000000000000 R9: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffffffffc06ef000 R14: 0000000000000000 R15: 0000000000000001
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#8 [ffffc90000a78108] do_one_initcall at ffffffff81001eb3
...
Viewing the kernel log:
crash> log
[ 0.000000] Linux version 6.1.0-13-amd64 ...
[ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-6.1.0-13-amd64 root=...
...
[ 123.456789] buggy_module: loading out-of-tree module taints kernel.
[ 123.457123] buggy_module: module loaded, about to crash...
[ 123.458456] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 123.459789] #PF: supervisor write access in kernel mode
[ 123.460123] #PF: error_code(0x0002) - not-present page
[ 123.461456] Oops: 0002 [#1] PREEMPT SMP NOPTI
...
[ 123.470123] Kernel panic - not syncing: Fatal exception
3.5 Real World Outcome
After this project, you’ll be able to:
- Analyze kernel crashes from production systems
- Provide meaningful crash analysis to kernel developers
- Understand root causes of kernel panics
- Debug kernel modules and drivers
- Support Linux systems at the deepest level
4. Solution Architecture
4.1 High-Level Design
┌─────────────────────────────────────────────────────────────────┐
│ CRASH ANALYSIS WORKFLOW │
└─────────────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ vmcore │ │ vmlinux │ │ Module debug │
│ │ │ │ │ symbols │
│ (crash dump) │ │(debug syms) │ │ (optional) │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
│ │ │
└────────────────────┼────────────────────┘
│
▼
┌─────────────────────┐
│ crash utility │
│ │
│ Interactive shell │
└──────────┬──────────┘
│
┌────────────────────┼────────────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ System Info │ │ Backtrace │ │ Memory │
│ │ │ │ │ Analysis │
│ • ps │ │ • bt │ │ • rd │
│ • sys │ │ • dis │ │ • search │
│ • mod │ │ • struct │ │ • vm │
└──────────────┘ └──────────────┘ └──────────────┘
│ │ │
└────────────────────┼────────────────────┘
│
▼
┌─────────────────────┐
│ ROOT CAUSE │
│ DETERMINATION │
└─────────────────────┘
4.2 Key Components
- crash utility: Interactive kernel debugger
- vmlinux: Kernel with debug symbols
- vmcore: Kernel memory dump
- Your analysis: Systematic investigation process
4.3 Data Structures
Key kernel structures you’ll examine:
// task_struct - represents a process/thread
struct task_struct {
volatile long state; // Current state
void *stack; // Kernel stack pointer
unsigned int flags; // Process flags
int on_cpu; // CPU running on
struct mm_struct *mm; // Memory descriptor
char comm[TASK_COMM_LEN]; // Process name
pid_t pid; // Process ID
// ... hundreds more fields
};
// pt_regs - CPU registers at exception
struct pt_regs {
unsigned long r15, r14, r13, r12;
unsigned long bp, bx, r11, r10, r9, r8;
unsigned long ax, cx, dx, si, di;
unsigned long orig_ax;
unsigned long ip; // Instruction pointer (RIP)
unsigned long cs;
unsigned long flags;
unsigned long sp; // Stack pointer (RSP)
unsigned long ss;
};
4.4 Algorithm Overview
Crash Analysis Process:
- Load and Verify
- Load vmcore with vmlinux
- Verify system information matches expected
- Initial Reconnaissance
- Check panic message
- View kernel log
- Identify crashing task
- Backtrace Analysis
- Get full backtrace
- Identify the exception frame
- Find your code in the trace
- Register/Memory Examination
- Examine registers at crash
- Look at memory being accessed
- Understand the failing instruction
- Root Cause Determination
- Connect evidence to cause
- Document findings
- Propose fix
5. Implementation Guide
5.1 Development Environment Setup
# Install crash utility
# Debian/Ubuntu
sudo apt-get install crash
# RHEL/CentOS/Fedora
sudo dnf install crash
# Install debug symbols for your kernel
# Debian/Ubuntu
sudo apt-get install linux-image-$(uname -r)-dbg
# RHEL/CentOS
sudo debuginfo-install kernel-$(uname -r)
# Find the vmlinux file
ls /usr/lib/debug/lib/modules/$(uname -r)/vmlinux
5.2 Project Structure
crash_analysis_project/
├── vmcore/
│ └── symlink or copy of vmcore from Project 8
├── analysis/
│ ├── session_log.txt # Record of crash session
│ ├── root_cause.md # Analysis conclusions
│ └── commands_used.md # Command reference
└── scripts/
└── analysis_commands.txt # Batch commands for crash
5.3 The Core Question You’re Answering
“Given a complete snapshot of kernel memory at the moment of death, how do you determine exactly what went wrong and why?”
This involves:
- Finding the failing code location
- Understanding what that code was trying to do
- Examining the state that caused the failure
- Connecting to the root cause
5.4 Concepts You Must Understand First
- Kernel address space layout
- Where kernel code lives
- Where modules are loaded
- Stack vs heap in kernel
- x86-64 exception handling
- Page fault handling
- Exception frames on stack
- task_struct and process representation
- How Linux represents processes
- Current task pointer
- Kernel stack layout
- Stack frames in kernel
- Return addresses and saved registers
5.5 Questions to Guide Your Design
About the crash session:
- What information does the initial crash banner tell you?
- How do you know which task was running when it crashed?
- What’s the difference between the current task and other tasks?
About backtrace analysis:
- How do you identify where in the trace your code is?
- What do the frame numbers mean?
- What’s an “exception frame” in the trace?
About memory examination:
- What register caused the fault (for a page fault)?
- How do you decode the error code?
- What was the code trying to do?
5.6 Thinking Exercise
Given this crash output:
PANIC: "Kernel panic - not syncing: Fatal exception"
PID: 1234 COMMAND: "insmod"
crash> bt
#7 [ffffc90000a78018] buggy_init+0x15/0x30 [buggy_module]
RIP: ffffffffc06ef015
RAX: 0000000000000000
crash> dis -l buggy_init
0xffffffffc06ef000 <buggy_init>: push %rbp
0xffffffffc06ef001 <buggy_init+0x1>: mov %rsp,%rbp
0xffffffffc06ef004 <buggy_init+0x4>: sub $0x10,%rsp
0xffffffffc06ef008 <buggy_init+0x8>: movq $0x0,-0x8(%rbp)
0xffffffffc06ef010 <buggy_init+0x10>: mov -0x8(%rbp),%rax
0xffffffffc06ef015 <buggy_init+0x15>: movl $0x2a,(%rax) <- crash here
Questions:
- What instruction crashed? What is it trying to do?
- What value is in %rax? What does that mean?
- What’s at offset -0x8(%rbp)? How did it get there?
- What’s the C code that produced this assembly?
5.7 Hints in Layers
Hint 1 - Getting Started:
# Load the crash dump
sudo crash /path/to/vmlinux /path/to/vmcore
# First commands to run:
crash> sys # System info
crash> log | tail # Last kernel messages
crash> bt # Backtrace of current task
Hint 2 - Understanding the Backtrace:
# See all tasks and their states
crash> ps
# Get detailed info about the crashing task
crash> task
# Get backtrace with more detail
crash> bt -l # With line numbers
crash> bt -f # With frame contents
Hint 3 - Examining Memory:
# Disassemble the crashing function
crash> dis buggy_init
# Examine registers
crash> bt # Shows registers at exception
# Read memory at address
crash> rd 0xffffc90000a78018 16
Hint 4 - Module Information:
# See loaded modules
crash> mod
# Get info about specific module
crash> mod -s buggy_module
# Find module's location in memory
crash> sym buggy_init
5.8 The Interview Questions They’ll Ask
- “How do you load a vmcore in crash?”
- Expected:
crash vmlinux vmcore, need matching debug symbols
- Expected:
- “How do you find what caused a kernel panic?”
- Expected: Check
bt, look for exception frame, examine registers
- Expected: Check
- “What’s the difference between
btin GDB and crash?”- Expected: crash shows kernel stack, includes register state, understands kernel structures
- “How do you examine a kernel data structure in crash?”
- Expected: Use
structcommand with address orpfor variables
- Expected: Use
- “What does it mean when RIP points to a module address?”
- Expected: Crash occurred in a loadable module, need module debug symbols
- “How do you find all processes that were running at crash time?”
- Expected:
pscommand shows all tasks with their states
- Expected:
5.9 Books That Will Help
| Topic | Book | Chapter(s) |
|---|---|---|
| crash Usage | “Red Hat Crash Course” | Online documentation |
| Kernel Internals | “Linux Kernel Development” - Love | Ch. 3-5 |
| Process Management | “Understanding the Linux Kernel” | Ch. 3 |
| Memory Management | “Linux Kernel Development” | Ch. 15-16 |
5.10 Implementation Phases
Phase 1: Setup and Loading (Day 1-2)
- Install crash and debug symbols
- Successfully load vmcore
- Verify version match
Phase 2: Basic Navigation (Day 3-4)
- Learn essential commands
- Navigate system information
- View process list and logs
Phase 3: Backtrace Analysis (Day 5-7)
- Analyze the full backtrace
- Identify the crashing function
- Understand the call chain
Phase 4: Deep Dive (Day 8-10)
- Examine registers and memory
- Disassemble crashing code
- Trace the root cause
Phase 5: Documentation (Day 11-14)
- Write up complete analysis
- Document commands used
- Create reproducible walkthrough
5.11 Key Implementation Decisions
- Recording Session: Use
scriptcommand to record your crash session - Command Notes: Keep notes on what each command revealed
- Iterative Analysis: Often need multiple passes to understand fully
6. Testing Strategy
Verification Checklist
Before analysis:
- crash loads vmcore without errors
- Kernel version matches between vmcore and vmlinux
- Module symbols available (if crash in module)
During analysis:
- Can get backtrace of crashing task
- Can identify the faulting instruction
- Can examine relevant registers
- Can read memory at crash location
After analysis:
- Root cause identified and documented
- Evidence supports conclusion
- Analysis is reproducible
Self-Test Questions
After your analysis, you should be able to answer:
- What function crashed?
- What instruction failed?
- Why did it fail (what was the invalid access)?
- What would fix the bug?
7. Common Pitfalls & Debugging
Pitfall 1: Version Mismatch
Problem: crash reports symbol errors or wrong data
Symptom:
crash: cannot resolve "init_task"
Solution:
# Verify versions match
crash> sys | grep RELEASE
# Compare with vmlinux package version
rpm -qf /usr/lib/debug/lib/modules/*/vmlinux
Pitfall 2: Missing Module Symbols
Problem: Backtrace shows ?? for module functions
Symptom:
#7 [address] ?? ()
Solution:
# Load module with debug info
crash> mod -s buggy_module /path/to/buggy_module.ko.debug
# Or if debug info is in separate file
crash> mod -S /path/to/debug/modules/
Pitfall 3: Confusing Frame Numbers
Problem: Not understanding which frame is “your code”
Solution: Look for the exception frame (usually marked with ---<EXCEPTION FRAME>---) or unusual addresses (your module will have different prefix than kernel addresses)
Pitfall 4: Not Recording Analysis
Problem: Can’t reproduce analysis later
Solution:
# Use script to record everything
script analysis_session.txt
crash vmlinux vmcore
# ... do analysis ...
exit
8. Extensions & Challenges
Extension 1: Analyze Different Panic Types
Trigger and analyze:
- Stack overflow panic
- Deadlock (soft lockup)
- Memory corruption (use-after-free)
- Hardware error simulation
Extension 2: Multi-CPU Analysis
For SMP systems:
crash> bt -a # Backtrace all CPUs
crash> foreach bt # Foreach task backtrace
crash> runq # Run queues per CPU
Extension 3: Automated Analysis Script
Write a crash extension that:
- Automatically extracts key information
- Generates a structured report
- Highlights common issues
Extension 4: Network/Storage Analysis
Analyze crashes in:
- Network drivers
- Filesystem code
- Block device drivers
9. Real-World Connections
Enterprise Support
Red Hat, SUSE, and Ubuntu support teams use crash daily:
- Customer sends vmcore
- Engineer analyzes with crash
- Root cause identified
- Fix developed or workaround provided
Kernel Development
Kernel developers use crash to:
- Understand complex bugs
- Verify fixes are correct
- Analyze race conditions
- Debug lock ordering issues
Forensics
Security researchers use crash for:
- Analyzing kernel exploits
- Understanding rootkit behavior
- Post-incident analysis
10. Resources
Official Documentation
Tutorials
Source Code
11. Self-Assessment Checklist
Before You Start
- Have vmcore from Project 8
- crash utility installed
- Matching vmlinux with debug symbols
- Basic understanding of kernel concepts
After Completion
- Can load vmcore successfully
- Can navigate crash interface
- Can get and interpret backtraces
- Can examine kernel memory and registers
- Can identify root cause of panic
- Can document analysis clearly
- Feel confident debugging kernel issues
12. Submission / Completion Criteria
Your project is complete when you have:
- Successful crash Session
- Loaded vmcore with correct vmlinux
- No symbol resolution errors
- Complete Analysis
- Documented backtrace interpretation
- Identified crashing function and instruction
- Explained why the crash occurred
- Described how to fix the bug
- Command Mastery
- Used at least 10 different crash commands
- Understand what each command shows
- Can explain when to use each
- Documentation
- Session log recorded
- Analysis written up clearly
- Commands and their purposes documented
- Understanding Demonstrated
- Can answer interview questions in 5.8
- Can explain the analysis to someone else
- Could analyze a different vmcore
Next: Project 10: Building a Centralized Crash Reporter - Scale crash analysis to production systems