Project 6: PE vs ELF Mapping Report
Build a report that maps file offsets to runtime addresses for the same sandbox built on Windows and Linux.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Level 3 |
| Time Estimate | 1 week |
| Main Programming Language | Tool-driven (readelf/objdump) |
| Alternative Programming Languages | N/A |
| Coolness Level | Level 3 |
| Business Potential | Level 1 |
| Prerequisites | Basic binary format knowledge |
| Key Topics | PE/ELF headers, mapping |
1. Learning Objectives
By completing this project, you will:
- Extract and interpret PE and ELF headers.
- Convert file offsets to runtime addresses.
- Document the differences in loader behavior.
2. All Theory Needed (Per-Concept Breakdown)
Binary Format Mapping (PE vs ELF)
-
Fundamentals PE and ELF describe how a binary is laid out on disk and how it should be mapped into memory. The loader reads the headers, maps segments, and applies relocations. If you understand how file offsets translate to runtime addresses, you can move between debugger views and file analysis reliably.
-
Deep Dive into the concept PE binaries define sections such as
.text,.data, and.rdata, each with virtual addresses and file offsets. ELF binaries define both sections and program headers. Sections are for the linker and debugger; program headers describe what the loader actually maps into memory. The key to mapping is understanding that the runtime address of a section is the image base plus the section’s virtual offset. File offsets are separate and can be used to locate bytes on disk. When ASLR or PIE is enabled, the image base changes, but the relative offsets remain stable.A common pitfall is assuming sections are the same as segments. In ELF, program headers define the memory mappings, and sections may be split or merged across them. This is why you should use program headers to understand runtime mappings. Another pitfall is ignoring relocations. If a binary is loaded at a different base address, relocations adjust references. This is especially important for Windows PE files when ASLR is enabled, and for Linux PIE binaries by default.
Mapping is not only about addresses; it also tells you permissions. Code segments are read/execute, while data segments are read/write. If you see a write breakpoint firing in a read/execute region, something is wrong. Understanding permissions helps you detect anomalies and verify that you are working with the correct region. For a reverse engineer, this knowledge connects static file analysis to dynamic debugging.
Finally, mapping allows you to produce a portable report. When you document offsets and runtime addresses for a known function, you create a reproducible map that helps future analysis, especially when tooling changes. This is why the report is a deliverable in this project.
-
How this fit on projects This concept informs Projects 4, 5, and the Capstone by bridging runtime and static analysis.
- Definitions & key terms
- Image base: Preferred load address.
- Section: Logical division in a binary file.
- Segment: Loadable memory mapping defined by program headers.
- Relocation: Patch applied when load base changes.
- Mental model diagram ``` Disk -> Loader -> Memory
[File Offset 0x400] -> [Runtime 0x140001000]
- **How it works (step-by-step)**
1. Extract headers (PE/ELF).
2. Identify sections/segments and bases.
3. Convert file offset to runtime address.
4. Validate in debugger.
- **Minimal concrete example**
ELF: .text offset 0x1000 -> runtime 0x401000 PE: .text offset 0x400 -> runtime 0x140001000
- **Common misconceptions**
- “Sections are the same as segments.” (Not always.)
- “Runtime addresses equal file offsets.” (ASLR changes base.)
- **Check-your-understanding questions**
1. Why do PIE binaries have different bases each run?
2. How do relocations help binaries execute correctly?
- **Check-your-understanding answers**
1. The loader randomizes the base for security.
2. Relocations adjust absolute addresses to new base.
- **Real-world applications**
- Crash dump analysis.
- Malware reverse engineering.
- **Where you’ll apply it**
- This project’s §5.4 and §6.2.
- Also used in: P05-ollydbg-debug-patch.md, P10-cross-platform-game-hacking-lab.md.
- **References**
- Practical Binary Analysis - Ch. 1-3.
- ELF specification (Linux Foundation).
- **Key insights**
Mapping is the glue between static and dynamic analysis.
- **Summary**
PE/ELF mapping lets you translate addresses and reason about loader behavior.
- **Homework/Exercises to practice the concept**
1. Compute the runtime address for a function given a base address.
2. Identify the permissions of each segment.
- **Solutions to the homework/exercises**
1. Add base address to section virtual offset.
2. Use program headers to see flags.
---
## 3. Project Specification
### 3.1 What You Will Build
A report that lists PE and ELF sections, their file offsets, runtime addresses, and permissions for the same sandbox binary built on Windows and Linux. The report must include at least one manual address conversion verified in a debugger.
### 3.2 Functional Requirements
1. Extract headers for PE and ELF.
2. Build a mapping table (offset -> runtime).
3. Validate one function address in a debugger.
### 3.3 Non-Functional Requirements
- **Performance**: Report completed under 1 week.
- **Reliability**: Address conversion verified by debugger.
- **Usability**: Table is clear and reproducible.
### 3.4 Example Usage / Output
ELF .text: offset 0x1000 -> runtime 0x401000 (r-x) PE .text: offset 0x400 -> runtime 0x140001000 (r-x)
### 3.5 Data Formats / Schemas / Protocols
- **Mapping table**: section, offset, runtime, permissions.
### 3.6 Edge Cases
- PIE enabled vs disabled.
- Stripped symbols.
### 3.7 Real World Outcome
#### 3.7.1 How to Run (Copy/Paste)
1. Run readelf/objdump on Linux binary.
2. Use PE viewer or dumpbin on Windows binary.
#### 3.7.2 Golden Path Demo (Deterministic)
Your report lists consistent mappings and a verified address.
#### 3.7.3 Failure Demo
Confusing sections with segments leads to incorrect mapping.
---
## 4. Solution Architecture
### 4.1 High-Level Design
[Binary] -> [Headers] -> [Mapping Table] -> [Debugger Validation]
### 4.2 Key Components
| Component | Responsibility | Key Decisions |
|-----------|----------------|---------------|
| Mapping Table | Offset->runtime | Use correct base |
| Validation | Debugger proof | One function |
### 4.3 Data Structures (No Full Code)
- **Mapping table** entries: section, offset, runtime, permissions.
### 4.4 Algorithm Overview
**Key Algorithm: Address Conversion**
1. Identify section virtual address.
2. Add image base.
3. Compare to debugger address.
**Complexity Analysis**
- Time: O(number of sections).
- Space: O(number of sections).
---
## 5. Implementation Guide
### 5.1 Development Environment Setup
- Linux: readelf/objdump.
- Windows: dumpbin or PE viewer.
### 5.2 Project Structure
project/ ├── pe-mapping.md ├── elf-mapping.md └── screenshots/
### 5.3 The Core Question You're Answering
> "How do I move between file offsets and runtime addresses reliably?"
### 5.4 Concepts You Must Understand First
- Image base and relocations
- Section vs segment
### 5.5 Questions to Guide Your Design
1. Is PIE/ASLR enabled?
2. Which section contains your target function?
### 5.6 Thinking Exercise
Pick one function and compute its runtime address manually.
### 5.7 The Interview Questions They'll Ask
1. "Why do file offsets differ from runtime addresses?"
2. "What is PIE?"
3. "How do you validate a mapping?"
### 5.8 Hints in Layers
**Hint 1:** Start with entry point address.
**Hint 2:** Use program headers for ELF.
**Hint 3:** Verify with debugger.
### 5.9 Books That Will Help
| Topic | Book | Chapter |
|-------|------|---------|
| ELF/PE | Practical Binary Analysis | Ch. 1-3 |
### 5.10 Implementation Phases
#### Phase 1: Extract Headers (2 hours)
- Produce header dumps.
**Checkpoint:** Section list created.
#### Phase 2: Build Mapping Table (4 hours)
- Create offset/runtime table.
**Checkpoint:** Table complete.
#### Phase 3: Validate (2 hours)
- Verify one address in debugger.
**Checkpoint:** Verified mapping.
### 5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|----------|---------|----------------|-----------|
| Validation | One function vs many | One | Focus on correctness |
---
## 6. Testing Strategy
### 6.1 Test Categories
| Category | Purpose | Examples |
|----------|---------|----------|
| Correctness | Address match | Debugger check |
### 6.2 Critical Test Cases
1. Entry point mapping validated.
2. One function address validated.
### 6.3 Test Data
Entry point: file offset + base == debugger address ```
7. Common Pitfalls & Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Section/segment confusion | Wrong addresses | Use program headers |
| Ignoring ASLR | Off by base | Adjust base |
7.2 Debugging Strategies
- Compare output with debugger memory map.
8. Extensions & Challenges
8.1 Beginner Extensions
- Map one additional section.
8.2 Intermediate Extensions
- Compare symbols in stripped vs unstripped binaries.
8.3 Advanced Extensions
- Include relocation entries in the report.
9. Real-World Connections
9.1 Industry Applications
- Crash dump triage.
9.2 Related Open Source Projects
- Sandbox from P01.
9.3 Interview Relevance
- Demonstrates understanding of binary loading.
10. Resources
10.1 Essential Reading
- Practical Binary Analysis - Ch. 1-3
10.2 Video Resources
- ELF/PE format explainers.
10.3 Tools & Documentation
- readelf, objdump, dumpbin docs.
10.4 Related Projects in This Series
- P05-ollydbg-debug-patch.md
- P10-cross-platform-game-hacking-lab.md
11. Self-Assessment Checklist
11.1 Understanding
- I can explain the difference between sections and segments.
- I can compute runtime addresses from offsets.
11.2 Implementation
- Mapping table complete.
- Address validation documented.
11.3 Growth
- I can explain loader behavior to someone else.
12. Submission / Completion Criteria
Minimum Viable Completion:
- Mapping table with at least one verified address.
Full Completion:
- Report covers PE and ELF and includes screenshots.
Excellence (Going Above & Beyond):
- Includes relocation analysis and symbols comparison.