Project 11: BMP Header Decoder
- File: P11-bmp-header-decoder.md
- Main Programming Language: C
- Alternative Programming Languages: Rust, Go, Python
- Coolness Level: Level 3 (See REFERENCE.md)
- Business Potential: Level 1 (See REFERENCE.md)
- Difficulty: Level 3 (See REFERENCE.md)
- Knowledge Area: File Formats
- Software or Tool: CLI
- Main Book: “Practical Binary Analysis”
What you will build: A tool that reads a BMP file header and prints width, height, and pixel format.
Why it teaches binary/hex: You must parse structured fields with known byte order.
Core challenges you will face:
- Field extraction -> Endianness
- Offset-based parsing -> Encoding & Forensics
- Display formatting -> Bits/Bytes/Nibbles
Real World Outcome
$ bmpinfo sample.bmp
width: 640
height: 480
bits_per_pixel: 24
pixel_data_offset: 0x00000036
The Core Question You Are Answering
“How do bytes become structured metadata in a real file format?”
Concepts You Must Understand First
- Byte order
- Which fields are little-endian?
- Book Reference: “Computer Systems: A Programmer’s Perspective” - Ch. 2
- Offsets and sizes
- How do you compute field boundaries?
- Book Reference: “Practical Binary Analysis” - Ch. 3
Questions to Guide Your Design
- Parsing
- Will you parse by reading fixed offsets or by reading structs?
- Validation
- How will you validate that input is really BMP?
Thinking Exercise
Field Map
Sketch a simple header layout: offset, size, meaning. Decide which fields must be validated first.
Questions to answer:
- Why must you know endianness before reading integers?
- Which field tells you where pixel data starts?
The Interview Questions They Will Ask
- “How do you parse a fixed-width binary header?”
- “Why is endianness essential for file formats?”
- “How would you validate the signature bytes?”
- “What problems arise with struct padding?”
- “How would you extend your parser to other formats?”
Hints in Layers
Hint 1: Starting Point Read the first 64 bytes into a buffer and parse fields by offset.
Hint 2: Next Level Use explicit little-endian conversions when assembling integers.
Hint 3: Technical Details Pseudocode:
read header bytes
width = little_endian_32(bytes[18..21])
height = little_endian_32(bytes[22..25])
Hint 4: Tools/Debugging Compare your output with a known image viewer or metadata tool.
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Binary formats | “Practical Binary Analysis” | Ch. 3 |
| Data representation | “Computer Systems: A Programmer’s Perspective” | Ch. 2 |
Common Pitfalls and Debugging
Problem 1: “Widths are absurdly large”
- Why: You treated little-endian fields as big-endian.
- Fix: Implement explicit little-endian reads.
- Quick test: Use a 2x2 image and verify width/height.
Definition of Done
- Correctly parses BMP header fields
- Validates signature before parsing
- Reports width, height, bits per pixel
- Works on at least three test files