Project 11: BMP Header Decoder

  • File: P11-bmp-header-decoder.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Python
  • Coolness Level: Level 3 (See REFERENCE.md)
  • Business Potential: Level 1 (See REFERENCE.md)
  • Difficulty: Level 3 (See REFERENCE.md)
  • Knowledge Area: File Formats
  • Software or Tool: CLI
  • Main Book: “Practical Binary Analysis”

What you will build: A tool that reads a BMP file header and prints width, height, and pixel format.

Why it teaches binary/hex: You must parse structured fields with known byte order.

Core challenges you will face:

  • Field extraction -> Endianness
  • Offset-based parsing -> Encoding & Forensics
  • Display formatting -> Bits/Bytes/Nibbles

Real World Outcome

$ bmpinfo sample.bmp
width: 640
height: 480
bits_per_pixel: 24
pixel_data_offset: 0x00000036

The Core Question You Are Answering

“How do bytes become structured metadata in a real file format?”

Concepts You Must Understand First

  1. Byte order
    • Which fields are little-endian?
    • Book Reference: “Computer Systems: A Programmer’s Perspective” - Ch. 2
  2. Offsets and sizes
    • How do you compute field boundaries?
    • Book Reference: “Practical Binary Analysis” - Ch. 3

Questions to Guide Your Design

  1. Parsing
    • Will you parse by reading fixed offsets or by reading structs?
  2. Validation
    • How will you validate that input is really BMP?

Thinking Exercise

Field Map

Sketch a simple header layout: offset, size, meaning. Decide which fields must be validated first.

Questions to answer:

  • Why must you know endianness before reading integers?
  • Which field tells you where pixel data starts?

The Interview Questions They Will Ask

  1. “How do you parse a fixed-width binary header?”
  2. “Why is endianness essential for file formats?”
  3. “How would you validate the signature bytes?”
  4. “What problems arise with struct padding?”
  5. “How would you extend your parser to other formats?”

Hints in Layers

Hint 1: Starting Point Read the first 64 bytes into a buffer and parse fields by offset.

Hint 2: Next Level Use explicit little-endian conversions when assembling integers.

Hint 3: Technical Details Pseudocode:

read header bytes
width = little_endian_32(bytes[18..21])
height = little_endian_32(bytes[22..25])

Hint 4: Tools/Debugging Compare your output with a known image viewer or metadata tool.

Books That Will Help

Topic Book Chapter
Binary formats “Practical Binary Analysis” Ch. 3
Data representation “Computer Systems: A Programmer’s Perspective” Ch. 2

Common Pitfalls and Debugging

Problem 1: “Widths are absurdly large”

  • Why: You treated little-endian fields as big-endian.
  • Fix: Implement explicit little-endian reads.
  • Quick test: Use a 2x2 image and verify width/height.

Definition of Done

  • Correctly parses BMP header fields
  • Validates signature before parsing
  • Reports width, height, bits per pixel
  • Works on at least three test files