Project 9: Hexdump Clone

  • File: P09-hexdump-clone.md
  • Main Programming Language: C
  • Alternative Programming Languages: Rust, Go, Python
  • Coolness Level: Level 4 (See REFERENCE.md)
  • Business Potential: Level 2 (See REFERENCE.md)
  • Difficulty: Level 3 (See REFERENCE.md)
  • Knowledge Area: Binary Inspection
  • Software or Tool: CLI
  • Main Book: “The C Programming Language”

What you will build: A hexdump tool that mirrors xxd-style output, with offsets and ASCII columns.

Why it teaches binary/hex: Hexdumps are the canonical view of byte-level data.

Core challenges you will face:

  • Byte formatting -> Bits/Bytes/Nibbles
  • Offset handling -> Endianness
  • ASCII rendering -> Encoding & Forensics

Real World Outcome

$ hexview sample.bin
00000000: 48 65 6c 6c 6f 0a 00 00  Hello...

The Core Question You Are Answering

“How do I render raw bytes so humans can interpret them?”

Concepts You Must Understand First

  1. Hex encoding
    • How do bytes map to hex pairs?
    • Book Reference: “Computer Systems: A Programmer’s Perspective” - Ch. 2
  2. ASCII rendering
    • How do you show printable vs non-printable bytes?
    • Book Reference: “The C Programming Language” - Ch. 7

Questions to Guide Your Design

  1. Line width
    • How many bytes per line?
    • Will you support a configurable width?
  2. Output layout
    • How will you align hex columns and ASCII columns?

Thinking Exercise

Dump Layout

Sketch the columns for a 16-byte hexdump line and label the offsets.

Questions to answer:

  • Why include offsets at all?
  • Why does an ASCII column help?

The Interview Questions They Will Ask

  1. “What does xxd do?”
  2. “How do you represent non-printable bytes in a hexdump?”
  3. “Why is hex preferred over binary for dumps?”
  4. “How would you handle large files efficiently?”
  5. “What is the relationship between a hexdump and encoding?”

Hints in Layers

Hint 1: Starting Point Read file in fixed-size blocks (e.g., 16 bytes).

Hint 2: Next Level Render offset as 8 hex digits, then bytes, then ASCII.

Hint 3: Technical Details Pseudocode:

for each block:
  print offset
  print hex bytes
  print ASCII or '.'

Hint 4: Tools/Debugging Compare with xxd output for the same file.

Books That Will Help

Topic Book Chapter
File I/O “The C Programming Language” Ch. 7

Common Pitfalls and Debugging

Problem 1: “Columns drift after short last line”

  • Why: You do not pad the final line.
  • Fix: Pad missing bytes with spaces before ASCII column.
  • Quick test: Dump a 5-byte file and check alignment.

Definition of Done

  • Matches xxd layout for 16-byte lines
  • Prints offset, hex, ASCII columns
  • Handles short final line gracefully
  • Works on large files without loading entire file