P01: Binary & Hex Visualization Tool

The Core Question: “How does a CPU see numbers, and why must we become fluent in reading hex dumps and binary patterns before understanding machine instructions?”

Learning Objectives

By completing this project, you will:

Master two’s complement representation - Understand how CPUs encode signed integers and why -1 looks like all 1s
Internalize the binary-hex relationship - Instantly recognize that 4 binary bits equal 1 hex digit (the nibble)
Understand bit widths and their significance - Know why 8-bit, 16-bit, 32-bit, and 64-bit representations matter
Visualize byte order (endianness) - See how the same value appears differently in little-endian vs big-endian memory layouts
Build professional CLI tools in C - Create robust command-line utilities with proper input parsing and error handling
Think like a CPU - See numbers not as decimal abstractions but as raw bit patterns that CPUs manipulate
Read hex dumps fluently - After this project, debugger output and memory inspectors become readable

Project Overview

Attribute	Value
Main Language	C
Alternative Languages	Python, Rust, Go
Difficulty	Beginner
Time Estimate	Weekend (6-12 hours)
Prerequisites	Basic C programming, understanding of decimal numbers
Knowledge Area	Number Systems / Data Representation
Main Book	“Code: The Hidden Language of Computer Hardware and Software” by Charles Petzold

Theoretical Foundation

Core Concepts

Before building this tool, you must internalize these fundamental concepts that underpin all of computing:

1. Binary: The Language CPUs Speak

CPUs don’t understand decimal. At the hardware level, everything is electrical signals that are either ON or OFF, HIGH or LOW, 1 or 0. This is why binary exists - it’s the only number system that maps directly to transistor states.

Hardware Reality:

    Transistor States → Binary Digits → Numbers
    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

    HIGH voltage (>2.5V)  →  1  ─┐
    LOW voltage  (<0.8V)  →  0  ─┼─→ Combine into patterns
                                 │
    8 transistors together:      │
    [ON][ON][ON][ON][ON][ON][ON][ON] = 11111111 = 255
    [ON][OFF][OFF][OFF][OFF][OFF][OFF][OFF] = 10000000 = 128

2. Positional Notation: The Universal Pattern

Every number system works the same way. Each digit’s position represents a power of the base:

DECIMAL: 1,234
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Position:    3       2       1       0
Power:      10³     10²     10¹     10⁰
Value:     1000     100      10       1
           ─────   ─────   ─────   ─────
Digit:       1       2       3       4
           ─────   ─────   ─────   ─────
Total:   1×1000 + 2×100 + 3×10 + 4×1 = 1234

BINARY: 1011
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Position:    3       2       1       0
Power:       2³      2²      2¹      2⁰
Value:        8       4       2       1
            ─────   ─────   ─────   ─────
Digit:        1       0       1       1
            ─────   ─────   ─────   ─────
Total:      1×8  +  0×4  +  1×2  +  1×1 = 11

HEXADECIMAL: 2F
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Position:    1       0
Power:      16¹     16⁰
Value:       16       1
            ─────   ─────
Digit:        2      F(=15)
            ─────   ─────
Total:      2×16 + 15×1 = 47

3. The Magic of Hexadecimal

Hexadecimal exists because of a beautiful mathematical relationship: 16 = 2^4. This means exactly 4 binary bits map to exactly 1 hex digit. This isn’t coincidence - it’s why hex became the standard for representing binary data:

THE NIBBLE RELATIONSHIP (Memorize This!)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Binary    Hex  │  Binary    Hex
──────────────┼────────────────
0000   =   0  │  1000   =   8
0001   =   1  │  1001   =   9
0010   =   2  │  1010   =   A
0011   =   3  │  1011   =   B
0100   =   4  │  1100   =   C
0101   =   5  │  1101   =   D
0110   =   6  │  1110   =   E
0111   =   7  │  1111   =   F
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Example: Convert 0xDEADBEEF to binary
D    E    A    D    B    E    E    F
↓    ↓    ↓    ↓    ↓    ↓    ↓    ↓
1101 1110 1010 1101 1011 1110 1110 1111

Result: 11011110101011011011111011101111

Notice: No calculation needed! Just table lookup.

4. Two’s Complement: How CPUs Handle Negative Numbers

CPUs can’t store minus signs. Instead, they use a brilliant encoding called two’s complement where the most significant bit (MSB) indicates sign:

TWO'S COMPLEMENT (8-bit examples)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Positive numbers: Same as unsigned
   00000001 =  1
   00000010 =  2
   01111111 = 127  (largest positive 8-bit signed)

The sign bit:
   0xxxxxxx = positive (MSB = 0)
   1xxxxxxx = negative (MSB = 1)

Negative numbers: Invert all bits, add 1
   To get -1:
   Step 1: Start with +1:      00000001
   Step 2: Invert all bits:    11111110
   Step 3: Add 1:              11111111
   Result: -1 = 11111111 (0xFF)

   To get -128:
   10000000 = -128 (most negative 8-bit signed)

VERIFICATION: Adding -1 + 1 should equal 0
   11111111  (-1)
 + 00000001  (+1)
 ──────────
  100000000  (9 bits!)
  ↑
  This bit "overflows" and is discarded in 8-bit math

  Result: 00000000 = 0 ✓

WHY THIS WORKS:
The same addition hardware works for both signed and unsigned!
The CPU doesn't care about interpretation - just bit patterns.

5. Bit Widths: Why Size Matters

Different data types use different numbers of bits. Understanding widths is essential for reading memory dumps:

COMMON BIT WIDTHS AND THEIR RANGES
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Width    Bytes  Unsigned Range           Signed Range
─────    ─────  ──────────────           ────────────
8-bit      1    0 to 255                 -128 to 127
16-bit     2    0 to 65,535              -32,768 to 32,767
32-bit     4    0 to 4,294,967,295       -2,147,483,648 to 2,147,483,647
64-bit     8    0 to 18,446,744,073...   -9,223,372,036... to 9,223...

VISUAL REPRESENTATION:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

8-bit:  ████████
        └──────┘
          1 byte

16-bit: ████████ ████████
        └──────┘ └──────┘
         byte 0   byte 1

32-bit: ████████ ████████ ████████ ████████
        └──────┘ └──────┘ └──────┘ └──────┘
         byte 0   byte 1   byte 2   byte 3

64-bit: ████████ ████████ ████████ ████████ ████████ ████████ ████████ ████████
        └───────────────────────────────────────────────────────────────────────┘
                                        8 bytes

6. Endianness: Byte Order in Memory

Different CPU architectures store multi-byte values in different orders. This is called “endianness”:

THE ENDIANNESS PROBLEM
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

The 32-bit value 0x12345678: How is it stored in memory?

LITTLE-ENDIAN (x86, x86-64, ARM default):
Least significant byte at lowest address

Memory Address:  100    101    102    103
Memory Content:  0x78   0x56   0x34   0x12
                 └─────────────────────────┘
                 Reversed! LSB first.

BIG-ENDIAN (Network byte order, older PowerPC):
Most significant byte at lowest address

Memory Address:  100    101    102    103
Memory Content:  0x12   0x34   0x56   0x78
                 └─────────────────────────┘
                 "Natural" reading order.

REAL EXAMPLE - 0xDEADBEEF:
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Little-endian memory dump:  EF BE AD DE
Big-endian memory dump:     DE AD BE EF

When you see a hex dump, you MUST know the endianness!

Why This Matters

Understanding binary and hex representation is the foundation for:

Reading debugger output - GDB, LLDB, and other debuggers show memory in hex
Understanding assembly language - Instructions and addresses are in hex
Analyzing network packets - Protocol data is binary
Writing embedded systems code - Direct hardware register manipulation
Security analysis - Exploit development requires bit-level understanding
Understanding CPU architecture - Instruction encoding is pure bit patterns

Historical Context

Binary computing dates back to the 1940s when Claude Shannon proved that Boolean algebra could be implemented with electronic circuits. The shift from decimal computers (like ENIAC) to binary was driven by hardware simplicity - a switch with two states is easier and cheaper to build than one with ten states.

Hexadecimal emerged in the 1960s as programmers needed a way to represent bytes (8 bits) more compactly than binary. The IBM System/360 (1964) popularized hex notation, and it became the universal standard for representing binary data.

Common Misconceptions

Misconception 1: “Negative numbers have a minus sign stored somewhere” Reality: There’s no minus sign. Two’s complement uses the MSB pattern to represent negativity. The same bit pattern can be positive or negative depending on interpretation.

Misconception 2: “Hex is harder to understand than decimal” Reality: Hex is actually simpler for binary data because of the exact 4-bit mapping. With practice, you’ll read hex faster than decimal for byte values.

Misconception 3: “Little-endian is backwards and wrong” Reality: Little-endian has practical advantages - adding numbers can start from the lowest address, and casting between sizes is trivial. Both are valid design choices.

Misconception 4: “My computer stores numbers as text like ‘255’” Reality: Numbers are stored as raw binary patterns. The text ‘255’ and the integer 255 are completely different in memory.

Project Specification

What You Will Build

A command-line tool called bitview that takes a number (decimal, hex, or binary) and displays its representation in all formats, with visual highlighting of important patterns like sign bits, byte boundaries, and endianness.

Functional Requirements

Multi-format Input (-d, -x, -b flags):
- Accept decimal numbers (default): ./bitview 255
- Accept hexadecimal with 0x prefix or -x flag: ./bitview 0xDEADBEEF
- Accept binary with 0b prefix or -b flag: ./bitview 0b11111111
Complete Output Display:
- Show decimal representation
- Show binary representation with byte grouping
- Show hexadecimal representation
- Show signed interpretation (two’s complement)
- Show bit width used
Bit Width Selection (-w flag):
- Support 8-bit, 16-bit, 32-bit, and 64-bit representations
- Default to smallest width that fits the number
- Show overflow warning if number exceeds selected width
Endianness Display (-e flag):
- Show both little-endian and big-endian memory layouts
- Highlight byte order differences visually
Byte Highlighting (--bytes):
- Separate binary output into byte groups
- Show hex digit alignment with binary nibbles
Sign Bit Visualization:
- Clearly mark the sign bit in signed interpretations
- Show two’s complement breakdown for negative numbers

Non-Functional Requirements

Performance: Handle all inputs up to 64-bit instantly
Portability: Compile and run on any POSIX system with standard C library
Robustness: Handle all invalid inputs gracefully with meaningful error messages
Usability: Self-documenting with --help flag

Example Usage/Output

$ ./bitview 255
Decimal:     255
Binary:      00000000 00000000 00000000 11111111
Hex:         0x000000FF
Signed:      255 (positive)
Bit width:   32-bit

$ ./bitview -1
Decimal:     -1
Binary:      11111111 11111111 11111111 11111111
             ^
             └─ Sign bit (1 = negative)
Hex:         0xFFFFFFFF
Signed:      -1 (two's complement)
Bit width:   32-bit

$ ./bitview 0xDEADBEEF
Decimal:     3735928559
Binary:      11011110 10101101 10111110 11101111
             ^^^^^^^^ ^^^^^^^^ ^^^^^^^^ ^^^^^^^^
                DE       AD       BE       EF
Hex:         0xDEADBEEF
Signed:      -559038737 (two's complement)
Bit width:   32-bit

$ ./bitview -w 8 127
Decimal:     127
Binary:      01111111
             ^
             └─ Sign bit (0 = positive)
Hex:         0x7F
Signed:      127 (positive, max for signed 8-bit)
Bit width:   8-bit

$ ./bitview -w 8 128
Decimal:     128
Binary:      10000000
             ^
             └─ Sign bit (1 = negative when interpreted as signed)
Hex:         0x80
Unsigned:    128
Signed:      -128 (two's complement, min for signed 8-bit)
Bit width:   8-bit

$ ./bitview -e 0x12345678
Decimal:     305419896
Binary:      00010010 00110100 01010110 01111000
Hex:         0x12345678

Memory Layout (32-bit):
  Big-endian:    [0x12] [0x34] [0x56] [0x78]  (address 0 → 3)
  Little-endian: [0x78] [0x56] [0x34] [0x12]  (address 0 → 3)
                 ^Most x86/x64 systems use little-endian

$ ./bitview --help
Usage: bitview [OPTIONS] NUMBER

Display number in binary, hexadecimal, and decimal formats.

Input Formats:
  123           Decimal (default)
  0xFF          Hexadecimal (0x prefix)
  0b1010        Binary (0b prefix)

Options:
  -w, --width N   Force bit width (8, 16, 32, or 64)
  -e, --endian    Show endianness memory layouts
  -s, --signed    Interpret as signed (default for negative input)
  -u, --unsigned  Interpret as unsigned
  -h, --help      Show this help message

Examples:
  bitview 255           Show 255 in all formats
  bitview -w 8 -1       Show -1 as 8-bit signed
  bitview 0xCAFE        Show hex input in all formats
  bitview -e 0x12345678 Show byte order layouts

Real World Outcome

After completing this tool, you’ll use it constantly for:

Debugging sessions: Quickly verify what a value looks like in binary/hex
Understanding error codes: System error codes are often hex
Analyzing bit flags: See which bits are set in configuration values
Learning assembly: Verify instruction encoding by checking opcodes
Network debugging: Convert between address formats

Solution Architecture

High-Level Design

┌──────────────────────────────────────────────────────────────────────────┐
│                             bitview                                       │
├──────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌───────────┐ │
│  │   Argument  │───▶│   Input     │───▶│   Display   │───▶│  Output   │ │
│  │   Parser    │    │   Parser    │    │  Formatter  │    │  Printer  │ │
│  └─────────────┘    └─────────────┘    └─────────────┘    └───────────┘ │
│         │                  │                  │                 │        │
│         ▼                  ▼                  ▼                 ▼        │
│  ┌────────────────────────────────────────────────────────────────────┐ │
│  │                        Data Model                                   │ │
│  │  ┌─────────────────────────────────────────────────────────────┐   │ │
│  │  │  struct number_display {                                     │   │ │
│  │  │      uint64_t value;         // The raw value                │   │ │
│  │  │      int64_t  signed_value;  // Signed interpretation        │   │ │
│  │  │      int      bit_width;     // 8, 16, 32, or 64             │   │ │
│  │  │      bool     is_negative;   // Original input was negative  │   │ │
│  │  │      enum     input_base;    // DEC, HEX, BIN                │   │ │
│  │  │  }                                                           │   │ │
│  │  └─────────────────────────────────────────────────────────────┘   │ │
│  └────────────────────────────────────────────────────────────────────┘ │
│                                                                          │
└──────────────────────────────────────────────────────────────────────────┘

Key Components

Component	Responsibility	Key Functions
Argument Parser	Parse command-line flags and extract input	`parse_args()`, validate options
Input Parser	Convert input string to numeric value	`parse_decimal()`, `parse_hex()`, `parse_binary()`
Display Formatter	Format value for output in all bases	`format_binary()`, `format_hex()`, `format_decimal()`
Output Printer	Render formatted output to stdout	`print_display()`, handle alignment

Data Structures

// Input parsing result
typedef enum {
    BASE_DECIMAL,
    BASE_HEXADECIMAL,
    BASE_BINARY
} input_base_t;

// Command-line options
typedef struct {
    int bit_width;           // 0 = auto-detect, or 8/16/32/64
    bool show_endian;        // Display byte order layouts
    bool force_signed;       // Force signed interpretation
    bool force_unsigned;     // Force unsigned interpretation
    input_base_t input_base; // Which base the input is in
} options_t;

// Parsed and processed number
typedef struct {
    uint64_t raw_value;      // The unsigned representation
    int64_t signed_value;    // Signed interpretation (if applicable)
    int bit_width;           // Determined or specified width
    bool input_was_negative; // True if input started with '-'
    input_base_t input_base; // How the input was specified
} number_data_t;

// Output strings ready for printing
typedef struct {
    char decimal[32];        // Decimal representation
    char hex[20];            // Hex with prefix
    char binary[80];         // Binary with spaces
    char signed_info[64];    // "positive" or "negative" with value
    char endian_big[64];     // Big-endian layout
    char endian_little[64];  // Little-endian layout
} display_strings_t;

Algorithm Overview

Main Program Flow:

1. Parse command-line arguments
   ├── Extract flags (-w, -e, -s, -u)
   ├── Extract input number string
   └── Validate options (e.g., width must be 8/16/32/64)

2. Detect input format
   ├── Starts with "0x" or "0X" → hexadecimal
   ├── Starts with "0b" or "0B" → binary
   ├── Starts with "-" → negative decimal
   └── Otherwise → positive decimal

3. Parse input to numeric value
   ├── For hex: iterate chars, multiply by 16, add digit value
   ├── For binary: iterate chars, shift left, add bit
   └── For decimal: standard strtol/strtoll

4. Determine bit width (if not specified)
   ├── Value fits in 8 bits → 8
   ├── Value fits in 16 bits → 16
   ├── Value fits in 32 bits → 32
   └── Otherwise → 64

5. Calculate signed interpretation
   ├── Check if MSB is set for given width
   ├── If set: calculate two's complement value
   └── Store both unsigned and signed values

6. Format output strings
   ├── Binary: convert each bit, group by 8s
   ├── Hex: convert each nibble, pad to width
   └── Decimal: sprintf the values

7. Print formatted output
   ├── Print each representation with labels
   ├── If -e flag: print endian layouts
   └── Align output for readability

Binary Conversion Algorithm (Decimal to Binary String):

Input: value (uint64_t), width (int)
Output: binary string with spaces between bytes

1. Create output buffer of appropriate size
2. For i from (width-1) down to 0:
   a. Extract bit i: (value >> i) & 1
   b. Append '0' or '1' to output
   c. If (i % 8 == 0) and (i != 0): append space
3. Return output string

Two’s Complement Interpretation:

Input: value (uint64_t), width (int)
Output: signed_value (int64_t)

1. Create mask for sign bit: 1 << (width - 1)
2. If (value & mask) is non-zero:
   a. Number is negative
   b. Extend sign bits: signed_value = value | ~((1 << width) - 1)
3. Else:
   a. Number is positive
   b. signed_value = value
4. Return signed_value

Implementation Guide

Development Environment Setup

# Required tools
# On macOS:
xcode-select --install

# On Linux:
sudo apt-get install build-essential

# Create project structure
mkdir -p bitview/{src,include,tests}
cd bitview

# Verify compiler
gcc --version
# or
clang --version

Project Structure

bitview/
├── src/
│   ├── main.c              # Entry point, main loop
│   ├── parser.c            # Input parsing functions
│   ├── formatter.c         # Output formatting functions
│   └── display.c           # Printing and layout functions
├── include/
│   ├── bitview.h           # Shared data structures
│   ├── parser.h            # Parser function declarations
│   ├── formatter.h         # Formatter function declarations
│   └── display.h           # Display function declarations
├── tests/
│   ├── test_parser.c       # Unit tests for parser
│   ├── test_formatter.c    # Unit tests for formatter
│   └── run_tests.sh        # Integration test script
├── Makefile
└── README.md

The Core Question You’re Answering

“How can I instantly visualize any number as the CPU sees it - in binary, with clear byte boundaries, sign bits, and endianness?”

This question drives every design decision:

Why binary output? Because that’s what the CPU actually stores
Why byte grouping? Because memory is addressed by bytes
Why sign bit marking? Because signed vs unsigned interpretation changes meaning
Why endianness display? Because memory layout affects debugging

Concepts You Must Understand First

Before writing code, verify you can answer these questions:

Concept	Self-Test Question	Where to Learn
Positional notation	How is 1011 binary calculated to 11 decimal?	“Code” Ch. 7-9
Division-remainder algorithm	How do you convert 25 to binary by hand?	CS:APP Ch. 2.1
Two’s complement	Why is -1 represented as 0xFF in 8 bits?	CS:APP Ch. 2.2
Bit widths	What’s the range of a signed 8-bit integer?	CS:APP Ch. 2.2
Endianness	How is 0x12345678 stored in little-endian?	CS:APP Ch. 2.1.3
C bit manipulation	What does `(x >> 4) & 0xF` extract?	K&R Ch. 2.9

Questions to Guide Your Design

Work through these before writing code:

How will you detect the input format?
- Check for “0x”/”0X” prefix → hex
- Check for “0b”/”0B” prefix → binary
- Check for leading “-“ → negative decimal
- What about invalid prefixes like “0z”?
How will you handle negative numbers?
- Parse the absolute value, then apply two’s complement
- Or parse as signed directly?
- What if someone inputs “-0xFF”?
How will you determine bit width if not specified?
- Find minimum width that contains the value
- Special case for negative numbers (need sign bit)
- What about 0? (Could be any width)
How will you format binary output for readability?
- Space every 8 bits (byte boundary)
- Align hex digits under corresponding nibbles?
- Leading zeros to fill the width?
How will you structure your code for testability?
- Pure functions that take input, return output
- No global state
- Separate parsing from formatting from printing

Thinking Exercise

Before coding, trace through these by hand:

Exercise 1: Manual Conversion Convert each of these without a calculator:

173 decimal to binary
0xBEEF to decimal
10110011 binary to hex
-5 to 8-bit two’s complement

Exercise 2: Width Determination For each value, what’s the minimum bit width needed?

255: ____ bits (unsigned)
256: ____ bits (unsigned)
-1: ____ bits (signed)
-129: ____ bits (signed)

Exercise 3: Endianness Layout Write out the memory layout for 0x12345678:

Big-endian: [__] [] [] [__]
Little-endian: [__] [] [] [__]

Exercise 4: Design Sketch On paper, write pseudocode for:

parse_hex_string(const char* str) → returns uint64_t
format_as_binary(uint64_t value, int width) → returns char*
get_signed_interpretation(uint64_t value, int width) → returns int64_t

Hints in Layers

Layer 1: Getting Started

If you’re stuck on where to begin:

Start with just decimal to binary conversion for unsigned numbers
Hardcode 32-bit width initially
Ignore command-line arguments - just use a hardcoded test value
Get the core algorithm working before adding features

Layer 2: Core Algorithm Hints

For decimal to binary:

// Extract each bit from MSB to LSB
for (int i = width - 1; i >= 0; i--) {
    int bit = (value >> i) & 1;
    // Append '0' or '1' to output
}

For hex to decimal:

// Each hex digit adds to running total
uint64_t result = 0;
for (each char in input) {
    result = result * 16 + digit_value(char);
}

For two’s complement:

// Check if sign bit is set
uint64_t sign_bit = 1ULL << (width - 1);
if (value & sign_bit) {
    // Negative: extend sign bits
    uint64_t mask = ~((1ULL << width) - 1);
    return (int64_t)(value | mask);
}

Layer 3: Implementation Details

For hex character to value:

int hex_char_to_int(char c) {
    if (c >= '0' && c <= '9') return c - '0';
    if (c >= 'a' && c <= 'f') return c - 'a' + 10;
    if (c >= 'A' && c <= 'F') return c - 'A' + 10;
    return -1; // Invalid
}

For value to hex character:

char int_to_hex_char(int value) {
    if (value < 10) return '0' + value;
    return 'A' + value - 10;
}

Layer 4: Debugging Hints

Common bugs to watch for:

Integer overflow when parsing (use unsigned long long)
Off-by-one in bit positions
Forgetting that C strings need null terminators
Sign extension when casting between sizes
Using %d for values larger than int

Test with these values:

0       → Should work (edge case: zero)
1       → Simplest positive
255     → Max 8-bit unsigned
256     → First value needing 9+ bits
-1      → All bits set
0xFFFFFFFF → 32-bit max
0x8000000000000000 → 64-bit MSB only

Interview Questions

After completing this project, you should be able to answer:

“How would you convert a decimal number to binary without built-in functions?”
- Explain division-remainder algorithm
- Mention that you reverse the remainders (or build string backwards)
- Complexity is O(log n)
“What is two’s complement and why do we use it?”
- Same addition hardware works for signed and unsigned
- Only one representation of zero
- Negation is simple: invert bits, add 1
“Why do we use hexadecimal instead of decimal for memory addresses?”
- Perfect mapping: 4 bits = 1 hex digit
- Much more compact than binary
- Easy to see byte boundaries
“What is endianness and when does it matter?”
- Order of bytes in multi-byte values
- Matters for: network protocols, file formats, cross-platform code
- x86 is little-endian, network order is big-endian
“How can the same bit pattern represent different values?”
- Interpretation depends on context (signed vs unsigned)
- 0xFF is 255 unsigned but -1 signed in 8 bits
- CPU doesn’t care - uses same hardware for both

Books That Will Help

Topic	Book	Chapter	Why
Binary fundamentals	“Code” by Petzold	Ch. 7-9	Builds intuition from first principles
Two’s complement	CS:APP	Ch. 2.2	Rigorous explanation with examples
Endianness	CS:APP	Ch. 2.1.3	Shows exactly how bytes are laid out
Bit manipulation in C	K&R	Ch. 2.9	Classic reference for C operators
Number representation	CS:APP	Ch. 2.1-2.2	Complete coverage of data formats

Implementation Phases

Phase 1: Core Parsing (Day 1, 2-3 hours)

Goals:

Parse a decimal string to uint64_t
Parse a hex string (with 0x prefix) to uint64_t
Handle basic error cases (invalid characters)

Checkpoint: You can run ./bitview 255 and ./bitview 0xFF and get the same internal value.

Phase 2: Binary Formatting (Day 1, 2-3 hours)

Goals:

Convert uint64_t to binary string
Add byte-boundary spacing
Handle specified bit widths

Checkpoint: You can see 255 displayed as 11111111 and 256 as 00000001 00000000.

Phase 3: Complete Output (Day 2, 2-3 hours)

Goals:

Display all three representations (dec, hex, bin)
Show signed interpretation
Proper output formatting and alignment

Checkpoint: Your output matches the example specification.

Phase 4: Polish and Features (Day 2, 2-3 hours)

Goals:

Add command-line argument parsing
Add endianness display
Add help message
Handle all edge cases

Checkpoint: Tool is complete and handles all test cases.

Key Implementation Decisions

Decision	Options	Recommendation	Rationale
Integer size	int, long, int64_t	uint64_t/int64_t	Consistent 64-bit support across platforms
String handling	Dynamic allocation vs static buffers	Static buffers	Simpler, no memory leaks, fixed max sizes
Argument parsing	getopt vs manual	getopt or getopt_long	Standard, robust, handles edge cases
Error handling	errno + return codes vs fprintf	fprintf + exit	Simple for CLI tool, immediate feedback

Testing Strategy

Test Categories

Category	Purpose	Examples
Boundary values	Test limits of each bit width	0, 127, 128, 255, 256, 32767, etc.
Sign handling	Verify two’s complement works	-1, -128, INT_MIN, etc.
Input formats	All prefix styles work	0x, 0X, 0b, 0B, no prefix
Error cases	Invalid input rejected	“hello”, “0xGG”, “0b123”
Edge cases	Special situations	Leading zeros, very large numbers

Critical Test Cases

# Boundary values
./bitview 0           # Zero
./bitview 1           # Smallest positive
./bitview 127         # Max 7-bit signed
./bitview 128         # First value needing 8th bit
./bitview 255         # Max 8-bit unsigned
./bitview 256         # First 9-bit value
./bitview 65535       # Max 16-bit unsigned
./bitview 2147483647  # Max 32-bit signed
./bitview 4294967295  # Max 32-bit unsigned

# Negative numbers
./bitview -1          # Should show all 1s
./bitview -128        # Min 8-bit signed
./bitview -129        # Needs 16 bits

# Hexadecimal input
./bitview 0xFF        # 255
./bitview 0xDEADBEEF  # Famous test value
./bitview 0xffffffff  # 32-bit max (lowercase)
./bitview 0XABCD      # Uppercase prefix

# Binary input
./bitview 0b11111111  # 255
./bitview 0b10000000  # 128

# With width flag
./bitview -w 8 255    # Show as 8-bit
./bitview -w 16 255   # Show as 16-bit
./bitview -w 8 256    # Should warn about overflow

# Error cases (should fail gracefully)
./bitview hello       # Not a number
./bitview 0xGHIJ      # Invalid hex
./bitview 0b123       # Invalid binary
./bitview ""          # Empty input

Test Data File

#!/bin/bash
# test_bitview.sh - Run all tests

PASS=0
FAIL=0

test_case() {
    local input="$1"
    local expected_contains="$2"
    local result=$(./bitview $input 2>&1)

    if echo "$result" | grep -q "$expected_contains"; then
        echo "PASS: $input"
        ((PASS++))
    else
        echo "FAIL: $input - expected '$expected_contains'"
        echo "  Got: $result"
        ((FAIL++))
    fi
}

# Run tests
test_case "255" "0xFF"
test_case "255" "11111111"
test_case "-1" "0xFFFFFFFF"
test_case "0xDEADBEEF" "3735928559"
test_case "0b10101010" "170"

echo ""
echo "Results: $PASS passed, $FAIL failed"

Common Pitfalls & Debugging

Frequent Mistakes

Pitfall	Symptom	Solution
Using `int` instead of `uint64_t`	Overflow on large values	Always use fixed-width types
Forgetting null terminator	Garbage characters in output	Ensure all strings are terminated
Wrong shift direction	Bits in wrong position	Draw out the operation on paper
Sign extension on cast	Unexpected large positive values	Use explicit masking
Not handling zero	Empty output or crash	Special case: if value == 0, output “0”

Debugging Strategies

Print intermediate values:

// Add during development, remove later
printf("DEBUG: parsed value = %llu (0x%llX)\n", value, value);
printf("DEBUG: sign bit position = %d\n", width - 1);
printf("DEBUG: sign bit value = %d\n", (value >> (width - 1)) & 1);

Test each function in isolation:

// Test parser alone
assert(parse_hex("FF") == 255);
assert(parse_hex("0xFF") == 255);
assert(parse_binary("11111111") == 255);

// Test formatter alone
assert(strcmp(format_hex(255), "0xFF") == 0);
assert(strcmp(format_binary(255, 8), "11111111") == 0);

Use a debugger:

# Compile with debug symbols
gcc -g -O0 main.c -o bitview

# Run in GDB
gdb ./bitview
(gdb) break main
(gdb) run 255
(gdb) print value
(gdb) step

Performance Traps

For this project, performance isn’t critical (everything is O(log n) at worst), but watch for:

Unnecessary string copies
Reallocating buffers in loops
Computing the same value multiple times

Extensions & Challenges

Beginner Extensions

Color output: Use ANSI codes to highlight sign bits in red
ASCII display: For byte-sized values, show the ASCII character if printable
Octal output: Add octal (base 8) representation

Intermediate Extensions

Interactive mode: REPL that accepts continuous input
Bit field highlighting: Highlight specific bit ranges (e.g., bits 4-7)
IEEE 754 floating point: Show float/double bit layout (sign, exponent, mantissa)
Arbitrary bases: Support base 3, base 7, etc.

Advanced Extensions

Instruction decoder: For x86, decode common instruction patterns
Memory dump parsing: Accept hex dump format and decode
GUI version: Create a graphical version with bit toggles
Network byte order: Convert between host and network byte order

Real-World Connections

Industry Applications

Debugging: Every debugger shows memory as hex dumps
Embedded systems: Direct register manipulation requires bit-level understanding
Network protocols: Packet headers are parsed bit by bit
Cryptography: Hash functions and encryption work at bit level
Graphics: Color values, pixel formats all use binary/hex

xxd: Hex dump utility (compare your output to this)
od: Octal dump (original Unix tool)
hexdump: Another standard hex viewer
Python struct module: Binary packing/unpacking

Interview Relevance

This project demonstrates:

Understanding of fundamental computer science concepts
Ability to build useful CLI tools
Knowledge of how CPUs represent data
C programming competence

Resources

Essential Reading

“Code” by Charles Petzold - Chapters 7-9 on binary and counting
CS:APP - Chapter 2 on information representation
K&R - Chapter 2.9 on bitwise operators

Online References

Wikipedia: “Two’s complement” - Clear explanation with examples
Stanford CS107: Binary and Data lab exercises
Computerphile YouTube: Binary and number systems videos

Tools

Calculator with programmer mode: Windows Calculator, macOS, online tools
GDB/LLDB: Practice reading hex in a debugger
Python: Quick verification with bin(), hex(), int(x, base)

Self-Assessment Checklist

Before considering this project complete, verify:

Conceptual Understanding

I can convert between decimal, binary, and hex without tools
I understand why -1 is 0xFFFFFFFF in 32 bits
I can explain two’s complement to someone else
I know the nibble table (4 bits to hex) by heart
I understand little-endian vs big-endian memory layout

Implementation Skills

My tool correctly handles all positive values up to 64 bits
My tool correctly shows negative number representations
Invalid input produces meaningful error messages
All command-line options work as specified
Output is properly formatted and aligned

Interview Readiness

I can explain the division-remainder algorithm clearly
I can describe why hex is used for memory addresses
I can discuss signed vs unsigned representation tradeoffs
I can explain endianness and when it matters

Submission/Completion Criteria

Minimum Viable Completion:

Accepts decimal input
Outputs binary and hex representations
Handles values 0 through 2^32-1
Basic error handling for invalid input

Full Completion:

All input formats (dec, hex, bin) work
Negative numbers with two’s complement
Bit width selection (-w flag)
Proper sign bit visualization
Help message with usage examples

Excellence:

Endianness display mode
Color-coded output
Byte and nibble alignment in output
Comprehensive test suite
Clean, well-documented code

This project is the foundation for understanding CPU architecture. The ability to read and interpret binary and hex values fluently is essential for every project that follows. Take your time, do the paper exercises, and ensure you truly understand before moving on.

P01: Binary & Hex Visualization Tool

Learning Objectives

Project Overview

Theoretical Foundation

Core Concepts

1. Binary: The Language CPUs Speak

2. Positional Notation: The Universal Pattern

3. The Magic of Hexadecimal

4. Two’s Complement: How CPUs Handle Negative Numbers

5. Bit Widths: Why Size Matters

6. Endianness: Byte Order in Memory

Why This Matters

Historical Context

Common Misconceptions

Project Specification

What You Will Build

Functional Requirements

Non-Functional Requirements

Example Usage/Output

Real World Outcome

Solution Architecture

High-Level Design

Key Components

Data Structures

Algorithm Overview

Implementation Guide

Development Environment Setup

Project Structure

The Core Question You’re Answering

Concepts You Must Understand First

Questions to Guide Your Design

Thinking Exercise

Hints in Layers

Interview Questions

Books That Will Help

Implementation Phases

Phase 1: Core Parsing (Day 1, 2-3 hours)

Phase 2: Binary Formatting (Day 1, 2-3 hours)

Phase 3: Complete Output (Day 2, 2-3 hours)

Phase 4: Polish and Features (Day 2, 2-3 hours)

Key Implementation Decisions

Testing Strategy

Test Categories

Critical Test Cases

Test Data File

Common Pitfalls & Debugging

Frequent Mistakes

Debugging Strategies

Performance Traps

Extensions & Challenges

Beginner Extensions

Intermediate Extensions

Advanced Extensions

Real-World Connections

Industry Applications

Related Tools

Interview Relevance

Resources

Essential Reading

Online References

Tools

Self-Assessment Checklist

Conceptual Understanding

Implementation Skills

Interview Readiness

Submission/Completion Criteria