Project 1: Card Number Validator & BIN Intelligence Service

Goal: Build practical expertise in payment security by implementing core controls (validation, tokenization, encryption), understanding PCI scope, and producing auditable, compliant artifacts.

Payment Data Boundaries

Payment systems live or die by data boundaries: where PANs can exist, how they move, and who can touch them. You need to draw a clear boundary between the Cardholder Data Environment (CDE) and everything else to reduce scope and risk.

Cryptographic Controls and Key Management

Payments rely on strong symmetric encryption, deterministic tokenization, and strict key lifecycle controls. Key custody, rotation, and HSM-backed operations are as important as the algorithms themselves.

Transaction Flow and Compliance Guarantees

Authorization, capture, and settlement have different security requirements. Compliance (PCI DSS, PCI PIN, 3DS) enforces minimal guarantees that must be reflected in system design.

Concept Summary Table

Concept Cluster What You Need to Internalize
Data classification PAN vs token, CDE boundaries, data minimization.
Cryptography AES, KDFs, tokenization, key hierarchy.
Transaction security Auth vs settlement, 3DS, P2PE.
Compliance PCI DSS scope, audit controls, evidence.
Risk controls Rate limits, fraud signals, logging.

Deep Dive Reading by Concept

Concept Book & Chapter
PCI DSS PCI DSS v4.0 — Requirements overview
Tokenization PCI Tokenization Guidelines — Implementation sections
Crypto in payments Cryptography Engineering — Ch. 6-9
Payment flows Payment Systems in the U.S. — transaction chapters
Fraud controls The Anatomy of the Payment Card Industry — risk sections

Project Overview

Attribute Value
Difficulty Level 1: Beginner
Time Estimate Weekend
Programming Language C
Knowledge Area Payments / Data Validation
Key Technologies Luhn Algorithm, BIN Database
Coolness Level Level 2: Practical but Forgettable
Business Potential 3. The “Service & Support” Model

Learning Objectives

By completing this project, you will:

  1. Understand PAN (Primary Account Number) anatomy - Learn how card numbers encode network, issuer, and validation information
  2. Implement the Luhn algorithm - Master this checksum algorithm used across financial systems
  3. Build BIN (Bank Identification Number) parsing - Learn how payment routing works at the data level
  4. Handle variable-length input validation - Process 13-19 digit card numbers correctly
  5. Design a clean C library API - Create a reusable validation library with proper error handling
  6. Understand validation vs. authentication vs. authorization - Critical distinction in payment systems

The Core Question You’re Answering

“How do payment systems know if a card number is valid BEFORE contacting the bank?”

This isn’t just about data validation—it’s about understanding the economics and security design of payment networks. Every time you enter a card number online, the merchant validates it instantly (before network calls). Why?

  1. Cost reduction: Sending invalid card numbers to processors costs money (merchant pays per transaction attempt)
  2. Fraud prevention: Invalid PANs are often manual typing errors OR deliberate probing attacks
  3. User experience: Instant feedback prevents user frustration and cart abandonment
  4. System design: Understanding why card numbers encode their own validity check reveals how distributed systems handle untrusted input

The Luhn algorithm (invented in 1954 by IBM scientist Hans Peter Luhn) is a checksum, not cryptography. It catches typing errors, not fraud. Why is this distinction critical? Because many developers confuse validation with authentication—Luhn tells you “this could be a real card number structure,” not “this card exists and has funds.”


Deep Theoretical Foundation

1. Primary Account Number (PAN) Structure

A PAN is NOT a random number—it’s a structured identifier with semantic meaning.

Format: [IIN/BIN (6-8 digits)][Account Number (variable)][Check Digit (1)]

Total length: 13-19 digits (varies by network)

┌──────────────────────────────────────────────────────────────────┐
│                    PAN STRUCTURE (16 digits)                      │
├──────────────────────────────────────────────────────────────────┤
│                                                                   │
│   4532 0151 1283 0366                                            │
│   │    │         │  │                                            │
│   │    │         │  └── Check Digit (position 16)                │
│   │    │         │      Calculated via Luhn algorithm            │
│   │    │         │                                               │
│   │    └─────────┴───── Account Number (positions 7-15)          │
│   │                     Unique identifier within issuer           │
│   │                                                               │
│   └────────────────────── BIN/IIN (positions 1-6)                │
│                           Identifies network + issuer             │
│                                                                   │
└──────────────────────────────────────────────────────────────────┘

The first digit identifies the Major Industry Identifier (MII):

  • 1-2: Airlines
  • 3: Travel and entertainment (Amex, Diners)
  • 4: Banking (Visa)
  • 5: Banking (Mastercard)
  • 6: Merchandising and banking (Discover)

Why this matters: The structure IS the routing information. When you swipe a card, the terminal reads the BIN and knows “send this to Visa network” before any authentication happens.

2. The Luhn Algorithm (Modulus 10)

The Luhn algorithm is a checksum—not encryption. It’s designed to catch common transcription errors.

What it catches:

  • 100% of single-digit errors
  • 98% of adjacent transpositions (typing “12” instead of “21”)

What it does NOT do:

  • Prevent deliberate tampering
  • Prove the card exists
  • Authenticate the cardholder

Algorithm Steps:

  1. Starting from the rightmost digit (check digit), double every second digit
  2. If doubling results in a two-digit number, sum those digits (e.g., 16 → 1+6=7)
  3. Sum all the digits
  4. If total modulo 10 is 0, the number is valid

Visual Trace:

Card number: 4 5 3 2 0 1 5 1 1 2 8 3 0 3 6 6
Position:   16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1  (right to left)

Step 1: Mark positions to double (even positions from right):
  Position 2, 4, 6, 8, 10, 12, 14, 16

  4  5  3  2  0  1  5  1  1  2  8  3  0  3  6  6
  *     *     *     *     *     *     *     *
  ↓     ↓     ↓     ↓     ↓     ↓     ↓     ↓

Step 2: Double marked digits:
  8  5  6  2  0  2  10 1  2  2  16 3  0  6  12 6

Step 3: Sum digits of doubled values > 9:
  10 → 1+0 = 1
  16 → 1+6 = 7
  12 → 1+2 = 3

  Result: 8  5  6  2  0  2  1  1  2  2  7  3  0  6  3  6

Step 4: Sum all:
  8+5+6+2+0+2+1+1+2+2+7+3+0+6+3+6 = 54

Step 5: Check 54 % 10 = 4 ≠ 0 → INVALID?

Wait! The trace shows position numbering matters. Let's re-examine...

The Key Insight: Position numbering starts from the CHECK DIGIT (rightmost), and we double digits at EVEN positions. A correct implementation processes right-to-left.

3. Bank Identification Number (BIN) / Issuer Identification Number (IIN)

The BIN is the first 6-8 digits of a PAN. It uniquely identifies the issuing bank and card network.

Standard BIN Ranges:

Prefix Network Notes
4 Visa All cards starting with 4
51-55 Mastercard Traditional range
2221-2720 Mastercard New range (2017+)
34, 37 American Express 15-digit cards
6011, 644-649, 65 Discover  
300-305, 36, 38 Diners Club  
3528-3589 JCB  

Why BINs enable distributed routing:

┌─────────────────────────────────────────────────────────────────────┐
│                     PAYMENT ROUTING VIA BIN                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                      │
│  Card: 4532 0151 1283 0366                                          │
│        │                                                             │
│        └──→ BIN: 453201                                             │
│             │                                                        │
│             ▼                                                        │
│  ┌─────────────────────────────────────────────┐                    │
│  │           BIN LOOKUP TABLE                   │                    │
│  ├─────────────────────────────────────────────┤                    │
│  │  453201 → Network: Visa                     │                    │
│  │           Issuer:  Bank of America          │                    │
│  │           Type:    Credit                    │                    │
│  │           Country: USA                       │                    │
│  └─────────────────────────────────────────────┘                    │
│                      │                                               │
│                      ▼                                               │
│  Terminal Decision: Route to Visa network → Bank of America          │
│                                                                      │
│  Without BINs: Would need central database lookup for every card!    │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘

4. Validation vs. Authentication vs. Authorization

Critical distinction every payment developer must understand:

Concept What It Answers Where It Happens Security Level
Validation “Is this structurally correct?” Client-side (browser) None (checksum)
Authentication “Does this card exist?” Card network/issuer Medium
Authorization “Can this transaction proceed?” Issuing bank High (funds check)

Luhn validation is UX, not security. Real validation requires:

  1. Contacting the card network
  2. Routing to the issuing bank
  3. Receiving an authorization response

5. ISO/IEC 7812 Standard

The international standard defining PAN structure:

  • Specifies MII (Major Industry Identifier)
  • Defines IIN (Issuer Identification Number)
  • Specifies check digit algorithm (Luhn)
  • Defines variable-length PANs (13-19 digits)

Why Amex is 15 digits: Historical design decisions. Different networks made different choices about account number length vs. issuer identification space.


Project Specification

What You’ll Build

A CLI tool and library that validates card numbers using the Luhn algorithm, identifies card networks from BIN ranges, and extracts metadata about the issuing bank.

Expected Output

$ ./cardvalidate 4532015112830366

Card Analysis Report
====================
Card Number:     4532015112830366
Length:          16 digits
Valid Luhn:      ✓ VALID
Card Network:    Visa
Card Type:       Credit
Issuer:          Bank of America
Country:         United States
BIN:             453201

Structure Breakdown:
┌──────┬────────────────┬─┐
│ BIN  │ Account Number │C│
│453201│ 51128303 6     │6│
└──────┴────────────────┴─┘
  IIN      Unique ID    Check

$ ./cardvalidate 378282246310005

Card Analysis Report
====================
Card Number:     378282246310005
Length:          15 digits
Valid Luhn:      ✓ VALID
Card Network:    American Express
Card Type:       Credit
Issuer:          American Express
Country:         United States
BIN:             378282

$ ./cardvalidate 1234567890123456

Card Analysis Report
====================
Card Number:     1234567890123456
Length:          16 digits
Valid Luhn:      ✗ INVALID (Expected check digit: 0, got: 6)
Card Network:    Unknown

Project Structure

cardvalidate/
├── src/
│   ├── main.c              # CLI entry point
│   ├── cardvalidate.c      # Core validation library
│   ├── cardvalidate.h      # Public API
│   ├── bin_database.c      # BIN range lookup
│   ├── bin_database.h      # BIN database interface
│   └── utils.c             # String utilities
├── data/
│   └── bin_ranges.csv      # BIN to issuer mappings
├── tests/
│   ├── test_luhn.c         # Luhn algorithm tests
│   ├── test_bin_lookup.c   # BIN lookup tests
│   └── known_cards.txt     # Test vectors
├── Makefile
└── README.md

Core API Design

// cardvalidate.h

#ifndef CARDVALIDATE_H
#define CARDVALIDATE_H

#include <stdbool.h>

typedef struct {
    bool valid;                    // Luhn check passed
    char network[32];              // "Visa", "Mastercard", etc.
    char issuer[64];               // "Bank of America", etc.
    char card_type[16];            // "credit" or "debit"
    char country[64];              // "United States", etc.
    char bin[9];                   // First 6-8 digits
    int expected_check_digit;      // What check digit should be
    int actual_check_digit;        // What check digit is
    int length;                    // Number of digits
} CardValidationResult;

// Primary API
CardValidationResult validate_card(const char* card_number);

// Individual functions
bool luhn_check(const char* card_number);
int calculate_check_digit(const char* card_number);
const char* identify_network(const char* card_number);

// Utility
void strip_non_digits(const char* input, char* output, size_t output_size);
void mask_pan(const char* pan, char* masked, size_t masked_size);

#endif

Solution Architecture

System Design

┌─────────────────────────────────────────────────────────────────────────┐
│                        CARD VALIDATOR ARCHITECTURE                       │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  INPUT                                                                   │
│    │                                                                     │
│    ▼                                                                     │
│  ┌─────────────────────────────────────────┐                            │
│  │         INPUT SANITIZER                  │                            │
│  │  • Strip spaces, dashes                  │                            │
│  │  • Validate characters (digits only)     │                            │
│  │  • Check length (13-19)                  │                            │
│  └─────────────────┬───────────────────────┘                            │
│                    │                                                     │
│         ┌─────────┴──────────┐                                          │
│         ▼                    ▼                                          │
│  ┌─────────────────┐  ┌─────────────────────┐                           │
│  │  LUHN VALIDATOR │  │   BIN LOOKUP        │                           │
│  │                 │  │                     │                           │
│  │  • Right-to-left│  │  • Extract BIN      │                           │
│  │    processing   │  │  • Search database  │                           │
│  │  • Double/sum   │  │  • Return metadata  │                           │
│  │  • Mod 10 check │  │                     │                           │
│  └────────┬────────┘  └─────────┬───────────┘                           │
│           │                     │                                        │
│           └──────────┬──────────┘                                        │
│                      ▼                                                   │
│  ┌─────────────────────────────────────────┐                            │
│  │         RESULT AGGREGATOR               │                            │
│  │  • Combine validation result            │                            │
│  │  • Format output                        │                            │
│  │  • Generate breakdown                   │                            │
│  └─────────────────────────────────────────┘                            │
│                      │                                                   │
│                      ▼                                                   │
│                   OUTPUT                                                 │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

BIN Database Design

┌─────────────────────────────────────────────────────────────────────────┐
│                        BIN DATABASE OPTIONS                              │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│  OPTION A: Array with Linear Search                                      │
│  ┌─────────────────────────────────────────────┐                        │
│  │  BINInfo bins[] = {                         │                        │
│  │    {"4", "Visa", ...},                      │  Time: O(n)            │
│  │    {"51", "Mastercard", ...},               │  Space: O(n)           │
│  │    {"453201", "Bank of America", ...},      │  Simple but slow       │
│  │    ...                                      │                        │
│  │  };                                         │                        │
│  └─────────────────────────────────────────────┘                        │
│                                                                          │
│  OPTION B: Sorted Array with Binary Search                               │
│  ┌─────────────────────────────────────────────┐                        │
│  │  Sort BINs by prefix                        │  Time: O(log n)        │
│  │  Binary search for match                    │  Space: O(n)           │
│  │  Search longest prefix first                │  Better for larger DB  │
│  └─────────────────────────────────────────────┘                        │
│                                                                          │
│  OPTION C: Trie (Prefix Tree)                                           │
│  ┌─────────────────────────────────────────────┐                        │
│  │       Root                                  │  Time: O(k) k=BIN len  │
│  │      / | \                                  │  Space: O(n*k)         │
│  │     4  5  6                                 │  Best for prefix match │
│  │    /   |   \                                │                        │
│  │   ...  1-5  0                               │                        │
│  └─────────────────────────────────────────────┘                        │
│                                                                          │
│  RECOMMENDATION: Start with Option A, optimize to B if needed            │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

Key Design Decisions

  1. String vs. Integer for PAN storage: Use strings. A 19-digit number exceeds unsigned long long range.

  2. Right-to-left vs. left-to-right Luhn: Process right-to-left (as defined) using index arithmetic, or reverse the string.

  3. Non-digit handling: Strip before validation, not reject. Real cards have spaces/dashes.

  4. BIN search order: Search longest prefixes first. 453201 (specific bank) should match before 4 (generic Visa).

  5. Memory management: Use stack allocation for fixed-size buffers. PANs are max 19 digits.


Implementation Guide

Phase 1: Basic Luhn Validation

Goal: Implement and test the Luhn algorithm.

Steps:

  1. Create luhn_check() function
  2. Handle digit extraction from string
  3. Implement the doubling logic
  4. Test with known valid/invalid cards

Test Cards (from payment network documentation):

// Valid test cards
"4111111111111111"  // Visa
"5555555555554444"  // Mastercard
"378282246310005"   // Amex
"6011111111111117"  // Discover

// Invalid (fail Luhn)
"1234567890123456"
"4111111111111112"  // Wrong check digit

Key Implementation Detail: The digit -= 9 trick for doubled values > 9:

  • 10 → 1+0=1, and 10-9=1
  • 12 → 1+2=3, and 12-9=3
  • 18 → 1+8=9, and 18-9=9

Phase 2: BIN Database

Goal: Build BIN lookup with network identification.

Steps:

  1. Create BIN data structure
  2. Populate with major network prefixes
  3. Implement prefix-matching search
  4. Return complete card info

Minimum BIN data:

{"4",      "Visa",             "Various",  "credit/debit"},
{"51-55",  "Mastercard",       "Various",  "credit"},
{"34",     "American Express", "Amex",     "credit"},
{"37",     "American Express", "Amex",     "credit"},
{"6011",   "Discover",         "Discover", "credit"},

Phase 3: CLI Interface

Goal: Build user-friendly command-line interface.

Features:

  • Single card validation: ./cardvalidate 4111111111111111
  • Batch mode: ./cardvalidate --batch file.txt
  • Explain mode: ./cardvalidate --explain 4111111111111111
  • Help: ./cardvalidate --help

Use getopt_long() for argument parsing.

Phase 4: Polish and Edge Cases

Edge cases to handle:

  • NULL input
  • Empty string
  • String with spaces/dashes: 4532 0151 1283 0366
  • Too short (<13 digits)
  • Too long (>19 digits)
  • Contains letters

PCI consideration: Even for validation, consider:

  • Don’t log full PANs
  • Clear memory after use
  • Support masked output

Testing Strategy

Unit Tests

// test_luhn.c

void test_luhn_valid_visa() {
    assert(luhn_check("4111111111111111") == true);
}

void test_luhn_valid_amex() {
    assert(luhn_check("378282246310005") == true);
}

void test_luhn_invalid_single_digit_error() {
    assert(luhn_check("4111111111111112") == false);
}

void test_luhn_empty_string() {
    assert(luhn_check("") == false);
}

void test_luhn_with_spaces() {
    // After stripping: should pass
    char clean[20];
    strip_non_digits("4111 1111 1111 1111", clean, sizeof(clean));
    assert(luhn_check(clean) == true);
}

Integration Tests

# test_cli.sh

# Test valid Visa
./cardvalidate 4111111111111111 | grep -q "VALID"

# Test invalid
./cardvalidate 1234567890123456 | grep -q "INVALID"

# Test batch mode
echo -e "4111111111111111\n5555555555554444" > /tmp/test.txt
./cardvalidate --batch /tmp/test.txt | grep -c "VALID" | grep -q "2"

Performance Tests

# Generate 1M test cards and validate
time ./cardvalidate --batch 1million_cards.txt

# Target: < 1 second for 1M cards

Common Pitfalls & Debugging

Pitfall 1: Wrong Position Numbering

Symptom: Valid cards fail, invalid cards pass.

Cause: Processing left-to-right instead of right-to-left, or off-by-one in position calculation.

Debug: Add verbose output showing each step:

for (int i = len - 1; i >= 0; i--) {
    printf("Position %d (from right %d): digit=%d, double=%s\n",
           i, len - i, digit, should_double ? "yes" : "no");
}

Pitfall 2: Integer Overflow

Symptom: Validation fails for long cards.

Cause: Trying to store PAN as integer.

Fix: Always use strings for PAN storage.

Pitfall 3: Forgetting to Handle Non-Digits

Symptom: Crashes or wrong results with formatted input.

Cause: Not stripping spaces/dashes before validation.

Fix: Always sanitize input first.

Pitfall 4: BIN Match Order

Symptom: All Visa cards show “Various” instead of specific bank.

Cause: Generic prefix 4 matches before specific 453201.

Fix: Search longest prefixes first, or sort BIN database by prefix length descending.


Extensions & Challenges

Extension 1: Check Digit Calculator

Generate valid Luhn check digits:

$ ./cardvalidate --generate 411111111111111
Check digit for 411111111111111 is: 1
Complete valid number: 4111111111111111

Extension 2: Card Generator

Generate valid test card numbers for a given network:

$ ./cardvalidate --generate-card visa
Generated valid Visa test card: 4916338506082832

Extension 3: Real BIN Database

Integrate with a real BIN database API (like binlist.net) for accurate issuer information.

Extension 4: WebAssembly Build

Compile to WASM for browser-based validation in checkout forms.

Extension 5: Fuzzing

Use AFL or libFuzzer to find edge cases in your input handling.


Interview Questions This Prepares You For

Conceptual

  1. “What is the Luhn algorithm, and why do payment cards use it?”
    • Checksum from 1954, catches transcription errors, not a security mechanism
  2. “What’s the difference between a checksum and a cryptographic hash?”
    • Checksums: error detection, reversible, no security
    • Hashes: one-way, collision-resistant, for integrity/authentication
  3. “What is a BIN, and how does it enable payment routing?”
    • First 6-8 digits, identifies network and issuer, enables distributed routing

Implementation

  1. “How would you implement Luhn without string conversion?”
    • Extract digits with modulo/division, process right-to-left
  2. “What data structure for BIN lookup?”
    • Hash table for O(1), binary search for O(log n), trie for prefix matching
  3. “How do you handle 19-digit cards in C?”
    • String representation, not numeric types

Security

  1. “Can I store a full PAN in my database?”
    • NO! PCI DSS prohibits except for specific compliant processors
  2. “What’s the attack surface of a card validator?”
    • Buffer overflow, logging sensitive data, timing attacks (unlikely for Luhn)

Resources

Books

Topic Book Chapter
Luhn & Checksums Serious Cryptography (Aumasson) Ch. 14: MACs
Input Validation Fluent C (Preschern) Ch. 5: Input Validation
String Processing C Primer Plus (Prata) Ch. 11: Strings
Modular Arithmetic CSAPP (Bryant & O’Hallaron) Ch. 2.1: Information Storage
PAN Structure PCI DSS v4.0 Requirement 3
Data Structures Algorithms in C (Sedgewick) Ch. 12, 15

Online Resources

  • PCI DSS v4.0 documentation (free): pcisecuritystandards.org
  • ISO 7812 summary articles
  • Payment network test card numbers (Visa, Mastercard developer docs)

Test Card Numbers

Visa:       4111111111111111, 4012888888881881
Mastercard: 5555555555554444, 5105105105105100
Amex:       378282246310005,  371449635398431
Discover:   6011111111111117, 6011000990139424

NEVER use real card numbers for testing!


Self-Assessment Checklist

Before considering this project complete:

  • Luhn algorithm correctly validates all test cards
  • Handles 13-19 digit cards correctly
  • Strips non-digit characters (spaces, dashes)
  • BIN lookup returns correct network for major card types
  • CLI provides clear, formatted output
  • Edge cases handled (empty, too short, too long, invalid chars)
  • Can explain the algorithm to someone else
  • Understand why Luhn is validation, not security
  • Know the difference between BIN and full PAN
  • Batch mode processes files efficiently

What’s Next?

After completing this project, you understand the foundation of card data structure. Move to Project 2: Payment Tokenization Vault to learn how real payment systems protect this data using encryption and tokens.