Project 1: Card Number Validator & BIN Intelligence Service
Goal: Build practical expertise in payment security by implementing core controls (validation, tokenization, encryption), understanding PCI scope, and producing auditable, compliant artifacts.
Payment Data Boundaries
Payment systems live or die by data boundaries: where PANs can exist, how they move, and who can touch them. You need to draw a clear boundary between the Cardholder Data Environment (CDE) and everything else to reduce scope and risk.
Cryptographic Controls and Key Management
Payments rely on strong symmetric encryption, deterministic tokenization, and strict key lifecycle controls. Key custody, rotation, and HSM-backed operations are as important as the algorithms themselves.
Transaction Flow and Compliance Guarantees
Authorization, capture, and settlement have different security requirements. Compliance (PCI DSS, PCI PIN, 3DS) enforces minimal guarantees that must be reflected in system design.
Concept Summary Table
| Concept Cluster | What You Need to Internalize |
|---|---|
| Data classification | PAN vs token, CDE boundaries, data minimization. |
| Cryptography | AES, KDFs, tokenization, key hierarchy. |
| Transaction security | Auth vs settlement, 3DS, P2PE. |
| Compliance | PCI DSS scope, audit controls, evidence. |
| Risk controls | Rate limits, fraud signals, logging. |
Deep Dive Reading by Concept
| Concept | Book & Chapter |
|---|---|
| PCI DSS | PCI DSS v4.0 — Requirements overview |
| Tokenization | PCI Tokenization Guidelines — Implementation sections |
| Crypto in payments | Cryptography Engineering — Ch. 6-9 |
| Payment flows | Payment Systems in the U.S. — transaction chapters |
| Fraud controls | The Anatomy of the Payment Card Industry — risk sections |
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Level 1: Beginner |
| Time Estimate | Weekend |
| Programming Language | C |
| Knowledge Area | Payments / Data Validation |
| Key Technologies | Luhn Algorithm, BIN Database |
| Coolness Level | Level 2: Practical but Forgettable |
| Business Potential | 3. The “Service & Support” Model |
Learning Objectives
By completing this project, you will:
- Understand PAN (Primary Account Number) anatomy - Learn how card numbers encode network, issuer, and validation information
- Implement the Luhn algorithm - Master this checksum algorithm used across financial systems
- Build BIN (Bank Identification Number) parsing - Learn how payment routing works at the data level
- Handle variable-length input validation - Process 13-19 digit card numbers correctly
- Design a clean C library API - Create a reusable validation library with proper error handling
- Understand validation vs. authentication vs. authorization - Critical distinction in payment systems
The Core Question You’re Answering
“How do payment systems know if a card number is valid BEFORE contacting the bank?”
This isn’t just about data validation—it’s about understanding the economics and security design of payment networks. Every time you enter a card number online, the merchant validates it instantly (before network calls). Why?
- Cost reduction: Sending invalid card numbers to processors costs money (merchant pays per transaction attempt)
- Fraud prevention: Invalid PANs are often manual typing errors OR deliberate probing attacks
- User experience: Instant feedback prevents user frustration and cart abandonment
- System design: Understanding why card numbers encode their own validity check reveals how distributed systems handle untrusted input
The Luhn algorithm (invented in 1954 by IBM scientist Hans Peter Luhn) is a checksum, not cryptography. It catches typing errors, not fraud. Why is this distinction critical? Because many developers confuse validation with authentication—Luhn tells you “this could be a real card number structure,” not “this card exists and has funds.”
Deep Theoretical Foundation
1. Primary Account Number (PAN) Structure
A PAN is NOT a random number—it’s a structured identifier with semantic meaning.
Format: [IIN/BIN (6-8 digits)][Account Number (variable)][Check Digit (1)]
Total length: 13-19 digits (varies by network)
┌──────────────────────────────────────────────────────────────────┐
│ PAN STRUCTURE (16 digits) │
├──────────────────────────────────────────────────────────────────┤
│ │
│ 4532 0151 1283 0366 │
│ │ │ │ │ │
│ │ │ │ └── Check Digit (position 16) │
│ │ │ │ Calculated via Luhn algorithm │
│ │ │ │ │
│ │ └─────────┴───── Account Number (positions 7-15) │
│ │ Unique identifier within issuer │
│ │ │
│ └────────────────────── BIN/IIN (positions 1-6) │
│ Identifies network + issuer │
│ │
└──────────────────────────────────────────────────────────────────┘
The first digit identifies the Major Industry Identifier (MII):
- 1-2: Airlines
- 3: Travel and entertainment (Amex, Diners)
- 4: Banking (Visa)
- 5: Banking (Mastercard)
- 6: Merchandising and banking (Discover)
Why this matters: The structure IS the routing information. When you swipe a card, the terminal reads the BIN and knows “send this to Visa network” before any authentication happens.
2. The Luhn Algorithm (Modulus 10)
The Luhn algorithm is a checksum—not encryption. It’s designed to catch common transcription errors.
What it catches:
- 100% of single-digit errors
- 98% of adjacent transpositions (typing “12” instead of “21”)
What it does NOT do:
- Prevent deliberate tampering
- Prove the card exists
- Authenticate the cardholder
Algorithm Steps:
- Starting from the rightmost digit (check digit), double every second digit
- If doubling results in a two-digit number, sum those digits (e.g., 16 → 1+6=7)
- Sum all the digits
- If total modulo 10 is 0, the number is valid
Visual Trace:
Card number: 4 5 3 2 0 1 5 1 1 2 8 3 0 3 6 6
Position: 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 (right to left)
Step 1: Mark positions to double (even positions from right):
Position 2, 4, 6, 8, 10, 12, 14, 16
4 5 3 2 0 1 5 1 1 2 8 3 0 3 6 6
* * * * * * * *
↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓
Step 2: Double marked digits:
8 5 6 2 0 2 10 1 2 2 16 3 0 6 12 6
Step 3: Sum digits of doubled values > 9:
10 → 1+0 = 1
16 → 1+6 = 7
12 → 1+2 = 3
Result: 8 5 6 2 0 2 1 1 2 2 7 3 0 6 3 6
Step 4: Sum all:
8+5+6+2+0+2+1+1+2+2+7+3+0+6+3+6 = 54
Step 5: Check 54 % 10 = 4 ≠ 0 → INVALID?
Wait! The trace shows position numbering matters. Let's re-examine...
The Key Insight: Position numbering starts from the CHECK DIGIT (rightmost), and we double digits at EVEN positions. A correct implementation processes right-to-left.
3. Bank Identification Number (BIN) / Issuer Identification Number (IIN)
The BIN is the first 6-8 digits of a PAN. It uniquely identifies the issuing bank and card network.
Standard BIN Ranges:
| Prefix | Network | Notes |
|---|---|---|
| 4 | Visa | All cards starting with 4 |
| 51-55 | Mastercard | Traditional range |
| 2221-2720 | Mastercard | New range (2017+) |
| 34, 37 | American Express | 15-digit cards |
| 6011, 644-649, 65 | Discover | |
| 300-305, 36, 38 | Diners Club | |
| 3528-3589 | JCB |
Why BINs enable distributed routing:
┌─────────────────────────────────────────────────────────────────────┐
│ PAYMENT ROUTING VIA BIN │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ Card: 4532 0151 1283 0366 │
│ │ │
│ └──→ BIN: 453201 │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────┐ │
│ │ BIN LOOKUP TABLE │ │
│ ├─────────────────────────────────────────────┤ │
│ │ 453201 → Network: Visa │ │
│ │ Issuer: Bank of America │ │
│ │ Type: Credit │ │
│ │ Country: USA │ │
│ └─────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Terminal Decision: Route to Visa network → Bank of America │
│ │
│ Without BINs: Would need central database lookup for every card! │
│ │
└─────────────────────────────────────────────────────────────────────┘
4. Validation vs. Authentication vs. Authorization
Critical distinction every payment developer must understand:
| Concept | What It Answers | Where It Happens | Security Level |
|---|---|---|---|
| Validation | “Is this structurally correct?” | Client-side (browser) | None (checksum) |
| Authentication | “Does this card exist?” | Card network/issuer | Medium |
| Authorization | “Can this transaction proceed?” | Issuing bank | High (funds check) |
Luhn validation is UX, not security. Real validation requires:
- Contacting the card network
- Routing to the issuing bank
- Receiving an authorization response
5. ISO/IEC 7812 Standard
The international standard defining PAN structure:
- Specifies MII (Major Industry Identifier)
- Defines IIN (Issuer Identification Number)
- Specifies check digit algorithm (Luhn)
- Defines variable-length PANs (13-19 digits)
Why Amex is 15 digits: Historical design decisions. Different networks made different choices about account number length vs. issuer identification space.
Project Specification
What You’ll Build
A CLI tool and library that validates card numbers using the Luhn algorithm, identifies card networks from BIN ranges, and extracts metadata about the issuing bank.
Expected Output
$ ./cardvalidate 4532015112830366
Card Analysis Report
====================
Card Number: 4532015112830366
Length: 16 digits
Valid Luhn: ✓ VALID
Card Network: Visa
Card Type: Credit
Issuer: Bank of America
Country: United States
BIN: 453201
Structure Breakdown:
┌──────┬────────────────┬─┐
│ BIN │ Account Number │C│
│453201│ 51128303 6 │6│
└──────┴────────────────┴─┘
IIN Unique ID Check
$ ./cardvalidate 378282246310005
Card Analysis Report
====================
Card Number: 378282246310005
Length: 15 digits
Valid Luhn: ✓ VALID
Card Network: American Express
Card Type: Credit
Issuer: American Express
Country: United States
BIN: 378282
$ ./cardvalidate 1234567890123456
Card Analysis Report
====================
Card Number: 1234567890123456
Length: 16 digits
Valid Luhn: ✗ INVALID (Expected check digit: 0, got: 6)
Card Network: Unknown
Project Structure
cardvalidate/
├── src/
│ ├── main.c # CLI entry point
│ ├── cardvalidate.c # Core validation library
│ ├── cardvalidate.h # Public API
│ ├── bin_database.c # BIN range lookup
│ ├── bin_database.h # BIN database interface
│ └── utils.c # String utilities
├── data/
│ └── bin_ranges.csv # BIN to issuer mappings
├── tests/
│ ├── test_luhn.c # Luhn algorithm tests
│ ├── test_bin_lookup.c # BIN lookup tests
│ └── known_cards.txt # Test vectors
├── Makefile
└── README.md
Core API Design
// cardvalidate.h
#ifndef CARDVALIDATE_H
#define CARDVALIDATE_H
#include <stdbool.h>
typedef struct {
bool valid; // Luhn check passed
char network[32]; // "Visa", "Mastercard", etc.
char issuer[64]; // "Bank of America", etc.
char card_type[16]; // "credit" or "debit"
char country[64]; // "United States", etc.
char bin[9]; // First 6-8 digits
int expected_check_digit; // What check digit should be
int actual_check_digit; // What check digit is
int length; // Number of digits
} CardValidationResult;
// Primary API
CardValidationResult validate_card(const char* card_number);
// Individual functions
bool luhn_check(const char* card_number);
int calculate_check_digit(const char* card_number);
const char* identify_network(const char* card_number);
// Utility
void strip_non_digits(const char* input, char* output, size_t output_size);
void mask_pan(const char* pan, char* masked, size_t masked_size);
#endif
Solution Architecture
System Design
┌─────────────────────────────────────────────────────────────────────────┐
│ CARD VALIDATOR ARCHITECTURE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ INPUT │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ INPUT SANITIZER │ │
│ │ • Strip spaces, dashes │ │
│ │ • Validate characters (digits only) │ │
│ │ • Check length (13-19) │ │
│ └─────────────────┬───────────────────────┘ │
│ │ │
│ ┌─────────┴──────────┐ │
│ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ LUHN VALIDATOR │ │ BIN LOOKUP │ │
│ │ │ │ │ │
│ │ • Right-to-left│ │ • Extract BIN │ │
│ │ processing │ │ • Search database │ │
│ │ • Double/sum │ │ • Return metadata │ │
│ │ • Mod 10 check │ │ │ │
│ └────────┬────────┘ └─────────┬───────────┘ │
│ │ │ │
│ └──────────┬──────────┘ │
│ ▼ │
│ ┌─────────────────────────────────────────┐ │
│ │ RESULT AGGREGATOR │ │
│ │ • Combine validation result │ │
│ │ • Format output │ │
│ │ • Generate breakdown │ │
│ └─────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ OUTPUT │
│ │
└─────────────────────────────────────────────────────────────────────────┘
BIN Database Design
┌─────────────────────────────────────────────────────────────────────────┐
│ BIN DATABASE OPTIONS │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ OPTION A: Array with Linear Search │
│ ┌─────────────────────────────────────────────┐ │
│ │ BINInfo bins[] = { │ │
│ │ {"4", "Visa", ...}, │ Time: O(n) │
│ │ {"51", "Mastercard", ...}, │ Space: O(n) │
│ │ {"453201", "Bank of America", ...}, │ Simple but slow │
│ │ ... │ │
│ │ }; │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ OPTION B: Sorted Array with Binary Search │
│ ┌─────────────────────────────────────────────┐ │
│ │ Sort BINs by prefix │ Time: O(log n) │
│ │ Binary search for match │ Space: O(n) │
│ │ Search longest prefix first │ Better for larger DB │
│ └─────────────────────────────────────────────┘ │
│ │
│ OPTION C: Trie (Prefix Tree) │
│ ┌─────────────────────────────────────────────┐ │
│ │ Root │ Time: O(k) k=BIN len │
│ │ / | \ │ Space: O(n*k) │
│ │ 4 5 6 │ Best for prefix match │
│ │ / | \ │ │
│ │ ... 1-5 0 │ │
│ └─────────────────────────────────────────────┘ │
│ │
│ RECOMMENDATION: Start with Option A, optimize to B if needed │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Key Design Decisions
-
String vs. Integer for PAN storage: Use strings. A 19-digit number exceeds
unsigned long longrange. -
Right-to-left vs. left-to-right Luhn: Process right-to-left (as defined) using index arithmetic, or reverse the string.
-
Non-digit handling: Strip before validation, not reject. Real cards have spaces/dashes.
-
BIN search order: Search longest prefixes first.
453201(specific bank) should match before4(generic Visa). -
Memory management: Use stack allocation for fixed-size buffers. PANs are max 19 digits.
Implementation Guide
Phase 1: Basic Luhn Validation
Goal: Implement and test the Luhn algorithm.
Steps:
- Create
luhn_check()function - Handle digit extraction from string
- Implement the doubling logic
- Test with known valid/invalid cards
Test Cards (from payment network documentation):
// Valid test cards
"4111111111111111" // Visa
"5555555555554444" // Mastercard
"378282246310005" // Amex
"6011111111111117" // Discover
// Invalid (fail Luhn)
"1234567890123456"
"4111111111111112" // Wrong check digit
Key Implementation Detail: The digit -= 9 trick for doubled values > 9:
- 10 → 1+0=1, and 10-9=1
- 12 → 1+2=3, and 12-9=3
- 18 → 1+8=9, and 18-9=9
Phase 2: BIN Database
Goal: Build BIN lookup with network identification.
Steps:
- Create BIN data structure
- Populate with major network prefixes
- Implement prefix-matching search
- Return complete card info
Minimum BIN data:
{"4", "Visa", "Various", "credit/debit"},
{"51-55", "Mastercard", "Various", "credit"},
{"34", "American Express", "Amex", "credit"},
{"37", "American Express", "Amex", "credit"},
{"6011", "Discover", "Discover", "credit"},
Phase 3: CLI Interface
Goal: Build user-friendly command-line interface.
Features:
- Single card validation:
./cardvalidate 4111111111111111 - Batch mode:
./cardvalidate --batch file.txt - Explain mode:
./cardvalidate --explain 4111111111111111 - Help:
./cardvalidate --help
Use getopt_long() for argument parsing.
Phase 4: Polish and Edge Cases
Edge cases to handle:
- NULL input
- Empty string
- String with spaces/dashes:
4532 0151 1283 0366 - Too short (<13 digits)
- Too long (>19 digits)
- Contains letters
PCI consideration: Even for validation, consider:
- Don’t log full PANs
- Clear memory after use
- Support masked output
Testing Strategy
Unit Tests
// test_luhn.c
void test_luhn_valid_visa() {
assert(luhn_check("4111111111111111") == true);
}
void test_luhn_valid_amex() {
assert(luhn_check("378282246310005") == true);
}
void test_luhn_invalid_single_digit_error() {
assert(luhn_check("4111111111111112") == false);
}
void test_luhn_empty_string() {
assert(luhn_check("") == false);
}
void test_luhn_with_spaces() {
// After stripping: should pass
char clean[20];
strip_non_digits("4111 1111 1111 1111", clean, sizeof(clean));
assert(luhn_check(clean) == true);
}
Integration Tests
# test_cli.sh
# Test valid Visa
./cardvalidate 4111111111111111 | grep -q "VALID"
# Test invalid
./cardvalidate 1234567890123456 | grep -q "INVALID"
# Test batch mode
echo -e "4111111111111111\n5555555555554444" > /tmp/test.txt
./cardvalidate --batch /tmp/test.txt | grep -c "VALID" | grep -q "2"
Performance Tests
# Generate 1M test cards and validate
time ./cardvalidate --batch 1million_cards.txt
# Target: < 1 second for 1M cards
Common Pitfalls & Debugging
Pitfall 1: Wrong Position Numbering
Symptom: Valid cards fail, invalid cards pass.
Cause: Processing left-to-right instead of right-to-left, or off-by-one in position calculation.
Debug: Add verbose output showing each step:
for (int i = len - 1; i >= 0; i--) {
printf("Position %d (from right %d): digit=%d, double=%s\n",
i, len - i, digit, should_double ? "yes" : "no");
}
Pitfall 2: Integer Overflow
Symptom: Validation fails for long cards.
Cause: Trying to store PAN as integer.
Fix: Always use strings for PAN storage.
Pitfall 3: Forgetting to Handle Non-Digits
Symptom: Crashes or wrong results with formatted input.
Cause: Not stripping spaces/dashes before validation.
Fix: Always sanitize input first.
Pitfall 4: BIN Match Order
Symptom: All Visa cards show “Various” instead of specific bank.
Cause: Generic prefix 4 matches before specific 453201.
Fix: Search longest prefixes first, or sort BIN database by prefix length descending.
Extensions & Challenges
Extension 1: Check Digit Calculator
Generate valid Luhn check digits:
$ ./cardvalidate --generate 411111111111111
Check digit for 411111111111111 is: 1
Complete valid number: 4111111111111111
Extension 2: Card Generator
Generate valid test card numbers for a given network:
$ ./cardvalidate --generate-card visa
Generated valid Visa test card: 4916338506082832
Extension 3: Real BIN Database
Integrate with a real BIN database API (like binlist.net) for accurate issuer information.
Extension 4: WebAssembly Build
Compile to WASM for browser-based validation in checkout forms.
Extension 5: Fuzzing
Use AFL or libFuzzer to find edge cases in your input handling.
Interview Questions This Prepares You For
Conceptual
- “What is the Luhn algorithm, and why do payment cards use it?”
- Checksum from 1954, catches transcription errors, not a security mechanism
- “What’s the difference between a checksum and a cryptographic hash?”
- Checksums: error detection, reversible, no security
- Hashes: one-way, collision-resistant, for integrity/authentication
- “What is a BIN, and how does it enable payment routing?”
- First 6-8 digits, identifies network and issuer, enables distributed routing
Implementation
- “How would you implement Luhn without string conversion?”
- Extract digits with modulo/division, process right-to-left
- “What data structure for BIN lookup?”
- Hash table for O(1), binary search for O(log n), trie for prefix matching
- “How do you handle 19-digit cards in C?”
- String representation, not numeric types
Security
- “Can I store a full PAN in my database?”
- NO! PCI DSS prohibits except for specific compliant processors
- “What’s the attack surface of a card validator?”
- Buffer overflow, logging sensitive data, timing attacks (unlikely for Luhn)
Resources
Books
| Topic | Book | Chapter |
|---|---|---|
| Luhn & Checksums | Serious Cryptography (Aumasson) | Ch. 14: MACs |
| Input Validation | Fluent C (Preschern) | Ch. 5: Input Validation |
| String Processing | C Primer Plus (Prata) | Ch. 11: Strings |
| Modular Arithmetic | CSAPP (Bryant & O’Hallaron) | Ch. 2.1: Information Storage |
| PAN Structure | PCI DSS v4.0 | Requirement 3 |
| Data Structures | Algorithms in C (Sedgewick) | Ch. 12, 15 |
Online Resources
- PCI DSS v4.0 documentation (free): pcisecuritystandards.org
- ISO 7812 summary articles
- Payment network test card numbers (Visa, Mastercard developer docs)
Test Card Numbers
Visa: 4111111111111111, 4012888888881881
Mastercard: 5555555555554444, 5105105105105100
Amex: 378282246310005, 371449635398431
Discover: 6011111111111117, 6011000990139424
NEVER use real card numbers for testing!
Self-Assessment Checklist
Before considering this project complete:
- Luhn algorithm correctly validates all test cards
- Handles 13-19 digit cards correctly
- Strips non-digit characters (spaces, dashes)
- BIN lookup returns correct network for major card types
- CLI provides clear, formatted output
- Edge cases handled (empty, too short, too long, invalid chars)
- Can explain the algorithm to someone else
- Understand why Luhn is validation, not security
- Know the difference between BIN and full PAN
- Batch mode processes files efficiently
What’s Next?
After completing this project, you understand the foundation of card data structure. Move to Project 2: Payment Tokenization Vault to learn how real payment systems protect this data using encryption and tokens.