Project 1: Card Number Validator & BIN Intelligence Service
Project 1: Card Number Validator & BIN Intelligence Service
Project Overview
| Attribute | Value |
|---|---|
| Difficulty | Level 1: Beginner |
| Time Estimate | Weekend |
| Programming Language | C |
| Knowledge Area | Payments / Data Validation |
| Key Technologies | Luhn Algorithm, BIN Database |
| Coolness Level | Level 2: Practical but Forgettable |
| Business Potential | 3. The โService & Supportโ Model |
Learning Objectives
By completing this project, you will:
- Understand PAN (Primary Account Number) anatomy - Learn how card numbers encode network, issuer, and validation information
- Implement the Luhn algorithm - Master this checksum algorithm used across financial systems
- Build BIN (Bank Identification Number) parsing - Learn how payment routing works at the data level
- Handle variable-length input validation - Process 13-19 digit card numbers correctly
- Design a clean C library API - Create a reusable validation library with proper error handling
- Understand validation vs. authentication vs. authorization - Critical distinction in payment systems
The Core Question Youโre Answering
โHow do payment systems know if a card number is valid BEFORE contacting the bank?โ
This isnโt just about data validationโitโs about understanding the economics and security design of payment networks. Every time you enter a card number online, the merchant validates it instantly (before network calls). Why?
- Cost reduction: Sending invalid card numbers to processors costs money (merchant pays per transaction attempt)
- Fraud prevention: Invalid PANs are often manual typing errors OR deliberate probing attacks
- User experience: Instant feedback prevents user frustration and cart abandonment
- System design: Understanding why card numbers encode their own validity check reveals how distributed systems handle untrusted input
The Luhn algorithm (invented in 1954 by IBM scientist Hans Peter Luhn) is a checksum, not cryptography. It catches typing errors, not fraud. Why is this distinction critical? Because many developers confuse validation with authenticationโLuhn tells you โthis could be a real card number structure,โ not โthis card exists and has funds.โ
Deep Theoretical Foundation
1. Primary Account Number (PAN) Structure
A PAN is NOT a random numberโitโs a structured identifier with semantic meaning.
Format: [IIN/BIN (6-8 digits)][Account Number (variable)][Check Digit (1)]
Total length: 13-19 digits (varies by network)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PAN STRUCTURE (16 digits) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ 4532 0151 1283 0366 โ
โ โ โ โ โ โ
โ โ โ โ โโโ Check Digit (position 16) โ
โ โ โ โ Calculated via Luhn algorithm โ
โ โ โ โ โ
โ โ โโโโโโโโโโโดโโโโโ Account Number (positions 7-15) โ
โ โ Unique identifier within issuer โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ BIN/IIN (positions 1-6) โ
โ Identifies network + issuer โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The first digit identifies the Major Industry Identifier (MII):
- 1-2: Airlines
- 3: Travel and entertainment (Amex, Diners)
- 4: Banking (Visa)
- 5: Banking (Mastercard)
- 6: Merchandising and banking (Discover)
Why this matters: The structure IS the routing information. When you swipe a card, the terminal reads the BIN and knows โsend this to Visa networkโ before any authentication happens.
2. The Luhn Algorithm (Modulus 10)
The Luhn algorithm is a checksumโnot encryption. Itโs designed to catch common transcription errors.
What it catches:
- 100% of single-digit errors
- 98% of adjacent transpositions (typing โ12โ instead of โ21โ)
What it does NOT do:
- Prevent deliberate tampering
- Prove the card exists
- Authenticate the cardholder
Algorithm Steps:
- Starting from the rightmost digit (check digit), double every second digit
- If doubling results in a two-digit number, sum those digits (e.g., 16 โ 1+6=7)
- Sum all the digits
- If total modulo 10 is 0, the number is valid
Visual Trace:
Card number: 4 5 3 2 0 1 5 1 1 2 8 3 0 3 6 6
Position: 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 (right to left)
Step 1: Mark positions to double (even positions from right):
Position 2, 4, 6, 8, 10, 12, 14, 16
4 5 3 2 0 1 5 1 1 2 8 3 0 3 6 6
* * * * * * * *
โ โ โ โ โ โ โ โ
Step 2: Double marked digits:
8 5 6 2 0 2 10 1 2 2 16 3 0 6 12 6
Step 3: Sum digits of doubled values > 9:
10 โ 1+0 = 1
16 โ 1+6 = 7
12 โ 1+2 = 3
Result: 8 5 6 2 0 2 1 1 2 2 7 3 0 6 3 6
Step 4: Sum all:
8+5+6+2+0+2+1+1+2+2+7+3+0+6+3+6 = 54
Step 5: Check 54 % 10 = 4 โ 0 โ INVALID?
Wait! The trace shows position numbering matters. Let's re-examine...
The Key Insight: Position numbering starts from the CHECK DIGIT (rightmost), and we double digits at EVEN positions. A correct implementation processes right-to-left.
3. Bank Identification Number (BIN) / Issuer Identification Number (IIN)
The BIN is the first 6-8 digits of a PAN. It uniquely identifies the issuing bank and card network.
Standard BIN Ranges:
| Prefix | Network | Notes |
|---|---|---|
| 4 | Visa | All cards starting with 4 |
| 51-55 | Mastercard | Traditional range |
| 2221-2720 | Mastercard | New range (2017+) |
| 34, 37 | American Express | 15-digit cards |
| 6011, 644-649, 65 | Discover | ย |
| 300-305, 36, 38 | Diners Club | ย |
| 3528-3589 | JCB | ย |
Why BINs enable distributed routing:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PAYMENT ROUTING VIA BIN โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Card: 4532 0151 1283 0366 โ
โ โ โ
โ โโโโ BIN: 453201 โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ BIN LOOKUP TABLE โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ
โ โ 453201 โ Network: Visa โ โ
โ โ Issuer: Bank of America โ โ
โ โ Type: Credit โ โ
โ โ Country: USA โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ Terminal Decision: Route to Visa network โ Bank of America โ
โ โ
โ Without BINs: Would need central database lookup for every card! โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
4. Validation vs. Authentication vs. Authorization
Critical distinction every payment developer must understand:
| Concept | What It Answers | Where It Happens | Security Level |
|---|---|---|---|
| Validation | โIs this structurally correct?โ | Client-side (browser) | None (checksum) |
| Authentication | โDoes this card exist?โ | Card network/issuer | Medium |
| Authorization | โCan this transaction proceed?โ | Issuing bank | High (funds check) |
Luhn validation is UX, not security. Real validation requires:
- Contacting the card network
- Routing to the issuing bank
- Receiving an authorization response
5. ISO/IEC 7812 Standard
The international standard defining PAN structure:
- Specifies MII (Major Industry Identifier)
- Defines IIN (Issuer Identification Number)
- Specifies check digit algorithm (Luhn)
- Defines variable-length PANs (13-19 digits)
Why Amex is 15 digits: Historical design decisions. Different networks made different choices about account number length vs. issuer identification space.
Project Specification
What Youโll Build
A CLI tool and library that validates card numbers using the Luhn algorithm, identifies card networks from BIN ranges, and extracts metadata about the issuing bank.
Expected Output
$ ./cardvalidate 4532015112830366
Card Analysis Report
====================
Card Number: 4532015112830366
Length: 16 digits
Valid Luhn: โ VALID
Card Network: Visa
Card Type: Credit
Issuer: Bank of America
Country: United States
BIN: 453201
Structure Breakdown:
โโโโโโโโฌโโโโโโโโโโโโโโโโโฌโโ
โ BIN โ Account Number โCโ
โ453201โ 51128303 6 โ6โ
โโโโโโโโดโโโโโโโโโโโโโโโโโดโโ
IIN Unique ID Check
$ ./cardvalidate 378282246310005
Card Analysis Report
====================
Card Number: 378282246310005
Length: 15 digits
Valid Luhn: โ VALID
Card Network: American Express
Card Type: Credit
Issuer: American Express
Country: United States
BIN: 378282
$ ./cardvalidate 1234567890123456
Card Analysis Report
====================
Card Number: 1234567890123456
Length: 16 digits
Valid Luhn: โ INVALID (Expected check digit: 0, got: 6)
Card Network: Unknown
Project Structure
cardvalidate/
โโโ src/
โ โโโ main.c # CLI entry point
โ โโโ cardvalidate.c # Core validation library
โ โโโ cardvalidate.h # Public API
โ โโโ bin_database.c # BIN range lookup
โ โโโ bin_database.h # BIN database interface
โ โโโ utils.c # String utilities
โโโ data/
โ โโโ bin_ranges.csv # BIN to issuer mappings
โโโ tests/
โ โโโ test_luhn.c # Luhn algorithm tests
โ โโโ test_bin_lookup.c # BIN lookup tests
โ โโโ known_cards.txt # Test vectors
โโโ Makefile
โโโ README.md
Core API Design
// cardvalidate.h
#ifndef CARDVALIDATE_H
#define CARDVALIDATE_H
#include <stdbool.h>
typedef struct {
bool valid; // Luhn check passed
char network[32]; // "Visa", "Mastercard", etc.
char issuer[64]; // "Bank of America", etc.
char card_type[16]; // "credit" or "debit"
char country[64]; // "United States", etc.
char bin[9]; // First 6-8 digits
int expected_check_digit; // What check digit should be
int actual_check_digit; // What check digit is
int length; // Number of digits
} CardValidationResult;
// Primary API
CardValidationResult validate_card(const char* card_number);
// Individual functions
bool luhn_check(const char* card_number);
int calculate_check_digit(const char* card_number);
const char* identify_network(const char* card_number);
// Utility
void strip_non_digits(const char* input, char* output, size_t output_size);
void mask_pan(const char* pan, char* masked, size_t masked_size);
#endif
Solution Architecture
System Design
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CARD VALIDATOR ARCHITECTURE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ INPUT โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ INPUT SANITIZER โ โ
โ โ โข Strip spaces, dashes โ โ
โ โ โข Validate characters (digits only) โ โ
โ โ โข Check length (13-19) โ โ
โ โโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโดโโโโโโโโโโโ โ
โ โผ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ LUHN VALIDATOR โ โ BIN LOOKUP โ โ
โ โ โ โ โ โ
โ โ โข Right-to-leftโ โ โข Extract BIN โ โ
โ โ processing โ โ โข Search database โ โ
โ โ โข Double/sum โ โ โข Return metadata โ โ
โ โ โข Mod 10 check โ โ โ โ
โ โโโโโโโโโโฌโโโโโโโโโ โโโโโโโโโโโฌโโโโโโโโโโโโ โ
โ โ โ โ
โ โโโโโโโโโโโโฌโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ RESULT AGGREGATOR โ โ
โ โ โข Combine validation result โ โ
โ โ โข Format output โ โ
โ โ โข Generate breakdown โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ OUTPUT โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
BIN Database Design
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ BIN DATABASE OPTIONS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ OPTION A: Array with Linear Search โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ BINInfo bins[] = { โ โ
โ โ {"4", "Visa", ...}, โ Time: O(n) โ
โ โ {"51", "Mastercard", ...}, โ Space: O(n) โ
โ โ {"453201", "Bank of America", ...}, โ Simple but slow โ
โ โ ... โ โ
โ โ }; โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ OPTION B: Sorted Array with Binary Search โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Sort BINs by prefix โ Time: O(log n) โ
โ โ Binary search for match โ Space: O(n) โ
โ โ Search longest prefix first โ Better for larger DB โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ OPTION C: Trie (Prefix Tree) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Root โ Time: O(k) k=BIN len โ
โ โ / | \ โ Space: O(n*k) โ
โ โ 4 5 6 โ Best for prefix match โ
โ โ / | \ โ โ
โ โ ... 1-5 0 โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ RECOMMENDATION: Start with Option A, optimize to B if needed โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Design Decisions
-
String vs. Integer for PAN storage: Use strings. A 19-digit number exceeds
unsigned long longrange. -
Right-to-left vs. left-to-right Luhn: Process right-to-left (as defined) using index arithmetic, or reverse the string.
-
Non-digit handling: Strip before validation, not reject. Real cards have spaces/dashes.
-
BIN search order: Search longest prefixes first.
453201(specific bank) should match before4(generic Visa). -
Memory management: Use stack allocation for fixed-size buffers. PANs are max 19 digits.
Implementation Guide
Phase 1: Basic Luhn Validation
Goal: Implement and test the Luhn algorithm.
Steps:
- Create
luhn_check()function - Handle digit extraction from string
- Implement the doubling logic
- Test with known valid/invalid cards
Test Cards (from payment network documentation):
// Valid test cards
"4111111111111111" // Visa
"5555555555554444" // Mastercard
"378282246310005" // Amex
"6011111111111117" // Discover
// Invalid (fail Luhn)
"1234567890123456"
"4111111111111112" // Wrong check digit
Key Implementation Detail: The digit -= 9 trick for doubled values > 9:
- 10 โ 1+0=1, and 10-9=1
- 12 โ 1+2=3, and 12-9=3
- 18 โ 1+8=9, and 18-9=9
Phase 2: BIN Database
Goal: Build BIN lookup with network identification.
Steps:
- Create BIN data structure
- Populate with major network prefixes
- Implement prefix-matching search
- Return complete card info
Minimum BIN data:
{"4", "Visa", "Various", "credit/debit"},
{"51-55", "Mastercard", "Various", "credit"},
{"34", "American Express", "Amex", "credit"},
{"37", "American Express", "Amex", "credit"},
{"6011", "Discover", "Discover", "credit"},
Phase 3: CLI Interface
Goal: Build user-friendly command-line interface.
Features:
- Single card validation:
./cardvalidate 4111111111111111 - Batch mode:
./cardvalidate --batch file.txt - Explain mode:
./cardvalidate --explain 4111111111111111 - Help:
./cardvalidate --help
Use getopt_long() for argument parsing.
Phase 4: Polish and Edge Cases
Edge cases to handle:
- NULL input
- Empty string
- String with spaces/dashes:
4532 0151 1283 0366 - Too short (<13 digits)
- Too long (>19 digits)
- Contains letters
PCI consideration: Even for validation, consider:
- Donโt log full PANs
- Clear memory after use
- Support masked output
Testing Strategy
Unit Tests
// test_luhn.c
void test_luhn_valid_visa() {
assert(luhn_check("4111111111111111") == true);
}
void test_luhn_valid_amex() {
assert(luhn_check("378282246310005") == true);
}
void test_luhn_invalid_single_digit_error() {
assert(luhn_check("4111111111111112") == false);
}
void test_luhn_empty_string() {
assert(luhn_check("") == false);
}
void test_luhn_with_spaces() {
// After stripping: should pass
char clean[20];
strip_non_digits("4111 1111 1111 1111", clean, sizeof(clean));
assert(luhn_check(clean) == true);
}
Integration Tests
# test_cli.sh
# Test valid Visa
./cardvalidate 4111111111111111 | grep -q "VALID"
# Test invalid
./cardvalidate 1234567890123456 | grep -q "INVALID"
# Test batch mode
echo -e "4111111111111111\n5555555555554444" > /tmp/test.txt
./cardvalidate --batch /tmp/test.txt | grep -c "VALID" | grep -q "2"
Performance Tests
# Generate 1M test cards and validate
time ./cardvalidate --batch 1million_cards.txt
# Target: < 1 second for 1M cards
Common Pitfalls & Debugging
Pitfall 1: Wrong Position Numbering
Symptom: Valid cards fail, invalid cards pass.
Cause: Processing left-to-right instead of right-to-left, or off-by-one in position calculation.
Debug: Add verbose output showing each step:
for (int i = len - 1; i >= 0; i--) {
printf("Position %d (from right %d): digit=%d, double=%s\n",
i, len - i, digit, should_double ? "yes" : "no");
}
Pitfall 2: Integer Overflow
Symptom: Validation fails for long cards.
Cause: Trying to store PAN as integer.
Fix: Always use strings for PAN storage.
Pitfall 3: Forgetting to Handle Non-Digits
Symptom: Crashes or wrong results with formatted input.
Cause: Not stripping spaces/dashes before validation.
Fix: Always sanitize input first.
Pitfall 4: BIN Match Order
Symptom: All Visa cards show โVariousโ instead of specific bank.
Cause: Generic prefix 4 matches before specific 453201.
Fix: Search longest prefixes first, or sort BIN database by prefix length descending.
Extensions & Challenges
Extension 1: Check Digit Calculator
Generate valid Luhn check digits:
$ ./cardvalidate --generate 411111111111111
Check digit for 411111111111111 is: 1
Complete valid number: 4111111111111111
Extension 2: Card Generator
Generate valid test card numbers for a given network:
$ ./cardvalidate --generate-card visa
Generated valid Visa test card: 4916338506082832
Extension 3: Real BIN Database
Integrate with a real BIN database API (like binlist.net) for accurate issuer information.
Extension 4: WebAssembly Build
Compile to WASM for browser-based validation in checkout forms.
Extension 5: Fuzzing
Use AFL or libFuzzer to find edge cases in your input handling.
Interview Questions This Prepares You For
Conceptual
- โWhat is the Luhn algorithm, and why do payment cards use it?โ
- Checksum from 1954, catches transcription errors, not a security mechanism
- โWhatโs the difference between a checksum and a cryptographic hash?โ
- Checksums: error detection, reversible, no security
- Hashes: one-way, collision-resistant, for integrity/authentication
- โWhat is a BIN, and how does it enable payment routing?โ
- First 6-8 digits, identifies network and issuer, enables distributed routing
Implementation
- โHow would you implement Luhn without string conversion?โ
- Extract digits with modulo/division, process right-to-left
- โWhat data structure for BIN lookup?โ
- Hash table for O(1), binary search for O(log n), trie for prefix matching
- โHow do you handle 19-digit cards in C?โ
- String representation, not numeric types
Security
- โCan I store a full PAN in my database?โ
- NO! PCI DSS prohibits except for specific compliant processors
- โWhatโs the attack surface of a card validator?โ
- Buffer overflow, logging sensitive data, timing attacks (unlikely for Luhn)
Resources
Books
| Topic | Book | Chapter |
|---|---|---|
| Luhn & Checksums | Serious Cryptography (Aumasson) | Ch. 14: MACs |
| Input Validation | Fluent C (Preschern) | Ch. 5: Input Validation |
| String Processing | C Primer Plus (Prata) | Ch. 11: Strings |
| Modular Arithmetic | CSAPP (Bryant & OโHallaron) | Ch. 2.1: Information Storage |
| PAN Structure | PCI DSS v4.0 | Requirement 3 |
| Data Structures | Algorithms in C (Sedgewick) | Ch. 12, 15 |
Online Resources
- PCI DSS v4.0 documentation (free): pcisecuritystandards.org
- ISO 7812 summary articles
- Payment network test card numbers (Visa, Mastercard developer docs)
Test Card Numbers
Visa: 4111111111111111, 4012888888881881
Mastercard: 5555555555554444, 5105105105105100
Amex: 378282246310005, 371449635398431
Discover: 6011111111111117, 6011000990139424
NEVER use real card numbers for testing!
Self-Assessment Checklist
Before considering this project complete:
- Luhn algorithm correctly validates all test cards
- Handles 13-19 digit cards correctly
- Strips non-digit characters (spaces, dashes)
- BIN lookup returns correct network for major card types
- CLI provides clear, formatted output
- Edge cases handled (empty, too short, too long, invalid chars)
- Can explain the algorithm to someone else
- Understand why Luhn is validation, not security
- Know the difference between BIN and full PAN
- Batch mode processes files efficiently
Whatโs Next?
After completing this project, you understand the foundation of card data structure. Move to Project 2: Payment Tokenization Vault to learn how real payment systems protect this data using encryption and tokens.