Project 4: Custom Validators and Types
Project 4: Custom Validators and Types
Build a library of custom Pydantic types and validators for common use casesโphone numbers, credit cards, URLs with specific patterns, monetary amounts, and domain-specific types.
Learning Objectives
By completing this project, you will:
- Master the Pydantic V2 validator pipeline - Understand before, after, and wrap validation modes
- Create reusable custom types with Annotated - Build domain-specific types that work across your codebase
- Implement __get_pydantic_core_schema__ - Deep integration with pydantic-core for custom classes
- Use cross-field validation - Validate relationships between multiple fields with @model_validator
- Chain and compose validators - Combine multiple validation steps for complex requirements
- Build production-ready validation libraries - Design validators that are testable, documented, and reusable
Deep Theoretical Foundation
The Validator Pipeline in Pydantic V2
Pydantic V2 introduced a completely redesigned validation system built on pydantic-core (written in Rust). Understanding this pipeline is essential for creating effective custom validators.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PYDANTIC V2 VALIDATION PIPELINE โ
โ โ
โ Raw Input โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ BEFORE VALIDATORS โ โ
โ โ mode='before' - Runs BEFORE type coercion โ โ
โ โ - Receives raw input (could be any type) โ โ
โ โ - Transform, normalize, or pre-process data โ โ
โ โ - Example: Strip whitespace, convert formats โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ PYDANTIC CORE VALIDATION โ โ
โ โ - Type coercion (string "123" -> int 123) โ โ
โ โ - Built-in constraints (min_length, ge, pattern) โ โ
โ โ - Schema validation from Rust core โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ AFTER VALIDATORS โ โ
โ โ mode='after' - Runs AFTER type coercion โ โ
โ โ - Receives validated, typed value โ โ
โ โ - Additional business logic validation โ โ
โ โ - Example: Check email domain, validate checksums โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ Validated Output โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The Three Validator Modes
1. Before Validators (mode=โbeforeโ)
Before validators run before Pydanticโs internal validation. They receive raw input and can transform it:
from pydantic import BaseModel, field_validator
class User(BaseModel):
email: str
@field_validator('email', mode='before')
@classmethod
def normalize_email(cls, v):
# v could be anything - string, bytes, None, etc.
if isinstance(v, str):
return v.lower().strip()
return v # Let Pydantic handle type errors
Use before validators for:
- Normalizing data (lowercase, strip whitespace)
- Converting between formats
- Handling multiple input representations
- Pre-processing before type coercion
2. After Validators (mode=โafterโ - the default)
After validators run after Pydantic has validated and coerced the type:
from pydantic import BaseModel, field_validator
class User(BaseModel):
age: int
@field_validator('age') # mode='after' is default
@classmethod
def validate_age(cls, v: int) -> int:
# v is guaranteed to be an int here
if v < 0:
raise ValueError('age must be positive')
return v
Use after validators for:
- Business rule validation
- Checksum verification
- Cross-reference validation
- Complex constraints not expressible with Field()
3. Wrap Validators (mode=โwrapโ)
Wrap validators give you complete control over the validation process:
from pydantic import BaseModel, field_validator, ValidationInfo
from pydantic_core import PydanticCustomError
class Config(BaseModel):
timeout: int
@field_validator('timeout', mode='wrap')
@classmethod
def validate_timeout(cls, v, handler, info: ValidationInfo):
# handler is Pydantic's internal validator
try:
validated = handler(v) # Call Pydantic's validation
return validated
except Exception:
# Provide fallback or custom error
if v == 'default':
return 30 # Default timeout
raise PydanticCustomError(
'invalid_timeout',
'Invalid timeout value: {value}',
{'value': v}
)
Use wrap validators for:
- Custom error messages
- Fallback values
- Conditional validation
- Performance optimization (skip validation in certain cases)
How Annotated Validators Work
Pydantic V2 embraces Pythonโs Annotated type for attaching validation metadata to types. This is the preferred way to create reusable validators.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ANNOTATED TYPE ANATOMY โ
โ โ
โ PhoneNumber = Annotated[str, BeforeValidator(normalize), AfterValidator(check)]
โ โ โ โ โ
โ โ โ โโโโ Run after coercion โ
โ โ โโโโ Run before coercion โ
โ โโโโ Base type for coercion โ
โ โ
โ Multiple validators are applied in order: โ
โ โ
โ Input โ
โ โ โ
โ โโโบ BeforeValidator(normalize) โโโ Transform raw input โ
โ โ โ
โ โโโบ str type coercion โโโ Pydantic's internal โ
โ โ โ
โ โโโบ AfterValidator(check) โโโ Validate coerced value โ
โ โ โ
โ โผ โ
โ Validated Output โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Creating Reusable Types with Annotated:
from typing import Annotated
from pydantic import AfterValidator, BeforeValidator
import re
def normalize_phone(v: str) -> str:
"""Remove all non-digit characters"""
return re.sub(r'\D', '', v)
def validate_phone(v: str) -> str:
"""Validate phone number format"""
if len(v) == 10:
return f"+1{v}" # Assume US
elif len(v) == 11 and v.startswith('1'):
return f"+{v}"
elif len(v) >= 11:
return f"+{v}"
raise ValueError(f"Invalid phone number: must have at least 10 digits")
# Reusable type - use anywhere in your codebase!
PhoneNumber = Annotated[str, BeforeValidator(normalize_phone), AfterValidator(validate_phone)]
# Usage
class Contact(BaseModel):
name: str
phone: PhoneNumber # Automatically validated!
fax: PhoneNumber # Same validation applied
Creating Custom Types with get_pydantic_core_schema
For complex custom types (classes, not just validated strings), you implement __get_pydantic_core_schema__. This hooks directly into pydantic-core.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CUSTOM TYPE SCHEMA INTEGRATION โ
โ โ
โ class Money: โ
โ amount: Decimal โ
โ currency: str โ
โ โ
โ @classmethod โ
โ def __get_pydantic_core_schema__(cls, source, handler): โ
โ โ โ โ โ
โ โ โ โโ Handler for nested types โ
โ โ โโ The source type (Money) โ
โ โ โ
โ โโ Return a CoreSchema that tells pydantic-core: โ
โ - What input types to accept โ
โ - How to validate them โ
โ - How to serialize output โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ CoreSchema Tree โ โ
โ โ โ โ
โ โ union_schema([ โ โ
โ โ is_instance_schema(Money), โโโ Accept Money objects โ โ
โ โ str_schema(), โโโ Accept strings like "USD 99"โ
โ โ float_schema(), โโโ Accept floats like 99.99 โ โ
โ โ dict_schema(...) โโโ Accept dicts โ โ
โ โ ]) โ โ
โ โ โ โ โ
โ โ โผ โ โ
โ โ no_info_after_validator_function( โโโ Then validate/convert โ โ
โ โ cls._validate, โ โ
โ โ ...union_schema... โ โ
โ โ ) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Complete Example: Money Type
from decimal import Decimal
from typing import Any
from pydantic import GetCoreSchemaHandler, GetJsonSchemaHandler
from pydantic_core import CoreSchema, core_schema
import re
class Money:
"""Represents a monetary amount with currency."""
__slots__ = ('amount', 'currency')
def __init__(self, amount: Decimal, currency: str = "USD"):
self.amount = Decimal(str(amount)).quantize(Decimal("0.01"))
self.currency = currency.upper()
def __repr__(self) -> str:
return f"{self.currency} {self.amount}"
def __eq__(self, other) -> bool:
if isinstance(other, Money):
return self.amount == other.amount and self.currency == other.currency
return False
@classmethod
def __get_pydantic_core_schema__(
cls,
_source_type: Any,
_handler: GetCoreSchemaHandler
) -> CoreSchema:
"""Define how Pydantic validates this type."""
return core_schema.no_info_after_validator_function(
cls._validate,
core_schema.union_schema([
# Accept existing Money objects
core_schema.is_instance_schema(Money),
# Accept strings like "USD 99.99" or "99.99"
core_schema.str_schema(),
# Accept numbers
core_schema.float_schema(),
core_schema.int_schema(),
# Accept dicts like {"amount": 99.99, "currency": "USD"}
core_schema.dict_schema(
keys_schema=core_schema.str_schema(),
values_schema=core_schema.any_schema(),
),
]),
serialization=core_schema.plain_serializer_function_ser_schema(
lambda m: {"amount": str(m.amount), "currency": m.currency},
info_arg=False,
),
)
@classmethod
def _validate(cls, value: Any) -> "Money":
"""Convert various inputs to Money."""
if isinstance(value, Money):
return value
if isinstance(value, (int, float, Decimal)):
return Money(Decimal(str(value)))
if isinstance(value, str):
# Parse "USD 100.00" or just "100.00"
match = re.match(r'^([A-Z]{3})?\s*(-?\d+\.?\d*)$', value.strip())
if match:
currency = match.group(1) or "USD"
amount = Decimal(match.group(2))
return Money(amount, currency)
raise ValueError(f"Cannot parse Money from string: {value}")
if isinstance(value, dict):
return Money(
amount=Decimal(str(value.get('amount', 0))),
currency=value.get('currency', 'USD')
)
raise ValueError(f"Cannot create Money from {type(value).__name__}")
@classmethod
def __get_pydantic_json_schema__(
cls,
_core_schema: CoreSchema,
handler: GetJsonSchemaHandler
) -> dict:
"""Define the JSON Schema for documentation."""
return {
"type": "object",
"properties": {
"amount": {"type": "string", "pattern": r"^-?\d+\.?\d*$"},
"currency": {"type": "string", "pattern": r"^[A-Z]{3}$"}
},
"required": ["amount"],
"examples": [
{"amount": "99.99", "currency": "USD"},
"USD 99.99",
99.99
]
}
The Pydantic-Core Schema System
Pydantic V2 uses a schema system to describe validation logic. Understanding key schema types helps you create sophisticated validators.
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ COMMON CORE SCHEMA TYPES โ
โ โ
โ PRIMITIVE SCHEMAS โ
โ โโโ str_schema() - Validates strings โ
โ โโโ int_schema() - Validates integers โ
โ โโโ float_schema() - Validates floats โ
โ โโโ bool_schema() - Validates booleans โ
โ โโโ none_schema() - Validates None โ
โ โ
โ CONTAINER SCHEMAS โ
โ โโโ list_schema(items) - Validates lists with item schema โ
โ โโโ dict_schema(k, v) - Validates dicts with key/value schemas โ
โ โโโ set_schema(items) - Validates sets โ
โ โโโ tuple_schema(items) - Validates tuples โ
โ โ
โ COMPOSITE SCHEMAS โ
โ โโโ union_schema([...]) - Accept any of multiple schemas โ
โ โโโ nullable_schema(s) - Schema s or None โ
โ โโโ chain_schema([...]) - Apply schemas in sequence โ
โ โ
โ VALIDATION SCHEMAS โ
โ โโโ no_info_before_validator_function(fn, schema) โ
โ โโโ no_info_after_validator_function(fn, schema) โ
โ โโโ with_info_before_validator_function(fn, schema) โ
โ โโโ with_info_after_validator_function(fn, schema) โ
โ โ
โ INSTANCE SCHEMAS โ
โ โโโ is_instance_schema(cls) - Check isinstance โ
โ โโโ model_schema(cls) - Validate as Pydantic model โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Cross-Field Validation Patterns
When validation depends on multiple fields, use @model_validator:
from pydantic import BaseModel, model_validator
class DateRange(BaseModel):
start_date: date
end_date: date
@model_validator(mode='after')
def validate_date_order(self) -> 'DateRange':
if self.end_date < self.start_date:
raise ValueError('end_date must be after start_date')
return self
Model Validator Modes:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MODEL VALIDATOR MODES โ
โ โ
โ mode='before' โ
โ โ โ
โ โ @model_validator(mode='before') โ
โ โ @classmethod โ
โ โ def validate(cls, data: Any) -> Any: โ
โ โ # data is the raw input (usually dict) โ
โ โ # Runs BEFORE any field validation โ
โ โ # Can transform the entire input โ
โ โ return data โ
โ โ โ
โ โ Use cases: โ
โ โ - Flatten nested structures โ
โ โ - Rename fields โ
โ โ - Add computed fields โ
โ โ โ
โ mode='after' (default) โ
โ โ โ
โ โ @model_validator(mode='after') โ
โ โ def validate(self) -> 'Self': โ
โ โ # self is the fully validated model instance โ
โ โ # All fields are validated and assigned โ
โ โ # Can validate relationships between fields โ
โ โ return self โ
โ โ โ
โ โ Use cases: โ
โ โ - Cross-field validation โ
โ โ - Computed properties that need all fields โ
โ โ - Consistency checks โ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Validator Information Context
Validators can receive context about the validation:
from pydantic import BaseModel, field_validator, ValidationInfo
class User(BaseModel):
password: str
password_confirm: str
@field_validator('password_confirm')
@classmethod
def passwords_match(cls, v: str, info: ValidationInfo) -> str:
# Access other field values through info.data
if 'password' in info.data and v != info.data['password']:
raise ValueError('passwords do not match')
return v
ValidationInfo provides:
info.data- Dict of already validated field valuesinfo.field_name- Name of the current fieldinfo.config- Model configurationinfo.context- Custom context passed viamodel_validate(..., context={})
Project Specification
Functional Requirements
Build a validation library called pydantic-types that provides:
- Phone Number Type
- Accept various formats: (555) 123-4567, 555-123-4567, +1-555-123-4567
- Normalize to international format: +15551234567
- Support configurable country default
- Credit Card Type
- Validate card number using Luhn algorithm
- Identify card type (Visa, Mastercard, Amex)
- Mask number for display: ** ** ** 1234
- URL Type with Pattern Matching
- Validate URL structure
- Restrict to specific domains
- Require HTTPS
- Extract components (domain, path, query)
- Monetary Amount Type
- Handle currency with amount
- Support multiple input formats
- Proper decimal precision
- Arithmetic operations
- Domain-Specific Types
- Social Security Number (SSN) with format validation
- Email with domain restrictions
- Slug for URLs
- Color (hex, RGB, named colors)
Usage Examples
from pydantic import BaseModel
from pydantic_types import (
PhoneNumber,
CreditCard,
SecureURL,
Money,
SSN,
DomainEmail,
Slug,
Color
)
class PaymentForm(BaseModel):
cardholder_name: str
card_number: CreditCard
billing_phone: PhoneNumber
amount: Money
class UserProfile(BaseModel):
email: DomainEmail['company.com'] # Only @company.com
phone: PhoneNumber
website: SecureURL
ssn: SSN
profile_slug: Slug
favorite_color: Color
# All of these should work:
payment = PaymentForm(
cardholder_name="John Doe",
card_number="4111-1111-1111-1111", # Visa test number
billing_phone="(555) 123-4567",
amount="USD 99.99"
)
print(payment.card_number.masked) # **** **** **** 1111
print(payment.card_number.card_type) # visa
print(payment.billing_phone) # +15551234567
print(payment.amount) # USD 99.99
Solution Architecture
Component Design
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PYDANTIC-TYPES LIBRARY โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Base Components โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ โ
โ โ โ Validators โ โ Patterns โ โ Errors โ โ โ
โ โ โ (functions) โ โ (regex) โ โ (custom) โ โ โ
โ โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Type Definitions โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Annotated Types (simple string-based) โ โ โ
โ โ โ โ โ โ
โ โ โ PhoneNumber = Annotated[str, BeforeValidator, AfterValidator]โ โ โ
โ โ โ SSN = Annotated[str, BeforeValidator, AfterValidator] โ โ โ
โ โ โ Slug = Annotated[str, BeforeValidator, AfterValidator] โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Custom Classes (complex types with methods) โ โ โ
โ โ โ โ โ โ
โ โ โ class CreditCard: โ โ โ
โ โ โ __get_pydantic_core_schema__() โ โ โ
โ โ โ @property masked โ โ โ
โ โ โ @property card_type โ โ โ
โ โ โ โ โ โ
โ โ โ class Money: โ โ โ
โ โ โ __get_pydantic_core_schema__() โ โ โ
โ โ โ __add__, __sub__, etc. โ โ โ
โ โ โ โ โ โ
โ โ โ class Color: โ โ โ
โ โ โ __get_pydantic_core_schema__() โ โ โ
โ โ โ to_hex(), to_rgb() โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Generic Types (parameterized) โ โ โ
โ โ โ โ โ โ
โ โ โ class DomainEmail(Generic[T]): โ โ โ
โ โ โ __class_getitem__() for DomainEmail['company.com'] โ โ โ
โ โ โ โ โ โ
โ โ โ class SecureURL: โ โ โ
โ โ โ with_domains(), require_https() โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
File Structure
pydantic-types/
โโโ pyproject.toml
โโโ src/
โ โโโ pydantic_types/
โ โโโ __init__.py # Public exports
โ โโโ _validators.py # Validator functions
โ โโโ _patterns.py # Regex patterns
โ โโโ _errors.py # Custom error types
โ โโโ phone.py # PhoneNumber type
โ โโโ credit_card.py # CreditCard type
โ โโโ url.py # SecureURL type
โ โโโ money.py # Money type
โ โโโ identifiers.py # SSN, Slug, etc.
โ โโโ email.py # DomainEmail type
โ โโโ color.py # Color type
โโโ tests/
โโโ test_phone.py
โโโ test_credit_card.py
โโโ test_url.py
โโโ test_money.py
โโโ test_identifiers.py
โโโ test_email.py
โโโ test_color.py
Phased Implementation Guide
Phase 1: Project Setup and Base Infrastructure (1-2 hours)
Goal: Set up the project structure and common utilities.
- Create project with pyproject.toml:
[project] name = "pydantic-types" version = "0.1.0" dependencies = [ "pydantic>=2.0", ] [project.optional-dependencies] dev = ["pytest", "pytest-cov", "mypy"] - Create
_patterns.pywith common regex patterns:import re PHONE_DIGITS = re.compile(r'\d') PHONE_FULL = re.compile(r'^\+?1?[\s.-]?\(?(\d{3})\)?[\s.-]?(\d{3})[\s.-]?(\d{4})$') CREDIT_CARD = re.compile(r'^[\d\s-]+$') SSN = re.compile(r'^(\d{3})-?(\d{2})-?(\d{4})$') SLUG = re.compile(r'^[a-z0-9]+(?:-[a-z0-9]+)*$') HEX_COLOR = re.compile(r'^#?([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$') - Create
_errors.pywith custom error types:from pydantic_core import PydanticCustomError def phone_error(value: str) -> PydanticCustomError: return PydanticCustomError( 'phone_number_invalid', 'Invalid phone number format: {value}', {'value': value} ) def credit_card_error(value: str, reason: str) -> PydanticCustomError: return PydanticCustomError( 'credit_card_invalid', '{reason}: {value}', {'value': value[:4] + '****', 'reason': reason} )
Checkpoint: Project structure exists with base utilities.
Phase 2: PhoneNumber Type (2-3 hours)
Goal: Create a robust phone number type with normalization.
- Implement phone validation logic:
# phone.py from typing import Annotated from pydantic import BeforeValidator, AfterValidator import re def normalize_phone(value: str) -> str: """Remove all non-digit characters.""" if not isinstance(value, str): raise TypeError('Phone number must be a string') return ''.join(c for c in value if c.isdigit()) def validate_phone(value: str) -> str: """Validate and format phone number.""" digits = value if len(digits) == 10: return f"+1{digits}" elif len(digits) == 11 and digits.startswith('1'): return f"+{digits}" elif len(digits) > 11: return f"+{digits}" else: raise ValueError( f"Phone number must have at least 10 digits, got {len(digits)}" ) PhoneNumber = Annotated[ str, BeforeValidator(normalize_phone), AfterValidator(validate_phone) ] -
Add phone number parsing for different formats.
- Write comprehensive tests:
# test_phone.py def test_us_formats(): assert validate("(555) 123-4567") == "+15551234567" assert validate("555-123-4567") == "+15551234567" assert validate("555.123.4567") == "+15551234567" assert validate("5551234567") == "+15551234567" def test_international(): assert validate("+1-555-123-4567") == "+15551234567" assert validate("+44 20 7946 0958") == "+442079460958" def test_invalid(): with pytest.raises(ValidationError): validate("123") # Too short
Checkpoint: PhoneNumber validates and normalizes various formats.
Phase 3: CreditCard Type (3-4 hours)
Goal: Create credit card type with Luhn validation and card type detection.
- Implement Luhn algorithm:
def luhn_checksum(card_number: str) -> bool: """Validate credit card number using Luhn algorithm.""" digits = [int(d) for d in card_number if d.isdigit()] odd_digits = digits[-1::-2] even_digits = digits[-2::-2] checksum = sum(odd_digits) for d in even_digits: checksum += sum(divmod(d * 2, 10)) return checksum % 10 == 0 - Implement card type detection:
def detect_card_type(number: str) -> str: """Detect credit card type from number.""" if number.startswith('4'): return 'visa' elif number[:2] in ('51', '52', '53', '54', '55'): return 'mastercard' elif number[:2] in ('34', '37'): return 'amex' elif number[:4] == '6011' or number[:2] == '65': return 'discover' return 'unknown' - Create CreditCard class with
__get_pydantic_core_schema__:class CreditCard: __slots__ = ('number', 'card_type') def __init__(self, number: str): self.number = number self.card_type = detect_card_type(number) @property def masked(self) -> str: return '*' * 12 + self.number[-4:] @property def last_four(self) -> str: return self.number[-4:] @classmethod def __get_pydantic_core_schema__(cls, source, handler): # Implementation here pass
Checkpoint: CreditCard validates numbers and provides useful properties.
Phase 4: Money Type (3-4 hours)
Goal: Create a full-featured monetary type with currency support.
- Implement Money class with Decimal precision:
from decimal import Decimal, ROUND_HALF_UP class Money: __slots__ = ('amount', 'currency') CURRENCIES = {'USD', 'EUR', 'GBP', 'JPY', 'CAD', 'AUD'} def __init__(self, amount: Decimal, currency: str = 'USD'): if currency.upper() not in self.CURRENCIES: raise ValueError(f"Unknown currency: {currency}") # Round to 2 decimal places (or 0 for JPY) precision = '1' if currency.upper() == 'JPY' else '0.01' self.amount = Decimal(str(amount)).quantize( Decimal(precision), rounding=ROUND_HALF_UP ) self.currency = currency.upper() def __add__(self, other: 'Money') -> 'Money': if self.currency != other.currency: raise ValueError(f"Cannot add {self.currency} and {other.currency}") return Money(self.amount + other.amount, self.currency) def __sub__(self, other: 'Money') -> 'Money': if self.currency != other.currency: raise ValueError(f"Cannot subtract {self.currency} and {other.currency}") return Money(self.amount - other.amount, self.currency) def __mul__(self, factor: int | float | Decimal) -> 'Money': return Money(self.amount * Decimal(str(factor)), self.currency) - Add parsing for various input formats:
99.99(number, default USD)"99.99"(string number)"USD 99.99"(currency prefixed){"amount": 99.99, "currency": "USD"}(dict)
- Implement serialization for JSON output.
Checkpoint: Money handles arithmetic and various input formats.
Phase 5: URL and Email Types (2-3 hours)
Goal: Create constrained URL and email types.
- Implement SecureURL with domain restrictions:
from urllib.parse import urlparse class SecureURL(str): @classmethod def __get_pydantic_core_schema__(cls, source, handler): return core_schema.no_info_after_validator_function( cls._validate, core_schema.str_schema() ) @classmethod def _validate(cls, value: str) -> 'SecureURL': parsed = urlparse(value) if parsed.scheme != 'https': raise ValueError('URL must use HTTPS') if not parsed.netloc: raise ValueError('Invalid URL: missing domain') return cls(value) - Create DomainEmail with parameterized domain:
class DomainEmailMeta(type): def __getitem__(cls, domain: str): """Allow DomainEmail['company.com'] syntax.""" return Annotated[ str, AfterValidator(lambda v: cls._validate(v, domain)) ] class DomainEmail(metaclass=DomainEmailMeta): @classmethod def _validate(cls, value: str, allowed_domain: str) -> str: if '@' not in value: raise ValueError('Invalid email format') domain = value.split('@')[1].lower() if domain != allowed_domain.lower(): raise ValueError(f'Email must be from @{allowed_domain}') return value.lower()
Checkpoint: URL and Email types with domain constraints work.
Phase 6: Identifier Types and Documentation (2-3 hours)
Goal: Complete remaining types and add comprehensive documentation.
- Implement SSN type:
def normalize_ssn(value: str) -> str: return ''.join(c for c in value if c.isdigit()) def validate_ssn(value: str) -> str: if len(value) != 9: raise ValueError('SSN must be exactly 9 digits') # Format as XXX-XX-XXXX return f"{value[:3]}-{value[3:5]}-{value[5:]}" SSN = Annotated[str, BeforeValidator(normalize_ssn), AfterValidator(validate_ssn)] - Implement Slug type:
def normalize_slug(value: str) -> str: # Convert to lowercase, replace spaces with hyphens slug = value.lower().strip() slug = re.sub(r'[^\w\s-]', '', slug) slug = re.sub(r'[\s_]+', '-', slug) return slug.strip('-') def validate_slug(value: str) -> str: if not re.match(r'^[a-z0-9]+(?:-[a-z0-9]+)*$', value): raise ValueError('Invalid slug format') return value Slug = Annotated[str, BeforeValidator(normalize_slug), AfterValidator(validate_slug)] -
Implement Color type (supporting hex, RGB, named colors).
-
Add docstrings and type hints to all public APIs.
- Create
__init__.pywith clean exports:from .phone import PhoneNumber from .credit_card import CreditCard from .money import Money from .url import SecureURL from .email import DomainEmail from .identifiers import SSN, Slug from .color import Color __all__ = [ 'PhoneNumber', 'CreditCard', 'Money', 'SecureURL', 'DomainEmail', 'SSN', 'Slug', 'Color', ]
Checkpoint: All types implemented with documentation.
Testing Strategy
Unit Tests
# tests/test_credit_card.py
import pytest
from pydantic import BaseModel, ValidationError
from pydantic_types import CreditCard
class Payment(BaseModel):
card: CreditCard
class TestCreditCardValidation:
def test_valid_visa(self):
payment = Payment(card="4111111111111111")
assert payment.card.card_type == "visa"
assert payment.card.last_four == "1111"
def test_valid_mastercard(self):
payment = Payment(card="5555555555554444")
assert payment.card.card_type == "mastercard"
def test_valid_amex(self):
payment = Payment(card="378282246310005")
assert payment.card.card_type == "amex"
def test_masked_output(self):
payment = Payment(card="4111111111111111")
assert payment.card.masked == "************1111"
def test_invalid_luhn(self):
with pytest.raises(ValidationError) as exc_info:
Payment(card="4111111111111112") # Invalid checksum
assert "luhn" in str(exc_info.value).lower()
def test_formats_with_spaces(self):
payment = Payment(card="4111 1111 1111 1111")
assert payment.card.number == "4111111111111111"
def test_formats_with_dashes(self):
payment = Payment(card="4111-1111-1111-1111")
assert payment.card.number == "4111111111111111"
# tests/test_money.py
import pytest
from decimal import Decimal
from pydantic import BaseModel, ValidationError
from pydantic_types import Money
class Order(BaseModel):
total: Money
class TestMoneyValidation:
def test_from_number(self):
order = Order(total=99.99)
assert order.total.amount == Decimal("99.99")
assert order.total.currency == "USD"
def test_from_string(self):
order = Order(total="EUR 50.00")
assert order.total.currency == "EUR"
assert order.total.amount == Decimal("50.00")
def test_from_dict(self):
order = Order(total={"amount": "100", "currency": "GBP"})
assert order.total.currency == "GBP"
def test_arithmetic(self):
m1 = Money(Decimal("10.00"), "USD")
m2 = Money(Decimal("5.50"), "USD")
result = m1 + m2
assert result.amount == Decimal("15.50")
result = m1 - m2
assert result.amount == Decimal("4.50")
result = m1 * 2
assert result.amount == Decimal("20.00")
def test_cannot_mix_currencies(self):
m1 = Money(Decimal("10.00"), "USD")
m2 = Money(Decimal("5.00"), "EUR")
with pytest.raises(ValueError, match="Cannot add"):
m1 + m2
# tests/test_phone.py
import pytest
from pydantic import BaseModel, ValidationError
from pydantic_types import PhoneNumber
class Contact(BaseModel):
phone: PhoneNumber
class TestPhoneNumber:
@pytest.mark.parametrize("input_value,expected", [
("(555) 123-4567", "+15551234567"),
("555-123-4567", "+15551234567"),
("555.123.4567", "+15551234567"),
("5551234567", "+15551234567"),
("+1 555 123 4567", "+15551234567"),
("+44 20 7946 0958", "+442079460958"),
])
def test_valid_formats(self, input_value, expected):
contact = Contact(phone=input_value)
assert contact.phone == expected
def test_too_short(self):
with pytest.raises(ValidationError):
Contact(phone="123456")
Integration Tests
# tests/test_integration.py
from pydantic import BaseModel
from pydantic_types import PhoneNumber, CreditCard, Money, DomainEmail
class PaymentForm(BaseModel):
customer_email: DomainEmail['example.com']
phone: PhoneNumber
card: CreditCard
amount: Money
class TestPaymentForm:
def test_complete_form(self):
form = PaymentForm(
customer_email="john@example.com",
phone="(555) 123-4567",
card="4111-1111-1111-1111",
amount="USD 99.99"
)
assert form.customer_email == "john@example.com"
assert form.phone == "+15551234567"
assert form.card.card_type == "visa"
assert form.card.masked == "************1111"
assert form.amount.amount == Decimal("99.99")
def test_serialization(self):
form = PaymentForm(
customer_email="john@example.com",
phone="+15551234567",
card="4111111111111111",
amount=99.99
)
data = form.model_dump()
assert 'customer_email' in data
assert 'phone' in data
# Can recreate from serialized data
form2 = PaymentForm.model_validate(data)
assert form2.phone == form.phone
Property-Based Testing
# tests/test_properties.py
from hypothesis import given, strategies as st
from pydantic_types.credit_card import luhn_checksum
@given(st.text(alphabet='0123456789', min_size=13, max_size=19))
def test_luhn_checksum_deterministic(card_number):
"""Luhn checksum always returns the same result for the same input."""
result1 = luhn_checksum(card_number)
result2 = luhn_checksum(card_number)
assert result1 == result2
def generate_valid_card():
"""Generate a valid credit card number using Luhn algorithm."""
# Start with 15 random digits
digits = [random.randint(0, 9) for _ in range(15)]
# Calculate check digit
partial = ''.join(map(str, digits))
for check in range(10):
if luhn_checksum(partial + str(check)):
return partial + str(check)
@given(st.builds(generate_valid_card))
def test_valid_cards_pass_luhn(card_number):
"""Generated valid cards always pass Luhn check."""
assert luhn_checksum(card_number) is True
Common Pitfalls and Debugging
Pitfall 1: Validator Order Matters
Problem: Before validators run in declaration order, but you expected a different order.
# WRONG: normalize runs after validate, so validate sees raw input
Email = Annotated[
str,
AfterValidator(validate_email), # Runs first!
BeforeValidator(normalize_email), # Runs second
]
# CORRECT: before validators run before after validators
Email = Annotated[
str,
BeforeValidator(normalize_email), # Runs first
AfterValidator(validate_email), # Runs second
]
Solution: Remember the order:
- All BeforeValidators (in declaration order)
- Core type validation
- All AfterValidators (in declaration order)
Pitfall 2: Modifying Values in After Validators
Problem: After validators receive immutable validated values.
# WRONG: Trying to modify a validated string
@field_validator('email', mode='after')
@classmethod
def lowercase_email(cls, v: str) -> str:
# This works, but conceptually wrong place
return v.lower()
# CORRECT: Use before validator for transformations
@field_validator('email', mode='before')
@classmethod
def lowercase_email(cls, v: str) -> str:
if isinstance(v, str):
return v.lower()
return v
Solution: Use before validators for transformations, after validators for validation.
Pitfall 3: ValidationInfo.data May Not Have All Fields
Problem: In field validators, info.data only contains previously validated fields.
class User(BaseModel):
password: str
password_confirm: str
@field_validator('password')
@classmethod
def validate_password(cls, v, info):
# info.data does NOT contain password_confirm yet!
# password_confirm hasn't been validated
if 'password_confirm' in info.data: # This will be False!
...
Solution: Use @model_validator(mode='after') for cross-field validation, or validate in the later fieldโs validator:
@field_validator('password_confirm')
@classmethod
def passwords_match(cls, v, info):
# Now info.data contains 'password' because it was validated first
if 'password' in info.data and v != info.data['password']:
raise ValueError('passwords do not match')
return v
Pitfall 4: Custom Types Not Serializing Properly
Problem: Custom class doesnโt serialize to JSON correctly.
class Money:
def __init__(self, amount, currency):
self.amount = amount
self.currency = currency
# When serializing:
# model_dump() returns Money object, not dict
# model_dump_json() fails with "Object of type Money is not JSON serializable"
Solution: Add serialization to your core schema:
@classmethod
def __get_pydantic_core_schema__(cls, source, handler):
return core_schema.no_info_after_validator_function(
cls._validate,
core_schema.union_schema([...]),
# Add serialization!
serialization=core_schema.plain_serializer_function_ser_schema(
lambda m: {"amount": str(m.amount), "currency": m.currency},
info_arg=False,
),
)
Pitfall 5: Generic Types Losing Type Information
Problem: DomainEmail['company.com'] loses the domain at runtime.
# WRONG: Type parameter not accessible
class DomainEmail(Generic[T]):
pass
# At runtime, you can't easily get 'company.com' from DomainEmail['company.com']
Solution: Use __class_getitem__ to capture the parameter:
class DomainEmailMeta(type):
_cache: dict[str, type] = {}
def __getitem__(cls, domain: str):
if domain not in cls._cache:
def validate(v: str) -> str:
if not v.endswith(f'@{domain}'):
raise ValueError(f'Email must end with @{domain}')
return v
# Create new annotated type with captured domain
cls._cache[domain] = Annotated[str, AfterValidator(validate)]
return cls._cache[domain]
class DomainEmail(metaclass=DomainEmailMeta):
pass
# Now DomainEmail['company.com'] creates a proper validator
Debugging Tips
- Print the core schema:
from pydantic import TypeAdapter adapter = TypeAdapter(YourType) print(adapter.core_schema) - Use validation context for debugging:
model.model_validate(data, context={'debug': True}) - Check error details:
try: Model(**data) except ValidationError as e: for error in e.errors(): print(f"Field: {error['loc']}") print(f"Input: {error['input']}") print(f"Type: {error['type']}")
Extensions and Challenges
Extension 1: Configurable Phone Number Type
Create a phone number type that accepts configuration:
from pydantic_types import PhoneNumber
# Configure default country
USPhone = PhoneNumber.configure(country='US', format='national')
UKPhone = PhoneNumber.configure(country='GB', format='international')
class Contact(BaseModel):
us_phone: USPhone # Validates as US number
uk_phone: UKPhone # Validates as UK number
Challenge: Implement the configure method using class factories.
Extension 2: Currency Conversion
Add currency conversion to the Money type:
from pydantic_types import Money
Money.set_exchange_rates({
('USD', 'EUR'): Decimal('0.85'),
('EUR', 'USD'): Decimal('1.18'),
})
usd = Money(Decimal('100'), 'USD')
eur = usd.convert_to('EUR') # EUR 85.00
Challenge: Handle rate updates and missing rates gracefully.
Extension 3: Async Validators
Create validators that can perform async operations:
from pydantic_types import AsyncEmail
class User(BaseModel):
email: AsyncEmail # Checks if email exists via API
async def create_user(data):
user = await User.model_validate_async(data)
# Email has been verified to exist
Challenge: Integrate with Pydanticโs async validation support.
Extension 4: Validation with External Services
Create a credit card type that validates with a payment processor:
class VerifiedCreditCard(CreditCard):
"""Card validated against payment processor."""
@classmethod
async def verify_with_stripe(cls, card: CreditCard) -> 'VerifiedCreditCard':
# Call Stripe API to verify card
result = await stripe.tokens.create(card=card.to_stripe_format())
if result.card.cvc_check == 'fail':
raise ValueError('CVC check failed')
return cls(card.number, verified=True)
Challenge: Handle API failures and rate limiting.
Extension 5: Type Composition
Create a system for composing validators:
from pydantic_types import compose, NonEmpty, Lowercase, TrimWhitespace
# Compose multiple transformations
Username = compose(
str,
TrimWhitespace,
Lowercase,
NonEmpty,
Pattern(r'^[a-z0-9_]+$')
)
class User(BaseModel):
username: Username # All validators applied
Challenge: Make composition type-safe and preserve error messages.
Real-World Connections
Where This Pattern Appears
-
Payment Processing: Stripe, Square, and payment APIs use similar validation patterns for card numbers, amounts, and currencies.
-
Form Libraries: Django forms, WTForms, and React Hook Form implement custom field types with validators.
-
API Frameworks: FastAPI and Flask-RESTX use similar patterns for request validation.
-
Data Pipelines: Apache Beam and Airflow validate data types in ETL processes.
Industry Examples
- Stripe Python SDK: Custom types for
CreditCard,BankAccount,Moneywith validation - phonenumbers library: Googleโs library for phone number parsing (can integrate with Pydantic)
- python-money: Django-based money handling with currency support
- email-validator: Used by Pydanticโs built-in
EmailStr
Production Considerations
- Performance: Cache compiled regex patterns
- Localization: Support international formats (phone, dates, currencies)
- Security: Never log full credit card numbers
- Extensibility: Design types for subclassing
Self-Assessment Checklist
Core Understanding
- Can I explain the difference between before, after, and wrap validators?
- Can I describe when to use
@field_validatorvs@model_validator? - Can I implement
__get_pydantic_core_schema__for a custom class? - Can I explain how
Annotatedtypes work in Pydantic V2? - Can I describe the validation pipeline order?
Implementation Skills
- Can I create a reusable Annotated type with validators?
- Can I implement the Luhn algorithm for credit card validation?
- Can I handle multiple input formats for a single type?
- Can I add proper JSON serialization to custom types?
- Can I create parameterized types like
DomainEmail['domain.com']?
Design Understanding
- Can I decide when to use Annotated types vs custom classes?
- Can I design validators that are testable and reusable?
- Can I provide helpful error messages for validation failures?
- Can I handle edge cases (empty strings, None, wrong types)?
Mastery Indicators
- Types handle all reasonable input formats
- Error messages are clear and actionable
- Types are well-documented with examples
- Code is type-safe (passes mypy)
- Types work correctly with JSON serialization
- Performance is acceptable for production use
Resources
Documentation
Books
- โFluent Pythonโ by Luciano Ramalho - Chapter 8: Type Hints in Functions
- โRobust Pythonโ by Patrick Viafore - Chapter 5: Annotated Types
- โPython Type Checkingโ by Dusty Phillips - Custom Types
Libraries to Study
- pydantic-extra-types - Official extra types
- phonenumbers - Phone parsing
- email-validator - Email validation
- python-money - Money handling
Articles
- Pydantic V2: Whatโs Changed
- Building Custom Types in Pydantic V2 (Check official blog)
- Type-Safe Python with Pydantic