Project 3: SPF Record Parser and Validator

Build a full SPF evaluator that resolves includes and determines if an IP is authorized to send for a domain.

Quick Reference

Attribute Value
Difficulty Advanced
Time Estimate 2-3 weeks
Language Python (Alternatives: Go, Rust, C)
Prerequisites DNS basics, CIDR math, parsing
Key Topics SPF grammar, recursive includes, DNS limits

1. Learning Objectives

  1. Parse SPF TXT records into structured mechanisms.
  2. Implement SPF evaluation logic with qualifiers.
  3. Resolve include and redirect chains within DNS lookup limits.
  4. Return correct results for pass, fail, softfail, neutral, and temperror.

2. Theoretical Foundation

2.1 Core Concepts

  • SPF records: TXT records starting with v=spf1 containing mechanisms and modifiers.
  • Qualifiers: + pass, - fail, ~ softfail, ? neutral.
  • Mechanisms: ip4, ip6, a, mx, include, all.
  • DNS lookup limit: A maximum of 10 DNS lookups per evaluation to prevent abuse.

2.2 Why This Matters

SPF is the first filter for spoofing. A correct evaluator is required for deliverability tools, authentication dashboards, and mail server policy decisions.

2.3 Historical Context / Background

SPF evolved from Sender ID and early anti-spoofing proposals. It was designed to be DNS-based and simple to deploy, but the lookup limit reflects real-world abuse concerns.

2.4 Common Misconceptions

  • Misconception: SPF is about message content. Reality: It only checks the sending IP.
  • Misconception: SPF is deterministic. Reality: DNS failures yield temperror or permerror.

3. Project Specification

3.1 What You Will Build

A CLI tool that accepts a domain and an IP address, fetches the SPF record, recursively resolves mechanisms, and outputs a final evaluation with trace.

3.2 Functional Requirements

  1. SPF TXT lookup for a domain.
  2. Parser for mechanisms and modifiers.
  3. Evaluation engine implementing RFC 7208 semantics.
  4. DNS lookup counter enforcing limit.
  5. Trace output showing which mechanism matched.

3.3 Non-Functional Requirements

  • Performance: Resolve a typical SPF in under 1 second.
  • Reliability: Clear errors for permerror and temperror.
  • Usability: Output should explain match path.

3.4 Example Usage / Output

$ ./spf-check google.com 209.85.220.41
SPF record: v=spf1 include:_spf.google.com ~all
Include: _spf.google.com
Match: ip4:209.85.128.0/17
Result: pass
Lookups: 4/10

3.5 Real World Outcome

You can prove whether a sending IP is authorized for a domain and explain which SPF mechanism caused the decision.


4. Solution Architecture

4.1 High-Level Design

CLI
  -> DNS Resolver
  -> SPF Parser
  -> Evaluator
  -> Trace Reporter

4.2 Key Components

Component Responsibility Key Decisions
Parser Tokenize SPF mechanisms Use a simple lexer by whitespace
Evaluator Apply qualifiers and match logic Left-to-right evaluation
Resolver DNS lookups with counter Centralized lookup budget
Trace Explain decisions Store match path

4.3 Data Structures

class Mechanism:
    def __init__(self, qualifier, kind, value):
        self.qualifier = qualifier
        self.kind = kind
        self.value = value

4.4 Algorithm Overview

Key Algorithm: SPF Evaluation

  1. Start with v=spf1 record.
  2. For each mechanism left to right:
    • Resolve if needed (a, mx, include).
    • Check match against IP.
    • If match, return qualifier result.
  3. If no match, return neutral or default.

Complexity Analysis:

  • Time: O(m + d) where m is mechanisms and d is DNS lookups
  • Space: O(m)

5. Implementation Guide

5.1 Development Environment Setup

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

5.2 Project Structure

spf-validator/
├── spf_parser.py
├── spf_eval.py
├── dns_client.py
└── README.md

5.3 The Core Question You’re Answering

“Given a domain and an IP, should this sender be trusted by SPF rules?”

5.4 Concepts You Must Understand First

Stop and research these before coding:

  1. SPF Grammar
    • Mechanisms vs modifiers
    • Qualifier prefixes
  2. CIDR Matching
    • IP range math
    • IPv4 vs IPv6
  3. DNS Lookup Limits
    • Why 10 lookups exists
    • How include and mx consume lookups
  4. Return Codes
    • pass, fail, softfail, neutral, temperror, permerror

5.5 Questions to Guide Your Design

  1. How will you avoid infinite recursion in includes?
  2. What should happen if DNS times out?
  3. How will you count lookups for mx and a mechanisms?

5.6 Thinking Exercise

If a record is:

v=spf1 include:a.example.com include:b.example.com -all

What happens if a.example.com returns permerror? Does evaluation continue?

5.7 The Interview Questions They’ll Ask

  1. “What is the SPF lookup limit and why does it exist?”
  2. “How does include work and when does it match?”
  3. “Why is SPF not sufficient without DKIM and DMARC?”

5.8 Hints in Layers

Hint 1: Build a token list

  • Split by space and parse qualifier prefix.

Hint 2: Centralize DNS lookups

  • Every resolver call should update a shared counter.

Hint 3: Make evaluation traceable

  • Keep a list of evaluation steps for output.

5.9 Books That Will Help

Topic Book Chapter
SPF spec RFC 7208 Sections 4-8
DNS TCP/IP Illustrated Vol 1 Ch. 11
Parsing Language Implementation Patterns Ch. 3

5.10 Implementation Phases

Phase 1: Foundation (3-4 days)

Goals:

  • Fetch SPF TXT record
  • Parse mechanisms

Tasks:

  1. TXT lookup and record selection.
  2. Parse mechanism tokens.

Checkpoint: Print parsed mechanisms for a domain.

Phase 2: Core Functionality (5-7 days)

Goals:

  • Implement evaluation with DNS lookup budget

Tasks:

  1. Implement ip4 and ip6 matching.
  2. Add include and mx resolution.

Checkpoint: Correctly evaluate a known SPF record.

Phase 3: Polish and Edge Cases (3-4 days)

Goals:

  • Handle errors and trace output

Tasks:

  1. Add permerror and temperror handling.
  2. Provide evaluation trace.

Checkpoint: Output explains match or failure path.

5.11 Key Implementation Decisions

Decision Options Recommendation Rationale
Parser Regex vs manual Manual tokenizer Easier to handle qualifiers
DNS library system resolver vs custom system resolver SPF logic is main focus
Trace output verbose vs concise verbose option Useful for debugging

6. Testing Strategy

6.1 Test Categories

Category Purpose Examples
Unit Tests Parser correctness Qualifiers and modifiers
Integration Tests Real SPF records google.com, yahoo.com
Edge Case Tests DNS failures timeout, NXDOMAIN

6.2 Critical Test Cases

  1. Lookup limit exceeded returns permerror.
  2. Include that returns pass causes pass.
  3. Softfail returns ~ result.

6.3 Test Data

v=spf1 ip4:192.0.2.0/24 -all

7. Common Pitfalls and Debugging

7.1 Frequent Mistakes

Pitfall Symptom Solution
Ignoring lookup limit Always pass locally Enforce 10 lookups
Wrong qualifier Incorrect result Parse prefix carefully
Mishandling include Always match Evaluate included record separately

7.2 Debugging Strategies

  • Print a trace of evaluated mechanisms.
  • Compare results to dig txt and known SPF checkers.

7.3 Performance Traps

  • Excessive DNS calls. Cache lookups across includes.

8. Extensions and Challenges

8.1 Beginner Extensions

  • Add JSON output for results.
  • Support redirect= modifier.

8.2 Intermediate Extensions

  • Implement exp= explanations.
  • Add IPv6 support for ip6.

8.3 Advanced Extensions

  • Add parallel DNS resolution with a lookup budget.
  • Integrate with DKIM and DMARC results.

9. Real-World Connections

9.1 Industry Applications

  • Mail servers use SPF checks during SMTP.
  • Security tools use SPF to evaluate spoofing risk.
  • pyspf: https://www.openspf.org/ - SPF reference implementation
  • OpenDMARC: https://github.com/trusteddomainproject/OpenDMARC

9.3 Interview Relevance

  • SPF logic and DNS recursion are common in email security roles.

10. Resources

10.1 Essential Reading

  • RFC 7208 - SPF specification
  • RFC 5321 - SMTP sending context

10.2 Video Resources

  • SPF and email authentication walkthroughs

10.3 Tools and Documentation

  • dig for TXT records
  • mxtoolbox for cross-checking

11. Self-Assessment Checklist

11.1 Understanding

  • I can explain how SPF works
  • I understand lookup limits
  • I can interpret SPF qualifiers

11.2 Implementation

  • Evaluates real domains correctly
  • Produces correct pass/fail/softfail
  • Enforces lookup limit

11.3 Growth

  • I can debug SPF issues with a trace
  • I can explain why SPF alone is not enough

12. Submission / Completion Criteria

Minimum Viable Completion:

  • Parse SPF record and evaluate ip4 mechanisms

Full Completion:

  • Handle include, mx, a mechanisms with lookup limits

Excellence (Going Above and Beyond):

  • Implement redirect and exp modifiers
  • Add caching and trace visualization

This guide was generated from EMAIL_SYSTEMS_DEEP_DIVE_PROJECTS.md. For the complete learning path, see the parent directory.