Project 4: DKIM Signature Verifier
Build a verifier that parses DKIM-Signature headers and validates signatures against DNS-published keys.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 3-4 weeks |
| Language | Python (Alternatives: Go, Rust, C) |
| Prerequisites | DNS TXT, RSA signatures, MIME headers |
| Key Topics | Canonicalization, body hash, DKIM tags |
1. Learning Objectives
- Parse DKIM-Signature headers into tag-value pairs.
- Implement canonicalization (simple and relaxed) for headers and body.
- Fetch DKIM public keys via DNS and validate signatures.
- Output pass/fail with clear diagnostic messages.
2. Theoretical Foundation
2.1 Core Concepts
- DKIM: DomainKeys Identified Mail uses cryptographic signatures to validate integrity and domain ownership.
- Canonicalization: Rules for normalizing headers and body before hashing.
- Body hash: The
bh=tag is the base64 hash of the canonicalized body. - Selector and domain:
s=andd=build the DNS name for the public key.
2.2 Why This Matters
DKIM is the integrity layer. Without correct canonicalization, even valid signatures will fail, causing deliverability issues.
2.3 Historical Context / Background
DKIM merges DomainKeys and Identified Internet Mail. It allows recipients to verify that content was not modified and that it was authorized by the domain.
2.4 Common Misconceptions
- Misconception: DKIM validates the sender address. Reality: It validates a signing domain.
- Misconception: Only the body is signed. Reality: Selected headers are signed too.
3. Project Specification
3.1 What You Will Build
A tool that accepts a raw RFC 5322 message, extracts DKIM signatures, canonicalizes headers and body, fetches the public key, and verifies the signature.
3.2 Functional Requirements
- Parse headers and body, preserving raw header order.
- Parse DKIM-Signature tags (v, a, d, s, h, bh, b, c).
- Canonicalize headers and body per
c=. - Compute body hash and compare to
bh=. - Verify signature over the canonicalized header set.
3.3 Non-Functional Requirements
- Performance: Should verify a message in under 1 second.
- Reliability: Must handle folded headers and multiple signatures.
- Usability: Return detailed failure reasons.
3.4 Example Usage / Output
$ ./dkim-verify message.eml
Signature 1: d=example.com s=selector1
Body hash: PASS
Header signature: PASS
Result: DKIM PASS
3.5 Real World Outcome
You can audit messages to prove whether content and headers were altered in transit and which domain authorized them.
4. Solution Architecture
4.1 High-Level Design
Message Parser
-> DKIM Tag Parser
-> Canonicalizer
-> DNS Key Fetcher
-> Signature Verifier
4.2 Key Components
| Component | Responsibility | Key Decisions |
|---|---|---|
| Parser | Split headers/body | Preserve raw header order |
| Canonicalizer | Apply relaxed/simple rules | Follow RFC 6376 |
| Key Fetcher | Lookup public key | TXT query on selector._domainkey |
| Verifier | RSA verify | Use crypto library |
4.3 Data Structures
class DkimSignature:
def __init__(self, tags: dict):
self.tags = tags
self.headers = []
4.4 Algorithm Overview
Key Algorithm: Header Canonicalization (relaxed)
- Lowercase header field name.
- Unfold whitespace, trim, compress WSP.
- Rebuild “name:value” with single space.
Complexity Analysis:
- Time: O(n) for message length
- Space: O(n)
5. Implementation Guide
5.1 Development Environment Setup
python -m venv .venv
source .venv/bin/activate
python -m pip install cryptography
5.2 Project Structure
dkim-verify/
├── message_parser.py
├── dkim_tags.py
├── canonicalize.py
├── dns_keys.py
└── verify.py
5.3 The Core Question You’re Answering
“Can I prove this message was authorized by the signing domain and not modified?”
5.4 Concepts You Must Understand First
Stop and research these before coding:
- Header Folding
- RFC 5322 header continuation rules
- Canonicalization
- Simple vs relaxed for headers and body
- DKIM Tags
- Required tags: v, a, d, s, h, bh, b
- RSA Signatures
- Base64 decoding and verification
5.5 Questions to Guide Your Design
- How will you choose which headers are signed when multiple appear?
- How will you handle body length tag
l=if present? - How will you parse and preserve header order?
5.6 Thinking Exercise
If a header appears twice (e.g., two Received headers), which instance is signed? Why does order matter?
5.7 The Interview Questions They’ll Ask
- “What is DKIM canonicalization and why is it necessary?”
- “What does the
bh=tag represent?” - “How do you build the DNS name for the DKIM public key?”
5.8 Hints in Layers
Hint 1: Parse tags into a dict
- Split on semicolons, then on ‘=’.
Hint 2: Verify body hash first
- It is simpler and isolates issues early.
Hint 3: Build the signing string carefully
- The DKIM-Signature header itself is included with an empty
b=.
5.9 Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| DKIM spec | RFC 6376 | Sections 3-6 |
| Cryptography | Serious Cryptography | Ch. 6, 11 |
| Message format | RFC 5322 | Sections 2-3 |
5.10 Implementation Phases
Phase 1: Foundation (1 week)
Goals:
- Parse message and DKIM tags
Tasks:
- Split headers and body.
- Parse DKIM tag list.
Checkpoint: Print tags and header list.
Phase 2: Core Functionality (1-2 weeks)
Goals:
- Canonicalize and compute hashes
Tasks:
- Implement relaxed and simple canonicalization.
- Compute body hash and compare.
Checkpoint: Correct bh= validation on known message.
Phase 3: Polish and Edge Cases (1 week)
Goals:
- Verify signature
Tasks:
- Fetch public key from DNS.
- Verify RSA signature for signed headers.
Checkpoint: DKIM PASS on a real message.
5.11 Key Implementation Decisions
| Decision | Options | Recommendation | Rationale |
|---|---|---|---|
| Crypto library | cryptography vs openssl CLI | cryptography | Cleaner API |
| Canonicalization | implement both | implement both | Real messages use relaxed |
| Multi-signature | first vs all | verify all | More accurate |
6. Testing Strategy
6.1 Test Categories
| Category | Purpose | Examples |
|---|---|---|
| Unit Tests | Canonicalization | folded headers, trailing spaces |
| Integration Tests | Real messages | Gmail and Yahoo samples |
| Edge Case Tests | Multiple signatures | Verify all or report failures |
6.2 Critical Test Cases
- Relaxed canonicalization matches RFC examples.
- Body hash mismatch returns DKIM fail.
- Missing public key returns temperror.
6.3 Test Data
DKIM-Signature: v=1; a=rsa-sha256; d=example.com; s=sel; ...
7. Common Pitfalls and Debugging
7.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Mishandled header folding | Signature fails | Preserve raw header lines |
| Incorrect canonicalization | False failures | Follow RFC examples |
| Wrong key name | NXDOMAIN | Use s._domainkey.d |
7.2 Debugging Strategies
- Compare with
opendkim-testmsgoutput. - Log canonicalized header string.
7.3 Performance Traps
- Re-fetching DNS keys for each message. Cache keys by selector.
8. Extensions and Challenges
8.1 Beginner Extensions
- Report which header failed canonicalization.
- Output JSON results.
8.2 Intermediate Extensions
- Support ed25519-sha256 when present.
- Verify
l=body length tag.
8.3 Advanced Extensions
- Implement ARC verification chain.
- Build a DKIM lint tool for domains.
9. Real-World Connections
9.1 Industry Applications
- Mail gateways use DKIM verification for trust scoring.
- Security tools validate integrity of inbound messages.
9.2 Related Open Source Projects
- OpenDKIM: https://github.com/trusteddomainproject/OpenDKIM
- dkimpy: https://github.com/kjd/idc-dkimpy
9.3 Interview Relevance
- Canonicalization and signature verification are common email security topics.
10. Resources
10.1 Essential Reading
- RFC 6376 - DKIM specification
- RFC 5322 - Message format
10.2 Video Resources
- DKIM verification walkthroughs
10.3 Tools and Documentation
- opendkim-testmsg for reference verification
- dig for DKIM TXT lookups
10.4 Related Projects in This Series
11. Self-Assessment Checklist
11.1 Understanding
- I can explain DKIM tags and selectors
- I understand relaxed vs simple canonicalization
- I can explain the body hash
11.2 Implementation
- Verifies real DKIM signatures
- Handles multiple signatures
- Reports clear failure reasons
11.3 Growth
- I can debug DKIM failures by inspecting canonicalized data
- I can explain DKIM limitations without DMARC
12. Submission / Completion Criteria
Minimum Viable Completion:
- Parse DKIM header and fetch public key
Full Completion:
- Verify body hash and header signature
Excellence (Going Above and Beyond):
- Support multiple algorithms and ARC
- Provide detailed diagnostics and linting
This guide was generated from EMAIL_SYSTEMS_DEEP_DIVE_PROJECTS.md. For the complete learning path, see the parent directory.