Project 24: Secret Scanner Hook (Security on Write)
Project 24: Secret Scanner Hook (Security on Write)
Build a PreToolUse hook that scans for secrets (API keys, passwords, tokens) before Kiro writes files - blocking commits of sensitive data before they happen.
Learning Objectives
By completing this project, you will:
- Master PreToolUse blocking hooks with exit code 2 for preventing dangerous operations
- Understand secret detection patterns including regex, entropy analysis, and known formats
- Implement allowlist systems for handling false positives safely
- Design security feedback loops that guide AI to fix issues, not just block them
- Apply defense-in-depth principles by catching secrets before git commit
Deep Theoretical Foundation
The Secret Leakage Problem
Secrets in source code are one of the most common security vulnerabilities:
How Secrets Leak:
Developer's Intent Reality
โโโโโโโโโโโโโโโโโ โโโโโโโ
"I'll just test โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
with the real key" โโโโโโโโบโ const key = "sk_live_xxx" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ git add . && commit โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ git push origin main โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโ
โ GitHub/GitLab Stores โ โ CI/CD Logs Expose โ โ Bots Scan & Exploit โ
โ Forever in History โ โ In Build Output โ โ Within Minutes โ
โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโ
Statistics that matter:
- GitGuardian found 10+ million secrets exposed on GitHub in 2023
- Average time to exploit a leaked AWS key: 4 minutes
- Cost of secret rotation after leak: hours to days of engineering time
The PreToolUse Blocking Pattern
Unlike PostToolUse (which observes), PreToolUse can prevent operations:
PreToolUse Hook Flow:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Kiro Agent โ
โ โ
โ User: "Create a config file with the database connection" โ
โ โ
โ Agent plans: write tool โ config.ts โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ PreToolUse Hook Runs BEFORE Write โโ
โ โ โโ
โ โ Input: { tool: "write", content: "...password=secret123..." } โโ
โ โ โโ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโ
โ โ โ Secret Scanner โ โโ
โ โ โ โ โโ
โ โ โ Checking content... โ โโ
โ โ โ โ FOUND: Hardcoded password on line 5 โ โโ
โ โ โ โ โโ
โ โ โ Exit code 2 โ BLOCK โ โโ
โ โ โ stdout โ Message to agent โ โโ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ โ
โ โผ โ
โ Write tool is BLOCKED โ
โ โ
โ Agent receives: "BLOCKED: Found hardcoded password. Use โ
โ environment variable instead: process.env.DB_PASS"โ
โ โ
โ Agent: "I understand. Let me refactor to use environment โ
โ variables instead..." โ
โ โ
โ [Rewrites with process.env.DB_PASSWORD] โ
โ โ
โ PreToolUse runs again โ No secrets found โ Exit 0 โ ALLOWED โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Exit Code Protocol
Kiro hooks use exit codes to communicate decisions:
Exit Code Meanings:
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Exit Code โ Meaning โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 0 โ ALLOW - Continue with the operation โ
โ โ Hook ran successfully, no issues found โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 1 โ ERROR - Something went wrong in the hook โ
โ โ Operation continues (fail-open) โ
โ โ Error is logged as warning โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 2 โ BLOCK - Prevent the operation โ
โ โ stdout is sent to agent as feedback โ
โ โ Agent should address the issue and retry โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Secret Detection Techniques
Multiple techniques work together for comprehensive detection:
Detection Layer Stack:
Layer 1: PATTERN MATCHING (Fast, High Confidence)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Known Secret Formats: โ
โ โ
โ AWS Access Key: AKIA[A-Z0-9]{16} โ
โ AWS Secret: [A-Za-z0-9/+=]{40} โ
โ GitHub Token: ghp_[A-Za-z0-9]{36} โ
โ Stripe Live Key: sk_live_[A-Za-z0-9]{24,} โ
โ JWT: eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+โ
โ Private Key: -----BEGIN (RSA |EC )?PRIVATE KEY----- โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Layer 2: ENTROPY ANALYSIS (Catches Unknown Formats)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Shannon Entropy measures randomness: โ
โ โ
โ "password" โ Low entropy (2.5) โ Probably not a secret โ
โ "aB3$kL9@mN2" โ High entropy (3.8) โ Suspicious โ
โ "sk_test_4eC39H" โ High entropy (4.2) โ Likely a secret โ
โ โ
โ Threshold: > 3.5 bits/char in strings > 16 chars = Suspicious โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Layer 3: CONTEXTUAL ANALYSIS (Reduces False Positives)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Variable Name Context: โ
โ โ
โ const password = "test123" โ Suspicious (password in name) โ
โ const description = "test123" โ Probably fine โ
โ const API_KEY = process.env.X โ Good (using env var) โ
โ const API_KEY = "sk_live_xxx" โ BAD (hardcoded with key name) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Layer 4: KNOWN VENDOR PATTERNS (Very High Confidence)
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Vendor-Specific Formats: โ
โ โ
โ Stripe: sk_live_*, rk_live_*, pk_live_* โ
โ AWS: AKIA*, ASIA* (access keys) โ
โ GitHub: ghp_*, gho_*, ghu_*, ghs_*, ghr_* โ
โ Slack: xox[baprs]-* โ
โ Twilio: SK[a-f0-9]{32} โ
โ SendGrid: SG\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Shannon Entropy Calculation
Entropy measures the randomness of a string - high entropy often indicates secrets:
Shannon Entropy Formula:
H = -ฮฃ p(x) * log2(p(x))
Where p(x) is the probability of character x appearing in the string.
Example Calculation for "aB3$kL9@":
Character frequencies:
a=1, B=1, 3=1, $=1, k=1, L=1, 9=1, @=1
Each appears 1/8 of the time, so:
H = -8 * (1/8 * log2(1/8))
H = -8 * (1/8 * -3)
H = 3 bits per character
Interpretation:
โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Entropy Range โ Meaning โ Example โ
โโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ 0 - 2.0 โ Very low โ "aaaaaaaa" โ
โ 2.0 - 3.0 โ Low โ "password" โ
โ 3.0 - 4.0 โ Medium โ "P@ssw0rd" โ
โ 4.0 - 5.0 โ High โ Random tokens โ
โ 5.0+ โ Very high โ Cryptographic secrets โ
โโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Allowlist Architecture
Not every detected โsecretโ is actually a problem. Allowlists handle false positives:
Allowlist Structure:
.kiro/secrets-allowlist.json
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ { โ
โ "patterns": [ โ
โ { โ
โ "pattern": "sk_test_.*", โ
โ "reason": "Test mode Stripe keys are safe", โ
โ "expires": "2025-01-01" โ
โ } โ
โ ], โ
โ "files": [ โ
โ { โ
โ "path": "src/examples/demo.ts", โ
โ "reason": "Demo file with fake credentials", โ
โ "hash": "abc123..." โ
โ } โ
โ ], โ
โ "hashes": [ โ
โ { โ
โ "sha256": "def456...", โ
โ "reason": "Public example from documentation", โ
โ "addedBy": "security-team", โ
โ "addedAt": "2024-01-15" โ
โ } โ
โ ] โ
โ } โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Defense in Depth
Your hook is one layer in a security onion:
Security Layers (Defense in Depth):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Developer Workstation โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Layer 1: Kiro PreToolUse Hook โโโ YOUR PROJECT โโ
โ โ Blocks before file write โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Layer 2: Git Pre-Commit Hook โโ
โ โ Blocks before commit โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Layer 3: Git Pre-Push Hook โโ
โ โ Blocks before push โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ Push
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Remote/CI โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Layer 4: GitHub Secret Scanning โโ
โ โ Alerts on push โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Layer 5: CI Pipeline Scanner (gitleaks/trufflehog) โโ
โ โ Fails pipeline โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Layer 6: Periodic Full Scan โโ
โ โ Catches historical leaks โโ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Why multiple layers?
โข Each layer catches what others miss
โข Earlier is better (cheaper to fix)
โข PreToolUse is the EARLIEST possible layer
Real-World Analogy: The Security Guard
Your hook is like a security guard at a building entrance:
- Checks everyone entering (scans all file content)
- Has a list of prohibited items (secret patterns)
- Uses judgment (entropy analysis for unknown threats)
- Allows known employees (allowlist for false positives)
- Explains why someone is stopped (feedback to AI)
- Suggests alternatives (use env vars instead)
Historical Context
Secret detection has evolved significantly:
Evolution of Secret Protection:
2010s: Manual Review
โโโบ Humans reviewed code for secrets (slow, error-prone)
2015: Git Hooks (pre-commit)
โโโบ Block commits containing patterns
2017: GitHub Secret Scanning
โโโบ Automatic detection on push
2019: gitleaks/trufflehog
โโโบ Open-source entropy-based scanning
2023: AI Code Assistants
โโโบ Can generate code with secrets!
2024+: PreToolUse Hooks โโโโ YOU ARE HERE
โโโบ Block AI from writing secrets
BEFORE they even hit the filesystem
Book References
For deeper understanding:
- โPractical Security Automationโ by Tony UcedaVelez - Automation patterns
- โApplication Security Handbookโ by OWASP - Secret management best practices
- โSecrets Managementโ by HashiCorp - Enterprise secret handling
- โThe Tangled Webโ by Michal Zalewski - Understanding security vulnerabilities
Complete Project Specification
What You Are Building
A PreToolUse hook that:
- Intercepts file writes before they happen
- Scans content for secrets using multiple detection methods
- Blocks dangerous writes with exit code 2
- Provides actionable feedback to help the AI fix the issue
- Supports allowlists for legitimate exceptions
Functional Requirements
| Feature | Behavior |
|---|---|
| Pattern Detection | Detect 20+ common secret formats |
| Entropy Analysis | Flag high-entropy strings in sensitive contexts |
| Context Awareness | Consider variable names and file types |
| Blocking | Exit 2 with clear message when secrets found |
| Allowlist | Skip known false positives |
| Remediation | Suggest environment variable alternatives |
Non-Functional Requirements
- Latency: Complete scan within 500ms (must not slow down development)
- False Positive Rate: < 5% false positives with default rules
- Coverage: Detect all OWASP top secret patterns
- Fail-Safe: If hook crashes, fail open (exit 1) to not block development
Solution Architecture
High-Level Component Diagram
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Kiro CLI โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ โ Agent Session โโ
โ โ โโ
โ โ Agent plans: write โ src/config.ts โโ
โ โ โ โโ
โ โโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ preToolUse event (BEFORE write)
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Secret Scanner Hook โ
โ โ
โ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โ
โ โ Pattern โ โ Entropy โ โ Context โ โ
โ โ Matcher โ โ Analyzer โ โ Analyzer โ โ
โ โ โข 50+ regex โ โ โข Shannon H โ โ โข Var names โ โ
โ โ โข Known keys โ โ โข Threshold โ โ โข File type โ โ
โ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Finding Aggregator โ โ
โ โ โข Deduplicate โ โ
โ โ โข Check allowlist โ โ
โ โ โข Rank by severity โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ
โ No Secrets Found Secrets Found โ
โ โ โ โ
โ โผ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ Exit Code 0 โ โ Exit Code 2 โ โ
โ โ (Allow) โ โ (Block) โ โ
โ โโโโโโโโโโโโโโโโโโโ โ + Feedback โ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Data Flow: Secret Detection and Blocking
1. Hook Receives Write Request
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ stdin: { โ
โ "hook_event_name": "preToolUse", โ
โ "tool_name": "write", โ
โ "tool_input": { โ
โ "file_path": "src/config.ts", โ
โ "content": "export const config = {\n apiKey: 'sk_live_4eC39HqLyjWDarjtT1zdp7dc'\n};"
โ } โ
โ } โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
2. Pattern Matching Phase
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Checking pattern: Stripe Live Key โ
โ Regex: sk_live_[A-Za-z0-9]{24,} โ
โ โ
โ MATCH FOUND at line 2: โ
โ sk_live_4eC39HqLyjWDarjtT1zdp7dc โ
โ โ
โ Confidence: HIGH (known format) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
3. Allowlist Check
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Checking allowlist... โ
โ โ
โ โข Pattern "sk_test_.*" - NOT matched (this is live, not test) โ
โ โข File "src/config.ts" - NOT in file allowlist โ
โ โข Hash of value - NOT in hash allowlist โ
โ โ
โ Result: NOT ALLOWLISTED โ Proceed to block โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
4. Generate Blocking Response
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ stdout: โ
โ โ
โ BLOCKED: Potential secrets detected in file write โ
โ โ
โ File: src/config.ts โ
โ โ
โ Findings: โ
โ โข Line 2: Stripe Live API Key (sk_live_...) โ
โ Severity: CRITICAL โ
โ This is a production payment key that could be exploited โ
โ โ
โ Remediation: โ
โ 1. Use environment variable: process.env.STRIPE_SECRET_KEY โ
โ 2. Add to .env file (which is gitignored) โ
โ 3. Example: โ
โ export const config = { โ
โ apiKey: process.env.STRIPE_SECRET_KEY โ
โ } โ
โ โ
โ exit code: 2 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Interfaces
// Secret finding
interface SecretFinding {
type: 'pattern' | 'entropy' | 'context';
name: string; // "Stripe Live Key"
pattern?: string; // The regex that matched
value: string; // The matched secret (partially redacted)
line: number;
column: number;
severity: 'critical' | 'high' | 'medium' | 'low';
confidence: 'high' | 'medium' | 'low';
remediation: string; // Suggested fix
}
// Pattern definition
interface SecretPattern {
name: string;
pattern: RegExp;
severity: 'critical' | 'high' | 'medium' | 'low';
description: string;
remediation: string;
testCases: {
shouldMatch: string[];
shouldNotMatch: string[];
};
}
// Allowlist entry
interface AllowlistEntry {
type: 'pattern' | 'file' | 'hash';
value: string;
reason: string;
addedBy: string;
addedAt: string;
expires?: string;
}
// Hook configuration
interface SecretScannerConfig {
enabled: boolean;
patterns: SecretPattern[];
entropyThreshold: number; // Default: 4.0
entropyMinLength: number; // Default: 16
allowlistPath: string;
excludePaths: string[]; // e.g., ["*.test.ts", "fixtures/*"]
failOpen: boolean; // Exit 0 on hook errors
}
Technology Choices
| Component | Technology | Rationale |
|---|---|---|
| Hook Runtime | Bun | Fast startup for responsiveness |
| Pattern Engine | Native RegExp | Performant, no dependencies |
| Entropy Calc | Custom | Simple algorithm, no library needed |
| Config Format | JSON | Standard Kiro pattern |
| Allowlist | JSON file | Version controllable, editable |
Phased Implementation Guide
Phase 1: Hook Foundation (Days 1-2)
Goal: Create a PreToolUse hook that intercepts write operations.
Tasks:
- Create hook script file
- Configure in
.kiro/settings.json - Parse stdin JSON for tool input
- Filter for write tool only
- Log file writes for debugging
Hints:
- PreToolUse receives the same JSON format as PostToolUse
- Exit 0 initially to not block anything
- The content to scan is in
tool_input.content
Configuration (.kiro/settings.json):
{
"hooks": {
"preToolUse": [
{
"matcher": "write",
"command": "bun run /path/to/hooks/secret-scanner.ts"
}
]
}
}
Starter Code:
#!/usr/bin/env bun
import { readFileSync } from 'fs';
const input = JSON.parse(readFileSync(0, 'utf-8'));
// Only process write operations
if (input.tool_name !== 'write') {
process.exit(0);
}
const { file_path, content } = input.tool_input;
console.error(`[SecretScanner] Checking: ${file_path}`);
console.error(`[SecretScanner] Content length: ${content.length} bytes`);
// TODO: Implement scanning
process.exit(0);
Phase 2: Pattern Detection (Days 3-5)
Goal: Implement regex-based detection for known secret formats.
Tasks:
- Define patterns for common secrets (AWS, GitHub, Stripe, etc.)
- Scan content against all patterns
- Record matches with line numbers
- Generate finding objects
- Test against known secrets
Hints:
- Use non-global regex for finding all matches
- Track line numbers by splitting content
- Redact the actual secret in output
Pattern Examples:
const PATTERNS: SecretPattern[] = [
{
name: 'AWS Access Key',
pattern: /AKIA[A-Z0-9]{16}/g,
severity: 'critical',
description: 'AWS IAM access key ID',
remediation: 'Use AWS_ACCESS_KEY_ID environment variable',
testCases: {
shouldMatch: ['AKIAIOSFODNN7EXAMPLE'],
shouldNotMatch: ['AKIAEXAMPLE123'],
},
},
{
name: 'Stripe Live Key',
pattern: /sk_live_[A-Za-z0-9]{24,}/g,
severity: 'critical',
description: 'Stripe live mode secret key',
remediation: 'Use STRIPE_SECRET_KEY environment variable',
testCases: {
shouldMatch: ['sk_live_4eC39HqLyjWDarjtT1zdp7dc'],
shouldNotMatch: ['sk_test_4eC39HqLyjWDarjtT1zdp7dc'],
},
},
// Add more patterns...
];
Phase 3: Entropy Analysis (Days 6-7)
Goal: Detect secrets that donโt match known patterns.
Tasks:
- Implement Shannon entropy calculation
- Extract potential secrets (quoted strings, assignments)
- Calculate entropy for each candidate
- Flag high-entropy strings in sensitive contexts
- Tune threshold to minimize false positives
Hints:
- Entropy > 4.0 for strings > 16 chars is suspicious
- Consider context: variable names containing โkeyโ, โsecretโ, โpasswordโ
- Skip common high-entropy non-secrets (UUIDs, hashes in dependencies)
Entropy Implementation:
function calculateEntropy(str: string): number {
const charCounts = new Map<string, number>();
for (const char of str) {
charCounts.set(char, (charCounts.get(char) || 0) + 1);
}
let entropy = 0;
for (const count of charCounts.values()) {
const probability = count / str.length;
entropy -= probability * Math.log2(probability);
}
return entropy;
}
function isHighEntropySecret(value: string, context: string): boolean {
if (value.length < 16) return false;
const entropy = calculateEntropy(value);
if (entropy < 4.0) return false;
// Check context for sensitive variable names
const sensitiveNames = /(?:key|secret|password|token|credential|auth)/i;
return sensitiveNames.test(context);
}
Phase 4: Blocking and Feedback (Days 8-10)
Goal: Block dangerous writes and provide actionable feedback.
Tasks:
- Format findings into clear output
- Exit with code 2 to block
- Provide specific remediation suggestions
- Include code examples for fixes
- Test the full blocking flow
Hints:
- The feedback goes to the AI, so be specific
- Include both whatโs wrong AND how to fix it
- Partially redact secrets in output
Blocking Output Format:
function formatBlockingMessage(findings: SecretFinding[]): string {
const lines = [
'โ BLOCKED: Potential secrets detected in file write',
'',
'The following secrets were found and must be removed:',
'',
];
for (const finding of findings) {
lines.push(`โข Line ${finding.line}: ${finding.name}`);
lines.push(` Value: ${redact(finding.value)}`);
lines.push(` Severity: ${finding.severity.toUpperCase()}`);
lines.push(` Fix: ${finding.remediation}`);
lines.push('');
}
lines.push('Suggested approach:');
lines.push('1. Use environment variables instead of hardcoded values');
lines.push('2. Reference with process.env.VARIABLE_NAME');
lines.push('3. Add actual values to .env file (gitignored)');
return lines.join('\n');
}
function redact(secret: string): string {
if (secret.length <= 8) return '****';
return secret.slice(0, 4) + '...' + secret.slice(-4);
}
Phase 5: Allowlist System (Days 11-14)
Goal: Allow legitimate exceptions without disabling security.
Tasks:
- Create allowlist file structure
- Load allowlist on hook start
- Check findings against allowlist before blocking
- Support pattern, file, and hash-based allowlisting
- Add expiration checking
Hints:
- Allowlist file should be version controlled
- Require a reason for each exception
- Support expiration dates for temporary exceptions
Allowlist Loading:
interface Allowlist {
patterns: { pattern: RegExp; reason: string; expires?: Date }[];
files: { path: string; reason: string }[];
hashes: { sha256: string; reason: string }[];
}
function loadAllowlist(path: string): Allowlist {
const raw = JSON.parse(readFileSync(path, 'utf-8'));
return {
patterns: raw.patterns.map(p => ({
pattern: new RegExp(p.pattern),
reason: p.reason,
expires: p.expires ? new Date(p.expires) : undefined,
})),
files: raw.files,
hashes: raw.hashes,
};
}
function isAllowlisted(finding: SecretFinding, filePath: string, allowlist: Allowlist): boolean {
// Check pattern allowlist
for (const entry of allowlist.patterns) {
if (entry.expires && new Date() > entry.expires) continue;
if (entry.pattern.test(finding.value)) return true;
}
// Check file allowlist
if (allowlist.files.some(f => filePath.includes(f.path))) return true;
// Check hash allowlist
const hash = crypto.createHash('sha256').update(finding.value).digest('hex');
if (allowlist.hashes.some(h => h.sha256 === hash)) return true;
return false;
}
Testing Strategy
Unit Tests
describe('SecretScanner', () => {
describe('pattern detection', () => {
it('detects AWS access keys', () => {
const content = 'const key = "AKIAIOSFODNN7EXAMPLE";';
const findings = scanForSecrets(content);
expect(findings).toHaveLength(1);
expect(findings[0].name).toBe('AWS Access Key');
});
it('ignores test Stripe keys', () => {
const content = 'const key = "sk_test_4eC39HqLyjWDarjtT1zdp7dc";';
const findings = scanForSecrets(content);
expect(findings).toHaveLength(0);
});
it('detects live Stripe keys', () => {
const content = 'const key = "sk_live_4eC39HqLyjWDarjtT1zdp7dc";';
const findings = scanForSecrets(content);
expect(findings).toHaveLength(1);
expect(findings[0].severity).toBe('critical');
});
});
describe('entropy analysis', () => {
it('calculates correct entropy', () => {
expect(calculateEntropy('aaaaaaaa')).toBeLessThan(1);
expect(calculateEntropy('aB3$kL9@mN2xYz')).toBeGreaterThan(3.5);
});
it('flags high-entropy strings in sensitive contexts', () => {
const content = 'const apiKey = "xK9mL2pQ8rT5vY3nZ";';
const findings = scanForSecrets(content);
expect(findings.some(f => f.type === 'entropy')).toBe(true);
});
});
describe('allowlist', () => {
it('skips allowlisted patterns', () => {
const allowlist = { patterns: [{ pattern: /test_.*/, reason: 'test keys' }] };
const finding = { value: 'test_abc123', type: 'pattern' };
expect(isAllowlisted(finding, 'file.ts', allowlist)).toBe(true);
});
});
});
Integration Tests
describe('Hook Integration', () => {
it('blocks write with secrets', async () => {
const input = {
hook_event_name: 'preToolUse',
tool_name: 'write',
tool_input: {
file_path: 'config.ts',
content: 'export const key = "sk_live_abc123def456ghi789";'
}
};
const result = await runHook(input);
expect(result.exitCode).toBe(2);
expect(result.stdout).toContain('BLOCKED');
expect(result.stdout).toContain('Stripe');
});
it('allows write without secrets', async () => {
const input = {
hook_event_name: 'preToolUse',
tool_name: 'write',
tool_input: {
file_path: 'config.ts',
content: 'export const key = process.env.API_KEY;'
}
};
const result = await runHook(input);
expect(result.exitCode).toBe(0);
});
});
Manual Testing
# 1. Test with a file containing a secret
echo '{"hook_event_name":"preToolUse","tool_name":"write","tool_input":{"file_path":"test.ts","content":"const key = \"sk_live_1234567890abcdef\";"}}' | bun run secret-scanner.ts
# Should exit 2 with blocking message
# 2. Test with safe file
echo '{"hook_event_name":"preToolUse","tool_name":"write","tool_input":{"file_path":"test.ts","content":"const key = process.env.API_KEY;"}}' | bun run secret-scanner.ts
# Should exit 0
# 3. Test in actual Kiro session
kiro-cli chat
> "Create a config file with my Stripe key sk_live_abc123"
# Should see hook block the write
Common Pitfalls and Debugging
Pitfall 1: Hook Blocks Too Much (False Positives)
Symptom: Legitimate code blocked as secrets
Debugging:
# Add debug logging to hook
console.error('[SecretScanner] Checking:', value);
console.error('[SecretScanner] Entropy:', calculateEntropy(value));
console.error('[SecretScanner] Context:', context);
Solution:
// Add to allowlist
{
"patterns": [
{
"pattern": "example_.*",
"reason": "Example values in documentation"
}
]
}
Pitfall 2: Hook Slows Down Development
Symptom: Noticeable delay on every file write
Cause: Scanning is too slow
Solution:
// Quick rejection for non-source files
const sourceExtensions = ['.ts', '.js', '.tsx', '.jsx', '.py', '.rb'];
if (!sourceExtensions.some(ext => filePath.endsWith(ext))) {
process.exit(0); // Skip non-source files
}
// Quick rejection for short files
if (content.length < 100) {
process.exit(0); // Unlikely to contain meaningful secrets
}
Pitfall 3: AI Doesnโt Understand Feedback
Symptom: AI keeps trying to write secrets differently
Cause: Feedback isnโt specific enough
Solution:
// Make feedback actionable
const feedback = `
BLOCKED: Hardcoded Stripe key detected.
Instead of:
const stripeKey = "sk_live_xxx";
Write:
const stripeKey = process.env.STRIPE_SECRET_KEY;
And add to .env file:
STRIPE_SECRET_KEY=sk_live_xxx
`;
Pitfall 4: Allowlist Grows Unbounded
Symptom: Allowlist becomes a dumping ground for false positives
Prevention:
// Require expiration for all entries
function validateAllowlistEntry(entry: AllowlistEntry): boolean {
if (!entry.expires) {
console.error('All allowlist entries must have expiration');
return false;
}
const maxExpiration = new Date();
maxExpiration.setMonth(maxExpiration.getMonth() + 6);
if (new Date(entry.expires) > maxExpiration) {
console.error('Allowlist entries cannot exceed 6 months');
return false;
}
return true;
}
Extensions and Challenges
Extension 1: Real-Time Secret Verification
For some secrets, verify if theyโre actually valid:
async function verifySecretIsLive(type: string, value: string): Promise<boolean> {
switch (type) {
case 'github_token':
const response = await fetch('https://api.github.com/user', {
headers: { Authorization: `Bearer ${value}` }
});
return response.status === 200;
case 'aws_access_key':
// Check if key is valid without making destructive calls
// Use STS GetCallerIdentity
default:
return true; // Assume live if can't verify
}
}
Extension 2: Git History Scanning
Scan existing git history for leaked secrets:
# Integrate with gitleaks
bun run scan-history.ts
# Output:
# Found 3 secrets in git history:
# 1. commit abc123 (2023-01-15): AWS key in config.ts
# 2. commit def456 (2023-02-20): GitHub token in deploy.sh
# 3. commit ghi789 (2023-03-10): DB password in docker-compose.yml
Extension 3: Secret Rotation Assistance
When a secret is detected, offer to rotate it:
SECRET DETECTED: Stripe Live Key
This key appears to be exposed. Would you like me to:
1. Rotate the key via Stripe API
2. Update all references to use environment variable
3. Add to .env.example with placeholder
4. Generate documentation for rotation procedure
Extension 4: Team Metrics Dashboard
Track secret detection metrics:
Secret Scanner Metrics (Last 30 Days):
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Blocks: 47 โ
โ By Type: โ
โ AWS Keys: 12 โโโโโโโโ โ
โ Stripe Keys: 8 โโโโโ โ
โ GitHub Tokens: 5 โโโ โ
โ Passwords: 22 โโโโโโโโโโโโโโ โ
โ โ
โ False Positive Rate: 3.2% โ
โ Allowlist Entries: 8 (3 expired) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Extension 5: Custom Pattern Builder
Allow users to define custom patterns via UI:
// .kiro/custom-patterns.json
{
"patterns": [
{
"name": "Internal API Token",
"regex": "int_[a-f0-9]{32}",
"severity": "high",
"description": "Internal service authentication token"
}
]
}
Real-World Connections
Industry Tools
Your hook implements similar functionality to:
- gitleaks: Open-source secret scanner
- trufflehog: Entropy-based secret detection
- GitGuardian: Enterprise secret monitoring
- GitHub Secret Scanning: Built into GitHub
- AWS CodeGuru: Amazonโs security scanner
Production Deployment
| Concern | Solution |
|---|---|
| Performance | Async pattern matching, early rejection |
| Reliability | Fail-open behavior, timeout handling |
| Maintainability | Pattern definitions in config files |
| Compliance | Audit logging of blocked writes |
| Enterprise | Integration with SIEM/SOAR systems |
Compliance Requirements
Many regulations require secret protection:
- SOC 2: Type II requires credential management controls
- PCI DSS: Requirement 3 covers protection of cardholder data
- HIPAA: Security Rule requires access controls
- GDPR: Article 32 requires appropriate security measures
Self-Assessment Checklist
Knowledge Verification
- Can you explain the difference between exit codes 0, 1, and 2 for hooks?
- What is Shannon entropy and how does it help detect secrets?
- Why is PreToolUse better than git pre-commit for AI-generated code?
- What are the trade-offs between pattern matching and entropy analysis?
- How do allowlists prevent security bypass while handling false positives?
Implementation Verification
- Hook blocks writes containing AWS access keys
- Hook blocks writes containing Stripe live keys
- Hook allows test keys (sk_test_*)
- High-entropy strings in sensitive contexts are flagged
- Allowlist properly excludes known false positives
Quality Verification
- Feedback messages are actionable (include how to fix)
- Secrets are redacted in output
- Hook completes within 500ms
- False positive rate is acceptable (< 5%)
Integration Verification
- Hook works seamlessly during normal Kiro usage
- AI successfully refactors to use environment variables after block
- Allowlist is version controlled
- Hook fails open if it crashes (exit 1, not 2)
Summary
Building a secret scanner hook teaches you:
- Blocking Hook Pattern: Using exit code 2 to prevent dangerous operations
- Multi-Layer Detection: Combining patterns, entropy, and context analysis
- Security Feedback Loops: Guiding AI to fix issues, not just blocking
- Defense in Depth: Understanding where this fits in the security stack
This is arguably the most important security control you can add to AI-assisted development. By catching secrets before they even hit the filesystem, youโre preventing the most common source of credential leaks.
Next Project: P25-code-review-workflow.md - Multi-agent code review system