Project 24: Secret Scanner Hook (Security on Write)

Build a PreToolUse hook that scans for secrets (API keys, passwords, tokens) before Kiro writes files - blocking commits of sensitive data before they happen.

Learning Objectives

By completing this project, you will:

Master PreToolUse blocking hooks with exit code 2 for preventing dangerous operations
Understand secret detection patterns including regex, entropy analysis, and known formats
Implement allowlist systems for handling false positives safely
Design security feedback loops that guide AI to fix issues, not just block them
Apply defense-in-depth principles by catching secrets before git commit

Deep Theoretical Foundation

The Secret Leakage Problem

Secrets in source code are one of the most common security vulnerabilities:

How Secrets Leak:

Developer's Intent                    Reality
─────────────────                    ───────
"I'll just test               ┌─────────────────────────────┐
 with the real key"   ───────►│ const key = "sk_live_xxx"  │
                              └─────────────────────────────┘
                                            │
                                            ▼
                              ┌─────────────────────────────┐
                              │    git add . && commit      │
                              └─────────────────────────────┘
                                            │
                                            ▼
                              ┌─────────────────────────────┐
                              │    git push origin main     │
                              └─────────────────────────────┘
                                            │
              ┌─────────────────────────────┼─────────────────────────────┐
              │                             │                             │
              ▼                             ▼                             ▼
┌───────────────────────┐   ┌───────────────────────┐   ┌───────────────────────┐
│ GitHub/GitLab Stores  │   │ CI/CD Logs Expose    │   │ Bots Scan & Exploit  │
│ Forever in History    │   │ In Build Output      │   │ Within Minutes       │
└───────────────────────┘   └───────────────────────┘   └───────────────────────┘

Statistics that matter:

GitGuardian found 10+ million secrets exposed on GitHub in 2023
Average time to exploit a leaked AWS key: 4 minutes
Cost of secret rotation after leak: hours to days of engineering time

The PreToolUse Blocking Pattern

Unlike PostToolUse (which observes), PreToolUse can prevent operations:

PreToolUse Hook Flow:

┌─────────────────────────────────────────────────────────────────────┐
│                           Kiro Agent                                 │
│                                                                      │
│  User: "Create a config file with the database connection"          │
│                                                                      │
│  Agent plans: write tool → config.ts                                │
│         │                                                            │
│         ▼                                                            │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ PreToolUse Hook Runs BEFORE Write                               ││
│  │                                                                  ││
│  │   Input: { tool: "write", content: "...password=secret123..." } ││
│  │                                                                  ││
│  │   ┌──────────────────────────────────────────────────────────┐ ││
│  │   │ Secret Scanner                                           │ ││
│  │   │                                                          │ ││
│  │   │ Checking content...                                      │ ││
│  │   │ ⚠ FOUND: Hardcoded password on line 5                   │ ││
│  │   │                                                          │ ││
│  │   │ Exit code 2 → BLOCK                                      │ ││
│  │   │ stdout → Message to agent                                │ ││
│  │   └──────────────────────────────────────────────────────────┘ ││
│  └─────────────────────────────────────────────────────────────────┘│
│         │                                                            │
│         ▼                                                            │
│  Write tool is BLOCKED                                              │
│                                                                      │
│  Agent receives: "BLOCKED: Found hardcoded password. Use            │
│                   environment variable instead: process.env.DB_PASS"│
│                                                                      │
│  Agent: "I understand. Let me refactor to use environment           │
│          variables instead..."                                       │
│                                                                      │
│  [Rewrites with process.env.DB_PASSWORD]                            │
│                                                                      │
│  PreToolUse runs again → No secrets found → Exit 0 → ALLOWED       │
└─────────────────────────────────────────────────────────────────────┘

Exit Code Protocol

Kiro hooks use exit codes to communicate decisions:

Exit Code Meanings:

┌────────────┬────────────────────────────────────────────────────────┐
│ Exit Code  │ Meaning                                                │
├────────────┼────────────────────────────────────────────────────────┤
│     0      │ ALLOW - Continue with the operation                   │
│            │ Hook ran successfully, no issues found                │
├────────────┼────────────────────────────────────────────────────────┤
│     1      │ ERROR - Something went wrong in the hook              │
│            │ Operation continues (fail-open)                       │
│            │ Error is logged as warning                            │
├────────────┼────────────────────────────────────────────────────────┤
│     2      │ BLOCK - Prevent the operation                         │
│            │ stdout is sent to agent as feedback                   │
│            │ Agent should address the issue and retry              │
└────────────┴────────────────────────────────────────────────────────┘

Secret Detection Techniques

Multiple techniques work together for comprehensive detection:

Detection Layer Stack:

Layer 1: PATTERN MATCHING (Fast, High Confidence)
┌─────────────────────────────────────────────────────────────────────┐
│ Known Secret Formats:                                                │
│                                                                      │
│ AWS Access Key:     AKIA[A-Z0-9]{16}                               │
│ AWS Secret:         [A-Za-z0-9/+=]{40}                              │
│ GitHub Token:       ghp_[A-Za-z0-9]{36}                             │
│ Stripe Live Key:    sk_live_[A-Za-z0-9]{24,}                        │
│ JWT:                eyJ[A-Za-z0-9_-]+\.eyJ[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+│
│ Private Key:        -----BEGIN (RSA |EC )?PRIVATE KEY-----          │
└─────────────────────────────────────────────────────────────────────┘

Layer 2: ENTROPY ANALYSIS (Catches Unknown Formats)
┌─────────────────────────────────────────────────────────────────────┐
│ Shannon Entropy measures randomness:                                 │
│                                                                      │
│ "password"         → Low entropy (2.5)  → Probably not a secret    │
│ "aB3$kL9@mN2"     → High entropy (3.8) → Suspicious                │
│ "sk_test_4eC39H" → High entropy (4.2) → Likely a secret           │
│                                                                      │
│ Threshold: > 3.5 bits/char in strings > 16 chars = Suspicious      │
└─────────────────────────────────────────────────────────────────────┘

Layer 3: CONTEXTUAL ANALYSIS (Reduces False Positives)
┌─────────────────────────────────────────────────────────────────────┐
│ Variable Name Context:                                               │
│                                                                      │
│ const password = "test123"     → Suspicious (password in name)     │
│ const description = "test123"  → Probably fine                     │
│ const API_KEY = process.env.X  → Good (using env var)              │
│ const API_KEY = "sk_live_xxx"  → BAD (hardcoded with key name)    │
└─────────────────────────────────────────────────────────────────────┘

Layer 4: KNOWN VENDOR PATTERNS (Very High Confidence)
┌─────────────────────────────────────────────────────────────────────┐
│ Vendor-Specific Formats:                                             │
│                                                                      │
│ Stripe:    sk_live_*, rk_live_*, pk_live_*                         │
│ AWS:       AKIA*, ASIA* (access keys)                               │
│ GitHub:    ghp_*, gho_*, ghu_*, ghs_*, ghr_*                       │
│ Slack:     xox[baprs]-*                                             │
│ Twilio:    SK[a-f0-9]{32}                                           │
│ SendGrid:  SG\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+                      │
└─────────────────────────────────────────────────────────────────────┘

Shannon Entropy Calculation

Entropy measures the randomness of a string - high entropy often indicates secrets:

Shannon Entropy Formula:

H = -Σ p(x) * log2(p(x))

Where p(x) is the probability of character x appearing in the string.

Example Calculation for "aB3$kL9@":

Character frequencies:
a=1, B=1, 3=1, $=1, k=1, L=1, 9=1, @=1

Each appears 1/8 of the time, so:
H = -8 * (1/8 * log2(1/8))
H = -8 * (1/8 * -3)
H = 3 bits per character

Interpretation:
┌───────────────────┬──────────────┬────────────────────────────────┐
│ Entropy Range     │ Meaning      │ Example                        │
├───────────────────┼──────────────┼────────────────────────────────┤
│ 0 - 2.0          │ Very low     │ "aaaaaaaa"                     │
│ 2.0 - 3.0        │ Low          │ "password"                     │
│ 3.0 - 4.0        │ Medium       │ "P@ssw0rd"                     │
│ 4.0 - 5.0        │ High         │ Random tokens                  │
│ 5.0+             │ Very high    │ Cryptographic secrets          │
└───────────────────┴──────────────┴────────────────────────────────┘

Allowlist Architecture

Not every detected “secret” is actually a problem. Allowlists handle false positives:

Allowlist Structure:

.kiro/secrets-allowlist.json
┌─────────────────────────────────────────────────────────────────────┐
│ {                                                                    │
│   "patterns": [                                                      │
│     {                                                                │
│       "pattern": "sk_test_.*",                                      │
│       "reason": "Test mode Stripe keys are safe",                   │
│       "expires": "2025-01-01"                                       │
│     }                                                                │
│   ],                                                                 │
│   "files": [                                                         │
│     {                                                                │
│       "path": "src/examples/demo.ts",                               │
│       "reason": "Demo file with fake credentials",                  │
│       "hash": "abc123..."                                           │
│     }                                                                │
│   ],                                                                 │
│   "hashes": [                                                        │
│     {                                                                │
│       "sha256": "def456...",                                        │
│       "reason": "Public example from documentation",               │
│       "addedBy": "security-team",                                   │
│       "addedAt": "2024-01-15"                                       │
│     }                                                                │
│   ]                                                                  │
│ }                                                                    │
└─────────────────────────────────────────────────────────────────────┘

Defense in Depth

Your hook is one layer in a security onion:

Security Layers (Defense in Depth):

┌─────────────────────────────────────────────────────────────────────┐
│                        Developer Workstation                         │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Layer 1: Kiro PreToolUse Hook ◄── YOUR PROJECT                  ││
│  │          Blocks before file write                               ││
│  └─────────────────────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Layer 2: Git Pre-Commit Hook                                    ││
│  │          Blocks before commit                                   ││
│  └─────────────────────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Layer 3: Git Pre-Push Hook                                      ││
│  │          Blocks before push                                     ││
│  └─────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────┘
                              │
                              ▼ Push
┌─────────────────────────────────────────────────────────────────────┐
│                           Remote/CI                                  │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Layer 4: GitHub Secret Scanning                                 ││
│  │          Alerts on push                                         ││
│  └─────────────────────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Layer 5: CI Pipeline Scanner (gitleaks/trufflehog)             ││
│  │          Fails pipeline                                         ││
│  └─────────────────────────────────────────────────────────────────┘│
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │ Layer 6: Periodic Full Scan                                     ││
│  │          Catches historical leaks                               ││
│  └─────────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────────┘

Why multiple layers?
• Each layer catches what others miss
• Earlier is better (cheaper to fix)
• PreToolUse is the EARLIEST possible layer

Real-World Analogy: The Security Guard

Your hook is like a security guard at a building entrance:

Checks everyone entering (scans all file content)
Has a list of prohibited items (secret patterns)
Uses judgment (entropy analysis for unknown threats)
Allows known employees (allowlist for false positives)
Explains why someone is stopped (feedback to AI)
Suggests alternatives (use env vars instead)

Historical Context

Secret detection has evolved significantly:

Evolution of Secret Protection:

2010s: Manual Review
       └─► Humans reviewed code for secrets (slow, error-prone)

2015: Git Hooks (pre-commit)
       └─► Block commits containing patterns

2017: GitHub Secret Scanning
       └─► Automatic detection on push

2019: gitleaks/trufflehog
       └─► Open-source entropy-based scanning

2023: AI Code Assistants
       └─► Can generate code with secrets!

2024+: PreToolUse Hooks ◄─── YOU ARE HERE
       └─► Block AI from writing secrets
           BEFORE they even hit the filesystem

Book References

For deeper understanding:

“Practical Security Automation” by Tony UcedaVelez - Automation patterns
“Application Security Handbook” by OWASP - Secret management best practices
“Secrets Management” by HashiCorp - Enterprise secret handling
“The Tangled Web” by Michal Zalewski - Understanding security vulnerabilities

Complete Project Specification

What You Are Building

A PreToolUse hook that:

Intercepts file writes before they happen
Scans content for secrets using multiple detection methods
Blocks dangerous writes with exit code 2
Provides actionable feedback to help the AI fix the issue
Supports allowlists for legitimate exceptions

Functional Requirements

Feature	Behavior
Pattern Detection	Detect 20+ common secret formats
Entropy Analysis	Flag high-entropy strings in sensitive contexts
Context Awareness	Consider variable names and file types
Blocking	Exit 2 with clear message when secrets found
Allowlist	Skip known false positives
Remediation	Suggest environment variable alternatives

Non-Functional Requirements

Latency: Complete scan within 500ms (must not slow down development)
False Positive Rate: < 5% false positives with default rules
Coverage: Detect all OWASP top secret patterns
Fail-Safe: If hook crashes, fail open (exit 1) to not block development

Solution Architecture

High-Level Component Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                           Kiro CLI                                   │
│  ┌─────────────────────────────────────────────────────────────────┐│
│  │                    Agent Session                                 ││
│  │                                                                  ││
│  │  Agent plans: write → src/config.ts                             ││
│  │         │                                                        ││
│  └─────────┼────────────────────────────────────────────────────────┘│
└────────────┼────────────────────────────────────────────────────────┘
             │ preToolUse event (BEFORE write)
             ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    Secret Scanner Hook                               │
│                                                                      │
│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐           │
│  │ Pattern       │  │ Entropy       │  │ Context       │           │
│  │ Matcher       │  │ Analyzer      │  │ Analyzer      │           │
│  │ • 50+ regex   │  │ • Shannon H   │  │ • Var names   │           │
│  │ • Known keys  │  │ • Threshold   │  │ • File type   │           │
│  └───────────────┘  └───────────────┘  └───────────────┘           │
│          │                  │                  │                    │
│          └──────────────────┼──────────────────┘                    │
│                             ▼                                        │
│                   ┌─────────────────────┐                           │
│                   │  Finding Aggregator │                           │
│                   │  • Deduplicate      │                           │
│                   │  • Check allowlist  │                           │
│                   │  • Rank by severity │                           │
│                   └─────────────────────┘                           │
│                             │                                        │
│                             ▼                                        │
│          ┌──────────────────┴──────────────────┐                    │
│          │                                      │                    │
│   No Secrets Found                      Secrets Found               │
│          │                                      │                    │
│          ▼                                      ▼                    │
│   ┌─────────────────┐               ┌─────────────────┐            │
│   │   Exit Code 0   │               │   Exit Code 2   │            │
│   │   (Allow)       │               │   (Block)       │            │
│   └─────────────────┘               │   + Feedback    │            │
│                                      └─────────────────┘            │
└─────────────────────────────────────────────────────────────────────┘

Data Flow: Secret Detection and Blocking

1. Hook Receives Write Request
   ┌─────────────────────────────────────────────────────────────────┐
   │ stdin: {                                                        │
   │   "hook_event_name": "preToolUse",                              │
   │   "tool_name": "write",                                         │
   │   "tool_input": {                                               │
   │     "file_path": "src/config.ts",                               │
   │     "content": "export const config = {\n  apiKey: 'sk_live_4eC39HqLyjWDarjtT1zdp7dc'\n};"
   │   }                                                              │
   │ }                                                                │
   └─────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
2. Pattern Matching Phase
   ┌─────────────────────────────────────────────────────────────────┐
   │ Checking pattern: Stripe Live Key                               │
   │ Regex: sk_live_[A-Za-z0-9]{24,}                                 │
   │                                                                  │
   │ MATCH FOUND at line 2:                                          │
   │   sk_live_4eC39HqLyjWDarjtT1zdp7dc                              │
   │                                                                  │
   │ Confidence: HIGH (known format)                                 │
   └─────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
3. Allowlist Check
   ┌─────────────────────────────────────────────────────────────────┐
   │ Checking allowlist...                                            │
   │                                                                  │
   │ • Pattern "sk_test_.*" - NOT matched (this is live, not test)  │
   │ • File "src/config.ts" - NOT in file allowlist                 │
   │ • Hash of value - NOT in hash allowlist                        │
   │                                                                  │
   │ Result: NOT ALLOWLISTED → Proceed to block                     │
   └─────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
4. Generate Blocking Response
   ┌─────────────────────────────────────────────────────────────────┐
   │ stdout:                                                         │
   │                                                                  │
   │ BLOCKED: Potential secrets detected in file write              │
   │                                                                  │
   │ File: src/config.ts                                             │
   │                                                                  │
   │ Findings:                                                        │
   │ • Line 2: Stripe Live API Key (sk_live_...)                    │
   │   Severity: CRITICAL                                            │
   │   This is a production payment key that could be exploited     │
   │                                                                  │
   │ Remediation:                                                     │
   │ 1. Use environment variable: process.env.STRIPE_SECRET_KEY     │
   │ 2. Add to .env file (which is gitignored)                      │
   │ 3. Example:                                                      │
   │    export const config = {                                       │
   │      apiKey: process.env.STRIPE_SECRET_KEY                      │
   │    }                                                             │
   │                                                                  │
   │ exit code: 2                                                     │
   └─────────────────────────────────────────────────────────────────┘

Key Interfaces

// Secret finding
interface SecretFinding {
  type: 'pattern' | 'entropy' | 'context';
  name: string;           // "Stripe Live Key"
  pattern?: string;       // The regex that matched
  value: string;          // The matched secret (partially redacted)
  line: number;
  column: number;
  severity: 'critical' | 'high' | 'medium' | 'low';
  confidence: 'high' | 'medium' | 'low';
  remediation: string;    // Suggested fix
}

// Pattern definition
interface SecretPattern {
  name: string;
  pattern: RegExp;
  severity: 'critical' | 'high' | 'medium' | 'low';
  description: string;
  remediation: string;
  testCases: {
    shouldMatch: string[];
    shouldNotMatch: string[];
  };
}

// Allowlist entry
interface AllowlistEntry {
  type: 'pattern' | 'file' | 'hash';
  value: string;
  reason: string;
  addedBy: string;
  addedAt: string;
  expires?: string;
}

// Hook configuration
interface SecretScannerConfig {
  enabled: boolean;
  patterns: SecretPattern[];
  entropyThreshold: number;      // Default: 4.0
  entropyMinLength: number;      // Default: 16
  allowlistPath: string;
  excludePaths: string[];        // e.g., ["*.test.ts", "fixtures/*"]
  failOpen: boolean;             // Exit 0 on hook errors
}

Technology Choices

Component	Technology	Rationale
Hook Runtime	Bun	Fast startup for responsiveness
Pattern Engine	Native RegExp	Performant, no dependencies
Entropy Calc	Custom	Simple algorithm, no library needed
Config Format	JSON	Standard Kiro pattern
Allowlist	JSON file	Version controllable, editable

Phased Implementation Guide

Phase 1: Hook Foundation (Days 1-2)

Goal: Create a PreToolUse hook that intercepts write operations.

Tasks:

Create hook script file
Configure in .kiro/settings.json
Parse stdin JSON for tool input
Filter for write tool only
Log file writes for debugging

Hints:

PreToolUse receives the same JSON format as PostToolUse
Exit 0 initially to not block anything
The content to scan is in tool_input.content

Configuration (.kiro/settings.json):

{
  "hooks": {
    "preToolUse": [
      {
        "matcher": "write",
        "command": "bun run /path/to/hooks/secret-scanner.ts"
      }
    ]
  }
}

Starter Code:

#!/usr/bin/env bun

import { readFileSync } from 'fs';

const input = JSON.parse(readFileSync(0, 'utf-8'));

// Only process write operations
if (input.tool_name !== 'write') {
  process.exit(0);
}

const { file_path, content } = input.tool_input;

console.error(`[SecretScanner] Checking: ${file_path}`);
console.error(`[SecretScanner] Content length: ${content.length} bytes`);

// TODO: Implement scanning
process.exit(0);

Phase 2: Pattern Detection (Days 3-5)

Goal: Implement regex-based detection for known secret formats.

Tasks:

Define patterns for common secrets (AWS, GitHub, Stripe, etc.)
Scan content against all patterns
Record matches with line numbers
Generate finding objects
Test against known secrets

Hints:

Use non-global regex for finding all matches
Track line numbers by splitting content
Redact the actual secret in output

Pattern Examples:

const PATTERNS: SecretPattern[] = [
  {
    name: 'AWS Access Key',
    pattern: /AKIA[A-Z0-9]{16}/g,
    severity: 'critical',
    description: 'AWS IAM access key ID',
    remediation: 'Use AWS_ACCESS_KEY_ID environment variable',
    testCases: {
      shouldMatch: ['AKIAIOSFODNN7EXAMPLE'],
      shouldNotMatch: ['AKIAEXAMPLE123'],
    },
  },
  {
    name: 'Stripe Live Key',
    pattern: /sk_live_[A-Za-z0-9]{24,}/g,
    severity: 'critical',
    description: 'Stripe live mode secret key',
    remediation: 'Use STRIPE_SECRET_KEY environment variable',
    testCases: {
      shouldMatch: ['sk_live_4eC39HqLyjWDarjtT1zdp7dc'],
      shouldNotMatch: ['sk_test_4eC39HqLyjWDarjtT1zdp7dc'],
    },
  },
  // Add more patterns...
];

Phase 3: Entropy Analysis (Days 6-7)

Goal: Detect secrets that don’t match known patterns.

Tasks:

Implement Shannon entropy calculation
Extract potential secrets (quoted strings, assignments)
Calculate entropy for each candidate
Flag high-entropy strings in sensitive contexts
Tune threshold to minimize false positives

Hints:

Entropy > 4.0 for strings > 16 chars is suspicious
Consider context: variable names containing “key”, “secret”, “password”
Skip common high-entropy non-secrets (UUIDs, hashes in dependencies)

Entropy Implementation:

function calculateEntropy(str: string): number {
  const charCounts = new Map<string, number>();
  for (const char of str) {
    charCounts.set(char, (charCounts.get(char) || 0) + 1);
  }

  let entropy = 0;
  for (const count of charCounts.values()) {
    const probability = count / str.length;
    entropy -= probability * Math.log2(probability);
  }

  return entropy;
}

function isHighEntropySecret(value: string, context: string): boolean {
  if (value.length < 16) return false;

  const entropy = calculateEntropy(value);
  if (entropy < 4.0) return false;

  // Check context for sensitive variable names
  const sensitiveNames = /(?:key|secret|password|token|credential|auth)/i;
  return sensitiveNames.test(context);
}

Phase 4: Blocking and Feedback (Days 8-10)

Goal: Block dangerous writes and provide actionable feedback.

Tasks:

Format findings into clear output
Exit with code 2 to block
Provide specific remediation suggestions
Include code examples for fixes
Test the full blocking flow

Hints:

The feedback goes to the AI, so be specific
Include both what’s wrong AND how to fix it
Partially redact secrets in output

Blocking Output Format:

function formatBlockingMessage(findings: SecretFinding[]): string {
  const lines = [
    '⛔ BLOCKED: Potential secrets detected in file write',
    '',
    'The following secrets were found and must be removed:',
    '',
  ];

  for (const finding of findings) {
    lines.push(`• Line ${finding.line}: ${finding.name}`);
    lines.push(`  Value: ${redact(finding.value)}`);
    lines.push(`  Severity: ${finding.severity.toUpperCase()}`);
    lines.push(`  Fix: ${finding.remediation}`);
    lines.push('');
  }

  lines.push('Suggested approach:');
  lines.push('1. Use environment variables instead of hardcoded values');
  lines.push('2. Reference with process.env.VARIABLE_NAME');
  lines.push('3. Add actual values to .env file (gitignored)');

  return lines.join('\n');
}

function redact(secret: string): string {
  if (secret.length <= 8) return '****';
  return secret.slice(0, 4) + '...' + secret.slice(-4);
}

Phase 5: Allowlist System (Days 11-14)

Goal: Allow legitimate exceptions without disabling security.

Tasks:

Create allowlist file structure
Load allowlist on hook start
Check findings against allowlist before blocking
Support pattern, file, and hash-based allowlisting
Add expiration checking

Hints:

Allowlist file should be version controlled
Require a reason for each exception
Support expiration dates for temporary exceptions

Allowlist Loading:

interface Allowlist {
  patterns: { pattern: RegExp; reason: string; expires?: Date }[];
  files: { path: string; reason: string }[];
  hashes: { sha256: string; reason: string }[];
}

function loadAllowlist(path: string): Allowlist {
  const raw = JSON.parse(readFileSync(path, 'utf-8'));

  return {
    patterns: raw.patterns.map(p => ({
      pattern: new RegExp(p.pattern),
      reason: p.reason,
      expires: p.expires ? new Date(p.expires) : undefined,
    })),
    files: raw.files,
    hashes: raw.hashes,
  };
}

function isAllowlisted(finding: SecretFinding, filePath: string, allowlist: Allowlist): boolean {
  // Check pattern allowlist
  for (const entry of allowlist.patterns) {
    if (entry.expires && new Date() > entry.expires) continue;
    if (entry.pattern.test(finding.value)) return true;
  }

  // Check file allowlist
  if (allowlist.files.some(f => filePath.includes(f.path))) return true;

  // Check hash allowlist
  const hash = crypto.createHash('sha256').update(finding.value).digest('hex');
  if (allowlist.hashes.some(h => h.sha256 === hash)) return true;

  return false;
}

Testing Strategy

Unit Tests

describe('SecretScanner', () => {
  describe('pattern detection', () => {
    it('detects AWS access keys', () => {
      const content = 'const key = "AKIAIOSFODNN7EXAMPLE";';
      const findings = scanForSecrets(content);
      expect(findings).toHaveLength(1);
      expect(findings[0].name).toBe('AWS Access Key');
    });

    it('ignores test Stripe keys', () => {
      const content = 'const key = "sk_test_4eC39HqLyjWDarjtT1zdp7dc";';
      const findings = scanForSecrets(content);
      expect(findings).toHaveLength(0);
    });

    it('detects live Stripe keys', () => {
      const content = 'const key = "sk_live_4eC39HqLyjWDarjtT1zdp7dc";';
      const findings = scanForSecrets(content);
      expect(findings).toHaveLength(1);
      expect(findings[0].severity).toBe('critical');
    });
  });

  describe('entropy analysis', () => {
    it('calculates correct entropy', () => {
      expect(calculateEntropy('aaaaaaaa')).toBeLessThan(1);
      expect(calculateEntropy('aB3$kL9@mN2xYz')).toBeGreaterThan(3.5);
    });

    it('flags high-entropy strings in sensitive contexts', () => {
      const content = 'const apiKey = "xK9mL2pQ8rT5vY3nZ";';
      const findings = scanForSecrets(content);
      expect(findings.some(f => f.type === 'entropy')).toBe(true);
    });
  });

  describe('allowlist', () => {
    it('skips allowlisted patterns', () => {
      const allowlist = { patterns: [{ pattern: /test_.*/, reason: 'test keys' }] };
      const finding = { value: 'test_abc123', type: 'pattern' };
      expect(isAllowlisted(finding, 'file.ts', allowlist)).toBe(true);
    });
  });
});

Integration Tests

describe('Hook Integration', () => {
  it('blocks write with secrets', async () => {
    const input = {
      hook_event_name: 'preToolUse',
      tool_name: 'write',
      tool_input: {
        file_path: 'config.ts',
        content: 'export const key = "sk_live_abc123def456ghi789";'
      }
    };

    const result = await runHook(input);

    expect(result.exitCode).toBe(2);
    expect(result.stdout).toContain('BLOCKED');
    expect(result.stdout).toContain('Stripe');
  });

  it('allows write without secrets', async () => {
    const input = {
      hook_event_name: 'preToolUse',
      tool_name: 'write',
      tool_input: {
        file_path: 'config.ts',
        content: 'export const key = process.env.API_KEY;'
      }
    };

    const result = await runHook(input);

    expect(result.exitCode).toBe(0);
  });
});

Manual Testing

# 1. Test with a file containing a secret
echo '{"hook_event_name":"preToolUse","tool_name":"write","tool_input":{"file_path":"test.ts","content":"const key = \"sk_live_1234567890abcdef\";"}}' | bun run secret-scanner.ts
# Should exit 2 with blocking message

# 2. Test with safe file
echo '{"hook_event_name":"preToolUse","tool_name":"write","tool_input":{"file_path":"test.ts","content":"const key = process.env.API_KEY;"}}' | bun run secret-scanner.ts
# Should exit 0

# 3. Test in actual Kiro session
kiro-cli chat
> "Create a config file with my Stripe key sk_live_abc123"
# Should see hook block the write

Common Pitfalls and Debugging

Pitfall 1: Hook Blocks Too Much (False Positives)

Symptom: Legitimate code blocked as secrets

Debugging:

# Add debug logging to hook
console.error('[SecretScanner] Checking:', value);
console.error('[SecretScanner] Entropy:', calculateEntropy(value));
console.error('[SecretScanner] Context:', context);

Solution:

// Add to allowlist
{
  "patterns": [
    {
      "pattern": "example_.*",
      "reason": "Example values in documentation"
    }
  ]
}

Pitfall 2: Hook Slows Down Development

Symptom: Noticeable delay on every file write

Cause: Scanning is too slow

Solution:

// Quick rejection for non-source files
const sourceExtensions = ['.ts', '.js', '.tsx', '.jsx', '.py', '.rb'];
if (!sourceExtensions.some(ext => filePath.endsWith(ext))) {
  process.exit(0);  // Skip non-source files
}

// Quick rejection for short files
if (content.length < 100) {
  process.exit(0);  // Unlikely to contain meaningful secrets
}

Pitfall 3: AI Doesn’t Understand Feedback

Symptom: AI keeps trying to write secrets differently

Cause: Feedback isn’t specific enough

Solution:

// Make feedback actionable
const feedback = `
BLOCKED: Hardcoded Stripe key detected.

Instead of:
  const stripeKey = "sk_live_xxx";

Write:
  const stripeKey = process.env.STRIPE_SECRET_KEY;

And add to .env file:
  STRIPE_SECRET_KEY=sk_live_xxx
`;

Pitfall 4: Allowlist Grows Unbounded

Symptom: Allowlist becomes a dumping ground for false positives

Prevention:

// Require expiration for all entries
function validateAllowlistEntry(entry: AllowlistEntry): boolean {
  if (!entry.expires) {
    console.error('All allowlist entries must have expiration');
    return false;
  }
  const maxExpiration = new Date();
  maxExpiration.setMonth(maxExpiration.getMonth() + 6);
  if (new Date(entry.expires) > maxExpiration) {
    console.error('Allowlist entries cannot exceed 6 months');
    return false;
  }
  return true;
}

Extensions and Challenges

Extension 1: Real-Time Secret Verification

For some secrets, verify if they’re actually valid:

async function verifySecretIsLive(type: string, value: string): Promise<boolean> {
  switch (type) {
    case 'github_token':
      const response = await fetch('https://api.github.com/user', {
        headers: { Authorization: `Bearer ${value}` }
      });
      return response.status === 200;

    case 'aws_access_key':
      // Check if key is valid without making destructive calls
      // Use STS GetCallerIdentity

    default:
      return true;  // Assume live if can't verify
  }
}

Extension 2: Git History Scanning

Scan existing git history for leaked secrets:

# Integrate with gitleaks
bun run scan-history.ts

# Output:
# Found 3 secrets in git history:
# 1. commit abc123 (2023-01-15): AWS key in config.ts
# 2. commit def456 (2023-02-20): GitHub token in deploy.sh
# 3. commit ghi789 (2023-03-10): DB password in docker-compose.yml

Extension 3: Secret Rotation Assistance

When a secret is detected, offer to rotate it:

SECRET DETECTED: Stripe Live Key

This key appears to be exposed. Would you like me to:
1. Rotate the key via Stripe API
2. Update all references to use environment variable
3. Add to .env.example with placeholder
4. Generate documentation for rotation procedure

Extension 4: Team Metrics Dashboard

Track secret detection metrics:

Secret Scanner Metrics (Last 30 Days):
┌─────────────────────────────────────────────────────────────────┐
│ Blocks: 47                                                       │
│ By Type:                                                         │
│   AWS Keys:      12 ████████                                    │
│   Stripe Keys:    8 █████                                       │
│   GitHub Tokens:  5 ███                                         │
│   Passwords:     22 ██████████████                              │
│                                                                  │
│ False Positive Rate: 3.2%                                       │
│ Allowlist Entries: 8 (3 expired)                                │
└─────────────────────────────────────────────────────────────────┘

Extension 5: Custom Pattern Builder

Allow users to define custom patterns via UI:

// .kiro/custom-patterns.json
{
  "patterns": [
    {
      "name": "Internal API Token",
      "regex": "int_[a-f0-9]{32}",
      "severity": "high",
      "description": "Internal service authentication token"
    }
  ]
}

Real-World Connections

Industry Tools

Your hook implements similar functionality to:

gitleaks: Open-source secret scanner
trufflehog: Entropy-based secret detection
GitGuardian: Enterprise secret monitoring
GitHub Secret Scanning: Built into GitHub
AWS CodeGuru: Amazon’s security scanner

Production Deployment

Concern	Solution
Performance	Async pattern matching, early rejection
Reliability	Fail-open behavior, timeout handling
Maintainability	Pattern definitions in config files
Compliance	Audit logging of blocked writes
Enterprise	Integration with SIEM/SOAR systems

Compliance Requirements

Many regulations require secret protection:

SOC 2: Type II requires credential management controls
PCI DSS: Requirement 3 covers protection of cardholder data
HIPAA: Security Rule requires access controls
GDPR: Article 32 requires appropriate security measures

Self-Assessment Checklist

Knowledge Verification

Can you explain the difference between exit codes 0, 1, and 2 for hooks?
What is Shannon entropy and how does it help detect secrets?
Why is PreToolUse better than git pre-commit for AI-generated code?
What are the trade-offs between pattern matching and entropy analysis?
How do allowlists prevent security bypass while handling false positives?

Implementation Verification

Hook blocks writes containing AWS access keys
Hook blocks writes containing Stripe live keys
Hook allows test keys (sk_test_*)
High-entropy strings in sensitive contexts are flagged
Allowlist properly excludes known false positives

Quality Verification

Feedback messages are actionable (include how to fix)
Secrets are redacted in output
Hook completes within 500ms
False positive rate is acceptable (< 5%)

Integration Verification

Hook works seamlessly during normal Kiro usage
AI successfully refactors to use environment variables after block
Allowlist is version controlled
Hook fails open if it crashes (exit 1, not 2)

Summary

Building a secret scanner hook teaches you:

Blocking Hook Pattern: Using exit code 2 to prevent dangerous operations
Multi-Layer Detection: Combining patterns, entropy, and context analysis
Security Feedback Loops: Guiding AI to fix issues, not just blocking
Defense in Depth: Understanding where this fits in the security stack

This is arguably the most important security control you can add to AI-assisted development. By catching secrets before they even hit the filesystem, you’re preventing the most common source of credential leaks.

Next Project: P25-code-review-workflow.md - Multi-agent code review system