Project 2: File Guardian - PreToolUse Blocking Hook

Project Overview

Attribute	Value
Project Number	2 of 40
Category	Hooks System Mastery
Main Programming Language	Python
Alternative Languages	Bash, Bun/TypeScript, Go
Difficulty Level	Level 2: Intermediate
Time Estimate	1 Weekend
Coolness Level	Level 3: Genuinely Clever
Business Potential	Service & Support Model
Knowledge Area	Hooks / Security / File Protection
Primary Tool	Claude Code PreToolUse Hooks
Main Reference	“Black Hat Python” by Justin Seitz

Summary: Build a PreToolUse hook that prevents Claude from modifying specific files or directories (like .env, secrets/, production.config) by examining tool arguments and blocking with exit code 2. This is your introduction to security boundaries in Claude Code.

Real World Outcome

When Claude tries to edit a protected file, your hook intercepts and blocks the action:

You: Update the database password in .env

Claude: I'll update the .env file...
[Uses Edit tool on .env]

+--------------------------------------------------+
|           FILE GUARDIAN BLOCKED ACTION            |
+--------------------------------------------------+
|                                                   |
|  Tool:     Edit                                   |
|  Target:   /Users/douglas/project/.env            |
|  Reason:   .env files contain secrets and are     |
|            protected by File Guardian             |
|                                                   |
|  Action:   BLOCKED (exit code 2)                  |
|                                                   |
|  Tip: If you need to update this file,            |
|       do it manually for security reasons.        |
|                                                   |
+--------------------------------------------------+

Claude: I apologize, but I'm unable to modify the .env file as it's
protected by your file guardian configuration. You'll need to update
it manually for security reasons.

What This Teaches You

Understanding PreToolUse hook payloads
Parsing complex JSON structures in Python
Pattern matching for file paths (glob, regex)
Implementing deterministic security boundaries
Handling path traversal attacks

The Core Question You’re Answering

“How can I create security boundaries that Claude cannot override, even if instructed?”

This is critically important to understand:

Hooks with exit code 2 are deterministic blocks. Unlike permission prompts (which users can click through), a blocking hook is absolute. The tool simply will not execute. This project teaches you to build guardrails that even you cannot bypass during a session.

             PERMISSION PROMPT vs BLOCKING HOOK

  +---------------------------+     +---------------------------+
  |    PERMISSION PROMPT      |     |     BLOCKING HOOK         |
  +---------------------------+     +---------------------------+
  |                           |     |                           |
  | "Claude wants to edit     |     | PreToolUse Hook runs      |
  |  .env - Allow?"           |     |                           |
  |                           |     | if file in blocklist:     |
  |  [Yes]  [No]  [Always]    |     |     exit(2)  # BLOCKED    |
  |                           |     |                           |
  | User can click [Yes]      |     | No prompt shown           |
  | and bypass protection     |     | Tool never executes       |
  |                           |     | Cannot be bypassed        |
  +---------------------------+     +---------------------------+
         ^                                    ^
         |                                    |
     Soft protection               Hard protection
     User-bypassable               Deterministic

Concepts You Must Understand First

Stop and research these before coding:

1. PreToolUse Hook Payload

When a tool is about to execute, your hook receives this JSON on stdin:

{
  "hook_event_name": "PreToolUse",
  "session_id": "abc123-def456",
  "tool_name": "Edit",
  "tool_input": {
    "file_path": "/Users/douglas/project/.env",
    "old_string": "DB_PASSWORD=oldpass",
    "new_string": "DB_PASSWORD=newpass"
  }
}

Different tools have different tool_input structures:

Tool	Key Fields in tool_input
`Edit`	`file_path`, `old_string`, `new_string`
`Write`	`file_path`, `content`
`Read`	`file_path`
`Bash`	`command`
`MultiEdit`	`file_path`, `edits[]`
`NotebookEdit`	`notebook_path`, `cell_number`

Reference: Claude Code Docs - “Hook Payloads”

2. Exit Code Semantics for Blocking

  Exit Code    |    Meaning          |    Effect
  -------------|---------------------|------------------
      0        |    ALLOW            |    Tool executes
      2        |    BLOCK            |    Tool blocked, Claude notified
    other      |    ERROR            |    Warning logged, tool executes

For security hooks, you MUST use exit code 2 for blocking. Any other code is treated as an error, and the tool will execute anyway.

Reference: Claude Code Docs - “Hooks”

3. Pattern Matching Strategies

You need to match file paths against blocklist patterns:

Strategy	Example Pattern	Matches
Exact Match	`.env`	Only `.env`
Glob	`secrets/*`	`secrets/api.key`, `secrets/db.json`
Glob Recursive	`*/.pem`	Any `.pem` file anywhere
Regex	`.password.`	`db_password.txt`, `PASSWORD.env`

Gotcha: Path traversal! ./secrets/../.env should match .env but won’t with naive string matching.

Reference: “Mastering Regular Expressions” Ch. 2 by Jeffrey Friedl

Questions to Guide Your Design

Before implementing, think through these security decisions:

1. What Tools Need Guarding?

+----------------+------------------+---------------------------+
| Tool           | Danger Level     | Should Guard?             |
+----------------+------------------+---------------------------+
| Edit           | HIGH             | Yes - modifies files      |
| Write          | HIGH             | Yes - creates/overwrites  |
| MultiEdit      | HIGH             | Yes - batch modifications |
| NotebookEdit   | MEDIUM           | Maybe - Jupyter files     |
| Bash           | HIGHEST          | Yes - can do anything     |
| Read           | LOW              | Maybe - secrets exposure  |
| Glob           | NONE             | No - only lists files     |
| Grep           | NONE             | No - only searches        |
+----------------+------------------+---------------------------+

2. What Patterns Should You Block?

Critical Files (always block):

.env, .env.local, .env.production
secrets.json, credentials.json
*.pem, *.key, *.p12 (certificates and keys)
id_rsa, id_ed25519 (SSH keys)

Critical Directories (block all contents):

secrets/, .ssh/, private/
certificates/, keys/

Pattern-Based (catch variations):

*password*, *secret*, *credential*
*token*, *apikey*

3. How Should Blocking Work?

Silent block: Just exit 2, minimal output
Informative block: Show user what was blocked and why
Logged block: Write to audit file for security review
Override option: Allow special prefix to bypass (dangerous!)

Thinking Exercise

Parse a Tool Input

Given this PreToolUse payload, identify what needs protection:

{
  "hook_event_name": "PreToolUse",
  "tool_name": "Edit",
  "tool_input": {
    "file_path": "/Users/dev/project/.env",
    "old_string": "DB_PASSWORD=oldpass",
    "new_string": "DB_PASSWORD=newpass"
  },
  "session_id": "abc123"
}

Questions to Answer:

Which field contains the file path?
- tool_input.file_path
How would you detect this is a sensitive file?
- Check if basename is .env
- Check if path contains secrets/
- Check against blocklist patterns
What if the path was ./secrets/../.env?
- Naive matching: might not catch it
- Solution: Normalize with os.path.realpath()

Handle the Bash Tool

The Bash tool is special - you need to parse the command string:

{
  "hook_event_name": "PreToolUse",
  "tool_name": "Bash",
  "tool_input": {
    "command": "cat .env | grep DB_PASSWORD"
  }
}

Questions:

How do you extract file references from a bash command?
What about $(cat .env) or backticks?
Should you block cat .env or just modifications?

The Interview Questions They’ll Ask

1. “How would you prevent an AI agent from modifying production configuration?”

Answer: Use a PreToolUse hook that inspects tool_input for file paths. Compare normalized paths against a blocklist of production config files. Exit with code 2 to block the tool entirely. This creates a deterministic security boundary that cannot be bypassed by the AI or user during the session.

2. “What’s the difference between permission prompts and blocking hooks?”

Answer:

Permission prompts ask the user and can be bypassed with “Yes” or “Always Allow”
Blocking hooks are deterministic - they check conditions and block unconditionally
Hooks provide stronger security because they don’t rely on user vigilance
Hooks can implement complex logic (regex matching, path normalization, time-based rules)

3. “How would you handle path normalization to prevent bypass attacks?”

Answer: Use os.path.realpath() in Python or realpath in bash to resolve:

Symlinks (/etc/secrets -> /actual/secrets)
Parent traversal (../../.env -> /home/user/.env)
Relative paths (./config.json -> /full/path/config.json)

Always normalize both the blocklist patterns and the incoming path before comparison.

4. “Can a malicious user encode file paths to bypass your hook?”

Answer: Potential bypass vectors:

URL encoding: %2e%2e%2f = ../
Unicode normalization: different representations of /
Symlinks: pointing to protected files
Case sensitivity: .ENV vs .env

Mitigations:

Always normalize paths
Use case-insensitive matching on case-insensitive filesystems
Resolve symlinks before checking
Consider URL decoding for web-facing scenarios

5. “How would you audit blocked attempts for security review?”

Answer: In the blocking hook:

Log to a dedicated audit file with timestamp, session_id, tool, path
Include the original and normalized paths
Record which pattern matched
Consider sending to a SIEM or monitoring system
Use structured JSON for easy parsing

Hints in Layers

Hint 1: Start with a Blocklist

Create a simple list of patterns to block:

BLOCKLIST = [
    ".env",
    ".env.local",
    ".env.production",
    "secrets/",
    "*.pem",
    "*.key",
    "id_rsa",
    "id_ed25519",
]

Hint 2: Parse the JSON

Read stdin and extract the relevant fields:

import json
import sys

payload = json.loads(sys.stdin.read())
tool_name = payload.get("tool_name")
tool_input = payload.get("tool_input", {})

# For file operations
file_path = tool_input.get("file_path", "")

Hint 3: Normalize Paths

Always normalize before checking:

import os

def normalize_path(path):
    # Expand ~ to home directory
    path = os.path.expanduser(path)
    # Resolve .. and symlinks
    path = os.path.realpath(path)
    return path

normalized = normalize_path(file_path)

Hint 4: Handle the Bash Tool

Parse the command string for file references:

import shlex
import re

def extract_files_from_command(command):
    files = []

    # Simple approach: find file-like patterns
    # This is basic - production would need shell parsing
    tokens = shlex.split(command)
    for token in tokens:
        if '/' in token or token.startswith('.'):
            files.append(token)

    # Also check for common patterns
    # cat .env, rm secrets/*, etc.
    file_pattern = r'(?:cat|rm|cp|mv|less|more|head|tail)\s+([^\s|>&]+)'
    matches = re.findall(file_pattern, command)
    files.extend(matches)

    return files

Books That Will Help

Topic	Book	Chapter	Why It Helps
Path security	“The Web Application Hacker’s Handbook” by Stuttard & Pinto	Ch. 10	Understanding path traversal attacks
Python JSON	“Fluent Python” by Luciano Ramalho	Ch. 17	Efficient JSON parsing patterns
Regex patterns	“Mastering Regular Expressions” by Jeffrey Friedl	Ch. 2-3	Pattern matching file paths
Security mindset	“Security in Computing” by Pfleeger	Ch. 4	Thinking about security boundaries
Bash parsing	“The Linux Command Line” by William Shotts	Ch. 24-26	Understanding shell command structure

Implementation Guide

Complete Python Implementation

#!/usr/bin/env python3
"""
File Guardian - PreToolUse Blocking Hook

Prevents Claude from modifying sensitive files and directories.
Exit code 2 = block, exit code 0 = allow
"""

import json
import sys
import os
import re
import fnmatch
from pathlib import Path
from datetime import datetime

# ===== CONFIGURATION =====

# Files and patterns to block (case-insensitive matching)
BLOCKLIST_PATTERNS = [
    # Environment files
    ".env",
    ".env.*",
    "*.env",

    # Secret/credential files
    "secrets.json",
    "credentials.json",
    "*secret*",
    "*credential*",
    "*password*",

    # Certificates and keys
    "*.pem",
    "*.key",
    "*.p12",
    "*.pfx",
    "*.crt",
    "*.cer",

    # SSH keys
    "id_rsa",
    "id_rsa.*",
    "id_ed25519",
    "id_ed25519.*",
    "authorized_keys",
    "known_hosts",

    # Cloud credentials
    ".aws/credentials",
    ".gcloud/*",
    "service-account*.json",
]

# Directories to block entirely
BLOCKLIST_DIRECTORIES = [
    ".ssh",
    "secrets",
    "private",
    "certificates",
    ".aws",
    ".gcloud",
]

# Tools that can modify files
MODIFYING_TOOLS = ["Edit", "Write", "MultiEdit", "NotebookEdit"]

# Tools that can execute arbitrary commands
DANGEROUS_TOOLS = ["Bash"]

# Log blocked attempts to this file
AUDIT_LOG = os.path.expanduser("~/.claude/file-guardian-audit.log")


# ===== CORE FUNCTIONS =====

def normalize_path(path: str) -> str:
    """Normalize a path to prevent bypass attacks."""
    if not path:
        return ""

    # Expand user home directory
    path = os.path.expanduser(path)

    # Make absolute if relative
    if not os.path.isabs(path):
        path = os.path.abspath(path)

    # Resolve symlinks and normalize
    try:
        path = os.path.realpath(path)
    except OSError:
        pass

    # Normalize path separators
    path = os.path.normpath(path)

    return path


def matches_pattern(filepath: str, pattern: str) -> bool:
    """Check if filepath matches a blocklist pattern."""
    # Get just the filename
    filename = os.path.basename(filepath)

    # Check filename against pattern
    if fnmatch.fnmatch(filename.lower(), pattern.lower()):
        return True

    # Check if any path component matches
    path_parts = Path(filepath).parts
    for part in path_parts:
        if fnmatch.fnmatch(part.lower(), pattern.lower()):
            return True

    return False


def is_in_blocked_directory(filepath: str) -> tuple[bool, str]:
    """Check if file is in a blocked directory."""
    path_parts = Path(filepath).parts

    for blocked_dir in BLOCKLIST_DIRECTORIES:
        if blocked_dir.lower() in [p.lower() for p in path_parts]:
            return True, blocked_dir

    return False, ""


def is_blocked_file(filepath: str) -> tuple[bool, str]:
    """Check if a file should be blocked. Returns (blocked, reason)."""
    normalized = normalize_path(filepath)

    # Check directory blocklist
    in_blocked_dir, dir_name = is_in_blocked_directory(normalized)
    if in_blocked_dir:
        return True, f"File is in protected directory: {dir_name}/"

    # Check pattern blocklist
    for pattern in BLOCKLIST_PATTERNS:
        if matches_pattern(normalized, pattern):
            return True, f"File matches protected pattern: {pattern}"

    return False, ""


def extract_files_from_bash_command(command: str) -> list[str]:
    """Extract potential file paths from a bash command."""
    files = []

    # Common file-operating commands
    file_commands = [
        r'(?:cat|less|more|head|tail|vim|nano|vi|code)\s+([^\s|>&;]+)',
        r'(?:rm|cp|mv|touch|chmod|chown)\s+(?:-[rf]*\s+)?([^\s|>&;]+)',
        r'(?:>|>>)\s*([^\s|>&;]+)',  # Redirections
        r'(?:source|\.)\s+([^\s|>&;]+)',  # Source commands
    ]

    for pattern in file_commands:
        matches = re.findall(pattern, command)
        files.extend(matches)

    # Also catch explicit file paths
    path_pattern = r'(?:^|\s)([./~][^\s|>&;]*)'
    path_matches = re.findall(path_pattern, command)
    files.extend(path_matches)

    return [f.strip() for f in files if f.strip()]


def log_blocked_attempt(tool_name: str, filepath: str, reason: str, session_id: str):
    """Log blocked attempts for audit purposes."""
    try:
        os.makedirs(os.path.dirname(AUDIT_LOG), exist_ok=True)
        with open(AUDIT_LOG, "a") as f:
            log_entry = {
                "timestamp": datetime.now().isoformat(),
                "session_id": session_id,
                "tool": tool_name,
                "path": filepath,
                "reason": reason,
            }
            f.write(json.dumps(log_entry) + "\n")
    except Exception:
        pass  # Don't fail the hook if logging fails


def print_block_message(tool_name: str, filepath: str, reason: str):
    """Print a user-friendly block message."""
    print(file=sys.stderr)
    print("+--------------------------------------------------+", file=sys.stderr)
    print("|           FILE GUARDIAN BLOCKED ACTION            |", file=sys.stderr)
    print("+--------------------------------------------------+", file=sys.stderr)
    print(f"|  Tool:   {tool_name:<40}|", file=sys.stderr)
    print(f"|  Target: {filepath[:40]:<40}|", file=sys.stderr)
    print(f"|  Reason: {reason[:40]:<40}|", file=sys.stderr)
    print("|                                                   |", file=sys.stderr)
    print("|  Action: BLOCKED                                  |", file=sys.stderr)
    print("+--------------------------------------------------+", file=sys.stderr)
    print(file=sys.stderr)


# ===== MAIN HANDLER =====

def main():
    # Read payload from stdin
    try:
        payload = json.loads(sys.stdin.read())
    except json.JSONDecodeError:
        # If we can't parse the payload, allow the action
        sys.exit(0)

    tool_name = payload.get("tool_name", "")
    tool_input = payload.get("tool_input", {})
    session_id = payload.get("session_id", "unknown")

    files_to_check = []

    # Handle file-modifying tools
    if tool_name in MODIFYING_TOOLS:
        file_path = tool_input.get("file_path") or tool_input.get("notebook_path")
        if file_path:
            files_to_check.append(file_path)

    # Handle Bash tool - parse command for file references
    elif tool_name in DANGEROUS_TOOLS:
        command = tool_input.get("command", "")
        files_to_check.extend(extract_files_from_bash_command(command))

    # Check each file against blocklist
    for filepath in files_to_check:
        blocked, reason = is_blocked_file(filepath)
        if blocked:
            log_blocked_attempt(tool_name, filepath, reason, session_id)
            print_block_message(tool_name, filepath, reason)
            sys.exit(2)  # BLOCK

    # No blocked files found - allow the action
    sys.exit(0)


if __name__ == "__main__":
    main()

Configuration in settings.json

{
  "hooks": [
    {
      "event": "PreToolUse",
      "type": "command",
      "command": "python3 ~/.claude/hooks/file-guardian.py",
      "timeout": 5000
    }
  ]
}

Architecture Diagram

+-----------------------------------------------------------------------+
|                      FILE GUARDIAN EXECUTION FLOW                       |
+-----------------------------------------------------------------------+
|                                                                        |
|  User Request: "Update the database password in .env"                  |
|                                                                        |
|  +-------------------+                                                 |
|  | Claude interprets |                                                 |
|  | request           |                                                 |
|  +--------+----------+                                                 |
|           |                                                            |
|           v                                                            |
|  +-------------------+                                                 |
|  | Claude prepares   |                                                 |
|  | Edit tool call    |                                                 |
|  +--------+----------+                                                 |
|           |                                                            |
|           v                                                            |
|  +-------------------+     +----------------------------------------+  |
|  | PreToolUse Event  |---->| file-guardian.py                       |  |
|  | fires             |     |                                        |  |
|  +-------------------+     | 1. Parse JSON payload                  |  |
|                            | 2. Extract file_path: ".env"           |  |
|                            | 3. Normalize: "/full/path/.env"        |  |
|                            | 4. Check blocklist patterns            |  |
|                            | 5. MATCH: ".env" in blocklist          |  |
|                            | 6. Log to audit file                   |  |
|                            | 7. Print block message                 |  |
|                            | 8. exit(2) - BLOCK                     |  |
|                            +----------------------------------------+  |
|                                       |                                |
|                                       v                                |
|  +------------------------------------------------------------------+  |
|  |                        TOOL BLOCKED                               |  |
|  |                                                                   |  |
|  |  Claude receives: "Tool Edit was blocked by a hook"              |  |
|  |                                                                   |  |
|  |  Claude responds: "I'm unable to modify the .env file as it's    |  |
|  |  protected by your file guardian configuration..."               |  |
|  +------------------------------------------------------------------+  |
|                                                                        |
+-----------------------------------------------------------------------+

Security Considerations

Path Traversal Prevention

# VULNERABLE - Doesn't handle ../ or symlinks
def bad_check(path):
    return ".env" in path  # ./config/../.env would pass!

# SECURE - Normalizes before checking
def good_check(path):
    normalized = os.path.realpath(os.path.expanduser(path))
    return is_blocked_file(normalized)

Race Conditions

TIME        ACTION
-----       ------
T1          Hook checks /path/safe.txt - ALLOWED
T2          Symlink created: /path/safe.txt -> /path/.env
T3          Tool writes to /path/safe.txt (actually .env!)

MITIGATION: Re-check at write time (not possible in hook)
            Use inotify/fswatch for symlink monitoring
            Consider blocking all symlink creation

Bash Command Parsing Limitations

# These are HARD to detect:
eval "cat .e""nv"           # String concatenation
cat $(echo .env)            # Command substitution
f=".env"; cat $f            # Variable expansion
base64 -d <<< "LmVudg==" | xargs cat  # Encoding

# MITIGATION: Consider blocking Bash tool entirely for sensitive projects
# Or use a shell parser library like bashlex

Learning Milestones

Milestone 1: Basic File Blocking Works

Goal: Block access to .env files

Test:

# Create test file
echo "SECRET=password" > /tmp/test/.env

# Start Claude and try to edit it
You: Show me what's in /tmp/test/.env
# Should be BLOCKED

What You’ve Learned:

PreToolUse hook execution flow
JSON payload parsing
Exit code 2 for blocking

Milestone 2: Path Normalization Prevents Bypass

Goal: Block ../ traversal attempts

Test:

# Try path traversal
You: Edit the file /tmp/safe/../test/.env
# Should still be BLOCKED despite ../

What You’ve Learned:

Path normalization with realpath
Security edge cases
Thinking like an attacker

Milestone 3: Bash Commands Are Parsed

Goal: Block cat .env style commands

Test:

You: Run: cat /tmp/test/.env
# Should be BLOCKED

What You’ve Learned:

Command string parsing
Multi-tool protection
Regex patterns for shell commands

Common Mistakes to Avoid

Mistake 1: Not Normalizing Paths

# WRONG
if ".env" in file_path:
    block()
# Bypass: ./config/../.env

# RIGHT
normalized = os.path.realpath(file_path)
if ".env" in normalized:
    block()

Mistake 2: Case-Sensitive Matching on Case-Insensitive FS

# WRONG on macOS/Windows
if file_path == ".env":
    block()
# Bypass: .ENV, .Env

# RIGHT
if file_path.lower() == ".env":
    block()

Mistake 3: Forgetting to Exit 2

# WRONG - prints message but tool still executes!
if is_blocked(path):
    print("Blocked!")
    # Missing exit(2)!

# RIGHT
if is_blocked(path):
    print("Blocked!")
    sys.exit(2)  # Critical!

Mistake 4: Not Handling All File-Modifying Tools

# WRONG - Only checks Edit
if tool_name == "Edit":
    check_file(tool_input["file_path"])

# RIGHT - Check all modifying tools
if tool_name in ["Edit", "Write", "MultiEdit", "NotebookEdit"]:
    check_file(tool_input.get("file_path") or tool_input.get("notebook_path"))

Extension Ideas

Once the basic guardian works, consider these enhancements:

Project-Specific Blocklists: Load from .claude/file-guardian.json per project
Time-Based Rules: Block production configs only during business hours
Git Integration: Block files not in .gitignore for public repos
Slack Alerts: Notify team channel when someone tries to access secrets
Allowlist by Branch: More permissive on feature branches vs main
Interactive Override: Prompt for a confirmation code for critical files

Summary

This project taught you to implement security boundaries using PreToolUse hooks:

PreToolUse Payloads: Understanding tool_name and tool_input structure
Exit Code 2: The deterministic block that cannot be bypassed
Path Normalization: Preventing traversal and symlink attacks
Bash Parsing: Extracting file references from shell commands
Security Mindset: Thinking about bypass vectors

With File Guardian in place, you have a foundation for building more sophisticated security hooks. Project 3 will shift focus to PostToolUse hooks for post-processing, and Project 5 will explore UserPromptSubmit for input validation.