Project 13: Skill Auto-Activation via Prompt Analysis
Project 13: Skill Auto-Activation via Prompt Analysis
Build a UserPromptSubmit hook that analyzes user prompts and automatically suggests or activates relevant skills using NLP techniques.
Quick Reference
| Attribute | Value |
|---|---|
| Difficulty | Expert |
| Time Estimate | 2-3 weeks |
| Language | Python + Markdown |
| Prerequisites | Projects 9-12 completed, NLP basics, embedding APIs |
| Key Topics | Intent classification, semantic embeddings, prompt augmentation, hooks |
| Knowledge Area | Skills / NLP / Intent Classification |
| Main Book | โNatural Language Processing with Pythonโ by Bird, Klein, Loper |
1. Learning Objectives
By completing this project, you will:
- Implement intent classification: Analyze user prompts to determine their intent
- Use semantic embeddings: Compare prompt meaning to skill descriptions using vector similarity
- Master the UserPromptSubmit hook: Modify prompts before they reach Claude
- Balance precision and recall: Avoid false positives while catching relevant requests
- Build hybrid matching systems: Combine keyword rules with semantic similarity
- Handle edge cases: Manage ambiguity, multiple matches, and confidence thresholds
2. Theoretical Foundation
2.1 The Skill Discovery Problem
Default skill discovery relies on Claudeโs pattern matching. This project adds intelligence:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ DEFAULT vs ENHANCED DISCOVERY โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ DEFAULT DISCOVERY ENHANCED DISCOVERY โ
โ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ
โ User: "check if api works" User: "check if api works" โ
โ โ โ โ
โ โผ โผ โ
โ Claude interprets UserPromptSubmit hook runs โ
โ prompt directly โ โ
โ โ โผ โ
โ โผ โโโโโโโโโโโโโโโโโโโ โ
โ Might or might not โ Analyze intent โ โ
โ use web-testing skill โ Match to skills โ โ
โ โ โ Score: 0.87 โ โ
โ โผ โโโโโโโโโโฌโโโโโโโโโ โ
โ Inconsistent skill โ โ
โ activation โผ โ
โ Augmented prompt: โ
โ "check if api works โ
โ [Consider using web-testing]" โ
โ โ โ
โ โผ โ
โ Claude reliably โ
โ uses the skill โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.2 Intent Classification Approaches
Three main approaches to understanding user intent:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ INTENT CLASSIFICATION METHODS โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ 1. KEYWORD MATCHING (Simple, Fast) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Rules: โ
โ "test" + "api" โ web-testing (0.6) โ
โ "commit" โ git-commit (0.8) โ
โ "document" โ doc-generator (0.7) โ
โ โ
โ Pros: Fast, predictable, no API calls โ
โ Cons: Misses synonyms, can't understand context โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ 2. SEMANTIC EMBEDDING (Powerful, Slower) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Steps: โ
โ 1. Embed user prompt โ vector [0.1, -0.3, 0.8, ...] โ
โ 2. Compare to skill description embeddings โ
โ 3. Find nearest neighbor by cosine similarity โ
โ โ
โ Pros: Understands meaning, handles paraphrasing โ
โ Cons: Slower, requires API calls, costs money โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ 3. HYBRID (Best of Both) โ
โ โโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Pipeline: โ
โ 1. Try keyword matching first (fast) โ
โ 2. If confident match (>0.7), use it โ
โ 3. If low confidence, use embeddings โ
โ 4. Combine scores for final decision โ
โ โ
โ Pros: Fast for common cases, accurate for edge cases โ
โ Cons: More complex to implement โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.3 Semantic Embeddings
Embeddings convert text to vectors that capture meaning:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ EMBEDDING COMPARISON โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Text โ Embedding API โ Vector โ
โ โ
โ "test the login flow" โ
โ โ โ
โ โผ โ
โ [0.12, -0.45, 0.78, 0.23, -0.91, ...] (1536 dimensions) โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Skill Descriptions (pre-computed): โ
โ โ
โ web-testing: "Automate web browser testing..." โ
โ โ [0.14, -0.42, 0.75, 0.21, -0.88, ...] โ
โ โ
โ git-commit: "Help create git commit messages..." โ
โ โ [-0.33, 0.67, 0.12, -0.54, 0.29, ...] โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ Cosine Similarity: โ
โ โ
โ prompt vs web-testing: 0.87 โโโ High similarity! โ
โ prompt vs git-commit: 0.23 โโโ Low similarity โ
โ โ
โ Result: Activate web-testing skill โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.4 The UserPromptSubmit Hook
This hook runs before every prompt reaches Claude:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ USER PROMPT SUBMIT HOOK โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ User types prompt โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ UserPromptSubmit Hook โ โ
โ โ โ โ
โ โ Input (stdin): โ โ
โ โ { โ โ
โ โ "prompt": "check if the api is working", โ โ
โ โ "session_id": "...", โ โ
โ โ "cwd": "/path/to/project" โ โ
โ โ } โ โ
โ โ โ โ
โ โ Your code runs: โ โ
โ โ - Analyze intent โ โ
โ โ - Match to skills โ โ
โ โ - Decide to augment or pass through โ โ
โ โ โ โ
โ โ Output (stdout): โ โ
โ โ { โ โ
โ โ "modified_prompt": "check if the api is working\n\n โ โ
โ โ [System: Consider using the web-testing skill]" โ โ
โ โ } โ โ
โ โ โ โ
โ โ Or exit(0) to pass through unchanged โ โ
โ โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ Modified prompt reaches Claude โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
2.5 Cosine Similarity
The standard way to compare embeddings:
import numpy as np
def cosine_similarity(a, b):
"""
Compute cosine similarity between two vectors.
cos(ฮธ) = (A ยท B) / (||A|| ร ||B||)
Returns value between -1 and 1:
1.0 = identical direction (most similar)
0.0 = orthogonal (unrelated)
-1.0 = opposite direction (least similar)
"""
dot_product = np.dot(a, b)
norm_a = np.linalg.norm(a)
norm_b = np.linalg.norm(b)
return dot_product / (norm_a * norm_b)
2.6 Precision vs Recall Trade-off
Tuning thresholds affects behavior:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ PRECISION vs RECALL โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ High Threshold (0.9) Low Threshold (0.5) โ
โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ
โ High Precision: High Recall: โ
โ - Only activates when VERY sure - Activates on loose matches โ
โ - Few false positives - Catches more relevant requests โ
โ - Misses some valid requests - More false positives โ
โ โ
โ Example at 0.9: Example at 0.5: โ
โ "test login" (0.87) โ NO "test login" (0.87) โ YES โ
โ "verify api" (0.75) โ NO "verify api" (0.75) โ YES โ
โ "check tests" (0.55) โ NO "check tests" (0.55) โ YES โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โ RECOMMENDATION: Start at 0.7 โ
โ โ
โ - Balances precision and recall โ
โ - Adjust based on user feedback โ
โ - Consider different thresholds per skill โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
3. Project Specification
3.1 What You Will Build
A UserPromptSubmit hook that:
- Analyzes every user prompt
- Extracts keywords and computes embeddings
- Matches against all available skill descriptions
- Augments the prompt with skill hints when confident
- Passes through unchanged when uncertain
3.2 Functional Requirements
- Prompt Analysis:
- Extract keywords from the prompt
- Compute semantic embedding of the prompt
- Handle edge cases (empty, very short, commands)
- Skill Matching:
- Keyword-based matching (fast, first pass)
- Embedding-based matching (accurate, second pass)
- Combine scores using weighted average
- Decision Making:
- Confidence threshold for activation (default: 0.7)
- Handle multiple matches gracefully
- Skip when user explicitly names a skill
- Prompt Augmentation:
- Append skill hint to the prompt
- Use unobtrusive format
- Preserve original prompt meaning
- Configuration:
- Configurable threshold
- Enable/disable keyword matching
- Enable/disable embedding matching
- Skill blacklist
3.3 Non-Functional Requirements
- Speed: Hook should complete in under 500ms
- Accuracy: > 80% precision on skill activation
- Efficiency: Cache embeddings to minimize API calls
- Robustness: Handle API failures gracefully
4. Real World Outcome
When you complete this project, hereโs exactly what youโll experience:
You: i need to check if the api is working on prod
# Behind the scenes:
# 1. UserPromptSubmit hook receives prompt
# 2. Extracts keywords: ["check", "api", "working", "prod"]
# 3. Computes embedding of prompt
# 4. Compares to skill descriptions:
# - web-testing: 0.87 โ Best match!
# - code-review: 0.42
# - git-commit: 0.31
# 5. Confidence 0.87 > threshold 0.7
# 6. Augments prompt with hint
Skill Matcher Analysis:
Intent: "API testing/verification"
Matched skill: web-testing (confidence: 0.87)
Augmenting prompt with skill context...
# Claude receives:
# "i need to check if the api is working on prod
#
# [System: Consider using the web-testing skill for this task]"
Claude: I'll help you verify the API is working on production.
[Invokes web-testing skill automatically]
Let me test the key endpoints...
Without the skill matcher, Claude might just describe how to test. With it, Claude automatically invokes the appropriate skill.
5. The Core Question Youโre Answering
โHow can I make skill discovery smarter by analyzing user intent and proactively activating the right skill?โ
This project teaches you:
- How to augment LLM behavior with pre-processing
- NLP techniques for intent classification
- The trade-offs between keyword and semantic matching
- How to build reliable classification systems
6. Concepts You Must Understand First
6.1 Intent Classification
| Concept | Questions to Answer | Reference |
|---|---|---|
| Intent signals | What words/patterns indicate intent? | โNLP with Pythonโ Ch. 6 |
| Rule-based vs ML | When to use each approach? | This guide, section 2.2 |
| Feature extraction | What features predict intent? | Text preprocessing |
6.2 Semantic Similarity
| Concept | Questions to Answer | Reference |
|---|---|---|
| Embeddings | What are they? How are they computed? | OpenAI embeddings docs |
| Cosine similarity | How do you compare vectors? | Linear algebra basics |
| Threshold tuning | How do you choose the right threshold? | Experimentation |
6.3 Prompt Augmentation
| Concept | Questions to Answer | Reference |
|---|---|---|
| UserPromptSubmit | How does the hook work? | Claude Code docs |
| Prompt injection | How to add hints without confusing Claude? | Careful formatting |
| Pass-through behavior | When should you NOT modify? | Low confidence cases |
7. Questions to Guide Your Design
7.1 What Signals Indicate Skill Intent?
Think about indicators for each skill:
| Skill | Keywords | Semantic Concepts |
|---|---|---|
| web-testing | test, api, verify, check | browser, automation, selenium |
| git-commit | commit, stage, changes | version control, save |
| doc-generator | document, docstring, jsdoc | explain, describe |
| code-review | review, pr, quality | feedback, improve |
7.2 How to Augment the Prompt?
Options for injecting skill hints:
| Approach | Example | Pros/Cons |
|---|---|---|
| Append | prompt + "\n[Use X skill]" |
Simple, visible |
| Prepend | "[Skill: X]\n" + prompt |
Immediate context |
| System note | prompt + "\n[System: Consider X]" |
Less intrusive |
| Context injection | Modify session context | Invisible to user |
7.3 How to Handle Ambiguity?
What to do when:
- Multiple skills match with similar scores?
- No skill matches strongly?
- User explicitly names a different skill?
8. Thinking Exercise
8.1 Design the Matching Pipeline
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SKILL MATCHER PIPELINE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ Input: "i need to check if the api is working on prod" โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ 1. PREPROCESS โ โ
โ โ โ Lowercase, tokenize โ
โ โ Input: โ ["i", "need", "to", "check", "if", "the", โ
โ โ raw prompt โ "api", "is", "working", "on", "prod"] โ
โ โ โ โ
โ โ Output: โ Remove stopwords: ["check", "api", "working", โ
โ โ keywords โ "prod"] โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ 2. KEYWORD โ โ
โ โ MATCH โ Check keyword โ skill mappings โ
โ โ โ โ
โ โ "api" โ โ web-testing (0.6) โ
โ โ "check" โ โ web-testing (0.4) โ
โ โ "working" โ โ web-testing (0.3) โ
โ โ โ โ
โ โ Keyword score: โ 0.6 (max) or combine somehow โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ 3. EMBEDDING โ โ
โ โ SIMILARITY โ Compare prompt embedding to skill โ
โ โ โ description embeddings โ
โ โ โ โ
โ โ web-testing: โ 0.87 โ Best match โ
โ โ code-review: โ 0.42 โ
โ โ git-commit: โ 0.31 โ
โ โ doc-generator: โ 0.28 โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ 4. COMBINE โ โ
โ โ SCORES โ Weighted average: โ
โ โ โ final = 0.3 * keyword + 0.7 * embedding โ
โ โ โ โ
โ โ web-testing: โ 0.3 * 0.6 + 0.7 * 0.87 = 0.79 โ
โ โ โ โ
โ โ Threshold: โ 0.79 > 0.7 โ ACTIVATE โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ Output: Augmented prompt with skill hint โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Questions to consider:
- What threshold triggers activation?
- Should you always augment, or only when confident?
- How do you handle embedding API latency?
9. The Interview Questions Theyโll Ask
- โHow would you implement intent classification for a skill system?โ
- Expected: Combine keyword matching with semantic similarity
- Bonus: Discuss hybrid approaches and fallback strategies
- โWhatโs the difference between keyword matching and semantic similarity?โ
- Expected: Keywords are exact/rule-based; embeddings capture meaning
- Bonus: Trade-offs of speed, accuracy, and cost
- โHow do you balance precision and recall in skill activation?โ
- Expected: Threshold tuning, different thresholds per skill
- Bonus: Discuss consequences of false positives vs false negatives
- โHow would you handle latency from embedding API calls?โ
- Expected: Caching, pre-computation, async calls
- Bonus: Fallback to keyword-only when API is slow
- โWhat are the risks of automatic skill activation?โ
- Expected: Wrong skill activated, user confusion
- Bonus: Logging, user override, feedback mechanisms
10. Solution Architecture
10.1 System Component Diagram
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SKILL MATCHER ARCHITECTURE โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ hooks/skill_matcher.py โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ
โ โ Main hook script that runs on UserPromptSubmit โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โผ โผ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ keyword_rules/ โ โ embeddings/ โ โ config.json โ โ
โ โ โ โ โ โ โ โ
โ โ keywords.json โ โ skill_cache.npy โ โ threshold: 0.7 โ โ
โ โ - apiโtesting โ โ (pre-computed โ โ weights: {...} โ โ
โ โ - commitโgit โ โ embeddings) โ โ blacklist: [] โ โ
โ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ
โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ EXTERNAL SERVICES โ โ
โ โ โ โ
โ โ OpenAI Embeddings API โโโ or โโโ Anthropic Embeddings โ โ
โ โ text-embedding-3-small (if available) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
10.2 Hook Execution Flow
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ HOOK EXECUTION FLOW โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ User submits prompt โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Claude Code โ Triggers UserPromptSubmit hook โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Read stdin โ Parse JSON payload โ
โ โ {prompt: "..."}โ โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Skip if command โ /help, /clear, etc. โ exit(0) โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Keyword match โ Fast first pass โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โโโโโโโดโโโโโโ โ
โ โ โ โ
โ High score Low score โ
โ (>0.8) (<0.8) โ
โ โ โ โ
โ โ โผ โ
โ โ โโโโโโโโโโโโโโโโโโโ โ
โ โ โ Embedding match โ Semantic comparison โ
โ โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ โ
โ โโโโโโโโฌโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Above threshold?โ โ
โ โโโโโโโโโโฌโโโโโโโโโ โ
โ โ โ
โ โโโโโโโดโโโโโโ โ
โ โ โ โ
โ Yes No โ
โ โ โ โ
โ โผ โผ โ
โ Augment exit(0) โ
โ prompt (pass through) โ
โ โ โ
โ โผ โ
โ Print JSON to stdout โ
โ {"modified_prompt": "..."} โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
11. Implementation Guide
11.1 Phase 1: Set Up the Hook Structure
# Create hooks directory if it doesn't exist
mkdir -p ~/.claude/hooks
# Create the skill matcher hook
touch ~/.claude/hooks/skill_matcher.py
# Create supporting files
mkdir -p ~/.claude/hooks/skill_matcher_data
touch ~/.claude/hooks/skill_matcher_data/keywords.json
touch ~/.claude/hooks/skill_matcher_data/config.json
11.2 Phase 2: Implement Keyword Matching
#!/usr/bin/env python3
"""
Skill Matcher Hook - UserPromptSubmit
Analyzes user prompts and suggests relevant skills.
"""
import json
import sys
import os
from typing import Dict, List, Tuple, Optional
# Configuration
THRESHOLD = 0.7
KEYWORD_WEIGHT = 0.3
EMBEDDING_WEIGHT = 0.7
# Keyword โ skill mappings with weights
KEYWORD_RULES = {
# web-testing
"test": [("web-testing", 0.5)],
"testing": [("web-testing", 0.6)],
"api": [("web-testing", 0.6)],
"verify": [("web-testing", 0.4)],
"check": [("web-testing", 0.3)],
"browser": [("web-testing", 0.7)],
"login": [("web-testing", 0.5)],
# git-commit
"commit": [("git-commit", 0.9)],
"stage": [("git-commit", 0.6)],
"changes": [("git-commit", 0.4)],
# doc-generator
"document": [("doc-generator", 0.8)],
"docstring": [("doc-generator", 0.9)],
"jsdoc": [("doc-generator", 0.9)],
"documentation": [("doc-generator", 0.8)],
# code-review
"review": [("code-review", 0.8)],
"pr": [("code-review", 0.7)],
"quality": [("code-review", 0.5)],
}
def extract_keywords(text: str) -> List[str]:
"""Extract keywords from text."""
# Simple tokenization
words = text.lower().split()
# Remove common stopwords
stopwords = {'i', 'the', 'a', 'an', 'is', 'are', 'was', 'were', 'to', 'of',
'and', 'or', 'for', 'on', 'in', 'it', 'this', 'that', 'with',
'be', 'have', 'do', 'if', 'my', 'your', 'need', 'want'}
return [w for w in words if w not in stopwords and len(w) > 2]
def keyword_match(keywords: List[str]) -> Dict[str, float]:
"""Match keywords to skills."""
scores: Dict[str, float] = {}
for keyword in keywords:
if keyword in KEYWORD_RULES:
for skill, weight in KEYWORD_RULES[keyword]:
if skill not in scores:
scores[skill] = 0
scores[skill] = max(scores[skill], weight)
return scores
def main():
# Read input from stdin
try:
payload = json.loads(sys.stdin.read())
except json.JSONDecodeError:
sys.exit(0) # Pass through on invalid input
prompt = payload.get("prompt", "")
# Skip empty prompts and commands
if not prompt or prompt.startswith("/"):
sys.exit(0)
# Extract keywords
keywords = extract_keywords(prompt)
if not keywords:
sys.exit(0)
# Keyword matching
scores = keyword_match(keywords)
if not scores:
sys.exit(0)
# Find best match
best_skill = max(scores, key=scores.get)
best_score = scores[best_skill]
# Check threshold
if best_score < THRESHOLD:
sys.exit(0)
# Augment the prompt
augmented = f"{prompt}\n\n[System: Consider using the {best_skill} skill for this task]"
# Output modified prompt
print(json.dumps({"modified_prompt": augmented}))
if __name__ == "__main__":
main()
11.3 Phase 3: Add Embedding Matching
#!/usr/bin/env python3
"""
Skill Matcher Hook with Embeddings - UserPromptSubmit
"""
import json
import sys
import os
import numpy as np
from typing import Dict, List, Tuple, Optional
# Try to import OpenAI for embeddings
try:
from openai import OpenAI
EMBEDDINGS_AVAILABLE = True
except ImportError:
EMBEDDINGS_AVAILABLE = False
# Configuration
THRESHOLD = 0.7
KEYWORD_WEIGHT = 0.3
EMBEDDING_WEIGHT = 0.7
EMBEDDING_CACHE_PATH = os.path.expanduser("~/.claude/hooks/skill_matcher_data/embeddings.json")
# Skill descriptions for embedding comparison
SKILL_DESCRIPTIONS = {
"web-testing": "Automate web browser testing including login flows, form submissions, and UI verification. Use for testing web functionality.",
"git-commit": "Help create well-formatted git commit messages following conventional commit format. Use for committing changes.",
"doc-generator": "Generate documentation for code including Python docstrings, JSDoc comments, and README sections.",
"code-review": "Comprehensive code review for security, performance, style, and testing. Use for reviewing PRs or code quality.",
}
def load_embedding_cache() -> Dict[str, List[float]]:
"""Load cached embeddings."""
if os.path.exists(EMBEDDING_CACHE_PATH):
with open(EMBEDDING_CACHE_PATH) as f:
return json.load(f)
return {}
def save_embedding_cache(cache: Dict[str, List[float]]):
"""Save embeddings to cache."""
os.makedirs(os.path.dirname(EMBEDDING_CACHE_PATH), exist_ok=True)
with open(EMBEDDING_CACHE_PATH, 'w') as f:
json.dump(cache, f)
def get_embedding(text: str, cache: Dict[str, List[float]]) -> Optional[List[float]]:
"""Get embedding for text, using cache when available."""
if text in cache:
return cache[text]
if not EMBEDDINGS_AVAILABLE:
return None
try:
client = OpenAI()
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
)
embedding = response.data[0].embedding
cache[text] = embedding
return embedding
except Exception:
return None
def cosine_similarity(a: List[float], b: List[float]) -> float:
"""Compute cosine similarity between two vectors."""
a_arr = np.array(a)
b_arr = np.array(b)
return float(np.dot(a_arr, b_arr) / (np.linalg.norm(a_arr) * np.linalg.norm(b_arr)))
def embedding_match(prompt: str, cache: Dict[str, List[float]]) -> Dict[str, float]:
"""Match prompt to skills using embeddings."""
prompt_embedding = get_embedding(prompt, cache)
if prompt_embedding is None:
return {}
scores = {}
for skill, description in SKILL_DESCRIPTIONS.items():
skill_embedding = get_embedding(description, cache)
if skill_embedding:
scores[skill] = cosine_similarity(prompt_embedding, skill_embedding)
return scores
def combine_scores(keyword_scores: Dict[str, float],
embedding_scores: Dict[str, float]) -> Dict[str, float]:
"""Combine keyword and embedding scores."""
all_skills = set(keyword_scores.keys()) | set(embedding_scores.keys())
combined = {}
for skill in all_skills:
kw_score = keyword_scores.get(skill, 0)
emb_score = embedding_scores.get(skill, 0)
combined[skill] = KEYWORD_WEIGHT * kw_score + EMBEDDING_WEIGHT * emb_score
return combined
def main():
# Read input
try:
payload = json.loads(sys.stdin.read())
except json.JSONDecodeError:
sys.exit(0)
prompt = payload.get("prompt", "")
if not prompt or prompt.startswith("/"):
sys.exit(0)
# Load embedding cache
cache = load_embedding_cache()
# Keyword matching
keywords = extract_keywords(prompt)
keyword_scores = keyword_match(keywords) if keywords else {}
# Embedding matching (if available)
embedding_scores = embedding_match(prompt, cache) if EMBEDDINGS_AVAILABLE else {}
# Combine scores
if embedding_scores:
scores = combine_scores(keyword_scores, embedding_scores)
else:
scores = keyword_scores
if not scores:
sys.exit(0)
# Find best match
best_skill = max(scores, key=scores.get)
best_score = scores[best_skill]
# Check threshold
if best_score < THRESHOLD:
sys.exit(0)
# Save updated cache
save_embedding_cache(cache)
# Augment prompt
augmented = f"{prompt}\n\n[System: Consider using the {best_skill} skill for this task (confidence: {best_score:.2f})]"
print(json.dumps({"modified_prompt": augmented}))
if __name__ == "__main__":
main()
11.4 Phase 4: Configure the Hook
Add to your Claude Code settings (.claude/settings.json):
{
"hooks": {
"UserPromptSubmit": {
"command": "python3 ~/.claude/hooks/skill_matcher.py"
}
}
}
11.5 Phase 5: Pre-compute Embeddings
Create a script to pre-compute skill description embeddings:
#!/usr/bin/env python3
"""Pre-compute embeddings for skill descriptions."""
import json
import os
from openai import OpenAI
SKILL_DESCRIPTIONS = {
"web-testing": "Automate web browser testing including login flows, form submissions, and UI verification.",
"git-commit": "Help create well-formatted git commit messages following conventional commit format.",
"doc-generator": "Generate documentation for code including Python docstrings, JSDoc comments.",
"code-review": "Comprehensive code review for security, performance, style, and testing.",
}
CACHE_PATH = os.path.expanduser("~/.claude/hooks/skill_matcher_data/embeddings.json")
def main():
client = OpenAI()
cache = {}
for skill, description in SKILL_DESCRIPTIONS.items():
print(f"Computing embedding for {skill}...")
response = client.embeddings.create(
input=description,
model="text-embedding-3-small"
)
cache[description] = response.data[0].embedding
os.makedirs(os.path.dirname(CACHE_PATH), exist_ok=True)
with open(CACHE_PATH, 'w') as f:
json.dump(cache, f)
print(f"Saved {len(cache)} embeddings to {CACHE_PATH}")
if __name__ == "__main__":
main()
12. Hints in Layers
Hint 1: Start with Keywords
Create a simple keyword โ skill mapping:
if "test" in prompt or "api" in prompt:
suggest_skill("web-testing")
Get this working before adding embeddings.
Hint 2: Add Embeddings
Pre-compute embeddings for all skill descriptions:
# At startup or in a setup script
for skill, description in SKILL_DESCRIPTIONS.items():
SKILL_EMBEDDINGS[skill] = get_embedding(description)
Compare prompt embedding to these cached embeddings.
Hint 3: Confidence Threshold
Only augment when confident:
if confidence > 0.7:
# Augment the prompt
augmented = f"{prompt}\n\n[System: Consider {skill}]"
else:
# Pass through unchanged
sys.exit(0)
Hint 4: Cache Embeddings
Store embeddings to avoid recomputing:
# Load cache at startup
cache = load_cache("embeddings.json")
# Check cache before API call
if text in cache:
return cache[text]
else:
embedding = api_call(text)
cache[text] = embedding
save_cache(cache)
13. Common Pitfalls & Debugging
13.1 Frequent Mistakes
| Pitfall | Symptom | Solution |
|---|---|---|
| Hook not running | Prompts unchanged | Check settings.json, permissions |
| Slow hook | Noticeable delay | Cache embeddings, use keywords first |
| API errors | Crashes on missing key | Handle exceptions, fallback to keywords |
| Over-activation | Wrong skill suggested | Raise threshold, improve descriptions |
| Under-activation | Skills never suggested | Lower threshold, add more keywords |
13.2 Debugging Steps
- Test hook standalone:
echo '{"prompt":"test api"}' | python hook.py - Check output format: Must be valid JSON with
modified_prompt - Log to file: Add logging to debug without breaking stdout
- Test thresholds: Try different values and observe behavior
14. Extensions & Challenges
14.1 Beginner Extensions
- Add logging to track activations
- Support skill blacklist (never suggest certain skills)
- Show confidence score in output
14.2 Intermediate Extensions
- Learn from feedback (user accepts/rejects suggestions)
- Context-aware matching (consider recent conversation)
- Multiple skill suggestions for complex requests
14.3 Advanced Extensions
- Train custom classifier on usage data
- A/B testing different thresholds
- Integration with skill usage analytics
15. Books That Will Help
| Topic | Book/Resource | Chapter/Section |
|---|---|---|
| Intent classification | โNLP with Pythonโ by Bird et al. | Chapter 6 |
| Embeddings | โSpeech and Language Processingโ | Chapter 6 |
| Semantic similarity | โFoundations of Statistical NLPโ | Chapter 15 |
| Cosine similarity | Linear algebra textbook | Vector operations |
16. Self-Assessment Checklist
Understanding
- I can explain keyword matching vs semantic similarity
- I understand how embeddings capture meaning
- I know how to tune precision/recall with thresholds
- I understand the UserPromptSubmit hook lifecycle
Implementation
- Keyword matching correctly identifies skills
- Embeddings improve matching accuracy
- Threshold prevents false positives
- Hook completes in reasonable time (<500ms)
Growth
- I can add new skills to the matcher
- I can adjust thresholds based on feedback
- I understand when embeddings help vs hurt
17. Learning Milestones
| Milestone | Indicator |
|---|---|
| Keyword matching works | You understand basic intent signals |
| Embeddings improve accuracy | You understand semantic similarity |
| Threshold tuning works | Youโve balanced precision and recall |
| Skills activate reliably | Youโve built intelligent discovery |
This guide was expanded from CLAUDE_CODE_MASTERY_40_PROJECTS.md. For the complete learning path, see the project index.