Project 20: “The Git Context Injector” — Context Management
| Attribute | Value |
|---|---|
| File | KIRO_CLI_LEARNING_PROJECTS.md |
| Main Programming Language | Bash |
| Coolness Level | Level 2: Practical |
| Difficulty | Level 2: Intermediate |
| Knowledge Area | Context Management |
What you’ll build: A UserPromptSubmit hook that appends git diff --staged.
Why it teaches Dynamic Context: The AI always sees the current change set.
Success criteria:
- Prompt includes diff content automatically.
Real World Outcome
You’ll have a context injector that automatically enriches every Kiro prompt with git state information, ensuring the AI always has visibility into what code is currently changed, staged, or uncommitted. This eliminates the need to manually paste git diff output:
Without the hook:
$ kiro "write tests for the changes I just made"
Kiro: I don't see any recent changes in the conversation. Can you share what files you modified?
With the hook:
$ git add src/auth.ts # Stage your changes
$ kiro "write tests for the changes I just made"
[UserPromptSubmit Hook] Injecting git context...
Enhanced prompt sent to Kiro:
────────────────────────────────────
Original: "write tests for the changes I just made"
Git Context:
Branch: feature/oauth-login
Status: 1 file changed, 45 insertions(+), 12 deletions(-)
Staged Changes:
diff --git a/src/auth.ts b/src/auth.ts
index 1234567..abcdefg 100644
--- a/src/auth.ts
+++ b/src/auth.ts
@@ -10,7 +10,15 @@ export class AuthService {
- async login(username: string, password: string) {
- return this.basicAuth(username, password);
+ async login(provider: 'google' | 'github', token: string) {
+ const user = await this.oauthVerify(provider, token);
+ return this.createSession(user);
}
+
+ private async oauthVerify(provider: string, token: string) {
+ // New OAuth verification logic
+ }
────────────────────────────────────
Kiro: I can see you've refactored the login method to support OAuth. I'll write comprehensive tests for both Google and GitHub OAuth flows, covering token validation, user creation, and session management.
[Kiro writes auth.test.ts with OAuth-specific tests]
Context injection report:
$ bash analyze-context-usage.sh
Git Context Injector Report (Last 30 Days)
───────────────────────────────────────────
Total Prompts: 1,847
Context Injected: 1,245 (67%)
Context Skipped: 602 (33% - no staged changes)
Average Context Size:
- Staged diff: 234 lines
- Unstaged diff: 127 lines
- Recent commits: 3 commits
Top Use Cases:
1. "Write tests for these changes" (387 prompts)
2. "Review this code" (298 prompts)
3. "Fix the bug I introduced" (156 prompts)
4. "Document what I changed" (89 prompts)
Token Budget Impact:
- Average context added: 1,200 tokens
- Prompts that exceeded budget: 12 (0.6%)
- Context truncation applied: 8 times
The hook intelligently decides what git information is relevant and formats it for maximum AI comprehension.
The Core Question You’re Answering
“How do I give AI visibility into my current work context without manually pasting git diffs every time?”
Before you start coding, consider: AI is stateless—it doesn’t know what files you changed, what branch you’re on, or what you committed yesterday. Developers waste time copying git diff output or explaining “I just modified the auth file.” A context injector automates this, making every prompt git-aware. This project teaches you to augment prompts with dynamic, session-specific context that makes AI more effective.
Concepts You Must Understand First
Stop and research these before coding:
- UserPromptSubmit Hook Lifecycle
- When does UserPromptSubmit execute (before prompt is sent to LLM)?
- Can you modify the user’s prompt text?
- How do you append context without overwriting the original prompt?
- What is the maximum context size before truncation is needed?
- Book Reference: Kiro CLI documentation - Hook System Architecture
- Git Plumbing Commands
- How do you get staged changes only (
git diff --cached)? - How do you get unstaged changes (
git diff)? - How do you get recent commit history (
git log -n 3 --oneline)? - How do you check if you’re in a git repository (
git rev-parse --is-inside-work-tree)? - Book Reference: “Pro Git” by Scott Chacon - Ch. 10 (Git Internals)
- How do you get staged changes only (
- Context Relevance Heuristics
- When should you inject diff (user mentions “changes”, “modified”, “tests”)?
- When should you skip injection (generic questions about unrelated topics)?
- How do you detect if the user is asking about code vs asking about concepts?
- Should you always inject branch name and commit history?
- Book Reference: None - requires experimentation and user feedback
- Token Budget Management
- How many tokens does a typical diff consume?
- How do you truncate large diffs (>100 files changed)?
- Should you prioritize staged changes over unstaged?
- How do you summarize commits vs including full diffs?
- Book Reference: “Designing Data-Intensive Applications” by Martin Kleppmann - Ch. 4 (Encoding)
Questions to Guide Your Design
Before implementing, think through these:
- Context Selection
- Should you inject staged changes, unstaged changes, or both?
- Do you include recent commits, or only uncommitted work?
- How do you decide between
git diffandgit show HEAD? - Should you include file renames, binary file changes, or submodule updates?
- Prompt Augmentation
- Where do you inject context (before prompt, after, or in structured fields)?
- How do you format diffs for readability (syntax highlight, collapse large hunks)?
- Should you summarize (“3 files changed, 45 insertions”) or show full diffs?
- Do you annotate the context (“Git Context:” header) or inject silently?
- Trigger Logic
- Do you inject context on every prompt or only when relevant?
- How do you detect relevance (keyword matching, NLP, always-on)?
- Should users be able to opt out (–no-git-context flag)?
- What if there are no changes—do you inject “no changes” or skip entirely?
- Performance and Safety
- How do you handle repositories with thousands of files changed?
- Should you run
git diffsynchronously or cache results? - What if
git difftakes 10 seconds (large binary files)? - How do you avoid leaking secrets in diffs (API keys, passwords)?
Thinking Exercise
Manual Context Injection Walkthrough
Before writing code, trace how your hook enhances different prompts:
Scenario 1: User asks about recent changes
User prompt: "Review the authentication changes I made"
Hook detects keywords: ["changes", "made"]
Hook runs: git diff --cached
Injected context:
Branch: feature/oauth
Staged: src/auth.ts (+45, -12)
Enhanced prompt:
"Review the authentication changes I made
Git Context:
<diff output>
"
Scenario 2: User asks generic question
User prompt: "What is the difference between OAuth and JWT?"
Hook detects: No code-specific keywords
Hook decision: Skip git context (not relevant)
Prompt sent unchanged:
"What is the difference between OAuth and JWT?"
Scenario 3: Large diff (500 files changed)
User prompt: "Write tests for my refactor"
Hook detects: 500 files changed in staging area
Hook decision: Truncate to top 10 most-changed files
Enhanced prompt:
"Write tests for my refactor
Git Context (truncated to top 10 files):
src/auth.ts (+200, -50)
src/db.ts (+150, -30)
...
[490 more files not shown]
"
Scenario 4: No changes staged
User prompt: "Explain this error message"
Hook runs: git diff --cached
Result: No output (nothing staged)
Hook decision: Inject "No staged changes" summary
Enhanced prompt:
"Explain this error message
Git Context: No staged changes. Branch: main (up to date)
"
Questions while tracing:
- How do you balance verbosity (full diffs) vs conciseness (summaries)?
- Should you always show branch name, even if it’s not relevant?
- How do you handle merge conflicts in diffs?
- What if the user’s prompt is already very long—do you still inject context?
The Interview Questions They’ll Ask
Prepare to answer these:
- “How would you design a heuristic to determine when git context is relevant to a user’s prompt vs when it’s just noise?”
- “Explain the difference between
git diff,git diff --cached, andgit diff HEAD. When would you use each?” - “How would you handle a scenario where the git diff output contains sensitive information like API keys or passwords?”
- “What strategies would you use to truncate large diffs (500+ files changed) while preserving the most important information?”
- “How would you implement caching for git commands to avoid running expensive operations on every prompt?”
- “Explain how you would detect if a user’s prompt is asking about code (inject context) vs asking a conceptual question (skip context).”
Hints in Layers
Hint 1: Start with Always-On Injection
Begin by injecting git context on every prompt. Don’t implement keyword detection yet. Get the basic flow working: read stdin (user prompt), run git diff --cached, append to prompt, output to stdout.
Hint 2: Check for Git Repository First Before running git commands, verify you’re in a repo:
if ! git rev-parse --is-inside-work-tree &>/dev/null; then
# Not a git repo, skip injection
echo "$original_prompt"
exit 0
fi
Hint 3: Format Context for Readability Use markdown fences to make diffs clear:
echo "$original_prompt"
echo ""
echo "Git Context:"
echo '```diff'
git diff --cached
echo '```'
Hint 4: Truncate Large Diffs Limit diff size to avoid token budget issues:
diff_lines=$(git diff --cached | wc -l)
if [ "$diff_lines" -gt 500 ]; then
git diff --cached --stat # Show summary only
else
git diff --cached
fi
Books That Will Help
| Topic | Book | Chapter |
|---|---|---|
| Git Internals | “Pro Git” by Scott Chacon | Ch. 10 (Git Internals), Ch. 2 (Git Basics) |
| Hook System | Kiro CLI documentation | Hooks System, UserPromptSubmit |
| Shell Scripting | “The Linux Command Line” by William Shotts | Ch. 27 (Flow Control), Ch. 24 (Script Debugging) |
| Token Management | Kiro CLI docs | Context Window Management |
| Text Processing | “Unix Power Tools” by Shelley Powers | Ch. 13 (Searching and Substitution) |
Common Pitfalls & Debugging
Problem 1: “Hook adds context to every prompt, even unrelated questions”
- Why: No relevance detection implemented
- Fix: Add keyword matching:
if echo "$prompt" | grep -qiE '(change|diff|commit|modify|test|review)'; then inject_git_context fi - Quick test: Ask “What is Python?” and verify no git context is added
Problem 2: “Diff output contains API keys or secrets”
- Why: No secret scanning before injecting context
- Fix: Filter out sensitive patterns:
git diff --cached | grep -vE '(API_KEY|SECRET|PASSWORD|TOKEN)=' - Quick test: Stage a file with
API_KEY=abc123, verify it’s redacted
Problem 3: “Hook is slow, takes 5+ seconds per prompt”
- Why: Running
git diffon a massive repository every time - Fix: Cache diff results and invalidate on file changes:
cache_file="/tmp/git-context-$(git rev-parse HEAD).cache" if [ ! -f "$cache_file" ]; then git diff --cached > "$cache_file" fi cat "$cache_file" - Quick test: Time hook execution—should be <100ms with cache
Problem 4: “Large diffs break Kiro’s context window”
- Why: 500-file refactor generates 50,000 lines of diff
- Fix: Implement smart truncation:
if [ $(git diff --cached | wc -l) -gt 500 ]; then echo "Git Context (large changeset, showing summary):" git diff --cached --stat | head -20 echo "[...truncated...]" else git diff --cached fi - Quick test: Create a large diff, verify it’s summarized
Problem 5: “Hook doesn’t work in subdirectories”
- Why: Git commands run from hook’s directory, not user’s cwd
- Fix: Detect git root and run commands there:
git_root=$(git rev-parse --show-toplevel) cd "$git_root" || exit 0 git diff --cached - Quick test: Run Kiro from a subdirectory, verify context injection works
Definition of Done
- Hook intercepts all UserPromptSubmit events
- Git context is injected when user mentions code/changes (keyword detection)
- Staged changes are shown with
git diff --cached - Unstaged changes are optionally included based on prompt
- Branch name and recent commits are included in summary
- Large diffs (>500 lines) are truncated with stats summary
- Hook skips injection when not in a git repository
- Secret patterns (API keys, passwords) are filtered from diffs
- Context is formatted in markdown code fences for readability
- Cache invalidation prevents stale diff data
- Hook completes in <100ms for cached results, <1s for fresh diffs
- Documentation explains how to disable context injection per-prompt