Project 30: "The Recursive Prompt Improver" — Metacognition

Project 30: “The Recursive Prompt Improver” — Metacognition

Attribute	Value
File	`KIRO_CLI_LEARNING_PROJECTS.md`
Main Programming Language	Natural Language
Coolness Level	Level 5: Pure Magic
Difficulty	Level 3: Advanced
Knowledge Area	Metacognition

What you’ll build: Ask Kiro to analyze its own steering and improve it based on mistakes.

Why it teaches Self-Correction: Your system gets smarter over time.

Success criteria:

A new steering rule prevents a repeated mistake.

Real World Outcome

You will have a Kiro CLI workflow that enables self-improvement through metacognitive analysis - Kiro analyzes its own behavior, identifies patterns of mistakes, and proposes steering rules to prevent future errors. When you run it, you’ll see:

Triggering Self-Analysis:

$ kiro "You keep running eslint without --fix, so I have to manually fix errors. Analyze this pattern and improve your steering."

🧠 Metacognitive Analysis Mode
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Analyzing recent conversation history...
├─ Reviewing last 50 messages
├─ Identifying repeated patterns
└─ Searching for corrective feedback

🔍 Pattern Detected: ESLint Usage Anti-Pattern
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Occurrences: 8 times in last 2 weeks

Example 1 (2024-01-01):
  Kiro: "Running ESLint to check for issues..."
  $ eslint src/
  User: "You found 23 errors but didn't fix them. Run it with --fix!"
  Kiro: "Sorry! Running eslint src/ --fix"

Example 2 (2024-01-02):
  Kiro: "Let me check for linting errors..."
  $ eslint src/components/
  User: "Again, you forgot --fix. Always use --fix!"
  Kiro: "My apologies. Running eslint src/components/ --fix"

Example 3 (2023-12-29):
  [Similar pattern repeated 6 more times]

Root Cause Analysis:
├─ Current steering doesn't specify --fix flag
├─ Default behavior is to report errors, not fix them
└─ User expects automatic fixing in 100% of cases

📝 Proposed Steering Rule
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Title: "Always Use ESLint --fix Flag"

Rule:
When running ESLint for any reason, ALWAYS include the --fix flag
to automatically fix correctable errors. Do not run eslint without
--fix unless the user explicitly requests a "dry run" or "report only".

Correct:
  $ eslint src/ --fix
  $ eslint src/**/*.js --fix

Incorrect:
  $ eslint src/
  $ eslint --debug src/

Exceptions:
- User says "check for errors but don't fix"
- User says "dry run"
- Running in CI/CD (--fix would modify files)

Priority: High (affects 40% of linting tasks)

Impact:
  Before: 8 occurrences of this mistake
  After: 0 expected occurrences
  Time saved: ~2 minutes per occurrence = 16 min/week

Do you want me to add this steering rule to ~/.kiro/steering.md? (y/n)

User Approves and Rule is Added:

$ y

✅ Steering Rule Added
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

File: ~/.kiro/steering.md
Section: ## Linting & Code Quality
Rule: "Always Use ESLint --fix Flag"

Full rule saved. This will apply to all future conversations.

Would you like me to analyze other patterns? (y/n)

Subsequent Behavior (Rule in Effect):

$ kiro "Check the code quality in src/"

🔧 Running Code Quality Check
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

$ eslint src/ --fix

[Applying steering rule: "Always Use ESLint --fix Flag"]

Fixed 12 errors automatically:
├─  7 × no-unused-vars
├─  3 × prefer-const
└─  2 × no-console

Remaining 3 errors require manual intervention:
├─ src/utils.js:45 - Undefined variable 'config'
├─ src/app.js:102 - Missing return statement
└─ src/api.js:23 - Async function without await

✓ Auto-fixable errors resolved
⚠ Manual errors remain - see above

Advanced: Multi-Pattern Analysis:

$ kiro "/meta-analyze last 100 messages"

🧠 Comprehensive Metacognitive Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Analyzing conversation history...
└─ Processed 100 messages, 42 tool calls, 18 corrections

Patterns Detected:

1. 📦 NPM Install Forgetting (12 occurrences)
   - You forget to run npm install after adding dependencies
   - Proposed Rule: "Always run npm install after package.json changes"

2. 🧪 Test Running Incomplete (8 occurrences)
   - You run tests but don't check if they passed
   - Proposed Rule: "Always verify test results before proceeding"

3. 📝 Commit Messages Too Vague (15 occurrences)
   - Commit messages like "fix bug" instead of descriptive
   - Proposed Rule: "Commit messages must describe WHAT and WHY"

4. 🔒 Secrets Exposure (2 occurrences)
   - You almost committed API keys in .env files
   - Proposed Rule: "NEVER commit .env files, always .gitignore them"

Generate steering rules for all 4 patterns? (y/n)

You’re seeing exactly what metacognition enables - a system that learns from its mistakes and self-improves through reflective analysis!

The Core Question You’re Answering

“How do you make an AI agent that learns from its own mistakes without retraining or fine-tuning?”

Before you write any code, sit with this question. Most AI systems are static - they repeat the same mistakes forever because they have no mechanism for self-correction. But humans learn through:

Reflection - “I made a mistake”
Analysis - “Why did I make it?”
Abstraction - “What pattern caused this?”
Rule Formation - “What rule would prevent it?”
Application - “Follow the rule next time”

This is metacognition - thinking about thinking. The system watches itself, identifies failure modes, and updates its behavior.

Traditional approach (static):

Kiro: [makes mistake]
User: [corrects]
Kiro: "Sorry, fixed it"
[2 days later]
Kiro: [makes same mistake again]

Metacognitive approach (self-improving):

Kiro: [makes mistake]
User: [corrects]
Kiro: "I notice I've made this mistake 8 times. Let me analyze..."
Kiro: [proposes steering rule]
User: "Yes, add that rule"
Kiro: [updates steering]
[2 days later]
Kiro: [applies rule, avoids mistake]

This is how AlphaGo learned (self-play), how humans learn (reflection), and how expert systems evolve (knowledge base updates).

Concepts You Must Understand First

Stop and research these before coding:

Metacognition (Thinking About Thinking)
- What is metacognition? (awareness of one’s own thought processes)
- How do humans self-correct? (error detection → analysis → strategy change)
- What is the OODA loop? (Observe, Orient, Decide, Act)
- Book Reference: “Thinking, Fast and Slow” Ch. 20-21 (Self-Monitoring) - Daniel Kahneman
Conversation Analysis & Pattern Mining
- How do you detect repeated patterns in text? (regex, n-grams, semantic clustering)
- What is cosine similarity for semantic patterns? (vector comparison)
- How do you extract “correction events”? (user says “no, do it this way”)
- Paper Reference: “Extracting Patterns from Conversational Data” - NLP literature
Steering/System Prompts
- What is a system prompt? (instructions that guide LLM behavior)
- How do steering rules work? (constraints added to every request)
- What’s the difference between few-shot examples and rules? (examples vs constraints)
- Docs Reference: Anthropic’s “Prompt Engineering Guide”
Rule Synthesis from Examples
- How do you generalize from specific examples? (abstraction)
- What makes a good rule? (clear, actionable, measurable)
- How do you avoid overfitting rules? (too specific = not generalizable)
- Book Reference: “AI: A Modern Approach” Ch. 19 (Learning from Examples) - Russell & Norvig
Feedback Loops & System Stability
- What is a feedback loop? (output affects future input)
- What is positive vs negative feedback? (amplifying vs dampening)
- How do you prevent runaway rule creation? (too many rules = conflict)
- Book Reference: “Thinking in Systems” Ch. 1 (Feedback Loops) - Donella Meadows

Questions to Guide Your Design

Before implementing, think through these:

Pattern Detection
- How do you identify a “mistake”? (user correction keywords: “no”, “actually”, “you forgot”)
- How many occurrences make a “pattern”? (3+ times = pattern, <3 = one-off)
- How do you cluster similar mistakes? (semantic similarity of corrections)
Analysis Triggering
- User-initiated (“/meta-analyze”) vs automatic (after 3 corrections)?
- Real-time (during conversation) vs batch (end of day)?
- Threshold-based (trigger after N mistakes)?
Rule Formulation
- Template-based (“Always X when Y”) vs freeform?
- Should rules include examples (few-shot) or just constraints?
- How specific should rules be? (per-project vs global)
Rule Storage & Application
- Where are rules stored? (steering.md, JSON config, database)
- How are rules loaded? (startup vs dynamic reload)
- Priority/precedence: What if rules conflict? (specific > general)
Validation & Testing
- How do you test if a rule works? (simulate past mistakes, verify prevention)
- How do you detect bad rules? (too restrictive, blocks valid actions)
- Should rules expire? (remove if not triggered in 3 months)

Thinking Exercise

Trace Metacognitive Loop

Before coding, manually trace this self-improvement cycle:

Given:

Conversation history: 50 messages
User has corrected Kiro 3 times for forgetting npm install

Trace each step:

Error Detection (Reflection)

Message 12:
  Kiro: "I've added express to package.json"
  User: "You forgot to run npm install!"
  Kiro: "Installing now: npm install"

Message 28:
  Kiro: "Added jsonwebtoken to dependencies"
  User: "npm install? You always forget this!"
  Kiro: "Sorry! Running npm install"

Message 45:
  Kiro: "Updated to React 18 in package.json"
  User: "AGAIN! npm install!!"
  Kiro: "My apologies. Running npm install"

Question: How do you detect the correction pattern? (user frustration escalates)

Pattern Extraction

# Pseudocode
corrections = []
for i, msg in enumerate(messages):
    if user_corrected(msg):  # Contains "forgot", "you always", "again"
        corrections.append({
            'index': i,
            'context': messages[i-1],  # What Kiro did wrong
            'correction': msg,
            'fix': messages[i+1]  # What Kiro did to fix
        })

# Group similar corrections
clusters = cluster_by_similarity(corrections)
# Cluster 1: "npm install" corrections (3 occurrences)

Question: What similarity threshold defines a cluster? (cosine > 0.8?)

Root Cause Analysis

Cluster: "NPM Install Forgetting" (3 occurrences)

Common pattern:
1. Kiro modifies package.json (add/update dependency)
2. Kiro does NOT run npm install
3. User reminds Kiro to run npm install
4. Kiro runs npm install

Root cause:
- Current steering doesn't link package.json changes → npm install
- Kiro treats them as independent actions

Question: How do you infer causality? (sequence analysis: A always followed by B)

Rule Synthesis

Proposed Steering Rule:

## Dependency Management

**Always run `npm install` after modifying package.json**

When you add, update, or remove dependencies in package.json,
IMMEDIATELY run `npm install` to sync node_modules.

Correct sequence:
1. Edit package.json (add dependency)
2. Run npm install
3. Verify installation succeeded

Don't forget this step - it's required for dependencies to be usable.

Question: Is this rule too specific? (what about yarn, pnpm?)

User Approval & Application

User: y (approves rule)

# Rule added to ~/.kiro/steering.md
# Next conversation, rule is loaded

Kiro: "Adding lodash to package.json..."
[Applying rule: "Always run npm install after modifying package.json"]
Kiro: "Running npm install..."
$ npm install
Kiro: "✓ lodash installed successfully"

Question: How do you verify the rule prevented the mistake? (no correction needed)

Questions while tracing:

What if the user corrects something that’s actually context-specific? (rule would be wrong)
What if two rules conflict? (“Always X” vs “Never X in situation Y”)
What if a rule is too broad? (blocks valid edge cases)

The Interview Questions They’ll Ask

Prepare to answer these:

“Explain the difference between metacognition in humans and self-improvement in AI systems. What are the key similarities and differences?”
“Your Kiro agent proposes a steering rule that’s too specific: ‘Always use port 3000 for Express servers.’ How would you generalize this into a better rule?”
“You’ve added 50 steering rules over 6 months. Now Kiro is slow and rules conflict. How do you prune and consolidate rules?”
“Walk me through how you would detect that a steering rule is harmful (blocking valid actions). What metrics would you track?”
“How would you prevent an adversarial user from poisoning the steering rules by giving intentionally bad corrections?”
“Explain the concept of ‘overfitting’ in machine learning. How does it relate to creating overly specific steering rules?”

Hints in Layers

Hint 1: Start with Manual Analysis Don’t automate pattern detection immediately. First, manually review your conversation history and identify 3 real mistakes Kiro made repeatedly. Write them down with examples.

Hint 2: Implement Correction Detection Scan conversation history for user corrections using keyword matching:

correction_keywords = [
    "you forgot",
    "you always",
    "again",
    "no, do it this way",
    "that's wrong",
    "actually",
    "incorrect"
]

for msg in messages:
    if any(kw in msg.content.lower() for kw in correction_keywords):
        # Mark as correction
        corrections.append(msg)

Hint 3: Cluster Similar Corrections Use embeddings to group semantically similar corrections:

# Embed each correction
correction_embeddings = [
    embed(c.content) for c in corrections
]

# Compute pairwise similarity
from sklearn.cluster import DBSCAN

clusters = DBSCAN(eps=0.3, min_samples=2).fit(correction_embeddings)
# Cluster 0: npm install corrections
# Cluster 1: eslint --fix corrections
# Cluster 2: commit message corrections

Hint 4: Extract Pattern Context For each cluster, extract the Kiro action that preceded the correction:

for cluster in clusters:
    for correction in cluster.corrections:
        prev_msg = get_previous_message(correction)  # What Kiro did
        next_msg = get_next_message(correction)      # How Kiro fixed it

        pattern = {
            'mistake': prev_msg.content,
            'correction': correction.content,
            'fix': next_msg.content
        }

Hint 5: Generate Rule Template Use an LLM to synthesize a rule from the pattern:

prompt = f"""
Based on these examples of a repeated mistake:

Example 1: {pattern_1}
Example 2: {pattern_2}
Example 3: {pattern_3}

Generate a steering rule that would prevent this mistake in the future.

Format:
## [Category]
**[Rule Title]**
[Rule description with examples of correct behavior]
"""

proposed_rule = llm(prompt)

Hint 6: Present for User Approval Display the proposed rule and ask for confirmation:

print(f"""
Proposed Steering Rule:

{proposed_rule}

Impact:
- Occurrences: {len(pattern.examples)}
- Estimated time saved: {time_estimate}

Add this rule to steering.md? (y/n)
""")

Hint 7: Append Rule to Steering File If approved, append to ~/.kiro/steering.md:

if user_approves:
    with open(os.path.expanduser('~/.kiro/steering.md'), 'a') as f:
        f.write(f"\n\n{proposed_rule}\n")
    print("✅ Rule added successfully")

Hint 8: Verify Rule Application In future conversations, check if the rule prevents the mistake:

# Load steering rules at startup
steering_rules = load_steering('~/.kiro/steering.md')

# Before each action, check rules
if action == 'modify package.json':
    relevant_rules = [r for r in steering_rules if 'npm install' in r]
    if relevant_rules:
        print("[Applying rule: 'Always run npm install after package.json changes']")
        run_npm_install()

Books That Will Help

Topic	Book	Chapter
Metacognition & Self-Monitoring	“Thinking, Fast and Slow” by Daniel Kahneman	Ch. 20-21
Learning from Examples	“Artificial Intelligence: A Modern Approach” by Russell & Norvig	Ch. 19
Feedback Loops & Systems	“Thinking in Systems” by Donella Meadows	Ch. 1
Pattern Mining in Text	“Speech and Language Processing” by Jurafsky & Martin	Ch. 8 (Sequence Labeling)
Prompt Engineering	“The Prompt Engineering Guide” (online)	All chapters

Common Pitfalls & Debugging

Problem 1: “Too many false positives - normal feedback detected as corrections”

Why: Overly broad keyword matching. “Actually, that looks good” is not a correction.
Fix: Use sentiment analysis or semantic similarity. Corrections have negative sentiment + suggest alternative action.
Quick test: Review 10 detected “corrections” - should all be actual mistakes, not positive feedback.

Problem 2: “Proposed rules are too specific - don’t generalize”

Why: Rule synthesized from a single example, not a pattern.
Fix: Require 3+ examples before creating a rule. Use LLM to generalize.
Quick test: Proposed rule should apply to related scenarios, not just the exact mistake.

Problem 3: “Steering file grows unbounded - 100+ rules conflict”

Why: No pruning or consolidation mechanism.
Fix: Implement rule expiry (remove if not triggered in 6 months), rule merging (combine similar rules).
Quick test: Count rules - should be <50 active rules at any time.

Problem 4: “Rule prevents valid actions - too restrictive”

Why: Rule doesn’t account for edge cases or exceptions.
Fix: Add exception clauses to rules (“unless user explicitly requests X”).
Quick test: User should be able to override any rule with explicit instruction.

Problem 5: “Metacognitive analysis is too slow - takes 30 seconds”

Why: Embedding all messages and clustering on every request.
Fix: Cache analysis results, run batch analysis periodically (not real-time).
Quick test: Analysis should complete in <5 seconds for 100-message history.