Project 3: “The Context Detective” — LLM Context Window

Attribute Value
File KIRO_CLI_MASTERY.md
Main Programming Language Bash / Kiro Commands
Coolness Level Level 2: Practical
Difficulty Level 1: Beginner
Knowledge Area LLM Context Window

What you’ll build: Experiments to visualize token usage. Load a large file, use /context show, /compact, and /context remove.

Why it teaches Context: You learn to garbage collect AI memory to maintain performance and accuracy.

Core challenges you’ll face:

  • Distinguishing file context from chat history.
  • Detecting when instructions are being forgotten.

Success criteria:

  • You can reduce context size without losing key constraints.

Real World Outcome

You’ll understand exactly how Kiro manages its 200K token context window and learn to use commands to prevent context overflow. You’ll be able to see what’s consuming context and strategically free up space.

Example Output:

$ kiro chat

You: "Load the entire src/ directory"

Kiro: [reads 250 files totaling 180K tokens]
  Context usage: 90% (180K / 200K tokens)
  ⚠️  WARNING: Approaching context limit

  I've loaded your source code, but we're at 90% context capacity.
  Consider using `/compact` to summarize our conversation or
  `/context remove` to free up space.

You: "/context show"

Kiro:
┌─────────────────────────────────────────────────┐
│ Context Window Usage: 180K / 200K (90%)         │
├─────────────────────────────────────────────────┤
│ Category            │ Tokens    │ Percentage   │
├─────────────────────┼───────────┼──────────────┤
│ Files (src/)        │ 140K      │ 70%          │
│ Chat History        │  25K      │ 12.5%        │
│ Steering Files      │   8K      │  4%          │
│ System Prompt       │   5K      │  2.5%        │
│ Tools/Resources     │   2K      │  1%          │
└─────────────────────┴───────────┴──────────────┘

Top file consumers:
  1. src/components/Dashboard.tsx      → 18K tokens
  2. src/api/handlers.ts               → 15K tokens
  3. src/lib/database.ts               → 12K tokens

You: "/context remove src/components/Dashboard.tsx"

Kiro: Removed src/components/Dashboard.tsx (18K tokens)
  New context usage: 162K / 200K (81%)

You: "/compact"

Kiro: [summarizes conversation history]
  Compacted 25K tokens of chat history → 5K token summary
  New context usage: 142K / 200K (71%)

  Summary preserved:
  - Your project uses Next.js 14 with TypeScript
  - Current task: debugging authentication flow
  - Key files: src/api/handlers.ts, src/lib/database.ts

You: "Now analyze the auth flow"

Kiro: [works with 71% context usage]
  Looking at src/api/handlers.ts...
  [analysis continues with plenty of context headroom]

The context usage meter shows in real-time as files/messages consume space.


The Core Question You’re Answering

“What happens when I load too much code into Kiro’s context, and how do I manage the 200K token budget without losing important information?”

Before experimenting, understand: Kiro’s context window is like RAM—finite and precious. Once full, either Kiro auto-compacts (potentially losing details) or refuses to load more. This project teaches you to be a context window architect: strategic about what you load, when to summarize, and how to preserve critical constraints.


Concepts You Must Understand First

Stop and research these before experimenting:

  1. Token Counting
    • What is a “token” in LLM terms?
    • How many tokens does a typical code file consume?
    • Do comments, whitespace, and variable names count as tokens?
    • Reference: Kiro Context Management
  2. Context Window Composition
    • What’s the breakdown of Kiro’s 200K context? (files, chat, steering, system prompt)
    • Which components are fixed (system prompt) vs dynamic (chat history)?
    • How does adding a steering file affect available space?
    • Book Reference: “Designing Data-Intensive Applications” by Kleppmann - Ch. 1 (Foundations)
  3. Compaction vs Removal
    • What’s the difference between /compact (summarize) and /context remove (delete)?
    • What information is lost during compaction?
    • When should you compact vs when should you remove?
    • Reference: Slash Commands Reference

Questions to Guide Your Design

Before experimenting, think through these:

  1. Loading Strategy
    • Should you load the entire codebase at once or selectively load files as needed?
    • How do you decide which files are “important enough” to keep in context?
    • What’s the tradeoff between having more context vs faster responses?
  2. Compaction Timing
    • Should you wait for Kiro’s auto-compact (80% threshold) or manually compact earlier?
    • What information must survive compaction? (steering rules, architectural decisions, bug context)
    • How do you verify that compaction preserved the right details?
  3. Multi-File Workflows
    • When debugging across 10 files, how do you keep all relevant context loaded?
    • How do you avoid reloading files you’ve already removed?
    • Should you use subagents for parallel file analysis instead?

Thinking Exercise

Exercise: Context Budget Allocation

You have 200K tokens. You’re debugging a Next.js authentication bug. Plan your context budget:

Available: 200K tokens

Fixed costs:
- System prompt:       5K
- Steering files:      8K
- Tools/Resources:     2K
─────────────────────────
Remaining budget:    185K

You need to analyze:

  • src/auth/login.tsx (12K tokens)
  • src/api/auth.ts (8K tokens)
  • src/lib/session.ts (6K tokens)
  • src/middleware.ts (4K tokens)
  • .env.example (1K tokens)
  • Chat history will grow over 50 messages (~25K tokens)

Questions while planning:

  • How much space should you reserve for growing chat history?
  • If you run out of space mid-conversation, which file would you remove first?
  • Should you proactively compact at 60% or wait until 80%?
  • Could you use /grep to search files instead of loading them entirely?

The Interview Questions They’ll Ask

  1. “Explain how Kiro’s 200K context window is allocated between files, chat history, and system components.”
  2. “What’s the difference between /compact and /context remove? When would you use each?”
  3. “How would you debug an issue across 20 files without exceeding the context window?”
  4. “What information is lost when Kiro auto-compacts at 80% context usage?”
  5. “How do steering files affect available context space?”
  6. “What strategies would you use to work with a codebase larger than 200K tokens?”

Hints in Layers

Hint 1: Monitor Before You Act Always run /context show before making decisions. Don’t guess about usage—measure it. This shows exactly what’s consuming space.

Hint 2: Load Incrementally Don’t run /context add src/ to load everything. Instead, load specific files: /context add src/auth/login.tsx. Add more only when needed. Start small, expand gradually.

Hint 3: Use Grep for Reconnaissance Before loading a file into context, use /grep to search it. Example: /grep "authenticate" src/auth/login.tsx. This finds info without burning context tokens.

Hint 4: Compact Early and Often Don’t wait until 90% usage. When you finish a subtask (e.g., “fixed login bug”), run /compact to summarize that work and free up space for the next subtask.


Books That Will Help

Topic Book Chapter
Token-based language models “Speech and Language Processing” by Jurafsky & Martin Ch. 3: N-gram Language Models
Memory management principles “Computer Systems: A Programmer’s Perspective” by Bryant & O’Hallaron Ch. 9: Virtual Memory
Resource allocation strategies “Designing Data-Intensive Applications” by Kleppmann Ch. 1: Foundations of Data Systems

Common Pitfalls & Debugging

Problem 1: “Kiro auto-compacted and forgot my instructions”

  • Why: Instructions were only in chat history, not in steering files or repeated in context
  • Fix: Put persistent instructions in .kiro/steering/*.md, not in chat messages
  • Quick test: After /compact, ask Kiro to repeat your key constraints—they should still be remembered

Problem 2: “Context fills up instantly when I load files”

  • Why: Loaded entire directory with /context add src/ instead of specific files
  • Fix: Remove with /context clear, then load only essential files one by one
  • Quick test: /context show → check if you’re loading files you don’t actually need

Problem 3: “Can’t load more files even though context shows 40%”

  • Why: Individual file might be too large (>50K tokens) or rate limiting
  • Fix: Use /grep to search file instead of loading it, or load only relevant sections
  • Quick test: Check file size with wc -w filename.tsx (words × 1.3 ≈ tokens)

Problem 4: “After compaction, Kiro gives different answers”

  • Why: Compaction creates a lossy summary; nuance is lost
  • Fix: Before compacting, explicitly state what must be remembered: “Key constraint: authentication must use JWT tokens with 1-hour expiry”
  • Quick test: Ask Kiro “what are the key constraints?” before and after /compact

Definition of Done

  • Ran /context show and understand the breakdown of usage
  • Loaded a large file (>10K tokens) and observed context percentage increase
  • Used /context remove <file> to free up space and verified percentage decrease
  • Triggered auto-compaction by reaching 80% context usage (or manually ran /compact)
  • Verified that key instructions survived compaction
  • Experimented with /grep as an alternative to loading full files
  • Identified which components are fixed (system prompt) vs dynamic (chat history)
  • Can explain when to use compact vs remove vs subagents for large tasks