Project 6: Cognitive Load Survey & Heatmap

Build a data-driven survey instrument and visualization tool that measures the cognitive load of teams, identifying which are “Drowning” vs “Thriving.”

Quick Reference

Attribute	Value
Difficulty	Intermediate
Time Estimate	1 Week (15-20 hours)
Primary Language	Python (Data Analysis)
Alternative Languages	R, JavaScript (D3.js)
Prerequisites	Basic data analysis, survey design
Key Topics	Cognitive Load Theory, Team Topologies, Organizational Health

1. Learning Objectives

By completing this project, you will:

Measure cognitive load using survey-based methodology
Distinguish load types (Intrinsic, Extraneous, Germane)
Visualize team health using heatmaps
Identify teams at risk of burnout or failure
Recommend operating model changes based on data

2. Theoretical Foundation

2.1 Core Concepts

The Three Types of Cognitive Load

Cognitive Load Theory (Sweller, 1988) identifies three types of mental burden:

┌─────────────────────────────────────────────────────────────────┐
│                    TOTAL COGNITIVE CAPACITY                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            INTRINSIC LOAD (Domain Complexity)           │   │
│  │  "How complex is the problem we're solving?"            │   │
│  │  - Business logic                                       │   │
│  │  - Domain knowledge                                     │   │
│  │  - Technical architecture                               │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            EXTRANEOUS LOAD (Waste)                      │   │
│  │  "What overhead is getting in the way?"                 │   │
│  │  - Broken tools                                         │   │
│  │  - Unclear processes                                    │   │
│  │  - Coordination overhead                                │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │            GERMANE LOAD (Value-Add)                     │   │
│  │  "What productive thinking are we doing?"               │   │
│  │  - Learning new skills                                  │   │
│  │  - Solving novel problems                               │   │
│  │  - Building features                                    │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

GOAL: Minimize EXTRANEOUS, Maintain INTRINSIC, Maximize GERMANE

The Team Capacity Model

HIGH EXTRANEOUS LOAD               HEALTHY LOAD BALANCE
(Team Drowning)                    (Team Thriving)

┌───────────────────┐              ┌───────────────────┐
│ ██████████████████│ Extraneous   │ ████             │ Extraneous
│ ████████          │ Intrinsic    │ ████████████     │ Intrinsic
│ ██                │ Germane      │ ████████████████ │ Germane
└───────────────────┘              └───────────────────┘
   0%          100%                   0%          100%

   "We're busy all day                "We ship valuable
    but ship nothing"                  features daily"

Brooks’ Law Applied

“Adding manpower to a late software project makes it later.”

Why? Because each new person adds cognitive load:

Onboarding requires others’ time
Communication paths grow quadratically (n × (n-1) / 2)
Context sharing becomes harder

2.2 Why This Matters

Cognitive overload is the silent killer of software teams.

Symptoms:

High turnover (“burnout”)
Low velocity despite working overtime
Increasing bugs and incidents
Long cycle times

Managers often try to fix “speed” by:

Adding people (increases coordination overhead)
Working longer hours (increases burnout)
Adding process (increases extraneous load)

The correct fix: Reduce extraneous load through better operating model design.

2.3 Historical Context

Cognitive Load Theory (1988): John Sweller’s research on learning
Team Topologies (2019): Applied cognitive load to team design
DORA Research (2018+): Correlated team cognitive load with performance

2.4 Common Misconceptions

Misconception	Reality
“Smart people can handle more”	Everyone has cognitive limits
“Work harder = more output”	Overload causes errors and rework
“Meetings are necessary”	Many meetings are extraneous load
“Complexity is unavoidable”	Extraneous complexity IS avoidable

3. Project Specification

3.1 What You Will Build

Survey Instrument: Questions that measure cognitive load by type
Data Collection System: Deploy survey, collect responses
Analysis Pipeline: Process responses, calculate scores
Heatmap Visualization: Show team-by-team cognitive load
Recommendations Engine: Suggest operating model changes

3.2 Functional Requirements

Survey Design
- 10-15 questions covering all load types
- Likert scale (1-5) for consistency
- Anonymous to ensure honesty
Data Collection
- Per-team aggregation
- At least 3 responses per team for validity
- Quarterly cadence
Analysis
- Calculate load scores by type
- Identify outliers (high extraneous, low germane)
- Track trends over time
Visualization
- Heatmap showing all teams
- Drill-down to team detail
- Compare quarters
Recommendations
- Based on Team Topologies patterns
- Actionable next steps
- Risk level per team

3.3 Non-Functional Requirements

Survey must complete in < 5 minutes
Analysis must handle 100+ teams
Visualization must be interactive (web-based)
All data must be anonymized at individual level

3.4 Example Usage / Output

Survey Questions:

## Intrinsic Load (Domain Complexity)
How complex is the domain you work in? (1=Simple, 5=Very Complex)
How much specialized knowledge is required? (1=None, 5=Extensive)
How often do you need to understand cross-team dependencies? (1=Never, 5=Constantly)

## Extraneous Load (Waste/Friction)
How often do broken tools slow you down? (1=Never, 5=Constantly)
How much time do you spend in meetings? (1=Very Little, 5=Too Much)
How often do you context-switch between unrelated tasks? (1=Never, 5=Constantly)
How clear are the processes you need to follow? (1=Very Clear, 5=Unclear)
How easy is it to get the information you need? (1=Very Easy, 5=Difficult)

## Germane Load (Value-Add)
How much time do you spend on valuable work? (1=Very Little, 5=Most of My Time)
How often do you learn new things? (1=Never, 5=Often)
How often do you solve novel problems? (1=Never, 5=Often)
How much autonomy do you have? (1=None, 5=Full)

Raw Results (per team):

team,intrinsic_avg,extraneous_avg,germane_avg,response_count
checkout,3.2,4.1,2.0,5
payments,2.8,2.2,3.8,4
platform,3.5,3.0,3.2,6
identity,2.5,4.5,1.8,3
data,4.0,2.0,4.2,5

Heatmap Output:

               COGNITIVE LOAD HEATMAP

         Intrinsic  Extraneous  Germane   STATUS
         ─────────  ──────────  ───────   ──────
Checkout    ●●●○○     ●●●●○     ●●○○○    🔴 HIGH RISK
Payments    ●●●○○     ●●○○○     ●●●●○    🟢 HEALTHY
Platform    ●●●●○     ●●●○○     ●●●○○    🟡 MODERATE
Identity    ●●○○○     ●●●●●     ●●○○○    🔴 HIGH RISK
Data        ●●●●○     ●●○○○     ●●●●○    🟢 HEALTHY

Legend: ● = Score 1  ○ = Score 0 (out of 5)

Team Analysis:

## Team: Checkout

### Scores
- Intrinsic Load: 3.2/5 (Moderate)
- Extraneous Load: 4.1/5 (HIGH - RED FLAG)
- Germane Load: 2.0/5 (LOW - RED FLAG)

### Diagnosis
The Checkout team is **drowning in toil**. They have moderate domain
complexity but extremely high friction from tools and processes.
Only 40% of their cognitive capacity is going toward valuable work.

### Symptoms You Might See
- Missed deadlines despite team working overtime
- Low morale, potential turnover risk
- Increasing bug rate
- Avoidance of new initiatives

### Recommended Actions
1. **Platform Intervention**: Platform team should investigate tooling pain
2. **Process Audit**: Review meetings, remove non-essential ones
3. **Boundary Review**: Consider if Checkout scope is too broad
4. **Enabling Team**: Assign enabling team to clear knowledge gaps

### Risk Level: HIGH
Immediate intervention recommended. Team is at burnout risk.

3.5 Real World Outcome

After implementing this system:

Identify 2-3 teams at high risk before they fail
Reduce extraneous load by 30% through targeted interventions
Improve team satisfaction scores
Data-driven re-org decisions

4. Solution Architecture

4.1 High-Level Design

┌─────────────────────────────────────────────────────────────────┐
│                  COGNITIVE LOAD SYSTEM                          │
└─────────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
        ▼                     ▼                     ▼
┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│  SURVEY       │     │  ANALYSIS     │     │  VISUALIZATION│
│               │     │               │     │               │
│ - Google Form │     │ - Pandas      │     │ - Matplotlib  │
│ - Typeform    │     │ - Statistics  │     │ - Plotly      │
│ - Custom      │     │ - Scoring     │     │ - Dash        │
└───────────────┘     └───────────────┘     └───────────────┘
        │                     │                     │
        ▼                     ▼                     ▼
┌───────────────┐     ┌───────────────┐     ┌───────────────┐
│  Responses    │     │  Team Scores  │     │  Heatmap      │
│  (CSV)        │     │  (JSON)       │     │  (HTML/PNG)   │
└───────────────┘     └───────────────┘     └───────────────┘

4.2 Key Components

Survey Builder: Design and deploy questionnaire
Response Collector: Gather and validate responses
Score Calculator: Compute load scores per team
Heatmap Generator: Visualize scores across org
Report Generator: Create team-specific recommendations

4.3 Data Structures

# models.py
from dataclasses import dataclass
from enum import Enum
from typing import List

class LoadType(Enum):
    INTRINSIC = "intrinsic"
    EXTRANEOUS = "extraneous"
    GERMANE = "germane"

@dataclass
class SurveyResponse:
    team_id: str
    timestamp: str
    answers: dict  # question_id -> score (1-5)

@dataclass
class TeamScore:
    team_id: str
    intrinsic_score: float  # 1-5 average
    extraneous_score: float
    germane_score: float
    response_count: int
    risk_level: str  # "low", "medium", "high"

@dataclass
class Question:
    id: str
    text: str
    load_type: LoadType
    reverse_scored: bool = False  # Some questions are inverted

# questions.yaml
questions:
  - id: q1
    text: "How complex is the domain you work in?"
    load_type: intrinsic

  - id: q4
    text: "How often do broken tools slow you down?"
    load_type: extraneous

  - id: q9
    text: "How much time do you spend on valuable work?"
    load_type: germane

4.4 Algorithm Overview

def calculate_team_scores(responses: List[SurveyResponse]) -> List[TeamScore]:
    # Group responses by team
    by_team = group_by(responses, key=lambda r: r.team_id)

    scores = []
    for team_id, team_responses in by_team.items():
        # Need minimum responses for validity
        if len(team_responses) < 3:
            continue

        # Calculate averages per load type
        intrinsic = average([
            r.answers[q.id]
            for r in team_responses
            for q in questions if q.load_type == INTRINSIC
        ])

        extraneous = average([...])
        germane = average([...])

        # Determine risk level
        risk = calculate_risk(intrinsic, extraneous, germane)

        scores.append(TeamScore(
            team_id=team_id,
            intrinsic_score=intrinsic,
            extraneous_score=extraneous,
            germane_score=germane,
            response_count=len(team_responses),
            risk_level=risk
        ))

    return scores

def calculate_risk(intrinsic, extraneous, germane) -> str:
    # High extraneous + low germane = high risk
    if extraneous > 3.5 and germane < 2.5:
        return "high"
    # Moderate extraneous OR low germane
    elif extraneous > 3.0 or germane < 3.0:
        return "medium"
    else:
        return "low"

5. Implementation Guide

5.1 Development Environment Setup

# Create project
mkdir cognitive-load-survey && cd cognitive-load-survey
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install pandas matplotlib seaborn plotly pyyaml jupyter

5.2 Project Structure

cognitive-load-survey/
├── data/
│   ├── questions.yaml
│   └── responses.csv  # Collected survey data
├── src/
│   ├── __init__.py
│   ├── models.py      # Data classes
│   ├── loader.py      # Load survey responses
│   ├── scoring.py     # Calculate load scores
│   ├── heatmap.py     # Generate visualizations
│   └── report.py      # Generate team reports
├── notebooks/
│   └── analysis.ipynb # Exploratory analysis
├── output/
│   ├── heatmap.html
│   └── reports/
│       └── team_checkout.md
└── survey/
    └── form.md        # Survey questions in Markdown

5.3 The Core Question You’re Answering

“Is this team slow because they are ‘bad,’ or because we’ve given them an impossible amount of things to remember?”

Cognitive load is the “Silent Killer” of software teams. Designing the operating model means removing extraneous load, not adding people or process.

5.4 Concepts You Must Understand First

Stop and research these before coding:

Intrinsic vs. Extraneous vs. Germane Load
- Which one do we want to maximize?
- Which one do we want to minimize?
- Book Reference: “Team Topologies” Ch. 2
Brooks’ Law
- Why does adding people sometimes slow things down?
- Book Reference: “The Mythical Man-Month” by Fred Brooks
Likert Scales
- Why use 5-point vs 7-point scales?
- How do you handle neutral responses?
- Reference: Survey methodology literature

5.5 Questions to Guide Your Design

Before implementing, think through these:

Survey Design

How do you ask “How much do you have to think?” without being vague?
Should you ask about time or about difficulty?
How do you avoid leading questions?

Validity

How many responses per team is “enough”?
What if a team only has 2 members?
How do you handle outliers?

Actionability

Once you find a “Red Team,” what change do you recommend?
Is the recommendation operational (change process) or structural (change boundary)?
Who acts on the recommendation?

5.6 Thinking Exercise

The “Context Switch” Counter

Pick a single work day. Every time you have to:

Stop coding to answer a question
Attend a meeting
Fix a broken tool
Search for information

Mark a tally.

Questions:

How many tallies by noon?
What percentage was “Value Add” (Germane) vs “Frustration” (Extraneous)?
If everyone on your team has 10+ tallies, what does that tell you?

Write down:

Your tally count for 4 hours
Categorize each as Intrinsic / Extraneous / Germane
Calculate percentages

5.7 Hints in Layers

Hint 1: Use the “Four Question” Method Keep it simple. Ask teams to rate 1-5:

“How easy is it to deploy?” (Extraneous inverse)
“How much of the domain do you understand?” (Intrinsic)
“How much time is spent on ‘toil’?” (Extraneous)
“How often do you get interrupted?” (Extraneous)

Hint 2: Aggregate by Team Operating Model Design is about teams, not individuals. Always average at the team level.

Hint 3: Visualize the Gap Create a scatter plot:

X-axis: Domain Complexity (Intrinsic)
Y-axis: Tooling Friction (Extraneous)
Teams in the top-right corner are your biggest risk.

Hint 4: Map to Team Topologies

High Tooling Complexity → Need Platform Team intervention
High Domain Complexity + Large Team → Consider boundary split
Low Germane Load → Enabling Team needed

5.8 The Interview Questions They’ll Ask

Prepare to answer these:

“How do you measure cognitive load in a software team?”
- Survey-based (perceived load), plus objective metrics (services owned, on-call burden)
“What is the difference between Extraneous and Germane cognitive load?”
- Extraneous = waste (tools, process, coordination). Germane = valuable work (learning, building)
“How does team size affect cognitive load?”
- Communication overhead grows O(n²). Larger teams = more coordination load.
“What are the signs that a team is suffering from too much cognitive load?”
- Low velocity, high burnout, increasing bugs, avoiding new work
“How can an ‘Enabling Team’ help reduce cognitive load?”
- Knowledge transfer reduces Intrinsic load. Tooling improvements reduce Extraneous load.

5.9 Books That Will Help

Topic	Book	Chapter
Cognitive Load Theory	“Team Topologies” by Skelton & Pais	Ch. 2: Cognitive Load
Team Size Effects	“The Mythical Man-Month” by Brooks	Ch. 2
Psychological Safety	“The Fearless Organization” by Edmondson	All
Survey Design	Various methodology texts	-

5.10 Implementation Phases

Phase 1: Survey Design (2-3 hours)

Write 12-15 questions covering all load types
Test with 2-3 colleagues for clarity
Deploy using Google Forms / Typeform

Phase 2: Data Collection (1-2 weeks, passive)

Send survey to all teams
Remind after 3 days
Close after 1 week
Export to CSV

Phase 3: Analysis (3-4 hours)

Load data into Pandas
Calculate team-level averages
Identify high-risk teams
Calculate correlations

Phase 4: Visualization (3-4 hours)

Create heatmap using Matplotlib/Seaborn
Add interactive version with Plotly
Create scatter plot of risk dimensions

Phase 5: Reporting (3-4 hours)

Write template for team reports
Generate report for each high-risk team
Include specific recommendations

5.11 Key Implementation Decisions

Decision	Option A	Option B	Recommendation
Survey tool	Google Forms	Custom form	Google Forms (faster)
Anonymity	Fully anonymous	Team-level only	Team-level (actionable)
Visualization	Static PNG	Interactive HTML	Interactive (Plotly)
Reporting	Markdown	PDF	Markdown (version control)

6. Testing Strategy

Data Validation

def test_minimum_responses():
    # Teams with < 3 responses should be excluded
    responses = [make_response(team="small", answers={...})]
    scores = calculate_scores(responses)
    assert len(scores) == 0

def test_score_calculation():
    responses = [
        make_response(team="test", answers={"q1": 4, "q2": 3})
    ] * 5
    scores = calculate_scores(responses)
    assert scores[0].intrinsic_score == 3.5

Visualization Testing

Verify heatmap renders without errors
Verify risk colors match thresholds
Manual review of output

Survey Testing

Test with 5 people before sending
Check completion time < 5 min
Verify questions are unambiguous

7. Common Pitfalls & Debugging

Problem	Symptom	Root Cause	Fix
Low response rate	< 50% completion	Survey too long or unclear value	Shorten survey, communicate purpose
All teams look similar	No differentiation	Questions too vague	Add more specific questions
Data seems wrong	Doesn’t match intuition	Sample bias or bad questions	Investigate outliers, refine questions
No action taken	Reports ignored	No clear owner for action	Assign accountability for each recommendation

8. Extensions & Challenges

Extension 1: Longitudinal Tracking

Run quarterly, track trends over time. Show if interventions are working.

Extension 2: Correlation Analysis

Correlate cognitive load with:

Deployment frequency
Incident rate
Employee satisfaction (eNPS)
Turnover

Extension 3: Predictive Model

Build ML model to predict which teams will have high turnover based on load scores.

Extension 4: Real-Time Load

Integrate with JIRA/GitHub to estimate load from work-in-progress, not just surveys.

9. Real-World Connections

Research:

DORA State of DevOps reports correlate cognitive load with performance
Google re:Work research on team effectiveness

Tools:

DX Survey (getdx.com)
Space Framework (GitHub/Microsoft)
Team Health Check (Spotify model)

10. Resources

Team Topologies

Survey Tools

Visualization

P01: Team Interaction Audit - Map communication patterns
P08: Dependency Visualizer - Visualize dependencies

11. Self-Assessment Checklist

Before considering this project complete, verify:

I can explain the three types of cognitive load
Survey has 10+ questions covering all load types
Survey completed by at least 3 teams (15+ responses)
Heatmap clearly shows high-risk teams in red
At least one team report written with specific recommendations
I can explain why Brooks’ Law matters for operating models
Results have been shared with at least one stakeholder

12. Submission / Completion Criteria

This project is complete when you have:

questions.yaml with 12+ validated questions
responses.csv with 20+ responses from 5+ teams
Heatmap visualization (HTML or PNG)
Team reports for at least 2 high-risk teams
Executive summary with org-level findings
Presentation of findings to stakeholder (optional)

Previous Project: P05: Platform-as-a-Product Blueprint Next Project: P07: Service Level Expectation Agreement